Architecture
Most computer based bioinformatic analyses are following at least three canonical steps.
Firstly the data of interest must be read so that on the second step the data can be processed
and analyzed. Finally the results have to be returned and presented. All these parts have to
be implemented by each software developer although a lot of code would be eligible for reuse.
Especially data import like parsing of different formats or accession of databases is strongly
dedicated to reutilization. The same accounts for data presentation for example in a Receiver
Operating Characteristic Curve (ROC Curve) or in a histogram of data distribution. Finally it
is desirable that common jobs for instance a comparison of the distribution of amino acids or
a visualization of structural classes of proteins can be applied to various data sources.
Here a Java library is presented which is intended to help researches to speed up answering
bioinformatic questions by providing a software development framework.
It allows connecting different modules by using standardized interfaces, ensures reusability
instead of throw-away scripts, helps accessing various different data sources, and includes
a huge amount of statistical test- and helper classes for recurring tasks.
Content:
- Architecture and Interfaces
- General design patterns
- Input layer
- Engine layer
- Visualisation layer
- Statistical tests