Architecture



Most computer based bioinformatic analyses are following at least three canonical steps. Firstly the data of interest must be read so that on the second step the data can be processed and analyzed. Finally the results have to be returned and presented. All these parts have to be implemented by each software developer although a lot of code would be eligible for reuse. Especially data import like parsing of different formats or accession of databases is strongly dedicated to reutilization. The same accounts for data presentation for example in a Receiver Operating Characteristic Curve (ROC Curve) or in a histogram of data distribution. Finally it is desirable that common jobs for instance a comparison of the distribution of amino acids or a visualization of structural classes of proteins can be applied to various data sources.

Here a Java library is presented which is intended to help researches to speed up answering bioinformatic questions by providing a software development framework.
It allows connecting different modules by using standardized interfaces, ensures reusability instead of throw-away scripts, helps accessing various different data sources, and includes a huge amount of statistical test- and helper classes for recurring tasks.

Content: