haskell
This is a collection of data structures and algorithms
useful for building bioinformatics-related tools
and utilities.
Current list of features includes: a Sequence data type supporting
protein and nucleotide sequences and conversion between them. As of version
0.4, different kinds of sequence have different types. Support for quality
data, reading and writing Fasta formatted files, reading TwoBit and
phd formats, and Roche/454 SFF files. Rudimentary (i.e. unoptimized) support
for doing alignments - including dynamic adjustment of scores based on sequence quality.
Also Blast output parsing. Partly implemented single linkage clustering, and
multiple alignment. Reading Gene Ontology (GO) annotations (GOA) and
definitions\/hierarchy.
The Darcs repository is at: <http://malde.org/~ketil/biohaskell/biolib>.