Sequence data manipulation


This basic puspose of this tool is format conversion. The sequences of basic formats like FASTA, PHYLIP, MEGA, CLUSTAL, etc. are converted from one format to another.


This tool also converts a special case. It works for the problem of short description in phylogeny trees. The phylogeny tree's produced contain only short descriptions which causes redundancy in most cases. This tool solves the problem of redundancy and short descriptions. The coding part takes a set of input sequences in fasta or clustal or mega formats and for each sequence the whole description is replaced with a short description. Then it generates the converted sequences with short description in the given input format and a match table with original and short descriptions. The decoding part takes the match table and the tree data. It replaces the original description in tree data with the short description from match table.


The tool also replaces the special characters and white space with an underscore character '_' in the sequence description. Currently this option works for Nexus and FASTA sequence formats.

An option is provided to remove the redundant sequences from the given data set for all the major sequence formats. The result will produce a non-redundant sequence file.