| Software Development :
In order to handle large volume of genome data there are many resources available over the internet .The sites NCBI , EMBL etc are well known .There are many highly specialized databases . For my research I found I had to develop several softwares of my own . These were written in C for convenience . In recent-times we are slowly shifting to the language PERL , which is tailor-made for bioinformatics . I give below an overview of some of the softwares developed in my lab .
DNAPRO1 :
It deals with mathematical linguistic approach to genome-data . I can use it to rank and order an arbitrary set of n-tuple over any length of the genome sequence . I have the option of working in both the overlapping and the non - overlapping modes . It is made to analyse the nucleotide as well as the peptide chains .
DNAPRO2 :
An extremely useful software that looks at correlation over arbitrary distance over the genomes , or the peptide chains , between the tuple of length n and the tuple of length m . It has very powerful graphics support that helps us visualize in a few pictures the underlying structure of the sequences .
Entropic Segments :
Another software written in C looks at the patch distribution over the chains . The segmentation algorithm is based on two different routes : the usual Jensen-Shanon , and entropy based on Fourier spectral weights – on the so called structure factors . In the usual entropic segmentation algorithms where to stop is an important issue . I find I can stop at boundaries of correlation zones . When I Fourier the correlations over the zones I pick up information on the distributions of the patches .I am now looking at specialized e.coli databases trying to relate the entropic segments to known biologically established boundaries .
|
|