Project Overview

Biological Problem:

There is a need for a simple easy-to-use application which searches for homologous genes in several target organisms given a set of genes in a pre-described pathway (ie. wyn). Our software targets this problem and produces graphical output of gene distances in a concise manner.

Background:

The determination of evolutionary relationships between organisms based on physical characteristics has long been established in biology. With the advances in bioinformatics, it is now possible to postulate evolutionary relationships between organisms by comparing their genetic information. Orthologs are genes derived from the same ancestral gene. Assuming constant mutation rates, the evolutionary distance between two organisms can be approximated by the sequence similarities between orthologs in each organism. Thus, sequence comparison of single orthologs from different organisms has yielded phylogenetic trees showing their evolutionary relationships.

However, results from single analysis are sensitive to the choice of genes analyzed. Orthologs with essential cellular function are well conserved, while others with no essential function are more permissive to mutations. Thus, similarity comparisons between orthologs with essential cellular function will likely give a shorter distance between the organisms than other genes. In addition, similarity between these genes may not be significantly different from each other. As a result, a phylogenetic tree cannot be constructed with good consistency and confidence from single gene similarity comparisons.

To address this problem of consistency, our group suggests using comparison of entire pathways between organisms for obtaining the distances between organisms. A hierarchical clustering technique will be then be used to construct the phylogenetic tree. Pathway analysis takes into account multiple genes, all of which are related in function. As variations in each single gene are averaged and not directly taken into account, a more stable tree can be obtained when compared to single gene analysis.


Team Members and Project Assignments:

Li Yan - HTML/CGI for user input, Perl program for running blast and parsing result

Christina Yau - graphical output tree construction

Harry Choi - matrix algorithm design, clustering

Gabe Kwong - distance metric calculations, matrix generation

Nicholas Lin - initial UI design with Netbeans (scratched in favor of HTML/CGI interface-more reliable), project website design & compilation of powerpoint presentation, assisted with CGI UI design

**everyone contributed to the powerpoint presentation

Hosted by www.Geocities.ws

1