Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Variable-length Intervals in Homology Search

Chattaraj, A. and Williams, H.E.

    Fast, accurate, and scalable search techniques for homology searching of large genomic collections are becoming an increasingly important requirement as genomic sequence collections continue to double in size almost yearly. Almost all homology search techniques rely on extracting fixed-length overlapping sequences from queries and database sequences, and comparing these as the first step in query evaluation; this is a feature of well-known tools such as FASTA, BLAST, and our own CAFE technique. In this paper we discuss a novel, variable-length approach to extracting subsequences that is based on homology scoring matrices. Our motivation is to achieve a balance between the speed and accuracy of fixed-length choices, that is, to encapsulate the speed of longer subsequence lengths and the accuracy of shorter ones. We show that incorporating this approach into our CAFE technique leads to a good compromise between accuracy and retrieval efficiency when searching with BLOSUM matrices sensitive to distant evolutionary relationships. We expect the same results would be achieved with other homology search techniques
Cite as: Chattaraj, A. and Williams, H.E. (2004). Variable-length Intervals in Homology Search. In Proc. Second Asia-Pacific Bioinformatics Conference (APBC2004), Dunedin, New Zealand. CRPIT, 29. Chen, Y.-P. P., Ed. ACS. 85-91.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS
 

 

ACS Logo© Copyright Australian Computer Society Inc. 2001-2014.
Comments should be sent to the webmaster at crpit@scem.uws.edu.au.
This page last updated 16 Nov 2007