A Combined Model and a Varied Gibbs Sampling Algorithm Used for Motif Discovery

Wu, X., Cheng, J., Song, C. and Wang, B.

    The conserved sequences in gene regulatory regions dominate gene regulation. Discovering these sequences and their functions is important in post genome era. A novel model is constructed to represent conserved motifs of DNA sequences. This model is a combination of PWM and WAM models. The advantage is the new model not only can comprise individual base frequencies in the motifs, but also can embody relationship of neighbourhood bases. In addition, a varied Gibbs sampling algorithm is applied with consideration of the different motif occurrences in each sequence. This variation is more accordant with the true situation of gene transcription controlling mechanism. By combining the model and the discovery algorithm, a program is constructed. After analysed a set of DNA sequences of upstream regions of genes using this program, putative motifs are discovered and are compared to experimental verified regulatory sequences. Results showed that this combination is ideal for motif discovery and the practice is meaningful for gene regulation research.
Cite as: Wu, X., Cheng, J., Song, C. and Wang, B. (2004). A Combined Model and a Varied Gibbs Sampling Algorithm Used for Motif Discovery. In Proc. Second Asia-Pacific Bioinformatics Conference (APBC2004), Dunedin, New Zealand. CRPIT, 29. Chen, Y.-P. P., Ed. ACS. 99-104.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS