Automated classification of DNA structure from sequence information

David Loewenstern; Helen M. Berman; Haym Hirsh

doi:10.7282/T35T3Q4J

Back

Automated classification of DNA structure from sequence information

Technical documentation

Open access

Automated classification of DNA structure from sequence information

David Loewenstern, Helen M. Berman and Haym Hirsh

Rutgers University

1997

DOI:

https://doi.org/10.7282/T35T3Q4J

Abstract

We introduce an algorithm, lllama, which combines simple pattern recognizers into a general method for estimating the entropy of a sequence. Each pattern recognizer exploits a partial match between subsequences to build a model of the sequence. Since the primary features of interest in biological sequence domains are subsequences with small variations in exact composition, lllama is particularly suited to such domains. We describe two methods, lllama-length and lllama-alone, which use this entropy estimate to perform maximum a posteriori classi cation. We apply these methods to several problems in three-dimensional structure classi cation of short DNA sequences. The results include a surprisingly low 3.6% error rate in predicting helical conformation of oligonucleotides. We compare our results to those obtained using more traditional methods for automated generation of classi ers

Files and links (2)

pdf

dcs-tr-331258.17 kBDownload View

Version of Record (VoR) Technical Documentation Open Access

url

Report an accessibility issueView

Please complete a content remediation request to report an accessibility issue with a library electronic resource, website, or service.

Metrics

77 File downloads

115 Record Views

Details

Title: Subtitle: Automated classification of DNA structure from sequence information
Creators: David Loewenstern (Author)
Helen M. Berman (Author)
Haym Hirsh (Author)
Date published: 1997
Publisher: Rutgers University
Number of pages: 22 pages
Academic Unit: School of Arts and Sciences; Center for Quantitative Biology; Chemistry and Chemical Biology (SAS); Computer Science (SAS)
Language: English
Resource Type: Technical documentation
Comment: Technical report DCS-TR-331
Identifiers: 991031549892704646