Learning with small disjuncts

Gary M. Weiss

doi:10.7282/t3-vem5-z794

Back

Technical documentation

Open access

Learning with small disjuncts

Gary M. Weiss

Rutgers University

1995

DOI:

https://doi.org/10.7282/t3-vem5-z794

Abstract

Systems that learn from examples often create a disjunctive concept definition. The disjuncts in the concept definition which cover only a few training examples are referred to as small disjuncts. The problem with small disjuncts is that they are more error prone than large disjuncts, but may be necessary to achieve a high level of predictive accuracy [Holte, Acker, and Porter, 1989]. This paper extends previous work done on the problem of small disjuncts by investigating the reasons why small disjuncts are more error prone than large disjuncts, and evaluating the impact small disjuncts have on inductive learning. This paper shows that attribute noise, missing attributes, class noise, and training set size can each cause small disjuncts to be more error prone than large disjuncts. This paper also evaluates the impact that these factors have on learning with small disjuncts (i.e., on the error rate). It shows, for two artificial domains, that when low levels of attribute noise are applied only to the training set (the ability to learn the correct noise-free concept is being evaluated), small disjuncts are primarily responsible for making learning difficult.

Files and links (2)

pdf

ml-tr-3948.14 kBDownload View

Version of Record (VoR) Technical Documentation Open Access

url

Report an accessibility issueView

Please complete a content remediation request to report an accessibility issue with a library electronic resource, website, or service.

Metrics

45 File downloads

71 Record Views

Details

Title: Subtitle: Learning with small disjuncts
Creators: Gary M. Weiss (Author) - Computer Science (New Brunswick)
Date published: 1995
Publisher: Rutgers University
Number of pages: 1 online resource (14 pages) : illustrations
Academic Unit: School of Arts and Sciences; Computer Science (SAS)
Language: English
Resource Type: Technical documentation
Comment: Technical report ml-tr-39
Identifiers: 991031549952404646