The problem with noise and small disjuncts

Gary M. Weiss

doi:10.7282/t3-ea9m-ev91

Back

The problem with noise and small disjuncts

Technical documentation

Open access

The problem with noise and small disjuncts

Gary M. Weiss

Rutgers University

1994

DOI:

https://doi.org/10.7282/t3-ea9m-ev91

Abstract

Systems that learn from examples often create a disjunctive concept definition. The disjuncts in the concept definition which cover only a few training examples are referred to as small disjuncts. The problem with small disjuncts is that they are more error prone than large disjuncts, but may be necessary to achieve a high level of predictive accuracy [Holte, Acker, and Porter, 1989]. This paper extends previous work done on the problem of small disjuncts by taking noise into account. It investigates the assertion that it is hard to learn from noisy data because it is difficult to distinguish between noise and true exceptions. In the process of evaluating this assertion, insights are gained into the mechanisms by which noise affects learning. Two domains are investigated. The experimental results in this paper suggest that for both Shapiro’s chess endgame domain [Shapiro, 1987] and for the Wisconsin breast cancer domain [Wolberg, 1990], the assertion is true, at least for low levels (5- 10%) of class noise.

Files and links (1)

pdf

ml-tr-38280.21 kBDownload View

Author's Original (AO) Open Access

Metrics

67 File downloads

138 Record Views

Details

Title: The problem with noise and small disjuncts
Creators: Gary M. Weiss (Author) - Computer Science (New Brunswick)
Date published: 1994
Publisher: Rutgers University
Number of pages: 1 online resource (13 pages) : illustrations
Academic Unit: Computer Science (SAS); School of Arts and Sciences
Language: English
Resource Type: Technical documentation
Comment: Technical report ml-tr-38
Identifiers: 991031549967404646