Abstract
Systems that learn from examples often create a disjunctive concept definition. The disjuncts in the concept definition which cover only a few training examples are referred to as small disjuncts. The problem with small disjuncts is that they are more error prone than large disjuncts, but may be necessary to achieve a high level of predictive accuracy [Holte, Acker, and Porter, 1989]. This paper extends previous work done on the problem of small disjuncts by taking noise into account. It investigates the assertion that it is hard to learn from noisy data because it is difficult to distinguish between noise and true exceptions. In the process of evaluating this assertion, insights are gained into the mechanisms by which noise affects learning. Two domains are investigated. The experimental results in this paper suggest that for both Shapiro’s chess endgame domain [Shapiro, 1987] and for the Wisconsin breast cancer domain [Wolberg, 1990], the assertion is true, at least for low levels (5- 10%) of class noise.