Abstract
Using data compression, we derive predictable properties of program reference behavior. The motivation behind this approach is that if a data source is highly predictable, then its output has very low entropy, thus leading to high compressibility. This approach has an important property that prediction can be carried out without assuming any rigid model of the data source. We find the sequence of time instances when a given memory location is accessed (called Inter-Reference Gap or IRG) to be a highly compressible, and hence a highly predictable stream. We validate this predictability in two ways: o First, we present memory replacement algorithms, both under a fixed memory scenario, and a dynamic allocation setting, which exploit the predictable nature of the IRGs to improve upon known techniques for this task. For fixed buffer, we obtain miss ratio improvements up to 37.5% over the LRU replacement. For dynamic memory management we obtain up to 20% improvement in the space-time product over the Denning's Working Set algorithm. The improvements are obtained at the cache (both L1 and L2), virtual memory, disk buffer and at the database buffer levels. o Second, we present trace compaction techniques, both lossless and lossy, using IRGs and show significant improvements over other known techniques for trace compaction. Second, we use spatial locality, both at the memory reference, and at the page level, to propose a new technique for lossless trace compaction which improves upon the best known method of Samples [Samples SIGMETRICS89] up to 60%. We discover the predictable nature of missed cache lines under a variety of workloads, and propose a hardware scheme for prefetching based on the history of misses. This technique is shown to have a significant improvement in miss ratio (up to 32%) over the non prefetching schemes. Finally, we propose a new measure for space-time product for dynamic memory management, since the known measures are inadequate for new multithreaded and shared memory architectures. Under this measure we show that the optimal online algorithm is a policy which alternates between two windows, unlike the fixed window scheme of the Denning's Working Set algorithm. Additionally, we show empirical evidence supporting the need for these newer measures and algorithms.