Abstract
As the gap between proessor and main memory speed ontinues to grow, higher ahe hit rates are required for eÆient proessor use. Re- ent work on ompile-time transformations to improve loality in sienti progams has foused on loop fusion, tiling, and distribution; previous work suggests that loop skewing is not useful in optimizing for loality. In this artile, we show that the value of loop skewing may only be evident in a ompiler that inludes transformations that have not been applied in empirial studies of loality (suh as the interhange of imperfetly nested loops). We also show how a new approah to data transformation an be used to further redue memory traÆ for these alulations