Jozo Dujmovic San Francisco State University
Program space is defined as a space where each point represents a program. The distance (difference) between programs can be computed using various multidimensional non-Euclidean semimetrics. We propose and use two classes of difference models: (1) White-Box models, and (2) Black-Box models. In all cases we use normalized differences: the minimum difference is 0 (for identical programs) and the maximum difference is 1 (or 100%, for the most different programs). Redundant programs can be visualized as clusters of close points. The primary motivation for exploring program space comes from practical workload characterization problems related to design, evaluation, and comparison of benchmark suites. Our goals are to increase the reliability and decrease the cost of benchmarking of computer systems and networks. This is particularly important for standard industrial benchmarks (such as suites developed by SPEC, TPC, GPC, PERFECT Club and others). Our design and evaluation techniques are based on theoretical concepts of benchmark suite size, density, granularity, and coverage (utilization) of program space. These concepts are used to eliminate excessive redundancy and to achieve a desired distribution of benchmarks in the program space. In particular, we propose the design of universal benchmark suites which have the following properties: (1) uniform and low redundancy between component workloads, (2) maximum size, and (3) uniform distribution of workloads within the program space.