[6] | Günter Böckle, Hermann Hellwagner, Roland Lepold, Gerd Sandweg, Burghardt Schallenberger, Raimar Thudt, Stefan and Wallstab, Structured Evaluation of Computer Systems, In IEEE Computer Society, N, A, vol. Vol. 29, no. No 6, N, A, pp. 45-51, 1996.
[bib] [doi] [pdf] [abstract]
Abstract: Evaluating computers and other systems is difficult for a couple of reasons. First, the goal of evaluation is typically ill-defined: customers, sometimes even designers, either don't know or can't specify exactly what result they expect. Often, they don't specify the architectural variants to consider, and often the metrics and workload they expect you to use are ill-defined. Second, they rarely clarify which kind of model and evaluation method best suit the evaluation problem. These problems have consequences. For one thing, the decision-maker may not trust the evaluation. For another, poor planning means the evaluation cannot be reproduced if any of the parameters are changed slightly. Finally, the evaluation documentation is usually inadequate, and so some time after the evaluation you might ask yourself, how did I come to that conclusion? An approach developed at Siemens makes decisions explicit and the process reproducible
|
[5] | Arndt Bode, Michael Gerndt, R Hackenberg, Hermann Hellwagner, High-Level Programming Models and Supportive Environments (HIPS´96), In Proceedings of IPPS '96, The 10th International Parallel Processing Symposium (A N, ed.), IEEE Computer Society, N, A, pp. -, 1996.
[bib] |
[4] | Günter Böckle, Hermann Hellwagner, Systematic Assessment of Computer Systems Architectures, In Innovationen bei Rechen- und Kommunikationssystemen, Eine Herausforderung für die Informatik (Bernd E Wolfinger, ed.), Springer Verlag, N, A, pp. 310-317, 1994.
[bib] |
[3] | Hermann Hellwagner, Randomized Shared Memory - Concept and Efficiency of a Scalable Shared Memory Scheme, In Parallel Computer Architectures: Theory, Hardware, Software, Applications (Bode Arndt, Mario Dal Cin, eds.), Springer Verlag, London, UK, pp. 102-117, 1993.
[bib] [abstract]
Abstract: Our work explores the practical relevance of Randomized Shared Memory (RSM), a theoretical concept that has been proven to enable an (asymptotically) optimally efficient implementation of scalable and universal shared memory in a distributed-memory parallel system. RSM (address hashing) pseudo-randomly distributes global memory addresses throughout the nodes' local memories. High memory access latencies are masked through massive parallelism. This paper introduces the basic principles and properties of RSM and analyzes its practical efficiency in terms of constant factors through simulation studies, assuming a state-of-the-art parallel architecture. Bottlenecks in the architecture are pointed out, and improvements are being made and their effects assessed quantitatively. The results show that RSM efficiency is encouragingly high, even in a non-optimized architecture. We propose architectural features to support RSM and conclude that RSM may indeed be a feasible shared-memory implementation in future massively parallel computers.
|
[2] | Hermann Hellwagner, Design Considerations for Scalable Parallel File Systems, In The Computer Journal - Parallel Processing, N, A, vol. Vol. 36, no. 8, N, A, pp. 741-755, 1993.
[bib] [pdf] [abstract]
Abstract: This paper addresses the problem of providing high-performance disk I/O in massively parallel computers. Resolving the fundamental I/O bottleneck in parallel architectures involves both hardware and software issues. We review previous work on disk arrays and I/O architectures aimed at providing highly parallel disk I/O subsystems. We then focus on the requirements and design of parallel file systems (PFSs) which are responsible to make the parallelism offered by the hardware and a declustered file organization available to application programs. We present the design strategy and key concepts of a general-purpose file system for a parallel computer with scalable distributed shared memory. The principal objectives of the PFS are to fully exploit the parallelism inherent among and within file accesses, and to provide scalable I/O performance. The machine model underlying the design is described, with and emphasis on the innovative architectural features supporting scalability of the shared memory. Starting from a classification of various scenarios of concurrent I/O requests, the features of the PFS design essential for achieving the goals are described and justified. It is argued that the inter- and intra-request parallelism of the I/O load can indeed be effectively exploited and supported by the parallel system resources. Scalability of I/O performance and of the PFS software can be ensured by avoiding serial bottlenecks through the use of the powerful architectural features.
|
[1] | Hermann Hellwagner, On the Practical Efficiency of Randomized Shared Memory, In Parallel Processing: CONPAR 92 - VAPP V, Second Joint International Conference on Vector and Parallel Processing (Luc Bougé, Michel Cosnard, Yves Robert, Denis Trystram, eds.), Springer, Berlin-Heidelberg, pp. 429-440, 1992.
[bib] [abstract]
Abstract: This paper analyzes the efficiency of Randomized Shared Memory (RSM) in terms of constant factors. RSM or memory hashing, that is, pseudorandom distribution of global memory addresses throughout local memories in a distributed-memory parallel system, has been proven to enable an (asymptotically) optimally efficient implementation of scalable and universal shared memory. High memory access latencies are hidden through massive parallelism. Our work examines the practical relevance and feasibility of this potentially significant theoretical result. After an introduction of the background, principles, and desirable properties of RSM and an outline of the approach to determine RSM efficiency, the major results of our simulations are presented. The results show that RSM efficiency is encouragingly high (up to 20% efficiency of idealized shared memory), even in an architecture modelled on the basis of state-of-the-art technology. Performance-limiting factors are identified from the results and architectural features to increase efficiency are proposed, most notably extremely fast process switching and a combining network. Several novel machine designs document the increased interest in RSM and hardware support.
|