[17] | Robbie De Sutter, Sam Lerouge, Peter De Neve, Christian Timmerer, Hermann Hellwagner, Rik and Van de Walle, Comparison of XML serializations: cost benefit vs. complexity, In ACM Multimedia Systems, Springer, vol. Vol 12, no. No 1, London, pp. 1-15, 2006.
[bib] [abstract]
Abstract: More and more data are structured, stored, and sent over a networ using the Extensible Markup Language (XML) language. There are, however, concerns about the verbosity of XML in such a way that it may restrain further adoption of the language, especially when exchanging XML-based data over heterogeneous networks, and when it is used within constrained (mobile) devices. Therefore, alternative (binary) serialization formats of the XML data become relevant in order to reduce this overhead. However, using binary-encoded XML should not introduce interoperability issues with existing applications nor add additional complexity to new applications. On top of that, it should have a clear cost reduction over the current plain-text serialization format. A first technology is developed within the ISO/IEC Moving Picture Experts Group, namely the Binary MPEG Format for XML. It provides good compression efficiency, ability to (partially) update existingXMLtrees, and facilitates random access into, and manipulation of, the binary-encoded bit stream. Another technique is based on the Abstract Syntax Notation One specification with the Packed Encoding Rules created by the ITU-T. This paper evaluates both techniques as alternative XML serialization formats and introduces a solution for the interoperability concerns. This solution and the alternative serialization formats are validated against two real-life use cases in terms of processing speed and cost reduction. The efficiency of the alternative serialization formats are compared to a classic plain text compression technique, in particular ZIP compression.
|
[16] | Anthony Vetro, Christian Timmerer, Digital Item Adaptation: Overview of Standardization and Research Activities, In IEEE Transactions on Multimedia, IEEE, vol. Special Issue on MPEG-21, Los Alamitos, CA, USA, pp. 418-426, 2005.
[bib] [doi] [pdf] [abstract]
Abstract: MPEG-21 Digital Item Adaptation (DIA) has recently been finalized as part of the MPEG-21 Multimedia Framework. DIA specifies metadata for assisting the adaptation of Digital Items according to constraints on the storage, transmission and consumption, thereby enabling various types of quality of service management. This paper provides an overview of DIA, describes its use in multimedia applications, and reports on some of the ongoing activities in MPEG on extending DIA for use in rights governed environments.
|
[15] | Christian Timmerer, Hermann Hellwagner, Interoperable Adaptive Multimedia Communication, In IEEE Multimedia Magazine, IEEE Computer Society, vol. 12, no. 1, Los Alamitos, USA, pp. 74-79, 2005.
[bib] [pdf] [abstract]
Abstract: Digital Item Adaptation (DIA) has been recently standardized as Part 7 of the MPEG-21 Multimedia Framework. This standard specifies tools enabling interoperable communication and adaptation of so-called Digital Items. The adaptation process becomes ever more difficult due to the heterogeneity of terminals and networks utilizing different types of multimedia contents encoded in various coding formats. Other aspects are the users� preferences and accessibility characteristics as well as the natural environment in which the content is consumed. This article describes how to use the tools within DIA in order to build a device and coding format independent adaptation module enabling interoperable multimedia communication.
|
[14] | Peter Schojer, Laszlo Böszörmenyi, Hermann Hellwagner, An Adaptive Standard Meta-data Aware Proxy Cache, In Scalable Computing: Practice and Experience, SCPE, vol. Vol 6, no. No 2, Timisoara, Romania, pp. 93-104, 2005.
[bib] [abstract]
Abstract: Multimedia is gaining ever more importance on the Internet. This increases the need for intelligent and efficient video caches. A promising approach to improve caching efficiency is to adapt videos. With the availability of MPEG-4 it is possible to develop a standard compliant proxy cache that allows fast and efficient adaptation. We propose a modular design for an adaptive MPEG-4 video proxy that supports efficient full and partial video caching in combination with filtering options that are driven by the terminal capabilities of the client. We use the native scalability operations provided by MPEG-4, the MPEG-7 standard to describe the scalability options for a video and the emerging MPEG-21 standard to describe the terminal capabilities. We restrict ourselves to full video caching. The combination of adaptation with MPEG-4, MPEG-7 and client terminal capabilities is to the best of our knowledge unique and will increase the quality of service for end users. Key words: Adaptation, MPEG-4, MPEG-7, MPEG-21, adaptive proxy, caching.
|
[13] | Sylvain Devillers, Christian Timmerer, Jörg Heuer, Hermann Hellwagner, Bitstream Syntax Description-Based Adaptation in Streaming and Constrained Environments, In IEEE Transactions on Multimedia, IEEE, vol. Special Issue on MPEG-21, Vol. 7, no. No. 3, Piscataway, USA, pp. 463-470, 2005.
[bib] [pdf] [abstract]
Abstract: The seamless access to rich multimedia content on any device and over an network, usually known as Universal Multimedia Access, requires interoperable description tools and adaptation techniques to be developed. To address the latter issue, MPEG-21 Digital Item Adaptation (DIA) introduces the Bitstream Syntax Description (BSD) framework, which provides tools for adapting multimedia content in a generic (i.e., coding format independent) way. The basic idea is to use the eXtensible Markup Language (XML) to describe the high-level structure of a binary media bitstream, to transform its description (e.g., by means of eXtensible Stylesheet Language Transformations, XSLT), and to construct the adapted media bitstream from the transformed description. This paper presents how this basic BSD framework, initially developed for non-streamed content and suffering from inherent limitations and high memory consumption of XML-related technologies such as XSLT, can be advanced and efficiently implemented in a streaming environment and on resource-constrained devices. Two different attempts to solve the inherent problems are described. The first approach proposes an architecture based on the streamed processing of SAX (Simple Application Programming Interface for XML) events and adopts STX (Streaming Transformations for XML) as an alternative to XSLT, whereas the second approach breaks a BSD up into well-formed fragments called Process Units (PUs) that can be processed individually by a standard XSLT processor. The current status of our work as well as directions for future research are given.
|
[12] | Roland Tusch, Laszlo Böszörmenyi, Balázs Goldschmidt, Hermann Hellwagner, Peter Schojer, Offensive and Defensive Adaptation in Distributed Multimedia Systems, In Computer Science and Information Systems, ComSIS, vol. Vol. 1, no. No 1, Novi Sad, pp. 49-77, 2004.
[bib] [abstract]
Abstract: Adaptation in multimedia systems is usually restricted to defensive, reactive media adaptation (often called stream-level adaptation). We argue that offensive, proactive, system-level adaptation deserves not less attention. If a distributed multimedia system cares for overall, end-to-end quality of service then it should provide a meaningful combination of both. We introduce an adaptive multimedia server (ADMS) and a supporting middleware which implement offensive adaptation based on a lean, flexible architecture. The measured costs and benefits of the offensive adaptation process are presented. We introduce an intelligent video proxy (QBIX), which implements defensive adaptation. The cost/benefit measurements of QBIX are presented elsewhere. We show the benefits of the integration of QBIX in ADMS. Offensive adaptation is used to find an optimal, user-friendly configuration dynamically for ADMS, and defensive adaptation is added to take usage environment (network and terminal) constraints into account.
|
[11] | Gabriel Panis, Andreas Hutter, Jörg Heuer, Hermann Hellwagner, Harald Kosch, Christian Timmerer, Sylvain Devillers, Myriam Amielh, Bitstream Syntax Description: A Tool for Multimedia Resource Adaptation within MPEG-21, In Signal Processing: Image Communication, Elsevier B.V., vol. Vol. 18, Special Issue on Multimedia Adaptation, no. 8, Amsterdam, Netherlands, pp. 721-747, 2003.
[bib] [pdf] [abstract]
Abstract: In this paper, a generic method is described to allow the adaptation of different multimedia resources by a single, media resource-agnostic processor. This method is based on an XML description of the media resources bitstream syntax, which can be transformed to reflect the desired adaptation and then be used to generate an adapted version of the bitstream. Based on this concept, two complementary technologies, BSDL and gBS Schema, are presented. The two technologies provide solutions for parsing a bitstream to generate its XML description, for the generic structuring of this description, and the generation of an adapted bitstream using its transformed description. The two technologies can be used as stand-alone tools; however, a joint approach has been developed in order to harmonise the two solutions and exploit their strengths. Since BSDL has been presented in previous publications, this paper is focusing more on the gBS Schema and the joint BSDL/gBS Schema approach.
|
[10] | Laszlo Böszörmenyi, Harald Kosch, Hermann Hellwagner, Best papers of EuroPar 2003, In Parallel Processing Letters, Springer, vol. 13, no. 4, Heidelberg, Germany, pp. 509-511, 2003.
[bib] |
[9] | Laszlo Böszörmenyi, Hermann Hellwagner, Harald Kosch, Mulugeta Libsie, Stefan Podlipnig, Metadata Driven Adaptation in the ADMITS Project, In Signal Processing - Image Communication - Special Issue on Multimedia Adaptation, Elsevier, vol. Vol. 18, no. Issue 8, Oxford, United Kingdom, pp. 749-766, 2003.
[bib][url] [abstract]
Abstract: The ADMITS project (Adaptation in Distributed Multimedia IT Systems) is building an experimental distributed multimedia system for investigations into adaptation, which we consider is an increasingly important tool for multimedia systems. A number of possible adaptation entities (server, proxy, clients, routers) are being explored, different algorithms for media, component and application-level adaptations are being implemented and evaluated, and experimental data are being derived to gain insight into when, where and how to adapt, and how individual, distributed adaptation steps interoperate and interact with each other. In this paper the "adaptation-chain" of (MPEG-conforming) metadata based adaptation is described: from the creation stage at the server side, through its usage in the network (actually in a proxy), up to the consumption at the client. The metadata are used to steer the adaptation processes. MPEG-conformant metadata, the so-called variation descriptions, are introduced; an example of a complete MPEG-7 document describing temporal scaling of an MPEG-4 video is given. The meta-database designed to store the metadata is briefly discussed. We describe how the metadata can be extracted from MPEG-4 visual elementary streams and initial results from the temporal video scaling experiment are given. We further present how the metadata can be utilized by enhanced cache replacement algorithms in a proxy server in order to realize quality-based caching; experimental results using these algorithms are also given. Finally, an adaptive query and presentation interface to the meta-.
|
[8] | Hermann Hellwagner, Matthias Ohlenroth, VI Architecture Communication Features and Performance on the Giganet Cluster LAN, In Future Generation Computer Systems, Elsevier B.V., vol. Vol. 18, no. Issue 3, Amsterdam, Netherlands, pp. 421-433, 2002.
[bib][url] [doi] [pdf] [abstract]
Abstract: The virtual interface (VI) architecture standard was developed to satisfy the need for a high throughput, low latency communication system required for cluster computing. VI architecture aims to close the performance gap between the bandwidths and latencies provided by the communication hardware and visible to the application, respectively, by minimizing the software overhead on the critical path of the communication. This paper presents the results of a performance study of one VI architecture hardware implementation, the Giganet cLAN (cluster LAN). The focus of the study is to assess and compare the performance of different VI architecture data transfer modes and specific features that are available to higher-level communication software like MPI in order to aid the implementor to decide which VI architecture options to employ for various communication scenarios. Examples of such options include the use of send/receive vs. RDMA data transfers, polling vs. blocking to check completion of communication operations, multiple VIs, completion queues and scatter capabilities of VI architecture.
|
[7] | Christian Weiß, Hermann Hellwagner, Linda Stals, Ulrich Rüde, Data Locality Optimizations to Improve The Efficiency of Multigrid Methods, In Concepts of Numerical Software, NA, NA, pp. 1-10, 2000.
[bib] [pdf] [abstract]
Abstract: Current superscalar microprocessors are able to operate at a peak performance of up to 1 GFlop/sec. However, current main memory technology does not provide the data needed fast enough to keep the CPU busy. To minimize idle times of the CPU, caches are used to speed up accesses to frequently used data. To exploit caches, the software must be aware of them and reuse data in the cache before it is being replaced. Unfortunately, all conventional multigrid codes are not cache-aware and hence exploit less than 10 percent of the peak performance of cache based machines. Our studies with linear PDEs with constant coefficients show that it is possible to speed up the execution of our multigrid method by a large factor and hence solve a Poisson’s equation with one million unknowns in less than 3 seconds. The optimized reuse of data in the cache allows us to exploit 30 percent of the peak performance of the CPU, in contrast to mgd9v for instance, which achieves less than 5 percent on the same machine. To achieve this, we used several techniques like loop unrolling and loop fusion to better exploit the memory hierarchy and the superscalar CPU. We study the effects of these techniques on the runtime performance in detail. We also study several tools which guide the optimizations and help to restructure the code.
|
[6] | Hermann Hellwagner, Ivan Zoraja, Vaidy Sunderam, SCIPVM: Parallel Distributed Computing on SCI Workstation Clusters, In Concurrency: Practice and Experience, N, A, vol. Vol 11, no. No 3, N, A, pp. 121-138, 1999.
[bib] [pdf] [abstract]
Abstract: Workstation and PC clusters interconnected by SCI (Scalable Coherent Interface) are very promising technologies for high performance cluster computing. Using commercial SBus to SCI interface cards and early system software and drivers, a two-workstation cluster has been constructed for initial testing and evaluation. The PVM system has been adapted to operate on this cluster using raw device access to the SCI interconnect, and preliminary communications performance tests have been carried out. Our preliminary results indicate that communications throughput in the range of 3.5 MBytes/s, and latencies Research supported by the Applied Mathematical Sciences program, Office of Basic Energy Sciences, U. S. Department of Energy, under Grant No. DE-FG05-91ER25105, the National Science Foundation, under Award Nos. ASC-9527186 and ASC-9214149, and the German Science Foundation SFB342. of 620 ¯s can be achieved on SCI clusters. These figures are significantly better (by a factor of 3 to 4) ...
|
[5] | Wolfgang Mayerle, Hermann Hellwagner, Konzepte und funktionaler Vergleich von Thread-Systemen (2), In Praxis der Informationsverarbeitung und Kommunikation, Spani, vol. 20, no. 4, Mannheim, Germany, pp. 225-229, 1997.
[bib] |
[4] | Wolfgang Mayerle, Hermann Hellwagner, Konzepte und funktionaler Vergleich von Thread-Systemen (1), In Praxis der Informationsverarbeitung und Kommunikation, Spaniol, Otto, vol. 20, Mannheim, Germany, pp. 164-174, 1997.
[bib] [pdf] [abstract]
Abstract: Dieses Papier gibt eine allgemeine Einführung in Threads und vergleicht einige derzeit für Arbeitsplatzrechner erhältliche Thread-Systeme. Aufbauend auf einer Motivation und grundlegenden Erläuterung des Thread-Konzepts werden wichtige Aspekte und Probleme von Thread-Bibliotheken vorgestellt. Nach einigen Hinweisen zur Programmierung mit Threads werden mehrere Implementierungen einander gegenübergestellt.
|
[3] | Hermann Hellwagner, Wolfgang Karl, Markus Leberecht, Enabling a PC Cluster for High-Performance Computing, In Speedup Journal, Proceedings, 21st Workshop, March 13-14, 1997, Cadro-Lugano, N, A, vol. Vol. 11, no. 1, N, A, pp. 18-23, 1997.
[bib] [pdf] [abstract]
Abstract: Due to their excellent cost/performance ratio, clusters of PCs can be attractive high-performance computing (HPC) platforms. Yet, their limited communication performance over standard LANs is still prohibitive for parallel applications. The project "Shared Memory in a LAN-like Environment" (SMiLE) at LRR-TUM adopts Scalable Coherent Interface (SCI) interconnect technology to build, and provide software for, a PC cluster which, with hardware-based distributed shared memory (DSM) and high-performance communication characteristics, is regarded as well suited for HPC. The paper describes the key features of the enabling technology, SCI. It then discusses the developments and important results of the SMiLE project so far: the development and initial performance of a PCI/SCI interface card, and the design and initial performance results of low-latency communication layers, Active Messages and a sockets emulation library.
|
[2] | Günter Böckle, Hermann Hellwagner, Roland Lepold, Gerd Sandweg, Burghardt Schallenberger, Raimar Thudt, Stefan and Wallstab, Structured Evaluation of Computer Systems, In IEEE Computer Society, N, A, vol. Vol. 29, no. No 6, N, A, pp. 45-51, 1996.
[bib] [doi] [pdf] [abstract]
Abstract: Evaluating computers and other systems is difficult for a couple of reasons. First, the goal of evaluation is typically ill-defined: customers, sometimes even designers, either don't know or can't specify exactly what result they expect. Often, they don't specify the architectural variants to consider, and often the metrics and workload they expect you to use are ill-defined. Second, they rarely clarify which kind of model and evaluation method best suit the evaluation problem. These problems have consequences. For one thing, the decision-maker may not trust the evaluation. For another, poor planning means the evaluation cannot be reproduced if any of the parameters are changed slightly. Finally, the evaluation documentation is usually inadequate, and so some time after the evaluation you might ask yourself, how did I come to that conclusion? An approach developed at Siemens makes decisions explicit and the process reproducible
|
[1] | Hermann Hellwagner, Design Considerations for Scalable Parallel File Systems, In The Computer Journal - Parallel Processing, N, A, vol. Vol. 36, no. 8, N, A, pp. 741-755, 1993.
[bib] [pdf] [abstract]
Abstract: This paper addresses the problem of providing high-performance disk I/O in massively parallel computers. Resolving the fundamental I/O bottleneck in parallel architectures involves both hardware and software issues. We review previous work on disk arrays and I/O architectures aimed at providing highly parallel disk I/O subsystems. We then focus on the requirements and design of parallel file systems (PFSs) which are responsible to make the parallelism offered by the hardware and a declustered file organization available to application programs. We present the design strategy and key concepts of a general-purpose file system for a parallel computer with scalable distributed shared memory. The principal objectives of the PFS are to fully exploit the parallelism inherent among and within file accesses, and to provide scalable I/O performance. The machine model underlying the design is described, with and emphasis on the innovative architectural features supporting scalability of the shared memory. Starting from a classification of various scenarios of concurrent I/O requests, the features of the PFS design essential for achieving the goals are described and justified. It is argued that the inter- and intra-request parallelism of the I/O load can indeed be effectively exploited and supported by the parallel system resources. Scalability of I/O performance and of the PFS software can be ensured by avoiding serial bottlenecks through the use of the powerful architectural features.
|