[5] | Michael Granitzer, T Neidhart, Mathias Lux, Learning Term Spaces based on Visual Feedback, In Proceedings of the 17th International Conference on Database and Expert Systems Applications (DEXA'06) (A Min Tjoa, R Wagner, eds.), IEEE Computer Society, Los Alamitos, CA, USA, pp. 176-180, 2006.
[bib] [abstract]
Abstract: Extracting and visualizing concepts and relationship between text documents strongly depends on the used similarity measure. In order to provide meaningful visualizations and to extract useful knowledge from document collections, user needs must be captured by the internal representation of documents, and the used similarity measure. In most applications the Vector Space Model and the Cosine similarity are used therefore and serve as good approximations. Nevertheless, influencing similarities between documents is rather hard, since parameter tuning relies heavily on expert knowledge of the underlying algorithms, and the influence of different weighting schemes and similarity measures is not known before. In this paper we present an approach on how to adapt the vector space representation of documents by giving visual feedback to the system. Our approach starts by clustering a corpus of text documents and visualizing the results using multi dimensional scaling techniques. Afterwards, a 2D landscape visualization is shown which can be manipulated by the user. Based on these manipulations the high dimensional representation of the documents is adapted to fit the users need more precisely. Our experiments show that iterating these steps results in an adapted representation of documents and similarities, generating layouts as intended by the user and furthermore increases clustering accuracy. While this paper only investigates the influence on clustering and visualization, the method itself may also be used for increasing classification and retrieval performance since it adapts to the users need of similarity.
|
[4] | Michael Granitzer, Harald Kosch, Mathias Lux, 5th Multimedia Metadata Community Workshop - Introduction, In 6th International Conference on Knowledge Management (Klaus Tochtermann, Hermann Maurer, eds.), Eigenverlag in Kooperation mit Springer Verlag, Graz, pp. 568-569, 2006.
[bib] |
[3] | Robbie De Sutter, Sam Lerouge, Peter De Neve, Christian Timmerer, Hermann Hellwagner, Rik Van de Walle, Comparison of XML serializations: cost benefits versus complexity, In Multimedia Systems, Springer, vol. Vol. 12, no. Nr. 2, Berlin, Heidelberg, New York, pp. 101-115, 2006.
[bib] [doi] [pdf] [abstract]
Abstract: More and more data are structured, stored, and sent over a networ using the Extensible Markup Language (XML) language. There are, however, concerns about the verbosity of XML in such a way that it may restrain further adoption of the language, especially when exchanging XML-based data over heterogeneous networks, and when it is used within constrained (mobile) devices. Therefore, alternative (binary) serialization formats of the XML data become relevant in order to reduce this overhead. However, using binary-encoded XML should not introduce interoperability issues with existing applications nor add additional complexity to new applications. On top of that, it should have a clear cost reduction over the current plain-text serialization format. A first technology is developed within the ISO/IEC Moving Picture Experts Group, namely the Binary MPEG Format for XML. It provides good compression efficiency, ability to (partially) update existingXMLtrees, and facilitates random access into, and manipulation of, the binary-encoded bit stream. Another technique is based on the Abstract Syntax Notation One specification with the Packed Encoding Rules created by the ITU-T. This paper evaluates both techniques as alternative XML serialization formats and introduces a solution for the interoperability concerns. This solution and the alternative serialization formats are validated against two real-life use cases in terms of processing speed and cost reduction. The efficiency of the alternative serialization formats are compared to a classic plain text compression technique, in particular ZIP compression.
|
[2] | Robbie De Sutter, Sam Lerouge, Peter De Neve, Christian Timmerer, Hermann Hellwagner, Rik and Van de Walle, Comparison of XML serializations: cost benefit vs. complexity, In ACM Multimedia Systems, Springer, vol. Vol 12, no. No 1, London, pp. 1-15, 2006.
[bib] [abstract]
Abstract: More and more data are structured, stored, and sent over a networ using the Extensible Markup Language (XML) language. There are, however, concerns about the verbosity of XML in such a way that it may restrain further adoption of the language, especially when exchanging XML-based data over heterogeneous networks, and when it is used within constrained (mobile) devices. Therefore, alternative (binary) serialization formats of the XML data become relevant in order to reduce this overhead. However, using binary-encoded XML should not introduce interoperability issues with existing applications nor add additional complexity to new applications. On top of that, it should have a clear cost reduction over the current plain-text serialization format. A first technology is developed within the ISO/IEC Moving Picture Experts Group, namely the Binary MPEG Format for XML. It provides good compression efficiency, ability to (partially) update existingXMLtrees, and facilitates random access into, and manipulation of, the binary-encoded bit stream. Another technique is based on the Abstract Syntax Notation One specification with the Packed Encoding Rules created by the ITU-T. This paper evaluates both techniques as alternative XML serialization formats and introduces a solution for the interoperability concerns. This solution and the alternative serialization formats are validated against two real-life use cases in terms of processing speed and cost reduction. The efficiency of the alternative serialization formats are compared to a classic plain text compression technique, in particular ZIP compression.
|
[1] | Laszlo Böszörmenyi, Istavan Simonics, Radoslav Pavlov, Methods and tools for development of semantic enabled systems and services for multimedia content, interoperability and reusability, Eigenverlag Universität Klagenfurt/Projekt Hubuska, Budapest, Ungarn, pp. 126, 2006.
[bib] |