[63] | Liting Zhou, Luca Piras, Michael Riegler, Mathias Lux, Duc-Tien Dang-Nguyen, Cathal Gurrin, An Interactive Lifelog Retrieval System for Activities of Daily Living Understanding, In CLEF 2018 Working Notes, CEUR-Workshop Proceedings, 2018.
[bib][url] [abstract]
Abstract: This paper describes the participation of the Organizer Teamin the ImageCLEFlifelog 2018 Daily Living Understanding and Lifelog MomentRetrieval. In this paper, we propose how to exploit LIFER, aninteractive lifelog search engine to solve the two tasks: Lifelog MomentRetrieval and Activities of Daily Living Understanding. We propose approachesfor both baseline, which aim to provide a reference system forother approaches, and human-in-the-loop, which advance the baselineresults.
|
[62] | Anatoliy Zabrovskiy, Christian Feldmann, Christian Timmerer, A Practical Evaluation of Video Codecs for Large-Scale HTTP Adaptive Streaming Services, In 2018 25th IEEE International Conference on Image Processing (ICIP), IEEE, Piscataway (NJ), pp. 998-1002, 2018.
[bib][url] [doi] [abstract]
Abstract: The number of bandwidth-hungry applications and services is constantly growing. HTTP adaptive streaming of audiovisual content accounts for the majority of today's internet traffic. Although the internet bandwidth increases also constantly, audio-visual compression technology is inevitable and we are currently facing the challenge to be confronted with multiple video codecs. This paper provides a practical evaluation of state of the art video codecs (i. e., AV1, AVC/libx264, HEVC/libx265, VP9/Iibvpx-vp9) for large-scale HTTP adaptive streaming services. In anticipation of the results, AV I shows promising performance compared to established video codecs. Additionally, AV I is intended to be royalty free making it worthwhile to be considered for large scale HTTP adaptive streaming services.
|
[61] | Anatoliy Zabrovskiy, Christian Feldmann, Christian Timmerer, Multi-codec DASH dataset, In MMSys '18 Proceedings of the 9th ACM Multimedia Systems Conference, ACM Press, New York (NY), pp. 438-443, 2018.
[bib][url] [doi] [abstract]
Abstract: The number of bandwidth-hungry applications and services is constantly growing. HTTP adaptive streaming of audio-visual content accounts for the majority of today's internet traffic. Although the internet bandwidth increases also constantly, audio-visual compression technology is inevitable and we are currently facing the challenge to be confronted with multiple video codecs.This paper proposes a multi-codec DASH dataset comprising AVC, HEVC, VP9, and AV1 in order to enable interoperability testing and streaming experiments for the efficient usage of these codecs under various conditions. We adopt state of the art encoding and packaging options and also provide basic quality metrics along with the DASH segments. Additionally, we briefly introduce a multi-codec DASH scheme and possible usage scenarios. Finally, we provide a preliminary evaluation of the encoding efficiency in the context of HTTP adaptive streaming services and applications.
|
[60] | Armin Trattnig, Christian Timmerer, Christopher Müller, Investigation of YouTube regarding Content Provisioning for HTTP Adaptive Streaming, In PV '18 Proceedings of the 23rd Packet Video Workshop, ACM Press, New York (NY), pp. 60-65, 2018.
[bib][url] [doi] [abstract]
Abstract: About 300 hours of video are uploaded to YouTube every minute. The main technology to delivery YouTube content to various clients is HTTP adaptive streaming and the majority of today's internet traffic comprises streaming audio and video. In this paper, we investigate content provisioning for HTTP adaptive streaming under predefined aspects representing content features and upload characteristics as well and apply it to YouTube. Additionally, we compare the YouTube's content upload and processing functions with a commercially available video encoding service. The results reveal insights into YouTube's content upload and processing functions and the methodology can be applied to similar services. All experiments conducted within the paper allow for reproducibility thanks to the usage of open source tools, publicly available datasets, and scripts used to conduct the experiments on virtual machines.
|
[59] | Christian Timmerer, Ali Cengiz Begen, A Framework for Adaptive Delivery of Omnidirectional Video, In IS&T International Symposium on Electronic Imaging 2018, Human Vision and Electronic Imaging 2018 Conference, 2018.
[bib][url] [pdf] |
[58] | Christian Timmerer, Anatoliy Zabrovskiy, Ali C. Begen, Automated Objective and Subjective Evaluation of HTTP Adaptive Streaming Systems, In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), IEEE, Piscataway (NJ), 2018.
[bib][url] [doi] [abstract]
Abstract: Streaming audio and video content currently accounts for the majority of the internet traffic and is typically deployed over the top of the existing infrastructure. We are facing the challenge of a plethora of media players and adaptation algorithms showing different behavior but lack a common framework for both objective and subjective evaluation of such systems. This paper aims to close this gap by (i) proposing such a framework, (ii) describing its architecture, (iii) providing an example evaluation, (iv) and discussing open issues.
|
[57] | Christian Timmerer, MPEG column: 121st MPEG meeting in Gwangju, Korea, In SIGMultimedia Records, ACM, vol. 10, no. 1, New York, NY, USA, pp. 6:6-6:6, 2018.
[bib][url] [doi] |
[56] | Christian Timmerer, MPEG Column: 120th MPEG Meeting in Macau, China, In SIGMultimedia Records, ACM, vol. 9, no. 3, New York, NY, USA, pp. 4:4-4:4, 2018.
[bib][url] [doi] |
[55] | Christian Timmerer, Martin Smole, Christopher Mueller, Efficient Multi-Codec Support for OTT Services: HEVC/H.265 and/or AV1?, In 2018 NAB BEIT Proceedings (not available, ed.), National Association of Broadcasters (NAB), Washington DC, USA, pp. 5, 2018.
[bib] [pdf] |
[54] | Christian Timmerer, Anatoliy Zabrovskiy, Ali Cengiz Begen, Automated Objective and Subjective Evaluation of HTTP Adaptive Streaming Systems, In Proceedings of the 1st IEEE International Conference on Multimedia Information Processing and Retrieval (MIPR) (not available, ed.), pp. 6, 2018.
[bib][url] [doi] [pdf] [abstract]
Abstract: Streaming audio and video content currently accounts for the majority of the internet traffic and is typically deployed over the top of the existing infrastructure. We are facing the challenge of a plethora of media players and adaptation algorithms showing different behavior but lack a common framework for both objective and subjective evaluation of such systems. This paper aims to close this gap by (i) proposing such a framework, (ii) describing its architecture, (iii) providing an example evaluation, (iv) and discussing open issues.
|
[53] | Christian Timmerer, MPEG column: 123rd MPEG meeting in Ljubljana, Slovenia, In ACM SIGMultimedia Records, ACM Press, vol. 10, New York (NY), 2018.
[bib][url] [doi] [abstract]
Abstract: The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.
|
[52] | Mario Taschwer, Manfred Jürgen Primus, Klaus Schoeffmann, Oge Marques, Early and Late Fusion of Classifiers for the MediaEval Medico Task, In Working Notes Proceedings of the MediaEval 2018 Workshop (M. Larson, P. Arora, C.H. Demarty, M. Riegler, B. Bischke, E. Dellandrea, M. Lux, A. Porter, G.J.F. Jones, eds.), vol. 2283, 2018.
[bib][url] |
[51] | Mario Taschwer, Oge Marques, Automatic separation of compound figures in scientific articles, In Multimedia Tools and Applications, no. 77, pp. 519-548, 2018.
[bib][url] [doi] [abstract]
Abstract: Content-based analysis and retrieval of digital images found in scientific articles is often hindered by images consisting of multiple subfigures (compound figures). We address this problem by proposing a method (ComFig) to automatically classify and separate compound figures, which consists of two main steps: (i) a supervised compound figure classifier (ComFig classifier) discriminates between compound and non-compound figures using task-specific image features; and (ii) an image processing algorithm is applied to predicted compound images to perform compound figure separation (ComFig separation). The proposed ComFig classifier is shown to achieve state-of-the-art classification performance on a published dataset. Our ComFig separation algorithm shows superior separation accuracy on two different datasets compared to other known automatic approaches. Finally, we propose a method to evaluate the effectiveness of the ComFig chain combining classifier and separation algorithm, and use it to optimize the misclassification loss of the ComFig classifier for maximal effectiveness in the chain.
|
[50] | Vlado Stankovski, Radu Prodan, Guest Editors’ Introduction: Special Issue on Storagefor the Big Data Era, In Journal of Grid Computing, 2018.
[bib][url] [doi] |
[49] | Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Noel E. O´Connor, Supavadee Aramvith, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 1), Springer, vol. 10704, 2018.
[bib][url] [doi] |
[48] | Klaus Schöffmann, Bernd Münzer, Manfred Jürgen Primus, Sabrina Kletz, Andreas Leibetseder, How Experts Search Different Than Novices – An Evaluation of the diveXplore Video Retrieval System at Video Browser Showdown 2018, In 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), IEEE, Piscataway (NJ), 2018.
[bib][url] [doi] [abstract]
Abstract: We present a modern interactive video retrieval tool, called diveXplore, that has been used for several iterations of the Video Browser Showdown (VBS) competition with great success – 2nd place for the last two years in a row. The tool provides novel video content search and interaction features (e.g., a semantic map-search & browsing feature with similarity arrangement and a highly efficient sketch-search, optimized for mobile touch-interaction) that make it perfectly suited for flexible video retrieval in large video collections. With the help of a user study we show that the diveXplore system can be used very efficiently by both type of users: novices and experts. Our evaluation results do also show that the interaction statistics of novices and experts differ in terms of used features. The details of our insights can be used to further optimize interfaces of video retrieval tools for non-experts.
|
[47] | Klaus Schöffmann, Werner Bailer, Cathal Gurrin, George M. Awad, Jakub Lokoč, Interactive Video Search: Where is the User in the Age of Deep Learning?, In MM '18 Proceedings of the 26th ACM international conference on Multimedia, ACM Press, New York (NY), pp. 2101-2103, 2018.
[bib][url] [doi] [abstract]
Abstract: In this tutorial we discuss interactive video search tools and methods, review their need in the age of deep learning, and explore video and multimedia search challenges and their role as evaluation benchmarks in the field of multimedia information retrieval. We cover three different campaigns (TRECVID, Video Browser Showdown, and the Lifelog Search Challenge), discuss their goals and rules, and present their achieved findings over the last half-decade. Moreover, we talk about datasets, tasks, evaluation procedures, and examples of interactive video search tools, as well as how they evolved over the years. Participants of this tutorial will be able to gain collective insights from all three challenges and use them for focusing their research efforts on outstanding problems that still remain unsolved in this area.
|
[46] | Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds., MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2), Springer, vol. 10705, 2018.
[bib][url] [doi] |
[45] | Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Noel E. O´Connor, Supavadee Aramvith, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds., MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 1), Springer, vol. 10704, 2018.
[bib][url] [doi] |
[44] | Klaus Schöffmann, Mario Taschwer, Stephanie Sarny, Bernd Münzer, Manfred Jürgen Primus, Doris Putzgruber-Adamitsch, Cataract-101: video dataset of 101 cataract surgeries, In MMSys '18 Proceedings of the 9th ACM Multimedia Systems Conference, ACM Press, New York (NY), pp. 421-425, 2018.
[bib][url] [doi] [abstract]
Abstract: Cataract surgery is one of the most frequently performed microscopic surgeries in the field of ophthalmology. The goal behind this kind of surgery is to replace the human eye lense with an artificial one, an intervention that is often required due to aging. The entire surgery is performed under microscopy, but co-mounted cameras allow to record and archive the procedure. Currently, the recorded videos are used in a postoperative manner for documentation and training. An additional benefit of recording cataract videos is that they enable video analytics (i.e., manual and/or automatic video content analysis) to investigate medically relevant research questions (e.g., the cause of complications). This, however, necessitates a medical multimedia information system trained and evaluated on existing data, which is currently not publicly available. In this work we provide a public video dataset of 101 cataract surgeries that were performed by four different surgeons over a period of 9 months. These surgeons are grouped into moderately experienced and highly experienced surgeons (assistant vs. senior physicians), providing the basis for experience-based video analytics. All videos have been annotated with quasi-standardized operation phases by a senior ophthalmic surgeon.
|
[43] | Michael Riegler, Pal Halvorsen, Bernd Münzer, Klaus Schöffmann, The Importance of Medical Multimedia, In MM '18 Proceedings of the 26th ACM international conference on Multimedia, ACM Press, New York (NY), pp. 2016-2108, 2018.
[bib][url] [doi] [abstract]
Abstract: Multimedia research is becoming more and more important for the medical domain, where an increasing number of videos and images are integrated in the daily routine of surgical and diagnostic work. While the collection of medical multimedia data is not an issue, appropriate tools for efficient use of this data are missing. This includes management and inspection of the data, visual analytics, as well as learning relevant semantics and using recognition results for optimizing surgical and diagnostic processes. The characteristics and requirements in this interesting but challenging field are different than the ones in classic multimedia domains. Therefore, this tutorial gives a general introduction to the field, provides a broad overview of specific requirements and challenges, discusses existing work and open challenges, and elaborates in detail how machine learning approaches can help in multimedia-related fields to improve the performance of surgeons/clinicians.
|
[42] | Laura Ricci, Alexander Iosup, Radu Prodan, Large Scale Cooperative Virtual Environments, In Concurrency and Computation: Practice and Experience, 2018.
[bib][url] [doi] |
[41] | Benjamin Rainer, Stefan Petscharnig, Christian Timmerer, Merge and Forward: A Self-Organized Inter-Destination Media Synchronization Scheme for Adaptive Media Streaming over HTTP, In MediaSync, Springer, Berlin, pp. 593-627, 2018.
[bib][url] [doi] [abstract]
Abstract: In this chapter, we present Merge and Forward, an IDMS scheme for adaptive HTTP streaming as a distributed control scheme and adopting the MPEG-DASH standard as representation format. We introduce so-called IDMS sessions and describe how an unstructured peer-to-peer overlay can be created using the session information using MPEG-DASH. We objectively assess the performance of Merge and Forward with respect to convergence time (time needed until all clients hold the same reference time stamp) and scalability. After the negotiation on a reference time stamp, the clients have to synchronize their multimedia playback to the agreed reference time stamp. In order to achieve this, we propose a new adaptive media playout approach minimizing the impact of playback synchronization on the QoE. The proposed adaptive media playout is assessed subjectively using crowd sourcing. We further propose a crowd sourcing methodology for conducting subjective quality assessments in the field of IDMS by utilizing GWAP. We validate the applicability of our methodology by investigating the lower asynchronism threshold for IDMS in scenarios like online quiz games.
|
[40] | Manfred Jürgen Primus, Bernd Münzer, Andreas Leibetseder, Klaus Schöffmann, The ITEC Collaborative Video Search System at the Video Browser Showdown 2018, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10705, Berlin, pp. 438-443, 2018.
[bib][url] [doi] [abstract]
Abstract: We present our video search system for the Video Browser Showdown (VBS) 2018 competition. It is based on the collaborative system used in 2017, which already performed well but also revealed high potential for improvement. Hence, based on our experience we introduce several major improvements, particularly (1) a strong optimization of similarity search, (2) various improvements for concept-based search, (3) a new flexible video inspector view, and (4) extended collaboration features, as well as numerous minor adjustments and enhancements, mainly concerning the user interface and means of user interaction. Moreover, we present a spectator view that visualizes the current activity of the team members to the audience to make the competition more attractive.
|
[39] | Manfred Jürgen Primus, Doris Putzgruber-Adamitsch, Mario Taschwer, Bernd Münzer, Yosuf El-Shabrawi, Laszlo Böszörmenyi, Klaus Schöffmann, Frame-Based Classification of Operation Phases in Cataract Surgery Videos, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 1) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Noel E. O´Connor, Supavadee Aramvith, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10704, Berlin, pp. 241-253, 2018.
[bib][url] [doi] [abstract]
Abstract: Cataract surgeries are frequently performed to correct a lens opacification of the human eye, which usually appears in the course of aging. These surgeries are conducted with the help of a microscope and are typically recorded on video for later inspection and educational purposes. However, post-hoc visual analysis of video recordings is cumbersome and time-consuming for surgeons if there is no navigation support, such as bookmarks to specific operation phases. To prepare the way for an automatic detection of operation phases in cataract surgery videos, we investigate the effectiveness of a deep convolutional neural network (CNN) to automatically assign video frames to operation phases, which can be regarded as a single-label multi-class classification problem. In absence of public datasets of cataract surgery videos, we provide a dataset of 21 videos of standardized cataract surgeries and use it to train and evaluate our CNN classifier. Experimental results display a mean F1-score of about 68% for frame-based operation phase classification, which can be further improved to 75% when considering temporal information of video frames in the CNN architecture.
|