Type: Inproceedings - page 5 - ITEC Publications [rss] [bib] [xml]

[501]	Deepak Chaudhary, Prateek Agrawal, Vishu Madaan, Bank Cheque Validation Using Image Processing, In Proceedings of the 3rd International Conference On Advanced Informatics For Computing Research (Ashish Kumar Luhach, Dharm Singh Jat, Kamarul Bin Ghazali Hawari, Xiao-Zhi Gao, Pawan Lingras, eds.), Springer Singapore, pp. 148-159, 2019. [bib][url] [doi]
[500]	Neha Bhadwal, Prateek Agrawal, Vishu Madaan, Bilingual Machine Translation System Between Hindi and Sanskrit Languages, In Proceedings of the 3rd International Conference On Advanced Informatics For Computing Research (Ashish Kumar Luhach, Dharm Singh Jat, Kamarul Bin Ghazali Hawari, Xiao-Zhi Gao, Pawan Lingras, eds.), Springer Singapore, pp. 312-321, 2019. [bib][url] [doi]
2018
[499]	Liting Zhou, Luca Piras, Michael Riegler, Mathias Lux, Duc-Tien Dang-Nguyen, Cathal Gurrin, An Interactive Lifelog Retrieval System for Activities of Daily Living Understanding, In CLEF 2018 Working Notes, CEUR-Workshop Proceedings, 2018. [bib][url] [abstract] Abstract: This paper describes the participation of the Organizer Teamin the ImageCLEFlifelog 2018 Daily Living Understanding and Lifelog MomentRetrieval. In this paper, we propose how to exploit LIFER, aninteractive lifelog search engine to solve the two tasks: Lifelog MomentRetrieval and Activities of Daily Living Understanding. We propose approachesfor both baseline, which aim to provide a reference system forother approaches, and human-in-the-loop, which advance the baselineresults.
[498]	Anatoliy Zabrovskiy, Christian Feldmann, Christian Timmerer, A Practical Evaluation of Video Codecs for Large-Scale HTTP Adaptive Streaming Services, In 2018 25th IEEE International Conference on Image Processing (ICIP), IEEE, Piscataway (NJ), pp. 998-1002, 2018. [bib][url] [doi] [abstract] Abstract: The number of bandwidth-hungry applications and services is constantly growing. HTTP adaptive streaming of audiovisual content accounts for the majority of today's internet traffic. Although the internet bandwidth increases also constantly, audio-visual compression technology is inevitable and we are currently facing the challenge to be confronted with multiple video codecs. This paper provides a practical evaluation of state of the art video codecs (i. e., AV1, AVC/libx264, HEVC/libx265, VP9/Iibvpx-vp9) for large-scale HTTP adaptive streaming services. In anticipation of the results, AV I shows promising performance compared to established video codecs. Additionally, AV I is intended to be royalty free making it worthwhile to be considered for large scale HTTP adaptive streaming services.
[497]	Anatoliy Zabrovskiy, Christian Feldmann, Christian Timmerer, Multi-codec DASH dataset, In MMSys '18 Proceedings of the 9th ACM Multimedia Systems Conference, ACM Press, New York (NY), pp. 438-443, 2018. [bib][url] [doi] [abstract] Abstract: The number of bandwidth-hungry applications and services is constantly growing. HTTP adaptive streaming of audio-visual content accounts for the majority of today's internet traffic. Although the internet bandwidth increases also constantly, audio-visual compression technology is inevitable and we are currently facing the challenge to be confronted with multiple video codecs.This paper proposes a multi-codec DASH dataset comprising AVC, HEVC, VP9, and AV1 in order to enable interoperability testing and streaming experiments for the efficient usage of these codecs under various conditions. We adopt state of the art encoding and packaging options and also provide basic quality metrics along with the DASH segments. Additionally, we briefly introduce a multi-codec DASH scheme and possible usage scenarios. Finally, we provide a preliminary evaluation of the encoding efficiency in the context of HTTP adaptive streaming services and applications.
[496]	Armin Trattnig, Christian Timmerer, Christopher Müller, Investigation of YouTube regarding Content Provisioning for HTTP Adaptive Streaming, In PV '18 Proceedings of the 23rd Packet Video Workshop, ACM Press, New York (NY), pp. 60-65, 2018. [bib][url] [doi] [abstract] Abstract: About 300 hours of video are uploaded to YouTube every minute. The main technology to delivery YouTube content to various clients is HTTP adaptive streaming and the majority of today's internet traffic comprises streaming audio and video. In this paper, we investigate content provisioning for HTTP adaptive streaming under predefined aspects representing content features and upload characteristics as well and apply it to YouTube. Additionally, we compare the YouTube's content upload and processing functions with a commercially available video encoding service. The results reveal insights into YouTube's content upload and processing functions and the methodology can be applied to similar services. All experiments conducted within the paper allow for reproducibility thanks to the usage of open source tools, publicly available datasets, and scripts used to conduct the experiments on virtual machines.
[495]	Christian Timmerer, Ali Cengiz Begen, A Framework for Adaptive Delivery of Omnidirectional Video, In IS&T International Symposium on Electronic Imaging 2018, Human Vision and Electronic Imaging 2018 Conference, 2018. [bib][url] [pdf]
[494]	Christian Timmerer, Anatoliy Zabrovskiy, Ali C. Begen, Automated Objective and Subjective Evaluation of HTTP Adaptive Streaming Systems, In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), IEEE, Piscataway (NJ), 2018. [bib][url] [doi] [abstract] Abstract: Streaming audio and video content currently accounts for the majority of the internet traffic and is typically deployed over the top of the existing infrastructure. We are facing the challenge of a plethora of media players and adaptation algorithms showing different behavior but lack a common framework for both objective and subjective evaluation of such systems. This paper aims to close this gap by (i) proposing such a framework, (ii) describing its architecture, (iii) providing an example evaluation, (iv) and discussing open issues.
[493]	Christian Timmerer, Martin Smole, Christopher Mueller, Efficient Multi-Codec Support for OTT Services: HEVC/H.265 and/or AV1?, In 2018 NAB BEIT Proceedings (not available, ed.), National Association of Broadcasters (NAB), Washington DC, USA, pp. 5, 2018. [bib] [pdf]
[492]	Christian Timmerer, Anatoliy Zabrovskiy, Ali Cengiz Begen, Automated Objective and Subjective Evaluation of HTTP Adaptive Streaming Systems, In Proceedings of the 1st IEEE International Conference on Multimedia Information Processing and Retrieval (MIPR) (not available, ed.), pp. 6, 2018. [bib][url] [doi] [pdf] [abstract] Abstract: Streaming audio and video content currently accounts for the majority of the internet traffic and is typically deployed over the top of the existing infrastructure. We are facing the challenge of a plethora of media players and adaptation algorithms showing different behavior but lack a common framework for both objective and subjective evaluation of such systems. This paper aims to close this gap by (i) proposing such a framework, (ii) describing its architecture, (iii) providing an example evaluation, (iv) and discussing open issues.
[491]	Christian Timmerer, MPEG column: 123rd MPEG meeting in Ljubljana, Slovenia, In ACM SIGMultimedia Records, ACM Press, vol. 10, New York (NY), 2018. [bib][url] [doi] [abstract] Abstract: The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.
[490]	Mario Taschwer, Manfred Jürgen Primus, Klaus Schoeffmann, Oge Marques, Early and Late Fusion of Classifiers for the MediaEval Medico Task, In Working Notes Proceedings of the MediaEval 2018 Workshop (M. Larson, P. Arora, C.H. Demarty, M. Riegler, B. Bischke, E. Dellandrea, M. Lux, A. Porter, G.J.F. Jones, eds.), vol. 2283, 2018. [bib][url]
[489]	Klaus Schöffmann, Bernd Münzer, Manfred Jürgen Primus, Sabrina Kletz, Andreas Leibetseder, How Experts Search Different Than Novices – An Evaluation of the diveXplore Video Retrieval System at Video Browser Showdown 2018, In 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), IEEE, Piscataway (NJ), 2018. [bib][url] [doi] [abstract] Abstract: We present a modern interactive video retrieval tool, called diveXplore, that has been used for several iterations of the Video Browser Showdown (VBS) competition with great success – 2nd place for the last two years in a row. The tool provides novel video content search and interaction features (e.g., a semantic map-search & browsing feature with similarity arrangement and a highly efficient sketch-search, optimized for mobile touch-interaction) that make it perfectly suited for flexible video retrieval in large video collections. With the help of a user study we show that the diveXplore system can be used very efficiently by both type of users: novices and experts. Our evaluation results do also show that the interaction statistics of novices and experts differ in terms of used features. The details of our insights can be used to further optimize interfaces of video retrieval tools for non-experts.
[488]	Klaus Schöffmann, Werner Bailer, Cathal Gurrin, George M. Awad, Jakub Lokoč, Interactive Video Search: Where is the User in the Age of Deep Learning?, In MM '18 Proceedings of the 26th ACM international conference on Multimedia, ACM Press, New York (NY), pp. 2101-2103, 2018. [bib][url] [doi] [abstract] Abstract: In this tutorial we discuss interactive video search tools and methods, review their need in the age of deep learning, and explore video and multimedia search challenges and their role as evaluation benchmarks in the field of multimedia information retrieval. We cover three different campaigns (TRECVID, Video Browser Showdown, and the Lifelog Search Challenge), discuss their goals and rules, and present their achieved findings over the last half-decade. Moreover, we talk about datasets, tasks, evaluation procedures, and examples of interactive video search tools, as well as how they evolved over the years. Participants of this tutorial will be able to gain collective insights from all three challenges and use them for focusing their research efforts on outstanding problems that still remain unsolved in this area.
[487]	Klaus Schöffmann, Mario Taschwer, Stephanie Sarny, Bernd Münzer, Manfred Jürgen Primus, Doris Putzgruber-Adamitsch, Cataract-101: video dataset of 101 cataract surgeries, In MMSys '18 Proceedings of the 9th ACM Multimedia Systems Conference, ACM Press, New York (NY), pp. 421-425, 2018. [bib][url] [doi] [abstract] Abstract: Cataract surgery is one of the most frequently performed microscopic surgeries in the field of ophthalmology. The goal behind this kind of surgery is to replace the human eye lense with an artificial one, an intervention that is often required due to aging. The entire surgery is performed under microscopy, but co-mounted cameras allow to record and archive the procedure. Currently, the recorded videos are used in a postoperative manner for documentation and training. An additional benefit of recording cataract videos is that they enable video analytics (i.e., manual and/or automatic video content analysis) to investigate medically relevant research questions (e.g., the cause of complications). This, however, necessitates a medical multimedia information system trained and evaluated on existing data, which is currently not publicly available. In this work we provide a public video dataset of 101 cataract surgeries that were performed by four different surgeons over a period of 9 months. These surgeons are grouped into moderately experienced and highly experienced surgeons (assistant vs. senior physicians), providing the basis for experience-based video analytics. All videos have been annotated with quasi-standardized operation phases by a senior ophthalmic surgeon.
[486]	Michael Riegler, Pal Halvorsen, Bernd Münzer, Klaus Schöffmann, The Importance of Medical Multimedia, In MM '18 Proceedings of the 26th ACM international conference on Multimedia, ACM Press, New York (NY), pp. 2016-2108, 2018. [bib][url] [doi] [abstract] Abstract: Multimedia research is becoming more and more important for the medical domain, where an increasing number of videos and images are integrated in the daily routine of surgical and diagnostic work. While the collection of medical multimedia data is not an issue, appropriate tools for efficient use of this data are missing. This includes management and inspection of the data, visual analytics, as well as learning relevant semantics and using recognition results for optimizing surgical and diagnostic processes. The characteristics and requirements in this interesting but challenging field are different than the ones in classic multimedia domains. Therefore, this tutorial gives a general introduction to the field, provides a broad overview of specific requirements and challenges, discusses existing work and open challenges, and elaborates in detail how machine learning approaches can help in multimedia-related fields to improve the performance of surgeons/clinicians.
[485]	Benjamin Rainer, Stefan Petscharnig, Christian Timmerer, Merge and Forward: A Self-Organized Inter-Destination Media Synchronization Scheme for Adaptive Media Streaming over HTTP, In MediaSync, Springer, Berlin, pp. 593-627, 2018. [bib][url] [doi] [abstract] Abstract: In this chapter, we present Merge and Forward, an IDMS scheme for adaptive HTTP streaming as a distributed control scheme and adopting the MPEG-DASH standard as representation format. We introduce so-called IDMS sessions and describe how an unstructured peer-to-peer overlay can be created using the session information using MPEG-DASH. We objectively assess the performance of Merge and Forward with respect to convergence time (time needed until all clients hold the same reference time stamp) and scalability. After the negotiation on a reference time stamp, the clients have to synchronize their multimedia playback to the agreed reference time stamp. In order to achieve this, we propose a new adaptive media playout approach minimizing the impact of playback synchronization on the QoE. The proposed adaptive media playout is assessed subjectively using crowd sourcing. We further propose a crowd sourcing methodology for conducting subjective quality assessments in the field of IDMS by utilizing GWAP. We validate the applicability of our methodology by investigating the lower asynchronism threshold for IDMS in scenarios like online quiz games.
[484]	Manfred Jürgen Primus, Bernd Münzer, Andreas Leibetseder, Klaus Schöffmann, The ITEC Collaborative Video Search System at the Video Browser Showdown 2018, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10705, Berlin, pp. 438-443, 2018. [bib][url] [doi] [abstract] Abstract: We present our video search system for the Video Browser Showdown (VBS) 2018 competition. It is based on the collaborative system used in 2017, which already performed well but also revealed high potential for improvement. Hence, based on our experience we introduce several major improvements, particularly (1) a strong optimization of similarity search, (2) various improvements for concept-based search, (3) a new flexible video inspector view, and (4) extended collaboration features, as well as numerous minor adjustments and enhancements, mainly concerning the user interface and means of user interaction. Moreover, we present a spectator view that visualizes the current activity of the team members to the audience to make the competition more attractive.
[483]	Manfred Jürgen Primus, Doris Putzgruber-Adamitsch, Mario Taschwer, Bernd Münzer, Yosuf El-Shabrawi, Laszlo Böszörmenyi, Klaus Schöffmann, Frame-Based Classification of Operation Phases in Cataract Surgery Videos, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 1) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Noel E. O´Connor, Supavadee Aramvith, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10704, Berlin, pp. 241-253, 2018. [bib][url] [doi] [abstract] Abstract: Cataract surgeries are frequently performed to correct a lens opacification of the human eye, which usually appears in the course of aging. These surgeries are conducted with the help of a microscope and are typically recorded on video for later inspection and educational purposes. However, post-hoc visual analysis of video recordings is cumbersome and time-consuming for surgeons if there is no navigation support, such as bookmarks to specific operation phases. To prepare the way for an automatic detection of operation phases in cataract surgery videos, we investigate the effectiveness of a deep convolutional neural network (CNN) to automatically assign video frames to operation phases, which can be regarded as a single-label multi-class classification problem. In absence of public datasets of cataract surgery videos, we provide a dataset of 21 videos of standardized cataract surgeries and use it to train and evaluate our CNN classifier. Experimental results display a mean F1-score of about 68% for frame-based operation phase classification, which can be further improved to 75% when considering temporal information of video frames in the CNN architecture.
[482]	Andrei Postoaca, Florin Pop, Radu Prodan, h-Fair: Asymptotic Scheduling of Heavy Workloads in Heterogeneous Data Centers, In 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), IEEE, Piscataway (NJ), 2018. [bib][url] [doi] [abstract] Abstract: Large scale computing solutions are increasingly used in the context of Big Data platforms, where efficient scheduling algorithms play an important role in providing optimized cluster resource utilization, throughput and fairness. This paper deals with the problem of scheduling a set of jobs across a cluster of machines handling the specific use case of fair scheduling for jobs and machines with heterogeneous characteristics. Although job and cluster diversity is unprecedented, most schedulers do not provide implementations that handle multiple resource type fairness in a heterogeneous system. We propose in this paper a new scheduler called h-Fair that selects jobs for scheduling based on a global dominant resource fairness heterogeneous policy, and dispatches them on machines with similar characteristics to the resource demands using the cosine similarity. We implemented h-Fair in Apache Hadoop YARN and we compare it with the existing Fair Scheduler that uses the dominant resource fairness policy based on the Google workload trace. We show that our implementation provides better cluster resource utilization and allocates more containers when jobs and machines have heterogeneous characteristics.
[481]	Konstantin Pogorelov, Michael Riegler, Pal Halvorsen, Steven Alexander Hicks, Kristin Ranheim Randel, Duc-Tien Dang-Nguyen, Mathias Lux, Olga Ostroukhova, Thomas de Lange, Medico Multimedia Task at MediaEval 2018, In Working Notes Proceedings of the MediaEval 2018 Workshop, CEUR Workshop Proceedings (CEUR-WS.org), Aachen, 2018. [bib][url] [abstract] Abstract: The Medico: Multimedia for Medicine Task, running for the secondtime as part of MediaEval 2018, focuses on detecting abnormalities,diseases, anatomical landmarks and other findings in imagescaptured by medical devices in the gastrointestinal tract. The taskis described, including the use case and its challenges, the datasetwith ground truth, the required participant runs and the evaluationmetrics.
[480]	Konstantin Pogorelov, Zeno Albisser, Olga Ostroukhova, Mathias Lux, Dag Johansen, Pal Halvorsen, Michael Riegler, Opensea: open search based classification tool, In MMSys '18 Proceedings of the 9th ACM Multimedia Systems Conference, ACM Press, New York (NY), pp. 363-368, 2018. [bib][url] [doi] [abstract] Abstract: This paper presents an open-source classification tool for image and video frame classification. The classification takes a search-based approach and relies on global and local image features. It has been shown to work with images as well as videos, and is able to perform the classification of video frames in real-time so that the output can be used while the video is recorded, playing, or streamed. OpenSea has been proven to perform comparable to state-of-the-art methods such as deep learning, at the same time performing much faster in terms of processing speed, and can be therefore seen as an easy to get and hard to beat baseline. We present a detailed description of the software, its installation and use. As a use case, we demonstrate the classification of polyps in colonoscopy videos based on a publicly available dataset. We conduct leave-one-out-cross-validation to show the potential of the software in terms of classification time and accuracy.
[479]	Stefan Petscharnig, Klaus Schöffmann, ActionVis: An Explorative Tool to Visualize Surgical Actions in Gynecologic Laparoscopy, In International Conference on Multimedia Modeling (yet not available, ed.), Springer, Cham, Switzerland, pp. 1-5, 2018. [bib][url] [doi] [abstract] Abstract: Appropriate visualization of endoscopic surgery recordings has a huge potential to benefit surgical work life. For example, it enables surgeons to quickly browse medical interventions for purposes of documentation, medical research, discussion with colleagues, and training of young surgeons. Current literature on automatic action recognition for endoscopic surgery covers domains where surgeries follow a standardized pattern, such as cholecystectomy. However, there is a lack of support in domains where such standardization is not possible, such as gynecologic laparoscopy. We provide ActionVis, an interactive tool enabling surgeons to quickly browse endoscopic recordings. Our tool analyses the results of a post-processing of the recorded surgery. Information on individual frames are aggregated temporally into a set of scenes representing frequent surgical actions in gynecologic laparoscopy, which help surgeons to navigate within endoscopic recordings in this domain.
[478]	Bernd Münzer, Andreas Leibetseder, Sabrina Kletz, Manfred Jürgen Primus, Klaus Schöffmann, lifeXplore at the Lifelog Search Challenge 2018, In LSC '18 Proceedings of the 2018 ACM Workshop on The Lifelog Search Challenge, ACM Digital Library, New York, NY, 2018. [bib][url] [doi] [abstract] Abstract: With the growing hype for wearable devices recording biometric data comes the readiness to capture and combine even more personal information as a form of digital diary - lifelogging today is practiced ever more and can be categorized anywhere between an informative hobby and a life-changing experience. From an information processing point of view, analyzing the entirety of such multi-source data is immensely challenging, which is why the first Lifelog Search Challenge 2018 competition is brought into being, as to encourage the development of efficient interactive data retrieval systems. Answering this call, we present a retrieval system based on our video search system diveXplore, which has successfully been used in the Video Browser Showdown 2017 and 2018. Due to the different task definition and available data corpus, the base system was adapted and extended to this new challenge. The resulting lifeXplore system is a flexible retrieval and exploration tool that offers various easy-to-use, yet still powerful search and browsing features that have been optimized for lifelog data and for usage by novice users. Besides efficient presentation and summarization of lifelog data, it includes searchable feature maps, concept and metadata filters, similarity search and sketch search.
[477]	Bernd Münzer, Klaus Schöffmann, Video Browsing on a Circular Timeline, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10705, Berlin, pp. 395-399, 2018. [bib][url] [doi] [abstract] Abstract: The emerging ubiquity of videos in all aspects of society demands for innovative and efficient browsing and navigation mechanisms. We propose a novel visualization and interaction paradigm that replaces the traditional linear timeline with a circular timeline. The main advantages of this new concept are (1) significantly increased and dynamic navigation granularity, (2) minimized spacial distances between arbitrary points on the timeline, as well as (3) the possibility to efficiently utilize the screen space for bookmarks or other supplemental information associated with points of interest. The demonstrated prototype implementation proves the expedience of this new concept and includes additional navigation and visualization mechanisms, which altogether create a powerful video browser.