[8] | Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds., MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2), Springer, vol. 10705, 2018.
[bib][url] [doi] |
[7] | Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Noel E. O´Connor, Supavadee Aramvith, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds., MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 1), Springer, vol. 10704, 2018.
[bib][url] [doi] |
[6] | Manfred Jürgen Primus, Bernd Münzer, Andreas Leibetseder, Klaus Schöffmann, The ITEC Collaborative Video Search System at the Video Browser Showdown 2018, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10705, Berlin, pp. 438-443, 2018.
[bib][url] [doi] [abstract]
Abstract: We present our video search system for the Video Browser Showdown (VBS) 2018 competition. It is based on the collaborative system used in 2017, which already performed well but also revealed high potential for improvement. Hence, based on our experience we introduce several major improvements, particularly (1) a strong optimization of similarity search, (2) various improvements for concept-based search, (3) a new flexible video inspector view, and (4) extended collaboration features, as well as numerous minor adjustments and enhancements, mainly concerning the user interface and means of user interaction. Moreover, we present a spectator view that visualizes the current activity of the team members to the audience to make the competition more attractive.
|
[5] | Manfred Jürgen Primus, Doris Putzgruber-Adamitsch, Mario Taschwer, Bernd Münzer, Yosuf El-Shabrawi, Laszlo Böszörmenyi, Klaus Schöffmann, Frame-Based Classification of Operation Phases in Cataract Surgery Videos, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 1) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Noel E. O´Connor, Supavadee Aramvith, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10704, Berlin, pp. 241-253, 2018.
[bib][url] [doi] [abstract]
Abstract: Cataract surgeries are frequently performed to correct a lens opacification of the human eye, which usually appears in the course of aging. These surgeries are conducted with the help of a microscope and are typically recorded on video for later inspection and educational purposes. However, post-hoc visual analysis of video recordings is cumbersome and time-consuming for surgeons if there is no navigation support, such as bookmarks to specific operation phases. To prepare the way for an automatic detection of operation phases in cataract surgery videos, we investigate the effectiveness of a deep convolutional neural network (CNN) to automatically assign video frames to operation phases, which can be regarded as a single-label multi-class classification problem. In absence of public datasets of cataract surgery videos, we provide a dataset of 21 videos of standardized cataract surgeries and use it to train and evaluate our CNN classifier. Experimental results display a mean F1-score of about 68% for frame-based operation phase classification, which can be further improved to 75% when considering temporal information of video frames in the CNN architecture.
|
[4] | Bernd Münzer, Klaus Schöffmann, Video Browsing on a Circular Timeline, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10705, Berlin, pp. 395-399, 2018.
[bib][url] [doi] [abstract]
Abstract: The emerging ubiquity of videos in all aspects of society demands for innovative and efficient browsing and navigation mechanisms. We propose a novel visualization and interaction paradigm that replaces the traditional linear timeline with a circular timeline. The main advantages of this new concept are (1) significantly increased and dynamic navigation granularity, (2) minimized spacial distances between arbitrary points on the timeline, as well as (3) the possibility to efficiently utilize the screen space for bookmarks or other supplemental information associated with points of interest. The demonstrated prototype implementation proves the expedience of this new concept and includes additional navigation and visualization mechanisms, which altogether create a powerful video browser.
|
[3] | Andreas Leibetseder, Sabrina Kletz, Klaus Schöffmann, Sketch-Based Similarity Search for Collaborative Feature Maps, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10705, Berlin, pp. 425-430, 2018.
[bib][url] [doi] [abstract]
Abstract: Past editions of the annual Video Browser Showdown (VBS) event have brought forward many tools targeting a diverse amount of techniques for interactive video search, among which sketch-based search showed promising results. Aiming at exploring this direction further, we present a custom approach for tackling the problem of finding similarities in the TRECVID IACC.3 dataset via hand-drawn pictures using color compositions together with contour matching. The proposed methodology is integrated into the established Collaborative Feature Maps (CFM) system, which has first been utilized in the VBS 2017 challenge.
|
[2] | Andreas Leibetseder, Manfred Jürgen Primus, Klaus Schöffmann, Automatic Smoke Classification in Endoscopic Video, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 2) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Supavadee Aramvith, Noel E. O´Connor, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10705, Berlin, pp. 362-366, 2018.
[bib][url] [doi] [abstract]
Abstract: Medical smoke evacuation systems enable proper, filtered removal of toxic fumes during surgery, while stabilizing internal pressure during endoscopic interventions. Typically activated manually, they, however, are prone to inefficient utilization: tardy activation enables smoke to interfere with ongoing surgeries and late deactivation wastes precious resources. In order to address such issues, in this work we demonstrate a vision-based tool indicating endoscopic smoke – a first step towards automatic activation of said systems and avoiding human misconduct. In the back-end we employ a pre-trained convolutional neural network (CNN) model for distinguishing images containing smoke from others.
|
[1] | Sabrina Kletz, Andreas Leibetseder, Klaus Schöffmann, Evaluation of Visual Content Descriptors for Supporting Ad-Hoc Video Search Tasks at the Video Browser Showdown, In MultiMedia Modeling - 24th International Conference, MMM 2018 (Part 1) (Klaus Schöffmann, Thanarat H. Chalidabhongse, Chong-Wah Ngo, Noel E. O´Connor, Supavadee Aramvith, Yo-Sung Ho, Moncef Gabbouj, Ahmed Elgammal, eds.), Springer, vol. 10704, Berlin, pp. 203-215, 2018.
[bib][url] [doi] [abstract]
Abstract: Since 2017 the Video Browser Showdown (VBS) collaborates with TRECVID and interactively evaluates Ad-Hoc Video Search (AVS) tasks, in addition to Known-Item Search (KIS) tasks. In this video search competition the participants have to find relevant target scenes to a given textual query within a specific time limit, in a large dataset consisting of 600 h of video content. Since usually the number of relevant scenes for such an AVS query is rather high, the teams at the VBS 2017 could find only a small portion of them. One way to support them at the interactive search would be to automatically retrieve other similar instances of an already found target scene. However, it is unclear which content descriptors should be used for such an automatic video content search, using a query-by-example approach. Therefore, in this paper we investigate several different visual content descriptors (CNN Features, CEDD, COMO, HOG, Feature Signatures and HOF) for the purpose of similarity search in the TRECVID IACC.3 dataset, used for the VBS. Our evaluation shows that there is no single descriptor that works best for every AVS query, however, when considering the total performance over all 30 AVS tasks of TRECVID 2016, CNN features provide the best performance.
|