Abstract: HTTP adaptive video streaming is a widespread and sought-after technology on the Internet that allows clients to dynamically switch between different stream qualities presented in the bitrate ladder to optimize overall received video quality. Currently, there exist several approaches of different complexity for building such a ladder. The simplest method is to use a static bitrate ladder, and the more complex one is to compute a per-title encoding ladder. The main drawback of these approaches is that they do not provide bitrate ladders for scenes with different visual complexity within the video. Moreover, most modern methods require additional computationally-intensive test encodings of the entire video to construct the convex hull, used to calculate the bitrate ladder. This paper proposes a new fast per-scene encoding approach called FAUST based on 1) quick entropy-based scene detection and 2) prediction of optimized bitrate ladder for each scene using an artificial neural network. The results show that our model reduces the mean absolute error to 0.15, the mean square error to 0.08, and the bitrate to 13.5 % while increasing the difference in video multimethod assessment fusion to 5.6 points.

Abstract: Social media is a popular medium for the dissemination of real-time news all over the world. Easy and quick information proliferation is one of the reasons for its popularity. An extensive number of users with different age groups, gender, and societal beliefs are engaged in social media websites. Despite these favorable aspects, a significant disadvantage comes in the form of fake news, as people usually read and share information without caring about its genuineness. Therefore, it is imperative to research methods for the authentication of news. To address this issue, this article proposes a two-phase benchmark model named WELFake based on word embedding (WE) over linguistic features for fake news detection using machine learning classification. The first phase preprocesses the data set and validates the veracity of news content by using linguistic features. The second phase merges the linguistic feature sets with WE and applies voting classification. To validate its approach, this article also carefully designs a novel WELFake data set with approximately 72,000 articles, which incorporates different data sets to generate an unbiased classification output. Experimental results show that the WELFake model categorizes the news in real and fake with a 96.73% which improves the overall accuracy by 1.31% compared to bidirectional encoder representations from transformer (BERT) and 4.25% compared to convolutional neural network (CNN) models. Our frequency-based and focused analyzing writing patterns model outperforms predictive-based related works implemented using the Word2vec WE method by up to 1.73%.

Abstract: Universal access to and provisioning of multimedia content is now a reality. It is easy to generate, distribute, share, and consume any multimedia content, anywhere, anytime, or any device. Open media standards took a crucial role toward enabling all these use cases leading to a plethora of applications and services that have now become a commodity in our daily life. Interestingly, most of these services adopt a streaming paradigm, are typically deployed over the open, unmanaged Internet, and account for most of today’s Internet traffic. Currently, the global video traffic is greater than 60% of all Internet traffic [1], and it is expected that this share will grow to more than 80% in the near future [2]. In addition, Nielsen’s law of Internet bandwidth states that the users’ bandwidth grows by 50% per year, which roughly fits data from 1983 to 2019 [3]. Thus, the users’ bandwidth can be expected to reach approximately 1 Gb/s by 2022. At the same time, network applications will grow and utilize the bandwidth provided, just like programs and their data expand to fill the memory available in a computer system. Most of the available bandwidth today is consumed by video applications, and the amount of data is further increasing due to already established and emerging applications, e.g., ultrahigh definition, high dynamic range, or virtual, augmented, mixed realities, or immersive media applications in general.

Abstract: Live User Generated Content (UGC) has become very popular in today’s video streaming applications, in particular with gaming and e-sport. However, streaming UGC presents unique challenges for video delivery. When dealing with the technical complexity of managing hundreds or thousands of concurrent streams that are geographically distributed, UGC systems are forces to made difficult trade-offs with video quality and latency. To bridge this gap, this paper presents a fully distributed architecture for UGC delivery over the Internet, termed QuaLA (joint Quality-Latency Architecture). The proposed architecture aims to jointly optimize video quality and latency for a better user experience and fairness. By using the proximal Jacobi alternating direction method of multipliers (ProxJ-ADMM) technique, QuaLA proposes a fully distributed mechanism to achieve an appropriate solution. We demonstrate the effectiveness of the proposed architecture through real-world experiments using the CloudLAB testbed. Experimental results show the outperformance of QuaLA in achieving high quality with more than 57% improvement while preserving a good level of fairness and respecting a given target latency among all clients compared to conventional client-driven solutions.

Abstract: Exponential growth in multimedia streaming traffic over the Internet motivates the research and further investigation of the user's perceived quality of such services. Enhancement of experienced quality by the users becomes more substantial when service providers compete on establishing superiority by gaining more subscribers or customers. Quality of Experience (QoE) enhancement would not be possible without an authentic and accurate assessment of the streaming sessions. HTTP Adaptive Streaming (HAS) is today's prevailing technique to deliver the highest possible audio and video content quality to the users. An end-to-end evaluation of QoE in HAS covers the precise measurement of the metrics that affect the perceived quality, eg. startup delay, stall events, and delivered media quality. Mentioned metrics improvements could limit the service's scalability, which is an important factor in real-world scenarios. In this study, we will investigate the stated metrics, best practices and evaluations methods, and available techniques with an aim to (i) design and develop practical and scalable measurement tools and prototypes, (ii) provide a better understanding of current technologies and techniques (eg. Adaptive Bitrate algorithms), (iii) conduct in-depth research on the significant metrics in a way that improvements of QoE with scalability in mind would be feasible, and finally (iv) provide a comprehensive QoE model which outperforms state-of-the-art models.

Abstract: With the recent growth of multimedia traffic over the Internet and emerging multimedia streaming service providers, improving Quality of Experience (QoE) for HTTP Adaptive Streaming (HAS) becomes more important. Alongside other factors, such as the media quality, HAS relies on the performance of the media player’s Adaptive Bitrate (ABR) algorithm to optimize QoE in multimedia streaming sessions. QoE in HAS suffers from weak or unstable internet connections and suboptimal ABR decisions. As a result of imperfect adaptiveness to the characteristics and conditions of the internet connection, stall events and quality level switches could occur and with different durations that negatively affect the QoE. In this paper, we address various identified open issues related to the QoE for HAS, notably (i) the minimum noticeable duration for stall events in HAS; (ii) the correlation between the media quality and the impact of stall events on QoE; (iii) the end-user preference regarding multiple shorter stall events versus a single longer stall event; and (iv) the end-user preference of media quality switches over stall events. Therefore, we have studied these open issues from both objective and subjective evaluation perspectives and presented the correlation between the two types of evaluations. The findings documented in this paper can be used as a baseline for improving ABR algorithms and policies in HAS.

Abstract: Adaptive bitrate (ABR) algorithms play a crucial role in delivering the highest possible viewer's Quality of Experience (QoE) in HTTP Adaptive Streaming (HAS). Online video streaming service providers use HAS - the dominant video streaming technique on the Internet - to deliver the best QoE for their users. A viewer's delight relies heavily on how the ABR of a media player can adapt the stream's quality to the current network conditions. QoE for video streaming sessions has been assessed in many research projects to give better insight into the significant quality metrics such as startup delay and stall events. The ITU Telecommunication Standardization Sector (ITU-T) P.1203 quality evaluation model allows to algorithmically predict a subjective Mean Opinion Score (MOS) by considering various quality metrics. Subjective evaluation is the best assessment method for examining the end-user opinion over a video streaming session's experienced quality. We have conducted subjective evaluations with crowdsourced participants and evaluated the MOS of the sessions using the ITU-T P.1203 quality model. This paper's main contribution is to investigate the correspondence of subjective and objective evaluations for well-known heuristic-based ABRs.

Abstract: Nowadays, modern ski resorts provide additional services to customers, such as recording videos of specific moments from their skiing experience. This and similar tasks can be achieved by using computer vision methods. In this work, we evaluate the detection performance of current object detection methods and the tracking performance of a detection-based tracking algorithm. The evaluation is based on videos of skiers and snowboarders from ski resorts. We collect videos of race tracks from different resorts and compile a public dataset of images and videos, where skiers and snowboarders are annotated with bounding boxes. Based on this data, we evaluate the performance of four state-of-the-art object detection methods. This evaluation is performed with general models trained on the MS COCO dataset as well as with custom models trained on our dataset. In addition, we review the performance of the detection-based, multi-object tracking algorithm Deep SORT, which we adapt for skier tracking.The results show promising performance and reveal that the MS COCO models already achieve high Precision, while training a custom model additionally improves the performance. Bigger models profit from custom training in terms of more accurate bounding box placement and higher Precision, while smaller models have an overall high training payoff. The modified Deep SORT tracker manages to follow a skier’s trajectory over an extended period and operates with high accuracy, which indicates that the tracker is overall well suited for tracking of skiers and snowboarders on race tracks. Even when exposed to strong camera and skier movement changes, the tracker stays latched onto the target.

Abstract: In the light of an increased use of premium intraocular lenses (IOL), such as EDOF IOLs, multifocal IOLs or toric IOLs even minor intraoperative complications such as decentrations or an IOL tilt, will hamper the visual performance of these IOLs. Thus, the post-operative analysis of cataract surgeries to detect even minor intraoperative deviations that might explain a lack of a post-operative success becomes more and more important. Up-to-now surgical videos are evaluated by just looking at a very limited number of intraoperative data sets, or as done in studies evaluating the pupil changes that occur during surgeries, in a small number intraoperative picture only. A continuous measurement of pupil changes over the whole surgery, that would achieve clinically more relevant data, has not yet been described. Therefore, the automatic retrieval of such events may be a great support for a post-operative analysis. This would be especially true if large data files could be evaluated automatically. In this work, we automatically detect pupil reactions in cataract surgery videos. We employ a Mask R-CNN architecture as a segmentation algorithm to segment the pupil and iris with pixel-based accuracy and then track their sizes across the entire video. We can detect pupil reactions with a harmonic mean (H) of Recall, Precision, and Ground Truth Coverage Rate (GTCR) of 60.9% and average prediction length (PL) of 18.93 seconds. However, we consider the best configuration for practical use the one with the H value of 59.4% and PL of 10.2 seconds, which is much shorter. We further investigate the generalization ability of this method on a slightly different dataset without retraining the model. In this evaluation, we achieve the H value of 49.3% with the PL of 18.15 seconds.

Abstract: Cognitive radio networks can efficiently manage the radio spectrum by utilizing the spectrum holes for secondary users in licensed frequency bands. The energy that is used to detect spectrum holes can be reduced considerably by predicting them. However, collisions can occur either between a primary user and secondary users or among the secondary users themselves. This paper introduces a centralized channel allocation algorithm (CCAA) in a scenario with multiple secondary users to control primary and secondary collisions. The proposed allocation algorithm, which uses a channel state predictor (CSP), provides good performance with fairness among the secondary users while they have minimal interference with the primary user. The simulation results show that the probability of a wrong prediction of an idle channel state in a multi-channel system is less than 0.9%. The channel state prediction saves the sensing energy by 73%, and the utilization of the spectrum can be improved by more than 77%.

Abstract: The Video Browser Showdown (VBS) has influenced the Multimedia community already for 10 years now. More than 30 unique teams from over 21 countries participated in the VBS since 2012 already. In 2021, we are celebrating the 10th anniversary of VBS, where 17 international teams compete against each other in an unprecedented contest of fast and accurate multimedia retrieval. In this tutorial we discuss the motivation and details of the VBS contest, including its history, rules, evaluation metrics, and achievements for multimedia retrieval. We talk about the properties of specific VBS retrieval systems and their unique characteristics, as well as existing open-source tools that can be used as a starting point for participating for the first time. Participants of this tutorial get a detailed understanding of the VBS and its search systems, and see the latest developments of interactive video retrieval.

Abstract: Social media applications are essential for next generation connectivity. Today, social media are centralized platforms with a single proprietary organization controlling the network and posing critical trust and governance issues over the created and propagated content. The ARTICONF project [1] funded by the European Union’s Horizon 2020 program researches a decentralized social media platform based on a novel set of trustworthy, resilient and globally sustainable tools that address privacy, robustness and autonomy-related promises that proprietary social media platforms have failed to deliver so far. This paper presents the ARTICONF approach to a car-sharing decentralized application (DApp) use case, as a new collaborative peer-to-peer model providing an alternative solution to private car ownership. We describe a prototype implementation of the car-sharing social media DApp and illustrate through real snapshots how the different ARTICONF tools support it in a simulated scenario.

Abstract: Despite the fact that automatic content analysis has made remarkable progress over the last decade - mainly due to significant advances in machine learning - interactive video retrieval is still a very challenging problem, with an increasing relevance in practical applications. The Video Browser Showdown (VBS) is an annual evaluation competition that pushes the limits of interactive video retrieval with state-of-the-art tools, tasks, data, and evaluation metrics. In this paper, we analyse the results and outcome of the 8th iteration of the VBS in detail. We first give an overview of the novel and considerably larger V3C1 dataset and the tasks that were performed during VBS 2019. We then go on to describe the search systems of the six international teams in terms of features and performance. And finally, we perform an in-depth analysis of the per-team success ratio and relate this to the search strategies that were applied, the most popular features, and problems that were experienced. A large part of this analysis was conducted based on logs that were collected during the competition itself. This analysis gives further insights into the typical search behavior and differences between expert and novice users. Our evaluation shows that textual search and content browsing are the most important aspects in terms of logged user interactions. Furthermore, we observe a trend towards deep learning based features, especially in the form of labels generated by artificial neural networks. But nevertheless, for some tasks, very specific content-based search features are still being used. We expect these findings to contribute to future improvements of interactive video search systems.

Abstract: Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions. While numerous methods for detecting, segmenting and tracking of medical instruments based on endoscopic video images have been proposed in the literature, key limitations remain to be addressed: Firstly, robustness, that is, the reliable performance of state-of-the-art methods when run on challenging images (e.g. in the presence of blood, smoke or motion artifacts). Secondly, generalization; algorithms trained for a specific intervention in a specific hospital should generalize to other interventions or institutions. In an effort to promote solutions for these limitations, we organized the Robust Medical Instrument Segmentation (ROBUST-MIS) challenge as an international benchmarking competition with a specific focus on the robustness and generalization capabilities of algorithms. For the first time in the field of endoscopic image processing, our challenge included a task on binary segmentation and also addressed multi-instance detection and segmentation. The challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures from three different types of surgery. The validation of the competing methods for the three tasks (binary segmentation, multi-instance detection and multi-instance segmentation) was performed in three different stages with an increasing domain gap between the training and the test data. The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap. While the average detection and segmentation quality of the best-performing algorithms is high, future research should concentrate on detection and segmentation of small, crossing, moving and transparent instrument(s) (parts).

Abstract: Organisations possess and continuously generate huge amounts of static and stream data, especially with the proliferation of Internet of Things technologies. Collected but unused data, i.e., Dark Data, mean loss in value creation potential. In this respect, the concept of Computing Continuum extends the traditional more centralised Cloud Computing paradigm with Fog and Edge Computing in order to ensure low latency pre-processing and filtering close to the data sources. However, there are still major challenges to be addressed, in particular related to management of various phases of Big Data processing on the Computing Continuum. In this paper, we set forth an ecosystem for Big Data pipelines in the Computing Continuum and introduce five relevant real-life example use cases in the context of the proposed ecosystem.

Abstract: Cloud data centers exploit many memory page management techniques that reduce the total memory utilization and access time. Mainly these techniques are applied to a hypervisor in a single host (intra-hypervisor) without the possibility to exploit the knowledge obtained by a group of hosts (clusters). We introduce a novel inter-hypervisor orchestration platform to provide intelligent memory page management for horizontal scaling. It will use the performance behavior of faster virtual machines to activate pre-fetching mechanisms that reduce the number of page faults. The overall platform consists of five modules - profiler, collector, classifier, predictor, and pre-fetcher. We developed and deployed a prototype of the platform, which comprises the first three modules. The evaluation shows that data collection is feasible in real-time, which means that if our approach is used on top of the existing memory page management techniques, it can significantly lower the miss rate that initiates page faults.

Abstract: Now that drones have evolved from bulky platforms to agile devices, a challenge is to combine multiple drones into an integrated autonomous system, offering functionality that individual drones cannot achieve. Such multidrone systems require connectivity, communication, and coordination. We discuss these building blocks along with case studies and lessons learned.

Abstract: We present IVOS, an interactive video content search system that allows for object-based search and filtering in video archives. The main idea behind is to use the result of recent object detection models to index all keyframes with a manageable set of object classes, and allow the user to filter by different characteristics, such as object name, object location, relative object size, object color, and combinations for different object classes – e.g., “large person in white on the left, with a red tie”. In addition to that, IVOS can also find segments with a specific number of objects of a particular class (e.g., “many apples” or “two people”) and supports similarity search, based on similar object occurrences.

Abstract: The push for agile pandemic analytic solutions has attained development-stage software modules of applications instead of functioning as full-fledged production-stage applications – i.e., performance, scalability, and energy-related concerns are not optimized for the underlying computing domains. And while the research continues to support the idea that reducing the energy consumption of algorithms improves the lifetime of battery-operated machines, advisable tools in almost any developer setting, an energy analysis report for R-based analytic programs is indeed a valuable suggestion. This article proposes an energy analysis framework for R-programs that enables data analytic developers, including pandemic-related application developers, to analyze the programs. It reveals an energy analysis report for R programs written to predict the new cases of 215 countries using random forest variants. Experiments were carried out at the IoT cloud research lab and the energy efficiency aspects were discussed in the article. In the experiments, ranger-based prediction program consumed 95.8 J.

Abstract: MU-MIMO is a high-speed technique in IEEE 802.11ac and upcoming 802.11ax technologies that improves spectral efficiency by allowing concurrent communication between one Access Point and multiple users. In this paper, we present MuVIS, a novel framework that proposes MU-MIMO-aware optimization for multi-user multimedia applications over IEEE 802.11ac/ax. Taking a cross-layer approach, MuVIS first optimizes the MU-MIMO user group selection for the users with the same characteristics in the PHY/MAC layer. It then optimizes the video bitrate for each group accordingly. We present our design and its evaluation on smartphones and laptops over 802.11ac WiFi.

Abstract: MU-MIMO is a high-speed technique in IEEE 802.11ac and upcoming ax technologies that improves spectral efficiency by allowing concurrent communication between one Access Point and multiple users. In this paper, we present LATTE, a novel framework that proposes MU-MIMO-aware optimization for multi-user multimedia applications over IEEE 802.11ac/ax. Taking a cross-layer approach, LATTE first optimizes the MU-MIMO user group selection for the users with the same characteristics in the PHY/MAC layer. It then optimizes the video bitrate for each group accordingly. We present our design and its evaluation on smartphones and laptops over 802.11ac WiFi. Our experimental evaluations indicate that LATTE can outperform other video rate adaptation algorithms.

Abstract: Video streaming services account for the majority of today's traffic on the Internet. Although the data transmission rate has been increasing significantly, the growing number and variety of media and higher quality expectations of users have led networked media applications to fully or even over-utilize the available throughput. HTTP Adaptive Streaming (HAS) has become a predominant technique for multimedia delivery over the Internet today. However, there are critical challenges for multimedia systems, especially the tradeoff between the increasing content (complexity) and various requirements regarding time (latency) and quality (QoE). This thesis will cover the main aspects within the end user's environment, including video consumption and interactivity, collectively referred to as player environment, which is probably the most crucial component in today's multimedia applications and services. We will investigate the methods that can enable the specification of various policies reflecting the user's needs in given use cases. Besides, we will also work on schemes that allow efficient support for server-assisted, and network-assisted HAS systems. Finally, those approaches will be considered to combine into policies that fit the requirements of all use cases (e.g., live streaming, video on demand, etc.).

Abstract: Fog computing emerged as a crucial platform for the deployment of IoT applications. The complexity of such applications require methods that handle the resource diversity and network structure of Fog devices, while maximizing the service placement and reducing the resource wastage. Prior studies in this domain primarily focused on optimizing application-specific requirements and fail to address the network topology combined with the different types of resources encountered in Fog devices. To overcome these problems, we propose a multilayer resource-aware partitioning method to minimize the resource wastage and maximize the service placement and deadline satisfaction rates in a Fog infrastructure with high multi-user application placement requests. Our method represents the heterogeneous Fog resources as a multilayered network graph and partitions them based on network topology and resource features. Afterwards, it identifies the appropriate device partitions for placing an application according to its requirements, which need to overlap in the same network topology partition. Simulation results show that our multilayer resource-aware partitioning method is able to place twice as many services, satisfy deadlines for three times as many application requests, and reduce the resource wastage by up to 15–32 times compared to two availability-aware and resource-aware state-of-the-art methods.

Abstract: Online games are a fundamental part of the entertainment industry but the current IP infrastructure does not satisfactorily fulfill the needs of these services. The novel networking architecture Named Data Networking (NDN) inherently supports network-level multicast and packet-level security and thereby introduces promising features for online games. In this paper, we propose an NDN-based approach to synchronize game state in a server cluster, a task necessary to allow multiple players in large numbers to play in the same game world. The proposed Quadtree Synchronization Protocol applies NDN’s data-centric nature to decouple the game world from the game servers hosting it. This means that requesting changes of a specific game world region becomes possible without knowing which game server is responsible for the requested region. We use a hierarchic game world structure when requesting data that allows the network to forward requests to the responsible game server without directly addressing it. This region-based naming scheme decouples world regions from servers which eases the management of the game server cluster and allows easier recovery after server failures. In addition, this decoupling allows exchanging information about a geographical region, such as a game world, without knowledge of the other participants changing the world. Such a region-based synchronization mode is not possible to implement with existing protocols. However, it allows building distributed systems that do not require a central server to work. Besides architectural benefits, network emulations show that our protocol increases the efficiency of data transport by utilizing network-level multicast. Our proposed approach can keep up with current protocols which can be used for inter-server game state synchronization.

Abstract: Video delivery over the Internet has been becoming a commodity in recent years, owing to the widespread use of Dynamic Adaptive Streaming over HTTP (DASH). The DASH specification defines a hierarchical data model for Media Presentation Descriptions (MPDs) in terms of segments. This paper focuses on segmenting video into multiple shots for encoding in Video on Demand (VoD) HTTP Adaptive Streaming (HAS) applications. Therefore, we propose a novel Discrete Cosine Transform (DCT) feature-based shot detection and successive elimination algorithm for shot detection and compare it against the default shot detection algorithm of the x265 implementation of the High Efficiency Video Coding (HEVC) standard. Our experimental results demonstrate that our proposed feature-based pre-processor has a recall rate of 25% and an F-measure of 20% greater than the benchmark algorithm for shot detection.

2021
[952]	Anatoliy Zabrovskiy, Prateek Agrawal, Christian Timmerer, Radu Prodan, FAUST: Fast Per-Scene Encoding Using Entropy-Based Scene Detection and Machine Learning, In 2021 30th Conference of Open Innovations Association (FRUCT), IEEE, pp. 292-302, 2021. [bib][url] [doi] [abstract] Abstract: HTTP adaptive video streaming is a widespread and sought-after technology on the Internet that allows clients to dynamically switch between different stream qualities presented in the bitrate ladder to optimize overall received video quality. Currently, there exist several approaches of different complexity for building such a ladder. The simplest method is to use a static bitrate ladder, and the more complex one is to compute a per-title encoding ladder. The main drawback of these approaches is that they do not provide bitrate ladders for scenes with different visual complexity within the video. Moreover, most modern methods require additional computationally-intensive test encodings of the entire video to construct the convex hull, used to calculate the bitrate ladder. This paper proposes a new fast per-scene encoding approach called FAUST based on 1) quick entropy-based scene detection and 2) prediction of optimized bitrate ladder for each scene using an artificial neural network. The results show that our model reduces the mean absolute error to 0.15, the mean square error to 0.08, and the bitrate to 13.5 % while increasing the difference in video multimethod assessment fusion to 5.6 points.
[951]	Pawan Kumar Verma, Prateek Agrawal, Ivone Amorim, Radu Prodan, WELFake: Word Embedding Over Linguistic Features for Fake News Detection, In IEEE Transactions on Computational Social Systems, Institute of Electrical and Electronics Engineers (IEEE), vol. 8, no. 4, pp. 881-893, 2021. [bib][url] [doi] [abstract] Abstract: Social media is a popular medium for the dissemination of real-time news all over the world. Easy and quick information proliferation is one of the reasons for its popularity. An extensive number of users with different age groups, gender, and societal beliefs are engaged in social media websites. Despite these favorable aspects, a significant disadvantage comes in the form of fake news, as people usually read and share information without caring about its genuineness. Therefore, it is imperative to research methods for the authentication of news. To address this issue, this article proposes a two-phase benchmark model named WELFake based on word embedding (WE) over linguistic features for fake news detection using machine learning classification. The first phase preprocesses the data set and validates the veracity of news content by using linguistic features. The second phase merges the linguistic feature sets with WE and applies voting classification. To validate its approach, this article also carefully designs a novel WELFake data set with approximately 72,000 articles, which incorporates different data sets to generate an unbiased classification output. Experimental results show that the WELFake model categorizes the news in real and fake with a 96.73% which improves the overall accuracy by 1.31% compared to bidirectional encoder representations from transformer (BERT) and 4.25% compared to convolutional neural network (CNN) models. Our frequency-based and focused analyzing writing patterns model outperforms predictive-based related works implemented using the Word2vec WE method by up to 1.73%.
[950]	Christian Timmerer, Mathias Wien, Lu Yu, Amy Reibman, Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards, In Proceedings of the IEEE, Institute of Electrical and Electronics Engineers (IEEE), vol. 109, no. 9, pp. 1423-1434, 2021. [bib][url] [doi] [abstract] Abstract: Universal access to and provisioning of multimedia content is now a reality. It is easy to generate, distribute, share, and consume any multimedia content, anywhere, anytime, or any device. Open media standards took a crucial role toward enabling all these use cases leading to a plethora of applications and services that have now become a commodity in our daily life. Interestingly, most of these services adopt a streaming paradigm, are typically deployed over the open, unmanaged Internet, and account for most of today’s Internet traffic. Currently, the global video traffic is greater than 60% of all Internet traffic [1], and it is expected that this share will grow to more than 80% in the near future [2]. In addition, Nielsen’s law of Internet bandwidth states that the users’ bandwidth grows by 50% per year, which roughly fits data from 1983 to 2019 [3]. Thus, the users’ bandwidth can be expected to reach approximately 1 Gb/s by 2022. At the same time, network applications will grow and utilize the bandwidth provided, just like programs and their data expand to fill the memory available in a computer system. Most of the available bandwidth today is consumed by video applications, and the amount of data is further increasing due to already established and emerging applications, e.g., ultrahigh definition, high dynamic range, or virtual, augmented, mixed realities, or immersive media applications in general.
[949]	Farzad Tashtarian, Abdelhak Bentaleb, Reza Farahani, Minh Nguyen, Christian Timmerer, Hermann Hellwagner, Roger Zimmermann, A Distributed Delivery Architecture for User Generated Content Live Streaming over HTTP, In 2021 IEEE 46th Conference on Local Computer Networks (LCN), IEEE, pp. 162-169, 2021. [bib][url] [doi] [abstract] Abstract: Live User Generated Content (UGC) has become very popular in today’s video streaming applications, in particular with gaming and e-sport. However, streaming UGC presents unique challenges for video delivery. When dealing with the technical complexity of managing hundreds or thousands of concurrent streams that are geographically distributed, UGC systems are forces to made difficult trade-offs with video quality and latency. To bridge this gap, this paper presents a fully distributed architecture for UGC delivery over the Internet, termed QuaLA (joint Quality-Latency Architecture). The proposed architecture aims to jointly optimize video quality and latency for a better user experience and fairness. By using the proximal Jacobi alternating direction method of multipliers (ProxJ-ADMM) technique, QuaLA proposes a fully distributed mechanism to achieve an appropriate solution. We demonstrate the effectiveness of the proposed architecture through real-world experiments using the CloudLAB testbed. Experimental results show the outperformance of QuaLA in achieving high quality with more than 57% improvement while preserving a good level of fairness and respecting a given target latency among all clients compared to conventional client-driven solutions.
[948]	Babak Taraghi, End-to-end Quality of Experience Evaluation for HTTP Adaptive Streaming, In Proceedings of the 29th ACM International Conference on Multimedia, ACM, pp. 2936-2939, 2021. [bib][url] [doi] [abstract] Abstract: Exponential growth in multimedia streaming traffic over the Internet motivates the research and further investigation of the user's perceived quality of such services. Enhancement of experienced quality by the users becomes more substantial when service providers compete on establishing superiority by gaining more subscribers or customers. Quality of Experience (QoE) enhancement would not be possible without an authentic and accurate assessment of the streaming sessions. HTTP Adaptive Streaming (HAS) is today's prevailing technique to deliver the highest possible audio and video content quality to the users. An end-to-end evaluation of QoE in HAS covers the precise measurement of the metrics that affect the perceived quality, eg. startup delay, stall events, and delivered media quality. Mentioned metrics improvements could limit the service's scalability, which is an important factor in real-world scenarios. In this study, we will investigate the stated metrics, best practices and evaluations methods, and available techniques with an aim to (i) design and develop practical and scalable measurement tools and prototypes, (ii) provide a better understanding of current technologies and techniques (eg. Adaptive Bitrate algorithms), (iii) conduct in-depth research on the significant metrics in a way that improvements of QoE with scalability in mind would be feasible, and finally (iv) provide a comprehensive QoE model which outperforms state-of-the-art models.
[947]	Babak Taraghi, Minh Nguyen, Hadi Amirpour, Christian Timmerer, Intense: In-Depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in HTTP Adaptive Streaming, In IEEE Access, Institute of Electrical and Electronics Engineers (IEEE), vol. 9, pp. 118087-118098, 2021. [bib][url] [doi] [abstract] Abstract: With the recent growth of multimedia traffic over the Internet and emerging multimedia streaming service providers, improving Quality of Experience (QoE) for HTTP Adaptive Streaming (HAS) becomes more important. Alongside other factors, such as the media quality, HAS relies on the performance of the media player’s Adaptive Bitrate (ABR) algorithm to optimize QoE in multimedia streaming sessions. QoE in HAS suffers from weak or unstable internet connections and suboptimal ABR decisions. As a result of imperfect adaptiveness to the characteristics and conditions of the internet connection, stall events and quality level switches could occur and with different durations that negatively affect the QoE. In this paper, we address various identified open issues related to the QoE for HAS, notably (i) the minimum noticeable duration for stall events in HAS; (ii) the correlation between the media quality and the impact of stall events on QoE; (iii) the end-user preference regarding multiple shorter stall events versus a single longer stall event; and (iv) the end-user preference of media quality switches over stall events. Therefore, we have studied these open issues from both objective and subjective evaluation perspectives and presented the correlation between the two types of evaluations. The findings documented in this paper can be used as a baseline for improving ABR algorithms and policies in HAS.
[946]	Babak Taraghi, Abdelhak Bentaleb, Christian Timmerer, Roger Zimmermann, Hermann Hellwagner, Understanding quality of experience of heuristic-based HTTP adaptive bitrate algorithms, In Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, ACM, pp. 82-89, 2021. [bib][url] [doi] [abstract] Abstract: Adaptive bitrate (ABR) algorithms play a crucial role in delivering the highest possible viewer's Quality of Experience (QoE) in HTTP Adaptive Streaming (HAS). Online video streaming service providers use HAS - the dominant video streaming technique on the Internet - to deliver the best QoE for their users. A viewer's delight relies heavily on how the ABR of a media player can adapt the stream's quality to the current network conditions. QoE for video streaming sessions has been assessed in many research projects to give better insight into the significant quality metrics such as startup delay and stall events. The ITU Telecommunication Standardization Sector (ITU-T) P.1203 quality evaluation model allows to algorithmically predict a subjective Mean Opinion Score (MOS) by considering various quality metrics. Subjective evaluation is the best assessment method for examining the end-user opinion over a video streaming session's experienced quality. We have conducted subjective evaluations with crowdsourced participants and evaluated the MOS of the sessions using the ITU-T P.1203 quality model. This paper's main contribution is to investigate the correspondence of subjective and objective evaluations for well-known heuristic-based ABRs.
[945]	Philip Steinkellner, Klaus Schöffmann, Evaluation of Object Detection Systems and Video Tracking in Skiing Videos, In 2021 International Conference on Content-Based Multimedia Indexing (CBMI), IEEE, pp. 1-6, 2021. [bib][url] [doi] [abstract] Abstract: Nowadays, modern ski resorts provide additional services to customers, such as recording videos of specific moments from their skiing experience. This and similar tasks can be achieved by using computer vision methods. In this work, we evaluate the detection performance of current object detection methods and the tracking performance of a detection-based tracking algorithm. The evaluation is based on videos of skiers and snowboarders from ski resorts. We collect videos of race tracks from different resorts and compile a public dataset of images and videos, where skiers and snowboarders are annotated with bounding boxes. Based on this data, we evaluate the performance of four state-of-the-art object detection methods. This evaluation is performed with general models trained on the MS COCO dataset as well as with custom models trained on our dataset. In addition, we review the performance of the detection-based, multi-object tracking algorithm Deep SORT, which we adapt for skier tracking.The results show promising performance and reveal that the MS COCO models already achieve high Precision, while training a custom model additionally improves the performance. Bigger models profit from custom training in terms of more accurate bounding box placement and higher Precision, while smaller models have an overall high training payoff. The modified Deep SORT tracker manages to follow a skier’s trajectory over an extended period and operates with high accuracy, which indicates that the tracker is overall well suited for tracking of skiers and snowboarders on race tracks. Even when exposed to strong camera and skier movement changes, the tracker stays latched onto the target.
[944]	Natalia Sokolova, Klaus Schoeffmann, Mario Taschwer, Stephanie Sarny, Doris Putzgruber-Adamitsch, Yosuf El-Shabrawi, Automatic detection of pupil reactions in cataract surgery videos, In PLOS ONE (Andreas Wedrich, ed.), Public Library of Science (PLoS), vol. 16, no. 10, pp. e0258390, 2021. [bib][url] [doi] [abstract] Abstract: In the light of an increased use of premium intraocular lenses (IOL), such as EDOF IOLs, multifocal IOLs or toric IOLs even minor intraoperative complications such as decentrations or an IOL tilt, will hamper the visual performance of these IOLs. Thus, the post-operative analysis of cataract surgeries to detect even minor intraoperative deviations that might explain a lack of a post-operative success becomes more and more important. Up-to-now surgical videos are evaluated by just looking at a very limited number of intraoperative data sets, or as done in studies evaluating the pupil changes that occur during surgeries, in a small number intraoperative picture only. A continuous measurement of pupil changes over the whole surgery, that would achieve clinically more relevant data, has not yet been described. Therefore, the automatic retrieval of such events may be a great support for a post-operative analysis. This would be especially true if large data files could be evaluated automatically. In this work, we automatically detect pupil reactions in cataract surgery videos. We employ a Mask R-CNN architecture as a segmentation algorithm to segment the pupil and iris with pixel-based accuracy and then track their sizes across the entire video. We can detect pupil reactions with a harmonic mean (H) of Recall, Precision, and Ground Truth Coverage Rate (GTCR) of 60.9% and average prediction length (PL) of 18.93 seconds. However, we consider the best configuration for practical use the one with the H value of 59.4% and PL of 10.2 seconds, which is much shorter. We further investigate the generalization ability of this method on a slightly different dataset without retraining the model. In this evaluation, we achieve the H value of 49.3% with the PL of 18.15 seconds.
[943]	Nakisa Shams, Hadi Amirpour, Christian Timmerer, Mohammad Ghanbari, A Channel Allocation Algorithm for Cognitive Radio Users Based on Channel State Predictors, Chapter in Proceedings of Sixth International Congress on Information and Communication Technology, Springer Singapore, vol. 235, pp. 711-719, 2021. [bib][url] [doi] [abstract] Abstract: Cognitive radio networks can efficiently manage the radio spectrum by utilizing the spectrum holes for secondary users in licensed frequency bands. The energy that is used to detect spectrum holes can be reduced considerably by predicting them. However, collisions can occur either between a primary user and secondary users or among the secondary users themselves. This paper introduces a centralized channel allocation algorithm (CCAA) in a scenario with multiple secondary users to control primary and secondary collisions. The proposed allocation algorithm, which uses a channel state predictor (CSP), provides good performance with fairness among the secondary users while they have minimal interference with the primary user. The simulation results show that the probability of a wrong prediction of an idle channel state in a multi-channel system is less than 0.9%. The channel state prediction saves the sensing energy by 73%, and the utilization of the spectrum can be improved by more than 77%.
[942]	Klaus Schoeffmann, Jakub Lokoc, Werner Bailer, 10 years of video browser showdown, In Proceedings of the 2nd ACM International Conference on Multimedia in Asia, ACM, pp. 1-3, 2021. [bib][url] [doi] [abstract] Abstract: The Video Browser Showdown (VBS) has influenced the Multimedia community already for 10 years now. More than 30 unique teams from over 21 countries participated in the VBS since 2012 already. In 2021, we are celebrating the 10th anniversary of VBS, where 17 international teams compete against each other in an unprecedented contest of fast and accurate multimedia retrieval. In this tutorial we discuss the motivation and details of the VBS contest, including its history, rules, evaluation metrics, and achievements for multimedia retrieval. We talk about the properties of specific VBS retrieval systems and their unique characteristics, as well as existing open-source tools that can be used as a starting point for participating for the first time. Participants of this tutorial get a detailed understanding of the VBS and its search systems, and see the latest developments of interactive video retrieval.
[941]	Nishant Saurabh, Carlos Rubia, Anandakumar Palanisamy, Spiros Koulouzis, Mirsat Sefidanoski, Antorweep Chakravorty, Zhiming Zhao, Aleksandar Karadimce, Radu Prodan, The ARTICONF Approach to Decentralized Car-Sharing, In Blockchain: Research and Applications, Elsevier BV, pp. 1-37, 2021. [bib][url] [doi] [abstract] Abstract: Social media applications are essential for next generation connectivity. Today, social media are centralized platforms with a single proprietary organization controlling the network and posing critical trust and governance issues over the created and propagated content. The ARTICONF project [1] funded by the European Union’s Horizon 2020 program researches a decentralized social media platform based on a novel set of trustworthy, resilient and globally sustainable tools that address privacy, robustness and autonomy-related promises that proprietary social media platforms have failed to deliver so far. This paper presents the ARTICONF approach to a car-sharing decentralized application (DApp) use case, as a new collaborative peer-to-peer model providing an alternative solution to private car ownership. We describe a prototype implementation of the car-sharing social media DApp and illustrate through real snapshots how the different ARTICONF tools support it in a simulated scenario.
[940]	Luca Rossetto, Ralph Gasser, Jakub Lokoc, Werner Bailer, Klaus Schoeffmann, Bernd Muenzer, Tomas Soucek, Phuong Anh Nguyen, Paolo Bolettieri, Andreas Leibetseder, Stefanos Vrochidis, Interactive Video Retrieval in the Age of Deep Learning - Detailed Evaluation of VBS 2019, In IEEE Transactions on Multimedia, Institute of Electrical and Electronics Engineers (IEEE), vol. 23, pp. 243-256, 2021. [bib][url] [doi] [abstract] Abstract: Despite the fact that automatic content analysis has made remarkable progress over the last decade - mainly due to significant advances in machine learning - interactive video retrieval is still a very challenging problem, with an increasing relevance in practical applications. The Video Browser Showdown (VBS) is an annual evaluation competition that pushes the limits of interactive video retrieval with state-of-the-art tools, tasks, data, and evaluation metrics. In this paper, we analyse the results and outcome of the 8th iteration of the VBS in detail. We first give an overview of the novel and considerably larger V3C1 dataset and the tasks that were performed during VBS 2019. We then go on to describe the search systems of the six international teams in terms of features and performance. And finally, we perform an in-depth analysis of the per-team success ratio and relate this to the search strategies that were applied, the most popular features, and problems that were experienced. A large part of this analysis was conducted based on logs that were collected during the competition itself. This analysis gives further insights into the typical search behavior and differences between expert and novice users. Our evaluation shows that textual search and content browsing are the most important aspects in terms of logged user interactions. Furthermore, we observe a trend towards deep learning based features, especially in the form of labels generated by artificial neural networks. But nevertheless, for some tasks, very specific content-based search features are still being used. We expect these findings to contribute to future improvements of interactive video search systems.
[939]	Tobias Ross, Annika Reinke, Peter M. Full, Martin Wagner, Hannes Kenngott, Martin Apitz, Hellena Hempe, Diana Mindroc-Filimon, Patrick Scholz, Thuy Nuong Tran, Pierangela Bruno, Pablo Arbeláez, Gui-Bin Bian, Sebastian Bodenstedt, Jon Lindström Bolmgren, Laura Bravo-Sánchez, Hua-Bin Chen, Cristina González, Dong Guo, Paal Halvorsen, Pheng-Ann Heng, Enes Hosgor, Zeng-Guang Hou, Fabian Isensee, Debesh Jha, Tingting Jiang, Yueming Jin, Kadir Kirtac, Sabrina Kletz, Stefan Leger, Zhixuan Li, Klaus H. Maier-Hein, Zhen-Liang Ni, Michael A. Riegler, Klaus Schoeffmann, Ruohua Shi, Stefanie Speidel, Michael Stenzel, Isabell Twick, Gutai Wang, Jiacheng Wang, Liansheng Wang, Lu Wang, Yujie Zhang, Yan-Jie Zhou, Lei Zhu, Manuel Wiesenfarth, Annette Kopp-Schneider, Beat P. Müller-Stich, Lena Maier-Hein, Comparative validation of multi-instance instrument segmentation in endoscopy: Results of the ROBUST-MIS 2019 challenge, In Medical Image Analysis, Elsevier BV, vol. 70, no. 66, pp. 1-62, 2021. [bib][url] [doi] [abstract] Abstract: Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions. While numerous methods for detecting, segmenting and tracking of medical instruments based on endoscopic video images have been proposed in the literature, key limitations remain to be addressed: Firstly, robustness, that is, the reliable performance of state-of-the-art methods when run on challenging images (e.g. in the presence of blood, smoke or motion artifacts). Secondly, generalization; algorithms trained for a specific intervention in a specific hospital should generalize to other interventions or institutions. In an effort to promote solutions for these limitations, we organized the Robust Medical Instrument Segmentation (ROBUST-MIS) challenge as an international benchmarking competition with a specific focus on the robustness and generalization capabilities of algorithms. For the first time in the field of endoscopic image processing, our challenge included a task on binary segmentation and also addressed multi-instance detection and segmentation. The challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures from three different types of surgery. The validation of the competing methods for the three tasks (binary segmentation, multi-instance detection and multi-instance segmentation) was performed in three different stages with an increasing domain gap between the training and the test data. The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap. While the average detection and segmentation quality of the best-performing algorithms is high, future research should concentrate on detection and segmentation of small, crossing, moving and transparent instrument(s) (parts).
[938]	Dumitru Roman, Nikolay Nikolov, Ahmet Soylu, Brian Elvesaeter, Hui Song, Radu Prodan, Dragi Kimovski, Andrea Marrella, Francesco Leotta, Mihhail Matskin, Giannis Ledakis, Konstantinos Theodosiou, Anthony Simonet-Boulogne, Fernando Perales, Evgeny Kharlamov, Alexandre Ulisses, Arnor Solberg, Raffaele Ceccarelli, Big Data Pipelines on the Computing Continuum: Ecosystem and Use Cases Overview, In 2021 IEEE Symposium on Computers and Communications (ISCC), IEEE, pp. 1-4, 2021. [bib][url] [doi] [abstract] Abstract: Organisations possess and continuously generate huge amounts of static and stream data, especially with the proliferation of Internet of Things technologies. Collected but unused data, i.e., Dark Data, mean loss in value creation potential. In this respect, the concept of Computing Continuum extends the traditional more centralised Cloud Computing paradigm with Fog and Edge Computing in order to ensure low latency pre-processing and filtering close to the data sources. However, there are still major challenges to be addressed, in particular related to management of various phases of Big Data processing on the Computing Continuum. In this paper, we set forth an ecosystem for Big Data pipelines in the Computing Continuum and introduce five relevant real-life example use cases in the context of the proposed ecosystem.
[937]	Sasko Ristov, Thomas Fahringer, Radu Prodan, Magdalena Kostoska, Marjan Gusev, Schahram Dustdar, Inter-host Orchestration Platform Architecture for Ultra-scale Cloud Applications, In IEEE Internet Computing, Institute of Electrical and Electronics Engineers (IEEE), pp. 1-1, 2021. [bib][url] [doi] [abstract] Abstract: Cloud data centers exploit many memory page management techniques that reduce the total memory utilization and access time. Mainly these techniques are applied to a hypervisor in a single host (intra-hypervisor) without the possibility to exploit the knowledge obtained by a group of hosts (clusters). We introduce a novel inter-hypervisor orchestration platform to provide intelligent memory page management for horizontal scaling. It will use the performance behavior of faster virtual machines to activate pre-fetching mechanisms that reduce the number of page faults. The overall platform consists of five modules - profiler, collector, classifier, predictor, and pre-fetcher. We developed and deployed a prototype of the platform, which comprises the first three modules. The evaluation shows that data collection is feasible in real-time, which means that if our approach is used on top of the existing memory page management techniques, it can significantly lower the miss rate that initiates page faults.
[936]	Bernhard Rinner, Christian Bettstetter, Hermann Hellwagner, Stephan Weiss, Multidrone Systems: More Than the Sum of the Parts, In Computer, Institute of Electrical and Electronics Engineers (IEEE), vol. 54, no. 5, pp. 34-43, 2021. [bib][url] [doi] [abstract] Abstract: Now that drones have evolved from bulky platforms to agile devices, a challenge is to combine multiple drones into an integrated autonomous system, offering functionality that individual drones cannot achieve. Such multidrone systems require connectivity, communication, and coordination. We discuss these building blocks along with case studies and lessons learned.
[935]	Anja Ressmann, Klaus Schoeffmann, IVOS - The ITEC Interactive Video Object Search System at VBS 2021, Chapter in MultiMedia Modeling, Springer International Publishing, no. 12573, pp. 479-483, 2021. [bib][url] [doi] [abstract] Abstract: We present IVOS, an interactive video content search system that allows for object-based search and filtering in video archives. The main idea behind is to use the result of recent object detection models to index all keyframes with a manageable set of object classes, and allow the user to filter by different characteristics, such as object name, object location, relative object size, object color, and combinations for different object classes – e.g., “large person in white on the left, with a red tie”. In addition to that, IVOS can also find segments with a specific number of objects of a particular class (e.g., “many apples” or “two people”) and supports similarity search, based on similar object occurrences.
[934]	Shajulin Benedict, Prateek Agrawal, Radu Prodan, Energy Consumption Analysis of R-Based Machine Learning Algorithms for Pandemic Predictions, Chapter in Communications in Computer and Information Science, Springer Singapore, vol. 1393, pp. 192-204, 2021. [bib][url] [doi] [abstract] Abstract: The push for agile pandemic analytic solutions has attained development-stage software modules of applications instead of functioning as full-fledged production-stage applications – i.e., performance, scalability, and energy-related concerns are not optimized for the underlying computing domains. And while the research continues to support the idea that reducing the energy consumption of algorithms improves the lifetime of battery-operated machines, advisable tools in almost any developer setting, an energy analysis report for R-based analytic programs is indeed a valuable suggestion. This article proposes an energy analysis framework for R-programs that enables data analytic developers, including pandemic-related application developers, to analyze the programs. It reveals an energy analysis report for R programs written to predict the new cases of 215 countries using random forest variants. Experiments were carried out at the IoT cloud research lab and the energy efficiency aspects were discussed in the article. In the experiments, ranger-based prediction program consumed 95.8 J.
[933]	Hannaneh Barahouei Pasandi, Tamer Nadeem, Hadi Amirpour, Christian Timmerer, A cross-layer approach for supporting real-time multi-user video streaming over WLANs, In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking, ACM, pp. 849-851, 2021. [bib][url] [doi] [abstract] Abstract:* MU-MIMO is a high-speed technique in IEEE 802.11ac and upcoming 802.11ax technologies that improves spectral efficiency by allowing concurrent communication between one Access Point and multiple users. In this paper, we present MuVIS, a novel framework that proposes MU-MIMO-aware optimization for multi-user multimedia applications over IEEE 802.11ac/ax. Taking a cross-layer approach, MuVIS first optimizes the MU-MIMO user group selection for the users with the same characteristics in the PHY/MAC layer. It then optimizes the video bitrate for each group accordingly. We present our design and its evaluation on smartphones and laptops over 802.11ac WiFi.
[932]	Hannaneh Barahouei Pasandi, Hadi Amirpour, Tamer Nadeem, Christian Timmerer, Learning-driven MU-MIMO Grouping for Multi-User Multimedia Applications Over Commodity WiFi, In Proceedings of the Workshop on Design, Deployment, and Evaluation of Network-assisted Video Streaming, ACM, pp. 15-21, 2021. [bib][url] [doi] [abstract] Abstract: MU-MIMO is a high-speed technique in IEEE 802.11ac and upcoming ax technologies that improves spectral efficiency by allowing concurrent communication between one Access Point and multiple users. In this paper, we present LATTE, a novel framework that proposes MU-MIMO-aware optimization for multi-user multimedia applications over IEEE 802.11ac/ax. Taking a cross-layer approach, LATTE first optimizes the MU-MIMO user group selection for the users with the same characteristics in the PHY/MAC layer. It then optimizes the video bitrate for each group accordingly. We present our design and its evaluation on smartphones and laptops over 802.11ac WiFi. Our experimental evaluations indicate that LATTE can outperform other video rate adaptation algorithms.
[931]	Minh Nguyen, Policy-driven Dynamic HTTP Adaptive Streaming Player Environment, In Proceedings of the 12th ACM Multimedia Systems Conference, ACM, pp. 408-412, 2021. [bib][url] [doi] [abstract] Abstract: Video streaming services account for the majority of today's traffic on the Internet. Although the data transmission rate has been increasing significantly, the growing number and variety of media and higher quality expectations of users have led networked media applications to fully or even over-utilize the available throughput. HTTP Adaptive Streaming (HAS) has become a predominant technique for multimedia delivery over the Internet today. However, there are critical challenges for multimedia systems, especially the tradeoff between the increasing content (complexity) and various requirements regarding time (latency) and quality (QoE). This thesis will cover the main aspects within the end user's environment, including video consumption and interactivity, collectively referred to as player environment, which is probably the most crucial component in today's multimedia applications and services. We will investigate the methods that can enable the specification of various policies reflecting the user's needs in given use cases. Besides, we will also work on schemes that allow efficient support for server-assisted, and network-assisted HAS systems. Finally, those approaches will be considered to combine into policies that fit the requirements of all use cases (e.g., live streaming, video on demand, etc.).
[930]	Zahra Najafabadi Samani, Nishant Saurabh, Radu Prodan, Multilayer Resource-aware Partitioning for Fog Application Placement, In 2021 IEEE 5th International Conference on Fog and Edge Computing (ICFEC), IEEE, pp. 9-18, 2021. [bib][url] [doi] [abstract] Abstract: Fog computing emerged as a crucial platform for the deployment of IoT applications. The complexity of such applications require methods that handle the resource diversity and network structure of Fog devices, while maximizing the service placement and reducing the resource wastage. Prior studies in this domain primarily focused on optimizing application-specific requirements and fail to address the network topology combined with the different types of resources encountered in Fog devices. To overcome these problems, we propose a multilayer resource-aware partitioning method to minimize the resource wastage and maximize the service placement and deadline satisfaction rates in a Fog infrastructure with high multi-user application placement requests. Our method represents the heterogeneous Fog resources as a multilayered network graph and partitions them based on network topology and resource features. Afterwards, it identifies the appropriate device partitions for placing an application according to its requirements, which need to overlap in the same network topology partition. Simulation results show that our multilayer resource-aware partitioning method is able to place twice as many services, satisfy deadlines for three times as many application requests, and reduce the resource wastage by up to 15–32 times compared to two availability-aware and resource-aware state-of-the-art methods.
[929]	Philipp Moll, Selina Isak, Hermann Hellwagner, Jeff Burke, A Quadtree-based synchronization protocol for inter-server game state synchronization, In Computer Networks, Elsevier BV, vol. 185, pp. 107723, 2021. [bib][url] [doi] [abstract] Abstract: Online games are a fundamental part of the entertainment industry but the current IP infrastructure does not satisfactorily fulfill the needs of these services. The novel networking architecture Named Data Networking (NDN) inherently supports network-level multicast and packet-level security and thereby introduces promising features for online games. In this paper, we propose an NDN-based approach to synchronize game state in a server cluster, a task necessary to allow multiple players in large numbers to play in the same game world. The proposed Quadtree Synchronization Protocol applies NDN’s data-centric nature to decouple the game world from the game servers hosting it. This means that requesting changes of a specific game world region becomes possible without knowing which game server is responsible for the requested region. We use a hierarchic game world structure when requesting data that allows the network to forward requests to the responsible game server without directly addressing it. This region-based naming scheme decouples world regions from servers which eases the management of the game server cluster and allows easier recovery after server failures. In addition, this decoupling allows exchanging information about a geographical region, such as a game world, without knowledge of the other participants changing the world. Such a region-based synchronization mode is not possible to implement with existing protocols. However, it allows building distributed systems that do not require a central server to work. Besides architectural benefits, network emulations show that our protocol increases the efficiency of data transport by utilizing network-level multicast. Our proposed approach can keep up with current protocols which can be used for inter-server game state synchronization.
[928]	Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, Christian Timmerer, Efficient Content-Adaptive Feature-Based Shot Detection for HTTP Adaptive Streaming, In 2021 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 2174-2178, 2021. [bib][url] [doi] [abstract] Abstract: Video delivery over the Internet has been becoming a commodity in recent years, owing to the widespread use of Dynamic Adaptive Streaming over HTTP (DASH). The DASH specification defines a hierarchical data model for Media Presentation Descriptions (MPDs) in terms of segments. This paper focuses on segmenting video into multiple shots for encoding in Video on Demand (VoD) HTTP Adaptive Streaming (HAS) applications. Therefore, we propose a novel Discrete Cosine Transform (DCT) feature-based shot detection and successive elimination algorithm for shot detection and compare it against the default shot detection algorithm of the x265 implementation of the High Efficiency Video Coding (HEVC) standard. Our experimental results demonstrate that our proposed feature-based pre-processor has a recall rate of 25% and an F-measure of 20% greater than the benchmark algorithm for shot detection.