[24] | Negin Ghamsarian, Mario Taschwer, Doris Putzgruber-Adamitsch, Stephanie Sarny, Yosuf El-Shabrawi, Klaus Schöffmann, ReCal-Net: Joint Region-Channel-Wise Calibrated Network for Semantic Segmentation in Cataract Surgery Videos, Chapter in Neural Information Processing, Springer International Publishing, no. 13110, pp. 391-402, 2021.
[bib][url] [doi] [abstract]
Abstract: Semantic segmentation in surgical videos is a prerequisite for a broad range of applications towards improving surgical outcomes and surgical video analysis. However, semantic segmentation in surgical videos involves many challenges. In particular, in cataract surgery, various features of the relevant objects such as blunt edges, color and context variation, reflection, transparency, and motion blur pose a challenge for semantic segmentation. In this paper, we propose a novel convolutional module termed as ReCal module, which can calibrate the feature maps by employing region intra-and-inter-dependencies and channel-region cross-dependencies. This calibration strategy can effectively enhance semantic representation by correlating different representations of the same semantic label, considering a multi-angle local view centering around each pixel. Thus the proposed module can deal with distant visual characteristics of unique objects as well as cross-similarities in the visual characteristics of different objects. Moreover, we propose a novel network architecture based on the proposed module termed as ReCal-Net. Experimental results confirm the superiority of ReCal-Net compared to rival state-of-the-art approaches for all relevant objects in cataract surgery. Moreover, ablation studies reveal the effectiveness of the ReCal module in boosting semantic segmentation accuracy.
|
[23] | Negin Ghamsarian, Mario Taschwer, Doris Putzgruber-Adamitsch, Stephanie Sarny, Yosuf El-Shabrawi, Klaus Schoeffmann, LensID: A CNN-RNN-Based Framework Towards Lens Irregularity Detection in Cataract Surgery Videos, Chapter in Medical Image Computing and Computer Assisted Intervention (MICCAI 2021), Springer International Publishing, no. 12908, pp. 76-86, 2021.
[bib][url] [doi] [abstract]
Abstract: A critical complication after cataract surgery is the dislocation of the lens implant leading to vision deterioration and eye trauma. In order to reduce the risk of this complication, it is vital to discover the risk factors during the surgery. However, studying the relationship between lens dislocation and its suspicious risk factors using numerous videos is a time-extensive procedure. Hence, the surgeons demand an automatic approach to enable a larger-scale and, accordingly, more reliable study. In this paper, we propose a novel framework as the major step towards lens irregularity detection. In particular, we propose (I) an end-to-end recurrent neural network to recognize the lens-implantation phase and (II) a novel semantic segmentation network to segment the lens and pupil after the implantation phase. The phase recognition results reveal the effectiveness of the proposed surgical phase recognition approach. Moreover, the segmentation results confirm the proposed segmentation network’s effectiveness compared to state-of-the-art rival approaches.
|
[22] | Negin Ghamsarian, Mario Taschwer, Doris Putzgruber-Adamitsch, Stephanie Sarny, Klaus Schoeffmann, Relevance Detection in Cataract Surgery Videos by Spatio- Temporal Action Localization, In 2020 25th International Conference on Pattern Recognition (ICPR), IEEE, pp. 10720-10727, 2021.
[bib][url] [doi] [abstract]
Abstract: In cataract surgery, the operation is performed with the help of a microscope. Since the microscope enables watching real-time surgery by up to two people only, a major part of surgical training is conducted using the recorded videos. To optimize the training procedure with the video content, the surgeons require an automatic relevance detection approach. In addition to relevance-based retrieval, these results can be further used for skill assessment and irregularity detection in cataract surgery videos. In this paper, a three-module framework is proposed to detect and classify the relevant phase segments in cataract videos. Taking advantage of an idle frame recognition network, the video is divided into idle and action segments. To boost the performance in relevance detection, the cornea where the relevant surgical actions are conducted is detected in all frames using Mask R-CNN. The spatiotemporally localized segments containing higher-resolution information about the pupil texture and actions, and complementary temporal information from the same phase are fed into the relevance detection module. This module consists of four parallel recurrent CNNs being responsible to detect four relevant phases that have been defined with medical experts. The results will then be integrated to classify the action phases as irrelevant or one of four relevant phases. Experimental results reveal that the proposed approach outperforms static CNNs and different configurations of feature-based and end-to-end recurrent networks.
|
[21] | Reza Farahani, CDN and SDN Support and Player Interaction for HTTP Adaptive Video Streaming, In Proceedings of the 12th ACM Multimedia Systems Conference, ACM, pp. 398-402, 2021.
[bib][url] [doi] [abstract]
Abstract: Video streaming has become one of the most prevailing, bandwidth-hungry, and latency-sensitive Internet applications. HTTP Adaptive Streaming (HAS) has become the dominant video delivery mechanism over the Internet. Lack of coordination among the clients and lack of awareness of the network in pure client-based adaptive video bitrate approaches have caused problems, such as sub-optimal data throughput from Content Delivery Network (CDN) or origin servers, high CDN costs, and non-satisfactory users' experience. Recent studies have shown that network-assisted HAS techniques by utilizing modern networking paradigms, e.g., Software Defined Networking (SDN), Network Function Virtualization(NFV), and edge computing can significantly improve HAS system performance. In this doctoral study, we leverage the aforementioned modern networking paradigms and design network-assistance for/by HAS clients to improve HAS systems performance and CDN/network utilization. We present four fundamental research questions to target different challenges in devising a network-assisted HAS system.
|
[20] | Reza Farahani, Farzad Tashtarian, Hadi Amirpour, Christian Timmerer, Mohammad Ghanbari, Hermann Hellwagner, CSDN: CDN-Aware QoE Optimization in SDN-Assisted HTTP Adaptive Video Streaming, In 2021 IEEE 46th Conference on Local Computer Networks (LCN), IEEE, pp. 525-532, 2021.
[bib][url] [doi] [abstract]
Abstract: Recent studies have revealed that network-assisted techniques, by providing a comprehensive view of the network, improve HTTP Adaptive Streaming (HAS) system performance significantly. This paper leverages the capability of Software-Defined Networking, Network Function Virtualization, and edge computing to introduce a CDN-Aware QoE Optimization in SDN-Assisted Adaptive Video Streaming (CSDN) framework. We employ virtualized edge entities to collect various information items and run an optimization model with a new server/segment selection approach in a time-slotted fashion to serve the clients’ requests by selecting optimal cache servers. In case of a cache miss, a client’s request is served by an optimal replacement quality from a cache server, by a quality transcoded from an optimal replacement quality at the edge, or by the originally requested quality from the origin server. Comprehensive experiments conducted on a large-scale testbed demonstrate that CSDN outperforms other approaches in terms of the users’ QoE and network utilization.
|
[19] | Reza Farahani, Farzad Tashtarian, Alireza Erfanian, Christian Timmerer, Mohammad Ghanbari, Hermann Hellwagner, ES-HAS: an edge- and SDN-assisted framework for HTTP adaptive video streaming, In Proceedings of the 31st ACM Workshop on Network and Operating Systems Support for Digital Audio and Video, ACM, pp. 50-57, 2021.
[bib][url] [doi] [abstract]
Abstract: Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience and resource utilization in a pure client-based HAS adaptation scheme. Software Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients' requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients' requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client's request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the original requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [12]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.
|
[18] | Alireza Erfanian, Optimizing QoE and Latency of Live Video Streaming Using Edge Computing and In-Network Intelligence, In Proceedings of the 12th ACM Multimedia Systems Conference, ACM, pp. 373-377, 2021.
[bib][url] [doi] [abstract]
Abstract: Live video streaming traffic and related applications have experienced significant growth in recent years. More users have started generating and delivering live streams with high quality (e.g., 4K resolution) through popular online streaming platforms such as YouTube, Twitch, and Facebook. Typically, the video contents are generated by streamers and watched by many audiences, which are geographically distributed in various locations far away from the streamers' locations. The resource limitation in the network (e.g., bandwidth) is a challenging issue for network and video providers to meet the users' requested quality. In this thesis, we will investigate optimizing QoEand end-to-end (E2E) latency of live video streaming by leveraging edge computing capabilities and in-network intelligence. We present four main research questions aiming to address the various challenges in optimizing live streaming QoE and E2E latency by employing edge computing and in-network intelligence.
|
[17] | Alireza Erfanian, Hadi Amirpour, Farzad Tashtarian, Christian Timmerer, Hermann Hellwagner, LwTE-Live: Light-weight Transcoding at the Edge for Live Streaming, In Proceedings of the Workshop on Design, Deployment, and Evaluation of Network-assisted Video Streaming, ACM, pp. 22-28, 2021.
[bib][url] [doi] [abstract]
Abstract: Live video streaming is widely embraced in video services, and its applications have attracted much attention in recent years. The increased number of users demanding high quality (e.g., 4K resolution) live videos increases the bandwidth utilization in the backhaul network. To decrease bandwidth utilization in HTTP Adaptive Streaming (HAS), in on-the-fly transcoding approaches, only the highest bitrate representation is delivered to the edge, and other representations are generated by transcoding at the edge. However, this approach is inefficient due to the high transcoding cost. In this paper, we propose a light-weight transcoding at the edge method for live applications, LwTE-Live, to decrease the bandwidth utilization and the overall live streaming cost. During the encoding processes at the origin server, the optimal encoding decisions are saved as metadata and the metadata replaces the corresponding representation in the bitrate ladder. The significantly reduced size of the metadata compared to its corresponding representation decreases the bandwidth utilization. The extracted metadata is then utilized at the edge to decrease the transcoding time. We formulate the problem as a Mixed-Binary Linear Programming (MBLP) model to optimize the live streaming cost, including the bandwidth and computation costs. We compare the proposed model with state-of-the-art approaches, and the experimental results show that our proposed method saves the cost and backhaul bandwidth utilization up to 34% and 45%, respectively.
|
[16] | Alireza Erfanian, Hadi Amirpour, Farzad Tashtarian, Christian Timmerer, Hermann Hellwagner, LwTE: Light-Weight Transcoding at the Edge, In IEEE Access, Institute of Electrical and Electronics Engineers (IEEE), vol. 9, pp. 112276-112289, 2021.
[bib][url] [doi] [abstract]
Abstract: Due to the growing demand for video streaming services, providers have to deal with increasing resource requirements for increasingly heterogeneous environments. To mitigate this problem, many works have been proposed which aim to ( i ) improve cloud/edge caching efficiency, (ii) use computation power available in the cloud/edge for on-the-fly transcoding, and (iii) optimize the trade-off among various cost parameters, e.g., storage, computation, and bandwidth. In this paper, we propose LwTE, a novel L ight- w eight T ranscoding approach at the E dge, in the context of HTTP Adaptive Streaming (HAS). During the encoding process of a video segment at the origin side, computationally intense search processes are going on. The main idea of LwTE is to store the optimal results of these search processes as metadata for each video bitrate and reuse them at the edge servers to reduce the required time and computational resources for on-the-fly transcoding. LwTE enables us to store only the highest bitrate plus corresponding metadata (of very small size) for unpopular video segments/bitrates. In this way, in addition to the significant reduction in bandwidth and storage consumption, the required time for on-the-fly transcoding of a requested segment is remarkably decreased by utilizing its corresponding metadata; unnecessary search processes are avoided. Popular video segments/bitrates are being stored. We investigate our approach for Video-on-Demand (VoD) streaming services by optimizing storage and computation (transcoding) costs at the edge servers and then compare it to conventional methods (store all bitrates, partial transcoding). The results indicate that our approach reduces the transcoding time by at least 80% and decreases the aforementioned costs by 12% to 70% compared to the state-of-the-art approaches.
|
[15] | Alireza Erfanian, Farzad Tashtarian, Anatoliy Zabrovskiy, Christian Timmerer, Hermann Hellwagner, OSCAR: On Optimizing Resource Utilization in Live Video Streaming, In IEEE Transactions on Network and Service Management, Institute of Electrical and Electronics Engineers (IEEE), vol. 18, no. 1, pp. 552-569, 2021.
[bib][url] [doi] [abstract]
Abstract: Live video streaming traffic and related applications have experienced significant growth in recent years. However, this has been accompanied by some challenging issues, especially in terms of resource utilization. Although IP multicasting can be recognized as an efficient mechanism to cope with these challenges, it suffers from many problems. Applying software-defined networking (SDN) and network function virtualization (NFV) technologies enable researchers to cope with IP multicasting issues in novel ways. In this article, by leveraging the SDN concept, we introduce OSCAR (Optimizing reSourCe utilizAtion in live video stReaming) as a new cost-aware video streaming approach to provide advanced video coding (AVC)-based live streaming services in the network. In this article, we use two types of virtualized network functions (VNFs): virtual reverse proxy (VRP) and virtual transcoder function (VTF). At the edge of the network, VRPs are responsible for collecting clients’ requests and sending them to an SDN controller. Then, by executing a mixed-integer linear program (MILP), the SDN controller determines a group of optimal multicast trees for streaming the requested videos from an appropriate origin server to the VRPs. Moreover, to elevate the efficiency of resource allocation and meet the given end-to-end latency threshold, OSCAR delivers only the highest requested quality from the origin server to an optimal group of VTFs over a multicast tree. The selected VTFs then transcode the received video segments and transmit them to the requesting VRPs in a multicast fashion. To mitigate the time complexity of the proposed MILP model, we present a simple and efficient heuristic algorithm that determines a near-optimal solution in polynomial time. Using the MiniNet emulator, we evaluate the performance of OSCAR in various scenarios. The results show that OSCAR surpasses other SVC- and AVC-based multicast and unicast approaches in terms of cost and resource utilization.
|
[14] | Wilfried Elmenreich, Mathias Lux, Analyzing Usage Patterns in Online Games, Chapter in A Ludic Society, Donau-Universität Krems, pp. 347-359, 2021.
[bib][url] [abstract]
Abstract: A typical life cycle of an online game is reflected in its usage patterns. A game first builds a user base, then reaches an absolute peak, to then being played by a minimum number of dedicated fans at the end of its life. Apart from this development, extraordinary internal and external events can be observed as changes in usage in games, especially multiplayer and massive multiplayer ones. For the usage of video games, the COVID-19 pandemic has impacted usage as it had on the game business itself. However, research lacks data to investigate these relations further. Usage statistics of games are rarely accessible for researchers. In this paper, we relate usage statistics to viewership and popularity of a game using available data sources like online statistics or activity on Twitch.tv. In a first study, data from the online role-playing game (MMORPG) Eternal Lands is analyzed. Eternal Lands is a free, multiplayer, online game that was created already in 2002. The usage patterns show day/night cycles of players in the prime time of the time zones where most players are located and increased playing activity on weekends. A general trend over time shows a slowly diminishing user base over the years since its introduction. In April 2020, a significant rise in user activities can be observed, attributed to lockdowns in many countries due to the COVID-19 pandemic. This can be attributed to regular players investing more time playing the game during the lockdown and to new or recurring players, who have not played the game intensively before, were looking for a distraction during the lockdown. In a second study, we focus on complementary viewer statistics on the popular game streaming platform Twitch.tv. We can observe that the COVID-19 pandemic impacted the playing time, as mentioned earlier. We relate usage data to viewership and streaming statistics of popular games. With the example of Eternal Lands, being a game that never went viral, we discuss the possibility of approximating a game's popularity through game streaming and viewership.
|
[13] | Ekrem Cetinkaya, Machine Learning Based Video Coding Enhancements for HTTP Adaptive Streaming, In Proceedings of the 12th ACM Multimedia Systems Conference, ACM, pp. 418-422, 2021.
[bib][url] [doi] [abstract]
Abstract: Video traffic comprises the majority of today's Internet traffic, and HTTP Adaptive Streaming (HAS) is the preferred method to deliver video content over the Internet. Increasing demand for video and the improvements in the video display conditions over the years caused an increase in the video coding complexity. This increased complexity brought the need for more efficient video streaming and coding solutions. The latest standard video codecs can reduce the size of the videos by using more efficient tools with higher time-complexities. The plans for integrating machine learning into upcoming video codecs raised the interest in applied machine learning for video coding. In this doctoral study, we aim to propose applied machine learning methods to video coding, focusing on HTTP adaptive streaming. We present four primary research questions to target different challenges in video coding for HTTP adaptive streaming.
|
[12] | Ekrem Cetinkaya, Hadi Amirpour, Christian Timmerer, Mohammad Ghanbari, Fast Multi-Resolution and Multi-Rate Encoding for HTTP Adaptive Streaming Using Machine Learning, In IEEE Open Journal of Signal Processing, Institute of Electrical and Electronics Engineers (IEEE), pp. 1-12, 2021.
[bib][url] [doi] [abstract]
Abstract: Video streaming applications keep getting more attention over the years, and HTTP Adaptive Streaming (HAS) became the de-facto solution for video delivery over the Internet. In HAS, each video is encoded at multiple quality levels and resolutions (i.e., representations) to enable adaptation of the streaming session to viewing and network conditions of the client. This requirement brings encoding challenges along with it, e.g., a video source should be encoded efficiently at multiple bitrates and resolutions. Fast multi-rate encoding approaches aim to address this challenge of encoding multiple representations from a single video by re-using information from already encoded representations. In this paper, a convolutional neural network is used to speed up both multi-rate and multi-resolution encoding for HAS. For multi-rate encoding, the lowest bitrate representation is chosen as the reference. For multi-resolution encoding, the highest bitrate from the lowest resolution representation is chosen as the reference. Pixel values from the target resolution and encoding information from the reference representation are used to predict Coding Tree Unit (CTU) split decisions in High-Efficiency Video Coding (HEVC) for dependent representations. Experimental results show that the proposed method for multi-rate encoding can reduce the overall encoding time by 15.08 % and parallel encoding time by 41.26 %, with a 0.89 % bitrate increase compared to the HEVC reference software. Simultaneously, the proposed method for multi-resolution encoding can reduce the encoding time by 46.27 % for the overall encoding and 27.71 % for the parallel encoding on average with a 2.05 % bitrate increase.
|
[11] | Ekrem Cetinkaya, Hadi Amirpour, Mohammad Ghanbari, Christian Timmerer, CTU depth decision algorithms for HEVC: A survey, In Signal Processing: Image Communication, Elsevier BV, vol. 99, pp. 116442, 2021.
[bib][url] [doi] [abstract]
Abstract: High Efficiency Video Coding (HEVC) surpasses its predecessors in encoding efficiency by introducing new coding tools at the cost of an increased encoding time-complexity. The Coding Tree Unit (CTU) is the main building block used in HEVC. In the HEVC standard, frames are divided into CTUs with the predetermined size of up to 64 × 64 pixels. Each CTU is then divided recursively into a number of equally sized square areas, known as Coding Units (CUs). Although this diversity of frame partitioning increases encoding efficiency, it also causes an increase in the time complexity due to the increased number of ways to find the optimal partitioning. To address this complexity, numerous algorithms have been proposed to eliminate unnecessary searches during partitioning CTUs by exploiting the correlation in the video. In this paper, existing CTU depth decision algorithms for HEVC are surveyed. These algorithms are categorized into two groups, namely statistics and machine learning approaches. Statistics approaches are further subdivided into neighboring and inherent approaches. Neighboring approaches exploit the similarity between adjacent CTUs to limit the depth range of the current CTU, while inherent approaches use only the available information within the current CTU. Machine learning approaches try to extract and exploit similarities implicitly. Traditional methods like support vector machines or random forests use manually selected features, while recently proposed deep learning methods extract features during training. Finally, this paper discusses extending these methods to more recent video coding formats such as Versatile Video Coding (VVC) and AOMedia Video 1(AV1).
|
[10] | Michal Barcis, Hermann Hellwagner, Information Distribution in Multi-Robot Systems: Adapting to Varying Communication Conditions, In 2021 Wireless Days (WD), IEEE, pp. 1-8, 2021.
[bib][url] [doi] [abstract]
Abstract: This work addresses the problem of application-layer congestion control in multi-robot systems (MRS). It is motivated by the fact that many MRS constrain the amount of transmitted data in order to avoid congestion in the network and ensure that critical messages get delivered. However, such constraints often need to be manually tuned and assume constant network capabilities. We introduce the adaptive goodput constraint, which smoothly adapts to varying communication conditions. It is suitable for long-term communication planning, where rapid changes are undesirable. We analyze the introduced method in a simulation-based study and show its practical applicability using mobile robots.
|
[9] | Michal Barcis, Agata Barcis, Nikolaos Tsiogkas, Hermann Hellwagner, Information Distribution in Multi-Robot Systems: Generic, Utility-Aware Optimization Middleware, In Frontiers in Robotics and AI, Frontiers Media (SA), vol. 8, pp. 1-11, 2021.
[bib][url] [doi] [abstract]
Abstract: This work addresses the problem of what information is worth sending in a multi-robot system under generic constraints, e.g., limited throughput or energy. Our decision method is based on Monte Carlo Tree Search. It is designed as a transparent middleware that can be integrated into existing systems to optimize communication among robots. Furthermore, we introduce techniques to reduce the decision space of this problem to further improve the performance. We evaluate our approach using a simulation study and demonstrate its feasibility in a real-world environment by realizing a proof of concept in ROS 2 on mobile robots.
|
[8] | Hadi Amirpour, Ekrem Cetinkaya, Christian Timmerer, Mohammad Ghanbari, Towards Optimal Multirate Encoding for HTTP Adaptive Streaming, Chapter in Proceedings of the 27th Internationl Conference on Multimedia Modeling (MMM 2021), Springer International Publishing, no. 12572, pp. 469-480, 2021.
[bib][url] [doi] [abstract]
Abstract: HTTP Adaptive Streaming (HAS) enables high quality stream-ing of video contents. In HAS, videos are divided into short intervalscalled segments, and each segment is encoded at various quality/bitratesto adapt to the available bandwidth. Multiple encodings of the same con-tent imposes high cost for video content providers. To reduce the time-complexity of encoding multiple representations, state-of-the-art methods typically encode the highest quality representation first and reusethe information gathered during its encoding to accelerate the encodingof the remaining representations. As encoding the highest quality rep-resentation requires the highest time-complexity compared to the lowerquality representations, it would be a bottleneck in parallel encoding scenarios and the overall time-complexity will be limited to the time-complexity of the highest quality representation. In this paper and toaddress this problem, we consider all representations from the highestto the lowest quality representation as a potential, single reference toaccelerate the encoding of the other, dependent representations. We for-mulate a set of encoding modes and assess their performance in terms ofBD-Rate and time-complexity, using both VMAF and PSNR as objec-tive metrics. Experimental results show that encoding a middle qualityrepresentation as a reference, can significantly reduce the maximum en-coding complexity and hence it is an efficient way of encoding multiplerepresentations in parallel. Based on this fact, a fast multirate encodingmethod is proposed which utilizes depth and prediction mode of a middle quality representation to accelerate the encoding of the dependentrepresentations.
|
[7] | Hadi Amirpourazarian, Christian Timmerer, Mohammad Ghanbari, SLFC: Scalable Light Field Coding, In 2021 Data Compression Conference (DCC), IEEE, pp. 43-52, 2021.
[bib][url] [doi] [abstract]
Abstract: Light field imaging enables some post-processing capabilities like refocusing, changing view perspective, and depth estimation. As light field images are represented by multiple views they contain a huge amount of data that makes compression inevitable. Although there are some proposals to efficiently compress light field images, their main focus is on encoding efficiency. However, some important functionalities such as viewpoint and quality scalabilities, random access, and uniform quality distribution have not been addressed adequately. In this paper, an efficient light field image compression method based on a deep neural network is proposed, which classifies multiple views into various layers. In each layer, the target view is synthesized from the available views of previously encoded/decoded layers using a deep neural network. This synthesized view is then used as a virtual reference for the target view inter-coding. In this way, random access to an arbitrary view is provided. Moreover, uniform quality distribution among multiple views is addressed. In higher bitrates where random access to an arbitrary view is more crucial, the required bitrate to access the requested view is minimized.
|
[6] | Hadi Amirpourazarian, Christian Timmerer, Mohammad Ghanbari, PSTR: Per-Title Encoding Using Spatio-Temporal Resolutions, In 2021 IEEE International Conference on Multimedia and Expo (ICME), IEEE, pp. 1-6, 2021.
[bib][url] [doi] [abstract]
Abstract: Current per-title encoding schemes encode the same video content (or snippets/subsets thereof) at various bitrates and spatial resolutions to find an optimal bitrate ladder for each video content. Compared to traditional approaches, in which a predefined, content-agnostic ("fit-to-all") encoding ladder is applied to all video contents, per-title encoding can result in (i) a significant decrease of storage and delivery costs and (ii) an increase in the Quality of Experience (QoE). In the current per-title encoding schemes, the bitrate ladder is optimized using only spatial resolutions, while we argue that with the emergence of high framerate videos, this principle can be extended to temporal resolutions as well. In this paper, we improve the per-title encoding for each content using spatio-temporal resolutions. Experimental results show that our proposed approach doubles the performance of bitrate saving by considering both temporal and spatial resolutions compared to considering only spatial resolutions.
|
[5] | Hadi Amirpour, Raimund Schatz, Christian Timmerer, Mohammad Ghanbari, On the Impact of Viewing Distance on Perceived Video Quality, In 2021 International Conference on Visual Communications and Image Processing (VCIP), IEEE, pp. 1-5, 2021.
[bib][url] [doi] [abstract]
Abstract: Due to the growing importance of optimizing the quality and efficiency of video streaming delivery, accurate assessment of user-perceived video quality becomes increasingly important. However, due to the wide range of viewing distances encountered in real-world viewing settings, the perceived video quality can vary significantly in everyday viewing situations. In this paper, we investigate and quantify the influence of viewing distance on perceived video quality. A subjective experiment was conducted with full HD sequences at three different fixed viewing distances, with each video sequence being encoded at three different quality levels. Our study results confirm that the viewing distance has a significant influence on the quality assessment. In particular, they show that an increased viewing distance generally leads to increased perceived video quality, especially at low media encoding quality levels. In this context, we also provide an estimation of potential bitrate savings that knowledge of actual viewing distance would enable in practice. Since current objective video quality metrics do not systematically take into account viewing distance, we also analyze and quantify the influence of viewing distance on the correlation between objective and subjective metrics. Our results confirm the need for distance-aware objective metrics when the accurate prediction of perceived video quality in real-world environments is required.
|
[4] | Hadi Amirpour, Hannaneh Barahouei Pasandi, Christian Timmerer, Mohammad Ghanbari, Improving Per-title Encoding for HTTP Adaptive Streaming by Utilizing Video Super-resolution, In 2021 International Conference on Visual Communications and Image Processing (VCIP), IEEE, pp. 1-5, 2021.
[bib][url] [doi] [abstract]
Abstract: In per-title encoding, to optimize a bitrate ladder over spatial resolution, each video segment is downscaled to a set of spatial resolutions, and they are all encoded at a given set of bitrates. To find the highest quality resolution for each bitrate, the low-resolution encoded videos are upscaled to the original resolution, and a convex hull is formed based on the scaled qualities. Deep learning-based video super-resolution (VSR) approaches show a significant gain over traditional upscaling approaches, and they are becoming more and more efficient over time. This paper improves the per-title encoding over the upscaling methods by using deep neural network-based VSR algorithms. Utilizing a VSR algorithm by improving the quality of low-resolution encodings can improve the convex hull. As a result, it will lead to an improved bitrate ladder. To avoid bandwidth wastage at perceptually lossless bitrates, a maximum threshold for the quality is set, and encodings beyond it are eliminated from the bitrate ladder. Similarly, a minimum threshold is set to avoid low-quality video delivery. The encodings between the maximum and minimum thresholds are selected based on one Just Noticeable Difference. Our experimental results show that the proposed per-title encoding results in a 24% bitrate reduction and 53% storage reduction compared to the state-of-the-art method.
|
[3] | Jesus Aguilar-Armijo, Multi-access Edge Computing for Adaptive Bitrate Video Streaming, In Proceedings of the 12th ACM Multimedia Systems Conference, ACM, pp. 378-382, 2021.
[bib][url] [doi] [abstract]
Abstract: Video streaming is the most used service in mobile networks and its usage will continue growing in the upcoming years. Due to this increase, content delivery should be improved as a key aspect of video streaming service, supporting higher bandwidth demand while assuring high quality of experience (QoE) for all the users. Multi-access edge computing (MEC) is an emerging paradigm that brings computational power and storage closer to the user. It is seen in the industry as a key technology for 5G mobile networks, with the goals of reducing latency, ensuring highly efficient network operation, improving service delivery and offering an improved user experience, among others. In this doctoral study, we aim to leverage the possibilities of MEC to improve the content delivery of video streaming services. We present four main research questions to target the different challenges in content delivery for HTTP Adaptive Streaming.
|
[2] | Jesus Aguilar-Armijo, Christian Timmerer, Hermann Hellwagner, EADAS: Edge Assisted Adaptation Scheme for HTTP Adaptive Streaming, In 2021 IEEE 46th Conference on Local Computer Networks (LCN), IEEE, pp. 487-494, 2021.
[bib][url] [doi] [abstract]
Abstract: Mobile networks equipped with edge computing nodes enable access to information that can be leveraged to assist client-based adaptive bitrate (ABR) algorithms in making better adaptation decisions to improve both Quality of Experience (QoE) and fairness. For this purpose, we propose a novel on-the-fly edge mechanism, named EADAS (Edge Assisted Adaptation Scheme for HTTP Adaptive Streaming), located at the edge node that assists and improves the ABR decisions on-the-fly. EADAS proposes (i) an edge ABR algorithm to improve QoE and fairness for clients and (ii) a segment prefetching scheme. The results show a QoE increase of 4.6%, 23.5%, and 24.4% and a fairness increase of 11%, 3.4%, and 5.8% when using a buffer-based, a throughput-based, and a hybrid ABR algorithm, respectively, at the client compared with client-based algorithms without EADAS. Moreover, QoE and fairness among clients can be prioritized using parameters of the EADAS algorithm according to service providers’ requirements.
|
[1] | Fatima Abdullah, Dragi Kimovski, Radu Prodan, Kashif Munir, Handover authentication latency reduction using mobile edge computing and mobility patterns, In Computing, Springer Science and Business Media (LLC), pp. 1-20, 2021.
[bib][url] [doi] [abstract]
Abstract: With the advancement in technology and the exponential growth of mobile devices, network traffic has increased manifold in cellular networks. Due to this reason, latency reduction has become a challenging issue for mobile devices. In order to achieve seamless connectivity and minimal disruption during movement, latency reduction is crucial in the handover authentication process. Handover authentication is a process in which the legitimacy of a mobile node is checked when it crosses the boundary of an access network. This paper proposes an efficient technique that utilizes mobility patterns of the mobile node and mobile Edge computing framework to reduce handover authentication latency. The key idea of the proposed technique is to categorize mobile nodes on the basis of their mobility patterns. We perform simulations to measure the networking latency. Besides, we use queuing model to measure the processing time of an authentication query at an Edge servers. The results show that the proposed approach reduces the handover authentication latency up to 54% in comparison with the existing approach.
|