Viewport-Driven Adaptive 360◦ Live Streaming Optimization Framework

<p>Shuai Peng<sup>1</sup>, Jialu Hu<sup>1</sup>, Han Xiao<sup>1</sup>, Shujie Yang<sup>1,*</sup>, Changqiao Xu<sup>1</sup></p>

doi:10.33969/J-NaNA.2021.010401

Viewport-Driven Adaptive 360◦ Live Streaming Optimization Framework

Shuai Peng¹, Jialu Hu¹, Han Xiao¹, Shujie Yang^1,*, Changqiao Xu¹

Corresponding Author:

Shujie Yang

Affiliation(s):

¹ State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China

*Corresponding author

Abstract:

Virtual reality (VR) video streaming and 360^◦ panoramic video have received extensive attention in recent years, which can bring users an immersive experience. However, the ultra-high bandwidth and ultra-low latency requirements of virtual reality video or 360^◦ panoramic video also put tremendous pressure on the carrying capacity of the current network. In fact, since the user’s field of view (a.k.a viewport) is limited when watching a panoramic video and users can only watch about 20%∼30% of the video content, it is not necessary to directly transmit all high-resolution content to the user. Therefore, predicting the user’s future viewing viewport can be crucial for selective streaming and further bitrate decisions. Combined with the tile-based adaptive bitrate (ABR) algorithm for panoramic video, video content within the user’s viewport can be transmitted at a higher resolution, while areas outside the viewport can be transmitted at a lower resolution. This paper mainly proposes a viewport-driven adaptive 360^◦ live streaming optimization framework, which combines viewport prediction and ABR algorithm to optimize the transmission of live 360^◦ panoramic video. However, existing viewport prediction always suffers from low prediction accuracy and does not support real-time performance. With the advantage of convolutional network (CNN) in image processing and long short-term memory (LSTM) in temporal series processing, we propose an online-updated viewport prediction model called LiveCL which mainly utilizes CNN to extract the spatial characteristics of video frames and LSTM to learn the temporal characteristics of the user’s viewport trajectories. With the help of the viewport prediction and ABR algorithm, unnecessary bandwidth consumption can be effectively reduced. The main contributions of this work include: (1) a framework for 360^◦ video transmission is proposed; (2) an online real-time viewport prediction model called LiveCL is proposed to optimize 360^◦ video transmission combined with a novel ABR algorithm, which outperforms the existing model. Based on the public 360^◦ video dataset, the tile accuracy, recall, precision, and frame accuracy of LiveCL are better than those of the latest model. Combined with related adaptive bitrate algorithms, the proposed viewport prediction model can reduce the transmission bandwidth by about 50%.

Keywords:

360^◦ video streaming, video streaming framework, viewport prediction, adaptive bitrate streaming, virtual reality

PDF

Downloads: 90 Views: 918

Cite This Paper:

Shuai Peng, Jialu Hu, Han Xiao, Shujie Yang, Changqiao Xu (2021). Viewport-Driven Adaptive 360^◦ Live Streaming Optimization Framework. Journal of Networking and Network Applications, Volume 1, Issue 4, pp. 139–149. https://doi.org/10.33969/J-NaNA.2021.010401.

References:

[1] IDC Corporate USA. Idc releases 10 predictions for the ar/vr market in 2021. https://www.idc.com/getdoc.jsp?containerId=prCHC47313321, 2021.

[2] Unity Technologies. What is ar, vr, mr, xr, 360◦. https://unity.com/how-to/what-is-xr-glossary, 2021.

[3] Wikipedia. Equirectangular projection. https://en.wikipedia.org/wiki/Equirectangular projection, 2021.

[4] Ingo Bauermann, Matthias Mielke, and Eckehard Steinbach. H. 264 based coding of omnidirectional video. Computer Vision and Graphics: International Conference, pages 209–215, 2006.

[5] Michael Zink, Ramesh Sitaraman, and Klara Nahrstedt. Scalable 360° video stream delivery: Challenges, solutions, and opportunities. Proceedings of the IEEE, 107(4):639–650, 2019.

[6] Internet. Internet speeds by country 2021. https://worldpopulationreview. com/country-rankings/internet-speeds-by-country, 2021.

[7] Chenghao Liu, Imed Bouazizi, and Moncef Gabbouj. Rate adaptation for adaptive http streaming. In Proceedings of the Second Annual ACM Conference on Multimedia Systems, page 169–174, 2011.

[8] Xianglong Feng, Yao Liu, and Sheng Wei. Livedeep: Online viewport prediction for live virtual reality streaming using lifelong deep learning. In 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pages 800–808, 2020.

[9] Cristina Perfecto, Mohammed S. Elbamby, Javier Del Ser, and Mehdi Bennis. Taming the latency in multi-user vr 360°: A qoe-aware deep learning-aided multicast framework. IEEE Transactions on Communi-cations, 68(4):2491–2508, 2020.

[10] Federico Chiariotti. A survey on 360-degree video: Coding, quality of experience and streaming. Computer Communications, 177:133–155, 2021.

[11] Abid Yaqoob, Ting Bi, and Gabriel-Miro Muntean. A survey on adaptive 360° video streaming: Solutions, challenges and opportunities. IEEE Communications Surveys Tutorials, 22(4):2801–2838, 2020.

[12] Rabia Shafi, Wan Shuai, and Muhammad Usman Younus. 360-degree video streaming: A survey of the state of the art. Symmetry, 12(9), 2020.

[13] Dongbiao He., Cedric Westphal., and J. J. Garcia-Luna-Aceves. Network support for ar/vr and immersive video application: A survey. In Pro-ceedings of the 15th International Joint Conference on e-Business and Telecommunications - ICETE,, pages 359–369. INSTICC, SciTePress, 2018.

[14] Ching-Ling Fan, Wen-Chih Lo, Yu-Tung Pai, and Cheng-Hsin Hsu. A survey on 360° video streaming: Acquisition, transmission, and display. ACM Comput. Surv., 52(4), 2019.

[15] Stefano Petrangeli, Gwendal Simon, and Viswanathan Swaminathan. Trajectory-based viewport prediction for 360-degree virtual reality videos. In 2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), pages 157–160, 2018.

[16] Yanan Bao, Huasen Wu, Tianxiao Zhang, Albara Ah Ramli, and Xin Liu. Shooting a moving target: Motion-prediction-based transmission for 360-degree videos. In 2016 IEEE International Conference on Big Data (Big Data), pages 1161–1170, 2016.

[17] Yanyu Xu, Yanbing Dong, Junru Wu, Zhengzhong Sun, Zhiru Shi, Jingyi Yu, and Shenghua Gao. Gaze prediction in dynamic 360° immersive videos. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5333–5342, 2018.

[18] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.

[19] Xianglong Feng, Viswanathan Swaminathan, and Sheng Wei. Viewport prediction for live 360-degree mobile video streaming using user-content hybrid motion tracking. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 3(2):1–22, 2019.

[20] Xianglong Feng, Zeyang Bao, and Sheng Wei. Exploring cnn-based viewport prediction for live virtual reality streaming. In 2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), pages 183–1833, 2019.

[21] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25:1097–1105, 2012.

[22] Kevin Spiteri, Rahul Urgaonkar, and Ramesh K. Sitaraman. Bola: Nearoptimal bitrate adaptation for online videos. IEEE/ACM Transactions on Networking, 28(4):1698–1711, 2020.

[23] Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. A control-theoretic approach for dynamic adaptive video streaming over http. SIGCOMM Comput. Commun. Rev., 45(4):325–338, 2015.

[24] Kevin Spiteri, Ramesh Sitaraman, and Daniel Sparacio. From theory to practice: Improving bitrate adaptation in the dash reference player. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 15(2s):1–29, 2019

[25] Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. Neural adaptive video streaming with pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 197–210, 2017.

[26] Francis Y Yan, Hudson Ayers, Chenzhi Zhu, Sadjad Fouladi, James Hong, Keyi Zhang, Philip Levis, and Keith Winstein. Learning in situ: a randomized experiment in video streaming. In 17th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 20), pages 495–511, 2020.

[27] Yan Ye, Elena Alshina, and Jill Boyce. JVET-E1003: Algorithm descriptions of projection format conversion and video quality metrics in 360Lib, 07 2018.

[28] PanoTools wikipedia. Cubic projection. https://wiki.panotools.org/Cubic Projection, 2019.

[29] Facebook Technologies. Enhancing high-resolution 360 streaming with view prediction. https://engineering.fb.com/2017/04/19/virtual-reality/enhancing-high-resolution-360-streaming-with-view-prediction/, 2017.

[30] Jian-Liang Lin, Ya-Hsuan Lee, Cheng-Hsuan Shih, Sheng-Yen Lin, Hung-Chih Lin, Shen-Kai Chang, Peng Wang, Lin Liu, and Chi-Cheng Ju. Efficient projection and coding tools for 360° video. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(1):84–97, 2019.

[31] HTC Corporation. Find the right immersive experience for you. https: //www.vive.com/us/product/, 2021.

[32] Sony Interactive Entertainment LLC. Immerse yourself in incredible virtual reality games and experiences. https://www.vive.com/us/, 2021.

[33] Facebook Technologies. An all-in-one headset filled with hundreds of unique experiences. https://www.oculus.com/quest-2/, 2021.

[34] Google VR. Experience virtual reality in a simple, fun, and affordable way. https://arvr.google.com/intl/en us/cardboard/, 2021.

[35] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.

[36] Chenglei Wu, Zhihao Tan, Zhi Wang, and Shiqiang Yang. A dataset for exploring user behaviors in vr spherical video streaming. In Proceedings of the 8th ACM on Multimedia Systems Conference, pages 193–198, 2017.

[37] Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76, 2021.

Viewport-Driven Adaptive 360◦ Live Streaming Optimization Framework

Related Links

Resources & Policies

Resources & Policies

Contact Us

Related Links

Resources & Policies

Resources & Policies

Contact Us