Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

SVD: Spatial Video Dataset

Izadimehr, MohammadHossein, Ghanbari, Milad, Chen, Guodong, Zhou, Wei, Hao, Xiaoshuai, Dasari, Mallesham, Timmerer, Christian and Amirpour, Hadi 2025. SVD: Spatial Video Dataset. Presented at: MM '25:The 33rd ACM International Conference on Multimedia, Dublin, Ireland, 31 October 2025. IXR '25: Proceedings of the 3rd International Workshop on Interactive eXtended Reality. Dublin: ACM, pp. 12988-12994. 10.1145/3746027.3758246

[thumbnail of 3746027.3758246.pdf] PDF - Published Version
Available under License Creative Commons Attribution Non-commercial.

Download (2MB)

Abstract

Stereoscopic video has long been the subject of research due to its ability to deliver immersive three-dimensional content to a wide range of applications. The dual-view format inherently provides binocular disparity cues that enhance depth perception and realism, making it indispensable for fields such as telepresence, 3D mapping, and robotic vision. Until recently, however, end-to-end pipelines for capturing, encoding, and viewing high-quality stereoscopic video were neither widely accessible nor optimized for consumer-grade devices. Today's smartphones, such as the iPhone Pro, and modern Head-Mounted Displays (HMDs) like the Apple Vision Pro, offer built-in support for stereoscopic video capture, hardware-accelerated encoding, and seamless playback on devices like the Apple Vision Pro and Meta Quest 3, which require minimal user intervention. Apple refers to this streamlined workflow as spatial Video. Making the full stereoscopic video process available to everyone has made new applications possible. Despite these advances, there remains a notable absence of publicly available datasets that include the complete spatial video pipeline on consumer platforms, hindering reproducibility and comparative evaluation of emerging algorithms. In this paper, we introduce SVD, a spatial video dataset comprising 300 five-second video sequences, i.e., 150 captured using an iPhone Pro and 150 with an Apple Vision Pro. Additionally, 10 longer videos with durations ranging from 2 min, 29 s to 5 min have been recorded. The SVD dataset is publicly released to facilitate research in codec performance evaluation, subjective and objective Quality of Experience assessment, depth-based computer vision, stereoscopic video streaming, and other emerging 3D applications such as neural rendering and volumetric capture. Link to the dataset: https://cd-athena.github.io/SVD/.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Schools > Computer Science & Informatics
Publisher: ACM
ISBN: 979-8-4007-2051-2
Date of First Compliant Deposit: 6 November 2025
Last Modified: 07 Nov 2025 07:15
URI: https://orca.cardiff.ac.uk/id/eprint/182179

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics