Izadimehr, MohammadHossein, Ghanbari, Milad, Chen, Guodong, Zhou, Wei, Hao, Xiaoshuai, Dasari, Mallesham, Timmerer, Christian and Amirpour, Hadi
2025.
SVD: Spatial Video Dataset.
Presented at: MM '25:The 33rd ACM International Conference on Multimedia,
Dublin, Ireland,
31 October 2025.
IXR '25: Proceedings of the 3rd International Workshop on Interactive eXtended Reality.
Dublin:
ACM,
pp. 12988-12994.
10.1145/3746027.3758246
|
|
PDF
- Published Version
Available under License Creative Commons Attribution Non-commercial. Download (2MB) |
Abstract
Stereoscopic video has long been the subject of research due to its ability to deliver immersive three-dimensional content to a wide range of applications. The dual-view format inherently provides binocular disparity cues that enhance depth perception and realism, making it indispensable for fields such as telepresence, 3D mapping, and robotic vision. Until recently, however, end-to-end pipelines for capturing, encoding, and viewing high-quality stereoscopic video were neither widely accessible nor optimized for consumer-grade devices. Today's smartphones, such as the iPhone Pro, and modern Head-Mounted Displays (HMDs) like the Apple Vision Pro, offer built-in support for stereoscopic video capture, hardware-accelerated encoding, and seamless playback on devices like the Apple Vision Pro and Meta Quest 3, which require minimal user intervention. Apple refers to this streamlined workflow as spatial Video. Making the full stereoscopic video process available to everyone has made new applications possible. Despite these advances, there remains a notable absence of publicly available datasets that include the complete spatial video pipeline on consumer platforms, hindering reproducibility and comparative evaluation of emerging algorithms. In this paper, we introduce SVD, a spatial video dataset comprising 300 five-second video sequences, i.e., 150 captured using an iPhone Pro and 150 with an Apple Vision Pro. Additionally, 10 longer videos with durations ranging from 2 min, 29 s to 5 min have been recorded. The SVD dataset is publicly released to facilitate research in codec performance evaluation, subjective and objective Quality of Experience assessment, depth-based computer vision, stereoscopic video streaming, and other emerging 3D applications such as neural rendering and volumetric capture. Link to the dataset: https://cd-athena.github.io/SVD/.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Date Type: | Publication |
| Status: | Published |
| Schools: | Schools > Computer Science & Informatics |
| Publisher: | ACM |
| ISBN: | 979-8-4007-2051-2 |
| Date of First Compliant Deposit: | 6 November 2025 |
| Last Modified: | 07 Nov 2025 07:15 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/182179 |
Actions (repository staff only)
![]() |
Edit Item |




Dimensions
Dimensions