--------------------------- LIVE Video Quality Database --------------------------- Kalpana Seshadrinathan - Email: kalpsesh@gmail.com REFERENCES ---------- 1. K. Seshadrinathan, R. Soundararajan, A. C. Bovik and L. K. Cormack, "Study of Subjective and Objective Quality Assessment of Video", accepted for publication, IEEE Transactions on Image Processing, 2009. 2. K. Seshadrinathan, R. Soundararajan, A. C. Bovik and L. K. Cormack, "A Subjective Study to Evaluate Video Quality Assessment Algorithms", to appear, SPIE Proceedings Human Vision and Electronic Imaging, Jan. 2010. 3. URL: http://live.ece.utexas.edu/research/quality/live_video.html VIDEOS ------ "videos" provides videos in the LIVE Video Quality Database as multiple zipped files. MD5 checksums are provided for the zip files, as well as individual yuv files contained within the zip files, to verify integrity of your download. NOTE: Each zip file in this directory is at least 700MB. Download times of an hour or more per file are typical. All video files have planar YUV 4:2:0 format and do not contain any headers. The spatial resolution of all videos is 768x432 pixels. The first 331776 bytes of each file correspond to the 8-bit Y component of the first frame, followed by 82944 bytes corresponding to the 8-bit U component of the first frame, followed by 82944 bytes corresponding to the 8-bit V component of the first frame. Frames are concatenated to form sequence files. COMPRESSED_VIDEOS ----------------- "compressed_videos" provides the videos in the LIVE Video Quality Database in compressed format as multiple zipped files. MD5 checksums are provided for the zip files to verify integrity of your download. All H.264 compressed bitstreams in the LIVE Video Quality Database are provided as elementary streams in ".264" format. All MPEG-2 compressed bitstreams are provided as elementary streams in ".m2v" format. H.264 bitstreams suffixed with "2" through "8" (see filenaming convention below) contain RTP packets, while H.264 bitstreams suffixed with "9" through "12" are in Annex B bytestream format. FILENAMING CONVENTION --------------------- The video file naming convention is as follows. Each filename has the following pattern - "xx#_$fps.yuv". "xx" is a pattern that denotes the source/reference video sequence that was used to create the test video. There are ten reference videos in the LIVE VQA Database and the patterns used for each of them are as follows: "bs" - Blue sky, "mc" - Mobile and Calendar, "pa" - Pedestrian Area, "pr" - Park run, "rb" - Riverbed, "rh" - Rushhour, "sf" - Sunflower, "sh" = Shields, "st" - Station, "tr" - Tractor "#" is a number that denotes the distortion category that was used to create the test video. There are four distortion categories in the LIVE VQA database (plus the reference) and the numbers used for them are as follows: "1" - the original reference video "2","3","4","5" - Wireless distortions (four test videos per reference) "6","7","8" - IP distortions (three test videos per reference) "9","10","11","12" - H.264 compression (four test videos per reference) "13","14","15","16" - MPEG-2 compression (four test videos per reference) "$" is a number that denotes the frame rate of the video sequence in frames per second (fps). All videos in the LIVE VQA Database have frame rates of 25 fps or 50 fps. As an example, for the "blue sky" sequence which has a frame rate of 25 fps: 1. bs1_25fps.yuv is the reference video. 2. bs2_25fps.yuv, bs3_25fps.yuv, bs4_25fps.yuv and bs5_25fps.yuv are four test videos obtained from the reference using wireless distortions. 3. bs6_25fps.yuv, bs7_25fps.yuv and bs8_25fps.yuv are three test videos obtained from the reference using IP distortions. 4. bs9_25fps.yuv, bs10_25fps.yuv, bs11_25fps.yuv and bs12_25fps.yuv are four test videos obtained from the reference using H.264 compression. 5. bs13_25fps.yuv, bs14_25fps.yuv, bs15_25fps.yuv and bs16_25fps.yuv are four test videos obtained from the reference using MPEG-2 compression. "pa", "rb", "rh", "sf", "sh", "st" and "tr" have 250 frames (frame rate of 25fps = 10 seconds of video). "bs" has 217 frames (frame rate of 25fps = 8.68 seconds of video). "mc", "pr" and "sh" have 500 frames (frame rate of 50fps = 10 seconds of video). SUBJECTIVE DATA FORMAT ---------------------- The subjective study was conducted using a single stimulus procedure and the subjects indicated the quality of the video on a continuous scale. Subjects also viewed each of the reference videos to facilitate computation of difference scores using hidden reference removal. Each video was viewed by 38 subjects. Unreliable subjects were discarded using the procedure specified in ITU-R BT 500.11. 9 out of 38 subjects were unreliable in our study and the subjective data provided here is from 29 valid subjects. The subjective data is provided in two files - "live_video_quality_seqs.txt" which contains names of the video sequences and "live_video_quality_data.txt" which contains the corresponding subjective scores. Each file has 150 lines corresponding to 150 distorted videos in the LIVE Video Quality Database. Line i of "live_video_quality_data.txt" contains the mean DMOS score followed by the standard deviation of the DMOS scores for the video named in line i of "live_video_quality_seqs.txt". As an example, the 50th line of live_video_quality_seqs.txt is "tr6_25fps.yuv" and the 50th line of live_video_quality_data.txt is "73.4730 11.2189". This means that the mean DMOS score (averaged across subjects) for the video "tr6_25fps.yuv" is 73.4730 and the standard deviation of the DMOS scores is 11.2189. ERRATUM TO THE PAPER -------------------- Since the publication of the paper, we found that the numbers we published in the paper for PSNR is slightly incorrect. We provide below accurate results of the performance of PSNR on the LIVE Video Quality Database. SROCC values for PSNR: Erratum to Table I(a) in the paper. Wireless: SROCC = 0.657411 IP : SROCC = 0.416685 H264 : SROCC = 0.458537 MPEG2 : SROCC = 0.386228 All : SROCC = 0.539792 LCC values for PSNR: Erratum to Table I(b) in the paper. Wireless: LCC = 0.668984 IP : LCC = 0.464538 H264 : LCC = 0.549255 MPEG2 : LCC = 0.389127 All : LCC = 0.562126 Variance of the residuals between individual subjective scores and PSNR: Erratum to Table II(a) in the paper: Wireless residual = 165.043361 IP residual = 169.583086 H264 residual = 180.840682 MPEG2 residual = 185.377176 All Data residual = 182.618213 Variance of the residuals between PSNR and DMOS values: Erratum to Table II(b) in the paper. Wireless: Residual = 61.529318 IP: Residual = 73.344315 H264: Residual = 85.162100 MPEG2: Residual = 88.272099 All Data: Residual = 82.976602 This causes slight changes to the statistical significance results presented in Table III of the paper. In particular, PSNR is no longer statistically inferior to VSNR on "All data" (entry for M1 vs. M5 in Table III becomes -----) and PSNR is no longer statistically inferior to Spatial MOVIE on "Wireless" (entry for M1 vs. M8 in Table III becomes ----0). Note that corresponding changes occur for entries M5 vs. M1 (-----) and M8 vs. M1 (----1) also. Remaining entries in Table III remain unchanged.