Average subjective
mark of video sequence is named MOS (Mean Opinion Score). This mark is obtained
by simple averaging of subjective scores:
experts_num - overall number of experts.
To illustrate
different dispersion of individual marks for each MOS, left and right borders
of 95% confidence intervals were counted.
To estimate
probability that experts were able to distinguish two codecs on a given
sequence, we calculate z-test for each pair of codecs and bitrates. We used
following formula to estimate this probability:
experts_num - total amount of experts.
Objective metrics
For all sequences
PSNR, VQM and SSIM were measured with MSU Video Quality Measurement Tool [7].
PSNR is the most
popular metric. Its sense is similar to the mean square error, but it is more
convenient to use due to the logarithmic scale.


There are a lot of
examples when PSNR does not reflect subjective quality.
VQM [3] and SSIM [4]
are relatively new metrics that pretend to reflect subjective opinion.
To compare objective
metrics' prediction, their results must be mapped on common scale. According to
the procedure described in [1], results of each metric were mapped to the
subjective data scale using the following fitting function:

Where
O - objective data;
Ofitted - fitted objective data;
g and d - parameters.
Parameters g and d
were selected to minimize sum of squares of differences between Ofitted and
subjective data:

Where S - subjective
data.
Results of fitting
process can be regarded as a prediction of a subjective opinion by an objective
metric.
MOS+PSNR/bitrate
graphs
On the following
graphs one can see subjective data for each sequence, its' 95% confidence
intervals and MOS values predicted by PSNR(3) (after fitting).
Battle

Picture 7.
Battle
The "Battle"
sequence is the most difficult one for codecs. PSNR is wrong in a number of
points, for instance on x264 690 and XviD 1024 PSNR values contradict subjective
scores. x264 is the absolute leader on all bitrates, followed by DivX, WMV and
XviD.
Z-test table is
shown below (probability that experts distinguished two sequences).
|
Battle
|
Ref.
|
DivX 1024
|
DivX 690
|
WMV 1024
|
WMV 690
|
x264 1024
|
x264 690
|
XviD 1024
|
XviD 690
|
|
Ref.
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
DivX 1024
|
1
|
1
|
1
|
1
|
1
|
0.87
|
1
|
1
|
1
|
|
DivX 690
|
1
|
1
|
1
|
0.97
|
0.94
|
1
|
0.95
|
0.89
|
1
|
|
WMV 1024
|
1
|
1
|
0.97
|
1
|
1
|
1
|
0.53
|
1
|
1
|
|
WMV 690
|
1
|
1
|
0.94
|
1
|
1
|
1
|
1
|
0.65
|
1
|
|
x264 1024
|
1
|
0.87
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
x264 690
|
1
|
1
|
0.95
|
0.53
|
1
|
1
|
1
|
1
|
1
|
|
XviD 1024
|
1
|
1
|
0.89
|
1
|
0.65
|
1
|
1
|
1
|
1
|
|
XviD 690
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
Rancho

Picture 8.
Rancho
All codecs performed
equally well on the "Rancho" sequence, difference between the subjective
ratings is small. x264 1024 is still the best, with mark equal to that of uncompressed
sequence.
|
Rancho
|
Ref.
|
DivX 1024
|
DivX 690
|
WMV 1024
|
WMV 690
|
x264 1024
|
x264 690
|
XviD 1024
|
XviD 690
|
|
Ref.
|
1
|
0.94
|
1
|
0.96
|
1
|
0.51
|
0.91
|
0.83
|
1
|
|
DivX 1024
|
0.94
|
1
|
0.94
|
0.59
|
0.97
|
0.96
|
0.59
|
0.76
|
0.95
|
|
DivX 690
|
1
|
0.94
|
1
|
0.92
|
0.61
|
1
|
0.97
|
0.99
|
0.56
|
|
WMV 1024
|
0.96
|
0.59
|
0.92
|
1
|
0.96
|
0.98
|
0.68
|
0.83
|
0.93
|
|
WMV 690
|
1
|
0.97
|
0.61
|
0.96
|
1
|
1
|
0.98
|
1
|
0.54
|
|
x264 1024
|
0.51
|
0.96
|
1
|
0.98
|
1
|
1
|
0.94
|
0.87
|
1
|
|
x264 690
|
0.91
|
0.59
|
0.97
|
0.68
|
0.98
|
0.94
|
1
|
0.69
|
0.97
|
|
XviD 1024
|
0.83
|
0.76
|
0.99
|
0.83
|
1
|
0.87
|
0.69
|
1
|
0.99
|
|
XviD 690
|
1
|
0.95
|
0.56
|
0.93
|
0.54
|
1
|
0.97
|
0.99
|
1
|
Matrix sc.1

Picture 9.
Matrix sc.1
XviD on 1024 kbps
became a leader on this sequence, but its advantage is small. PSNR was adequate
for this sequence except for x264 on 1024 kbps
|
Matrix sc.1
|
Ref.
|
DivX 1024
|
DivX 690
|
WMV 1024
|
WMV 690
|
x264 1024
|
x264 690
|
XviD 1024
|
XviD 690
|
|
Ref.
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
DivX 1024
|
1
|
1
|
1
|
0.71
|
1
|
0.6
|
0.99
|
0.85
|
1
|
|
DivX 690
|
1
|
1
|
1
|
1
|
0.74
|
1
|
0.88
|
1
|
0.7
|
|
WMV 1024
|
1
|
0.71
|
1
|
1
|
1
|
0.79
|
0.97
|
0.95
|
1
|
|
WMV 690
|
1
|
1
|
0.74
|
1
|
1
|
1
|
0.71
|
1
|
0.88
|
|
x264 1024
|
1
|
0.6
|
1
|
0.79
|
1
|
1
|
0.99
|
0.78
|
1
|
|
x264 690
|
1
|
0.99
|
0.88
|
0.97
|
0.71
|
0.99
|
1
|
1
|
0.95
|
|
XviD 1024
|
1
|
0.85
|
1
|
0.95
|
1
|
0.78
|
1
|
1
|
1
|
|
XviD 690
|
1
|
1
|
0.7
|
1
|
0.88
|
1
|
0.95
|
1
|
1
|
Matrix sc.2

Picture 10.
Matrix sc.2
x264 is the best
again. PSNR values are close for DivX, WMV, x264 and XviD despite the fact that
subjective scores differ.
|
Matrix sc.2
|
Ref.
|
DivX 1024
|
DivX 690
|
WMV 1024
|
WMV 690
|
x264 1024
|
x264 690
|
XviD 1024
|
XviD 690
|
|
Ref.
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
DivX 1024
|
1
|
1
|
1
|
0.82
|
1
|
0.75
|
0.74
|
0.8
|
1
|
|
DivX 690
|
1
|
1
|
1
|
0.98
|
0.98
|
1
|
0.99
|
0.98
|
0.94
|
|
WMV 1024
|
1
|
0.82
|
0.98
|
1
|
1
|
0.94
|
0.6
|
0.52
|
1
|
|
WMV 690
|
1
|
1
|
0.98
|
1
|
1
|
1
|
1
|
1
|
0.69
|
|
x264 1024
|
1
|
0.75
|
1
|
0.94
|
1
|
1
|
0.9
|
0.93
|
1
|
|
x264 690
|
1
|
0.74
|
0.99
|
0.6
|
1
|
0.9
|
1
|
0.58
|
1
|
|
XviD 1024
|
1
|
0.8
|
0.98
|
0.52
|
1
|
0.93
|
|