Digital Video Basics
By Amir Majidimehr
At the risk of stating the obvious, video in the
real world is analog. To convert it to digital, we need to decide on
how many samples to take per second and the resolution of each. The
latter is rather simple in this space. The most common resolution is
8 bits with 10 bits reserved for broadcast/effects/professional
space. To put this in perspective, we use a minimum of 16 bits for
audio in CD and go up from there.
8 bits represents 48 db of "signal to noise" ratio by the way. Not a
very big number and hence the reason even clean video has some noise
in it. But we digress.
CCIR 601/BT601 is a SMPTE specification which standardized how many
samples are taken per second for standard definition video. The
convention is 13.5 MHz (millions per second) which translates into
720 pixels horizontally. And not by accident, this is the same
number of pixels in DVD format.
Vertical resolution for standard definition video is fixed by the
broadcast standard (NTSC in US is 525, and PAL used in the rest of
the world, 625). The actual visible resolution is lower due to some
reserved areas and resulting in 720x486 and 720x576 for NTSC/PAL
standards respectively. Some people round the vertical resolution a
bit so you might see slightly different numbers. Putting it
altogether, the DVD resolution therefore is 720x480 in US for a
total of roughly 346,000 pixels. Put in the context of typical
digital camera, this is 0.3 megapixels per frame of video. Imagine
taking a picture at that resolution and blowing it up bigger than
poster size to 40 and 60 inches of your flat panel TV. No wonder
then that the high definition video standard was created to up the
resolution (by up to 6 times).
In video, we do not operate in RGB (red/green/blue pixels) as PCs
do. Instead, we separate the color from black and white
information. The latter is called Luminance and the former
Chrominance. Luma and Chroma for short. Color is separated into two
“color difference” values giving us three values per pixel. In
other words, each video sample has one Luma value, and two Chroma
values. As you may know, in RGB we also have three samples one for
each color indicated by the letters.
Depending on who you are talking to, and whether we are talking
about analog or digital signals, you will see notations such as
YCrCb, YPrPb, YUV, etc. They are all the same and indicate color
difference mode as opposed to RGB. As the theme indicates, “Y” is
always luminance and the other two are the color difference signals.
RGB is for computer use and has no notion of separate luminance.
Using simple math, you can go from YUV to RGB and back. But not
every color can be represented in each space so the conversion may
be lossy causing some colors to be lost.
In order to reduce the burden of storing and managing so much video
data, color is sub-sampled meaning it is allowed to have lower
bandwidth. A notation is used to indicate this. Without
subsampling the video is said to be 4:4:4. Next step down is 4:2:2
which translates into half the bandwidth for color. If you halve
the color yet again you arrive at 4:2:0. Using 8 bits, 4:4:4
translates into 24 bits as it would in RGB world. The same
resolution translates into 12 bits when dealing with 4:2:0.
DVD, Blu-ray Disc, and all forms of broadcast television use 4:2:0
(8-bits). This means that the color is substantially reduced in
bandwidth as compared to what is available in a professional
setting. Fortunately your eye is not very sensitive to color
frequency (due to the difference between rods and cones in your
eye). So in real life, you don’t see that softness. But put up a
color bar from a test disc and you can easily see that the edges are
quite soft as one color switches to another. Compare that to the
very sharp lines in the black and white boxes.
For HD there is also a difference in color space. What is color
space? In a nutshell, the color space tells you what numbers to use
to represent a specific color. As you can imagine, there a lot of
variations in the color red. But we must pick one to be the absolute
red color so that we can then reproduce it at the receiving end.
CCIR 601 defines a certain color space used for SD. Recommendation
709 does the same for HD. It has a slightly expanded color gamut.
Your TV always expects 709 if it sees a signal higher in resolution
than SD. So when a DVD player upsamples the video, it must also
change the color space to 709 or the colors would look (slightly)
wrong.
As for levels, the TV world uses 16 to 235 in an 8-bit word even
though the range of values is 0 to 255. This is to allow signals to
go above and below the min and max and not get lost. In computer
world/RGB, we use the full values. If your display is set up for PC
use/0-255, feeding it video signals results in washed out video
since it thinks “16” means something well above absolute black.
Likewise, if your display thinks you are in video mode and you feed
if computer graphics, the levels can be crushed into 235. Something
has to do the right conversion between the two. If not done right,
you get banding and crushed video.
Finally, if your source is 8 bits as is the case with everything you
buy today, having higher resolution does you nothing. Even though a
display may advertize 10-bits, 16 or whatever, you are not able to
experience that (assuming the display can actually show those levels
as opposed to having that input resolution only which is commonly
the case). So unless you are doing graphics/effects work on the
display, you can safely ignore these numbers.
This should get you started at least in learning more about this
subject. If you have subscription to WideScreen Review magazine, I
also wrote an article in there with a lot more detail on this topic.
Back to Articles