Video for Audiophiles
By, Amir Majidimehr
This is a presentation I made to the Pacific
Northwest Audiophile Society. As the name implies, they are heavily
into audio. So I thought it would be good to present video in the
context of what audiophiles would want to learn. The presentation
is useful to all but if you know audio well, it will especially
resonate.
As you see in the slides, every topic from how
we digitize the video, to its data rate, encoding, compression
(MPEG-2, MPEG-4 AVC, VC-1), transmission (HDMI), projection
technology (LCD, DILA, DLP), calibration and most importantly, high
fidelity audio for video is covered.
The original presentation had 3-D animation
videos for the room simulations. This version does not. I will
upload the rendering together with a dedicated article at a future
time.
Audio and video are quite different in end-to-end production flow
and achieved fidelity and usage in the home. For example,
unlike audio where we have no idea of "truth," we can with 100%
confidence determine whether what we see at home is what was
produced in mastering. This is accomplished by using strict
standards that simply do not exist in audio. On the other side
of the coin, video achieves far lower performance than audio which
fortuitously matches our poor eyesight relative to our hearing
perception.
Digital video is created by separating black and white and color
components (Luma and Chroma respectively). This is done because the
eye is much less sensitive to resolution in color versus black and
white. As you see from the computation below, the number of bits
that represent our video is quite low relative to audio.
Specifically the black and white samples have only 8 to 10 bits as
compared to audio’s 16 to 24 bits. The resulting signal to noise
ratio is a poultry ~48 dB which is far cry of 96 dB we get for CD
audio for example.
As you see in this slide, the total number of pixels (dots) in our
video, even in high definition, is extremely low relative to our
typical still image capture devices. At just 2 million pixels, HD
video is a far cry from even the cheapest digital cameras. Yet, we
enlarge the video to such larger frame. Imagine trying to print a
10 foot wide picture that is just 2 megapixels! Video takes so much
data that capturing and delivering video at higher rate is quite
challenging so we are stuck at this limit for some time.
Fortunately, as can be seen in our reference theater with its 17
foot wide screen, the image can still be breathtaking if done right.
Using the details already provided, we can compute how much data we
need to store to represent digital video. As you can see, without
some kind of compression the numbers are insanely high. At 389
Gigabyte, a typical laptop hard disk can’t even store a single
movie!
The basic concept of video compression is easy: compress all we can
in a still video frame and then transmit what changes from that
frame to the next. For example, if there is a solid white wall, we
can reduce the amount of data it takes to represent all of those
similar pixels. But importantly, if the wall stays the same from
frame to frame, we can simply tell the decoder to repeat them and
thereby, save a ton of data in not having to retransmit that
redundant data over and over again.
The image on the left is the original. Compare that to the heavily
compressed version to the right. Notice how the distortion is
highest on the edges which we call “high frequency” portion. The
original image is 3 megabytes and the compressed, just 0.03
megabytes or about 100 times smaller.
Close up of the compressed image. Note the "blocking"
artifacts on the face and extreme distortion on all the edges.
We have come a long way since the original video compression
standard, MPEG-2. We can achieve double the efficiency enabling us
to deliver better quality in less space and with reduction in
bandwidth required. Compression standards such as MPEG-4 AVC and
VC-1 (also called WMV) enable Internet delivery and Blu-ray Disc to
perform much better than if they had remained MPEG-2.
Unfortunately, MPEG-2 remains the standard for US television digital
standard and hence the horrendous artifacts present in it especially
in sports where the high motion and detail become distorted.
Standards are great in lower the cost of products but they also
stifle innovation this way as we will have to live with this
transmission system for long, long time.
As mentioned earlier, in video we have an end to end standard that
allows us to verify every piece of the deliver chain to make sure it
stays faithful to the original content. This is done by using a
special color pattern (pictured) below at original capture point.
If we make sure that the captured video is producing the preset
values in that chart, and using measurement equipment at the
display, we can assure that what we see is what was captured.
As mentioned our job at high level is simple in achieving full
fidelity: we feed the display the color bars and measure whether
each color component is where it needs to be. You can see that in
the CIE chart on the left where the dot is the measured value, and
the square what it should be. If they land on top of each other, we
are golden. If they do not, then the display needs to be adjusted.
The example below is from special software we use to simplify this
work (Calman). The input to the system is a sensitive color
spectrometer. There is more to this of course than this brief
introduction including such concepts as “gamma” which determines how
the display shows different gradations of brightness.
The concept of 3-D video is really simple: we simply need to capture
two video streams, one representing what the human eyes would have
seen. The brain then uses the same technique it normally uses to
determine depth. But at home, we only have one display. To
simulate two, we (usually) play video at twice the rate, with frames
for each eye alternating with the other. Then by the use of active
glasses, we can make sure at any one instance, only one eye can see
the video intended for it. Passive systems used in theaters use two
projectors with the light polarized 90 degrees from each other.
Combine that with a set of polarized glasses with the same design
and you make sure that each eye only sees one image again. The
drawback here is that you need a specialized projection screen that
preserves this polarization. These screens do not work well for 2-D
viewing which is most of what we watch. Expensive solutions exist
such as using two screens but for the most part, it is best to use
polarized glasses and a single projector such as our Sim2 3-D Nero
and Solo.
The best way to experience movies at home is with a large screen and
that is enabled using a projector. For example, the screen we have
in our showroom is a whopping 235 inches diagonally (how displays
are usually measured). Compare that to your 5 or 60 inch flat
panel. By completely covering your field of view, we achieve what
is called “suspension of disbelief” making you feel like you are in
the movie, rather than sitting at home. Combine that with surround
sound and the illusion is complete. There are three competing
projection technologies LCD, DILA and DLP, each with their own pros
and cons as listed in the slide. Key metrics are image sharpness,
contrast, black level (how dark the image can get), 3-D performance,
etc.
Projector can either use a single imaging element to synthesize the
image or three. In case of the latter, each represents one of the
primary colors (red, green and blue). While this has some
advantages in the purity of the color and contrast, the drawback is
that the three panels need to align perfectly or the colors bleed.
In the picture below, you see how far off each color can be from
each other, rather than at the same spot. This is on a $60,000
projector! Thankfully there are much higher quality projectors such
as our Sim2 which have near perfect “registration” (alignment of
pixels). All LCD, DILA and some DLP projectors use triple imager
technology.
DLP projectors can be designed with a single image where a color
wheel in front of it in sync with each color can convert the image
from black and white to color. Advantage here is that there is no
misalignment of panels so the image can look exceptionally sharp.
The other major advantage is reduced cost. To wit, our Sim2 Nero
3-D projector is half the price of its 3-chip 3-D Solo! The only
drawback can be for a small percentage of the population which can
sometimes see a strobing effect where the colors separate for a
moment. Thankfully by spinning the color wheel very fast, this can
be almost eliminated as is the case with the Sim2 Nero.
The standard interconnect between display and monitor is HDMI. It
is a twisted pair system designed originally for very short
distances. That is a sever limitation when it comes to sending
video to projector and to other rooms in the house. HDMI provides
for automatic detection of remote devices and their capabilities
although the latter is sometimes poorly done.
In many ways, audio is an afterthought in HDMI, being slaved to
video. This causes extraction of audio clock to be difficult, often
underperforming other digital audio standards such as S/PDIF by a
factor of 10 or more! So for best performance, you may want to
connect both HDMI and S/PDIF to your Blu-ray or DVD player.
HDMI reliability can be very poor. Problem is often blamed on the
cable but usually is the fault of improper implementation at the
source, processor/AVR or the display. Troubleshooting requires
having proper equipment which sadly almost none of the design and
installation companies, sans us, own. Without it, it is an
expensive and time consuming process of swapping out equipment until
the faulty unit is identified. Often the problem can occur down the
line where a fully functional unit stops working when say, a
component like a DVD player is replaced and all of a sudden the
video flashes, becomes green, or no picture is shown.
While audio for video shares the same goal of high fidelity
reproduction as does 2-channel music, it differs in many respects.
Theaters are often bare, making them less acoustically suitable for
good sound reproduction. Most importantly, movie sound is bass
heavy making reproduction of low frequencies much more important.
The typical approach is to just throw some speakers in the room and
sometimes, following incorrect advice on the Internet on how to
treat the room. These are all random approaches to a problem that
can be solved scientifically to be correct at the start, and to
produce the flattest and most faithful reproduction of bass
frequencies.
Textbook approaches to room acoustics abound. Unfortunately they
are for the most part wrong because they make assumptions such as
ideal rectangular rooms with completely symmetrical wall
construction and no furniture which does not often occur in real
life.
The right approach uses two key techniques: use of multiple
subwoofers and fluid dynamics simulation of the room. Multiple subs
is critical to get around the fundamental physics of waves
reflecting from walls and adding/subtracting and with it, create
wildly varying frequency response which varies from seat to seat.
Using more than one subwoofer as later slides show, can sharply
reduce these variations but this requires knowing where to place
them and that is where the second method comes in. Using computer
simulation of speakers as pistons energizing the room and the air as
the “fluid,” we can model thousands or even millions of
configurations of subwoofer locations, arriving at the best location
and number of units to get the best response.
The proof is in the pudding as they say. Here are the simulations
of our theater with a single subwoofer on the left model, and three
on the right (two on each side and one in the ceiling). I will post
the full video later but for now, you can see in this frozen frame
at 17 Hz, how different the room response is between the two. The
colors are the “isosurfaces” of similar pressure (equal loudness).
While the isosurface simulation looks pretty, perhaps a more useful
view is the cutaway that shows the pressure at the ear height only.
After all, it is not important how the room sounds at other points
in the room. Here we see a more simplified but more dramatic view
of what is occurring in each subwoofer configuration. In the single
subwoofer simulation on the left, at this 30 Hz frequency point,
there is a whopping 25 dB of difference in loudness compared to just
two seats over! Imaging you hearing normal bass while your wife going
deaf sitting next to you hearing this much exaggerated frequency
response. In contrast the three sub simulation shows almost zero
difference in sound pressure at this frequency between the seats.
This is the ideal that we want to strive for and there is no way to
achieve it with one subwoofer and without simulation. Yet, the most
common system sold for home theater is a single subwoofer given to
the homeowner or the installer to randomly place in the room.
Whether you want us to design your theater or yourself, we can
provide this turnkey service for you. Using simple measurements of
the room and all the allowable positions and maximum number of
subwoofers, we can then run the simulations for you and provide you
with the optimal number and placement of subwoofers. For the cost
of one or two subwoofers, you will have, far, far better sound than
any other solution even if you do nothing more than this.
There are two programs: Gold and Platinum. The differences are
listed in the slides. The cost difference is 2X.
Once you have the subwoofers optimized the next step is to use
acoustic treatment to deal with remaining variations which should be
much smaller.
Final step is electronic equalization. There is no more powerful
system than the JBL Synthesis system using the SDEC processor. This
is a 20+ channel audio system that can concurrently correct that
many audio channels, allowing it to easily accommodate multiple
subwoofers and bi-amplification of speakers. The SFM system here
will attempt to deal with variations between subwoofer size and
model, and ARCOS provides optimizations of speakers and subwoofers.
Here you see an example of SFM/ARCOS smoothing the response of the
subwoofer (the thicker solid line) relative to the more distorted
version prior to correct (the more faded lines). As you see, the
new response is much smoother. Note how JBL shows you before and
after whereas consumer room EQ systems which never provide you
measurements post correction. They only show you a pretend graph of
what they told the speakers to reproduce. Such graphs are
notoriously wrong in those systems and hence the reason they usually
do more harm than good.
I wish our audio/video systems worked perfectly out of box but
unfortunately they do not. Worse yet, figuring out that they do not
and how to get them to perform is an expensive proposition. As an
example, to measure display fidelity we use our Minolta spectrometer
which originally cost us $20,000. No wonder then that companies
don’t bother measuring systems they sell or use cheap consumer
devices that simply do not have usable precision as the display
brightness goes down (critical to make sure the display does not
have a color shift in darker areas).
Likewise, as mentioned earlier, troubleshooting HDMI requires having
an instrument such as our Quantum Data HDMI portable analyzer. We
have solved customer problems in 30 minutes using this instrument
that we could not with weeks of swapping equipment and blaming the
wrong gear for the problem.
Neither system is easy to use however. They are professional tools
requiring extensive knowledge of what they do and how the underlying
system operates. Thankfully we are here to get the job done so the
point is to make sure whoever you use for your audio/video needs is
similarly situated. Or else, you are headed for reliability and
performance issues.
This was the lead in introduction to our reference theater employing
the techniques mentioned earlier for its design. You can read more
about it in our article on the design of our
reference home theater.
And here is a picture of the front wall,
showing the speakers, one of the subwoofers and special acoustic
treatment in the front wall and on the sides.
Back to Articles