|
What is A
"Color Space?" A
“color space” is a way of specifying a color numerically,
usually as a triplet of numbers representing positions in a
three-dimensional “space” of color. Color spaces are
three-dimensional because our eyes have three different
kinds of color-sensitive cells (called “cone cells” or
“cones”), and thus every color space in one way or another
must encode three different color intensities. Most people
are at least a bit familiar with the way images are formed
on a computer monitor or television by combining red, green,
and blue dots of varying brightness to form a wide range of
colors. That method uses the most common kind of color
space, the “RGB” space, named for the colors
Red, Green, and Blue.
As it turns out,
there is not just one RGB color space. There are an infinite
number of RGB color spaces, created by varying several
parameters, including the specific hue of red, green, and/or
blue to be used for the colored dots in the display, the hue
of white used, and the specific way the brightness of the
dots in the display varies as the numbers fed into the
display vary.
(It’s worth noting
at this point that the RGB spaces we use all the time in
video and computer displays are generally labeled “R’G’B’”
(pronounced “R-prime, G-prime, B-prime”) by color scientists
because they are “gamma-corrected” spaces. Color scientists
reserve “RGB” to refer to non-gamma-corrected (or “linear”)
spaces that use red, green, and blue primaries. This
distinction is not as commonly used in the video or computer
world, and is beyond the purview of this article. Just know
that in this article, when we talk about “RGB” we mean
R’G’B’ in the language of color science. We’ll talk about
this further in a future article about
gamma.)
You might well
ask, “If the way you specify colors is via the percentages
of R, G, and B, and different color spaces use different
primary colors of Red, Green, and/or Blue, how do you
specify the specific primary colors of Red, Green, or Blue
used to define the space?” The answer is that you use a
fundamental color space, called XYZ, or more
formally CIE XYZ. XYZ is a color space that is derived from
basic studies of how the eye and brain sense color. It is
notable for being an “absolute” color space (meaning that
colors are specified directly, not by reference to other
colors), and for being able to represent any possible real
visible color that a human being can sense. RGB, by
contrast, is a “relative” color space, where the colors are
specified relative to three “primaries,” which are the
colors of red, green, and blue used in that particular
space.
There are
additional color spaces that can represent any color, almost
all derived from XYZ, including “xyY,” “CIELUV”, and
“CIELAB”. But
displays always use some form RGB as their fundamental color
space, for the simple reason that real-world displays can’t
show all colors. They can only show colors that can be mixed
from their specific RGB primaries, so it’s not useful to
send them colors they can’t display.
In the world of
high definition video, there is one very common RGB space
specified in an ITU standard called BT.709, or
sometimes Rec. 709 (for “Recommendation number 709”). It
specifies (in an absolute space) the specific colors of red,
green, and blue that must be used in a conforming display,
and what color of white the display needs to produce when
all three primaries are at full brightness. There is no
current standard for display gamma, which is how the
brightness of each pixel varies as the input voltages or
digital values vary, but there is a common understanding
based on using the gamma of the CRTs used in video
mastering.
Video and
Y'CbCr Given that
video displays are fundamentally RGB devices and all share a
common RGB color space, specified in BT.709, you’d expect
that the primary color space used to transmit and store
video would be BT.709 RGB. But in fact, even though video
cameras physically measure RGB values, and displays are made
using RGB primaries, video is stored, transmitted, and
processed in a color space called Y’CbCr, or sometimes
informally “YUV.”
Y’CbCr is the latest
version of a set of color spaces that were developed in the
early days of color television. The broadcasters and the FCC
wanted to make color television backward compatible with
black-and-white television, so all the people who owned
black-and-white televisions wouldn’t find them obsolete when
color broadcasting started. Unfortunately, there wasn’t
enough room to broadcast both a full-color signal and a
black-and-white signal within the frequency band owned by a
single television station. It was necessary to find a way to
send both a compatible black-and-white signal and a color
add-on signal that could be combined with the
black-and-white signal to produce a full color
signal.
Since there was very
little room in the frequency band for even a color add-on
signal, it was necessary to make the color add-on very low
resolution. This worked out OK because your eyes are much
less sensitive to color resolution than to brightness
resolution. Another way of looking at it is that the
viewer’s perception of how sharp the picture is depends
mostly on the main black-and-white signal, with the extra
color signal adding almost no additional sharpness. Thus the
color signal can be, in effect, a somewhat rough and blurry
overlay.
The main black-and-white
signal is carried in a single channel called Y’ (pronounced
Y-prime), and the low-resolution color signal is carried in
two channels, labeled Cb and Cr, also called “color
difference” signals, because they are derived from B-Y’ and
R-Y’. Y’ itself is a weighted combination of R, G, and B,
using specific
weights that are designed to make Y’ approximate
perceived brightness.
Y’CbCr is a handy color
space for storing and broadcasting video, because the Y’
signal can be stored or sent at very high resolution, and Cb
and Cr can be stored or sent at low resolution without
causing the final image to look significantly worse. In
effect it’s a very simple lossy
compression scheme, throwing away portions of the
image that are less important for perception (the detailed
color information) in order to devote more resources to the
important stuff (the black-and-white details).
As with RGB, there are a
potentially infinite number of possible Y’CbCr color spaces,
varying primarily in the weights of R, G, and B that are
combined to form the Y’ signal. Luckily, there is a
standard, the aforementioned BT.709, which gives specific
mathematical functions for converting RGB to and from
Y’CbCr.
Subsampling In the old days of color TV, the Cb and Cr
channels (which just to be precise weren’t called “Cb” and
“Cr” at the time, but that’s not important to this
discussion) were reduced in resolution via an analog lowpass
filter, which stripped out detail and allowed the color
signal to fit in the tiny amount of broadcast bandwidth
available for the extra color information. But in the
digital era, the Cb and Cr signals are reduced in resolution
via the simple expedient of scaling them down to a smaller
number of pixels.
The process of
scaling the color portions of an image to a lower resolution
is called “subsampling,” and scaling the color back to the
original resolution is called “upsampling.” Either one can
also be called “resampling.” All of these operations are
identical in practice to scaling the color channels, just
like scaling an image from one pixel size to another. There
are a variety of different resolutions that can be stored or
sent, and we often think of these various color resolution
options as a different color space. This isn’t strictly
true, as technically speaking the color space remains the
same no matter how the color channels are scaled, but it’s
still relatively common to speak of changing color spaces
when one is actually changing color subsampling
modes.
 |
4:2:0 - Progressive*
|
 |
| 4:2:0 -
Interlaced* |
The subsampling
format that is used on modern consumer video delivery media
like Blu-ray Disc and DVD is called 4:2:0. This confusing
designation means that for every 4 Y’ pixels on the even
scan lines, there will be 2 Cb and 2 Cr pixels and for every
4 Y’ pixels on the odd scan lines there will be 0 Cb and Cr
pixels. If you work all that out, it really means that the
Cb and Cr portions of the image are scaled by ½ in both
dimensions. So if the resolution of the overall image is
1920x1080, for example, the Cb and Cr portions of the image
will be at 960x540 resolution. In order to display an HD
image that is stored in this format, the Cb and Cr channels
need to be scaled back up to 1920x1080 by interpolating
values for the missing pixels, and then each pixel can be
converted to RGB using the Y’CbCr->RGB algorithm
specified in BT.709.
 |
| 4:2:2* |
For professional
video, the most common format is 4:2:2, which
means for every 4 Y’ pixels, there are 2 Cb and Cr pixels on
the even lines and 2 Cb and Cr pixels on the odd lines.
Again, it really works out to scaling the Cb and Cr by ½ in
just the horizontal direction, but leaving the vertical
unchanged. So each 1920x1080 image in 4:2:2 has the Cb and
Cr stored at 960x1080. To display an image stored in this
format, the Cb and Cr channels only need to be scaled
horizontally.
 |
| 4:4:4* |
Finally there is
4:4:4. This
just means that there are an equal number of Y’, Cb, and Cr
pixels, with no subsampling at all. This format is expensive
to store, so is only used for storage of very high-end
professional master video. But it is often available as an
output format from a player or video
processor.
Given that all the
various shiny-disc and broadcast video formats use 4:2:0
natively, one might assume that players would just send the
video to the display in that format. But as it turns out,
video players are basically required to at minimum convert
the video to 4:2:2 in order to send it to the display,
because there are standards for storing 4:2:0, but no
standards for sending it to a display. While only 4:2:2 is
required, many players now also offer the ability to go
further and convert the video to 4:4:4, or even RGB. And
now we get to the meat of this guide. What format should you
set your player to output? If you have a video processor,
what format should you feed it, and what format should you
have it produce? Or does it even matter?
The answer, as
with so many other things in life, is, “It
depends.”
The
Conversion Chain Let’s
consider the process necessary to get video off a shiny disc
(or from a digital broadcast or cable channel). First the
video needs to be converted from 4:2:0 to 4:2:2, then to
4:4:4, then to RGB, and finally it can be fed to the display
controller. This is the same process no matter what display
technology is being used, whether LCD, DLP, plasma, or CRT.
It’s possible to shortcut the process slightly by going
directly from 4:2:0 to 4:4:4, but in practice this isn’t
used as often as you’d think.
If you choose to
output 4:2:2 from your player to the display, then the
display will need to do the scaling to 4:4:4 and then to
RGB. If you output 4:4:4 to the display, the display will
not need to do any scaling at all, but will need to do the
conversion to RGB. If you output RGB to the display, then
the display can avoid all conversion steps and send the
signal right to the controller. No matter which you choose,
the same conversion steps are still happening; all you are
choosing is which device is performing the
conversion.
There’s no
specific reason that a display or a player would be the
optimal place to do these conversion steps. In theory doing
the conversion in the display minimizes the amount of data
that has to flow across the HDMI link, but in practice HDMI
is more than adequate to handle any format all the way up to
4:4:4 or RGB.
So the key to
choosing the right color space to output is finding out
which device does a better job of converting color spaces.
This is not always easy to evaluate, and it’s quite possible
for one device to do a better job in one area, like 4:2:2 to
4:4:4, but do worse in another area, like 4:4:4 to
RGB.
You’d think that
if a display handles a 4:2:2 input signal well, then feeding
it an RGB signal would be no worse, but in fact some
displays do extra work when they are fed RGB, because they
convert the signal back to 4:4:4 or even 4:2:2! This happens
because one or more of their internal processing chips is
designed only for one color format. So for these displays,
sending in any format other than the one it will use for
internal processing will only add extra processing and
potentially degrade the image.
The same logic
applies to video processors. If the processor does all its
work in 4:2:2, there’s no advantage to sending it RGB or
4:4:4, and in fact there may be a
disadvantage.
Unfortunately
device makers tend not to reveal the exact processing steps
they use internally, or the algorithms they use to convert
various color spaces to RGB. Some use different algorithms
depending on which color space is fed in. The bottom line is
to assume nothing, and test every
combination.
Performing The
Evaluation Here’s how you can decide
which color space mode works best with your display and
player combination, using the Spears & Munsil High
Definition Benchmark. If you change any component in your
system, either player, display, or processor, you’ll want to
run the evaluation again. There’s no easy way to predict
what will produce the best results with a particular
combination of components.
Scoring Form We’ve
helpfully provided a PDF file you can download
and print out, so you can try all the output modes from your
player and evaluate which one works best.
Before you start, you’ll want to
check that the settings for brightness, contrast, color,
tint, and sharpness are all calibrated properly for each of
the color space modes. Some displays have separate memories
for every input mode, so you might find that even if the
display is adjusted properly when it is being fed 4:2:2, the
settings change when it gets a 4:4:4 or RGB input signal.
Start by setting the output on
the player to 4:2:2. Run through the basic calibration steps
for brightness, contrast, color, and tint (using the
articles on the Spears & Munsil web site as a guide).
Then switch the output on the player to 4:4:4 and run
through the calibration again. You may not need to adjust
anything. If your player has RGB mode, do the calibration
again for that mode. If you have even more modes, you may
need to print out more forms and write in the names of the
other modes you want to compare.
For the color temperature and/or
gamma adjustments, unless you have special test equipment
you won’t be able to calibrate these settings. Just make
sure that these settings are set the same for all the color
space modes. If your player only has one set of settings
that is correct for all of the color space modes (which is
the most common case), you don’t need to fill in this
section. Just do the calibration once and then make sure it
continues to work in the other picture modes.
Once you are sure that you have
the correct settings for each input picture mode, run
through the various tests putting a check in the box for
pass, and leaving the box unchecked for
fail. When you’re done, hopefully one mode will
have the most boxes checked, and most of the time that will
be the preferred mode to use. In some cases, you may find
that one specific issue is more distracting for you than the
others, and in that case you’ll want to choose among the
modes that doesn’t have that particular problem.
If you can select modes in both
your player and your video processor, our recommendation is
to start by trying the various modes in your processor,
leaving the player in factory default mode. Choose the
output mode that scores best and set the processor in that
mode, then move to the player and evaluate all the various
modes the player can produce. If you end up changing the
player’s output mode, you may want to return to the
processor to re-evaluate in case the input mode affects the
processor’s output. If you want to be completely
comprehensive, you may want to try every possible
combination of player and processor mode separately.
 |
Player -> Display
|
 |
Player -> Receiver ->
Display
|
Important note:
If you are running the HDMI signal through a receiver or
switcher and find problems, especially with clipping, you
should try taking the receiver or switcher out of the chain
and connecting the player directly to the display to see if
that fixes the problem. There are several receivers,
switchers, and video processors that will clip the signal
passing through them, even if they aren’t doing any
processing of the image. Also check the web sites of the
manufacturer of your receiver or switcher to see if there is
a new firmware, as this might correct some or all of the
errors.
Let’s take a look at the various
test patterns you’ll want to look at and what to look for in
each:
Chroma Alignment This
pattern contains shapes in various color combinations that
are designed to show any misalignment between the chroma
channels and the luma channel. These misalignments can be
caused by mistakes or shortcuts in the chroma upsampling,
and it’s not uncommon to find that changing the format sent
from the player to the display changes the amount of chroma
misalignment.
The primary things to look at are
the long thin diamond shapes on the left, right, top, and
bottom of the screen. Each of them has a single straight
line of chroma pixels laid on top of a long skinny diamond
in the luma channel. When the alignment is correct, the
chroma should be centered on the diamond, and the diamond
should look completely symmetrical. Most people find it
easiest to see the alignment clearly against the gray
background. The difference can be quite subtle, on the order
of a half-pixel shift.
You can also often see difference
(if any) in the chroma upsampling algorithm. Nearest
neighbor will have very sharp chroma transitions, but will
have a half-pixel shift to the right in the chroma channel.
Bilinear and bicubic will produce a softer, but more
accurate, chroma channel, with smoothly rolled-off edges.
Don’t be fooled by the sharp look of nearest neighbor; on
this pattern it often looks sharper, but it will make the
finished image look jagged. See the example image to see
what the various upsampling approaches tend to look like.
Put a check in the row labeled
“Alignment correct” for any mode where the chroma lines are
centered in their diamonds. If multiple modes have properly
aligned chroma, put a check for all of them. If none of them
are properly aligned, put a check for the mode that is the
closest to correct, or for none of them if none of them are
close to correct.
Also compare the image on screen
to our example images for the various chroma upsampling
approaches. If you're not sure what kind of upsampling is
being used, you may want to look at the bursts and zone
plate patterns and compare them to our samples as well. If
you don't see clear stairstepping in any of those patterns,
it's reasonable to assume that the upsampling is using
bilinear or better.
 |
Chroma
Alignment
|
Chroma
Multiburst This pattern has ten
horizontal bursts in two rows of five on top, and ten
vertical bursts in two rows of five on the bottom. The
horizontal bursts show how well the video playback chain is
reproducing horizontal chroma resolution, and the vertical
bursts show how well the video playback chain is reproducing
vertical chroma resolution.
For this pattern, look at the
highest-frequency bursts, which are on the lower right of
both the horizontal and vertical sections. They should have
clear, bright colors that look identical to the colors in
the other bursts. If the colors are muted, or the burst
looks solid gray or any other color, it shows that chroma
resolution is being lost during one of the upsampling
conversions. If the horizontal burst is muted, that shows a
problem in the 4:2:2->4:4:4 conversion. If the vertical
burst is muted, that shows a problem in the 4:2:0->4:2:2
conversion.
Another thing that’s fairly easy
to tell from this pattern is the quality of the chroma
upsampling being done. If the chroma upsampling is being
done using an algorithm called “nearest neighbor” then each
chroma pixel is just being copied four times to make the new
upscaled chroma image. This is fast and easy, but produces
blocky, jagged color contours in the final image. Bilinear
upsampling uses a linear interpolation algorithm to create
the replacement pixels when it scales up the chroma channel,
and looks much better. Bicubic upsampling uses two cubic
interpolation curves to produce a very smooth and clean
chroma channel, and is generally considered the best
commonly used algorithm. Take a look at our sample image to
get an idea of how this pattern will vary when upsampled
with different algorithms.
Put a check in the row labeled
“High-frequency detail” for all modes that have clean,
bright, colorful high-resolution chroma bursts. If no modes
have good bursts, put a check for the mode that has the
best-looking ones.
Put a check in the row labeled
“Upsampling bilinear or Better” if the upsampling is clearly
something better than Nearest Neighbor.
 |
Chroma
Multiburst - Low Frequency Burst
|
 |
Chroma
Multiburst - High Frequency Burst
|
 |
Chroma
Multiburst - Multiple
Conversions
|
Chroma Zone
Plate This pattern is a good “at a
glance” pattern to see problems and issues with the chroma
channels. It has diagonals and high-frequency details that
make it possible to see quickly if there are problems with
the chroma, once you know what the pattern should look like.
This pattern should have clean,
clear colors all the way to the edge of the image, with no
obvious loss of color intensity in the corners. If the
corners are not as colorful as the center, or look like a
solid color, that shows loss of the highest chroma
resolution.
The other thing to look at is the
amount and strength of the moiré in the pattern. Moiré is a
visual impression of false curves caused by aliasing of the
true curves in the image. It looks like an optical illusion
of sorts, showing concentric circles on the left and right
sides of the screen, similar to the actual concentric
circles radiating out from the center. There is always a bit
of moiré in this pattern even when everything is perfect.
After viewing this pattern with
all of the different output modes selected sequentially, put
a check in the row labeled “Minimal moiré” for the mode that
has the smallest amount of moiré. If all of them have the
same amount of moiré, put a check in all of them.
Put a check in the row labeled
“Corner detail” for all of the modes that show the corners
of the image at full intensity with no falloff in brightness
or colorfulness.
 |
Chroma Zone Plate - Minimal
Moiré
|
 |
Chroma Zone Plate - Corner
Detail
|
 |
| Chroma Zone
Plate | Chroma Upsampling
Error This pattern was designed to test
for the Chroma Upsampling Error in MPEG decoders, but it’s
also useful for checking the smoothness of chroma
upsampling. It has diagonal chroma bursts in the bottom two
rows that make it easy to see any jaggedness or
stairstepping in the chroma channel. As you choose different
output modes from the player, if there is a big difference
between the quality of the upsampling algorithms, you’ll see
the diagonals vary between smooth and jagged. The best
quality upsampling will generally produce the smoothest
diagonals on these lines.
After viewing this pattern with
all of the different output modes selected sequentially, put
a check in the row labeled “Diagonals smooth” for the mode
that has the smoothest-looking diagonal lines. If they all
look the same, put a check in all the boxes. You might also
want to look at the diagonals and curves in the Chroma
Zone Plate pattern as well. Sometimes it’s easier to
see the differences on one or the other depending on the
specific display.
 |
Chroma Upsampling
Error
|
Clipping This
pattern tells you if any of the primary color channels is
being clipped above the reference white level at any point
in the chain. There are some popular HDMI transmitter chips
that clip the Y’ channel when converting 4:2:2 to 4:4:4 or
RGB, so when using a player with one of these chips, setting
the player to output anything other than 4:2:2 produces a
hard clip in the Y’ channel. A telltale sign of this is that
the Y’ (white) channel is clipped, but the red, green, and
blue channels are not clipped, or at least not completely
clipped.
Put a check in the row labeled,
“White not clipped” if the white portion of the pattern
shows concentric squares rather than a large solid square.
Put a check in the row labeled,
“Red, green, and blue not clipped” if the red, green, and
blue portions of the pattern show concentric squares rather
than a large solid square.
 |
| Clipping |
*© Copyright Secrets of Home Theater and
High Fidelity and are used with permission.
|