The Digital
Darkroom
Most of the work of photography is done here. When we shoot with film, unless we
have the equipment, time and desire to process our own film and develop
our own prints, we hand it over
to a processing lab. The lab then runs our film through their machines
to
produce cookie cutter prints.
Likewise, when we shoot digital, use JPEG and print
straight from the camera, we are automating the processing process in
much the same way the lab did with our film.
Although the instant feedback and convenience of
viewing and printing of photos right from our home computer is a plus
for digital photography, the true power of digital is our ability to control
and fine tune the output from the original capture to the printed result. This process is called post-processing. The steps we take to process our
images is our workflow.
File Formats
In order for us to use the data captured by the digicam,
it must be stored and be accessible. The process of storing this information
is called "writing" the data to the storage medium. The way
it is written is referred to as the "format" - not to be confused
with formatting a storage device, which erases all data on the medium.
File formats are analogous to the way we format written
text. The language used
in these lessons is English. We have chosen to store this string of English
language in a web page. We could have chosen a variety of other formats such as a magazine, a newspaper, a text book, a leaflet, etc. Each of the formats offers various advantages and disadvantages. Some,
such as magazines, offer higher quality but cost more. Others, such as
newspapers, offer cheap, fast distribution methods but lack in quality.
Computers store information in binary language.
For computers, size is equivalent to cost. The larger the size, the
more it costs to store the data in both time and equipment.
JPEG vs. RAW
In the infancy of computer imaging, a number of different file formats
were being developed to address the issue of large data files resulting
in high-color image maps. Storage costs were, and still are, a premium.
An uncompressed straight binary stream wasted valuable space, contributing to
the cost of storing, transmitting, loading, and editing large image maps.
JPEG:
The Joint Photographic Experts Group developed the JPEG format in response
to these issues. Their goal was to standardize an image file format while
providing compression that significantly reduced the end size of the image
map stored. The compression used in this method is twofold.
First, it combines redundant data. In an
area of pixels of the same value, say 255 in our blown out highlights,
each pixel has a value of 255. In an uncompressed image map, this is simply
stored as a string of 255s. With JPEG compression, it is stored as
a single value of (255,x), where x is the number of pixels with that value (it
is actually a little more complicated than that, but we will keep it simple
for the sake of discussion). Therefore, a simple image with very little detail can be significantly
compressed. Conversely, an image containing a wide range of colors and
fine details is not as compressible. This is referred to as
"lossless" compression; The data can be extracted and the
original image map reconstructed.
Secondly, the JPEG engine throws out information not normally
visible to the human eye. Our eyes are not perfect imaging devices. We
have many flaws our mind automatically overcomes in the visual process. The JPEG format capitalizes on our
ability to fill in missing detail and throws out visual information
that is largely unperceivable. The data is not included during the
compression stage, and when decompression occurs, the missing data can no
longer be retrieved. This type of compression is referred to as "lossy"
compression; the data extracted is different from the data
originally compressed.
As a result of lossy compression techniques, a JPEG file, when decompressed,
is different from the original image map that was compressed. These differences
are referred to as "artifacts." These artifacts are cumulative. Each time an image map is changed and recompressed, more data is thrown
out. Repeated compression / decompression / recompression cycles result in
the addition of more artifacts to the point where they become visible.
Take any web image and enlarge it by several factors, as we have done
above, and these artifacts
will appear around sharp contrasty edges.
Furthermore, higher compression ratios throw out
greater amounts of data. The JPEG algorithm works in 8x8 pixel squares. At higher compression ratios, so much information is disregarded from
these squares they start looking different from each other when decompressed. The result is a blocky mosaic look to highly compressed JPEG images.
RAW:
We
have briefly discussed some of the advantages of using the RAW format,
and it seems I am plugging RAW as the format to use. But,
what is RAW?
A JPEG file must first be processed into an image
from sensor data (remember each pixel from the sensor is a single color
gradient, either red, green, or blue). The camera takes these individual
channels and combines them into a true color pixel. Settings such as image
sharpening, white balance, hue adjustment, contrast, etc., are all
applied to the resulting image, and it is compressed and stored.
A RAW file is a straight binary data dump
from the imaging sensor without any processing applied to it. Image data
is stored directly as it is seen by the camera with settings such as
white balance stored separately. There are several advantages
to doing this.
Dynamic Range: The imaging sensor typically
reads 12 bits of data per pixel (bpp). JPEG files only store 8bpp. This
means that right off the bat, 4bpp of information is disregarded. This
information is clipped off the low and high ends of the dynamic range,
throwing out highlight and shadow detail.
With a RAW file, this information is retained and
a small amount of highlight detail recovery is possible in post processing. This also allows
us to adjust the exposure value electronically. It is often possible
to correct up to 1 to 2 stops of over or underexposure via this method,
but I recommend that we try to keep EV compensation in post minimal.
Another benefit of having more bits available
when working with images is greater color accuracy is possible. More
information is recorded for each color and becomes available for us to
work with.
Since RAW images are unprocessed and the processing of
the data into an actual image is done in post processing with our RAW
editors, this allows us control over how the data from individual RGB
color channels are combined. In other words, we can correct white
balance in post
processing to a larger degree than with JPEG images.
Additionally, as new RAW editors and third party RAW
editors become available, we can re-process a RAW file to yield better
image quality through new algorithms. Although we can change camera image
processing software via firmware, once the data has been processed and
written as a JPEG file, we can no longer go back and reprocess the RAW
data into a new JPEG file if we do not have the RAW file to begin with.
Furthermore firmware releases are at the manufacturer's discretion and
third party solutions do not exist.
Some manufacturers offer compressed RAW files that
significantly reduce storage requirements to the point where they are only slightly
larger in file size than a quality JPEG. This improves the overall write
times and allows for storage of more of RAW files per storage medium.
JPEG vs. TIFF
Another type of nearly ubiquitous image format is the Tagged Image File Format (TIFF). This is nothing more than a linear map of bits making up
the image (a bitmap). TIFF files are uncompressed and as such, data is exactly
read from the file in exactly the same manner it was written. No
compression algorithms
are involved.
This appears to be a boon over JPEG since it does
not introduce artifacts and, more importantly, image quality does not
degrade each time the image is edited and saved. But
To create the image file, the RAW data must be processed
in the exact same method as a JPEG file. The only difference is the
compression technique, or lack thereof, applied when writing the file. Although 16bpp TIFF files are available, manufacturers use 8bpp TIFF formats
to cut down on the amount of space the end file takes up.
In fact, since a TIFF file combines 3 channels of
color data into a single pixel, it tends to be larger than RAW files which
have only a single channel of color data per pixel. As a consequence,
the files take a long time to write to the storage medium (up to 40 seconds
in some digicams) and eat up huge amounts of space.
Although compression algorithms exist for TIFF
files where the compression is lossless (the decompressed file is exactly
the same as the data originally compressed), these compression algorithms
are not standardized. An algorithm such as LZH, Lempel-Ziv and Haruyasu, named
after the developers, is royalty free and available to the
general public. However, it is not a standard and is seldom employed. As
such, camera TIFF files are recorded uncompressed.
If our camera does not support RAW, JPEG at its
minimum compression setting produces files where artifacts are unperceivable
at all but the closest scrutiny. It is essentially the same quality as
a TIFF file, yet takes up 10x less space resulting in faster write times,
read times and greater image storage capacity.
TIFF is simply a space hog
Do not use it.
|