The Digital Darkroom

Most of the work of photography is done here.  When we shoot with film, unless we have the equipment, time and desire to process our own film and develop our own prints, we hand it over to a processing lab.  The lab then runs our film through their machines to produce cookie cutter prints.

Likewise, when we shoot digital, use JPEG and print straight from the camera, we are automating the processing process in much the same way the lab did with our film.

Although the instant feedback and convenience of viewing and printing of photos right from our home computer is a plus for digital photography, the true power of digital is our ability to control and fine tune the output from the original capture to the printed result.  This process is called post-processing.  The steps we take to process our images is our workflow.

File Formats

In order for us to use the data captured by the digicam, it must be stored and be accessible.  The process of storing this information is called "writing" the data to the storage medium.  The way it is written is referred to as the "format" - not to be confused with formatting a storage device, which erases all data on the medium.

File formats are analogous to the way we format written text.  The language used in these lessons is English.  We have chosen to store this string of English language in a web page.  We could have chosen a variety of other formats such as a magazine, a newspaper, a text book, a leaflet, etc.  Each of the formats offers various advantages and disadvantages.  Some, such as magazines, offer higher quality but cost more.  Others, such as newspapers, offer cheap, fast distribution methods but lack in quality.

Computers store information in binary language.  For computers, size is equivalent to cost.  The larger the size, the more it costs to store the data in both time and equipment.

In the infancy of computer imaging, a number of different file formats were being developed to address the issue of large data files resulting in high-color image maps.  Storage costs were, and still are, a premium.  An uncompressed straight binary stream wasted valuable space, contributing to the cost of storing, transmitting, loading, and editing large image maps.

The Joint Photographic Experts Group developed the JPEG format in response to these issues.  Their goal was to standardize an image file format while providing compression that significantly reduced the end size of the image map stored.  The compression used in this method is twofold.

First, it combines redundant data.  In an area of pixels of the same value, say 255 in our blown out highlights, each pixel has a value of 255.  In an uncompressed image map, this is simply stored as a string of 255s.  With JPEG compression, it is stored as a single value of (255,x), where x is the number of pixels with that value (it is actually a little more complicated than that, but we will keep it simple for the sake of discussion).  Therefore, a simple image with very little detail can be significantly compressed.  Conversely, an image containing a wide range of colors and fine details is not as compressible.  This is referred to as "lossless" compression;   The data can be extracted and the original image map reconstructed.

Secondly, the JPEG engine throws out information not normally visible to the human eye.  Our eyes are not perfect imaging devices.  We have many flaws our mind automatically overcomes in the visual process.  The JPEG format capitalizes on our ability to fill in missing detail and throws out visual information that is largely unperceivable.  The data is not included during the compression stage, and when decompression occurs, the missing data can no longer be retrieved.  This type of compression is referred to as "lossy" compression; the data extracted is different from the data originally compressed. 

As a result of lossy compression techniques, a JPEG file, when decompressed, is different from the original image map that was compressed.  These differences are referred to as "artifacts."  These artifacts are cumulative.  Each time an image map is changed and recompressed, more data is thrown out.  Repeated compression / decompression / recompression cycles result in the addition of more artifacts to the point where they become visible.  Take any web image and enlarge it by several factors, as we have done above, and these artifacts will appear around sharp contrasty edges.

Furthermore, higher compression ratios throw out greater amounts of data.  The JPEG algorithm works in 8x8 pixel squares.  At higher compression ratios, so much information is disregarded from these squares they start looking different from each other when decompressed.  The result is a blocky mosaic look to highly compressed JPEG images.

We have briefly discussed some of the advantages of using the RAW format, and it seems I am plugging RAW as the format to use.  But, what is RAW?

A JPEG file must first be processed into an image from sensor data (remember each pixel from the sensor is a single color gradient, either red, green, or blue).  The camera takes these individual channels and combines them into a true color pixel.  Settings such as image sharpening, white balance, hue adjustment, contrast, etc., are all applied to the resulting image, and it is compressed and stored.

A RAW file is a straight binary data dump from the imaging sensor without any processing applied to it.  Image data is stored directly as it is seen by the camera with settings such as white balance stored separately.  There are several advantages to doing this.

Dynamic Range:  The imaging sensor typically reads 12 bits of data per pixel (bpp).  JPEG files only store 8bpp.  This means that right off the bat, 4bpp of information is disregarded.  This information is clipped off the low and high ends of the dynamic range, throwing out highlight and shadow detail.

With a RAW file, this information is retained and a small amount of highlight detail recovery is possible in post processing.  This also allows us to adjust the exposure value electronically.  It is often possible to correct up to 1 to 2 stops of over or underexposure via this method, but I recommend that we try to keep EV compensation in post minimal.

Another benefit of having more bits available when working with images is greater color accuracy is possible.  More information is recorded for each color and becomes available for us to work with.

Since RAW images are unprocessed and the processing of the data into an actual image is done in post processing with our RAW editors, this allows us control over how the data from individual RGB color channels are combined.  In other words, we can correct white balance in post processing to a larger degree than with JPEG images.

Additionally, as new RAW editors and third party RAW editors become available, we can re-process a RAW file to yield better image quality through new algorithms.  Although we can change camera image processing software via firmware, once the data has been processed and written as a JPEG file, we can no longer go back and reprocess the RAW data into a new JPEG file if we do not have the RAW file to begin with.  Furthermore firmware releases are at the manufacturer's discretion and third party solutions do not exist.

Some manufacturers offer compressed RAW files that significantly reduce storage requirements to the point where they are only slightly larger in file size than a quality JPEG.  This improves the overall write times and allows for storage of more of RAW files per storage medium.

Another type of nearly ubiquitous image format is the Tagged Image File Format (TIFF).  This is nothing more than a linear map of bits making up the image (a bitmap).  TIFF files are uncompressed and as such, data is exactly read from the file in exactly the same manner it was written.  No compression algorithms are involved.

This appears to be a boon over JPEG since it does not introduce artifacts and, more importantly, image quality does not degrade each time the image is edited and saved.  But…

To create the image file, the RAW data must be processed in the exact same method as a JPEG file.  The only difference is the compression technique, or lack thereof, applied when writing the file.  Although 16bpp TIFF files are available, manufacturers use 8bpp TIFF formats to cut down on the amount of space the end file takes up.

In fact, since a TIFF file combines 3 channels of color data into a single pixel, it tends to be larger than RAW files which have only a single channel of color data per pixel. As a consequence, the files take a long time to write to the storage medium (up to 40 seconds in some digicams) and eat up huge amounts of space.

Although compression algorithms exist for TIFF files where the compression is lossless (the decompressed file is exactly the same as the data originally compressed), these compression algorithms are not standardized.  An algorithm such as LZH, Lempel-Ziv and Haruyasu, named after the developers, is royalty free and available to the general public.  However, it is not a standard and is seldom employed.  As such, camera TIFF files are recorded uncompressed.

If our camera does not support RAW, JPEG at its minimum compression setting produces files where artifacts are unperceivable at all but the closest scrutiny.  It is essentially the same quality as a TIFF file, yet takes up 10x less space resulting in faster write times, read times and greater image storage capacity.

TIFF is simply a space hog…  Do not use it.