the image histogram (iii) – useful information

Some people think that the histogram is some sort of panacea for digital photography, a means of deciding whether an image is “perfect” enough. Others tend to disregard the statistical response it provides completely. This leads us to question what useful information is there in a histogram, and how we go about interpreting it.

A plethora of information

A histogram maps the brightness or intensity of every pixel in an image. But what does this information tell us? One of the main roles of a histogram is to provide information on the tonal distributions in an image. This is useful to help determine if there is something askew with the visual appearance of an image. Histograms can be viewed live/in-camera, for the purpose of determining whether or not an image has been corrected exposed, or used during post-processing to fix aesthetic inadequacies. Aesthetic deficiencies can occur during the acquisition process, or can be intrinsic to the image itself, e.g. faded vintage photographs. Examples of deficiencies include such things as blown highlights, or lack of contrast.

A histogram can tell us many differing things about how intensities are distributed throughout the image. Figure 1 shows an example of a colour image, photograph taken in Bergen, Norway, its associated grayscale image and histograms. The histogram spans the entire range of intensity values. Midtones comprise 66% of pixels in the image, with the majority tiered towards the lighter midtone values (the largest hump in the histogram). Shadow pixels comprise only 7% of the whole image, and are actually associated with shaded regions in the image. Highlights relate to regions like the white building on the left, and some of the clouds. There are very few pure white, the exception being the shopfront signs. Some of the major features in the histogram are indicated in the image.

Fig.1: A colour image and its histograms

There is no perfect histogram

Before we get into the nitty-gritty, there is one thing that should be made clear. Sometimes there are infographics on the internet that tout the myth of a “perfect” or “ideal” histogram. The reality is that such infographics are very misleading. There is no such thing as a perfect histogram. The notion of the ideal histogram is one that is shaped like a “bell”, but there is no reason why the distribution of intensities should be that even. Here is the usual description of an ideal image: “An ideal image has a histogram which has a centred hill type shape, with no obvious skew, and a form that is spread across the entire histogram (and without clipping)”.

Fig.2: A bell-shaped curve

But a scene may be naturally darker or lighter rather than midtones found in a bell-shaped histogram. Photographs taken in the latter part of the day will be naturally darker, as will photographs of dark objects. Conversely, a photograph of a snowy scene will skew to the right. Consider the picture of the Toronto skyline taken at night shown in Figure 3. Obviously the histogram doesn’t come close to being “perfect”, but the majority of the scene is dark – not unusual for a dark scene, and hence the histogram is representative of this. In this case the low-key histogram is ideal.

Fig.3: A dark image with a skewed histogram

Interpreting a histogram

Interpreting a histogram usually involves examining the size and uniformity of the distribution of intensities in the image. The first thing to do is to look at the overall curve of the histogram to get some idea about its shape characteristics. The curve visually communicates the number of pixels in any one particular intensity.

First, check for any noticeable peaks, dips, or plateaus. For example peaks generally indicate a large number of pixels of a certain intensity range within the image. Plateaus indicate a uniform distribution of intensities. Check to see if the histogram skewed to the left or right. A left-skewed histogram might indicate underexposure, the scene itself being dark (e.g. a night scene), or containing dark objects. A right-skewed histogram may indicate overexposure, or a scene full of white objects. A centred histogram may indicate a well-exposed image, because it is full of mid-tones. A small, uniform hill may indicate a lack of contrast.

Next look at the edges of the histogram. A histogram with peaks that are placed against either edge of the histogram may indicate some loss of information, a phenomena known as clipping. For example if clipping occurs on the right side, something known as highlight clipping, the image may be overexposed in some areas. This is a common occurrence in semi-bright overcast days, where the clouds can become blown-out. But of course this is relative to the scene content of the image. As well as shape, the histogram shows how pixels are groups into tonal regions, i.e. the highlights, shadows, and midtones.

Consider the example shown below in Figure 4. Some might interpret this as somewhat of an “ideal” histogram. Most of the pixels appear in the midtones region of the histogram, with no great amount of blacks below 17, nor whites above 211. This is a well-formed image, except that it lacks some contrast. Stretching the histogram over the entire range of 0-255 could help improve the contrast.

Fig.4: An ideal image with a central “hump” (but lacking some contrast)

Now consider a second example. This picture in Figure 5 is of a corner grocery store in Montreal and has a histogram with a multipeak shape. The three distinct features almost fit into the three tonal regions: the shadows (dark blue regions, and empty dark space to the right of the building), the midtones (e.g. the road), and the highlights (the light upper brick portion of the building). There is nothing intrinsically wrong with this histogram, as it accurately represents the scene in the image.

Fig.4: An ideal image with multiple peaks in the histogram

Remember, if the image looks okay from a visual perspective, don’t second-guess minor disturbances in the histogram.

Next: More on interpretation – histogram shapes.

the image histogram (ii) – grayscale vs colour

In terms of image processing there are two basic types of histogram: (i) colour, and (ii) intensity (or luminance/grayscale) histograms. Figure 1 shows a colour image (an aerial shot of Montreal), and its associated RGB and intensity histograms. Colour histograms are essentially RGB histograms, typically represented by three separate histograms, one for each of the components – Red, Green, and Blue. The three R,G,B histograms are sometimes shown in one mixed histogram with all three R,G,B, components overlaid with one another (sometimes including an intensity histogram).

Fig.1: Colour and grayscale histograms

Both RGB and intensity histograms contain the same basic information – the distribution of values. The difference lies in what the values represent. In an intensity histogram, the values represent the intensity values in a grayscale image (typically 0 to 255). In an RGB histogram, divided into individual R, G, B histograms, each colour channel is just a graph of the frequencies of each of the RGB component values of each pixel.

An example is shown in Figure 2. Here a single pixel is extracted from an image. The RGB triplet for the pixel is (230,154,182) i.e. it has a red value of 230, a green value of 154, and a blue value of 182. Each value is counted in its respective bin in the associated component histogram. So red value 230 is counted in the bin marked as “230” in the red histogram. The three R,G, B histograms are visually no different than an intensity histogram. The individual R, G, and B histograms do not represent distributions of colours, but merely distributions of components – for that you need a 3D histogram (see bottom).

Fig.2: How an RGB histogram works: From single RGB pixel to RGB component histograms

Applications portray colour histograms in many different forms. Figure 3 shows the RGB histograms from three differing applications: Apple Photos, ImageJ, and ImageMagick. Apple Photos provides the user with the option of showing the luminance histogram, the mixed RGB, or the individual R, G, B histograms. The combined histogram shows all the overlaid R, G, B histograms, and a gray region showing where all three overlap. ImageJ shows the three components in separate histograms, and ImageMagick provides an option for their combined or separate. Note that some histograms (ImageMagick) seem a little “compressed”, because of the chosen x-scale.

Fig.3: How RGB histograms are depicted in applications

One thing you may notice when comparing intensity and RGB histograms is that the intensity histogram is very similar to the green channel or the RGB image (see Figure 4). The human eye is more sensitive to green light than red or blue light. Typically the green intensity levels within an image are most representative of the brightness distribution of the colour image.

Fig.4: The RGB-green histogram verus intensity histogram

An intensity image is normally created from an RGB image by converting each pixel so that it represents a value based on a weighted average of the three colours at that pixel. This weighting assumes that green represents 59% of the perceived intensity, while the red and blue channels account for just 30% and 11%, respectively. Here is the actual formula used:

gray = 0.299R + 0.587G + 0.114B

Once you have a grayscale image, it can be used to derive an intensity histogram. Figure 5 illustrates how a grayscale image is created from an RGB image using this formula.

Fig.5: Deriving a grayscale image from an RGB image

Honestly there isn’t really that much useful data in RGB histograms, although they seem to be very common in image manipulation applications, and digital cameras. The problem lies with the notion of the RGB colour space. It is a space in which chrominance and luminance are coupled together, and as such it is difficult to manipulate any one of the channels without causing shifts in colour. Typically, applications that allow manipulation of the histogram do so by first converting the image to a decoupled colour space such as HSB (Hue-Saturation-Brightness), where the brightness can be manipulated independently of the colour information.

A Note on 3D RGB: Although it would be somewhat useful, there are very few applications that provide a 3D histogram, constructed from the R, G, and B information. One reason is that these 3D matrices could be very sparse. Instead of three 2D histograms, each with 256 pieces of information, there is now a 3D histogram with 2563 or 16,777,216 pieces of information. The other reason is that 3D histograms are hard to visualize.

the image histogram (i) – what is it?

An image is really just a collection of pixels of differing intensities, regardless of whether it is a grayscale (achromatic) or colour image. Exploring the pixels collectively helps provide an insight into the statistical attributes of an image. One way of doing this is by means of a histogram, which represents statistical information in a visual format. Using a histogram it is easy to determine whether there are issues with an image, such as over-exposure. In fact histograms are so useful that most digital cameras offer some form of real-time histogram in order to prevent poorly exposed photographs. Histograms can also be used in post-processing situations to improve the aesthetic appeal of an image.

Fig.1: A colour image with its intensity histogram overlaid.

A histogram is simply a frequency distribution, represented in the form of a graph. An image histogram, sometimes called an intensity histogram, describes the frequency of intensity (brightness) values that occur in an image. Sometimes as in Figure 1, the histogram is represented as a bar graph, while other times it appears as a line graph. The graph typically has “brightness” on the horizontal axis, and “number of pixels” on the vertical axis. The “brightness” scale describes a series of values in a linear scale from 0, which represents black, to some value N, which represents white.

Fig.2: A grayscale image and its histogram.

A image histogram, H, contains N bins, with each bin containing a value representing the number of times an intensity value occurs in an image. So a histogram for a typical 8-bit grayscale image with 256 gray levels would have N=256 bins. Each bin in the histogram, H[i] represents the number of pixels in the image with intensity i. Therefore H[0] is the number of pixels with intensity 0 (black), H[1] the number of pixels with intensity 1, and so forth until H[255] which is the number of pixels with the maximum intensity value, 255 (i.e. white).

A histogram can be used to explore the overall information in an image. It provides a visual characterization of the intensities, but does not confer any spatial information, i.e. how the pixels physically relate to one another in the image. This is normal because the main function of a histogram is to represent statistical information in a compact form. The frequency data can be used to calculate the minimum and maximum intensity values, the mean, and even the median.

This series will look at the various types of histograms, how they can be used to produce better pictures, and how they can be manipulated to improve the aesthetics of an image.

Image sharpening – image content and filter types

Using a sharpening filter is really contingent upon the content of an image. Increasing the size of a filter may have some impact, but it may also have no perceptible impact – what-so-ever. Consider the following photograph of the front of a homewares store taken in Oslo.

A storefront in Oslo with a cool font

The image (which is 1500×2000 pixels – down sampled from a 12MP image) contains a lot of fine details, from the stores signage, to small objects in the window, text throughout the image, and even the lines on the pavement. So sharpening would have an impact on the visual acuity of this image. Here is the image sharpened using the “Unsharp Mask” filter in ImageJ (radius=10, mask-weight=0.3). You can see the image has been sharpened, as much by the increase in contrast than anything else.

Image sharpened with Unsharp masking radius=10, mask-weight=0.3

Here is a close-up of two regions, showing how increasing the sharpness has effectively increased the contrast.

Pre-filtering (left) vs. post-sharpening (right)

Now consider an image of a landscape (also from a trip to Norway). Landscape photographs tend to lack the same type of detail found in urban photographs, so sharpening will have a different effect on these types of image. The impact of sharpening will be reduced in most of the image, and will really only manifest itself in the very thin linear structures, such as the trees.

Sharpening tends to work best on features of interest with existing contrast between the feature and its surrounding area. Features that are too thin can sometimes become distorted. Indeed sometimes large photographs do not need any sharpening, because the human eye has the ability to interpret the details in the photograph, and increasing sharpness may just distort that. Again this is one of the reasons image processing relies heavily on aesthetic appeal. Here is the image sharpened using the same parameters as the previous example:

Image sharpened with Unsharp masking radius=10, mask-weight=0.3

There is a small change in contrast, most noticeable in the linear structures, such as the birch trees.  Again the filter uses contrast to improve acuity (Note that if the filter were small, say with a radius of 3 pixels, the result would be minimal). Here is a close-up of two regions.

Pre-filtering (left) vs. post-sharpening (right)

Note that the type of filter also impacts the quality of the sharpening. Compare the above results with those of the ImageJ “Sharpen” filter, which uses a kernel of the form:

ImageJ “Sharpen” filter

Notice that the “Sharpen” filter produces more detail, but at the expense of possibly overshooting some regions in the image, and making the image appear grainy. There is such as thing as too much sharpening.

Original vs. ImageJ “Unsharp Masking” filter vs. ImageJ “Sharpen” filter

So in conclusion, the aesthetic appeal of an image which has been sharpened is a combination of the type of filter used, the strength/size of the filter, and the content of the image.

Image sharpening in colour – how to avoid colour shifts

It is unavoidable – processing colour images using some types of algorithms may cause subtle changes in the colour of an image which affect its aesthetic value. We have seen this in certain forms of the unsharp masking parameters used in ImageJ. How do we avoid this? One way is to create a more complicated algorithm, but the reality is that without knowing exactly how a pixel contributes to an object that’s basically impossible. Another way, which is way more convenient is to use a separable colour space. RGB is not separable – the red, green and blue components must work together to form an image. Modify one of these components, and it will have an affect on the rest of them. However if we use a colour space such as HSV (Hue-Saturation-Value), HSB (Hue-Saturation-Brightness) or CIELab, we can avoid colour shifts altogether. This is because these colour spaces separate luminance from colour information, therefore image sharpening can be performed on the luminance layer only – something known as luminance sharpening.

Luminance,  brightness, or intensity can be thought of as the “structural” information in the image. For example first we convert an image from RGB to HSB, then process only the brightness layer of the HSB image. Then convert back to RGB. For example, below are two original regions extracted from an image, both containing differing levels of blur.

Original “blurry” image

Here is the RGB processed image (UM, radius=10, mask weight=0.5):

Sharpened using RGB colour space

Note the subtle changes in colour in the region surrounding the letters? Almost a halo-type effect. This sort of colour shift should be avoided. Now below is the HSB processed image using the same parameters applied to only the brightness layer:

Sharpened using the Brightness layer of HSB colour space

Notice that there are acuity improvements in both images, however it is more apparent in the right half, “rent K”. The black objects in the left half, have had their contrast improved, i.e. the black got blacker against the yellow background, and hence their acuity has been marginally enhanced. Neither suffers from colour shifts.

Unsharp masking in ImageJ – changing parameters

In a previous post we looked at whether image blur could be fixed, and concluded that some of it could be slightly reduced, but heavy blur likely could not. Here is the image we used, showing blur at two ends of the spectrum.

Blur at two ends of the spectrum: heavy (left) and light (right).

Now the “Unsharp Masking” filter in ImageJ, is not terribly different from that found in other applications. It allows the user to specify a “radius” for the Gaussian blur filter, and a mask weight (0.1-0.9). How does modifying the parameters affect the filtered image? Here are some examples using a radius of 10 pixels, and a variable mask weight.

Radius = 10; Mask weight = 0.25
Radius = 10; Mask weight = 0.5
Radius = 10; Mask weight = 0.75

We can see that as the mask weight increases, the contrast change begins to affect the colour in the image. Our eyes may perceive the “rent K” text to be sharper in the third image with MW=0.75, but the colour has been impacted in such as way that the image aesthetics have been compromised. There is little change to the acuity of the “Mölle” text (apart from the colour contrast). A change in contrast can certainly improve the visibility of detail in the image (i.e. they are easier to discern), however maybe not their actual acuity. It is sometimes a trick of the eye.

What about if we changed the radius? Does a larger radius make a difference? Here is what happens when we use a radius of 40 pixels, and a MW=0.25.

Radius = 40; Mask weight = 0.25

Again, the contrast is slightly increased, and perceptual acuity may be marginally improved, but again this is likely due to the contrast element of the filter.

Note that using a small filter size, e.g. 3-5 pixels in a large image (12-16MP) will have little effect, unless there are features in the image that size. For example, in an image containing features 1-2 pixels in width (e.g. a macro image), this might be appropriate, however will likely do very little in a landscape image.

Does image super-resolution work?

Everyone has some image that they wish had better resolution, i.e. the image would have finer detail. The problem with this concept is that it is almost impossible to create pixels from information that did not exist in the original image. For example if you want to increase the size of an image 4 times, that basically means that a 100×100 pixel image would be transformed into an image 400×400 pixels in size. There is a glitch here though, increasing the dimensions of the image by four times, actually increases the data within the image by 16 times. The original image had 10,000 pixels, yet the new image will have 160,000 pixels. That means 150,000 pixels of information will have to be interpreted from the original 10,000 pixels. That’s a lot of “padding” information that doesn’t exist.

There are a lot of algorithms out there that suggest that they can increase the resolution of an image anywhere from 2-16 times. It is easy to be skeptical about these claims, so do they work? I tested two of these platforms on two vastly different images. Images where I was interested in seeing a higher resolution. The first image is a segment of an B&W aerial photograph of my neighbourhood from 1959. I have always been interested in seeing the finer details, so will super-resolution fix this problem? The second image is a small image of a vintage art poster which I would print were it to have better resolution.

My experiments were performed on two online systems: (i) AI Image Enlarger, and (ii) Deep Image. Now both seem to use AI in some manner to perform the super-resolution. I upscaled both images 4 times (the max of the free settings). Now these experiments are quick-and-dirty, offering inputs from the broadest ends of the spectrum. They are compared to the original image “upscaled” four times using a simple scaling algorithm, i.e. each pixel in the input image becomes 4 pixels in the output image.

The first experiment with the B&W aerial photograph (490×503) increased the size of the image to 1960×2092 pixels. Neither super-resolution algorithm produced any results which are perceptually different from the original, i.e. there is no perceived enhanced resolution. This works to the theory of “garbage-in, garbage-out”, i.e. you cannot make information from nothing. Photographs are inherently harder to upsize than other forms of image.

The original aerial image (left) compared with the super-resolution image produced by AI Image Enlarger (right).

The original aerial image (left) compared with the super-resolution image produced by Deep Image (right).

The next experiment with the coloured poster (318×509) increased the size of the image to 1272×2036 pixels. Here the results from both algorithms are quite good. Both algorithms enhance detail within the images, making things more crisp, aesthetically pleasing, and actually increasing the detail resolution. Why did the poster turn out better? This is mainly because artwork contains a lot more distinct edges between objects, and the colour also likely contributes to the algorithms success.

The original poster image (left) compared with the super-resolution image produced by AI Image Enlarger (right).

The original poster image (left) compared with the super-resolution image produced by Deep Image (right).

To compare the algorithms, I have extracted two segments from the poster image, to show how the differing algorithms deal with the super-resolution. The AI Image Enlarger seems to retain more details, while producing a softer look, whereas Deep Image enhances some details (river flow) at the expense of others, some of which it almost erodes (bridge structure, locomotive windows).

It’s all in the details: AI Image Enlarger (left) vs. Deep Image (right)

The other big difference is that AI Image Enlarger was relatively fast, whereas Deep Image was as slow as molasses. The overall conclusion? I think super-resolution algorithms work fine for tasks that have a good amount of contrast in them, and possibly images with distinct transitions, such as artwork. However trying to get details out of images with indistinct objects in them is not going to work too well.

What is unsharp masking?

Many image post-processing applications use unsharp masking (UM) as their choice of sharpening algorithm. It is one of the most ubiquitous methods of image sharpening. Unsharp masking was introduced by Schreiber [1] in 1970 for the purpose of improving the quality of wirephoto pictures for newspapers. It is based on the principle of photographic masking whereby a low-contrast positive transparency is made of the original negative. The mask is then “sandwiched” with the negative, and the amalgam used to produce the final print. The effect is an increase in sharpness.

The process of unsharp masking accentuates the high-frequency components of an image, i.e. the edge regions where there is a sharp transition in image intensity. It does this by extracting the high-frequency details from an image, and adding them to the original image. This process can be better understood by first considering a 1D signal shown in the figure below.

An example of unsharp masking using a 1D signal

This is the process of what happens to the signal

  1. The original signal.
  2. The signal is “blurred”, by a filter which enhances the “low-frequency” components of the signal.
  3. The blurred signal, ➁, is subtracted from ➀, to extract the “high-frequency” components of the signal, i.e. the “edge” signal.
  4. The “edge” signal is added to the original signal ➀ to produce the sharpened signal.

In the context of digital images unsharp masking works by subtracting a blurred form of an image from the original image itself to create an “edge” image which is then used to improve the acuity of the original image. There are many different approaches to unsharp masking which use differing forms of filters. Some use a more traditional approach using the process outlines above, with the blurring actuated using a Gaussian blur, while others use specific filters which create “edge” images directly, which can be either added to, or subtracted from the original image.

[1] Schreiber, W., “Wirephoto quality improvement by unsharp masking,” Pattern Recognition, Vol.2, pp.117-121 (1970). 

Fixing photographs (e.g. travel snaps) (ii)

3︎⃣ Fixing light with B&W

There are some images which contain shafts of light. Sometimes this light helps highlight certain objects in the photograph, be it as hard light or soft light. Consider the following photo of a viking carving from the Viking Ship Museum in Oslo. There are some nice shadows caused by the light streaming in from the right side of the scene. One way to reduce the effects of light is to convert the photograph to black-and-white.

By suppressing the role colour plays in the image, the eyes become more fixated on the fine details, and less on the light and shadows.

4︎⃣ Improving on sharpness

Sometimes it is impossible to take a photograph with enough sharpness. Tweaking the sharpness just slightly can help bring an extra crispness to an image. This is especially true in macro photographs, or photographs with fine detail. If the image is blurry, there is every likelihood that it can not be salvaged. There is only so much magic that can be performed by image processing. Here is a close-up of some water droplets on a leaf.

If we filter the image using some unsharp masking to sharpen the image, we get:

5︎⃣ Saturating colour

Photographs of scenes containing vivid colour may sometimes appear quite dull, or maybe you want to boost the colour in the scene. By adjusting the colour balance, or manipulating the colour histogram, it is possible to boost the colours in a photograph, although they may end up “unrealistic” colours in the processed image. Here is a street scene of some colourful houses in Bergen, Norway.

Here the image has been processed with a simple contrast adjustment, although the blue parts of the sky have all but disappeared.

Why image processing is an art

There are lots of blogs that extol some piece of code that does some type of “image processing”. Classically this is some type of image enhancement – an attempt to improve the aesthetics of an image. But the problem with image processing is that there are aspects of if that are not really a science. Image processing is an art fundamentally because the quality of the outcome is often intrinsically linked to an individuals visual preferences. Some will say the operations used in image processing are inherently scientific because they are derived using mathematical formula. But so are paint colours. Paint is made from chemical substances, and deriving a particular colour is nothing more than a mathematical formula for combining different paint colours. We’re really talking about processing here, and not analysis (operations like segmentation). So what forms of processing are artistic?

  1. Anything that is termed a “filter”. The Instagram-type filters that make an ordinary photo look like a Polaroid. 
  2. Anything with the word enhancement in it. This is an extremely loose term – for it literally means “an increase in quality” – what does this mean to different people? This could involve improving the contrast in an image, removing blur through sharpening, or maybe suppressing noise artifacts.

These processes are partially artistic because there is no tried-and-true method of determining whether the processing has resulted in an improvement in the quality of the image. Take an image, improve its contrast. Does it have a greater aesthetic appeal? Are the colours more vibrant? Do vibrant colours contribute to aesthetic appeal? Are the blues really blue?

Contrast enhancement: (a) original, (b) Retinex-processed, (c) MAXimum of (a) and (b)

Consider the photograph above. To some, the image on the left suffers from being somewhat underexposed, i.e. dark. The image in the middle is the same image processed using a filter called Retinex. Retinex helps remove unfavourable illumination conditions – the result is not perfect, however the filter can help recover detail from an image in which it is enveloped in darkness. Whilst a good portion of the image has been “lightened”, the overcast sky has darkened through the process. There is no exact science for “automagically” making an image have greater aesthetic appeal. The art of image processing often requires tweaking settings, and adjusting the image until it appears to have improved visually. In the final image of the sequence below, the original and Retinex processed images are used to create a composite by retaining only the maximum value at each pixel location. The result is a brighter, contrasty, more visually appealing image.