How colour changes our perspective of photographs

The first permanent photograph was produced in 1825 by the French inventor Joseph Nicéphore Niépce. Since then photographs have become the epitome of our visual history. Until becoming widespread in the 1950s, colour images were more of an aberration, with monochrome, i.e. black-and-white, being the norm, partially due to the more simplistic processing requirements. As a result, history of good portion of the 19th/20th centuries is perceived in terms of monochromic images. This determines how we perceive history, for humans perceive monochromatic images in a vastly differing manner to colour ones.

The use of black-and-white in historical photographs implies certain ideas about history. There is the perception that such photos are authentic historical images. By the mid half of the 19th century, photography had become an important means of creating a visual record of life. However the process was inherently monochromatic, and the resulting photographs provided a representation of the structure of a subject, but lacked the colour which would have provided a more realistic context. There were some photographic processes which yielded an overall colour, such as cyanotypes, however such colour was unrealistic. The first colourization of photographic occurred in the early 1840s, when Swiss painter Johann Baptist Isenring used a mixture of gum arabic and pigments to make the first coloured daguerreotype. Such hand colouring continued in successive mediums including albumen and gelatin silver prints. The purpose of this hand-colouring may have been to increase the realism of the photographic prints (in lieu of a colour photographic process) .

The major failing of monochromatic images may be the fact that they suffer from a lack of context. Removing the colour from an image provides us with a different perception of the scene. Take for example the picture of the Russian peasant girls shown in Fig. 1. The image is from the US Library of Congress Prokudin Gorskii Collection, and depicts three young women offering berries to visitors to their izba, a traditional wooden house, in a rural area along the Sheksna River, near the town of Kirillov. Shown in colour, we perceive a richness in the girls garments, even though they are peasant girls in some small Russian town. When we think of peasant Russia in the early 20th century, we are unlikely to associate such vibrant colours with their place in society. Had we viewed only the panchromatic image, our perception would be vastly different.

Gorskii photographs
Russian peasant girls in colour and grayscale (Prokudin Gorskii)

Humans are capable of perceiving approximately 32 shades of gray and millions of colours. When we interpret an image to extract descriptors, some of those descriptors will be influenced by the perceived colour of objects within the image. A monochrome image relies on a spectrum of intensities that range from black to white, so when we view a monochromatic image, we perceive the image based on tone, texture and contrast, rather than colour. In the photograph of the peasant girls we are awed by the dazzling red and purple dresses, when viewing the monochrome image we are drawn to the shape of the dresses, the girls pose, and the content of the image.

Here is a second example of a sulfur stack shown in both colour and grayscale. The loss of meaning in the monochrome image is clear. The representative stack of sulphur is readily identifiable in the colour image, however in the monochrome image, the identifying attribute has been removed, leaving only the structure of the image with a loss of context.

Extracted sulfur stacked in a “vat” 60 feet tall at Freeport Sulphur Co. in Hoskins Mound, Texas.
Extracted sulfur stacked in a “vat” 60 feet tall at Freeport Sulphur Co. in Hoskins Mound, Texas. Kodachrome transparency by John Vachon

Digital photography: some things just aren’t possible

Despite the advances in digital photography, we are yet to see a camera which views a scene the same way that our eyes do. True, we aren’t able to capture and store scenes with our eyes, but they do have inherently advanced ability to optically analyze our surroundings, thanks in part to millions of years of coevolution with our brains.

There are some things that just aren’t possible in post-processing digital images. One is removing glare, and reflections from glass. Consider the image below, which was taken directly in front of a shop window. The photograph basically reflects the image from the opposite side of the street. Now getting rid of this is challenging. One idea might be to use a polarizing filter, but that won’t work directly in front of a window (a polarising filter removes light beams with a specific angle. As the sensor doesn’t record the angle of the light beams, it can’t be recreated in post-processing.). Another option is to actually take the shot at a different part of the day, or the night. There is no fancy image processing algorithm that will remove the reflection, although someone has undoubtedly tried. This is a case where the photographic acquisition process is all.

windowReflection

Glass reflection in a shop window.

Any filter that changes properties of the light that isn’t captured by the digital sensor (or film), is impossible to reproduce in post-processing. Sometimes the easiest approach to taking a photograph of something in a window is to wait for an overcast day, or even photograph the scene at night. Here is a similar image taken of a butcher shop in Montreal.

nightviewGlass

Nighttime image, no reflection, and backlit.

This image works well, because the contents of the image are back-lit from within the building. If we aren’t that concerned about the lighting on the building itself, this works nicely – just changes the aesthetics of the image to concentrate more on the meat in the window.

In image processing, have we have forgotten about aesthetic appeal?

In the golden days of photography, the quality and aesthetic appeal of the photograph was unknown until after the photograph was processed, and the craft of physically processing it played a role in how it turned out. These images were rarely enhanced because it wasn’t as simple as just manipulating it in Photoshop. Enter the digital era. It is now easier to take photographs, from just about any device, anywhere. The internet would not be what it is today without digital media, and yet we have moved from a time when photography was a true art, to one in which photography is a craft. Why a craft? Just like a woodworker crafts a piece of wood into a piece of furniture, so to do photographers  crafting their photographs in the like of Lightroom,or Photoshop.There is nothing wrong with that, although I feel like too much processing takes away from the artistic side of photography.

Ironically the image processing community has spent years developing filters to process images, to make them look more visually appealing – sharpening filters to improve acuity, contrast enhancement filters to enhance features. The problem is that many of these filters were designed to work in an “automated” manner (and many really don’t work well), and the reality is that people prefer to use interactive filters. A sharpening filter may work best when the user can modify its strength, and judge its aesthetic appeal through qualitative means. The only place “automatic” image enhancement algorithms exist are those in-app filters, and in-camera filters. The problem is that it is far too difficult to judge how a generic filter will affect a photograph, and each photograph is different. Consider the following photograph.

Cherries in a wooden bowl, medieval.

A vacation pic.

The photograph was taken using the macro feature on my 12-40mm Olympus m4/3 lens. The focal area is the top-part of the bottom of the wooden bucket. So some of the cherries are in focus, others are not, and there is a distinct soft blur in the remainder of the picture. This is largely because of the low depth of field associated with close-ip photographs… but in this case I don’t consider this a limitation, and would not necessarily want to suppress it through sharpening, although I might selectively enhance the cherries, either through targeted sharpening or colour enhancement. The blur is intrinsic to the aesthetic appeal of the image.

Most filters that have been incredibly successful are usually proprietary, and so the magic exists in a black box. The filters created by academics have never faired that well. Many times they are targeted to a particular application, poorly tested (on Lena perhaps?), or not at all designed from the perspective of aesthetics. It is much easier to manipulate a photograph in Photoshop because the aesthetics can be tailored to the users needs. We in the image processing community have spent far too many years worrying about quantitative methods of determining the viability of algorithms to improve images, but the reality is that aesthetic appeal is all that really matters. Aesthetic appeal matters, and it is not something that is quantifiable. Generic algorithms to improve the quality of images don’t exist, it’s just not possible in the overall scope of the images available. Filters like Instagram’s Larkwork because they are not changing the content of the image really, they are modifying the colour palette, and they do that applying the same look-up table for all images (derived from some curve transformation).

People doing image processing or computer vision research need to move beyond the processing and get out and take photographs. Partially to learn first hand the problems associated with taking photographs, but also to gain an understanding of the intricacies of aesthetic appeal.

Why photographs need very little processing

I recently read an article on photographing a safari in Kenya, in which the author, Sarfaraz Niazi, made an interesting statement. While describing the process of taking 8000 photos on the trip he made a remark about post-processing, and said his father taught him a lesson when he was aged 5 – that “every picture is carved out in perpetuity as soon as you push the shutter“. There is so much truth in this statement. Photographs are snapshots of life, and the world around us is rarely perfect, so why should a photograph be any different? It is not necessary to vastly process images – there are of course ways to adjust the contrast, maybe improve the sharpness, or adjust the exposure somewhat, but beyond that, what is necessary? Add a filter? Sure that’s fun on Instagram, but shouldn’t be necessary on camera-based photographs.

Many years of attempting to derive algorithms to improve images have taught me that there are no generic one-fits-all algorithms. Each photograph must be modified in a manner that suits the ultimate aesthetic appeal of the image. An algorithm manipulates through quantitative evaluation, having no insight into the content, or qualitative aspects of the photograph. No AI algorithm will ever be able to replicate the human eyes ability to determine aesthetic value – and every persons aesthetic interpretation will be different. Add too much computational photography into a digital camera, and you end up with too much of a machine-driven photograph. Photography is a craft as much as an art and should not be controlled solely by algorithms. Consider the following photograph, taken in Glasgow, Scotland. The photograph suffers from being taken on quite a hot day in the summer, when the sky was somewhat hazy. The hazy sky is one factor which causes a reduction in colour intensity in the photograph.

glasgowAestheticpre

Original photograph

In every likelihood, this photograph represents the true scene quite accurately. An increase in saturation, and modification of exposure will produce a more vivid photograph, shown below. Likely one of the Instagram filters would also have done a nice job in “improving” the image. Was the enhancement necessary? Maybe, maybe not. The enhancement does improve the colours within the image, and the contrast between objects.

glasgowAestheticpost

Post-processed photograph

Why aesthetic appeal in image processing matters

What makes us experience beauty?

I have spent over two decades writing algorithms for image processing, however I have never really created anything uber fulfilling . Why? Because it is hard to create generic filters, especially for tasks such as image beautification. In many ways improving the aesthetic appeal of photographs involves modifying the content on an image in more non natural ways. It doesn’t matter how AI-ish an algorithm is, it cannot fathom what the concept of aesthetic appeal is.  A photograph one person may find pleasing may be boring to others. Just like a blank canvas is considered art to some, but not to others. No amount of mathematical manipulation will lead to a algorithmic panacea of aesthetics. We can modify the white balance and play with curves, indeed we can make 1001 changes to a photograph, but the final outcome will be perceived differently by different people.

After spending years researching image processing algorithms, and designing some of my own, it wasn’t until I decided to take the art of acquiring images to a greater depth that I realized algorithms are all good and well, but there is likely little need for the plethora of algorithms created every year. Once you pick up a camera, and start playing with different lenses, and different camera settings, you begin to realize that part of the nuance any photograph is its natural aesthetic appeal. Sure, there are things that can be modified to improve aesthetic appeal, such as contrast enhancement or improving the sharpness, but images also contain unfocused regions that contribute to their beauty.

If you approach image processing purely from a mathematical (or algorithmic) viewpoint, what you are trying to achieve is some sort of utopia of aesthetics. But this is almost impossible, largely because every photography is unique.  It is possible to improve the acuity of objects in an image using techniques such as unsharp masking, but it is impossible to resurrect a blurred image – but maybe that’s the point. One could create an fantastic filter that sharpens an image beautifully, but with the sharpness of modern lenses, that may not be practical. Consider this example of a photograph taken in Montreal. The image has good definition of colour, and has a histogram which is fairly uniform. There isn’t a lot that can be done to this image, because it truly does represent the scene as it exists in real life. If I had taken this photo on my iPhone, I would be tempted to post it on Instagram, and add a filter… which might make it more interesting, but maybe only from the perspective of boosting colour.

aestheticAppeal1

A corner hamburger joint in Montreal – original image.

Here is the same image with only the colour saturation boosted (by ×1.6). Have its visual aesthetics been improved? Probably. Our visual system would say it is improved, but that is largely because our eyes are tailored to interpret colour.

aestheticAppeal2

A corner hamburger joint in Montreal – enhanced image.

If you take a step back from the abyss of algorithmically driven aesthetics, you begin to realize that too few individuals in the image processing community have taken the time to really understand the qualities of an image. Each photograph is unique, and so the idea of generic image processing techniques is highly flawed. Generic techniques work sufficiently well in machine vision applications where the lighting is uniform, and the task is also uniform, e.g. inspection of rice grains, or identification of burnt potato chips. No aesthetics are needed, just the ability to isolate an object and analyze it for whatever quality is needed. It’s one of the reasons unsharp masking has always been popular. Alternative algorithms for image sharpening really don’t work much better. And modern lenses are sharp, in fact many people would be more likely to add blur than take it away.