Aesthetically motivated picture processing

For years I wrote scientific papers on various topics in image processing, but what I learnt from that process was that few of the papers written are actually meaningful. For instance, in trying to create new image sharpening algorithms many people forgot the whole point of sharpening. Either a photographer strives for sharpness in an entire image or endeavours to use blur as a means of focusing the attention on something of interest in the image (which is in focus, and therefore sharp). Many sharpening algorithms have been developed with the concept of sharpening the whole image… but this is often a falsehood. Why does the photo need to be sharpened? What is the benefit? A simple sharpening with unsharp masking (which is an unfortunate name for a filter) works quite well in its task. But it was designed at a time when images were small, and filters were generally simple 3×3 constructs. Applying the original filter to a 24MP 4000×6000 pixel image will make little, if any difference. On the other hand, blurring an image does nothing for its aesthetics unless it is selective, in essence trying to mimic bokeh in some manner.

Much of what happens in image processing (aside from machine vision) is aesthetically based. The true results of image processing cannot be provided in a quantitative manner and that puts it at odds with scientific methodology. But who cares? Scientific thought in an academic realm is far too driven by pure science with little in the way of pure inventing. But alas few academics think this way, most take on the academic mantra and are hogtied to doing things in a specified way. I no longer prescribe to this train of thoughts, and I don’t really know if I ever did.

aesthetic appeal, picture of Montreal metro with motion blur

This picture shows motion blur which results from a moving subway car, whilst the rest of the picture remains in focus. The motion blur is a part of the intrinsic appeal of the photograph – yet there is no way of objectively quantifying the aesthetic value – it is something that can only be qualitatively and subjectively evaluated.

Aesthetically motivated Image processing is a perfect fit for photographs because while there are theoretical underpinnings to how lenses are designed, and technical principles of how a camera works, the ultimate result – a photograph, is the culmination of the mechanical ability of the camera and the artistic ability of the photographer. Machine vision, the type used in manufacturing facilities to determine things like product defects is different, because it is tasked with precision automated photography in ideal controlled conditions. To develop algorithms to remove haze from natural scenes, or reduce glare is extremely difficult, and may be best taken when thee is no haze. Aesthetic-based picture processing is subjectively qualitative and there is nothing wrong with that. It is one of the criteria that sets humans apart from machines – the inherent ability to visualize things differently. Some may find bokeh creamy while others may find it too distractive, but that’s okay. You can’t create an algorithm to describe bokeh because it is an aesthetic thing. The same way it’s impossible to quantify taste, or distinguish exactly what umami is.

Consider the following quote from Bernard Berenson (Aesthetics, Ethics, and History) –

‘The eyes without the mind would perceive in solids nothing but spots or pockets of shadow and blisters of light, chequering and criss-crossing a given area. The rest is a matter of mental organization and intellectual construction. What the operator will see in his camera will depend, therefore, on his gifts, and training, and skill, and even more on his general education; ultimately it will depend on his scheme of the universe.’

Photographs and the craft of chance

Photographs are the encapsulation of our lives. They are snapshots, brief interludes into slices of time. Times long past. Memories of fighting in the trenches in WW1, the landings at Normandy, life in small Italian mountain villages. The best and worse of our histories. Photographs capture such fleeting moments that in most cases it would be impossible to reproduce. Photography is in its core essence the art of chance. Of being in the right place at the right time, of being able to capture just the right amount of photons entering the camera. Blink, and it could all be different. Before photographs our history was handed down through generations in stories, or paintings upon the wall. But neither of these is fleeting, they are thought-out, prescribed renditions of history. Photographs are not, they are raw, invoking, and often need no explanation. And while they could be considered by some to be art, they are crafted using tools which allow light to be captured. The true result is in natures control.

Capturing natural life is truly the essence of the craft of chance. That one photograph that captures an insect holding still, almost posing for the shot – blink and it will move on to its next feast.

The camera does not lie

There is an old phrase, “the camera does not lie“, which can be interpreted as both true and false. In historic photos where there was little done in the way of manipulation, the photograph often did hold the truth of what appeared in the scene. In modern photographs that are “enhanced” this is often not the case. But there is another perspective. The phrase is true because the camera objectively captures everything in the scene within its field of view. But it is also false, because the human eye, is not all seeing, perceiving the world in a highly subjective manner – focusing on the object (or person) of interest. Most photographs tend to contain far too much information, visual “flotsam” that is selectively discarded by the human visual system. The rendition of colours can also appear “unnatural” in photographs because of issues with white balance, film types (in analog cameras), and sensors (digital cameras). 

What the human eye sees (left) versus the camera (right)

A good example of how the human eye and camera lens perceive things differently is shown in the two photos above. The photograph on the right contains photographic perspective distortion (keystoning), where the tall buildings tend to “fall” or “lean” within the picture. The human eye (simulated on the left) on the other hand, corrects for this issue, and so does not perceive it.  To photograph a tall building, the camera is often tilted upward, and in position the vertical lines of the building converge toward the top of the picture. The convergence of vertical lines is a natural manifestation of perspective which we find acceptable in the horizontal plane (e.g. the convergence of railway tracks in the distance), but which seems unnatural in the vertical plane.

There are many other factors that influence the outcome of a picture. Some are associated with the physical abilities of a camera and its associated lenses, others the environment. For example the colour of ambient light (e.g. a colour cast created by the sun setting), perspective (the wider a lens the more distortion introduced), or contrast (e.g. B&W images becoming “flat”). While the camera does not lie, it rarely exactly reproduces the world as we see it. Or maybe we don’t perceive the world around us as it truly is.