The effect of crop sensors on lenses

Lenses used on crop-sensor cameras are a little different to those of full-frame cameras. Mostly this has to do with size – because the sensor is smaller, the image circle doesn’t need to be as large, and therefore less glass is needed in their construction. This allows crop-sensor lenses to be more compact, and lighter. The benefit is that for lenses like telephoto, a smaller size lens is required. A 300mm FF equivalent in MFT only needs to be 150mm. But what does focal-length equivalency mean?

Focal-Length Equivalency

The most visible effect of crop-sensors on lenses is the angle-of-view (AOV), which is essentially where the term crop comes from – the smaller sensor’s AOV is a crop of the full frame. Take a photograph with two cameras: one with a full-frame and another with an APS-C sensor, from the same position using lens with the same focal lengths. The camera with the APS-C sensor will have a more narrowed AOV. For example a 35mm lens on a FF camera has the same focal length as a FF on an MFT or APS-C camera, however the AOV will be different on each. An example of this is shown in Fig.1 for a 35mm lens (showing horizontal AOV).

Fig.1: AOV for 35mm lenses on FF, APS-C, and MFT

Now it should be made clear that none of this affects the focal length of the lens. The focal length of a lens remains the same – regardless of the sensor on the camera. Therefore a 50mm lens in FF, APS-C or MFT will always have a focal length of 50mm. What changes is the AOV of each of the lenses, and consequently the FOV. In order to obtain the same AOV on a cropped-sensor camera, a new lens with the appropriate focal length must be chosen.

Manufacturers of crop-sensors like to use the term “equivalent focal length“. Now this is the focal length AOV as it relates to full-frame. So Olympus says that a MFT lens with a focal length of 17mm has a 34mm FF equivalency. It has an AOV of 65° (diagonal, as per the lens specs), and a horizontal AOV of 54°. Here’s how we calculate those (21.64mm is the diagonal of the MFT sensor, which is 17.3×13mm in size):

  • 17mm MFT lens → 2*arctan(21.64/(2*17)) = 65° (diag)
  • 17mm MFT lens → 2*arctan(17.3/(2*17)) = 54° (hor)
  • 34mm FF lens → 2*arctan(36/(2*34)) = 55.8° (hor)

So a lens with a 17mm focal length on a camera with a 2.0× crop factor MFT sensor would give an AOV equivalent of to that of a 34mm lens. An APS-C sensor has a crop factor of ×1.5, so a 26mm lens would be required to give an AOV equivalent of the 34mm FF lens. Figure 2 depicts the differences between 50mm FF and APS-C lenses, and the similarities between a 50mm FF lens and a 35mm APS-C lens (which give approximately the same AOV/FOV).

Fig.2: Example of lens equivalencies: FF vs. APS-C (×1.5)

Interchangeability of Lenses

On a side note, FF lenses can be used on crop-sensor cameras because the image circle of the FF lens is larger than the crop sensor. The reverse is however not possible, as a CS lens has a smaller image circle than a FF sensor. The picture below illustrates the various combinations of FF/MFT sensor cameras, and FF/MFT lenses.

Fig.3:The effect of interchanging lenses between FF and crop sensor cameras.

Of course all this is pointless if you don’t care about comparing your crop-sensor camera to a full-frame camera.

NOTE: I tend to use horizontal AOV rather than the manufacturers more typical diagonal AOV. It makes more sense because I am generally viewing a scene in a horizontal context.

How many bits in an image?

When it comes to bits and images it can become quite confusing. For example, are JPEGs 8-bit, or 24-bit? Well they are both.

Basic bits

A bit is a binary digit, i.e. it can have a value of 0 or 1. When something is X-bit, it means that it has X binary digits, and 2X possible values. Figure 1 illustrates various values for X as grayscale tones. For example a 2-bit image will have 22, or 4 values (0,1,2,3).

Fig.1: Various bits

An 8-bit image has 28 possible values for bits – i.e. 256 values ranging from 0..255. In terms of binary values, 255 in binary is 11111111, 254 is 11111110, …, 1 is 00000001, and 0 is 00000000. Similarly, a 16-bit means there are 216 possible values, from 0..65535. The number of bits is sometimes called the bit-depth.

Bits-per-pixel

Images typically describe bits in terms of bits-per-pixel (BPP). For example a grayscale image may have 8-BPP, meaning each pixel can have one of 256 values from 0 (black) to 255 (white). Colour images are a little different because they are typically composed of three component images, red (R), green (G), and blue (B). Each component image has its own bit-depth. So a typical 24-bit RGB image is composed of three 8-BPP component images, i.e. 24-BPP RGB = 8-BPP (R) + 8-BPP (G) + 8-BPP (B).

The colour depth of the image is then 2563 or 16,777,216 colours (or 2563, 28=256 for each of the component images). A 48-bit RGB image contains three component images, R, G, and B, each having 16-BPP, for 248 or 281,474,976,710,656 colours.

Bits and file formats

JPEG stores images with a precision of 8-bits per component image, for a total of 24-BPP. The TIFF format supports various bit depths. There are also RGB images stored as 32-bit images. Here 8 bits are used to represent each of the RGB component images, with individual values 0-255. The remaining 8 bits are reserved for the transparency, or alpha (α) component. The transparency component represents the ability to see through a colour pixel onto the background. However only some image file formats support transparency. For example JPEG does not support transparency. Typically of the more common formats, only PNG and TIFF support transparency.

Bits and RAW

Then there are RAW images. Remember RAW images are not RGB images. They maintain the 2D array of pixel values extracted from photosite array of the camera sensor (they only become RGB after post-processing using off-camera software). Therefore they maintain the bit-depth of the camera’s ADC. Common bit depths are 12, 14, and 16. For example a camera that outputs 12-bits will have pixels in the raw image which will be 12-bits. A 12-bit image has 4096 levels of luminance per colour pixel. Once the RGB image is generated that means 4096^3 possible colours, which is 68,719,476,736 possible colours for each pixel. That’s 4096 times the amount of colours of an 8-bit per component RGB image. For example the Ricoh GR III stores its RAW images using 14-bits. This means that a RAW image has the potential of 16,384 colour for each component (once processed), versus a JPEG produced by the same camera, which only has 256 colours for each component.

Do more bits matter?

So theoretically its nice to have 68 billion odd colours, but is it practical. The HVS can distinguish between 7 and 10 million colours, so for visualization purposes 8-bits per colour component is fine. For editing an image, often the more colour depth the better. When an image has been processed it can then be stored as a 16-bit TIFF image, and JPEGs produced as needed (for applications such as the web).

From photosites to pixels (iii) – DIP

DIP is the Digital Image Processing system. Once the ADC has performed its conversion, each of the values from the photosite has been converted from a voltage to a binary number representing some value in its bit depth. So basically you have a matrix of integers representing each of the original photosites. The problem is that this is essentially a matrix of grayscale values, with each element of the matrix representing with a Red, Green of Blue pixel (basically a RAW image). If a RAW image is required, then no further processing is performed, the RAW image and its associated metadata are saved in a RAW image file format. However to obtain a colour RGB image and store it as a JPEG, further processing must be performed.

First it is necessary to perform a task called demosaicing (or demosaiking, or debayering). Demosaicing separates the red, green, and blue elements of the Bayer image into three distinct R, G, and B components. Note a colouring filtering mechanism other than Bayer may be used. The problem is that each of these layers is sparse – the green layer contains 50% green pixels, and the remainder are empty. The red and blue layers only contain 25% of red and blue pixels respectively. Values for the empty pixels are then determined using some form of interpolation algorithm. The result is an RGB image containing three layers representing red, green and blue components for each pixel in the image.

The DIP process

Next any processing related to settings in the camera are performed. For example, the Ricoh GR III has two options for noise reduction: Slow Shutter Speed NR, and High-ISO Noise Reduction. In a typical digital camera there are image processing settings such as grain effect, sharpness, noise reduction, white balance etc. (which don’t affect RAW photos). Some manufacturers also add additional effects such as art effect filters, and film simulations, which are all done within the DIP processor. Finally the RGB image image is processed to allow it to be stored as a JPEG. Some level of compression is applied, and metadata is associated with the image. The JPEG is then stored on the memory card.

Demystifying Colour (v) : colour gamuts

Terms used to describe colours are often confusing. If a colour space is a subset of a colour model, then what is a colour gamut? Is it the same as a colour space? How does it differ from a colour profile? In reality there is often very little difference between the terms. For example, depending on where you read it sRGB can be used to describe a colour space, a colour gamut, or a colour profile. Confused? Probably.

Colour gamuts

A gamut is a range or spectrum of some entity, for example “the complete gamut of human emotions“. A colour gamut describes a subset of colours within the entire spectrum of colours that are identifiable by the human eye, i.e. the visible colour spectrum. More specifically a gamut is the range of colours a colour space can represent.

While the range of colour imaging devices is very broad, e.g. digital cameras, scanners, monitors, printers, the range of colours they produce can vary considerably. Colour gamuts are designed to reconcile colours that can be used in common between devices. The term colour gamut is usually used in association with electronic devices, i.e. the devices range of reproducible colours, or the range of different colours that can be interpreted by a colour model. A colour gamut can therefore be used to express the difference between various colour spaces, and to illustrate the extent of coverage of a colour space.

Fig.1: CIE XYZ 2D Chromaticity Diagram depicting various colour spaces as gamuts

The colour gamut of a device is sometimes visualized as a volume of colours, typically in CIELab or CIELuv colour spaces, or as a project in the CIEXYZ colour space producing a 2D xy chromaticity diagram (CD). particularly the luminance of the primary colours. Typically a colour space specifies three (x,y) coordinates to define the three primary colours it uses. The triangle formed by the three coordinates encloses the gamut of colours that the device can reproduce. The table below shows the RGB coordinates for various colour spaces in the CIE chromaticity diagram, shown on the 2D diagram in Figure 1.

NameR(x)R(y)G(x)G(y)B(x)B(y)%CIE
sRGB0.640.330.30.60.150.0635
Adobe RGb0.640.330.210.710.150.0650
ProPhoto0.73470.26530.15960.84040.03660.000191
Apple RGB0.62500.340.280.59500.15500.0733.5
NTSC RGB0.670.330.210.710.140.0854
CIE RGB0.73460.26650.28110.70770.17060.0059

Note that colour gamuts are 3D which is more informative than the 2D CD – it captures the nuances of the colour space, particularly the luminance of the primary colours. However the problem with 3D is that it is not easy to plot, and hence the reason a 2D representation is often used (the missing dimension is brightness).

Two of the most common gamuts in the visual industry are sRGB, and Adobe RGB (which are also colour spaces). Each of these gamuts references a different range of colours, suited to particular applications and devices. sRGB is perhaps the most common gamut used in modern electronic devices. It is gamut that covers a good range of colours for average viewing needs, so much so that it is the default standard for the web, and most images taken using digital cameras. The largest RGB working space, ProPhoto is an RGB color space developed by Kodak, and encompasses 90% of the possible colours in the CIE XYZ chromaticity diagram.

Gamut mapping is the conversion of one devices colour space to another. For example the case where an image stored as sRGB is to be reproduced on a print medium with a CMYK colour space. The objective of a gamut mapping algorithm is to translate colours in the input space to achievable colours in the output space. The gamut of an output device depends on its technology. For example, colour monitors are not always capable of displaying all colours associated with sRGB.

Colour profiles

On many systems the colour gamut is described as a colour profile, and more specifically is associated with an ICC Color Profile, which is a standardized system put in place by the international colour consortium. Such profiles help convert the colours in the designated colour space associated with an image to the device. For example the standard profile on Apple laptops is “Color LCD”.Some of the most common RGB ICC profiles are sRGB (sRGB IEC61966-2.1).

From photosites to pixels (ii) – ADC

The inner workings of a camera are much more complex than most people care to know about, but everyone should have a basic understanding of how digital photographs are created.

The ADC is the Analog-to-Digital Converter. After the exposure of a picture ends, the electrons captured in each photosite are converted to a voltage. The ADC takes this analog signal as input, and classifies it into a brightness level represented by a binary number. The output from the ADC is sometimes called an ADU, or Analog-to-Digital Unit, which is a dimensionless unit of measure. The darker regions of a photographed scene will correspond to a low count of electrons, and consequently a low ADU value, while brighter regions correspond to higher ADU values.

Fig. 1: The ADC process

The value output by the ADC is limited by its resolution (or bit-depth). This is defined as the smallest incremental voltage that can be recognized by the ADC. It is usually expressed as the number of bits output by the ADC. For example a full-frame sensor with a resolution of 14 bits can convert a given analog signal to one of 214 distinct values. This means it has a tonal range of 16384 values, from 0 to 16,383 (214-1). An output value is computed based on the following formula:

ADU = (AVM / SV) × 2R

where AVM is the measured analog voltage from the photosite, SV is the system voltage, and R is the resolution of the ADC in bits. For example, for an ADC with a resolution of 8 bits, if AVM=2.7, SV=5.0, and 28, then ADU=138.

Resolution (bits)Digitizing stepsDigital values
82560..255
1010240.1023
1240960..4095
14163840..16383
16655360..65535
Dynamic ranges of ADC resolution

The process is roughly illustrated in Figure 1. using a simple 3-bit, system with 23 values, 0 to 7. Note that because discrete numbers are being used to count and sample the analog signal, a stepped function is used instead of a continuous one. The deviations the stepped line makes from the linear line at each measurement is the quantization error. The process of converting from analog to digital is of course subject to some errors.

Now it’s starting to get more complicated. There are other things involved, like gain, which is the ratio applied while converting the analog voltage signal to bits. Then there is the least significant bit, which is the smallest change in signal that can be detected.

Those weird image sensor sizes

Some sensors sizes are listed as some form of inch, for example a sensor size of 1″ or 2/3”. The diagonal size of this sensor is actually only 0.43” (11mm). Cameras sensors of the “inch” type do not signify the actual diagonal size of the sensor. These sizes are actually based on old video cameras tubes where the inch measurement referred to the out diameter of the video tube. 

The world use to use vacuum tubes for a lot of things, i.e. far beyond just the early computers. Video cameras like those used on NASA’s unmanned deep space probes like Mariner used vacuum tubes as their image sensors. These were known as vidicon tubes, basically a video camera tube design in which the target material is a photoconductor. There were a number of branded versions, e.g. Plumicon (Philips), Trinicon (Sony).

A sample of the 1″ vidicon tube, and its active area.

These video tubes were described using the outside diameter of the overall glass tube, and always expressed in inches. This differed from the area of the actual imaging sensor, which was typically two-thirds of the size. For example, a 1″ sized tube typically had a picture area of about 2/3″ on the diagonal, or roughly 16mm. For example, Toshiba produced Vidicon tubes in sizes of 2/3″, 1″, 1.2″ and 1.5″.

These vacuum tube based sensors are long gone, yet some manufacturers still use this deception to make tiny sensors seem larger than they are. 

Image sensorImage sensor sizeDiagonalSurface Area
1″13.2×8.8mm15.86mm116.16mm2
2/3″8.8×6.6mm11.00mm58.08mm2
1/1.8”7.11×5.33mm8.89mm37.90mm2
1/3”4.8×3.6mm6.00mm17.28mm2
1/3.6″4.0×3.0mm5.00mm12.00mm2
Various weird sensor sizes

For example, a smartphone may have a camera with a sensor size of 1/3.6″. How does it get this? The actual sensor will be approximately 4×3mm in size, with a diagonal of 5mm. This 5mm is multiplied by 3/2 giving 7.5mm (0.295″). 1” sensors are somewhere around 13.2×8.8mm in size with a diagonal of 15.86mm. So 15.86×3/2=23.79mm (0.94″), which is conveniently rounded up to 1″. The phrase “1 inch” makes it seem like the sensor is almost as big as a FF sensor, but in reality they are nowhere near the size. 

Various sensors and their fractional “video tube” dimensions.

Supposedly this is also where MFT gets its 4/3 from. The MFT sensor is 17.3×13mm, with a diagonal of 21.64mm. So 21.64×3/2=32.46mm, or 1.28″, roughly equating to 4/3″. Although other stores say 4/3 is all about the aspect ratio of the sensor, 4:3.

Photosites – Well capacity

When photons (light) enter a lens of a camera, some of them will pass through all the way to the sensor, and some of those photons will pass through various layers (e.g. filters) and end up in being gathered in the photosite. Each photosite on a sensor has a capacity associated with it. This is normally known as the photosite well capacity (sometimes called the well depth, or saturation capacity). It is a measure of the amount of light that can be recorded before the photosite becomes saturated (no long able to collect any more photons).

When photons hit the photo-receptive photosite, they are converted to electrons. The more photons that hit a photosite, the more the photosite cavity begins to fill up. After the exposure has ended, the amount of electrons in each photosite is read, and the photosite is cleared to prepare for the next frame. The number of electrons counted determines the intensity value of that pixel in the resulting image. The gathered electrons create a voltage which is an analog signal -the more photons that strike a photosite, the higher the voltage.

More light means a greater response from the photosite. At some point the photosite will not be able to register any more light because it is at capacity. Once a photosite is full, it cannot hold any more electrons, and any further incoming photons are discarded, and lost. This means the photosite has become saturated.

Fig.1: Well-depth illustrated with P representing photons, and e- representing electrons.

Different sensors can have photosites with different well-depths, which affects how many electrons the photosite can hold. For example consider two photosites from different sensors. One has a well-depth of 1000 electrons, and the other 500 electrons. If everything remains constant from the perspective of camera settings, noise etc., then over an exposure time the photosite with the smaller well-depth will fill to capacity sooner. If over the course of an exposure 750 photons are converted to electrons in each of the photosites, then the photosite with a well-depth of 1000 will be 75% capacity, and the photosite with a well-depth of 500 will become saturated, discarding 250 of the photons (see Figure 2).

Fig.2: Different well capacities exposed to 750 photons

Two photosite cavities with the same well-capacities, but differing size (in μm) will also affect how quickly the cavity fills up with electrons. The larger sized photosite will fill up quicker. Figure 3 shows four differing sensors, each with a different photosite pitch, and well capacity (the area of each box abstractly represents the well capacity of the photosite in relation to the photosite pitch).

Fig.3: Examples of well capacity in various sensors

Of course the reality is that electrons do not need a physical “bin” to be stored in, the photosites are just shown in this manner to illustrate a concept. In fact the concept of well-depth is somewhat ill-termed, as it does not take into account the surface area of the photosite.

32 shades of gray

Humans don’t interpret gray tones very well – the human visual system perceiving approximately 32 shades of gray. So an 8-bit image with 256 tones already contains too much information for humans to interpret. That’s why you don’t really see any more clarity in a 10-bit image with 1024 shades of gray than a 5-bit image with 32 shades of gray. But why do we only see approximately 32 shades of gray?

It is the responsibility of the rod receptors to deal with black and white. The rods are far less precise than the cones which deal with colour, but are more sensitive to low levels of light that are typically associated with being able to see in a dimly lit room, or at night. There are supposedly over 100 million rods in the retina, but this doesn’t help distinguish any more than 30-32 shades of gray. This may stem from evolutionary needs – in the natural world there are very few things that are actually gray – stones, some trunks of trees, weathered wood, so there was very little need to distinguish between more than a few shades of gray. From an evolutionary perspective, humans needed night vision because they lived half their lives in darkness. This advantage remained crucial, apart perhaps form the past 150 years or so.

The rods work so well that dark adapted humans can detect just a handful of photons hitting the retina. It is likely this is the reason there are so many rods in the retina – so that in exceedingly low levels of light as many as possible of the scarce photons are captured by rods. Figure 1 illustrates two grayscale optical illusions, which rely on our eyes insensitivity to shades of gray. In the image on the left, the horizontal strip of gray is actually the same shade throughout, although our eyes deceive us into thinking that it is light on the left and dark on the right. in the image on the right, the inner boxes are all the same shade of gray, even though they appear to be different.

Fig.1: Optical illusions

To illustrate this further, consider the series of images in the figure below. The first image is the original colour image. The middle image shows that image converted to grayscale with 256 shades of gray. The image on the right shows the colour image converted to 4-bit grayscale, i.e. 16 shades of gray. Is there any perceptual difference between Fig.2b and 2c? Hardly.

Fig.2a: Original colour
Fig.2b: 8-bit grayscale
Fig.2c: 4-bit grayscale

You will see articles that suggest humans can see anywhere from 500-750 shades of gray. They are usually articles related to radiology, where radiologists interpret images like x-rays. The machines that take these medical images are capable of producing 10-bit or 12-bit images which are interpreted on systems capable of improving contrast. There may of course be people that can see more shades of gray, just like there are people with a condition called aphakia that possess ultraviolet vision (aphakia is a lack of a lens which normally blocks UV light, so they are able to perceive wavelengths up to 300nm). There are also tetrachromats who posses a fourth cone cell, allowing them to see up to 100 million colours.

Demystifying colour (iv) : RGB colour model

The basics of human perception underpin the colour theory used in devices like digital cameras. The RGB colour model is based partly on the Young-Helmholtz theory of trichromatic colour vision, developed by Thomas Young, and Hermann von Helmholtz in the 19th century, the manner in which the human visual system gives rise to the theory of colour. In 1802, Young postulated the existence of three types of photoreceptors in the eye, each sensitive to a particular range of visible light. Helmholtz further developed the theory in 1850, suggesting the three photoreceptors be classified into short, middle and long according to their response to wavelengths of light striking the retina. In 1857 James Maxwell used linear algebra to prove the Young-Helmholtz theory. Some of the first experiments colour photography using the concept of RGB were made by Maxwell in 1861. He created colour images by combining three separate photographs, each taken with a red, green, and blue colour-filter.

In the early 20th century the CIE set out to create a comprehensively quantify the human perception of colour. This was based on experimental work done by William David Wright and John Guild. The results of the experiments were summarized by the standardized CIE RGB colour matching functions for R, G, and B. The name RGB stems from the fact that red, green, and blue primaries can be thought of as the basis for a vector representing a colour. Devices such as digital cameras have been designed to approximate the spectral response of the cones of the human eye. Before light photons are captured by a camera sensors photosites they pass through red, green or blue optical filters which mimic the response of the cones. The image that is formed at the other end of the process is encoded using RGB colour space information.

The RGB colour model is one in which colours are represented as combinations of the three primary colours: red (R), green (G), and blue (B). RGB is an additive colour model, which means that a colour is formed by mixing various intensities of red, green and blue light. The collection of all the colours obtained by such a linear combination of red, green and blue forms a cube shaped colour space (see Fig.1). Each colour, as described by its RGB components, is represented by a point that can be found either on the surface or inside the cube.

RGB colour space cube
Fig.1: The geometric representation of the RGB colour space

The cube, as shown in Fig.1, shows the primary (red, green, blue), and secondary colours (cyan, magenta, yellow), all of which lie on the vertices of the colour cube. The corner of RGB colour cube that is at the origin of the coordinate system corresponds to black (R=G=B=0). Radiating out from Black are the three primary coordinate axes, Red, Green, and Blue. Each of these range from 0 to Cmax, where Cmax is typically 255 for a 24-bit colour space (8-bits each for R, G, and B). The corner of the cube that is diagonally opposite to the origin represents white (R=G=B=255). Each of these 8-bit colours contains 256 values, so the total amount of colours which can be produced is 2563, or 16,777,216 colours. Sometimes the values are normalized between 0 and 1, and the colour cube is called the unit cube. The diagonal (dashed) line connecting black and white corresponds to all the gray colours between black and white, which is also known as gray axis. Grays are formed when all three components are equal, i.e. R=G=B. For example the 50% gray is (127,127,127).

Fig.2: An image and its RGB colour space.

Figure 2 illustrates an RGB cube for a colour image. Notice that while the pink colour of the sky looks somewhat uniform in the image, it is anything, showing up as a swath of various shades of pink in the RGB cube. There are 275,491 unique colours in the Fig.2 image. Every possible colour corresponds to a point within the RGB colour cube, and is of the form: Cxyz = (Rx,Gy,Bz). For example Fig.3 illustrates three colours extracted from the image in Fig.2.

Fig.3: Examples of some RGB colours from Fig.2

The RGB colour model has a number of benefits:

  • It is the simplest colour model.
  • No transformation is required to display data on a screen, e.g. images.
  • It is a computationally practical system.
  • The model is very easy to implement.

But equally it has a number of limitations:

  • It is not a perceptual model. In perceptual terms, colour and intensity are distinct from one another, but the R, G, and B components each contain both colour and intensity information. This makes it challenging to perform some image processing operations in RGB space.
  • It is psychologically non-intuitive, i.e. not able to determine what a particular RGB colour corresponds to in the real world, or what RGB means in a physical sense.
  • It is non-uniform, i.e. it is impossible to evaluate the perceived differences between colours on the basis of distance in RGB space (the cube).
  • For the purposes of image processing, the RGB space is often converted to another colour space by means of some non-linear transformation.

The RGB colour space is commonly used in imaging devices because of its affinity with the human visual system. Two of the most commonly used colour spaces derived from the RGB model are sRGB and Adobe RGB.

The simplicity of achromatic photographs

We live in a world where colour surrounds us, so why would anyone want to take an achromatic, black-and-white photograph? What draws us to a B&W photograph? Many modern colour images are brightened to add a sense of the exotic in the same way that B&W invokes an air of nostalgia. B&W does not exaggerate the truth in the same way that colour does. It does sometimes veil the truth, but in many ways it is an equalizer. Colours and the emotions they represent are stripped away, leaving nothing but raw structure. We are then less likely to draw emotions into the interpretation of achromatic photographs. There is a certain rawness to B&W photographs, which cannot be captured by colour.

Every colour image is of course built upon an achromatic image. The tonal attributes provides the structure, the chrominance the aesthetic elements that help us interpret what we see. Black and white photographs offer simplicity. When colour is removed from a photograph, it forces a different perspective of the world. To create a pure achromatic photograph means the photographer has to look beyond the story posed by the chromatic elements of the scene. It forces one to focus on the image. There is no hue, no saturation to distract. The composition of the scene suddenly becomes more important. Both light and the darkness of shadows become more pronounced. The photographic framework of a world without colour forces one to see things differently. Instead of highlighting colour, it helps highlight shape, texture, form and pattern.

Sometimes even converting a colour image to B&W using a filter can make the image content seem more meaningful. Colour casts or odd-ball lighting can often be vanquished if the image is converted. Noise that would appear distracting in a colour image, adds to an image as “grain” in B&W. B&W images will always capture the truth of a subjects structure, but colours are always open to interpretation due to the way individuals perceive colour. 

Above is a colour photograph of a bronze sculpture taken at The Vigeland Park in Oslo, a sculpture park displaying the works of Gustav Vigeland. The colour image is interesting, but the viewer is somewhat distracted by the blue sky, and even the patina on the statue. A more interesting take is the achromatic image, obtained via the Instagram Inkwell filter. The loss of colour has helped improve the contrast between the sculpture and its background.