Image resolution and human perception

Sometimes we view a poster or picture from afar and are amazed at the level of detail, or the crispness of the features, yet viewed from up close this just isn’t the case. Is this a trick of the eye? It has to do with the resolving power of the eye.

Images, whether they are analog photographs, digital prints, or paintings, can contain many different things. There are geometric patterns, shapes, colours – everything needed in order to perceive the contents of the image (or in the case of some abstract art, not perceive it). Now as we have mentioned before, the sharpest resolution in the human eye occurs in the fovea, which represents about 1% of the eyes visual field – not exactly a lot. The rest of the visual field until the peripheral vision has progressively less ability to discern sharpness. Of course the human visual system does form a picture, because the brain is able to use visual memory to form a mental model of the world as you move around.

Fig.1: A photograph of a photograph stitched together (photographed at The Rooms, St.John’s, NFLD). .

Image resolution plays a role in our perception of images. The human eye is only able to resolve a certain amount of resolution based on viewing distance. There is actually an equation used to calculate this: 2/(0.000291×distance(inches)). A normal human eye (i.e. 20-20 vision) can distinguish patterns of alternating black and white lines with a feature size as small as one minute of an arc, i.e. 1/60 degree or π/(60*180) = 0.000291 radians.

So if a poster were viewed from a distance of 6 feet, the resolution capable of being resolved by the eye is 95 PPI. That’s why the poster in Fig.1, comprised of various separate photographs stitched together (digitally) to form a large image, appears crisp from that distance. It could be printed at 100 DPI, and still look good from that distance. Up close though it is a different story, as many of the edge features are quiet soft, and lack the sharpness expected from the “distant” viewing. The reality it that the poster could be printed at 300 DPI, but viewed from the same distance of 6 feet, it is unlikely the human eye could discern any more detail. It would only be useful if the viewer comes closer, however coming closer then means you may not be able to view the entire scene. Billboards offer another a good example. Billboards are viewed from anywhere from 500-2500 feet away. At 573ft, the human eye can discern 1.0 PPI, at 2500ft it would be 0.23 PPI (it would take 16 in2 to represent 1 pixel). So the images used for billboards don’t need to have a very high resolution.

Fig.2: Blurry details up close

Human perception is then linked to the resolving power of the eye. Resolving power is the ability of the eye to distinguish between very small objects that are very close together. To illustrate this further, consider the images shown in Fig.3. They have been extracted from a digital scan of a vintage brochure taken at various enlargement scales. When viewing the brochure it is impossible to see the dots associated with the printing process, because they are too small to discern (and that’s the point). The original, viewed on the screen is shown in Fig.3D. Even in Fig.3C it is challenging to see the dot pattern that makes up the print. In both Fig.3A and 3B, the dot pattern can be identified. It is no different with any picture. But looking at the picture close up, the perception of the picture is one of blocky, dot matrix, not the continuous image which exists when viewed from afar.

Fig.3: Resolving detail

Note that this is an exaggerated example, as the human eye does not have the discerning power to view the dots of the printing process without assistance. If the image were blown up to poster size however, a viewer would be able to discern the printing pattern. Many vintage photographs, such as the vacation pictures sold in 10-12 photo sets work on the same principle. When provided as a 9cm×6cm black-and-white photograph, they seem to show good detail when viewed from 16-24 inches away. However when viewed through a magnifying glass, or enlarged post-digitization, they lack the same sharpness as viewed from afar.

Note that 20-20 vision is based on the 20ft distance from the patient to the acuity chart when taking an eye exam. Outside of North America, the distance is normally 6 metres, and so 20-20 = 6-6.


Resolution of the human eye (i) pure pixel power

A lot of visual technology such as digital cameras, and even TVs are based on megapixels, or rather the millions of pixels in a sensor/screen. What is the resolution of the human eye? It’s not an easy question to answer, because there are a number of facets to the concept of resolution, and the human eye is not analogous to a camera sensor. It might be better to ask how many pixels would be needed to make an image on a “screen” large enough to fill our entire field of view, so that when we look at it we can’t detect pixelation.

Truthfully, we may never really be able to put an exact number on the resolution of the human visual system – the eyes are organic, not digital. Human vision is made possible by the presence of photoreceptors in the retina. These photoreceptors, of which there are over 120 million in each eye, convert electromagnetic radiation into neural signals. The photoreceptors consist of rods and cones. Rods (which are rod shaped) provide scotopicvision,  are responsible for low-light vision, and are achromatic. Cones (which are con shaped) provide photopicvision, are active at high levels of illumination, and are capable of colour vision. There are roughly 6-7 million cones, and nearly 120-125 million rods.

But how many [mega] pixels is this equivalent to? An easy guess of pixel resolution might be 125 -130 megapixels. Maybe. But then many rods are attached to bipolar cells providing for a low resolution, whereas cones each have their own  bipolar cell. The bipolar cells strive to transmit signals from the photoreceptors to the ganglion cells. So there may be way less than 120 million rods providing actual information (sort-of like taking a bunch of grayscale pixels in an image and averaging their values to create an uber-pixel). So that’s not a fruitful number.

A few years ago Roger M. Clark of Clark Vision performed a calculation, assuming a field of view of 120° by 120°, and an acuity of 0.3 arc minutes. The result? He calculated that the human eye has a resolution of 576 megapixels. The calculation is simple enough:

(120 × 120 × 60 × 60) / (0.3 × 0.3) =576,000,000

The value 60 is the number of arc-minutes per degree, and the 0.3 arcmin²  is essentially the “pixel” size. A square degree is then 60×60 arc-minutes, and contains 40,000 “pixels”.  Seems like a huge number. But, as Clark notes, the human eye is not a digital camera. We don’t take snapshots (more’s the pity), and our vision system is more like a video stream. We also have two eyes, providing stereoscopic and binocular vision with the ability of depth perception. So there are many more factors than available in a simple sensor. For example, we typically move our eyes around, and our brain probably assembles a higher resolution image than is possible using our photoreceptors (similar I would imagine to how a high-megapixel image is created by a digital camera, slightly moving the sensor, and combining the shifted images).

The issue here may actually be the pixel size. In optimal viewing conditions the human eye can resolve detail as small as 0.59 arc minutes per line pair, which equates to 0.3 arc minutes. This number comes from a study from 1897 – “Die Abhängigkeit der Sehschärfe von der Beleuchtungsintensität”, written by Arthur König (translated roughly to “The Dependence of Visual Acuity on the Illumination Intensity”). A more recent study from 1990 (Curcio90) suggests a value of 77 cycles per degree. To convert this to arc-minutes per cycle, we first divide 1 by 77 and then multiply by 60 = 0.779. Two pixels define a cycle, so 0.779/2 = 0.3895, or 0.39. Now if we use 0.39×0.39 arcmin as the pixel size, we get 6.57 pixels per arcmin², versus 11.11 pixels when the acuity is 0.3. This vastly changes the value calculated to 341megapixels (60% of the previous calculation).

Clark’s calculation using 120° is also conservative, as the eyes  field of view is roughly 155° horizontally, and 135° vertically. If we used these constraints we would get 837 megapixels (0.3), or 495 megapixels (0.39). The pixel size of 0.3 arcmin² is optimal viewing – but about 75% of the population have 20/20 vision, both with and without corrective measures. 20/20 vision implies an acuity of 1 arc minute, which means a pixel size of 1×1 arcmin². This could mean a simple 75 megapixels. There are three other factors which complicate this: (i)  these calculations assume uniform  optimal acuity, which is very rarely the case, (ii) vision is binocular, not monocular, and (iii) the field of view is likely not a rectangle.

For binocular vision, assuming each eye has a horizontal field of view of 155°, and there is an overlap of 120° (120° of vision from each eye is binocular, remaining 35° in each eye is monocular). This results in an overall horizontal field of view of 190°, meaning if we use 190°, and 1 arc minute acuity we get a combined total vision of 92 megapixels. If we change acuity to 0.3 we get over 1 gigapixel. Quite a range.

All these calculations are mere musings – there are far too many variables to consider in trying to calculate a generic number to represent the megapixel equivalent of the human visual system. The numbers I have calculated are approximations only to show the broad range of possibilities based solely on a few simple assumptions. In the next couple of posts we’ll look at some of the complicating factors, such as the concept of uniform acuity.

(Curcio90) Curcio, C.A., Sloan, K.R., Kalina, R.E., Hendrickson, A.E., “Human photoreceptor topography”, The Journal of Comparative Neurology, 292, pp.497-523 (1990)