What is a photosite?

When people talk about cameras, they invariably talk about pixels, or rather megapixels. The new Fujifilm X-S20 has 26 megapixels. This means that the image produced by the camera will contain 26 million pixels. But the sensor itself does not have any pixels, the sensor has photosites.

The job of photosites is to capture photons of light. After a bunch of processing, the data captured by each photosite is converted into a digital signal, and processed into a pixel. All the photosites on a sensor contribute to the resultant image. On a sensor there are two numbers used to define the number of photosites. The first is the physical sensor resolution which is the actual number of physical photosites found on the sensor. For example on the Sony a7RV (shown below), there are 9566×6377 physical photosites (61MP). However not all the photosites are used to create an image – the ones that are form the maximum image resolution, i.e. the maximum number of pixels in an image. For the Sony a7RV this is 9504×6336 photosites (60.2MP) used to create an image. This is sometimes known as the effective number of photosites.

The Sony a7R V

There are two major differences between photosites and pixels. Firstly, photosites are physical entities, pixels are not, they are digital entities. Secondly, while photosites have a size, and are different based on the sensor type, and number of photosites on a sensor, pixels are dimensionless. For example each photosite on the Sony a7RV has a pitch (width) of 3.73µm, and an area of 13.91µm2.

What is a pixel?

So camera sensors don’t have pixels, but what is a pixel?

A pixel is short for picture element, and represents the essential building block of images. The term was first coined in 1965, in two different academic articles in SPIE Proceedings in 1965, written by Fred C. Billingsley of Caltech’s Jet Propulsion Laboratory. An alternative, pel, was introduced by William F. Schreiber of MIT in the Proceedings of the IEEE in 1967 (but it never really caught on).

Pixels are square in shape. In the context of digital cameras, a pixel is derived from the digitization of a signal from a sensor photosite. Pixels come together in a rectangular grid to form an image. An image is somewhat like a mosaic in structure. Each pixel provides data for representing the entire picture being digitized. 

Picture showing relationship between an image and a pixel.

What are the characteristics of a pixel? Firstly a pixel is dimensionless. Pixels are not visible unless an image is overly enlarged, and their perceived “size” is directly related to the size of pixels on a physical device. An image shown on a mobile device will be deemed to have smaller pixels than an image shown on a 4K television.

Pixels have a value associated with them which represents their “colour”. This value relates to luminance in the case of a grayscale image, with a pixel taking a value between black (0) and white (255). In the case of a colour image this is both luminance and chrominance. A pixel in a colour image typically has three components, one for Red, one for Green, and one for Blue, or RGB – when the values are combined they derive a single colour.

The precision to which a pixel can specify colour is called its bit depth or colour depth. For example a typical grayscale image is 8-bit, or contains 2^8=256 shades of gray. A typical colour image is 24-bit, or 8 bits for each of Red, Green and Blue, providing 2^24 or 16,777,216 different colours.

A single pixel considered in isolation conveys information on the luminance and/or chrominance of a single location in an image. A group of pixels with similar characteristics, e.g. chrominance or luminance, can coallesc together to form an object. A pixel is surrounded by eight neighbouring pixels, four of which are direct, or adjacent, neighbours, and four of which are indirect or diagonal neighbours.

pixel neighbours
Pixel neighbours: The red pixels are the direct neighbours, while the blue pixels are the indirect neighbours.

The more pixels an image contains, the more detail it has the ability to describe. This is known as image resolution. Consider the two pictures of the word “Leica” below. The high resolution version has 687×339 pixels, whereas the low resolution image is 25% of its size, at 171×84 pixels. The high resolution image has more pixels, and hence more detail.