Facial recognition has baffled scientists for generations. How can the human brain commit so many individual faces to memory with such ease? A study published this week in the journal Cell finds that facial recognition may actually be much simpler than we thought.
When we look at a selection of faces, our brains can single out the familiar ones with no effort at all. This smooth process comes so naturally that most people never give it a second thought.
But someone who does give this phenomenon a second thought is Doris Tsao, a professor of biology and biological engineering at the California Institute of Technology in Pasadena.
Over recent years, Prof. Tsao has conducted a range of experiments that have attempted to get to the bottom of facial perception.
Specifically, they found six regions that are responsible for identifying faces. These regions, referred to as face patches, are housed in the inferior temporal (IT) cortex, which is an area known to be involved in visual processing.
Each of the six patches is packed with neurons that fire particularly strongly when presented with faces, compared with other objects. Prof. Tsao and team call these neurons “face cells.” They also demonstrated that artificially stimulating these face cells in macaque monkeys disturbed their perception of faces much more than other objects.
Earlier theories had it that each of the cells within these brain areas represented a specific face. This, however, does not ring true. “You could potentially recognize 6 billion people, but you don’t have 6 billion face cells in the IT cortex,” Prof. Tsao explains. “There had to be some other solution.”
In the latest study, Prof. Tsao and postdoctoral fellow Steven Le Chang dug deeper into the function of face cells. They showed that each of the cells represents a particular axis in multidimensional space, which the researchers refer to as “face space.”
In a similar way to red, blue, and green combining to produce every color, these axes can be combined to produce every possible face.
The team started out “by designing a 50-dimensional space that could represent all faces.” Half of the dimensions were assigned to face shape, such as the distance between the eyes, and the other 25 were assigned to other features, including texture and skin tone.
They used the macaque monkey as a model. By inserting electrodes into the face patches, they could record the activity of single face cells. Each face that was presented to the macaque sparked a proportional response in the face cells depending on differences in a single axis.
Following on from this, the team designed an algorithm that could decode faces from the neural responses alone. In other words, by simply measuring the activity of these face cells, the scientists could generate a representation of the face that the monkey was viewing. When the algorithm-generated images were compared with the actual images, they were almost identical.
Perhaps surprisingly, taking signals from a little more than 200 neurons within just two face patches was enough to reconstruct the faces. There were 106 cells in one face patch and 99 in the other.
“People always say a picture is worth a thousand words. But I like to say that a picture of a face is worth about 200 neurons.”
Prof. Doris Tsao
The final nail in the coffin of the one-neuron-one-face theory was hammered into place by the final part of the study. Prof. Tsao and Chang found that a range of very different-looking faces could cause an individual face cell “to fire in exactly the same way.”
It was an unexpected finding, as Prof. Tsao says, “This was completely shocking to us, we had always thought face cells were more complex. But, it turns out each face cell is just measuring distance along a single axis of face space and is blind to other features.”
Although there are a number of steps that need to occur between seeing an image and the response of the face cells, the bare bones of facial recognition may be surprisingly simple. These results may not only apply to facial recognition. Instead, “This work suggests that other objects could be encoded with similarly simple coordinate systems,” explains Prof. Tsao.
This knowledge might spur the creation of innovative applications for artificial intelligence. As Prof. Tsao adds, “This could inspire new machine learning algorithms for recognizing faces. In addition, our approach could be used to figure out how units in deep networks encode other things, such as objects and sentences.”