本帖隐藏的内容
Face recognition is the problem of identifying and verifying people in a photograph by their face.
It is a task that is trivially performed by humans, even under varying light and when faces are changed by age or obstructed with accessories and facial hair. Nevertheless, it is remained a challenging computer vision problem for decades until recently.
Deep learning methods are able to leverage very large datasets of faces and learn rich and compact representations of faces, allowing modern models to first perform as-well and later to outperform the face recognition capabilities of humans.
In this post, you will discover the problem of face recognition and how deep learning methods can achieve superhuman performance.
After reading this post, you will know:
- Face recognition is a broad problem of identifying or verifying people in photographs and videos.
- Face recognition is a process comprised of detection, alignment, feature extraction, and a recognition task
- Deep learning models first approached then exceeded human performance for face recognition tasks.
Let’s get started.

A Gentle Introduction to Deep Learning for Face Recognition
Photo by Susanne Nilsson, some rights reserved.
OverviewThis tutorial is divided into five parts; they are:
- Faces in Photographs
- Process of Automatic Face Recognition
- Face Detection Task
- Face Recognition Tasks
- Deep Learning for Face Recognition
Faces in PhotographsThere is often a need to automatically recognize the people in a photograph.
There are many reasons why we might want to automatically recognize a person in a photograph.
For example:
- We may want to restrict access to a resource to one person, called face authentication.
- We may want to confirm that the person matches their ID, called face verification.
- We may want to assign a name to a face, called face identification.
Generally, we refer to this as the problem of automatic “face recognition” and it may apply to both still photographs or faces in streams of video.
Humans can perform this task very easily.
We can find the faces in an image and comment as to who the people are, if they are known. We can do this very well, such as when the people have aged, are wearing sunglasses, have different colored hair, are looking in different directions, and so on. We can do this so well that we find faces where there aren’t any, such as in clouds.
Nevertheless, this remains a hard problem to perform automatically with software, even after 60 or more years of research. Until perhaps very recently.
For example, recognition of face images acquired in an outdoor environment with changes in illumination and/or pose remains a largely unsolved problem. In other words, current systems are still far away from the capability of the human perception system.
— Face Recognition: A Literature Survey, 2003.
Want Results with Deep Learning for Computer Vision?Take my free 7-day email crash course now (with sample code).
Click to sign-up and also get a free PDF Ebook version of the course.
Download Your FREE Mini-Course
Process of Automatic Face RecognitionFace recognition is the problem of identifying or verifying faces in a photograph.
A general statement of the problem of machine recognition of faces can be formulated as follows: given still or video images of a scene, identify or verify one or more persons in the scene using a stored database of faces
— Face Recognition: A Literature Survey, 2003.
Face recognition is often described as a process that first involves four steps; they are: face detection, face alignment, feature extraction, and finally face recognition.
- Face Detection. Locate one or more faces in the image and mark with a bounding box.
- Face Alignment. Normalize the face to be consistent with the database, such as geometry and photometrics.
- Feature Extraction. Extract features from the face that can be used for the recognition task.
- Face Recognition. Perform matching of the face against one or more known faces in a prepared database.
A given system may have a separate module or program for each step, which was traditionally the case, or may combine some or all of the steps into a single process.
A helpful overview of this process is provided in the book “Handbook of Face Recognition,” provided below:

Overview of the Steps in a Face Recognition Process. Taken from “Handbook of Face Recognition,” 2011.
Face Detection TaskFace detection is the non-trivial first step in face recognition.
It is a problem of object recognition that requires that both the location of each face in a photograph is identified (e.g. the position) and the extent of the face is localized (e.g. with a bounding box). Object recognition itself is a challenging problem, although in this case, it is similar as there is only one type of object, e.g. faces, to be localized, although faces can vary wildly.
The human face is a dynamic object and has a high degree of variability in its appearance, which makes face detection a difficult problem in computer vision.
— Face Detection: A Survey, 2001.
Further, because it is the first step in a broader face recognition system, face detection must be robust. For example, a face cannot be recognized if it cannot first be detected. That means faces must be detected with all manner of orientations, angles, light levels, hairstyles, hats, glasses, facial hair, makeup, ages, and so on.
As a visual front-end processor, a face detection system should also be able to achieve the task regardless of illumination, orientation, and camera distance
— Face Detection: A Survey, 2001.
The 2001 paper titled “Face Detection: A Survey” provides a taxonomy of face detection methods that can be broadly divided into two main groups:
- Feature-Based.
- Image-Based.
The feature-based face detection uses hand-crafted filters that search for and locate faces in photographs based on a deep knowledge of the domain. They can be very fast and very effective when the filters match, although they can fail dramatically when they don’t, e.g. making them somewhat fragile.
… make explicit use of face knowledge and follow the classical detection methodology in which low level features are derived prior to knowledge-based analysis. The apparent properties of the face such as skin color and face geometry are exploited at different system levels.
— Face Detection: A Survey, 2001.
Alternately, image-based face detection is holistic and learns how to automatically locate and extract faces from the entire image. Neural networks fit into this class of methods.
… address face detection as a general recognition problem. Image-based representations of faces, for example in 2D intensity arrays, are directly classified into a face group using training algorithms without feature derivation and analysis. […] these relatively new techniques incorporate face knowledge implicitly into the system through mapping and training schemes.
— Face Detection: A Survey, 2001.
Perhaps the dominant method for face detection used for many years (and was used in many cameras) was described in the 2004 paper titled “Robust Real-time Object Detection,” called the detector cascade or simply “cascade.”