-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Face Recognition Accuracy Problems
Question: Face recognition works well with European individuals, but overall accuracy is lower with Asian individuals.
This is a real problem. The face recognition model is only as good as the training data. Since the face recognition model was trained using public datasets built pictures scraped from websites (celebrities, etc), it only works as well as the source data. Those public datasets are not evenly distributed amongst all individuals from all countries.
I hope to improve this in the future, but it requires building a dataset of millions of pictures of millions of people from lots of different places. If you have access to this kind of data and are willing to share it for model training, please email me.
Question: Your application only uses one picture of each person to identify them. Can I use more than one picture of each person to make identification more accurate?
Sure! It just requires more work and the best way to do it depends on the kind of application you are building.
You can use the face_encodings()
function to get a representation of a single face image. The representation is an array with 128 floating point elements (i.e. a face vector). You can use these face vectors to build any kind of machine learning classification model you want.
Here's a working example using a KNN classifier to classify a new image based on multiple pictures of each known person: https://github.com/ageitgey/face_recognition/blob/master/examples/face_recognition_knn.py
The default face encoding model was trained by @davisking on millions of images of faces grouped by individual. Re-training the model is not possible unless you have that volume of data. Adding a few thousand of your own images won't really help. Instead, try building a classifier on top of the current face encodings model [like this](https://github.com/ageitgey/face_recognition/blob/master/examples/face_recognition_knn.p.
So if you don't have tens of millions of images grouped by individual, you can't really retrain the model yourself. And if you do, let me know so we can combine training data! :)