The COVID-19 spreads to the whole Galaxy, the planet where R2D2 lives included. To slow down the propagation speed of the virus, R2D2 has a new mission to recognize the faces and find people who don’t wear masks. R2D2 will take advantage of its new deep learning module to learn from the dataset gathered from other planets.
This assignment consists of 2 parts. In part 1, you will learn how to use PCA to implement a face detector.
While for part 2, you will utilize the Convolutional Neural Network (CNN) to conduct a mask classification.
A skeleton notebook r2d2_hw4.ipynb containing empty definitions for each question has been provided. Please do not change any of that. Since portions of this assignment will be graded automatically, none of the names or function signatures in this file should be modified. However, you are free to introduce additional variables or functions if needed. You could use the Google Colab to edit the notebook file and conduct the training using the free GPU from Google.
You are strongly encouraged to follow the Python style guidelines set forth in PEP 8, which was written in part by the creator of Python. However, your code will not be graded for style.
Once you have completed the assignment, you should submit your file on Gradescope.
Principle Component Analysis (PCA) is a widely used approach for dimension reduction and making prediction models. For example, for a person with many personalities, you could always find several main personalities to describe him. And PCA does the similar job, which could pick several features to describe a large data.
PCA consists of two main steps:
These are basic linear algebra concepts and if you are unfamiliar with these terms, please click the Wiki link for reference.
Now swith to our face detection task, just like the eigenvector, we could also use PCA to find some eigenfaces, aka principle faces, to represent human faces.
To find the effective eigenfaces, we need a large amount of real human faces as our dataset. The Labeled Faces in the Wild Home (LFW) provides over 13,000 faces and we selected the first 2,000 faces which are large enought for our task. You could download the dataset here or just run the cell in the notebook to fetch the data.
To use the dataset, run the following code to convert it to a numpy array
>>> import pickle
>>> faces = pickle.load(open("faces.p", "rb"))
>>> print(faces.shape)
(2000, 64, 64)
This array contains 2000 rows, each row represents a face with the size of 64 by 64 pixels.
TODO
Implement the flatten_faces(faces)
which flatten the 2D face to a 1D vector.
>>> face_vectors = flatten_faces(faces)
>>> print(face_vectors.shape)
(2000, 4096)
Implement the cal_avgface(face_vectors)
which takes the mean of all face vectors to get the average face of the dataset.
>>> average_face = cal_avgface(face_vectors)
>>> print(average_face.shape)
(4096,)
You could also take a look at how it looks like by running:
>>> import matplotlib.pyplot as plt
>>> plt.imshow(average_face.reshape(64, 64), cmap='gray')
>>> plt.show()
Average Face
If your result looks like above, you are on the right track and you could save the average face by:
>>> pickle.dump(average_face, open('average_face.p', 'wb'))
Now we could start to compute the eigenfaces, as mentioned before, we need two steps to get the eigenfaces - compute the covariance matrix and take the eigenvector of this matrix. The math theory part is shown below, you could also read this instruction from JHU for details.
TODO
Implement the cal_covariance(A)
which takes in the face vector minus by the average face and return the covariance matrix given the equation \(C = AA^T\)
>>> A = face_vectors - average_face
>>> C = cal_covariance(A)
>>> print(C.shape)
(2000, 200)
Implement the cal_eigenvectors(C)
which takes in the covariance matrix and return the eigenvectors sorted by the descending order of the eigenvalue. It may take around 30 seconds for compution.
>>> eigenvectors = cal_eigenvectors(C)
>>> print(eigenvectors.shape)
(2000, 2000)
Implement the cal_eigenfaces(eigenvectors)
which transform the eigenvectors to eigenfaces using the formula \(eigenfaces = (A^T \cdot eigenvectors)^T\)
>>> eigenfaces = cal_eigenfaces(eigenvectors)
>>> print(eigenfaces.shape)
(2000, 4096)
You could take a look at what the first several eigenfaces look like by:
>>> plt.imshow(eigenfaces[0].reshape(64, 64), cmap='gray')
>>> plt.show()
If your result looks similar as above, you could save the eigenfaces using:
pickle.dump(eigenfaces, open('eigenfaces.p', 'wb'))
After we get the eigenfaces, we could project any image to the face space. We could calculate the distance between original image and the projected image to indentify whether it is a human face and thus the face detector is bulit. The figure shown below demonstrates how we could use the approach to detect human face. The distance of CCB’s is mush smaller than the kitten’s which means it is more likely to be a human face.
TODO
proj2face_space(target_vector, eigenfaces)
which takes in the target vector whe want to analyze and the eigenfaces and return its projection vector on the face space. It includes two steps:
>>> face_space_vector = proj2face_space(target_vector, eigenfaces)
>>> print(face_space_vector.shape)
(4096,)
dist2face_space(target_vector, face_space_vector)
which returns the euclidean distance between the two vectors. You may use the np.linalg API.If everything goes well so far, you could now use the Raspberry pi camera system to bulid a face detector. We provide a function in the detector.py called pca_face_detection(camera, k, THRESHOLD)
in the skeleton code, and please make sure you put the eigenfaces.p and averageface.p in the same directory of the skeleton file. The argument k
represents the number of eigenfaces you want to use and the threshold
stands for the boundary of distinguish human face. The TA used k=20 and threshold=50000 which works well. You may need to fine tune these parameters to get better performance.
with RPiCamera('tcp://IP_ADDRESS:65433') as camera:
pca_face_detection(camera, 20, 50000)
Record a short video to show the performance of your face detector and submit the link of your video through this form.
“We need, as a nation, to show a degree of consistency of everybody following public-health recommendations on wearing masks or other face coverings”, said Dr. Fauci.
Now it’s time to apply the machine learning skills to deal with the big challeng we are facing now! In this part, we will implement a real time mask detector using the camera system. This homework is mainly described in this notebook, you could directly edit it in Google Colab.
We use the dataset from kaggle, here is the original dataset in xml type. We have preprocessed the original dataset into a dictionary with the format {‘image’: np.array, ‘label’: ‘good’/’bad’}.
TODO
preprocess(image)
which transforms the image data according to the following steps:
For the baseline model, we simply flatten the image data and utlize Perceptron as our classifier.
TODO
flatten(images)
which convert the 2D images to 1D vector.sklearn
API.In this section, you are required to construct a nerual network using the keras platform.
TODO
Build a neural network (show below) according to the following descriptions:
To get better performance, we could use more complicated networks. There are various of architectures available in keras like VGG, InceptionV3, etc.
Feel free to experiment on different model architecture and find your best model. Evaluate your model on the test images and upload to the leaderboard to compete with your classmates.
After finding the most appropriate face detector and you fine tuned mask classifier. It’s the time to apply these on the R2D2. First, save your model with the weights. Then use the mask_detection(camera)
in the detector.py and you will be able to implement a real time mask detector.
Record a short video to show the performance of your mask detector and submit the link of your video through this form.
Here are what you need to submit for this homework:
preprocess
, proj2face_space
and dist2face_space
)For the leaderboad, we will give top 1 extra 5 points for this assigenment, top 2-3: 3 points, top 5 - 10: 1 point.