Introduction

Our project idea is to utilize a Raspberry Pi 3 to build a webcam server that provides a platform for users to remotely monitor their homes over the internet. DropBox uploading of captured images and email notifications will be triggered if a human is detected to warn the user of potential intruders. One usage of this project is to provide a home security system for homeowners while they are physically absent.

The hardware of the system includes a Raspberry Pi 3, a Pi Cam, and a pan-tilt kit with two servos that provides 180 degrees of up/down and left/right rotation. Our final software is based on experimenting with existing OpenCV libraries by using various combinations of motion, face, and human body detection to find the optimal combination of performance versus accuracy for a practical security system.

Objective

The objective of this project is to build a human detection security camera system that monitors a home and captures images of potential intruders. The camera mechanism rotates either automatically or manually to track a face when a human is within view of the camera. A notification is sent to the homeowner immediately upon detection of a human to warn the user of the security status of their home.

System Design

Figure 1. System Design

The Pi Server provides video streaming and runs human face detection and body detection to distingush humans from non-human objects. Notifications are sent to users through email and uploaded to users' DropBox accounts upon detection.

Hardware Implementation

System Schemetic and Flow Chart Design

Figure 2. System Block Diagram

Figure 3. System Design Flowchart

The above figures show the design of our system. The servos of the pan-and-tilt kit are connected to 2 GPIO output pins of the Raspberry Pi to enable their control via PWMs. They were powered by an external 4.5V supply, either from a lab bench power supply or a battery pack with a voltage regulator. The Pi Cam was directly connected via the camera slot on the Raspberry Pi’s PCB.

Raspberry Pi3

Figure 4. Raspberry Pi 3 with 1.2GHz 64-bit quad-core ARMv8 CPU, 1 GB RAM

PiCamera Module

Figure 5. Raspberry Pi Camera Module V2 - 8 Megapixel,1080p, 3280 (H) x 2464 (V) Active Pixel Count

Pan Tilt Servos

Figure 6. Pan Tilt Kit with two servos for 180 degrees of up/down and left/right rotation

Figure 7. Final Security Camera System

Software Implementation

Human Detection Algorithm with Opencv

For human body detection, we used existing OpenCV libraries with different combinations of motion, face, and body detection.

Click here for instructions on installing OpenCV3.

Motion Detection

Our motion detector uses OpenCV’s APIs to compute the weighted mean of previous frames along with the current frame to detect changes in the background. By subtracting the computed average of the previous frames from the current frame, we can obtain the difference in frames and compare the frame delta with the specified threshold value. If the delta exceeds the threshold, the motion detector is triggered. The lower the threshold value, the more sensitive the motion detector becomes. The video stream on the screen will have a bounding box drawn around the object and text reading "Occupied" when motion is detected.

Figure 8. Motion detected in dark environment

Human Face Detection

For face detection, we used the Haar cascade classifier as it is capable of significant accuracy detecting human faces. The computational load is also low enough to achieve a reasonable framerate when multithreading on the 4 processing cores of the Raspberry Pi. The Haar cascade face detector already has a classifier trained on human faces, so there is no need for us to perform training.

Figure 9. Face detection with Haar cascade classifier

Click here to download OpenCV Haar cascade classifiers.

Human Body Detection

We worked with two algorithms for human body detection. One is a Haar cascade classifier trained to detect the upper and lower body seperately. The computational load is lower with the Haar cascade at significant expense to accuracy. Thus, we went with the other option and integrated a pedestrian detector that uses a histogram of oriented gradients (HOG) classifier. This algorithm works by detecting different body parts and pieces them together to form a human body. The computation is rather expensive but significantly more accurate than the cascade classifier. (The HOG detector was included in OpenCV with no additional files required for download.)

Figure 10. Human body detection with HOG detector

Dropbox Image Upload of Detected Humans

To trigger the DropBox uploading, users need to have a reigstered DropBox account and save their "dropbox_key" and "dropbox_secret" in the conf.json file (attached in the Appendix section) and consent to DropBox access before running. Once a human is detected, the images will be uploaded automatically into users' DropBox folders and sent via email if the feature is enabled.

Figure 11. Images stored in the local folder that is linked with the DropBox account

Email Notication of Detected Humans

If a human is detected, an email notification is immediately sent to notify the user of potential intruders. The email notification also includes a captured image of the detected human. The email notification configuration file is attached in the Appendix section.

Install mailutils for email notification

$ sudo apt-get install mailutils

Figure 12. Email sent to the user of the detected human

Demos

Parallelized Face Detector

face_parallel.py

Our first code was modified from a basic face detection program we found online. It uses the Haar cascade classifier built into the OpenCV library. The baseline version was single threaded and only used one processing core of the Raspberry Pi, so we modified it to support multithreading. Images from the PiCam are distributed with a round robin scheduler across all four cores. Our version achieves roughly 8 to 10 FPS by using all available processors on the Raspberry Pi and is capable of accurately detecting faces under appropriate lighting conditions at distances of up to approximately 20 feet.

Pedestrian Detector with Face Prescreening

face_pedestrian.py

We next found a pedestrian detection library that used a histogram of oriented gradients (HOG) classifier. This program would detect body parts to identify an entire person, such as arms, torso, and legs. The HOG classifier requires significantly more computing power than the Haar cascade classifier, and achieved a very low framerate on the Raspberry Pi of roughly 0.5 to 1 frame per second. We combined the face and pedestrian detectors into a single program by using the fast face detection as a pre-filter. If a face was detected, we would trigger the slower pedestrian detector to try to identify the person’s body. This enabled a relatively smooth video stream until the pedestrian detector was activated, at which point the video would stall for several seconds as the compute intensive stage was run. We also introduced frame skipping in this version. Every third video frame would be skipped to accelerate the program. We also skipped frames of the pedestrian detector as a form of “cooldown” after it detected a body. This avoids reprocessing the same individual too quickly and helps to improve the overall framerate.

Pedestrian Detector with Motion Prescreening

motion_pedestrian.py

A trivially simple method to defeat the above mechanisms is to cover one’s face, bypassing the face detector completely. We developed a fix for this by replacing the pre-filtering stage with a motion detector instead. This would trigger whenever enough motion was captured by the camera and run the pedestrian detector. The number of false positives limited the framerate, as even slight amounts of motion in the background would freeze the video as the HOG classifier attempted to locate a body. Ultimately, we determined the framerate limited the effectiveness of this method, and reverted back to the face detector.

Body Detector with Face Prescreening

face_body.py

We next tried to accelerate body detection by using a different classifier to improve the overall framerate. We located a Haar cascade classifier that would detect a figure’s upper and lower body. The Haar classifier runs significantly faster than the HOG classifier, eliminating the video freezes from pedestrian detection. However, we found the detection accuracy of the Haar body classifiers is disappointingly low. Even though we achieved a high framerate, the program would only detect faces reliably, missing out the test subjects’ bodies. This was deemed unacceptable, so we reverted to the HOG pedestrian detector.

Pedestrian Detector with Face Prescreening, Camera Tracking/Control, and Live Notifications

face_ped_db_track.py

Our final demo was built from the face pedestrian detector, and involve integrating various features to build a complete security system. The PiCam was mounted to a gimble that enabled panning and tilting. We introduced two methods to change the camera orientation. The first enabled a remote user to manually control the camera direction using the keyboard. The user could also toggle to an auto-tracking mode that would center the camera on a detected face. To stabilize the camera, we used two methods to minimize jitter. The camera turns only if the center of the face is outside of a bounding box, preventing the camera from rapidly changing direction after achieving a lock. We also took inspiration from saturating counters used in branch prediction with an x-axis and y-axis counter. Each counter would be incremented or decremented on each frame depending on if the face was above/below or left/right of the camera’s center. The camera would only turn upon hitting the saturation values, indicating the uncentered face was detected in roughly the same region for multiple consecutive frames. This eliminates the error from background noise, such as a person briefly walking past the camera. Picking up a face would trigger the pedestrian detector. If that determined there was a human body onscreen, the Raspberry Pi would automatically upload that image frame to a Dropbox account via a secure connection. It would also email the picture to a specified address to warn the owner immediately.

Conclusion

There are several avenues for expansion from the current state of our project. Firstly, our final design can be defeated by an intruder with a covered face. To rectify this, the motion detection needs to be improved to reduce the rate of false positives. Potential solutions could include a way to quantify on-screen movement and only trigger when a human sized object is moving. Our programs overall suffer from relatively low framerates and high latency because of the large number of applications being run concurrently. Currently, all of our features are blocking; i.e. the video is paused while uploading a picture to Dropbox, and the camera cannot track or be turned while the pedestrian detector is running. One major improvement would be to make all these features nonblocking. Thus, the video would continue to stream while the pedestrian detector runs in the background, and interfacing with Dropbox services would run entirely in the background with no obvious effect on performance. This would produce a major improvement to the user experience.

Finally, one obvious method to circumvent the processing power bottleneck is to migrate to a more powerful platform, such as a Beaglebone. Our design is clearly constrained by the resources available on the Raspberry Pi. Another option is to interface multiple Raspberry Pis together on a distributed computing model and offload computation to other devices. For example, running the face detection and camera control on one Pi and the pedestrian detection and Dropbox/email interface on another should dramatically improve performance. The last resort would be to reduce the number of features of our system, as a tiny embedded device is not designed to handle so much computing for a smooth user interface.

We were able to implement a fairly complex security system on a resource constrained hardware platform with reasonable performance. On the image processing side, our design exploration led to several different detection schemes with varying levels of performance and reliability. On the notifications side, we were able to provide live alerts and intruder identification via a Dropbox connection and an automated email mechanism. We ultimately achieved the objectives outlined in our project proposal and updates.

Appendix

face_body.py $ python face_body.py

face_parallel.py $ python face_parallel.py

face_pedestrian.py $ python face_pedestrian.py

motion_pedestrian.py $ python motion_pedestrian.py

face_ped_db_track.py $ python face_ped_db_track.py --conf conf.json

ssmtp.conf Configuration file for setting up email notification

conf.json Configuration file for running the python script

Acknowledgements

We would like to thank Prof. Joseph Skovira and our TAs Brendon, Steven, and Jacob for providing us support and guidance throughout the semester. Without them, this project would not have been nearly as successful. We'd also like to acknowledge the greater Raspberry Pi community for providing a platform for low cost hardware projects.

https://courses.cit.cornell.edu/ece5990/ECE5990_Fall15_FinalProjects/ Andre_Heil/ece5990_final_report/avh34_jr986.html/

https://realpython.com/blog/python/face-recognition-with-python

http://www.pyimagesearch.com/2015/11/09/pedestrian-detection-opencv/

http://www.pyimagesearch.com/2015/06/01/home-surveillance-and-motion-detection-with-the-raspberry-pi-python-and-opencv/

http://answers.opencv.org/question/42049/body-detection-using-haarcascade/