In this article, I will go through an application which monitors your incoming packages at your doorstep using AI.
As we are mostly home, due to the current pandemic situation, we shop online for a lot of items from different e-commerce websites and the packages get delivered at our doorsteps. How cool would it be that, if you get notified instantly when packages get delivered to or removed from your doorstep? This is exactly what I will be covering in this article.
My motivation for this project was 2-fold. Firstly, the idea came up when alwaysAI asked if I would like to speak and do a demo at a webinar of theirs on accelerated AI at the edge. Secondly, I received the second place in the OpenCV Spatial AI competition where I worked on a parcel classification and dimensioning problem. I was well versed working with computer vision on cardboard packages. More details about that are here.
I used alwaysAI low code platform for making the computer vision application and Balena for deployment. Balena provides a software platform that helps developers build, deploy, and manage the code that runs on connected devices.
The architecture is pretty simple. We have a setup with a Raspberry Pi 4 that is connected to a webcam as well as an Intel NCS2. The Pi is connected to an MQTT broker and Balena cloud using Wi-Fi. The Raspberry Pi runs a Balena OS image, which includes a component called balena-engine. This engine orchestrates the application containers like microservices’, making them independent and self-contained.
The overall flow of how to build this type of application is shown as below.
For this project, I trained a custom object detection model that detected packages. The first step in this process was to create my dataset. I gathered a lot of images of packages how they would look at my doorstep under different conditions like lighting, distance, time of the day, angles, combinations, stacked, etc. The more images you collect, the better your model’s accuracy will be in identifying packages. Some examples are shown below. I trained the model using around 200 annotated images.
Once we have the images, we need to annotate them so that our model can use these annotations to learn to identify packages when the camera sees them
Model training is the most time-consuming task after dataset creation. alwaysAI has a model training toolkit in place for doing your model training on your laptop without needing the cloud. If you have a laptop or a desktop which has the Nvidia GPU that supports CUDA, you should be sorted to run the training job locally. I trained for 100 epochs and on my laptop it took around 1.5 hours which was great. There is no need to provision any cloud resources to run the training. I used the mobilenet-ssd architecture which is pretty good for this use case because of its speed and decent accuracy. The model training toolkit uses the TensorFlow under the hood for training the model.
I built this application so that anyone can run this application no matter what hardware they have. Broadly speaking, you can run the application in 6 different ways, based on the hardware you have.
For instance, one setup would be: a Raspberry Pi 4, a USB HD camera, and an NCS2; another potential setup would be: a Raspberry Pi 4 with a USB EyeCloud camera, and a third is just a Raspberry Pi 4 with a USB HD camera.
I have the code on GitHub with detailed instructions if you want to try out this project. I have the detailed video which shows how I push the application to the Raspberry Pi 4 using Balena CLI and how you get mobile application notifications.
Feel free to fork the repository or send a pull request if you have more ideas with this application. This application sends notification to a mobile app. You could integrate it with another application that sounds a buzzer or turn on a light, or take a picture when packages gets dropped at your doorstep. The possibilities are endless. If you like the project, please let know if you did something with it, I would be very happy to know about it.
Happy Hacking ..!
Credits to Lila Mullany for helping me build this application.