(Demo Video)[https://user-images.githubusercontent.com/47744559/235339233-676e6d52-94e4-428f-b068-b3072bb63795.mp4]
This project is a thesis submission for the degree of Bachelors of Science from Cairo University - Faculty of Engineering. The goal of this project was to develop an food recognition and detection system for visually impaired individuals.
The system is designed to assist visually impaired individuals in identifying food(including oriental food) in their surroundings through the use of a camera in a smart glasses and machine learning algorithms. The system is able to recognize a wide range of plates (54 dish including both oriental and international dishes) and classify them.
Object recognition and classification: The system uses deep learning algorithms to recognize and classify objects in real-time. Audio feedback: The system provides audio feedback to the user, identifying the food plate(s) ahead and their respective locations. User-friendly interface: The interface used for simulation is kivy running on Raspberry Pi 4B (Raspbian). This is just to simulate the use of smartglasses that embeds a camera.
This project was developed by Mostafa Sherif, Youssef Sayed, Amir Salah and myself under the supervision of Prof Ibrahim Sobh and Prof Ahmed Darwish. We would like to thank Valeo Egypt for selecting our project for the 2021 Valeo Mentorship Program.
If you have any questions or feedback, please contact karim-ibrahim or check the Thesis book and presentation Section at the end of this file.
`\_Datasets`
`\_Custom Dataset`
`\_food-101`
` \_master [REPO]`
`\_classification weights`
`\_dataset manipulation`
`\_detection weights`
`\_images`
`\_unsuccessful trials`
`\_visuals`
`classification_inference.py`
`classification_training.py`
`classification_utils.py`
`detection_utils.py`
`pipeline.py`
The dataset used was based on the Food101 Dataset which is a balanced ds that has a total of 101K images (1000 image per 101 Classes). DS processing process involved excluding unpopular dishes from the food-101 dataset and collecting/adding oriental dishes to the unexcluded classes. A total of 54 Class was included in the final Dataset:
A FasterRCNN
object detector is used to identify the bounding blocks of the plates of food (if any) and output them to a MobileNetV2
classifier that is trained on the aforementioned Custom Dataset. The output is the location of each bounding box and the predicted label for that box.
The Classifier was Fine-tuned on the 54k image dataset for 70 epochs showing the following: