This is my first attempt on creating a basic Flask application. This application allows to detect object in any image along with its features like color, etc. Inspirarion behind it was a POC to enable the image based search in the e-commerce website project I worked earlier.
After cloning the repository, follow these steps to set up the project:
-
Set Up your development environment:
- Download and install JetBrains PyCharm IDE or your preferred IDE.
- The following instructions will focus on PyCharm, but most IDEs provide similar features.
-
Open the project:
- In PyCharm, navigate to
File -> Open
and select the cloned repository folder.
- In PyCharm, navigate to
-
Set Up a local virtual environment:
- Go to
Settings
>Project: finance-gpt
>Python Interpreter
>Add Interpreter
. - Choose
Add Local Interpreter
>Virtualenv Environment
.- Select
Environment
->New
. - Set
Base Interpreter
to your installed Python version (e.g., Python 3.x). - Click
OK
.
- Select
- Go to
-
Install dependencies: Install required dependencies by running the following command in terminal through IDE:
pip install -r requirements.txt
-
Run the application: Application can be started by running the following command from the root folder:
python -m cx_img.app
The application does not include a user interface (UI) but provides two APIs that can be accessed using Postman or any other API client. These APIs are:
-
Training API
- This API generates a training data file based on data source color images.
- It uses color classification to extract and store RGB values for each image in the data source.
- Special thanks to color_recognition by ahmetozlu for inspiring this approach.
-
Image Classification API
- This API processes input images and returns data about the detected objects, including their confidence scores and features such as color.
- The object detection model is based on ResNet40.
- For color detection, area of main object in the image is identified first through GrabCut algorithm. KNN algorithm is then applied on this object by referencing the training data to determine the color corresponding to the object's RGB values.
- The Training API must be executed once during setup to generate the necessary training data for accurate image classification results.
- To access the Swagger UI for API documentation, navigate to http://localhost:5001/ui (or the appropriate port if different).
Here are some screenshots showcasing working deployments of the application.
-
Image Classification API and sample response:
Potential enhancements for future development include:
- Improving color detection accuracy by expanding the dataset with more diverse training images.
- Adding additional features such as:
- Object type detection.
- Features specific to object types (e.g., collar type, pattern, and color for shirts; brand and color for mobile phones).
- The current use of ResNet40, a pretrained model, may limit accuracy for object detection. To achieve higher accuracy, a custom model can be created and trained with a dataset relevant to your usecase.
- The current approach of using KNN for color detection does not perform well for images with multiple colors. A more advanced method is required to handle such scenarios effectively.
This project is licensed under the MIT License. See the LICENSE file for more details.