Welcome to HandDrawnDigitAI π€ β an intuitive app for recognizing hand-drawn digits! Powered by CustomTkinter, a pre-trained Convolutional Neural Network (CNN), and a user-friendly GUI, this project brings machine learning right to your fingertips. Draw digits, recognize them instantly, and enjoy the seamless experience!
HandDrawnDigitAI/
β
βββ app/
β βββ __init__.py # Makes `app` a package
β βββ gui.py # Contains the GUI logic
β βββ digit_recognizer.py # Logic for recognizing digits
β βββ themes.py # Themes dictionary
β βββ utils.py # Helper utility functions
β
βββ models/
β βββ cnn_model.h5 # Pre-trained model
β
βββ logs/ # Stores temporary image files
β
βββ notebooks/
β βββ cnn_mnist.ipynb # Notebook for model
β
βββ run.py # Main entry point to the app
βββ requirements.txt # Dependencies for the project
βββ README.md # Documentation
-
πΌοΈ Interactive Drawing Canvas
Use the mouse to draw digits directly on the canvas. -
π§ AI-Powered Digit Recognition
Recognize digits (0-9) using a pre-trained CNN model trained on the MNIST dataset. -
π¨ Dynamic Themes
Choose from multiple visually appealing themes:
π΅ Oceanic | π Dark Mode | π¨ Vibrant | π€ Corporate | πΈ Pink Black -
π οΈ Modular Codebase
Clean and organized project structure for easy navigation.
- Draw a digit on the canvas.
- Press the "Predict" button, and the model will recognize the digit.
- If the prediction isn't clear, adjust and re-draw on the canvas!
- Use the "Clear Canvas" button to start over.
Make sure you have the following dependencies installed before running the app:
tensorflow==2.10.0
customtkinter==5.1.2
pillow==9.2.0
numpy==1.23.0
opencv-python==4.6.0
pip install -r requirements.txt
-
gui.py
Handles the graphical user interface logic, including button interactions and canvas drawing. -
digit_recognizer.py
Preprocesses the canvas image and uses the CNN model for predictions. -
themes.py
Stores theme configurations for a visually appealing UI. -
utils.py
Helper functions to support digit recognition and GUI.
- cnn_model.h5
Pre-trained CNN model saved in Keras HDF5 format.
- Temporary directory to store canvas snapshots during runtime.
- cnn_mnist.ipynb
A Jupyter Notebook used to train the CNN model on the MNIST dataset.
- The main entry point to launch the application.
Follow these steps to get started:
- Clone the repository:
git clone https://github.com/asRot0/HandDrawnDigitAI.git
cd HandDrawnDigitAI
- Install the required dependencies:
pip install -r requirements.txt
- Run the app:
python run.py
- Draw digits and enjoy the magic! β¨
π Dark Mode | π΅ Oceanic | πΈ Pink Black |
---|---|---|
Here are some important mathematical formulas used in this project
Convolutional layers apply filters to extract features from input images.
-
$O(i, j)$ β Output feature map at position$(i, j)$ . -
$I(i+m, j+n)$ β Input image pixels affected by the filter. -
$K(m, n)$ β Kernel (filter
) values applied to the input.
π Why Add?
- Represents how a filter (kernel) slides over an image to extract meaningful features.
- Core operation in Convolutional Neural Networks (CNNs).
2οΈβ£ ReLU Activation Function (Hidden Layers)
ReLU introduces non-linearity to the model by keeping only positive values.
-
$x$ β Input value to the activation function.
π Why Add?
- Helps prevent vanishing gradients.
- Improves CNNβs ability to learn complex patterns.
Max pooling reduces the spatial size of feature maps while preserving key information.
-
$P(i, j)$ β Pooled output value at position$(i, j)$ . -
$F(i+m, j+n)$ β Input feature map values in the pooling region. -
$R$ β Pooling region (e.g., 2Γ2 or 3Γ3 window).
π Why Add?
- Reduces computation and prevents overfitting.
- Keeps dominant features while discarding unnecessary details.
The softmax function converts model outputs into probability distributions.
-
$z_i$ β Raw score (logit) for class$( i )$ . -
$e^{z_i}$ β Exponential of the logit, ensuring positive values. -
$\sum_{j=1}^{n} e^{z_j}$ β Sum of exponentials across all$( n )$ classes (normalization factor).
π Why Add?
- Softmax assigns probabilities to digit classes (0-9).
- Ensures outputs sum up to 1, making it interpretable.
Cross-entropy measures the difference between predicted and actual labels.
-
$\mathcal{L}$ β Cross-entropy loss value. -
$y_i$ β Actual label (ground truth) for class$( i )$ (1 for correct class, 0 otherwise). -
$\hat{y_i}$ β Predicted probability from thesoftmax
function.
π Why Add?
- Penalizes incorrect predictions by increasing the loss.
- Common loss function for multi-class classification.
Adam adjusts learning rates based on gradients to optimize CNN performance.
-
$\theta_t$ β Model parameters at step$( t )$ . -
$m_t$ β First moment estimate (mean of gradients). -
$v_t$ β Second moment estimate (variance of gradients). -
$\eta$ β Learning rate (step size). -
$\epsilon$ β Small constant to avoid division by zero.
π Why Add?
- Used as the optimizer in this project (
optimizer='adam'
). - Combines momentum & adaptive learning rates for faster convergence.
These mathematical concepts power the CNN architecture used in this project. They help in feature extraction, classification, optimization, and learning to recognize hand-drawn digits efficiently and accurately.
If you'd like to retrain the model:
- Open the Jupyter Notebook in the
notebooks
/ directory. - Train the CNN on the MNIST dataset.
- Save the updated model as cnn_model.h5 in the models/ directory.
- βοΈ Add support for multiple languages.
- π Integrate visualization of model predictions (e.g., activation maps).
- π Real-time digit recognition with camera input.
- Asif Ahmed β GitHub
Creator and Maintainer
Want to contribute? Feel free to fork the repository and open a pull request!
If you find this project helpful, consider giving it a β on GitHub and sharing it with others!