This repository contains implementations of Deep Neural Networks (DNNs) for classifying handwritten digits from the MNIST dataset. The project explores three distinct approaches: a TensorFlow-based DNN with sigmoid activation, a custom NumPy-based DNN with sigmoid activation, and a custom NumPy-based DNN with ReLU activation. Each implementation aims to provide insights into neural network design, training, and performance on the MNIST dataset.
The MNIST dataset consists of 60,000 training images and 10,000 test images of handwritten digits (0-9), each represented as 28x28 grayscale pixels. This project demonstrates how DNNs can be trained to recognize these digits, comparing the effectiveness of different frameworks and activation functions.
- TensorFlow Implementation: A DNN built using TensorFlow/Keras with sigmoid activations, achieving high accuracy with a modern deep learning framework.
- NumPy Sigmoid Implementation: A from-scratch DNN using NumPy with sigmoid activations, showcasing fundamental neural network concepts.
- NumPy ReLU Implementation: A from-scratch DNN using NumPy with ReLU activations, exploring an alternative activation function for improved performance.
- Preprocessing: Conversion of raw MNIST
.idx
files to.csv
format for the NumPy implementations. - Evaluation: Accuracy metrics and confusion matrix (TensorFlow version) to assess model performance.
- File:
tensorflow_dnn.ipynb
- Architecture:
- Input Layer: Flattened 28x28 images (784 units)
- Hidden Layer 1: 128 units, sigmoid activation
- Hidden Layer 2: 64 units, sigmoid activation
- Output Layer: 10 units, softmax activation
- Training:
- Optimizer: Adam
- Loss: Sparse Categorical Crossentropy
- Epochs: 10
- Dataset: Normalized MNIST (0-1 range)
- Results: Achieved 97.75% accuracy on the test set after 10 epochs.
- Extras: Includes confusion matrix generation for detailed performance analysis.
- File:
numpy_sigmoid_dnn.ipynb
- Architecture:
- Input Layer: 784 units
- Hidden Layer 1: 128 units, sigmoid activation
- Hidden Layer 2: 64 units, sigmoid activation
- Output Layer: 10 units, softmax activation
- Training:
- Learning Rate: 0.001
- Epochs: 30
- Dataset: Normalized MNIST (0.01-1 range)
- Results: Achieved 83.54% accuracy on the test set after 30 epochs.
- Extras: Custom forward and backward propagation with sigmoid and softmax functions.
- File:
numpy_relu_dnn.ipynb
- Architecture:
- Input Layer: 784 units
- Hidden Layer 1: 128 units, ReLU activation
- Hidden Layer 2: 64 units, ReLU activation
- Output Layer: 10 units, ReLU activation
- Training:
- Learning Rate: 0.001
- Epochs: 30
- Dataset: Normalized MNIST (0.01-1 range)
- Results: Achieved 69.09% accuracy on the test set after 30 epochs.
- Extras: Custom implementation with ReLU activation, highlighting its impact on convergence.
Implementation | Framework | Activation | Epochs | Test Accuracy |
---|---|---|---|---|
TensorFlow DNN | TensorFlow | Sigmoid | 10 | 97.75% |
NumPy Sigmoid DNN | NumPy | Sigmoid | 30 | 83.54% |
NumPy ReLU DNN | NumPy | ReLU | 30 | 69.09% |
- TensorFlow DNN: Outperforms due to optimized gradients (Adam) and framework efficiency.
- NumPy Sigmoid: Solid performance but slower convergence; limited by sigmoid’s vanishing gradient.
- NumPy ReLU: Lower accuracy, likely due to ReLU in the output layer causing instability in this context.
- Python 3.x
- Required libraries:
- TensorFlow (
pip install tensorflow
) - NumPy (
pip install numpy
) - Matplotlib (
pip install matplotlib
)
- TensorFlow (
- Clone the Repository:
git clone https://github.com/[YourUsername]/mnist-dnn.git cd mnist-dnn