Computer-Vision-Object-Detection-and-Segmentation

Computer Vision Object Detection and Segmentation using Pytorch, TorchVision, TensorBoard

Pedestrian Detection and Segmentation

This project demonstrates object detection and instance segmentation with a Mask R-CNN using PyTorch and TorchVision. It fine-tunes a pre-trained Mask R-CNN model on the Penn-Fudan dataset, which contains 170 images with 345 instances of pedestrians.

Overview

The training script, train.py, performs the following tasks:

Data Preparation: Loads and prepares the pedestrian dataset for training and evaluation.
Model Customization: Customizes the head of a pre-trained Mask R-CNN model for the specific number of classes in the dataset.
Training: Trains the customized Mask R-CNN model using the prepared dataset.

The evaluation script, eval.py, performs the following tasks:

Evaluation: Evaluates the trained model on a test set and visualizes the results, including bounding box detection and instance segmentation masks.

Requirements

Python 3.6+
PyTorch
TorchVision
PyCocoTools
Matplotlib
TensorBoard (for visualization)

Setup

Clone the repository
Install the required packages

Usage

Run the training script: python train.py
During training, TensorBoard logs will be saved to the runs/PennFudanPed directory. You can visualize the training progress using TensorBoard: tensorboard --logdir=runs
After training, the model weights will be saved as model.pth.
Run the evaluation script on a test image and display the output with bounding boxes and segmentation masks. You can visualize the output using TensorBoard: python eval.py

Customization

You can customize the project to suit your needs by modifying the train.py or eval.py script. For example, you can:

Change the pre-trained model used for fine-tuning
Modify the hyperparameters (e.g., learning rate, batch size)
Experiment with different data augmentation techniques
Modify the Confidence Threshold
Detect Partially-Occluded Objects
Integrate additional evaluation metrics

Acknowledgments

This project was inspired by the seminal work Mask R-CNN https://arxiv.org/pdf/1703.06870.pdf) and utilizes code from the PyTorch and TorchVision repositories. This README file provides an overview of the project, including its purpose, requirements, setup instructions, usage guidelines, customization options, license information, and acknowledgments. It aims to give potential employers or collaborators a clear understanding of the project's functionality and how to run and modify it. Feel free to customize the content further based on your specific needs and preferences. Special thanks to the PyTorch team for providing an excellent deep learning framework and the open-source community for their valuable resources and contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data/PennFudanPed		data/PennFudanPed
README.md		README.md
coco.py		coco.py
eval.py		eval.py
pedestrians-raw.jpg		pedestrians-raw.jpg
pedestrians.png		pedestrians.png
train.py		train.py
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computer-Vision-Object-Detection-and-Segmentation

Pedestrian Detection and Segmentation

Overview

Requirements

Setup

Usage

Customization

Acknowledgments

About

Releases

Packages

Languages

evan-sctg/Computer-Vision-Object-Detection-and-Segmentation

Folders and files

Latest commit

History

Repository files navigation

Computer-Vision-Object-Detection-and-Segmentation

Pedestrian Detection and Segmentation

Overview

Requirements

Setup

Usage

Customization

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages