Text Detection and OCR Application

A Python-based desktop application that combines OpenCV for image processing and Tesseract OCR for text extraction, wrapped in a user-friendly Tkinter interface. This application allows users to load images, detect text regions, extract text content, and save the results.

Features

User-Friendly Interface: Clean and intuitive Tkinter-based GUI
Image Processing: Advanced image processing using OpenCV
Text Detection: Accurate text region detection using contour analysis
OCR Integration: Text extraction using Google's Tesseract OCR engine
File Management: Support for various image formats and text export
Real-Time Preview: Live display of processed images with detected text regions
Progress Tracking: Visual feedback during processing operations

Prerequisites

Before running the application, ensure you have the following installed:

Python Dependencies

pip install opencv-python
pip install pytesseract
pip install pillow

Tesseract OCR Installation

Windows

Download the Tesseract installer from the official GitHub repository
Run the installer and note the installation path
Update the tesseract_cmd path in the code to match your installation:

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

Linux

sudo apt-get update
sudo apt-get install tesseract-ocr

macOS

brew install tesseract

Installation

Clone the repository:

git clone https://github.com/nguyensinhloc/TextRecognitionFromImages.git
cd TextRecognitionFromImages

Install required dependencies:

pip install -r requirements.txt

Update Tesseract path in main.py if necessary
Run the application:

python main.py

Usage

Loading Images
- Click the "Load Image" button
- Select an image file (supported formats: PNG, JPG, JPEG, BMP, TIFF)
- The image will be displayed in the left panel
Detecting Text
- Click the "Detect Text" button
- The application will:
  - Process the image using OpenCV
  - Detect text regions
  - Highlight detected regions in green
  - Extract text using OCR
  - Display extracted text in the right panel
Saving Results
- Click the "Save Text" button
- Choose a location and filename
- The extracted text will be saved as a UTF-8 encoded text file

Project Structure

TextRecognitionFromImages/
│
├── main.py    # Main application file
├── requirements.txt         # Project dependencies
├── README.md               # Project documentation

How It Works

Image Processing Pipeline
- Convert image to grayscale
- Apply binary thresholding
- Perform dilation for better text region detection
- Find contours in the processed image
Text Detection
- Filter contours based on size
- Draw rectangles around detected text regions
- Extract region of interest (ROI) for each detection
OCR Processing
- Process each ROI using Tesseract OCR
- Combine extracted text
- Display results in the application

Troubleshooting

Common Issues

Tesseract Not Found
- Verify Tesseract is installed correctly
- Check the path in the code matches your installation
- Ensure environment variables are set correctly
Poor Text Detection
- Ensure image is clear and well-lit
- Try adjusting the threshold values in the code
- Consider preprocessing images before loading
Performance Issues
- Reduce image size before processing
- Close other resource-intensive applications
- Check system meets minimum requirements

Contributing

Fork the repository
Create a new branch (git checkout -b feature/improvement)
Make changes
Commit your changes (git commit -am 'Add new feature')
Push to the branch (git push origin feature/improvement)
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenCV for image processing capabilities
Google's Tesseract OCR engine
Python Tkinter for the GUI framework
All contributors and testers

Version History

1.0.0 (2024-11-22)
- Initial release
- Basic text detection and OCR functionality
- Tkinter GUI implementation

Contact

Nguyễn Sinh Lộc

Project Link: https://github.com/nguyensinhloc/TextRecognitionFromImages

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github		.github
.idea		.idea
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Detection and OCR Application

Features

Prerequisites

Python Dependencies

Tesseract OCR Installation

Windows

Linux

macOS

Installation

Usage

Project Structure

How It Works

Troubleshooting

Common Issues

Contributing

License

Acknowledgments

Version History

Contact

About

Releases

Packages

Languages

License

nguyensinhloc/TextRecognitionFromImages

Folders and files

Latest commit

History

Repository files navigation

Text Detection and OCR Application

Features

Prerequisites

Python Dependencies

Tesseract OCR Installation

Windows

Linux

macOS

Installation

Usage

Project Structure

How It Works

Troubleshooting

Common Issues

Contributing

License

Acknowledgments

Version History

Contact

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages