A Python-based desktop application that combines OpenCV for image processing and Tesseract OCR for text extraction, wrapped in a user-friendly Tkinter interface. This application allows users to load images, detect text regions, extract text content, and save the results.
- User-Friendly Interface: Clean and intuitive Tkinter-based GUI
- Image Processing: Advanced image processing using OpenCV
- Text Detection: Accurate text region detection using contour analysis
- OCR Integration: Text extraction using Google's Tesseract OCR engine
- File Management: Support for various image formats and text export
- Real-Time Preview: Live display of processed images with detected text regions
- Progress Tracking: Visual feedback during processing operations
Before running the application, ensure you have the following installed:
pip install opencv-python
pip install pytesseract
pip install pillow
- Download the Tesseract installer from the official GitHub repository
- Run the installer and note the installation path
- Update the
tesseract_cmd
path in the code to match your installation:
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
sudo apt-get update
sudo apt-get install tesseract-ocr
brew install tesseract
- Clone the repository:
git clone https://github.com/nguyensinhloc/TextRecognitionFromImages.git
cd TextRecognitionFromImages
- Install required dependencies:
pip install -r requirements.txt
-
Update Tesseract path in
main.py
if necessary -
Run the application:
python main.py
-
Loading Images
- Click the "Load Image" button
- Select an image file (supported formats: PNG, JPG, JPEG, BMP, TIFF)
- The image will be displayed in the left panel
-
Detecting Text
- Click the "Detect Text" button
- The application will:
- Process the image using OpenCV
- Detect text regions
- Highlight detected regions in green
- Extract text using OCR
- Display extracted text in the right panel
-
Saving Results
- Click the "Save Text" button
- Choose a location and filename
- The extracted text will be saved as a UTF-8 encoded text file
TextRecognitionFromImages/
│
├── main.py # Main application file
├── requirements.txt # Project dependencies
├── README.md # Project documentation
-
Image Processing Pipeline
- Convert image to grayscale
- Apply binary thresholding
- Perform dilation for better text region detection
- Find contours in the processed image
-
Text Detection
- Filter contours based on size
- Draw rectangles around detected text regions
- Extract region of interest (ROI) for each detection
-
OCR Processing
- Process each ROI using Tesseract OCR
- Combine extracted text
- Display results in the application
-
Tesseract Not Found
- Verify Tesseract is installed correctly
- Check the path in the code matches your installation
- Ensure environment variables are set correctly
-
Poor Text Detection
- Ensure image is clear and well-lit
- Try adjusting the threshold values in the code
- Consider preprocessing images before loading
-
Performance Issues
- Reduce image size before processing
- Close other resource-intensive applications
- Check system meets minimum requirements
- Fork the repository
- Create a new branch (
git checkout -b feature/improvement
) - Make changes
- Commit your changes (
git commit -am 'Add new feature'
) - Push to the branch (
git push origin feature/improvement
) - Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenCV for image processing capabilities
- Google's Tesseract OCR engine
- Python Tkinter for the GUI framework
- All contributors and testers
- 1.0.0 (2024-11-22)
- Initial release
- Basic text detection and OCR functionality
- Tkinter GUI implementation
Nguyễn Sinh Lộc
Project Link: https://github.com/nguyensinhloc/TextRecognitionFromImages