This repository contains resources helpful if you are going to build a system for fiscal receipt data extraction.
- Receipts2Go: The Big World of Small Documents
- Deep Learning for automatic sale receipt understanding
- From one group of authors:
- OCR Engine to extract Food-items and Prices from Receipt Images via Pattern matching and heuristics approach
- OCR Engine to Extract Food-Items, Prices, Quantity, Units from Receipt Images, Heuristics Rules Based Approach
- Optical Character Recognition Engine to extract Food-items and Prices from Grocery Receipt Images via Templating and Dictionary-Traversal Technique
- Automated Receipt Image Identification, Cropping, and Parsing
- Separation and Extraction of Valuable Information From Digital Receipts Using Google Cloud Vision OCR
- OCR accuracy improvement on document images through a novel pre-processing approach
- Mobile Scanner and OCR (A first step towards receipt to spreadsheet)
- A Novel Integrated Framework for Learning both Text Detection and Recognition
- A Multitask Network for Localization and Recognition of Text in Images
- Towards Unconstrained End-to-End Text Spotting
- Utilize OCR text to extract receipt data and classify receipts with common Machine Learning algorithms
- Chargrid: Towards Understanding 2D Documents
- Graph Convolution for Multimodal Information Extraction from Visually Rich Documents
- Visual-Linguistic Methods for Receipt Field Recognition
- CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor
- Attend, Copy, Parse End-to-end Information Extraction from Documents
- End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net
- EATEN: Entity-aware Attention for Single Shot Visual Text Extraction
- Post-OCR parsing: building simple and robust parser via BIO tagging
- LayoutLM: Pre-training of Text and Layout for Document Image Understanding