RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
-
Updated
Jul 2, 2025 - Python
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
A Repo For Document AI
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Parsing-free RAG supported by VLMs
Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Datasets and Evaluation Scripts for CompHRDoc
[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
Run optical character recognition with PyTesseract from the FiftyOne App!
This small module connects Label Studio with Fonduer by creating a fonduer labeling function for gold labels from a label studio export. Documentation: https://irgroup.github.io/labelstudio-to-fonduer/
将语雀知识库接入大语言模型,实现基于 RAG(检索增强生成)的智能问答系统,支持FastAPI,兼容OpenAI API与本地Ollama模型。
Multimodal benchmark for evaluating handwritten editorial correction in printed text.
Official implementation for "SlimDoc: Lightweight Distillation of Document Transformer Models," published in the International Journal on Document Analysis and Recognition (IJDAR), 2025
PDF Chatbot is an AI-driven application that lets users chat with their PDF documents. It extracts text from uploaded PDFs and uses a powerful language model to answer user queries in a context-aware manner. The chatbot is built with Python, Gradio for the web interface, PyPDF2 for PDF parsing, and Hugging Face Transformers + LangChain for natural
This repository includes the ReceiptVQA dataset and the Pytorch implementation of the LiGT method and other evaluated baselines.
The Document Q&A with Google Gemma project involves building an intelligent system for extracting and answering questions from documents using the Google Gemma API. It integrates natural language processing (NLP) techniques to provide accurate, context-aware responses.
Add a description, image, and links to the document-understanding topic page so that developers can more easily learn about it.
To associate your repository with the document-understanding topic, visit your repo's landing page and select "manage topics."