URL to Markdown API

Transform any web content into clean, LLM-ready Markdown with a single API call! This powerful FastAPI service seamlessly converts web pages, documents, and multimedia content into structured Markdown format, making it perfect for AI/ML pipelines, content aggregation, and data processing workflows.

🚀 Key Features

Universal Content Support: Convert web articles, YouTube videos, PDFs, Office documents, and more
LLM-Optimized Output: Clean, structured Markdown perfect for AI/ML processing
Rich Media Handling: Extract metadata from images, audio files, and videos
Smart Processing: OCR for images, transcription for audio, and intelligent content extraction
Simple Integration: RESTful API with clear error handling and response codes

🌐 Live Demo

Try it now at markdown.nimk.ir - Transform any URL into clean Markdown instantly!

Example 1: Converting a Web Article

GET https://markdown.nimk.ir/https://ask.library.arizona.edu/faq/407985

This will convert the library FAQ article into clean, readable Markdown format.

Example 2: Converting a YouTube Video

GET https://markdown.nimk.ir/https://www.youtube.com/watch?v=dQw4w9WgXcQ

This will extract the video title, description, and other metadata in Markdown format.

Features

Convert web pages to clean Markdown optimized for LLM processing
Support for various content types including:
- Web articles and HTML content
- YouTube videos
- PDF documents
- PowerPoint presentations
- Word documents
- Excel spreadsheets
- Images (with EXIF metadata and OCR)
- Audio files (with metadata and transcription)
- Text-based formats (CSV, JSON, XML)
- ZIP files (processes contents)
Automatic URL protocol handling
Clean error handling with appropriate HTTP status codes

Why Markdown for LLMs?

Structured Format: Markdown provides a clean, hierarchical structure that LLMs can easily parse and understand
Consistent Representation: Different content types are normalized into a unified text format
Preserved Semantics: Headers, lists, and emphasis are maintained in a way that preserves document structure
Reduced Noise: Removes unnecessary formatting and styling, focusing on content
Enhanced Accessibility: Makes content more accessible for text analysis and natural language processing

Installation

Standard Installation

Clone the repository
Install dependencies:

pip install -r requirements.txt

Docker Deployment

Using Docker Compose (Recommended)

docker-compose up -d

This will build and start the service in detached mode. The API will be available at http://localhost:8000

Using Docker directly

# Build the image
docker build -t url-to-markdown .

# Run the container
docker run -d -p 8000:8000 url-to-markdown

Usage

Start the server:

uvicorn main:app --reload

The API will be available at http://localhost:8000

API Endpoint

GET /{url}

The URL should be URL-encoded if it contains special characters.

Example Use Cases

Converting YouTube Videos
```
GET http://localhost:8000/www.youtube.com/watch?v=dQw4w9WgXcQ
```
This will return the video title, description, and metadata in Markdown format.
Converting PDF Documents
```
GET http://localhost:8000/https://pdfobject.com/pdf/sample.pdf
```
This will convert the PDF content into readable Markdown text.

Converting Web Articles

GET http://localhost:8000/https://dev.to/iw4p/scraping-tweets-without-twitter-api-and-free-5g9c

This will convert the article content into clean Markdown format.

Response Format

Successful response: Plain text Markdown content

# Article Title

## Content

[Article content in Markdown format]

Error Responses

400: URL processing failed
415: Unsupported URL format
500: Internal server error

Development

This project uses:

FastAPI for the web framework
MarkItDown for content conversion
Python 3.12+

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

URL to Markdown API

🚀 Key Features

🌐 Live Demo

Example 1: Converting a Web Article

Example 2: Converting a YouTube Video

Features

Why Markdown for LLMs?

Installation

Standard Installation

Docker Deployment

Usage

API Endpoint

Example Use Cases

Response Format

Error Responses

Development

License

About

Releases

Packages

Languages

iw4p/url-to-markdown

Folders and files

Latest commit

History

Repository files navigation

URL to Markdown API

🚀 Key Features

🌐 Live Demo

Example 1: Converting a Web Article

Example 2: Converting a YouTube Video

Features

Why Markdown for LLMs?

Installation

Standard Installation

Docker Deployment

Usage

API Endpoint

Example Use Cases

Response Format

Error Responses

Development

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages