This benchmark is about reading pure PDF files - notscanned documents and not documents that applied OCR.
Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
# | Name | File Size | Pages |
---|---|---|---|
1 | 2201.00214 | 2.4MiB | 22 |
2 | GeoTopo-book | 5.1MiB | 117 |
3 | 2201.00151 | 1.5MiB | 12 |
4 | 1707.09725 | 7.0MiB | 134 |
5 | 2201.00021 | 2.6MiB | 10 |
6 | 2201.00037 | 2.9MiB | 33 |
7 | 2201.00069 | 14.7MiB | 15 |
8 | 2201.00178 | 2.3MiB | 16 |
9 | 2201.00201 | 1.3MiB | 9 |
10 | 1602.06541 | 2.9MiB | 16 |
11 | 2201.00200 | 284.8KiB | 7 |
12 | 2201.00022 | 1.2MiB | 14 |
13 | 2201.00029 | 797.6KiB | 12 |
14 | 1601.03642 | 1004.9KiB | 8 |
Name | Last PyPI Release | License | Version | Dependencies |
---|---|---|---|---|
Borb | 2024-08-03 | AGPL/Commercial | 2.1.16 | |
pypdfium2 | 2024-12-19 | Apache-2.0 or BSD-3-Clause | 4.30.1 | PDFium (Foxit/Google) |
pdfminer.six | 2024-07-06 | MIT/X | 20231228 | |
pdfplumber | 2025-01-01 | MIT | 0.11.5 | pdfminer.six |
pdfrw | 2017-09-18 | MIT | 0.4 | |
pdftotext | - | GPL | 0.86.1 | build-essential libpoppler-cpp-dev pkg-config python3-dev |
playa | 2025-02-20 | MIT | 0.3.0 | |
PyMuPDF | 2025-02-06 | GNU AFFERO GPL 3.0 / Commerical | 1.25.3 | MuPDF |
pypdf | 2025-02-09 | BSD 3-Clause | 5.3.0 | |
Tika | 2023-01-01 | Apache v2 | 2.6.0 | Apache Tika |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | pypdfium2 | 0.1s | 0.8s | 0.3s | 0.2s | 0.2s | 0.0s | 0.1s | 0.1s | 0.1s | 0.0s | 0.1s | 0.0s | 0.1s | 0.0s | 0.0s |
2 | PyMuPDF | 0.2s | 1.3s | 0.4s | 0.7s | 0.3s | 0.1s | 0.2s | 0.1s | 0.1s | 0.1s | 0.1s | 0.1s | 0.1s | 0.0s | 0.0s |
3 | pdftotext | 0.3s | 1.0s | 1.1s | 0.3s | 0.8s | 0.1s | 0.3s | 0.2s | 0.1s | 0.1s | 0.1s | 0.1s | 0.1s | 0.0s | 0.1s |
4 | playa | 2.5s | 17.2s | 5.3s | 4.4s | 2.2s | 0.7s | 1.1s | 0.6s | 0.6s | 0.4s | 0.7s | 0.5s | 0.6s | 0.4s | 0.2s |
5 | pypdf | 4.1s | 28.7s | 8.1s | 8.1s | 3.9s | 1.2s | 2.0s | 0.8s | 1.0s | 0.8s | 1.0s | 0.9s | 0.8s | 0.6s | 0.4s |
6 | pdfminer.six | 9.0s | 55.9s | 23.7s | 16.8s | 8.9s | 2.3s | 4.0s | 1.8s | 2.2s | 1.5s | 2.7s | 1.8s | 2.0s | 1.1s | 0.9s |
7 | pdfplumber | 13.0s | 86.4s | 22.7s | 23.4s | 14.2s | 4.2s | 7.1s | 3.3s | 3.2s | 2.9s | 4.4s | 3.3s | 3.5s | 1.9s | 1.7s |
8 | Tika | 24.4s | 17.8s | 100.1s | 0.6s | 23.4s | 47.3s | 48.3s | 31.5s | 34.5s | 0.1s | 13.2s | 0.1s | 24.2s | 0.1s | 0.1s |
9 | Borb | 50.5s | 188.4s | 149.1s | 2.3s | 113.6s | 28.4s | 11.7s | 112.3s | 23.7s | 27.1s | 8.4s | 5.7s | 27.7s | 4.9s | 2.9s |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | PyMuPDF | 0.6s | 0.3s | 0.7s | 0.0s | 2.2s | 0.6s | 0.0s | 3.3s | 0.5s | 0.5s | 0.1s | 0.0s | 0.4s | 0.3s | 0.0s |
2 | pypdfium2 | 1.3s | 1.5s | 2.3s | 0.0s | 4.3s | 1.2s | 0.2s | 5.7s | 0.9s | 0.9s | 0.3s | 0.1s | 0.7s | 0.3s | 0.0s |
3 | pypdf | 5.2s | 24.6s | 7.0s | 6.6s | 18.9s | 1.7s | 0.7s | 7.6s | 1.5s | 1.5s | 0.9s | 0.2s | 1.3s | 0.3s | 0.2s |
4 | pdfminer.six | 12.3s | 69.2s | 24.6s | 20.6s | 36.6s | 2.6s | 4.1s | 2.4s | 2.3s | 1.5s | 2.7s | 2.0s | 2.1s | 1.1s | 0.9s |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | pdfrw | 0.1s | 0.1s | 0.5s | 0.1s | 0.4s | 0.1s | 0.1s | 0.2s | 0.1s | 0.1s | 0.1s | 0.1s | 0.2s | 0.0s | 0.0s |
2 | PyMuPDF | 0.2s | 0.5s | 0.7s | 0.2s | 0.5s | 0.1s | 0.1s | 0.1s | 0.1s | 0.1s | 0.1s | 0.0s | 0.1s | 0.0s | 0.0s |
3 | pypdf | 0.6s | 0.7s | 2.3s | 0.5s | 1.7s | 0.3s | 0.4s | 0.5s | 0.4s | 0.2s | 0.5s | 0.2s | 0.6s | 0.1s | 0.1s |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | pypdf | 3.4MB | 2.5MB | 5.6MB | 1.6MB | 7.2MB | 2.7MB | 3.1MB | 15.4MB | 2.4MB | 1.3MB | 3.0MB | 0.3MB | 1.2MB | 0.8MB | 1.0MB |
2 | pdfrw | 3.5MB | 2.5MB | 5.7MB | 1.6MB | 7.3MB | 2.7MB | 3.1MB | 15.4MB | 2.4MB | 1.3MB | 3.0MB | 0.3MB | 1.2MB | 0.8MB | 1.0MB |
3 | PyMuPDF | 3.7MB | 2.7MB | 6.9MB | 1.7MB | 8.5MB | 2.8MB | 3.4MB | 15.5MB | 2.5MB | 1.4MB | 3.2MB | 0.3MB | 1.3MB | 0.9MB | 1.1MB |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | pypdfium2 | 97% | 99% | 97% | 94% | 99% | 98% | 96% | 99% | 99% | 99% | 99% | 98% | 78% | 99% | 99% |
2 | pypdf | 96% | 99% | 95% | 93% | 98% | 99% | 96% | 97% | 99% | 99% | 99% | 99% | 78% | 100% | 99% |
3 | PyMuPDF | 96% | 98% | 96% | 93% | 97% | 98% | 95% | 99% | 98% | 98% | 98% | 97% | 77% | 98% | 99% |
4 | playa | 96% | 98% | 93% | 93% | 98% | 98% | 95% | 97% | 97% | 98% | 99% | 98% | 77% | 96% | 99% |
5 | pdfplumber | 93% | 96% | 89% | 89% | 98% | 92% | 94% | 93% | 95% | 93% | 97% | 94% | 76% | 99% | 98% |
6 | pdftotext | 92% | 96% | 94% | 91% | 95% | 92% | 96% | 96% | 96% | 97% | 83% | 94% | 77% | 96% | 79% |
7 | pdfminer.six | 89% | 95% | 79% | 86% | 92% | 86% | 93% | 95% | 93% | 92% | 92% | 93% | 71% | 98% | 86% |
8 | Tika | 83% | 99% | 0% | 92% | 95% | 77% | 86% | 81% | 82% | 98% | 88% | 98% | 67% | 98% | 96% |
9 | Borb | 45% | 70% | 79% | 0% | 40% | 48% | 92% | 0% | 64% | 51% | 41% | 55% | 41% | 0% | 53% |