This benchmark is about reading pure PDF files - notscanned documents and not documents that applied OCR.
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
Name |
Last PyPI Release |
License |
Version |
Dependencies |
Borb |
2023-06-23 |
AGPL/Commercial |
2.1.16 |
|
pypdfium2 |
2023-07-04 |
Apache-2.0 or BSD-3-Clause |
4.18.0 |
PDFium (Foxit/Google) |
pdfminer.six |
2022-11-05 |
MIT/X |
20221105 |
|
pdfplumber |
2023-07-29 |
MIT |
0.10.2 |
pdfminer.six |
pdfrw |
2017-09-18 |
MIT |
0.4 |
|
pdftotext |
- |
GPL |
0.86.1 |
build-essential libpoppler-cpp-dev pkg-config python3-dev |
PyMuPDF |
2023-08-24 |
GNU AFFERO GPL 3.0 / Commerical |
1.23.1 |
MuPDF |
pypdf |
2023-08-26 |
BSD 3-Clause |
3.15.4 |
|
Tika |
2023-01-01 |
Apache v2 |
2.6.0 |
Apache Tika |
Text Extraction Speed
# |
Library |
Average |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
1 |
PyMuPDF |
0.1s |
0.4s |
0.2s |
0.2s |
0.2s |
0.0s |
0.1s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
2 |
pypdfium2 |
0.2s |
1.9s |
0.2s |
0.2s |
0.2s |
0.0s |
0.1s |
0.1s |
0.1s |
0.0s |
0.1s |
0.0s |
0.0s |
0.0s |
0.0s |
3 |
pdftotext |
0.3s |
0.8s |
1.0s |
0.3s |
0.8s |
0.1s |
0.2s |
0.2s |
0.1s |
0.0s |
0.1s |
0.1s |
0.1s |
0.0s |
0.0s |
4 |
Tika |
1.1s |
12.9s |
0.9s |
0.6s |
0.4s |
0.1s |
0.3s |
0.2s |
0.1s |
0.1s |
0.1s |
0.1s |
0.1s |
0.0s |
0.0s |
5 |
pypdf |
2.6s |
18.7s |
4.8s |
5.3s |
2.3s |
0.7s |
0.9s |
0.4s |
0.5s |
0.3s |
0.6s |
0.5s |
0.4s |
0.4s |
0.2s |
6 |
pdfminer.six |
4.5s |
26.0s |
12.9s |
8.0s |
4.6s |
1.3s |
2.1s |
1.0s |
1.2s |
0.8s |
1.5s |
0.9s |
0.9s |
0.6s |
0.6s |
7 |
pdfplumber |
6.7s |
41.7s |
10.9s |
11.5s |
8.4s |
2.4s |
4.3s |
2.0s |
1.9s |
1.9s |
2.7s |
1.8s |
1.7s |
1.0s |
1.2s |
8 |
Borb |
34.7s |
111.2s |
105.0s |
1.4s |
87.2s |
21.1s |
7.4s |
83.5s |
16.4s |
20.3s |
5.4s |
3.4s |
18.8s |
3.2s |
2.1s |
Image Extraction Speed
# |
Library |
Average |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
1 |
PyMuPDF |
0.5s |
0.3s |
0.5s |
0.0s |
1.7s |
0.4s |
0.0s |
3.2s |
0.4s |
0.4s |
0.1s |
0.0s |
0.3s |
0.2s |
0.0s |
2 |
pypdf |
2.8s |
16.4s |
2.1s |
0.8s |
9.2s |
1.1s |
0.0s |
6.7s |
0.9s |
0.9s |
0.4s |
0.0s |
0.7s |
0.2s |
0.1s |
3 |
pdfminer.six |
6.5s |
31.8s |
13.7s |
9.2s |
24.0s |
1.5s |
2.3s |
1.5s |
1.4s |
0.9s |
1.5s |
0.9s |
1.0s |
0.6s |
0.5s |
# |
Library |
Average |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
1 |
PyMuPDF |
0.0s |
0.0s |
0.1s |
0.0s |
0.1s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
0.0s |
2 |
pdfrw |
0.1s |
0.0s |
0.4s |
0.0s |
0.3s |
0.1s |
0.1s |
0.1s |
0.1s |
0.1s |
0.1s |
0.0s |
0.1s |
0.0s |
0.0s |
3 |
pypdf |
0.4s |
0.6s |
1.7s |
0.4s |
0.9s |
0.2s |
0.3s |
0.4s |
0.3s |
0.2s |
0.3s |
0.1s |
0.2s |
0.0s |
0.2s |
# |
Library |
Average |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
1 |
pdfrw |
3.4MB |
2.5MB |
5.7MB |
1.6MB |
7.3MB |
2.7MB |
3.1MB |
15.4MB |
2.4MB |
1.3MB |
3.0MB |
0.3MB |
1.1MB |
0.8MB |
1.0MB |
2 |
pypdf |
3.5MB |
2.5MB |
5.7MB |
1.6MB |
7.3MB |
2.7MB |
3.1MB |
15.4MB |
2.4MB |
1.3MB |
3.0MB |
0.3MB |
1.1MB |
0.8MB |
1.0MB |
3 |
PyMuPDF |
3.7MB |
2.7MB |
6.8MB |
1.7MB |
8.5MB |
2.8MB |
3.4MB |
15.5MB |
2.5MB |
1.4MB |
3.2MB |
0.3MB |
1.2MB |
0.9MB |
1.1MB |
Text Extraction Quality
# |
Library |
Average |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
1 |
pypdfium2 |
98% |
99% |
97% |
94% |
99% |
98% |
96% |
99% |
98% |
99% |
99% |
98% |
98% |
99% |
99% |
2 |
pypdf |
97% |
98% |
93% |
94% |
98% |
98% |
96% |
97% |
98% |
99% |
99% |
98% |
98% |
98% |
99% |
3 |
PyMuPDF |
97% |
98% |
96% |
93% |
97% |
98% |
96% |
98% |
98% |
98% |
98% |
97% |
97% |
98% |
99% |
4 |
Tika |
96% |
99% |
98% |
92% |
97% |
98% |
96% |
93% |
97% |
98% |
93% |
98% |
93% |
98% |
96% |
5 |
pdftotext |
93% |
96% |
93% |
91% |
94% |
92% |
96% |
96% |
96% |
97% |
83% |
94% |
96% |
96% |
79% |
6 |
pdfminer.six |
90% |
95% |
79% |
86% |
92% |
86% |
93% |
95% |
93% |
92% |
92% |
93% |
86% |
98% |
86% |
7 |
pdfplumber |
75% |
94% |
84% |
61% |
97% |
61% |
93% |
61% |
89% |
57% |
59% |
67% |
59% |
98% |
67% |
8 |
Borb |
45% |
70% |
79% |
0% |
40% |
48% |
92% |
0% |
64% |
51% |
41% |
55% |
43% |
0% |
53% |