-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
BUG: fix sheared image #2801
BUG: fix sheared image #2801
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2801 +/- ##
=======================================
Coverage 95.86% 95.86%
=======================================
Files 51 51
Lines 8528 8528
Branches 1691 1691
=======================================
Hits 8175 8175
Misses 209 209
Partials 144 144 ☔ View full report in Codecov by Sentry. |
Is there an easy way to craft a test for this without using the original restricted file? |
https://corpora.tika.apache.org/base/docs/govdocs1/938/938702.pdf-tika-938702.pdf apparently has been deleted and cannot be used directly anymore. |
Nearly minimal test file for this issue (thanks to @pubpub-zz for providing support with minimizing the embedded image): tt.pdf |
in replacement: |
## Version 5.0.0, 2024-09-15 This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead). ### Deprecations (DEP) - Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (#2813) - Drop Python 3.7 support (#2793) ### New Features (ENH) - Add capability to remove /Info from PDF (#2820) - Add incremental capability to PdfWriter (#2811) - Add UniGB-UTF16 encodings (#2819) - Accept utf strings for metadata (#2802) - Report PdfReadError instead of RecursionError (#2800) - Compress PDF files merging identical objects (#2795) ### Bug Fixes (BUG) - Fix sheared image (#2801) ### Robustness (ROB) - Robustify .set_data() (#2821) - Raise PdfReadError when missing /Root in trailer (#2808) - Fix extract_text() issues on damaged PDFs (#2760) - Handle images with empty data when processing an image from bytes (#2786) ### Developer Experience (DEV) - Fix coverage uploads (#2832) - Test against Python 3.13 (#2776) [Full Changelog](4.3.1...5.0.0)
closes #2411