Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ROB: Raise PdfReadError when missing /Root in trailer. #2808

Merged
merged 2 commits into from
Aug 23, 2024

Conversation

BertrandBordage
Copy link
Contributor

@BertrandBordage BertrandBordage commented Aug 22, 2024

Fixes #2806.

When running the same code as described in #2806 with the same PDF, now this happens:

>>> list(reader.pages)
Object 493 0 not defined.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/pypdf/pypdf/_page.py", line 2356, in __len__
    return self.length_function()
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/pypdf/pypdf/_doc_common.py", line 352, in get_num_pages
    self._flatten()
  File "/pypdf/pypdf/_doc_common.py", line 1100, in _flatten
    catalog = self.root_object
              ^^^^^^^^^^^^^^^^
  File "/pypdf/pypdf/_reader.py", line 195, in root_object
    raise PdfReadError('Cannot find "/Root" key in trailer')
pypdf.errors.PdfReadError: Cannot find "/Root" key in trailer

@BertrandBordage BertrandBordage changed the title [Robustness] Raise PdfReadError when missing /Root in trailer. ROB: Raise PdfReadError when missing /Root in trailer. Aug 22, 2024
@BertrandBordage BertrandBordage marked this pull request as draft August 22, 2024 23:16
Copy link

codecov bot commented Aug 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.86%. Comparing base (d2d520b) to head (9e73fb9).
Report is 78 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2808   +/-   ##
=======================================
  Coverage   95.85%   95.86%           
=======================================
  Files          51       51           
  Lines        8573     8576    +3     
  Branches     1695     1696    +1     
=======================================
+ Hits         8218     8221    +3     
  Misses        212      212           
  Partials      143      143           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@BertrandBordage BertrandBordage marked this pull request as ready for review August 22, 2024 23:31
@stefan6419846 stefan6419846 merged commit 9f08cd0 into py-pdf:main Aug 23, 2024
16 checks passed
@pubpub-zz pubpub-zz mentioned this pull request Sep 15, 2024
pubpub-zz added a commit that referenced this pull request Sep 17, 2024
## Version 5.0.0, 2024-09-15

This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).


### Deprecations (DEP)
- Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (#2813)
- Drop Python 3.7 support (#2793)

### New Features (ENH)
- Add capability to remove /Info from PDF (#2820)
- Add incremental capability to PdfWriter (#2811)
- Add UniGB-UTF16 encodings (#2819)
- Accept utf strings for metadata (#2802)
- Report PdfReadError instead of RecursionError (#2800)
- Compress PDF files merging identical objects (#2795)

### Bug Fixes (BUG)
- Fix sheared image (#2801)

### Robustness (ROB)
- Robustify .set_data() (#2821)
- Raise PdfReadError when missing /Root in trailer (#2808)
- Fix extract_text() issues on damaged PDFs (#2760)
- Handle images with empty data when processing an image from bytes (#2786)

### Developer Experience (DEV)
- Fix coverage uploads (#2832)
- Test against Python 3.13 (#2776)


[Full Changelog](4.3.1...5.0.0)
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing root object raising: 'NoneType' object has no attribute 'get_object' (different from #1295 & #1689)
2 participants