Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

How to extract images from a PDF? #278

Open
liubaochuan opened this issue May 9, 2024 · 3 comments
Open

How to extract images from a PDF? #278

liubaochuan opened this issue May 9, 2024 · 3 comments

Comments

@liubaochuan
Copy link

How to extract images from a PDF when get_page_images doesn't work.

@gamcoh
Copy link

gamcoh commented Sep 26, 2024

Did you find any help @liubaochuan ?

@Heinenen
Copy link
Collaborator

The relevant part of the spec is "8.9 Images". There seem to be two ways to embed an image: as an XObject and as an inline image (8.9.7 Inline Images).
Inline images are embedded in the content stream of the page. I'm pretty sure that lopdf does not find such images.

A sample PDF would really help to confirm my suspicion (or help fix a bug).

@Heinenen
Copy link
Collaborator

Maybe related to #78.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants