Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Find headings in pdfs #329

Closed
vekunz opened this issue Jan 23, 2020 · 1 comment
Closed

Find headings in pdfs #329

vekunz opened this issue Jan 23, 2020 · 1 comment

Comments

@vekunz
Copy link

vekunz commented Jan 23, 2020

Hello,
I'm not sure if this is even possible, therefore I ask for it. I'm using pdf-lib in a pipeline where I first create pdfs from markdown and then merge some pdfs together. Currently, I have to create a table of contents manually and I have to update it every time I make changes to the source.
So my question is, is it possible to make a feature to detect on which page a specific heading is?

@vekunz vekunz changed the title Find headingin pdfs Find headings in pdfs Jan 23, 2020
@Hopding
Copy link
Owner

Hopding commented Jan 26, 2020

Hello @vekunz!

I'm afraid this is not really possible to do with pdf-lib today. It would require a lot of custom code to be written to extract and parse text. And this is not an easy or straightforward thing to do with PDFs (see #93 and #137).

To be clear, it is technically possible to do. And libraries like pdf.js that are designed for reading PDF documents (as opposed to creating/editing them) can do it. It's just that pdf-lib doesn't have the facilities to make this easy as of today.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants