Find headings in pdfs #329

vekunz · 2020-01-23T12:00:08Z

Hello,
I'm not sure if this is even possible, therefore I ask for it. I'm using pdf-lib in a pipeline where I first create pdfs from markdown and then merge some pdfs together. Currently, I have to create a table of contents manually and I have to update it every time I make changes to the source.
So my question is, is it possible to make a feature to detect on which page a specific heading is?

Hopding · 2020-01-26T21:58:11Z

Hello @vekunz!

I'm afraid this is not really possible to do with pdf-lib today. It would require a lot of custom code to be written to extract and parse text. And this is not an easy or straightforward thing to do with PDFs (see #93 and #137).

To be clear, it is technically possible to do. And libraries like pdf.js that are designed for reading PDF documents (as opposed to creating/editing them) can do it. It's just that pdf-lib doesn't have the facilities to make this easy as of today.

vekunz changed the title ~~Find headingin pdfs~~ Find headings in pdfs Jan 23, 2020

Hopding closed this as completed Jan 26, 2020

bcholmes mentioned this issue Sep 4, 2021

can I edit text in pdf with PDF-lib? #950

Closed

Hopding mentioned this issue Sep 23, 2021

Is there any way to extract text from a PDF using pdf-lib library i.e. using x , y coordinates #892

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Find headings in pdfs #329

Find headings in pdfs #329

vekunz commented Jan 23, 2020

Hopding commented Jan 26, 2020

Find headings in pdfs #329

Find headings in pdfs #329

Comments

vekunz commented Jan 23, 2020

Hopding commented Jan 26, 2020