Full PDF doc loaded before single page could be rendered #9537

mustafa0x · 2018-03-06T18:57:59Z

I understood from the FAQ that pdf.js only downloads what it needs, however, using the sample code from the docs, I noticed, via Chrome DevTools' network panel, that the entire document was loaded first, even though a single page was drawn.

Some things I tried to no avail:

Run it on my server.
Run it on another pdf file.
Run qpdf --linearize on that pdf file.
Try HEAD, which gave the same results but the doc was downloaded in 64kb chunks.

My use case is displaying specific pages from very long PDF files (1000+ pages). If pdf.js is not the right tool please let me know.

Related issues: #1108, #2719, #1923, #1375, #2470, #3461, #6104, #8897.

The text was updated successfully, but these errors were encountered:

timvandermeij · 2018-03-06T20:53:18Z

The default range chunk size is

pdf.js/src/display/api.js

Line 37 in c33bf80

var DEFAULT_RANGE_CHUNK_SIZE = 65536; // 2^16 = 65536

Only PDF files larger than that will use range (chunked) loading. The server must support range requests and the PDF file must be optimized for web (linearized). If that is the case, then range loading should work just fine for your use case (just try it out with some of your own PDF files to make sure).

mustafa0x · 2018-03-07T05:35:18Z

Thanks @timvandermeij!

See: http://159.89.108.117/pdfjs-1.10.88/web/load-single.html. I'm using the PDF spec: http://159.89.108.117/PDF32000_2008.pdf.

Using HEAD, the first page was rendered when ~4.5MB was downloaded, but it did continue to download the entire file (22.5MB). See: http://159.89.108.117/pdf.js-master/examples/helloworld/load-single.html

mustafa0x · 2018-03-07T19:39:28Z

I ran pdfinfo on PDF32000_2008.pdf, and it told me Optimized: no, so I downloaded a PDF which was optimized (http://159.89.108.117/annual_report_2009.pdf) but the results didn't change

mustafa0x · 2018-03-13T18:43:56Z

@timvandermeij Mind giving some input on this?

timvandermeij · 2018-03-13T21:24:06Z

Your example does use range requests (indicated by response codes 206 in the network tab of the console), so that looks fine to me. I think you may want to disable auto-fetching; see:

pdf.js/src/display/api.js

Line 175 in e0fb18a

* @property {boolean} disableAutoFetch - (optional) Disable pre-fetching of PDF

Disable pre-fetching of PDF file data. When range requests are enabled PDF.js will automatically keep fetching more data even if it isn't needed to display the current page. The default value is false. NOTE: It is also necessary to disable streaming, see above, in order for disabling of pre-fetching to work correctly.

pravid · 2019-01-16T10:44:11Z

Will this work in case of pdf stored on other domain?
My pdf is stored on cloud server, all the headers are set properly. It loads the pdf completely before rendering. I'm using normal function,
pdfjsLib.getDocument({ url: DEFAULT_URL, password: "abc", disableStream: false, disableAutoFetch: true, })

Am I missing any parameters here, how can I specify 'range' here?

Hao-Wu · 2019-07-04T10:54:55Z

Will this work in case of pdf stored on other domain?
My pdf is stored on cloud server, all the headers are set properly. It loads the pdf completely before rendering. I'm using normal function,
pdfjsLib.getDocument({ url: DEFAULT_URL, password: "abc", disableStream: false, disableAutoFetch: true, })

Am I missing any parameters here, how can I specify 'range' here?

Hi @pravid , Did you make it work with pdf being stored on other domains?

pravid · 2019-07-04T13:54:20Z

@Hao-Wu , Yes in a way. I used cors proxy to sort this issue.
So, while setting default_url, I had to prefix proxy path to it.
Check this link : https://github.com/Rob--W/cors-anywhere (Node JS proxy server)
For more details on CORS : https://humanwhocodes.com/blog/2010/05/25/cross-domain-ajax-with-cross-origin-resource-sharing/

Hope this helps.

shivamsharmabtp · 2019-12-06T09:35:13Z

Hi, @pravid i am loading pdf from different source using url /pdfjs/web/viewer.html?file= . It is working fine but loading entire pdf before rendering it. Is it possible to start rendering pdf before complete load? Thank you.

pravid · 2019-12-06T11:13:58Z

Yes, check above answer for details. #9537 (comment)
I used CORS to solve this loading issue.
You need to set up nodejs proxy server (ex. https://myproxyserver.com) and
while calling your pdf, set file path as,
var pdfPath = "https://myproxyserver.com/" + "https://mypdffilepath.com/my.pdf";

Hope this helps.

shivamsharmabtp · 2019-12-06T13:08:03Z

Hi @pravid , Sorry, i didn't get much. For example you can look at this link of my website . It loads pdf from gcp bucket and perfectly renders it. But the only issue is, it loads complete pdf before rendering it. For large pdfs it takes some time and the reader is blank for minutes. As far as i think you are telling how to render pdf from source not in domain. But this is not my issue. I have already resolved it by commenting out the line which throws error. As i understood i should use https://myproxyserver.com/https://storage.googleapis.com/... pdf url instead of https://storage.googleapis.com/... which is working fine.. i am not able to understand how this will help and how to implement it. I hope you understood my doubt. Thank you.

pravid · 2019-12-06T15:45:10Z

Have you setup your node js server?
Pls go through the link to understand how CORS works. They have also given sample files to check for. https://github.com/Rob--W/cors-anywhere (Node JS proxy server)
'https://myproxyserver.com' will be the url of your server you set.
Also do note that, your pdf also should be optimized.

timvandermeij closed this as completed Mar 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full PDF doc loaded before single page could be rendered #9537

Full PDF doc loaded before single page could be rendered #9537

mustafa0x commented Mar 6, 2018 •

edited

Loading

timvandermeij commented Mar 6, 2018 •

edited

Loading

mustafa0x commented Mar 7, 2018 •

edited

Loading

mustafa0x commented Mar 7, 2018

mustafa0x commented Mar 13, 2018

timvandermeij commented Mar 13, 2018

pravid commented Jan 16, 2019

Hao-Wu commented Jul 4, 2019

pravid commented Jul 4, 2019

shivamsharmabtp commented Dec 6, 2019

pravid commented Dec 6, 2019

shivamsharmabtp commented Dec 6, 2019 •

edited

Loading

pravid commented Dec 6, 2019

Full PDF doc loaded before single page could be rendered #9537

Full PDF doc loaded before single page could be rendered #9537

Comments

mustafa0x commented Mar 6, 2018 • edited Loading

timvandermeij commented Mar 6, 2018 • edited Loading

mustafa0x commented Mar 7, 2018 • edited Loading

mustafa0x commented Mar 7, 2018

mustafa0x commented Mar 13, 2018

timvandermeij commented Mar 13, 2018

pravid commented Jan 16, 2019

Hao-Wu commented Jul 4, 2019

pravid commented Jul 4, 2019

shivamsharmabtp commented Dec 6, 2019

pravid commented Dec 6, 2019

shivamsharmabtp commented Dec 6, 2019 • edited Loading

pravid commented Dec 6, 2019

mustafa0x commented Mar 6, 2018 •

edited

Loading

timvandermeij commented Mar 6, 2018 •

edited

Loading

mustafa0x commented Mar 7, 2018 •

edited

Loading

shivamsharmabtp commented Dec 6, 2019 •

edited

Loading