Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Last line of PDF text on a page not selectable in the new PDF.js #1283

Closed
mkdir-washington-edu opened this issue Oct 2, 2021 · 9 comments · Fixed by hypothesis/browser-extension#799
Assignees
Labels
bug Frontend Issues that include frontend work

Comments

@mkdir-washington-edu
Copy link

Describe the bug
From user ticket: https://app.hubspot.com/contacts/6291320/ticket/583626067/

User reports two PDFs that have issues selecting the last line of certain pages in the LMS app that don't have this problem in Firefox.

PDF 1 aka "Jess" had a problem at the last line of page 74.

PDF 2 aka "Tatum" had a problem with the last line of page 50.

In Firefox 92:
image

image

In the LMS app:

https://hypothesis.instructure.com/courses/258/assignments/2032
image

https://hypothesis.instructure.com/courses/258/assignments/2037
image

To Reproduce
Steps to reproduce the behavior:

  1. Compare PDFs linked above to assignments linked above
  2. Note the difference in selectable text in the indicated pages.

Expected behavior
If possible it would be good to have the LMS app reflect what users can currently select using their web browsers.

@mkdir-washington-edu
Copy link
Author

Unsure, but maybe related to https://github.com/hypothesis/support/issues/226.

@mkdir-washington-edu
Copy link
Author

From backlog meeting: In PDF.js - in FF dev tools you can find version of PDF.js. Get versions of FF and LMS app.

Firefox 94 uses PDF.js: 2.11.298
The LMS app uses PDF.js 2.11.249

@mattdricker mattdricker transferred this issue from hypothesis/support-legacy Jan 14, 2022
@chrisshaw chrisshaw changed the title last line of PDF text on a page not selectable in the new PDF.js Last line of PDF text on a page not selectable in the new PDF.js Feb 4, 2022
@robertknight robertknight self-assigned this Feb 17, 2022
@robertknight
Copy link
Member

robertknight commented Feb 17, 2022

Testing the "Jess" PDF from the issue description with PDF.js v2.13.153 in Chrome:

Jess 5 1 - PDFjs 2 13 153

In the cases where the selection doesn't cover the whole line, it still includes the correct text. However the position of text in the hidden text layer doesn't line up with the underlying content. This can be seen if changing the styling on the hidden text layer (here it has red text and a slightly-translucent white background):

Jess text layer

For contrast, here is the text layer from the version of PDF.js that we currently ship in Chrome:

Jess text layer - PDFjs 2 11 - Chrome

Testing in Firefox 97 with the same PDF.js version:

Jess 5 1 - PDFjs 2 13 153 - Firefox

Here is the text layer in Firefox:

Jess text layer - Firefox

To test in Firefox I ran a local HTTP server with python3 -m http.server 8006 in the directory containing the PDF and accessed the PDF via http://localhost:8006/jess.pdf. This was necessary because the Firefox extension doesn't support accessing file:// URLs yet, at least not in development mode.

@robertknight
Copy link
Member

It occurs to me it would be quite useful to add a hidden "visualize PDF text layer" feature to the client for debugging purposes. What would be the best way to enable it? Keyboard shortcut?

@robertknight
Copy link
Member

Here is the example from the "Tatum" PDF in 2.13.153. The main issue is fixed, although there are now some lines where the text layer doesn't align correctly with the content underneath. Similar to the issue with the other PDF it does have all of the content, but the horizontal sizing doesn't match.

Tatum text layer - PDFjs 2 13 153 - Chrome

@robertknight
Copy link
Member

I started looking at this in the browser extension. See hypothesis/browser-extension#799. In the process I filed a client issue which we'll need to resolve first - hypothesis/client#4331.

@robertknight
Copy link
Member

I'm currently working on a PDF.js update here: hypothesis/browser-extension#799.

@robertknight
Copy link
Member

@robertknight
Copy link
Member

The new PDF.js version will bump the minimum supported browser version for the PDF viewer as described at mozilla/pdf.js#14538 to versions that are ~3 years old, which conveniently aligns approximately with our typical cut-off point for older browsers:

  • Chrome 73
  • Firefox ESR
  • Safari 12.1

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Frontend Issues that include frontend work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants