-
-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add the ability to copy a pdf #986
Add the ability to copy a pdf #986
Conversation
@Hopding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mohamedsalem401 thanks for working on this! It's intriguing to me that building a copy seems to fix so many different issues. I wonder why that is? 🤔
This is, of course, a workaround for the problems rather than a full solution of the root cause, because this requires users to know when they need to use .copy()
. Also, .copy()
won't always copy all information over to the new doc (acroforms, etc...).
Regardless, I'm happy to have a feature like this built into the lib. I've requested a slight refactor. Please also be sure to update the unit tests and integration tests to exercise this logic. 🙂
src/api/PDFDocument.ts
Outdated
if (returnCopy) { | ||
const pdfDoc = new PDFDocument(context, ignoreEncryption, updateMetadata); | ||
const pdfCopy = await PDFDocument.create(); | ||
const contentPages = await pdfCopy.copyPages( | ||
pdfDoc, | ||
pdfDoc.getPageIndices(), | ||
); | ||
|
||
for (const page of contentPages) { | ||
pdfCopy.addPage(page); | ||
} | ||
if (pdfDoc.getAuthor() !== undefined) { | ||
pdfCopy.setAuthor(pdfDoc.getAuthor()!); | ||
} | ||
if (pdfDoc.getCreationDate() !== undefined) { | ||
pdfCopy.setCreationDate(pdfDoc.getCreationDate()!); | ||
} | ||
if (pdfDoc.getCreator() !== undefined) { | ||
pdfCopy.setCreator(pdfDoc.getCreator()!); | ||
} | ||
if (pdfDoc.getModificationDate() !== undefined) { | ||
pdfCopy.setModificationDate(pdfDoc.getModificationDate()!); | ||
} | ||
if (pdfDoc.getProducer() !== undefined) { | ||
pdfCopy.setProducer(pdfDoc.getProducer()!); | ||
} | ||
if (pdfDoc.getSubject() !== undefined) { | ||
pdfCopy.setSubject(pdfDoc.getSubject()!); | ||
} | ||
if (pdfDoc.getTitle() !== undefined) pdfCopy.setTitle(pdfDoc.getTitle()!); | ||
pdfCopy.defaultWordBreaks = pdfDoc.defaultWordBreaks; | ||
|
||
return pdfCopy; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are related to, but still distinct from, loading a document. So I think this would fit better as a .copy()
method on PDFDocument
. Then, instead of:
PDFDocument.load(..., { returnCopy: true })
we could do:
PDFDocument.load(...).copy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you've done this, be sure to add a proper doc comment to the .copy()
method
add .copy() method that returns a copy of the original pdfDocument
I have made the changes you asked for 😃 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks again @mohamedsalem401! This feature will go out in the next release 🎉
Thanks for accepting my PR @Hopding I have some questions if you have the time:
Looking forward to making more contributions to this wonderful repository 😃 |
|
@mohamedsalem401 Thank you so much for working on this! I had originally set a bounty on #951 and I'd love to discuss it with you. Is it possible for you to email me at emil@rechat.com ? |
closes #951
Some badly-made PDF files can be automatically repaired by Adobe Reader, but can't be edited by pdf-lib.
Sometimes, working with a copy of these corrupted pdf can fix those issues.
Examples:
1- #951
https://github.com/Hopding/pdf-lib/files/6912122/test.pdf
opening and saving the original file:
opening and saving with PDFDocument.load(......, {returnCopy: true})
2- #140
Having the ability to load a copy directly allows you to remove all of the deleted pages' objects from the document. (thus reducing file size):
A file with deleted pages that weren't removed from the document
the exact same copy but after loading a copy and deleting all of the deleted pages' objects from the document.
3-Other undamaged pdfs can be viewed better when copied to pdf-lib first:
copy
original