Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

File size gets increased or unchanged #262

Open
amitdixit opened this issue Feb 17, 2024 · 4 comments
Open

File size gets increased or unchanged #262

amitdixit opened this issue Feb 17, 2024 · 4 comments

Comments

@amitdixit
Copy link

I had a PDF fie with 58 pages of size 4.5 MB , I removed 50 pages from the file the size was like 4.4 MB for 8 pages, I did use compress and other things but he size remained unchanged.

use lopdf::dictionary;

use std::collections::BTreeMap;
use std::vec;

use lopdf::content::{Content, Operation};
use lopdf::{Bookmark, Document, Object, ObjectId, Stream};

fn main() {
let mut doc = Document::load("PATH_TO_58_page.pdf").unwrap();
doc.compress();
let count = doc.get_pages().len();

let mut page_numbers: Vec<u32> = vec![];
for i in 8..count {
    
    page_numbers.push(i as u32);
}

doc.delete_pages(&page_numbers);

doc.compress();
doc.save("PATH_TO_58_page_new1.pdf").unwrap();
doc.compress();

let mut doc2 = Document::load("PATH_TO_58_page.pdf").unwrap();
let pages = doc2.get_pages();

for (contents_index, page) in pages {

    if !page_numbers.contains(&contents_index){
        println!("{}", contents_index);
        let content = doc2.get_page_content(page).unwrap();
        let mut stream = Stream::new(dictionary! {}, content.clone());
        stream.compress().unwrap();
        let stream_obj = Object::Stream(stream);

        doc2.objects.insert(page, stream_obj);
    }
    
}

doc2.compress();
doc2.save("output.pdf").unwrap();

}

@Heinenen
Copy link
Collaborator

This sounds realistic if the pages that you removed were only text and the remaining pages contain images. Then, lopdf is not at fault.

If that is not the case, a PDF file to reproduce this issue would be very helpful.

@4F2E4A2E
Copy link

lopdf is awesome, thank you for this crate.
Using this example [1] I can confirm that the split pages have the same size as the original file. Just take any word document, convert it to pdf and split with that code.

1: https://gitlab.com/andrew_ryan/pdf_cli/-/blob/main/src/main.rs?ref_type=heads#L107

@amitdixit
Copy link
Author

Hi . All the pages which I removed contains images.

@Heinenen
Copy link
Collaborator

Thanks, I'll have a look at it, but probably not very soon.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants