This is a collection of bash/linux commands useful for editing a scanned book. You start with a scanned double-page-per-side pdf and you come up with a single-page, cropped and reduced in size pdf.
In this procedure I assume that you are familiar with linux command line. If you don't, there are many tutorials out there. You may have to install the following tools:
poppler-utils
imagemagick
The procedure is the following. We extract images from pdf file, then we crop the right pages and the left pages, we convert images back to pdf files and we join them toghether. Note: to have a good result you must be very careful to scan the pages maintaining the book aligned in the same position, athought cropping may cut out part of text from some pages.
mkdir img
pdfimages input.pdf img/page
mkdir right/img
mkdir left/img
Open one of the images in img/ with GIMP (or another image program), use rectangular selection tool to select the area you want to become the right page. Write down the position X0,Y0 and the size WIDTH,HEIGHT of the selection.
for a in img/*.pbm; do convert -crop WIDTHxHEIGHT+X0+Y0 $a right/$a ; done
Remember to replace the values with the ones you wrote down.
for a in img/*.pbm; do convert -crop WIDTHxHEIGHT+X0+Y0 $a left/$a ; done
rename s/.pbm/b.pbm/ right/img/*
rename s/.pbm/a.pbm/ left/img/*
mkdir pages; cp left/img/* pages/; cp right/img/* pages/;
for a in pages/*; do convert $a $a.pdf; done
pdfunite pages/*.pdf output.pdf
Check if your final pdf is OK, then you can remove all the temporary directories created.
rm -R right/ left/ img/ pages/
Now we have the final cropped pdf. That will probably a high quality, big file. We can adjust the size of our file:
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=setting -sOutputFile=output-final.pdf output.pdf
replacing "output-final.pdf" with the name you want and "setting" with a desired quality level. Quality level settings are /screen
the lowest resolution and lowest file size, but fine for viewing on a screen; /ebook
a mid-point in resolution and file size; /printer
and /prepress
high-quality settings used for printing PDFs.
Improvements to this procedure are welcome.