Thursday, December 2, 2010

PDF me, please!

A professor came into the lab this morning and asked if I could help her. She's going to a third-world country for a semester and doesn't want to drag a bunch of textbooks with her, so she asked if we could turn the one (or perhaps one of the ones...) she wants to teach with into a PDF. Around 600 pages...Thanks to our high-speed, duplexing color scanner, the entire book was scanned into the computer in around 30 minutes.
Automatically running optical character recognition (OCR) on the thing took quite a bit longer, then exporting as a PDF took a while, too.
Then I heard from my boss that only a few chapters were to be converted and given to her, according to their discussion. Unfortunately, Acrobat 9 didn't want to cooperate, so I had my student worker use OS X Preview to separate 'em (which seems to have removed the OCR. hmp.), then another worker batch OCR the individual chapters, which seems to have worked. Interestingly enough, my OCR'ed PDF was around 250 MB, the total size of the individual chapters was around 2.5 GB! Talk about an explosion!

No comments:

Post a Comment