I can appreciate that, if you have no problem sacrificing the book. There are some very old books at my local seminary library that I want to digitize, though, so I'm gonna have to build my own rig.
@mjt1517 I've seen many of the DIY scanning systems that people have made. They are quite cool. But the amount of work required to get a completely automated system is intense. Orders of magnitude more work than what I've done here. Another alternative is to spend hundreds of thousands of dollars for a professional non-destructive scanning system like Google uses. Also way out of reach. All non-destructive scanners take WAY longer than this method too. That is why I chose this way.
Thanks for the great tutorial. I have Adobe Acrobat and never realized I could digitize my old genealogy books with it. Such a great tool for searching information quickly.
Okay, between 1:58 and 2:00 I see two orbs moving in a horizontal direction, independently from one another and the environment around. Look for them in the video, they're just above the book, but far in the distance, close to the wall.
@ontopofthe7thhill Wow, never thought I would see conspiracy theories about a dry tutorial like this! :-) Just kidding. Anyhow, I think those are probably artifacts from the video compression. They occur very close to the edge of the book, so it's very possible they are just some side-effects from the algorithm being used to reduce how much information must be sent for each frame of video. Compression codecs attempt to remove unnecessary visual info but there is always a tradeoff involved.
I use a radial arm saw. Of course that is not something that everyone has, but it works great. I place a board on either side of the book and tighten bolts to hold it tightly in place. I countersink the bolts that go against the table. I use a plywood blade which is quite fine. The resulting cut is very smooth and works well through the scanner.
@Myfrogsnameisbob hey, i meant that home depot reps can "bandsaw" the binding of the text book, and have the pages separated (as I do not have any wood working equipment).
@jonnyu25 The software to my scanner automatically compiles the pages together. A flat-bed scanner usually doesn't have these conveniences. You'll have to figure something out for yourself, but this is the very reason I bought the document scanner--because it scans both sides and does it quickly, taking care of all of this for you. Otherwise you are looking at a process that takes 100 times longer. For me, the hundreds of hours saved by this scanner are worth paying 200 extra bucks for.
Dude, for about fifty cents, you could have taken your book to the local printer and asked them to cut off the binding. The cut would have been considerably cleaner.
@cayblood Hundreds? Then they would probably discount it to 25 cents per book. Your time is worth money too. You would save hours of having to babysit your scanner, doing something else between scanner feedings. You could even ask them to rebind your books after you were done because the amount of binding cut off would be minimal and likely still easily rebound. What did you do with all the unbound books?
@gr33nman Good points. That might actually be a great idea if I ever have to undertake something like this again. We just put them in our paper recycling bin. But I would still have to babysit the scanner. Even with crisp cuts, the pages don't always go through perfectly.
Thanks so much for producing such a great video. A quick question because I'm thinking of undertaking this for quite a lot of books. Approximately how much space (megabytes) does 1 book, say 100 pages/500 pages/whatever PDF file take up both BEFORE and AFTER the OCR conversion. If a converted file takes up a lot of space, maybe I can just scan into a raw PDF and then perform OCR when I wish to really read more deeply, convert into epub/mobi format for a reader, or whatever. Thanks again!!!
@Mises500 A PDF scan at the default quality settings, which are pretty good, takes around 100MB for 1,000 pages of text. After OCR the size tends not to change very much. Acrobat also has PDF optimization tools that can usually reduce your file size by 50% or more, so you can often end up with a PDF that is less than 50MB per 1,000 pages.
@drewdaniel12 Yes, once OCR has been performed on the PDF you can annotate and/or copy/paste the text, depending on what you are using for your PDF viewer.
Lots of folks have been commenting about my bare feet and lack of safety glasses. All I can say is I'm guilty as charged. I was thinking about whether to re-film or not after doing it and I just thought, what the heck, why not leave it that way? I've done lots of woodworking work and most of the time I do use goggles and wear shoes, but this was just a quickie. Despite my lack of shoes, I was handling the saw quite carefully.
@cayblood As far as eyecare is concerned, after doing it for a while you realize which kinds of cuts run the risk of sending material in your direction and which ones don't. But I agree that it is better to be safe than sorry. So, moral of the story is, don't follow my example :-)
@mkcvenezia I just used the blade that came with my circular saw. Perhaps if you use one with smaller teeth you would get a smoother cut, but basically anything will work. The key to clean cuts is to clamp tightly.
@watman26 Let's see, that would have cost me thousands of dollars to digitize my library. No thanks. The whole idea was to get rid of the paper books anyway.
Cool! The clever bit was the novel thought of "Hey... I could use a circular saw on my expensive book!"
What DPI do you scan at?
B&W? Grayscale?
Any other relevant settings?
I've only got a one-page-at-a-time flatbed scanner. Does anyone have any ideas of how to acheive the same result without spending hours and hours on scanning?
@theonlyrick The scanner defaults to 600 dpi but can go as high as 1200. It auto-detects color and switches when necessary to capture color photos etc. As far as your flatbed is concerned, you're out of luck. You'll need to upgrade.
Hi Carl, your video was just posted to Lifehacker, and I thought I recognized your name. I know you from PHS, this is Sam Anderson. Great idea. I have done this to magazines a lot, I can take off the binding with out a saw. I saw a video of a special paper cutter that removes book bindings. But this method would be much less expensive method.
@syand3 Hey Sam, great to hear from you! You still living in Utah? We recently moved to Norway. In fact, part of the reason I was digitizing all my books was because the overseas move was so expensive.
@max60003 Depends on how many pages. As an example, a 1000 page paperback book ended up being 128MB. But you can significantly shrink this down using Acrobat's optimize feature, which has the added benefit of making the pages turn faster in your e-reader.
@radionowandthen The scanner scans both sides of the page. As you saw in the video, the binding must be cut off, so effectively all sides of all pages are scanned. If you wish to see a spread image that covers two adjacent pages, you can adjust your PDF reader to show you both pages side by side when you read them.
@nurktwin1960 This has nothing to do with this video. It's a Youtube problem. Please don't post such comments here, if you want somebody to do something about it. Instead, post feedback to the help forum below.
@nurktwin1960 Youtube hosts these video files. Every file they host can potentially be on a different server. Each server experiences a different level of load and demand for its files. Apparently the server that this file is being hosted on is not able to serve out its data to you at a speed that prevents it from pausing. This is a problem that Youtube needs to figure out. The video itself has no pauses or jitters in it, you're just not receiving the data fast enough.
Thank you for posting this, I've learned much and have spent the last hour learning about this fantastic scanner. Why can't you put a piece of 3 sided plastic stuck on the top of the case in front of the feeder to hold 500 pages? Wouldn't a band saw make a smoother cut?
@ssssaaafff The scanner has a slot that you feed the paper into that only fits about 100 pages at a time. The reason it can't hold more has nothing to do with the capacity in the outfeed tray, but the way the infeed area is designed. The smoothness of the cut is not really affected by the thickness of the blade, but by how well you have clamped the pages. Loose clamping gives you a jagged cut--tight clamping gives a very clean cut.
Hey would you convert a book for me
Crowderkyle1111 2 days ago
I can appreciate that, if you have no problem sacrificing the book. There are some very old books at my local seminary library that I want to digitize, though, so I'm gonna have to build my own rig.
mjt1517 4 days ago
You don't have to slice and dice your books. Look up DIY book scanning. You just need to photograph the pages and OCR them.
mjt1517 4 days ago
@mjt1517 I've seen many of the DIY scanning systems that people have made. They are quite cool. But the amount of work required to get a completely automated system is intense. Orders of magnitude more work than what I've done here. Another alternative is to spend hundreds of thousands of dollars for a professional non-destructive scanning system like Google uses. Also way out of reach. All non-destructive scanners take WAY longer than this method too. That is why I chose this way.
cayblood 4 days ago
Dude! you are bare feet! Isn't that dangerous?
churchbigbang 2 months ago
Thanks for the great tutorial. I have Adobe Acrobat and never realized I could digitize my old genealogy books with it. Such a great tool for searching information quickly.
mivey64 2 months ago
Now this, is useful. Thank you!
silllyniecy 3 months ago
Okay, between 1:58 and 2:00 I see two orbs moving in a horizontal direction, independently from one another and the environment around. Look for them in the video, they're just above the book, but far in the distance, close to the wall.
ontopofthe7thhill 3 months ago
@ontopofthe7thhill Wow, never thought I would see conspiracy theories about a dry tutorial like this! :-) Just kidding. Anyhow, I think those are probably artifacts from the video compression. They occur very close to the edge of the book, so it's very possible they are just some side-effects from the algorithm being used to reduce how much information must be sent for each frame of video. Compression codecs attempt to remove unnecessary visual info but there is always a tradeoff involved.
cayblood 3 months ago
Don't need to re-film. Just add the warning words at the video introduction.
Florenceho 3 months ago
@Florenceho Or better yet just read my comment on the video. It's really not a big deal.
cayblood 3 months ago
man can i borrow your scanner
crazyboy4793 4 months ago
I use a radial arm saw. Of course that is not something that everyone has, but it works great. I place a board on either side of the book and tighten bolts to hold it tightly in place. I countersink the bolts that go against the table. I use a plywood blade which is quite fine. The resulting cut is very smooth and works well through the scanner.
Harv5591 5 months ago
I take my books to Home Depot ;o) Saws are scary. I hear ABBYY Fine Reader does a MUCH more accurate job of recognizing characters (OCR) than Adobe.
sdje348 5 months ago
@sdje348 Home Depot digitizes books?
Myfrogsnameisbob 1 month ago
@Myfrogsnameisbob hey, i meant that home depot reps can "bandsaw" the binding of the text book, and have the pages separated (as I do not have any wood working equipment).
sdje348 1 month ago
How can I organize the pages if my scanner only does one side at a time?
jonnyu25 5 months ago
@jonnyu25 The software to my scanner automatically compiles the pages together. A flat-bed scanner usually doesn't have these conveniences. You'll have to figure something out for yourself, but this is the very reason I bought the document scanner--because it scans both sides and does it quickly, taking care of all of this for you. Otherwise you are looking at a process that takes 100 times longer. For me, the hundreds of hours saved by this scanner are worth paying 200 extra bucks for.
cayblood 5 months ago
Comment removed
jonnyu25 5 months ago
Dude, for about fifty cents, you could have taken your book to the local printer and asked them to cut off the binding. The cut would have been considerably cleaner.
gr33nman 5 months ago
@gr33nman I doubt any printer would be interested in doing this for hundreds of books, which is what I needed it for. It worked well enough.
cayblood 5 months ago
@cayblood Hundreds? Then they would probably discount it to 25 cents per book. Your time is worth money too. You would save hours of having to babysit your scanner, doing something else between scanner feedings. You could even ask them to rebind your books after you were done because the amount of binding cut off would be minimal and likely still easily rebound. What did you do with all the unbound books?
gr33nman 5 months ago
@gr33nman Good points. That might actually be a great idea if I ever have to undertake something like this again. We just put them in our paper recycling bin. But I would still have to babysit the scanner. Even with crisp cuts, the pages don't always go through perfectly.
cayblood 5 months ago
Consider yourself fortunate you didn't lop off yer foot!
PigsCanFly99 5 months ago
Spineless!! ^_^ :-P
unlokia 5 months ago
MY FRIEND I AM FROM GREECE AND YOU HELPED ME VERY MUCH WITH YOUR IDEA!!!!GONGRATULATIONS!!!!
ElfasLoucio 6 months ago
Shame, but thanks! (Better late than never with my reply, eh?)
R.
PS - We have big scanner/copiers at work. Perhaps....
theonlyrick 6 months ago
is it just me that noticed that he was cutting without any socks or shoes on?
SambaBentzen 6 months ago 2
Thanks so much for producing such a great video. A quick question because I'm thinking of undertaking this for quite a lot of books. Approximately how much space (megabytes) does 1 book, say 100 pages/500 pages/whatever PDF file take up both BEFORE and AFTER the OCR conversion. If a converted file takes up a lot of space, maybe I can just scan into a raw PDF and then perform OCR when I wish to really read more deeply, convert into epub/mobi format for a reader, or whatever. Thanks again!!!
Mises500 7 months ago
@Mises500 A PDF scan at the default quality settings, which are pretty good, takes around 100MB for 1,000 pages of text. After OCR the size tends not to change very much. Acrobat also has PDF optimization tools that can usually reduce your file size by 50% or more, so you can often end up with a PDF that is less than 50MB per 1,000 pages.
cayblood 7 months ago
ok, thanks!
drewdaniel12 9 months ago
Once you get it into adobe acrobat format, are you able to highlight and copy and paste sentences into say Word?
Thanks,
Andrew
drewdaniel12 9 months ago
@drewdaniel12 Yes, once OCR has been performed on the PDF you can annotate and/or copy/paste the text, depending on what you are using for your PDF viewer.
cayblood 9 months ago
To save DRM - STOP printing now? #piraten
Boomel 10 months ago
I always wondered if this would work. The OCR was a nice touch. I always figured the best case would only be a PDF of the scans.
hosalabad 10 months ago
Lots of folks have been commenting about my bare feet and lack of safety glasses. All I can say is I'm guilty as charged. I was thinking about whether to re-film or not after doing it and I just thought, what the heck, why not leave it that way? I've done lots of woodworking work and most of the time I do use goggles and wear shoes, but this was just a quickie. Despite my lack of shoes, I was handling the saw quite carefully.
cayblood 10 months ago
@cayblood As far as eyecare is concerned, after doing it for a while you realize which kinds of cuts run the risk of sending material in your direction and which ones don't. But I agree that it is better to be safe than sorry. So, moral of the story is, don't follow my example :-)
cayblood 10 months ago
@cayblood Awesome tutorial. I just have 1 questing, what is the best blade to use?
mkcvenezia 10 months ago
@mkcvenezia I just used the blade that came with my circular saw. Perhaps if you use one with smaller teeth you would get a smoother cut, but basically anything will work. The key to clean cuts is to clamp tightly.
cayblood 10 months ago
Print Shop Worker Here - Take this project to your local shop for better results.
1) Use a ream cutter to cut your book, most shops charge around $1.00 per cut.
2) Have it scanned in on a digital copier - way faster, better quality and only a few cents per scanned page.
3) You then have the option to rebind your book, I would recommend spiral bound.
watman26 10 months ago
@watman26 Let's see, that would have cost me thousands of dollars to digitize my library. No thanks. The whole idea was to get rid of the paper books anyway.
cayblood 10 months ago
seriously... no glasses, no shoes ?? mate, thats a circular saw. i assume you don't kids coz wtf!?
aussievoter 10 months ago
Cool! The clever bit was the novel thought of "Hey... I could use a circular saw on my expensive book!"
What DPI do you scan at?
B&W? Grayscale?
Any other relevant settings?
I've only got a one-page-at-a-time flatbed scanner. Does anyone have any ideas of how to acheive the same result without spending hours and hours on scanning?
theonlyrick 10 months ago
@theonlyrick The scanner defaults to 600 dpi but can go as high as 1200. It auto-detects color and switches when necessary to capture color photos etc. As far as your flatbed is concerned, you're out of luck. You'll need to upgrade.
cayblood 10 months ago
Awesome idea. BTW, love the safety boots.
tyasm 10 months ago
Hi Carl, your video was just posted to Lifehacker, and I thought I recognized your name. I know you from PHS, this is Sam Anderson. Great idea. I have done this to magazines a lot, I can take off the binding with out a saw. I saw a video of a special paper cutter that removes book bindings. But this method would be much less expensive method.
syand3 10 months ago
@syand3 Hey Sam, great to hear from you! You still living in Utah? We recently moved to Norway. In fact, part of the reason I was digitizing all my books was because the overseas move was so expensive.
cayblood 10 months ago
whats the file size of the scanned book ?
max60003 10 months ago
@max60003 Depends on how many pages. As an example, a 1000 page paperback book ended up being 128MB. But you can significantly shrink this down using Acrobat's optimize feature, which has the added benefit of making the pages turn faster in your e-reader.
cayblood 10 months ago
This video is a year and a half old, and it just got featured on lifehacker :P Congrats!
JarrodJS 10 months ago
@JarrodJS Cool! Feels good to know that I was able to contribute in some way.
cayblood 10 months ago
This has been flagged as spam show
Watch my video on how to cut a book.
It's called "how to cut a book at home for scanning".
215810 1 year ago
Watch my video on how to cut a book.
It's called "how to cut a book at home for scanning".
215810 1 year ago
How do you handle alternating pages?
radionowandthen 1 year ago
@radionowandthen The scanner scans both sides of the page. As you saw in the video, the binding must be cut off, so effectively all sides of all pages are scanned. If you wish to see a spread image that covers two adjacent pages, you can adjust your PDF reader to show you both pages side by side when you read them.
cayblood 1 year ago
Video keeps stuttering and stopping.
nurktwin1960 1 year ago
@nurktwin1960 This has nothing to do with this video. It's a Youtube problem. Please don't post such comments here, if you want somebody to do something about it. Instead, post feedback to the help forum below.
cayblood 1 year ago
@cayblood If it's a Youtube problem, why are all the other videos I watch on this subject not having any problems?
nurktwin1960 1 year ago
@nurktwin1960 Youtube hosts these video files. Every file they host can potentially be on a different server. Each server experiences a different level of load and demand for its files. Apparently the server that this file is being hosted on is not able to serve out its data to you at a speed that prevents it from pausing. This is a problem that Youtube needs to figure out. The video itself has no pauses or jitters in it, you're just not receiving the data fast enough.
cayblood 1 year ago
@cayblood Thanks. That's a better explanation than I got from from the help forum.
nurktwin1960 1 year ago
Excellent !!!
Thanks for sharing this great video. Very helpfully.
Regards.
adeirton 1 year ago
what type of saw do you need so that the pages dont get all jagged
brandon9555 1 year ago
@brandon9555 I think any brand of circular saw will be fine. The most important thing to avoid jagged edges is to clamp the pages very tightly.
cayblood 1 year ago
Thank you for posting this, I've learned much and have spent the last hour learning about this fantastic scanner. Why can't you put a piece of 3 sided plastic stuck on the top of the case in front of the feeder to hold 500 pages? Wouldn't a band saw make a smoother cut?
ssssaaafff 1 year ago
@ssssaaafff The scanner has a slot that you feed the paper into that only fits about 100 pages at a time. The reason it can't hold more has nothing to do with the capacity in the outfeed tray, but the way the infeed area is designed. The smoothness of the cut is not really affected by the thickness of the blade, but by how well you have clamped the pages. Loose clamping gives you a jagged cut--tight clamping gives a very clean cut.
cayblood 1 year ago
cool video carl =)
nickleus1977 1 year ago
Hi,
Thank you very much this is great.
Thanks. God Bless.
Aaron.
aaroncavanaugh2 1 year ago
your scanner cost more than my computer
brandon9555 1 year ago
Good men!!!
How about A3 Size Scanning this fast?
Well done for paperless!!
TY
Sergio
MRMANFE 1 year ago
It's a ScanSnap S1500M. It works fine with just about everything, as long as it's no larger than an 8.5x11 sheet of paper.
cayblood 1 year ago
What model SapScan is this? Does it work well with the pictures and diagrams?
joexner52 1 year ago
Sweet action!
timcharper1 2 years ago