HP Labs started developing an OCR engine back in 1985 and it took them 10 years before they abandoned it, just to be picked up by Google to release it as open source.
It’s called Tesseract and it turns out it’s really easy to install it on a Linux server (if you have root privileges) and put together a simple CGI script to serve it over the web.
Here’s how you do it:
1. Go to tesseract-ocr readme and find Installation Notes – Tesseract 3.01. Follow the instructions for Linux. There’s many dependencies and it takes quite some time to install them, get yourself a beer or some nuts for company.
I had some issues with TESSDATA_PREFIX and since I only needed english, I moved eng.traineddata file from tessdata folder to /usr/local/bin/tesseract/ (where the tesseract is installed).
2. If you don’t have it proceed to Compiling and Installing – Apache HTTP Server.
3. Create a file in cgi-bin inside of your apache installation, name it ocr.cgi and populate with following code:
#!/bin/sh
TES=/usr/local/bin/tesseract
OCR=/home/econofy/ocr
FILE=$1
EXT=${FILE##*.}
echo Content-type: text/plain
echo ""
wget -O $OCR/wget.$EXT $1
if [ -x $TES ]
then
$TES $OCR/wget.$EXT $OCR/wget -l eng
cat $OCR/wget.txt
rm $OCR/wget.txt
else
echo Tesseract not found
fi
Now, you’re ready to test if it works. Just pass a url of some picture, whether it’s png, jpeg or tiff to your cgi script. Here’s a picture I used – 
Here’s how it looks like on my end (you might still find this on a Rackspace server generously provided by Cloudmine’s Ilya Braude):
Tesseract Open Source OCR Engine v3.02 with Leptonica
U S Government Federal law prohibits removal otthis label before consumer purchase.EHERCI GUIDE
Refrigerator-Freezer Modelts): HTJ17BBT, HTJ17CBT,
- Automatic Defrost HTH17CBT
- Top-Mounted Freezer Capacity:16.5 Cubic Feet
- Without Through-the-Door IceEstimated Yearly Operating Cost
! I I I I
$42 $52
Cost Range of Similar ModelsThe estimated yearly operating cost of this model was
not available at the time the range was published.324kWh
Estimated Yearly Electricity Use
Your cost will depend on your utility rates and use.
I Cost range based only on models of similar capacity with automatic defrost,
top-mounted freezer, and without through-the-door iceI Estimated operating cost based on a 2007 national average electricity cost
of 10.65 cents per kWh.F f t . ‘t ft. I I” . -
I or morein orma ion visi www cgovappiances 197Da23aP003 ENERGYS-I-AR£1
Far from perfect, heh? I’m going to explore the configuration settings and let you know how it goes.








