Tesseract OCR Library – Successfully compiled in Window :)
Today, I got the project to make OCR software. After googling, I reach on the conclusion to use Tesseract library. This library is opensource and available in both Windows and Linux. This library is provided with Visual Studio project. I have compiled it with the .NET and also Visual Studio 6.0. This creates tesseract.exe. Which is successfully run on my pc. It converts the tiff image into the text. Though not perfect but we can use it in our project.
If anybody wants the instruction please mail me or post your queries and question here.
=============================
Updated on 20 Mar 2008
Following is the link which my compiled code
http://cid-7ec550791692ecb9.skydrive.live.com/self.aspx/Tesseract/tesseract-1.03.zip
Also google updated and provided tesseract code which is on following link
http://code.google.com/p/tesseract-ocr/downloads/list
=============================
Updated on 5 Jun 2008
Following are the links which may be useful for compiling under .NET. I haven’t tried it still.
http://www.pixel-technology.com/freeware/tessnet/
==============================
Updated on 23 Mar 2009
I believe that following URL will resolve all your issue. This ocr is updated on 14th April 2008. I think you should give it a shot 🙂
http://code.google.com/p/tesseract-ocr/downloads/list
==============================
Updated on 05 Oct 2010
Now, Tesseract 2.04 is available with the compiled source code in Visual C++ 2008 Express edition on http://code.google.com/p/tesseract-ocr/downloads/list.
Even 3.0 (Preview 1) is released with windows executable which you can use in .NET externally and convert in text.
Hi,
I have doing one OCR application. so i I had try to Tesseract 1.03 opensouce in Microsoft Visual Sudio-2005 .Net framework 2.0. But this show the errors list(Refer below), Please help me, If you have exe file or compilable source code please send me with discreption.
Looking forward your replay….
Thank you very much
Barathi
Errors list
Error 10 error C2666: ‘pow’ : 6 overloads have similar conversions e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\classify\clusttool.cpp 147
Error 30 error C2666: ‘pow’ : 6 overloads have similar conversions e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\classify\clusttool.cpp 147
Error 420 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\display\pgedit.cpp 799
Error 629 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\textord\tordmain.cpp 162
Error 630 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\textord\tordmain.cpp 172
Barathi
March 12, 2007 at 7:14 am
Hi,
how do you resolve those errors?..
I am actually experiencing those errors and I don’t know what to do..
francis
September 8, 2010 at 8:21 am
Will Send you tomorrow.
Manish Pansiniya
March 12, 2007 at 2:53 pm
can you send it to me also?..
francis
September 8, 2010 at 9:00 am
Barathi
Please download it from. Please note that this code is for .NET.
http://www.sharebigfile.com/file/109699/tesseract-1-03-zip.html
I have changed the code a bit to remove the errors.
Manish Pansiniya
March 13, 2007 at 6:23 am
Great job on compiling. How do you actually invoke the exe? passing commandline parameters?
CP
March 16, 2007 at 5:21 am
basically i haven’t tested the .NET executable but for VC++ executable, i succeed for it. Yes i am invoking exe by passing parameter to it.
tesseract [imagefile] [textfile]
But not every image can be processed by this exe. So you need to download ImageMagic and use it’s convert.exe to convert existing images to tif. And this generated tif needs to be used by tesseract so that it can successfully OCR the image.
Manish Pansiniya
March 17, 2007 at 9:56 am
I have just downloaded the compiled Tesseract Windows exe complete with a user interface from the website below in case this is of any use to you. This version also opens any Tiff compression images.
http://www.softi.co.uk/tess.htm
Regards
Jason
Jason Fullerton
March 25, 2007 at 10:09 pm
I have tried to download your modified Tesseract OCR library, but the link didn’t work. Can you post this again, please?
Anonymous
October 25, 2007 at 8:19 am
download from following folder
http://www.4shared.com/dir/4325553/c16db878/sharing.html
Manish Pansiniya
October 25, 2007 at 11:49 am
I was able to compile it using Studio 2005 but I have some problems running tesseract2.01.exe on Windows 2k3 Server
for example when I want to scan the phototest.tif file which goes with src I have the following in tesseract.log
Unable to load unicharset file tesseract2.01.exe/tessdata/eng.unicharset
I have run all the .exe files in training folder. All except cnTraining.exe which gives me unhandled exception.
I tried to find some data to put it into tessdata folder but then I have another list of errors in log file
Could you help me out with that? 🙂 Thanks!
p.s. I tried to compile the cnTraining to be able to Debug it but I got LINK errors..
Nick
October 25, 2007 at 10:06 pm
Uh.. I got /tessdata from project source zip file. I run tesseract and it works. Now I have to figure out how to get rid of LINK errors while compiling the cnTraining project 🙂
Error 1 error LNK2019: unresolved external symbol “int __cdecl getopt(int,char * * const,char const *)” (?getopt@@YAHHQAPADPBD@Z) referenced in function “void __cdecl ParseArguments(int,char * *)” (?ParseArguments@@YAXHPAPAD@Z) cnTraining.obj
Error 2 error LNK2019: unresolved external symbol “void __cdecl assert(int)” (?assert@@YAXH@Z) referenced in function “void __cdecl DoError(int,char const *)” (?DoError@@YAXHPBD@Z) danerror.obj
Error 3 fatal error LNK1120: 2 unresolved externals .\..\..\debug/cnTraining.exe
It would be great to get some help! Thanks!
Nick
October 25, 2007 at 10:36 pm
Nick…is it must to use .NET code of tesseract as i have .exe file running and tested on my pc. If you need that source code please let me know. I would upload the same. I haven’t tested the .NET code and i believe i could not do for the next 2 week as of tight schedule. Please let me know your thoughts.
Manish Pansiniya
October 26, 2007 at 6:17 am
Nick, I have compiled cnTraining in .NET and its working fine with my machine. Do you need that executable. Else try to compile it on Visual C++ 6.0. May be it is the issue.
Manish Pansiniya
October 26, 2007 at 6:31 am
Hi, have downloaded tesserect code, compiled in .NET and produced the correct .exe.
However I do not know how to produce a text file from my .tif, Please can you help?
I tried running from command line:
c:\tesserect pic.tif resultfile.txt
Baber
October 27, 2007 at 12:35 am
Here you need to follow some steps. Following would be useful. You need to convert image tif to another tif format using imagemagik.
http://www.howtoforge.com/ocr_with_tesseract_on_ubuntu704
Manish Pansiniya
October 29, 2007 at 9:45 am
>>I have compiled cnTraining in .NET and its working fine with my machine
As I understand that program should give me default tessdata files…anyway I am still not able to compile it by my own. The one which I have gives me an exception when I run it 😦 It would be great if you provide me your source files. Thanks!
Nick
October 30, 2007 at 10:52 pm
Well, finally I have just compiled my code but there is still that problem. It gives me an exception when running. Could it be because I run it on w2k3 server machine? I gonna check it with XP though.
I am also planing to use it in my project… just wondering is it hard to tech it or not 🙂
Nick
October 30, 2007 at 11:40 pm
I believe it would be run on 2k3. Just use ImageMagic before running it on any image.
Manish Pansiniya
October 31, 2007 at 1:11 pm
tesseract.exe runs well without any problem. I am experiencing problem with running cnTraining.exe 🙂
Nick
October 31, 2007 at 7:10 pm
Nick. I haven’t used that exe still :-P.
Manish Pansiniya
November 1, 2007 at 8:41 am
Manish Pansiniya,
can you please update your link:
http://www.sharebigfile.com/file/109699/tesseract-1-03-zip.html
the link is broken…
is it possible to compile it under .NET compact framework?
squid
December 11, 2007 at 5:31 pm
Hi Manish,
Can you please give us the link to download the .NET compiled version of the Tesseract OCR?
Thanks
Vellayan
Vellayan
December 18, 2007 at 7:44 am
I am on leave currently. Please check after 2 days. I will surely upload the same. thanks. Also if you know any public server where i can put the file and it would be kept there for 1 year or so. I have url but those are of 1 month and so. thanks
Manish Pansiniya
December 19, 2007 at 11:51 am
I am on leave currently. Please check after 2 days. I will surely upload the same. thanks. Also if you know any public server where i can put the file and it would be kept there for 1 year or so. I have url but those are of 1 month and so. thanks
Manish Pansiniya
December 19, 2007 at 11:52 am
I have a place where I can put it for over 1 year.
Pls send an email to second.softwares@gmail.com
Second softwares
December 20, 2007 at 10:41 am
hello expert
i have the same project but little bit of
difference, the image with some text written in it and i want to read only that text and generate text file
so please give me some help how i can read this
text from image file
varinder sharma
December 24, 2007 at 5:46 am
Hi. Is it possible to use the Tesseract library directly in C#, without invoking any executables? Thank you.
Felipe Dornelas
January 21, 2008 at 4:04 pm
Where I can download Tesseract OCR Library? I need to use in a VB.Net project, can I use it in VB.Net ? Thanks
luca
January 23, 2008 at 1:33 pm
luca u can use in any programming language using executable.
I haven’t tested the .NET ported code and i think its hard to work on the code. I will upload the code on some site and let you know in this post only.
Manish Pansiniya
January 29, 2008 at 4:24 pm
Please Help i am looking to use this engine in my project in JAVA Help
zeid
February 4, 2008 at 8:59 am
please help me i have the same project but on linux how to do that
zeid
February 14, 2008 at 11:58 am
Hi, i am aslo doing a ocr software.
Could you please send me the codes.
Or to upload it again.
Thank you very much.
John
February 22, 2008 at 10:59 am
2 John:
I think U’ll find all U need here: http://groups.google.com/group/tesseract-ocr/msg/c3d22084770d4727?
usarskyy
March 25, 2008 at 3:35 pm
Please, I would like to use tesseract with C#, can you help me
thanks(I Talk to you from Brazil)
Paulo
April 28, 2008 at 5:09 pm
I also am looking to use tesseract with c# without invoking an exe.
Any Suggestions?
Sutekh
May 13, 2008 at 11:49 am
Sutekh, I think the library is such that we generally use the executable to do the needed work. Need to look into the code to make it work. And regarding C#. I haven’t done it as i said before. But you can make the dll in VC++ and use it in C# somehow.
Manish Pansiniya
May 13, 2008 at 2:07 pm
Hi Manish,
Today, I got the project to make OCR software by using Tesseract library. After googling, I
got ur blog. i did read that u already have compiled version of dotnet .So Can you please give me the link to download the .NET compiled version of the Tesseract OCR?
—
Thanks & Regards
Parmesh. A
Parmesh
May 15, 2008 at 6:49 am
As i said parmesh i have compiled it but i could not run it. it gives me error and could not have time to debug it.
Manish Pansiniya
May 15, 2008 at 7:00 am
can u give me link to download so that i wil try myself and let u know
10x in adv
Parmesh
May 15, 2008 at 8:45 am
hi
when i compiled the same code in VS2005, it shows 32 errors and more than 2000 warnings can u please give me a sol.n?
Anonymous
June 11, 2008 at 5:46 am
Hi i haven´t tried your .NET compiled version, i will try it now, i downloaded a version and compiled it and i get this error when invoking:
Tessedit:Error:Usage:Tessedit imagename outputbase [configfile [[+|-]varfile]…]
Signal_exit 25 ABORT. LocCode: 3 AbortCode: 0
Anonymous
October 1, 2008 at 5:33 pm
Hi
I am looking for an OCR component (or Source)
for my .net Application.
Like most of you I am struggling to get
teseract to build under Visual Studio.
I have downloaded FreeOCR(uses Tesseract)from http://www.FreeOCR.co.uk)
This app works fine and uses .net.
I have been unable to contact the Author
Ralph Richardson.
He seems to be compiling and building
Tesseract into a .net Application.
Ralph are you out here we need your help!
JohnH
October 12, 2008 at 10:58 pm
I need help on how to use tesseract engine in JAVA,Please tell me how to download necessary stuff for JAVA application.
Tejas
May 28, 2009 at 9:50 pm
Hi
Tejas,and all experts
i have seen your question in blog ,am very new to java,
i need to work with tessaract ocr engine in my jsf project(no training),can any one help what are necessary things i need to install and any source code for reference.
leela
January 29, 2011 at 1:09 pm
Hi,
I am looking for good .NET OCR engine. I am student and I need to develop C# application, OCR application. Please instruct me how you used your OCR engine under .NET.
It would be kind if you mail me also how to do it in Java.
Regards,
Marusz
Mariusz
August 12, 2009 at 5:51 pm
Hi Mariusz,
See, actually i have build it in VC++ and use exe from command prompt. As i have already mentioned that it is not giving 100% result but yes, it is really good one in open source. From C#, i was calling executable from shellcommand library and then passing image to covert to text.
For java, i dont have much idea. But you can goto Google code and search that.
Manish
August 12, 2009 at 6:07 pm
Do you have the library for visual studio in vb or c#?
ffffff
September 9, 2009 at 10:47 am
Hello Manish ji,
i am using tesseract, but the OCR engine is not working properly in terms of getting wordlist and co-ordiantes. It is not getting wordlist atleast 40% for some documents which are clear in font. please guide me what i need to change in my tesseract application. thanq in advance.
pratap
December 21, 2009 at 10:58 am
Hi manish,
I am downloaded tesseract from your site. It is working properly . But I want tesseract in dll form to share with other application.So I made some changes to make it as visual C++ 6 MFC Appwizard(dll).Then following errors occuring
\ccutil\scanutils.cpp(25) : fatal error C1083: Cannot open include file: ‘inttypes.h’: No such file or directory
fatal error C1189: #error : include ‘stdafx.h’ before including this file for PCH Error executing cl.exe.
can u help me to reslove this.
Thanks and Regards
vidhu
vidhu
February 4, 2010 at 11:02 am
Hi Manish,
I am trying to use tesseract 2.04 “exe”, directly to command line but when i run “tesseract.exe test.tif out.txt” showme “unable to load unicharset file ./tessdata/eng.unicharset” and if i change the language spanish “unable to load unicharset file ./tessdata/spa.unicharset”
Ernie
March 2, 2010 at 5:45 pm
Hi Ernie,
From your setup, it seems that it is the problem of path setting in the tesseract. Put the tessdata folder at the proper location. I could not help you more as I have done that around 2 years ago. I will look at the tesseract progress and see if I can post something new on this so that it would be helpful.
Manish
Manish
March 5, 2010 at 6:52 pm
Hi Manish,
Can you help me how to use tesseract ocr on vc++ 2008 mfc aplication? how to linking the dll?
Febri
March 26, 2010 at 3:45 pm
Hi Febri,
I have started company and currently very busy with it.
You can outsource it to me ;). Just kidding.
Manish
IntelliPro IT Solutions
Manish
March 26, 2010 at 11:46 pm
Now its available on Google code http://code.google.com/p/tesseract-ocr/ to download. The 2.04 version can be easily built in VC++ 2008 out of box.
Manish
October 5, 2010 at 11:21 pm
when i run on command prompt tesserct like;
tesseract image_name.tiff and out_put_file_name.txt then this error occurs,
Tesseract Open Source OCR Engine
TIFFOpen: mypune0015.tif: Cannot open.
tesseract:Error:Read of file failed:mypune0015.tif
Segmentation fault
what can i do plz help me.
amol
November 30, 2010 at 4:52 pm
I think you will need to save the tif file as an uncompressed file. Could you please look into that whether you have uncompressed file or not.
Manish
November 30, 2010 at 5:02 pm
hi manish, iam unable to load ocr3.0 from paperfile.net which is a free software to convert images to docs getting the error message c:\Documents and Settings\All Users\Application Data\Tarmainstaller\{108A39BF-4ED1-4293-B11A-06BD521FB8F7}\Cache\DROPPED_20100101190241.tiz and also the below error msg giving the msg as error 11
c:\Documents and Settings\All Users\Application Data\Tarma installer\{108A39BF-4ED1-4293-B11A-06BD521FB8F7}\Cache\FreeOCR_2.1.0.8_L075a6c69191ec1db_x86.exe
srinivastripati
February 3, 2011 at 8:44 am
hi manish, chope u can help me load free ocr 3.0 which runs on tarma installer using tesseract the error message is as follws:c:\Documents and Settings\All Users\Application Data\Tarmainstaller\{108A39BF-4ED1-4293-B11A-06BD521FB8F7}\Cache\DROPPED_20100101190241.tiz
i have a project to convert gif images to word plz reply ASAP
srinivastripati
February 3, 2011 at 8:47 am
I DONOT HAVE ANY KNOWLEDGE ABOUT SOFWARES AND ALL THESE STUFF CAN U SUGGEST ME WHAT TO RT NOW
srinivastripati
February 3, 2011 at 8:49 am
Hi Manish,
I need to build tesseract-1.03 in MFC vs6.0. Can you please provide me the info regaring what needto tbe done to compile and build it.
Regards,
Anshuman
Anshuman
March 18, 2011 at 11:29 pm
hi expert
i would like to ask few questions, as i am now doing my school project on OCR
honestly i am lost on how to execute the source code given by the tesseract 3.0 which i have downloaded. when i unzip it, it gives me alot of C++ file which i dont know how to test it out.
besides that i am currently working on the VB.NET code and it is not tesseract
would you like to give me some suggestion on how to start?
thank you so much
regards
nana
March 25, 2011 at 7:26 am
Hi Twins and shobhit,
Actually, I am very busy now a days with our company works. So I could not help you. But if you can send me specific issues, might be I will help you into that.
Manish
April 7, 2011 at 10:34 pm
hi manish,
i have to make ocr for my college project but in c# and i am clueless about how to start and all other details. please help me with this.
its kind of urgent.
thanks in advance
regards
shobhit sarin
April 7, 2011 at 1:02 am
hi
i totally understand your situation but i am not getting from where to start else i wont have troubled you. if u have source code of a simple ocr in c# please mail me or upload it. or tell me what are the different modules of the project like reading image or converting to text module.
thanx
regards
shobhit
April 8, 2011 at 12:13 am
do you have instructions for using with eclipse on windows
android
April 8, 2011 at 10:47 pm
Hello,
I want an example in c++ of how to use tessearct
guess22
May 19, 2011 at 9:43 pm
Actually it is better to use the executable and pass the parameter using command line.
Manish
May 19, 2011 at 11:30 pm
Thx for response
but the purpose is to compile the library tesseract in windows xp and it ‘s fine I did it but now I want to use it I must develop in c++ a program that transform an image in text
guess22
May 20, 2011 at 6:35 pm
i downloaded your compiled code..when i’m running the application its saying that dll files are missing from your computers what should i do?
Abhisheck Badjatia
June 9, 2011 at 4:41 pm
Hı I have a project about Ocr on windows device.I used tessearct but it gave me error(İnit() function ).Can you help me ?
tuba
July 21, 2011 at 4:08 am
Hi Tuba,
Sorry about that. But I am bit busy with the work. Could you post your problem to google groups if it helps.
Manish
July 21, 2011 at 10:42 pm
could you please show how to use the Tesseract functions in a C++ program. I am not being able to link it with the VS project.
I posted it on So but got no reply… http://stackoverflow.com/questions/6798278/build-error-with-c-api-of-tesseract-ocr
arunirc
July 24, 2011 at 5:42 pm
am trying to download the tesseract free ocr siftware using the tarma installer. Instead of downloading to my harddrive, I was in the process of downloading to one of my thumb drives. During installation I received the following error: Error1006 while opening these files-The volume for a file has been externally altered so that the opening file is no longer valid.
Are you familiar with this error, and can you assist me with completing my installation or do I have to install on my hard drive for software to work?
del
July 24, 2011 at 7:18 pm
Hi,
Could some one please tell me how to use the Tesseract OCR on the windows machine. I want to know how exactly it functions. Help needed.
Misty
October 31, 2011 at 5:56 pm
Has anyone used this for OCR on PDF files?
Rohit Sharma
November 8, 2011 at 10:02 pm
Is it possible compile tesseract as a win32 dll ? If yes, can someone guide me please.
Debjit
December 15, 2011 at 2:09 am
can you please send me the source code on how to use tessaract in vb6 .0 and vb.net 2005. i will be gladly appreciate any help from you.thank you
Jason
December 15, 2011 at 8:45 am
I prefer use gocr than tesseract for ocr. see the different here http://www.seeingwithsound.com/ocr.htm
jasa pembuatan web
January 2, 2012 at 8:55 pm
DROPPED_20100101190241.tiz what is this in ocr software
that create a problem
arvind
February 26, 2012 at 11:47 pm
hi all,
if anybody has implemented java with TesseractOCR please do share with us.
debrajmallick
April 28, 2012 at 4:42 pm
Hi Manish.. Am working on OCR project, Where i need to read the numbers from the image. After a lot search in google, have found ur blog & downloaded the project http://cid-7ec550791692ecb9.skydrive.live.com/self.aspx/Tesseract/tesseract-1.03.zip. As u said u successfully compilied this project in VS 2008. But when i compiled this in Visual studio 2008, am getting errors. Can u help me to solve these errors ??
Guru
June 4, 2012 at 1:27 pm
could you also email me a latest version of you visual studio project. i am trying to use the teserat to develop an ocr software in vb 2010.
reyniel macero
June 21, 2012 at 6:56 am
Hi Reyniel, Acttually it was on t http://cid-7ec550791692ecb9.skydrive.live.com/self.aspx/Tesseract/tesseract-1.03.zip url but it was really old one and I do not have time to try new one. Could you please download new one from google if it works.
Manish
June 21, 2012 at 8:12 pm
Hi,
I have started working with tesseract but i can’t run it. I tried to work using command prompt on windows but it is saying “Can’t create output file”
on cmd i am using: :tesseract sample2.tif output
My whole term project is based on it so please help me ASAP.
Thanks in advance.
Ishant
August 22, 2012 at 12:12 am
0down vote
I figured it out, if you are using visual studios 2010 and are using windows forms / designer you can add it easily this way with no issues
1) add the following projects to your project ( i am warning you once, do not add the tesseract solution, or change any setting in the projects you add, unless you love to hate yourself )
ccmain
ccstruct
ccutil
classify
cube
cutil
dict
image
libtesseract
nutral_networks
textord
viewer
wordrec
you can add the others but you don’t really want all that built into your project do you? naaa, build those separately
2) go to your project properties and add libtesseract as a reference, you can now that it is visible as a project, this will make it so that your project builds fast without examining the millions of warnings within tesseract. [common properties]->[add reference]
3) right click your project in the solution explorer and click project dependencies, make sure it is dependant on libtesseract or even all of them, it just means they build before your project.
4) the tesseract 2010 visual studio projects contain a number of configuration settings aka release, release.dll, debug, debug.dll, it seems that the release.dll settings produce the right files. First, set the solution output to release.dll. Click your project properties. Then click configuration manager. If that is not available, do this, click the SOLUTION’s properties in the solution tree and click configuration tab, you will see a list of projects and the associated configuration settings. You will notice your project is not set to release.dll even though the output is. If you took the second route you still need to click configuration manager. Then you can edit the settings, click new on your projects settings and call it release.dll…exactly the same as the rest of them and copy the settings from release. Do the same thing for Debug, so that you have a debug.dll name copied from debug settings. wheew…almost done
5) Don’t try to change tesseracts settings to match yours….that wont work ….and when the new release comes out you wont be able to just “throw it in” and go. Accept the fact that in this state your new modes are Release.dll and Debug.dll. don’t stress out…you can go back when its is finished and remove the projects from your solution.
6) Guess where the libraries and dll’s come out? in your project, you may or may not need to add the library directories. Some people say to dump all the headers into a single folder so they only need to add one folder to the includes but not me. I want to be able to delete the tesseract folder and reload it from the zips without extra work….and be fully ready to update in one move or restore it if I made a mess of the code. Its a bit of work and you can to it with code instead of the settings which is the way i do it, but you should include all the folders that contain header files within the 2010 tesseract project folder and leave them alone.
7) there is no need to add any files to your project. just these lines of code….. I have included some additional code that converts from one foreign data set to the tiff friendly version with no need to save / load file. aren’t I nice?
8) now you can fully debug in debug.dll and release.dll, once you have successfully built it into your project even once you can remove all the added projects and it will be peeerfect. no extra compiling or errors. fully debugable, all natural.
9) If I remember right, I could not get around the fact I had to copy the files in 2008/lib/ into my projects release folder….darn it.
In my projects “functions.h” I put
#pragma comment (lib, “liblept.lib” )
#define _USE_TESSERACT_
#ifdef _USE_TESSERACT_
#pragma comment (lib, “libtesseract.lib” )
#include
#endif
#include
in my main project I put this in a class as a member:
tesseract::TessBaseAPI *readSomeNombers;
and of course I included “functions.h” somewhere
then I put this in my classes constructor:
readSomeNombers = new tesseract::TessBaseAPI();
readSomeNombers ->Init(NULL, “eng” );
readSomeNombers ->SetVariable( “tessedit_char_whitelist”, “0123456789,.” );
then I created this class member function: and a class member to serve as an output, don’t hate, I don’t like returning variables. Not my style. The memory for the pix does not need to be destroyed when used inside a member function this way I believe and my test suggest this is a safe way to call these functions. But by all means, you can do whatever.
void Gaara::scanTheSpot()
{
Pix *someNewPix;
char* outText;
ostringstream tempStream;
RECT tempRect;
someNewPix = pixCreate( 200 , 40 , 32 );
convertEasyBmpToPix( &scanImage, someNewPix, 87, 42 );
readSomeNombers ->SetImage(someNewPix);
outText = readSomeNombers ->GetUTF8Text();
tempStream.str(“”);
tempStream << outText;
classMemeberVariable = tempStream.str();
//pixWrite( "test.bmp", someNewPix, IFF_BMP );
}
The object that has the information that I want to scan is in memory and is pointed to by &scanImage. It is from the “EasyBMP” library but that is not important.
Which I deal with in a function in “functions.h”/ “functions.cpp” by the way, i am doing a little extra processing here while i am in the loop, namely thinning the characters and making it black and white and reversing black and white which is unnessesry. At this phase in my development I am still looking for ways to improve the recognition. Though for my proposes this has not yielded bad data yet. My view is to use the default Tess data for simplicity. I am acting heuristically to solve a very complex problem.
void convertEasyBmpToPix( BMP *sourceImage, PIX *outputImage, unsigned startX, unsigned startY )
{
int endX = startX + ( pixGetWidth( outputImage ) );
int endY = startY + ( pixGetHeight( outputImage ) );
unsigned destinationX;
unsigned destinationY = 0;
for( int yLoop = startY; yLoop < endY; yLoop++ )
{
destinationX = 0;
for( int xLoop = startX; xLoop GetPixel( xLoop, yLoop ) ) ) )
{
pixSetRGBPixel( outputImage, destinationX, destinationY, 0,0,0 );
}
else
{
pixSetRGBPixel( outputImage, destinationX, destinationY, 255,255,255 );
}
destinationX++;
}
destinationY++;
}
}
bool isWhite( RGBApixel *image )
{
if(
//destination->SetPixel( x, y, source->GetPixel( xLoop, yLoop ) );
( image->Red Blue Green = 1200) //%%% vkr for VC 6.0
typedef _int64 inT64;
typedef unsigned _int64 uinT64;
#else
typedef long long int inT64;
typedef unsigned long long int uinT64;
#endif //%%% vkr for VC 6.0
typedef float FLOAT32;
typedef double FLOAT64;
typedef unsigned char BOOL8;
Kage.Sabaku.No.Gaara
September 18, 2012 at 7:33 am
I actually think about exactly why you called this specific posting, “Tesseract OCR Library – Successfully compiled in Window :
) Manish Pansiniyas Blog”. In any event . I personally
appreciated the post!Many thanks-Margarita
http://tinyurl.com/congdavy43093
February 5, 2013 at 6:06 pm
I am new to OCR. Can you tell me how to install for windows 7. i am developing the project in c++ which needs OCR. Please provide me appropriate solution.
Thanks and Regards,
Vikky
Vicky Patil
March 6, 2013 at 9:52 pm
“Tesseract OCR Library – Successfully compiled in
Window 🙂 | Manish Pansiniya’s Blog” Roller Shade was indeed a very good blog post and I personally was truly satisfied to locate the blog. Thanks for the post,Roxana
Kandi
March 13, 2013 at 9:07 pm
mine is windows 7 32 bit,i have visual studio 2012 plz tell me how to install tesseract
shrinidhi
September 5, 2013 at 10:29 pm
Shrinidhi, i think you should see google code for this. I think they provided all the code to build this under windows with VS. Hope this helps.
Manish
September 6, 2013 at 2:28 am