Tesseract OCR Library – Successfully compiled in Window :)
Today, I got the project to make OCR software. After googling, I reach on the conclusion to use Tesseract library. This library is opensource and available in both Windows and Linux. This library is provided with Visual Studio project. I have compiled it with the .NET and also Visual Studio 6.0. This creates tesseract.exe. Which is successfully run on my pc. It converts the tiff image into the text. Though not perfect but we can use it in our project.
If anybody wants the instruction please mail me or post your queries and question here.
=============================
Updated on 20 Mar 2008
Following is the link which my compiled code
http://cid-7ec550791692ecb9.skydrive.live.com/self.aspx/Tesseract/tesseract-1.03.zip
Also google updated and provided tesseract code which is on following link
http://code.google.com/p/tesseract-ocr/downloads/list
=============================
Updated on 5 Jun 2008
Following are the links which may be useful for compiling under .NET. I haven’t tried it still.
http://www.pixel-technology.com/freeware/tessnet/
==============================
Updated on 23 Mar 2009
I believe that following URL will resolve all your issue. This ocr is updated on 14th April 2008. I think you should give it a shot
Hi,
I have doing one OCR application. so i I had try to Tesseract 1.03 opensouce in Microsoft Visual Sudio-2005 .Net framework 2.0. But this show the errors list(Refer below), Please help me, If you have exe file or compilable source code please send me with discreption.
Looking forward your replay….
Thank you very much
Barathi
Errors list
Error 10 error C2666: ‘pow’ : 6 overloads have similar conversions e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\classify\clusttool.cpp 147
Error 30 error C2666: ‘pow’ : 6 overloads have similar conversions e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\classify\clusttool.cpp 147
Error 420 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\display\pgedit.cpp 799
Error 629 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\textord\tordmain.cpp 162
Error 630 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\textord\tordmain.cpp 172
Barathi
March 12, 2007 at 7:14 am
Will Send you tomorrow.
Manish Pansiniya
March 12, 2007 at 2:53 pm
Barathi
Please download it from. Please note that this code is for .NET.
http://www.sharebigfile.com/file/109699/tesseract-1-03-zip.html
I have changed the code a bit to remove the errors.
Manish Pansiniya
March 13, 2007 at 6:23 am
Great job on compiling. How do you actually invoke the exe? passing commandline parameters?
CP
March 16, 2007 at 5:21 am
basically i haven’t tested the .NET executable but for VC++ executable, i succeed for it. Yes i am invoking exe by passing parameter to it.
tesseract [imagefile] [textfile]
But not every image can be processed by this exe. So you need to download ImageMagic and use it’s convert.exe to convert existing images to tif. And this generated tif needs to be used by tesseract so that it can successfully OCR the image.
Manish Pansiniya
March 17, 2007 at 9:56 am
I have just downloaded the compiled Tesseract Windows exe complete with a user interface from the website below in case this is of any use to you. This version also opens any Tiff compression images.
http://www.softi.co.uk/tess.htm
Regards
Jason
Jason Fullerton
March 25, 2007 at 10:09 pm
I have tried to download your modified Tesseract OCR library, but the link didn’t work. Can you post this again, please?
Anonymous
October 25, 2007 at 8:19 am
download from following folder
http://www.4shared.com/dir/4325553/c16db878/sharing.html
Manish Pansiniya
October 25, 2007 at 11:49 am
I was able to compile it using Studio 2005 but I have some problems running tesseract2.01.exe on Windows 2k3 Server
for example when I want to scan the phototest.tif file which goes with src I have the following in tesseract.log
Unable to load unicharset file tesseract2.01.exe/tessdata/eng.unicharset
I have run all the .exe files in training folder. All except cnTraining.exe which gives me unhandled exception.
I tried to find some data to put it into tessdata folder but then I have another list of errors in log file
Could you help me out with that?
Thanks!
p.s. I tried to compile the cnTraining to be able to Debug it but I got LINK errors..
Nick
October 25, 2007 at 10:06 pm
Uh.. I got /tessdata from project source zip file. I run tesseract and it works. Now I have to figure out how to get rid of LINK errors while compiling the cnTraining project
Error 1 error LNK2019: unresolved external symbol “int __cdecl getopt(int,char * * const,char const *)” (?getopt@@YAHHQAPADPBD@Z) referenced in function “void __cdecl ParseArguments(int,char * *)” (?ParseArguments@@YAXHPAPAD@Z) cnTraining.obj
Error 2 error LNK2019: unresolved external symbol “void __cdecl assert(int)” (?assert@@YAXH@Z) referenced in function “void __cdecl DoError(int,char const *)” (?DoError@@YAXHPBD@Z) danerror.obj
Error 3 fatal error LNK1120: 2 unresolved externals .\..\..\debug/cnTraining.exe
It would be great to get some help! Thanks!
Nick
October 25, 2007 at 10:36 pm
Nick…is it must to use .NET code of tesseract as i have .exe file running and tested on my pc. If you need that source code please let me know. I would upload the same. I haven’t tested the .NET code and i believe i could not do for the next 2 week as of tight schedule. Please let me know your thoughts.
Manish Pansiniya
October 26, 2007 at 6:17 am
Nick, I have compiled cnTraining in .NET and its working fine with my machine. Do you need that executable. Else try to compile it on Visual C++ 6.0. May be it is the issue.
Manish Pansiniya
October 26, 2007 at 6:31 am
Hi, have downloaded tesserect code, compiled in .NET and produced the correct .exe.
However I do not know how to produce a text file from my .tif, Please can you help?
I tried running from command line:
c:\tesserect pic.tif resultfile.txt
Baber
October 27, 2007 at 12:35 am
Here you need to follow some steps. Following would be useful. You need to convert image tif to another tif format using imagemagik.
http://www.howtoforge.com/ocr_with_tesseract_on_ubuntu704
Manish Pansiniya
October 29, 2007 at 9:45 am
>>I have compiled cnTraining in .NET and its working fine with my machine
As I understand that program should give me default tessdata files…anyway I am still not able to compile it by my own. The one which I have gives me an exception when I run it
It would be great if you provide me your source files. Thanks!
Nick
October 30, 2007 at 10:52 pm
Well, finally I have just compiled my code but there is still that problem. It gives me an exception when running. Could it be because I run it on w2k3 server machine? I gonna check it with XP though.
I am also planing to use it in my project… just wondering is it hard to tech it or not
Nick
October 30, 2007 at 11:40 pm
I believe it would be run on 2k3. Just use ImageMagic before running it on any image.
Manish Pansiniya
October 31, 2007 at 1:11 pm
tesseract.exe runs well without any problem. I am experiencing problem with running cnTraining.exe
Nick
October 31, 2007 at 7:10 pm
Nick. I haven’t used that exe still
.
Manish Pansiniya
November 1, 2007 at 8:41 am
Manish Pansiniya,
can you please update your link:
http://www.sharebigfile.com/file/109699/tesseract-1-03-zip.html
the link is broken…
is it possible to compile it under .NET compact framework?
squid
December 11, 2007 at 5:31 pm
Hi Manish,
Can you please give us the link to download the .NET compiled version of the Tesseract OCR?
Thanks
Vellayan
Vellayan
December 18, 2007 at 7:44 am
I am on leave currently. Please check after 2 days. I will surely upload the same. thanks. Also if you know any public server where i can put the file and it would be kept there for 1 year or so. I have url but those are of 1 month and so. thanks
Manish Pansiniya
December 19, 2007 at 11:51 am
I am on leave currently. Please check after 2 days. I will surely upload the same. thanks. Also if you know any public server where i can put the file and it would be kept there for 1 year or so. I have url but those are of 1 month and so. thanks
Manish Pansiniya
December 19, 2007 at 11:52 am
I have a place where I can put it for over 1 year.
Pls send an email to second.softwares@gmail.com
Second softwares
December 20, 2007 at 10:41 am
hello expert
i have the same project but little bit of
difference, the image with some text written in it and i want to read only that text and generate text file
so please give me some help how i can read this
text from image file
varinder sharma
December 24, 2007 at 5:46 am
Hi. Is it possible to use the Tesseract library directly in C#, without invoking any executables? Thank you.
Felipe Dornelas
January 21, 2008 at 4:04 pm
Where I can download Tesseract OCR Library? I need to use in a VB.Net project, can I use it in VB.Net ? Thanks
luca
January 23, 2008 at 1:33 pm
luca u can use in any programming language using executable.
I haven’t tested the .NET ported code and i think its hard to work on the code. I will upload the code on some site and let you know in this post only.
Manish Pansiniya
January 29, 2008 at 4:24 pm
Please Help i am looking to use this engine in my project in JAVA Help
zeid
February 4, 2008 at 8:59 am
please help me i have the same project but on linux how to do that
zeid
February 14, 2008 at 11:58 am
Hi, i am aslo doing a ocr software.
Could you please send me the codes.
Or to upload it again.
Thank you very much.
John
February 22, 2008 at 10:59 am
2 John:
I think U’ll find all U need here: http://groups.google.com/group/tesseract-ocr/msg/c3d22084770d4727?
usarskyy
March 25, 2008 at 3:35 pm
Please, I would like to use tesseract with C#, can you help me
thanks(I Talk to you from Brazil)
Paulo
April 28, 2008 at 5:09 pm
I also am looking to use tesseract with c# without invoking an exe.
Any Suggestions?
Sutekh
May 13, 2008 at 11:49 am
Sutekh, I think the library is such that we generally use the executable to do the needed work. Need to look into the code to make it work. And regarding C#. I haven’t done it as i said before. But you can make the dll in VC++ and use it in C# somehow.
Manish Pansiniya
May 13, 2008 at 2:07 pm
Hi Manish,
Today, I got the project to make OCR software by using Tesseract library. After googling, I
got ur blog. i did read that u already have compiled version of dotnet .So Can you please give me the link to download the .NET compiled version of the Tesseract OCR?
–
Thanks & Regards
Parmesh. A
Parmesh
May 15, 2008 at 6:49 am
As i said parmesh i have compiled it but i could not run it. it gives me error and could not have time to debug it.
Manish Pansiniya
May 15, 2008 at 7:00 am
can u give me link to download so that i wil try myself and let u know
10x in adv
Parmesh
May 15, 2008 at 8:45 am
hi
when i compiled the same code in VS2005, it shows 32 errors and more than 2000 warnings can u please give me a sol.n?
Anonymous
June 11, 2008 at 5:46 am
Hi i haven´t tried your .NET compiled version, i will try it now, i downloaded a version and compiled it and i get this error when invoking:
Tessedit:Error:Usage:Tessedit imagename outputbase [configfile [[+|-]varfile]…]
Signal_exit 25 ABORT. LocCode: 3 AbortCode: 0
Anonymous
October 1, 2008 at 5:33 pm
Hi
I am looking for an OCR component (or Source)
for my .net Application.
Like most of you I am struggling to get
teseract to build under Visual Studio.
I have downloaded FreeOCR(uses Tesseract)from http://www.FreeOCR.co.uk)
This app works fine and uses .net.
I have been unable to contact the Author
Ralph Richardson.
He seems to be compiling and building
Tesseract into a .net Application.
Ralph are you out here we need your help!
JohnH
October 12, 2008 at 10:58 pm
I need help on how to use tesseract engine in JAVA,Please tell me how to download necessary stuff for JAVA application.
Tejas
May 28, 2009 at 9:50 pm
Hi,
I am looking for good .NET OCR engine. I am student and I need to develop C# application, OCR application. Please instruct me how you used your OCR engine under .NET.
It would be kind if you mail me also how to do it in Java.
Regards,
Marusz
Mariusz
August 12, 2009 at 5:51 pm
Hi Mariusz,
See, actually i have build it in VC++ and use exe from command prompt. As i have already mentioned that it is not giving 100% result but yes, it is really good one in open source. From C#, i was calling executable from shellcommand library and then passing image to covert to text.
For java, i dont have much idea. But you can goto Google code and search that.
Manish
August 12, 2009 at 6:07 pm
Do you have the library for visual studio in vb or c#?
ffffff
September 9, 2009 at 10:47 am