Manish Pansiniya’s Blog

.NET, C#, Javascript, ASP.NET and lots more…:)

Tesseract OCR Library – Successfully compiled in Window :)

with 45 comments

Today, I got the project to make OCR software. After googling, I reach on the conclusion to use Tesseract library. This library is opensource and available in both Windows and Linux. This library is provided with Visual Studio project. I have compiled it with the .NET and also Visual Studio 6.0. This creates tesseract.exe. Which is successfully run on my pc. It converts the tiff image into the text. Though not perfect but we can use it in our project.

If anybody wants the instruction please mail me or post your queries and question here.
=============================
Updated on 20 Mar 2008
Following is the link which my compiled code
http://cid-7ec550791692ecb9.skydrive.live.com/self.aspx/Tesseract/tesseract-1.03.zip

Also google updated and provided tesseract code which is on following link

http://code.google.com/p/tesseract-ocr/downloads/list
=============================

Updated on 5 Jun 2008

Following are the links which may be useful for compiling under .NET. I haven’t tried it still.

http://groups.google.com/group/tesseract-ocr/browse_thread/thread/dec2ca5ce4d5c325/b676b481590dc105?lnk=gst&q=.net#b676b481590dc105

http://www.pixel-technology.com/freeware/tessnet/

==============================

Updated on 23 Mar 2009

I believe that following URL will resolve all your issue. This ocr is updated on 14th April 2008. I think you should give it a shot :)

http://code.google.com/p/tesseract-ocr/downloads/list

Written by Manish

March 3, 2007 at 3:08 pm

Posted in .NET, Uncategorized

Tagged with

45 Responses

Subscribe to comments with RSS.

  1. Hi,
    I have doing one OCR application. so i I had try to Tesseract 1.03 opensouce in Microsoft Visual Sudio-2005 .Net framework 2.0. But this show the errors list(Refer below), Please help me, If you have exe file or compilable source code please send me with discreption.
    Looking forward your replay….

    Thank you very much
    Barathi

    Errors list

    Error 10 error C2666: ‘pow’ : 6 overloads have similar conversions e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\classify\clusttool.cpp 147
    Error 30 error C2666: ‘pow’ : 6 overloads have similar conversions e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\classify\clusttool.cpp 147
    Error 420 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\display\pgedit.cpp 799
    Error 629 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\textord\tordmain.cpp 162
    Error 630 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\textord\tordmain.cpp 172

    Barathi

    March 12, 2007 at 7:14 am

  2. Will Send you tomorrow.

    Manish Pansiniya

    March 12, 2007 at 2:53 pm

  3. Barathi

    Please download it from. Please note that this code is for .NET.

    http://www.sharebigfile.com/file/109699/tesseract-1-03-zip.html

    I have changed the code a bit to remove the errors.

    Manish Pansiniya

    March 13, 2007 at 6:23 am

  4. Great job on compiling. How do you actually invoke the exe? passing commandline parameters?

    CP

    March 16, 2007 at 5:21 am

  5. basically i haven’t tested the .NET executable but for VC++ executable, i succeed for it. Yes i am invoking exe by passing parameter to it.
    tesseract [imagefile] [textfile]

    But not every image can be processed by this exe. So you need to download ImageMagic and use it’s convert.exe to convert existing images to tif. And this generated tif needs to be used by tesseract so that it can successfully OCR the image.

    Manish Pansiniya

    March 17, 2007 at 9:56 am

  6. I have just downloaded the compiled Tesseract Windows exe complete with a user interface from the website below in case this is of any use to you. This version also opens any Tiff compression images.

    http://www.softi.co.uk/tess.htm

    Regards

    Jason

    Jason Fullerton

    March 25, 2007 at 10:09 pm

  7. I have tried to download your modified Tesseract OCR library, but the link didn’t work. Can you post this again, please?

    Anonymous

    October 25, 2007 at 8:19 am

  8. I was able to compile it using Studio 2005 but I have some problems running tesseract2.01.exe on Windows 2k3 Server

    for example when I want to scan the phototest.tif file which goes with src I have the following in tesseract.log

    Unable to load unicharset file tesseract2.01.exe/tessdata/eng.unicharset

    I have run all the .exe files in training folder. All except cnTraining.exe which gives me unhandled exception.

    I tried to find some data to put it into tessdata folder but then I have another list of errors in log file

    Could you help me out with that? :) Thanks!

    p.s. I tried to compile the cnTraining to be able to Debug it but I got LINK errors..

    Nick

    October 25, 2007 at 10:06 pm

  9. Uh.. I got /tessdata from project source zip file. I run tesseract and it works. Now I have to figure out how to get rid of LINK errors while compiling the cnTraining project :)

    Error 1 error LNK2019: unresolved external symbol “int __cdecl getopt(int,char * * const,char const *)” (?getopt@@YAHHQAPADPBD@Z) referenced in function “void __cdecl ParseArguments(int,char * *)” (?ParseArguments@@YAXHPAPAD@Z) cnTraining.obj
    Error 2 error LNK2019: unresolved external symbol “void __cdecl assert(int)” (?assert@@YAXH@Z) referenced in function “void __cdecl DoError(int,char const *)” (?DoError@@YAXHPBD@Z) danerror.obj
    Error 3 fatal error LNK1120: 2 unresolved externals .\..\..\debug/cnTraining.exe

    It would be great to get some help! Thanks!

    Nick

    October 25, 2007 at 10:36 pm

  10. Nick…is it must to use .NET code of tesseract as i have .exe file running and tested on my pc. If you need that source code please let me know. I would upload the same. I haven’t tested the .NET code and i believe i could not do for the next 2 week as of tight schedule. Please let me know your thoughts.

    Manish Pansiniya

    October 26, 2007 at 6:17 am

  11. Nick, I have compiled cnTraining in .NET and its working fine with my machine. Do you need that executable. Else try to compile it on Visual C++ 6.0. May be it is the issue.

    Manish Pansiniya

    October 26, 2007 at 6:31 am

  12. Hi, have downloaded tesserect code, compiled in .NET and produced the correct .exe.

    However I do not know how to produce a text file from my .tif, Please can you help?

    I tried running from command line:
    c:\tesserect pic.tif resultfile.txt

    Baber

    October 27, 2007 at 12:35 am

  13. Here you need to follow some steps. Following would be useful. You need to convert image tif to another tif format using imagemagik.
    http://www.howtoforge.com/ocr_with_tesseract_on_ubuntu704

    Manish Pansiniya

    October 29, 2007 at 9:45 am

  14. >>I have compiled cnTraining in .NET and its working fine with my machine

    As I understand that program should give me default tessdata files…anyway I am still not able to compile it by my own. The one which I have gives me an exception when I run it :( It would be great if you provide me your source files. Thanks!

    Nick

    October 30, 2007 at 10:52 pm

  15. Well, finally I have just compiled my code but there is still that problem. It gives me an exception when running. Could it be because I run it on w2k3 server machine? I gonna check it with XP though.

    I am also planing to use it in my project… just wondering is it hard to tech it or not :)

    Nick

    October 30, 2007 at 11:40 pm

  16. I believe it would be run on 2k3. Just use ImageMagic before running it on any image.

    Manish Pansiniya

    October 31, 2007 at 1:11 pm

  17. tesseract.exe runs well without any problem. I am experiencing problem with running cnTraining.exe :)

    Nick

    October 31, 2007 at 7:10 pm

  18. Nick. I haven’t used that exe still :-P .

    Manish Pansiniya

    November 1, 2007 at 8:41 am

  19. Manish Pansiniya,
    can you please update your link:
    http://www.sharebigfile.com/file/109699/tesseract-1-03-zip.html
    the link is broken…
    is it possible to compile it under .NET compact framework?

    squid

    December 11, 2007 at 5:31 pm

  20. Hi Manish,

    Can you please give us the link to download the .NET compiled version of the Tesseract OCR?

    Thanks

    Vellayan

    Vellayan

    December 18, 2007 at 7:44 am

  21. I am on leave currently. Please check after 2 days. I will surely upload the same. thanks. Also if you know any public server where i can put the file and it would be kept there for 1 year or so. I have url but those are of 1 month and so. thanks

    Manish Pansiniya

    December 19, 2007 at 11:51 am

  22. I am on leave currently. Please check after 2 days. I will surely upload the same. thanks. Also if you know any public server where i can put the file and it would be kept there for 1 year or so. I have url but those are of 1 month and so. thanks

    Manish Pansiniya

    December 19, 2007 at 11:52 am

  23. I have a place where I can put it for over 1 year.

    Pls send an email to second.softwares@gmail.com

    Second softwares

    December 20, 2007 at 10:41 am

  24. hello expert
    i have the same project but little bit of
    difference, the image with some text written in it and i want to read only that text and generate text file
    so please give me some help how i can read this
    text from image file

    varinder sharma

    December 24, 2007 at 5:46 am

  25. Hi. Is it possible to use the Tesseract library directly in C#, without invoking any executables? Thank you.

    Felipe Dornelas

    January 21, 2008 at 4:04 pm

  26. Where I can download Tesseract OCR Library? I need to use in a VB.Net project, can I use it in VB.Net ? Thanks

    luca

    January 23, 2008 at 1:33 pm

  27. luca u can use in any programming language using executable.

    I haven’t tested the .NET ported code and i think its hard to work on the code. I will upload the code on some site and let you know in this post only.

    Manish Pansiniya

    January 29, 2008 at 4:24 pm

  28. Please Help i am looking to use this engine in my project in JAVA Help

    zeid

    February 4, 2008 at 8:59 am

  29. please help me i have the same project but on linux how to do that

    zeid

    February 14, 2008 at 11:58 am

  30. Hi, i am aslo doing a ocr software.
    Could you please send me the codes.
    Or to upload it again.
    Thank you very much.

    John

    February 22, 2008 at 10:59 am

  31. 2 John:
    I think U’ll find all U need here: http://groups.google.com/group/tesseract-ocr/msg/c3d22084770d4727?

    usarskyy

    March 25, 2008 at 3:35 pm

  32. Please, I would like to use tesseract with C#, can you help me
    thanks(I Talk to you from Brazil)

    Paulo

    April 28, 2008 at 5:09 pm

  33. I also am looking to use tesseract with c# without invoking an exe.
    Any Suggestions?

    Sutekh

    May 13, 2008 at 11:49 am

  34. Sutekh, I think the library is such that we generally use the executable to do the needed work. Need to look into the code to make it work. And regarding C#. I haven’t done it as i said before. But you can make the dll in VC++ and use it in C# somehow.

    Manish Pansiniya

    May 13, 2008 at 2:07 pm

  35. Hi Manish,

    Today, I got the project to make OCR software by using Tesseract library. After googling, I
    got ur blog. i did read that u already have compiled version of dotnet .So Can you please give me the link to download the .NET compiled version of the Tesseract OCR?


    Thanks & Regards

    Parmesh. A

    Parmesh

    May 15, 2008 at 6:49 am

  36. As i said parmesh i have compiled it but i could not run it. it gives me error and could not have time to debug it.

    Manish Pansiniya

    May 15, 2008 at 7:00 am

  37. can u give me link to download so that i wil try myself and let u know

    10x in adv

    Parmesh

    May 15, 2008 at 8:45 am

  38. hi
    when i compiled the same code in VS2005, it shows 32 errors and more than 2000 warnings can u please give me a sol.n?

    Anonymous

    June 11, 2008 at 5:46 am

  39. Hi i haven´t tried your .NET compiled version, i will try it now, i downloaded a version and compiled it and i get this error when invoking:
    Tessedit:Error:Usage:Tessedit imagename outputbase [configfile [[+|-]varfile]…]

    Signal_exit 25 ABORT. LocCode: 3 AbortCode: 0

    Anonymous

    October 1, 2008 at 5:33 pm

  40. Hi
    I am looking for an OCR component (or Source)
    for my .net Application.
    Like most of you I am struggling to get
    teseract to build under Visual Studio.
    I have downloaded FreeOCR(uses Tesseract)from http://www.FreeOCR.co.uk)
    This app works fine and uses .net.
    I have been unable to contact the Author
    Ralph Richardson.
    He seems to be compiling and building
    Tesseract into a .net Application.
    Ralph are you out here we need your help!

    JohnH

    October 12, 2008 at 10:58 pm

  41. I need help on how to use tesseract engine in JAVA,Please tell me how to download necessary stuff for JAVA application.

    Tejas

    May 28, 2009 at 9:50 pm

  42. Hi,

    I am looking for good .NET OCR engine. I am student and I need to develop C# application, OCR application. Please instruct me how you used your OCR engine under .NET.

    It would be kind if you mail me also how to do it in Java.
    Regards,
    Marusz

    Mariusz

    August 12, 2009 at 5:51 pm

    • Hi Mariusz,

      See, actually i have build it in VC++ and use exe from command prompt. As i have already mentioned that it is not giving 100% result but yes, it is really good one in open source. From C#, i was calling executable from shellcommand library and then passing image to covert to text.
      For java, i dont have much idea. But you can goto Google code and search that.

      Manish

      August 12, 2009 at 6:07 pm

  43. Do you have the library for visual studio in vb or c#?

    ffffff

    September 9, 2009 at 10:47 am


Leave a Reply