Manish Pansiniya's Blog

.NET, C#, Javascript, ASP.NET and lots more…:)

Tesseract OCR Library – Successfully compiled in Window :)

with 91 comments

Today, I got the project to make OCR software. After googling, I reach on the conclusion to use Tesseract library. This library is opensource and available in both Windows and Linux. This library is provided with Visual Studio project. I have compiled it with the .NET and also Visual Studio 6.0. This creates tesseract.exe. Which is successfully run on my pc. It converts the tiff image into the text. Though not perfect but we can use it in our project.

If anybody wants the instruction please mail me or post your queries and question here.
=============================
Updated on 20 Mar 2008
Following is the link which my compiled code
http://cid-7ec550791692ecb9.skydrive.live.com/self.aspx/Tesseract/tesseract-1.03.zip

Also google updated and provided tesseract code which is on following link

http://code.google.com/p/tesseract-ocr/downloads/list
=============================

Updated on 5 Jun 2008

Following are the links which may be useful for compiling under .NET. I haven’t tried it still.

http://groups.google.com/group/tesseract-ocr/browse_thread/thread/dec2ca5ce4d5c325/b676b481590dc105?lnk=gst&q=.net#b676b481590dc105

http://www.pixel-technology.com/freeware/tessnet/

==============================

Updated on 23 Mar 2009

I believe that following URL will resolve all your issue. This ocr is updated on 14th April 2008. I think you should give it a shot 🙂

http://code.google.com/p/tesseract-ocr/downloads/list
==============================

Updated on 05 Oct 2010

Now, Tesseract 2.04 is available with the compiled source code in Visual C++ 2008 Express edition on http://code.google.com/p/tesseract-ocr/downloads/list.
Even 3.0 (Preview 1) is released with windows executable which you can use in .NET externally and convert in text.

Written by Manish

March 3, 2007 at 3:08 pm

Posted in .NET, Uncategorized

Tagged with

91 Responses

Subscribe to comments with RSS.

  1. Hi,
    I have doing one OCR application. so i I had try to Tesseract 1.03 opensouce in Microsoft Visual Sudio-2005 .Net framework 2.0. But this show the errors list(Refer below), Please help me, If you have exe file or compilable source code please send me with discreption.
    Looking forward your replay….

    Thank you very much
    Barathi

    Errors list

    Error 10 error C2666: ‘pow’ : 6 overloads have similar conversions e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\classify\clusttool.cpp 147
    Error 30 error C2666: ‘pow’ : 6 overloads have similar conversions e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\classify\clusttool.cpp 147
    Error 420 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\display\pgedit.cpp 799
    Error 629 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\textord\tordmain.cpp 162
    Error 630 error C2440: ‘=’ : cannot convert from ‘const char *’ to ‘char *’ e:\downloads\xtesseract-1.03\xtesseract-1\tesseract-1.03\textord\tordmain.cpp 172

    Barathi

    March 12, 2007 at 7:14 am

    • Hi,

      how do you resolve those errors?..
      I am actually experiencing those errors and I don’t know what to do..

      francis

      September 8, 2010 at 8:21 am

  2. Will Send you tomorrow.

    Manish Pansiniya

    March 12, 2007 at 2:53 pm

    • can you send it to me also?..

      francis

      September 8, 2010 at 9:00 am

  3. Barathi

    Please download it from. Please note that this code is for .NET.

    http://www.sharebigfile.com/file/109699/tesseract-1-03-zip.html

    I have changed the code a bit to remove the errors.

    Manish Pansiniya

    March 13, 2007 at 6:23 am

  4. Great job on compiling. How do you actually invoke the exe? passing commandline parameters?

    CP

    March 16, 2007 at 5:21 am

  5. basically i haven’t tested the .NET executable but for VC++ executable, i succeed for it. Yes i am invoking exe by passing parameter to it.
    tesseract [imagefile] [textfile]

    But not every image can be processed by this exe. So you need to download ImageMagic and use it’s convert.exe to convert existing images to tif. And this generated tif needs to be used by tesseract so that it can successfully OCR the image.

    Manish Pansiniya

    March 17, 2007 at 9:56 am

  6. I have just downloaded the compiled Tesseract Windows exe complete with a user interface from the website below in case this is of any use to you. This version also opens any Tiff compression images.

    http://www.softi.co.uk/tess.htm

    Regards

    Jason

    Jason Fullerton

    March 25, 2007 at 10:09 pm

  7. I have tried to download your modified Tesseract OCR library, but the link didn’t work. Can you post this again, please?

    Anonymous

    October 25, 2007 at 8:19 am

  8. I was able to compile it using Studio 2005 but I have some problems running tesseract2.01.exe on Windows 2k3 Server

    for example when I want to scan the phototest.tif file which goes with src I have the following in tesseract.log

    Unable to load unicharset file tesseract2.01.exe/tessdata/eng.unicharset

    I have run all the .exe files in training folder. All except cnTraining.exe which gives me unhandled exception.

    I tried to find some data to put it into tessdata folder but then I have another list of errors in log file

    Could you help me out with that? 🙂 Thanks!

    p.s. I tried to compile the cnTraining to be able to Debug it but I got LINK errors..

    Nick

    October 25, 2007 at 10:06 pm

  9. Uh.. I got /tessdata from project source zip file. I run tesseract and it works. Now I have to figure out how to get rid of LINK errors while compiling the cnTraining project 🙂

    Error 1 error LNK2019: unresolved external symbol “int __cdecl getopt(int,char * * const,char const *)” (?getopt@@YAHHQAPADPBD@Z) referenced in function “void __cdecl ParseArguments(int,char * *)” (?ParseArguments@@YAXHPAPAD@Z) cnTraining.obj
    Error 2 error LNK2019: unresolved external symbol “void __cdecl assert(int)” (?assert@@YAXH@Z) referenced in function “void __cdecl DoError(int,char const *)” (?DoError@@YAXHPBD@Z) danerror.obj
    Error 3 fatal error LNK1120: 2 unresolved externals .\..\..\debug/cnTraining.exe

    It would be great to get some help! Thanks!

    Nick

    October 25, 2007 at 10:36 pm

  10. Nick…is it must to use .NET code of tesseract as i have .exe file running and tested on my pc. If you need that source code please let me know. I would upload the same. I haven’t tested the .NET code and i believe i could not do for the next 2 week as of tight schedule. Please let me know your thoughts.

    Manish Pansiniya

    October 26, 2007 at 6:17 am

  11. Nick, I have compiled cnTraining in .NET and its working fine with my machine. Do you need that executable. Else try to compile it on Visual C++ 6.0. May be it is the issue.

    Manish Pansiniya

    October 26, 2007 at 6:31 am

  12. Hi, have downloaded tesserect code, compiled in .NET and produced the correct .exe.

    However I do not know how to produce a text file from my .tif, Please can you help?

    I tried running from command line:
    c:\tesserect pic.tif resultfile.txt

    Baber

    October 27, 2007 at 12:35 am

  13. Here you need to follow some steps. Following would be useful. You need to convert image tif to another tif format using imagemagik.
    http://www.howtoforge.com/ocr_with_tesseract_on_ubuntu704

    Manish Pansiniya

    October 29, 2007 at 9:45 am

  14. >>I have compiled cnTraining in .NET and its working fine with my machine

    As I understand that program should give me default tessdata files…anyway I am still not able to compile it by my own. The one which I have gives me an exception when I run it 😦 It would be great if you provide me your source files. Thanks!

    Nick

    October 30, 2007 at 10:52 pm

  15. Well, finally I have just compiled my code but there is still that problem. It gives me an exception when running. Could it be because I run it on w2k3 server machine? I gonna check it with XP though.

    I am also planing to use it in my project… just wondering is it hard to tech it or not 🙂

    Nick

    October 30, 2007 at 11:40 pm

  16. I believe it would be run on 2k3. Just use ImageMagic before running it on any image.

    Manish Pansiniya

    October 31, 2007 at 1:11 pm

  17. tesseract.exe runs well without any problem. I am experiencing problem with running cnTraining.exe 🙂

    Nick

    October 31, 2007 at 7:10 pm

  18. Nick. I haven’t used that exe still :-P.

    Manish Pansiniya

    November 1, 2007 at 8:41 am

  19. Manish Pansiniya,
    can you please update your link:
    http://www.sharebigfile.com/file/109699/tesseract-1-03-zip.html
    the link is broken…
    is it possible to compile it under .NET compact framework?

    squid

    December 11, 2007 at 5:31 pm

  20. Hi Manish,

    Can you please give us the link to download the .NET compiled version of the Tesseract OCR?

    Thanks

    Vellayan

    Vellayan

    December 18, 2007 at 7:44 am

  21. I am on leave currently. Please check after 2 days. I will surely upload the same. thanks. Also if you know any public server where i can put the file and it would be kept there for 1 year or so. I have url but those are of 1 month and so. thanks

    Manish Pansiniya

    December 19, 2007 at 11:51 am

  22. I am on leave currently. Please check after 2 days. I will surely upload the same. thanks. Also if you know any public server where i can put the file and it would be kept there for 1 year or so. I have url but those are of 1 month and so. thanks

    Manish Pansiniya

    December 19, 2007 at 11:52 am

  23. I have a place where I can put it for over 1 year.

    Pls send an email to second.softwares@gmail.com

    Second softwares

    December 20, 2007 at 10:41 am

  24. hello expert
    i have the same project but little bit of
    difference, the image with some text written in it and i want to read only that text and generate text file
    so please give me some help how i can read this
    text from image file

    varinder sharma

    December 24, 2007 at 5:46 am

  25. Hi. Is it possible to use the Tesseract library directly in C#, without invoking any executables? Thank you.

    Felipe Dornelas

    January 21, 2008 at 4:04 pm

  26. Where I can download Tesseract OCR Library? I need to use in a VB.Net project, can I use it in VB.Net ? Thanks

    luca

    January 23, 2008 at 1:33 pm

  27. luca u can use in any programming language using executable.

    I haven’t tested the .NET ported code and i think its hard to work on the code. I will upload the code on some site and let you know in this post only.

    Manish Pansiniya

    January 29, 2008 at 4:24 pm

  28. Please Help i am looking to use this engine in my project in JAVA Help

    zeid

    February 4, 2008 at 8:59 am

  29. please help me i have the same project but on linux how to do that

    zeid

    February 14, 2008 at 11:58 am

  30. Hi, i am aslo doing a ocr software.
    Could you please send me the codes.
    Or to upload it again.
    Thank you very much.

    John

    February 22, 2008 at 10:59 am

  31. 2 John:
    I think U’ll find all U need here: http://groups.google.com/group/tesseract-ocr/msg/c3d22084770d4727?

    usarskyy

    March 25, 2008 at 3:35 pm

  32. Please, I would like to use tesseract with C#, can you help me
    thanks(I Talk to you from Brazil)

    Paulo

    April 28, 2008 at 5:09 pm

  33. I also am looking to use tesseract with c# without invoking an exe.
    Any Suggestions?

    Sutekh

    May 13, 2008 at 11:49 am

  34. Sutekh, I think the library is such that we generally use the executable to do the needed work. Need to look into the code to make it work. And regarding C#. I haven’t done it as i said before. But you can make the dll in VC++ and use it in C# somehow.

    Manish Pansiniya

    May 13, 2008 at 2:07 pm

  35. Hi Manish,

    Today, I got the project to make OCR software by using Tesseract library. After googling, I
    got ur blog. i did read that u already have compiled version of dotnet .So Can you please give me the link to download the .NET compiled version of the Tesseract OCR?


    Thanks & Regards

    Parmesh. A

    Parmesh

    May 15, 2008 at 6:49 am

  36. As i said parmesh i have compiled it but i could not run it. it gives me error and could not have time to debug it.

    Manish Pansiniya

    May 15, 2008 at 7:00 am

  37. can u give me link to download so that i wil try myself and let u know

    10x in adv

    Parmesh

    May 15, 2008 at 8:45 am

  38. hi
    when i compiled the same code in VS2005, it shows 32 errors and more than 2000 warnings can u please give me a sol.n?

    Anonymous

    June 11, 2008 at 5:46 am

  39. Hi i haven´t tried your .NET compiled version, i will try it now, i downloaded a version and compiled it and i get this error when invoking:
    Tessedit:Error:Usage:Tessedit imagename outputbase [configfile [[+|-]varfile]…]

    Signal_exit 25 ABORT. LocCode: 3 AbortCode: 0

    Anonymous

    October 1, 2008 at 5:33 pm

  40. Hi
    I am looking for an OCR component (or Source)
    for my .net Application.
    Like most of you I am struggling to get
    teseract to build under Visual Studio.
    I have downloaded FreeOCR(uses Tesseract)from http://www.FreeOCR.co.uk)
    This app works fine and uses .net.
    I have been unable to contact the Author
    Ralph Richardson.
    He seems to be compiling and building
    Tesseract into a .net Application.
    Ralph are you out here we need your help!

    JohnH

    October 12, 2008 at 10:58 pm

  41. I need help on how to use tesseract engine in JAVA,Please tell me how to download necessary stuff for JAVA application.

    Tejas

    May 28, 2009 at 9:50 pm

    • Hi
      Tejas,and all experts

      i have seen your question in blog ,am very new to java,
      i need to work with tessaract ocr engine in my jsf project(no training),can any one help what are necessary things i need to install and any source code for reference.

      leela

      January 29, 2011 at 1:09 pm

  42. Hi,

    I am looking for good .NET OCR engine. I am student and I need to develop C# application, OCR application. Please instruct me how you used your OCR engine under .NET.

    It would be kind if you mail me also how to do it in Java.
    Regards,
    Marusz

    Mariusz

    August 12, 2009 at 5:51 pm

    • Hi Mariusz,

      See, actually i have build it in VC++ and use exe from command prompt. As i have already mentioned that it is not giving 100% result but yes, it is really good one in open source. From C#, i was calling executable from shellcommand library and then passing image to covert to text.
      For java, i dont have much idea. But you can goto Google code and search that.

      Manish

      August 12, 2009 at 6:07 pm

  43. Do you have the library for visual studio in vb or c#?

    ffffff

    September 9, 2009 at 10:47 am

  44. Hello Manish ji,
    i am using tesseract, but the OCR engine is not working properly in terms of getting wordlist and co-ordiantes. It is not getting wordlist atleast 40% for some documents which are clear in font. please guide me what i need to change in my tesseract application. thanq in advance.

    pratap

    December 21, 2009 at 10:58 am

  45. Hi manish,

    I am downloaded tesseract from your site. It is working properly . But I want tesseract in dll form to share with other application.So I made some changes to make it as visual C++ 6 MFC Appwizard(dll).Then following errors occuring

    \ccutil\scanutils.cpp(25) : fatal error C1083: Cannot open include file: ‘inttypes.h’: No such file or directory
    fatal error C1189: #error : include ‘stdafx.h’ before including this file for PCH Error executing cl.exe.

    can u help me to reslove this.

    Thanks and Regards
    vidhu

    vidhu

    February 4, 2010 at 11:02 am

  46. Hi Manish,

    I am trying to use tesseract 2.04 “exe”, directly to command line but when i run “tesseract.exe test.tif out.txt” showme “unable to load unicharset file ./tessdata/eng.unicharset” and if i change the language spanish “unable to load unicharset file ./tessdata/spa.unicharset”

    Ernie

    March 2, 2010 at 5:45 pm

    • Hi Ernie,

      From your setup, it seems that it is the problem of path setting in the tesseract. Put the tessdata folder at the proper location. I could not help you more as I have done that around 2 years ago. I will look at the tesseract progress and see if I can post something new on this so that it would be helpful.

      Manish

      Manish

      March 5, 2010 at 6:52 pm

  47. Hi Manish,

    Can you help me how to use tesseract ocr on vc++ 2008 mfc aplication? how to linking the dll?

    Febri

    March 26, 2010 at 3:45 pm

  48. when i run on command prompt tesserct like;
    tesseract image_name.tiff and out_put_file_name.txt then this error occurs,
    Tesseract Open Source OCR Engine
    TIFFOpen: mypune0015.tif: Cannot open.
    tesseract:Error:Read of file failed:mypune0015.tif
    Segmentation fault
    what can i do plz help me.

    amol

    November 30, 2010 at 4:52 pm

    • I think you will need to save the tif file as an uncompressed file. Could you please look into that whether you have uncompressed file or not.

      Manish

      November 30, 2010 at 5:02 pm

  49. hi manish, iam unable to load ocr3.0 from paperfile.net which is a free software to convert images to docs getting the error message c:\Documents and Settings\All Users\Application Data\Tarmainstaller\{108A39BF-4ED1-4293-B11A-06BD521FB8F7}\Cache\DROPPED_20100101190241.tiz and also the below error msg giving the msg as error 11

    c:\Documents and Settings\All Users\Application Data\Tarma installer\{108A39BF-4ED1-4293-B11A-06BD521FB8F7}\Cache\FreeOCR_2.1.0.8_L075a6c69191ec1db_x86.exe

    srinivastripati

    February 3, 2011 at 8:44 am

  50. hi manish, chope u can help me load free ocr 3.0 which runs on tarma installer using tesseract the error message is as follws:c:\Documents and Settings\All Users\Application Data\Tarmainstaller\{108A39BF-4ED1-4293-B11A-06BD521FB8F7}\Cache\DROPPED_20100101190241.tiz
    i have a project to convert gif images to word plz reply ASAP

    srinivastripati

    February 3, 2011 at 8:47 am

  51. I DONOT HAVE ANY KNOWLEDGE ABOUT SOFWARES AND ALL THESE STUFF CAN U SUGGEST ME WHAT TO RT NOW

    srinivastripati

    February 3, 2011 at 8:49 am

  52. Hi Manish,
    I need to build tesseract-1.03 in MFC vs6.0. Can you please provide me the info regaring what needto tbe done to compile and build it.

    Regards,
    Anshuman

    Anshuman

    March 18, 2011 at 11:29 pm

  53. hi expert
    i would like to ask few questions, as i am now doing my school project on OCR
    honestly i am lost on how to execute the source code given by the tesseract 3.0 which i have downloaded. when i unzip it, it gives me alot of C++ file which i dont know how to test it out.

    besides that i am currently working on the VB.NET code and it is not tesseract

    would you like to give me some suggestion on how to start?
    thank you so much
    regards

    nana

    March 25, 2011 at 7:26 am

    • Hi Twins and shobhit,

      Actually, I am very busy now a days with our company works. So I could not help you. But if you can send me specific issues, might be I will help you into that.

      Manish

      April 7, 2011 at 10:34 pm

  54. hi manish,
    i have to make ocr for my college project but in c# and i am clueless about how to start and all other details. please help me with this.
    its kind of urgent.
    thanks in advance
    regards

    shobhit sarin

    April 7, 2011 at 1:02 am

  55. hi
    i totally understand your situation but i am not getting from where to start else i wont have troubled you. if u have source code of a simple ocr in c# please mail me or upload it. or tell me what are the different modules of the project like reading image or converting to text module.
    thanx
    regards

    shobhit

    April 8, 2011 at 12:13 am

  56. do you have instructions for using with eclipse on windows

    android

    April 8, 2011 at 10:47 pm

  57. Hello,

    I want an example in c++ of how to use tessearct

    guess22

    May 19, 2011 at 9:43 pm

    • Actually it is better to use the executable and pass the parameter using command line.

      Manish

      May 19, 2011 at 11:30 pm

  58. Thx for response
    but the purpose is to compile the library tesseract in windows xp and it ‘s fine I did it but now I want to use it I must develop in c++ a program that transform an image in text

    guess22

    May 20, 2011 at 6:35 pm

  59. i downloaded your compiled code..when i’m running the application its saying that dll files are missing from your computers what should i do?

    Abhisheck Badjatia

    June 9, 2011 at 4:41 pm

  60. Hı I have a project about Ocr on windows device.I used tessearct but it gave me error(İnit() function ).Can you help me ?

    tuba

    July 21, 2011 at 4:08 am

    • Hi Tuba,

      Sorry about that. But I am bit busy with the work. Could you post your problem to google groups if it helps.

      Manish

      July 21, 2011 at 10:42 pm

  61. could you please show how to use the Tesseract functions in a C++ program. I am not being able to link it with the VS project.
    I posted it on So but got no reply… http://stackoverflow.com/questions/6798278/build-error-with-c-api-of-tesseract-ocr

    arunirc

    July 24, 2011 at 5:42 pm

  62. am trying to download the tesseract free ocr siftware using the tarma installer. Instead of downloading to my harddrive, I was in the process of downloading to one of my thumb drives. During installation I received the following error: Error1006 while opening these files-The volume for a file has been externally altered so that the opening file is no longer valid.
    Are you familiar with this error, and can you assist me with completing my installation or do I have to install on my hard drive for software to work?

    del

    July 24, 2011 at 7:18 pm

  63. Hi,
    Could some one please tell me how to use the Tesseract OCR on the windows machine. I want to know how exactly it functions. Help needed.

    Misty

    October 31, 2011 at 5:56 pm

  64. Has anyone used this for OCR on PDF files?

    Rohit Sharma

    November 8, 2011 at 10:02 pm

  65. Is it possible compile tesseract as a win32 dll ? If yes, can someone guide me please.

    Debjit

    December 15, 2011 at 2:09 am

  66. can you please send me the source code on how to use tessaract in vb6 .0 and vb.net 2005. i will be gladly appreciate any help from you.thank you

    Jason

    December 15, 2011 at 8:45 am

  67. I prefer use gocr than tesseract for ocr. see the different here http://www.seeingwithsound.com/ocr.htm

    jasa pembuatan web

    January 2, 2012 at 8:55 pm

  68. DROPPED_20100101190241.tiz what is this in ocr software
    that create a problem

    arvind

    February 26, 2012 at 11:47 pm

  69. hi all,
    if anybody has implemented java with TesseractOCR please do share with us.

    debrajmallick

    April 28, 2012 at 4:42 pm

  70. Hi Manish.. Am working on OCR project, Where i need to read the numbers from the image. After a lot search in google, have found ur blog & downloaded the project http://cid-7ec550791692ecb9.skydrive.live.com/self.aspx/Tesseract/tesseract-1.03.zip. As u said u successfully compilied this project in VS 2008. But when i compiled this in Visual studio 2008, am getting errors. Can u help me to solve these errors ??

    Guru

    June 4, 2012 at 1:27 pm

  71. could you also email me a latest version of you visual studio project. i am trying to use the teserat to develop an ocr software in vb 2010.

    reyniel macero

    June 21, 2012 at 6:56 am

  72. Hi,

    I have started working with tesseract but i can’t run it. I tried to work using command prompt on windows but it is saying “Can’t create output file”

    on cmd i am using: :tesseract sample2.tif output

    My whole term project is based on it so please help me ASAP.

    Thanks in advance.

    Ishant

    August 22, 2012 at 12:12 am

  73. 0down vote

    I figured it out, if you are using visual studios 2010 and are using windows forms / designer you can add it easily this way with no issues

    1) add the following projects to your project ( i am warning you once, do not add the tesseract solution, or change any setting in the projects you add, unless you love to hate yourself )
    ccmain
    ccstruct
    ccutil
    classify
    cube
    cutil
    dict
    image
    libtesseract
    nutral_networks
    textord
    viewer
    wordrec

    you can add the others but you don’t really want all that built into your project do you? naaa, build those separately

    2) go to your project properties and add libtesseract as a reference, you can now that it is visible as a project, this will make it so that your project builds fast without examining the millions of warnings within tesseract. [common properties]->[add reference]

    3) right click your project in the solution explorer and click project dependencies, make sure it is dependant on libtesseract or even all of them, it just means they build before your project.

    4) the tesseract 2010 visual studio projects contain a number of configuration settings aka release, release.dll, debug, debug.dll, it seems that the release.dll settings produce the right files. First, set the solution output to release.dll. Click your project properties. Then click configuration manager. If that is not available, do this, click the SOLUTION’s properties in the solution tree and click configuration tab, you will see a list of projects and the associated configuration settings. You will notice your project is not set to release.dll even though the output is. If you took the second route you still need to click configuration manager. Then you can edit the settings, click new on your projects settings and call it release.dll…exactly the same as the rest of them and copy the settings from release. Do the same thing for Debug, so that you have a debug.dll name copied from debug settings. wheew…almost done

    5) Don’t try to change tesseracts settings to match yours….that wont work ….and when the new release comes out you wont be able to just “throw it in” and go. Accept the fact that in this state your new modes are Release.dll and Debug.dll. don’t stress out…you can go back when its is finished and remove the projects from your solution.

    6) Guess where the libraries and dll’s come out? in your project, you may or may not need to add the library directories. Some people say to dump all the headers into a single folder so they only need to add one folder to the includes but not me. I want to be able to delete the tesseract folder and reload it from the zips without extra work….and be fully ready to update in one move or restore it if I made a mess of the code. Its a bit of work and you can to it with code instead of the settings which is the way i do it, but you should include all the folders that contain header files within the 2010 tesseract project folder and leave them alone.

    7) there is no need to add any files to your project. just these lines of code….. I have included some additional code that converts from one foreign data set to the tiff friendly version with no need to save / load file. aren’t I nice?

    8) now you can fully debug in debug.dll and release.dll, once you have successfully built it into your project even once you can remove all the added projects and it will be peeerfect. no extra compiling or errors. fully debugable, all natural.

    9) If I remember right, I could not get around the fact I had to copy the files in 2008/lib/ into my projects release folder….darn it.

    In my projects “functions.h” I put
    #pragma comment (lib, “liblept.lib” )
    #define _USE_TESSERACT_
    #ifdef _USE_TESSERACT_
    #pragma comment (lib, “libtesseract.lib” )
    #include
    #endif
    #include

    in my main project I put this in a class as a member:
    tesseract::TessBaseAPI *readSomeNombers;

    and of course I included “functions.h” somewhere

    then I put this in my classes constructor:
    readSomeNombers = new tesseract::TessBaseAPI();
    readSomeNombers ->Init(NULL, “eng” );
    readSomeNombers ->SetVariable( “tessedit_char_whitelist”, “0123456789,.” );

    then I created this class member function: and a class member to serve as an output, don’t hate, I don’t like returning variables. Not my style. The memory for the pix does not need to be destroyed when used inside a member function this way I believe and my test suggest this is a safe way to call these functions. But by all means, you can do whatever.
    void Gaara::scanTheSpot()
    {
    Pix *someNewPix;
    char* outText;
    ostringstream tempStream;
    RECT tempRect;
    someNewPix = pixCreate( 200 , 40 , 32 );
    convertEasyBmpToPix( &scanImage, someNewPix, 87, 42 );

    readSomeNombers ->SetImage(someNewPix);
    outText = readSomeNombers ->GetUTF8Text();
    tempStream.str(“”);
    tempStream << outText;
    classMemeberVariable = tempStream.str();
    //pixWrite( "test.bmp", someNewPix, IFF_BMP );
    }

    The object that has the information that I want to scan is in memory and is pointed to by &scanImage. It is from the “EasyBMP” library but that is not important.

    Which I deal with in a function in “functions.h”/ “functions.cpp” by the way, i am doing a little extra processing here while i am in the loop, namely thinning the characters and making it black and white and reversing black and white which is unnessesry. At this phase in my development I am still looking for ways to improve the recognition. Though for my proposes this has not yielded bad data yet. My view is to use the default Tess data for simplicity. I am acting heuristically to solve a very complex problem.
    void convertEasyBmpToPix( BMP *sourceImage, PIX *outputImage, unsigned startX, unsigned startY )
    {
    int endX = startX + ( pixGetWidth( outputImage ) );
    int endY = startY + ( pixGetHeight( outputImage ) );
    unsigned destinationX;
    unsigned destinationY = 0;
    for( int yLoop = startY; yLoop < endY; yLoop++ )
    {
    destinationX = 0;
    for( int xLoop = startX; xLoop GetPixel( xLoop, yLoop ) ) ) )
    {
    pixSetRGBPixel( outputImage, destinationX, destinationY, 0,0,0 );
    }
    else
    {
    pixSetRGBPixel( outputImage, destinationX, destinationY, 255,255,255 );
    }
    destinationX++;
    }
    destinationY++;
    }
    }
    bool isWhite( RGBApixel *image )
    {
    if(
    //destination->SetPixel( x, y, source->GetPixel( xLoop, yLoop ) );
    ( image->Red Blue Green = 1200) //%%% vkr for VC 6.0
    typedef _int64 inT64;
    typedef unsigned _int64 uinT64;
    #else
    typedef long long int inT64;
    typedef unsigned long long int uinT64;
    #endif //%%% vkr for VC 6.0
    typedef float FLOAT32;
    typedef double FLOAT64;
    typedef unsigned char BOOL8;

    Kage.Sabaku.No.Gaara

    September 18, 2012 at 7:33 am

  74. I actually think about exactly why you called this specific posting, “Tesseract OCR Library – Successfully compiled in Window :
    ) Manish Pansiniyas Blog”. In any event . I personally
    appreciated the post!Many thanks-Margarita

  75. I am new to OCR. Can you tell me how to install for windows 7. i am developing the project in c++ which needs OCR. Please provide me appropriate solution.
    Thanks and Regards,
    Vikky

    Vicky Patil

    March 6, 2013 at 9:52 pm

  76. “Tesseract OCR Library – Successfully compiled in
    Window 🙂 | Manish Pansiniya’s Blog” Roller Shade was indeed a very good blog post and I personally was truly satisfied to locate the blog. Thanks for the post,Roxana

    Kandi

    March 13, 2013 at 9:07 pm

  77. mine is windows 7 32 bit,i have visual studio 2012 plz tell me how to install tesseract

    shrinidhi

    September 5, 2013 at 10:29 pm

    • Shrinidhi, i think you should see google code for this. I think they provided all the code to build this under windows with VS. Hope this helps.

      Manish

      September 6, 2013 at 2:28 am


Leave a reply to http://tinyurl.com/congdavy43093 Cancel reply