How to use Tesseract in Text Grab

For many years Tesseract has been the steady state of the art for OCR models. When I originally build Text Grab in my research I found many tools which used Tesseract as their models. I chose to go with the built in APIs to Windows because they were easier to target APIs, simpler language management, and it had extra features like bounding box and line segmentation.

Nevertheless Tesseract remains a very high quality OCR model so I added the option to use Tesseract if it was already installed on your system. Once installed the option to use Tesseract was enabled. Then users could choose installed Tesseract models or the Windows models on the fly during the Fullscreen Grab or file scanning from the Edit Text Window.

How to enable Tesseract on Text Grab:

  1. Go to the Text Grab Settings
  2. Click on the “Tess” tab on the right rail
  3. Install Tesseract using the provided winget command
    • Installer provided by Mannheim University Library here
  4. When complete close and open the Text Grab settings window again
  5. Click on “Tess” again
  6. Turn on “Enable Tesseract within Text Grab”
  7. Launch a Fullscreen Grab
  8. Select a language with Tesseract in the name
  9. Select a region of your screen to OCR
  10. Done!

I hope this was useful and let me know if there is anything else you want to see in Text Grab!

Joe

Leave a comment