Text recognition (OCR) in Android app using Tesseract example

In this example, we will detect text using an Android app using Tesseract for Android Studio.
We can easily do OCR in an android app using Tesseract library. Tesseract for Android can be used as a dependency and you can learn how to setup Tesseract in Android Studio in this tutorial.
In this example we will continue from previous part where we detected texted in an android app using openCV. Now we will recognize text, i.e perform OCR in Android app using Tesseract.
Using Tesseract for text detection in Android is pretty simple, here’s how you do it.

Using Tesseract for OCR in Android Studio : 

  1. Initialize the TessBaseAPI with the path to traineddata file and proper page segmentation mode.
  2. Just pass the image from which you want to detect text as  bitmap to the tessBaseAPI variable.
  3. Finally call the getUTF8Text method on the variable, this returns a String value.

How to use Tesseract with Android Studio will be properly explained in a post soon. This post just describes how to detect text from image using Tesseract.

So continuing from the previous post on Text Detection in Android using openCV, let’s add the following :
In the detect text function replace the following :

        for (int ind = 0; ind < contour2.size(); ind++) {
            rectan3 = Imgproc.boundingRect(contour2.get(ind));
            rectan3 = Imgproc.boundingRect(contour2.get(ind));
            if (rectan3.area() > 0.5 * imgsize || rectan3.area() < 100
                    || rectan3.width / rectan3.height < 2) {
                Mat roi = new Mat(morbyte, rectan3);
                roi.setTo(zeos);

            } else
                Imgproc.rectangle(mRgba, rectan3.br(), rectan3.tl(),
                        CONTOUR_COLOR);
        }

With:
for (int ind = 0; ind < contour2.size(); ind++) {
                rectan3 = Imgproc.boundingRect(contour2.get(ind));
try {
                    Mat croppedPart;
                    croppedPart = mIntermediateMat.submat(rectan3);
                    bmp = Bitmap.createBitmap(croppedPart.width(), croppedPart1.height(), Bitmap.Config.ARGB_8888);
                    Utils.matToBitmap(croppedPart, bmp);
                } catch (Exception e) {
                    Log.d(TAG, "cropped part data error " + e.getMessage());
                }
                if (bmp != null) {
                   doOCR(bmp);
                }
}

And add the following method to out MainActivity :

private void doOCR(final Bitmap bitmap) {
               String text = mTessOCR.getOCRResult(bitmap);
            }

We also need to add TessOCR mTessOCR and initialize it. Read the NOTE below.
NOTE :
Now we also need to define mTessOCR variable which  is used to initialize TessBaseAPI. We have created a java class for that. Check out how to use tesseract with Android here – Creating android OCR app with Tesseract. Feel free to drop questions below. Happy Coding!

    Don’t miss these tips!

    We don’t spam! Read our privacy policy for more info.

    You’ve been successfully subscribed to our newsletter! See you on the other side!

    Sharing is caring!

    2 thoughts on “Text recognition (OCR) in Android app using Tesseract example”

    1. Dear theCodeCity team,

      The link to the setup tutorial (where you said “you can learn how to setup Tesseract in Android Studio in this tutorial.”) seems to not be working anymore. Could you check for this issue and maybe update the link? Thank you very much in advance.

      Best regards,
      Tobias

    Leave a Comment

    Your email address will not be published.

    Exit mobile version