Skip to end of metadata
Go to start of metadata

Overview

The OCR action group is intended for automatic image recognition and writing the recognized text to a Recorder variable of string type.

The OCR action is NOT recorded automatically and should be added to the recording manually.

Use case

In an automated task, you can face a point, when it is necessary to copy textual information from an image, desktop application, applets, like Java, Adobe Flash, or any others of this kind. Such information cannot be copied directly with the keyboard or mouse, so you can add the image to or take a screenshot in the OCR action to be processed by the OCR plugin to recognize and extract the text.

Limitations

  • The OCR action can be used for recognition of images that contain the Latin and numeric characters only.
  • The amount of pages is 1k pages per license.
  • Input formats support image files only (TIFF, PNG, JPG, GIF).

Using OCR Action

You can add the OCR action manually when editing a recording by dragging it from the Actions Library to the Actions Flow.

OCR Action Properties

When the OCR action is added to Actions Flow, you can set the action properties.

The OCR action has the following properties:

  • Capture new image – the image for the action captured by the Recorder
  • Choose new image – the image for the action is chosen from the existing image files on your machine
  • Put OCR result into variable – a Recorder Variable linked to receive the text is recognized from the image by OCR
  • Advanced
    • Wait – a delay before the action starts (in milliseconds)
  • Comments – description of the action

Adding Images to Recording

There are two ways to add an image to your recording for recognition: you can capture an image or add an existing image to the recording.

Capturing New Image

  1. Click Capture new image in the Action Properties window.
  2. The countdown of 5 seconds starts in the right bottom corner of your screen.

    Note

    The countdown duration is adjustable. It is set in the RPA Recorder Preferences (see more information)

  3. During the countdown, switch to the application to capture the image from.
  4. When the countdown stops, select the region to be captured.
  5. The image is captured and displayed in the Action Properties window.
  6. Now you can adjust the anchor region to explicitly mark a place on the image with static information and the capture region containing the text to be recognized.
    1. Anchor region excludes irrelevant or dynamic elements and creates a reference point for the bot on the screen to calculate the distance to the capture region.
    2. Capture region creates a placeholder, where the text to be recognized appears.
  7. Click the image in the Action Properties.

    The Set the Anchor and Capture regions window opens with highlighted Anchor region (1) and Capture region (2).
  8. Use the spinners to mark the static information as the Anchor region (1) and select the placeholder for text as the Capture region (2).

    Hint

    You can increase/decrease the value in the spinner for 1 with the Up/Down Arrow keys or for 10 with the Page Up/Page Down keys.

  9. Click OK to complete the procedure.

Choosing New Image

The option allows you to choose an existing image and add it to the OCR action.

  1. Go to the folder where your images are stored.
  2. Select an image (or images) to add to the recording.
  3. Copy/paste or drag and drop the image(-s) to the folder with your recording in the Media Files window.
  4. Choose the Copy files option.

    The Copy files option is recommended, as the files are located in your RPA Express workspace and can be managed with the Media Files browser. The Link to files option creates links to the files in the external location. It can produce some unnecessary problems that are avoided by choosing the previous option.

    The image(-s) are copied to your recording.

  5. Click Choose new image, select the image in the Choose image window, and click OK. The image is added to the OCR action.
  6. Adjust the anchor and capture regions. Click the image to open the window to set the anchor and capture regions. Use the spinners to mark the static information as the Anchor region and select the Capture region right over the place, where the text to be recognized can appear.
  7. Click OK to complete the procedure.

Choosing Recorder Variable

A Recorder Variable is needed to get results of image recognition by OCR. The output from OCR is a text, so the variable should be declared as a string.

  1. Create a Recorder variable and set it type to String in the Recorder Variables.
  2. In the OCR Action Properties window, assign the Recorder variable to receive the text recognized by OCR.

Selecting Languages

Available from SPA 9.4.

Within the OCR action settings, you can select a specific language that is used in the text you are recognizing.

  1. Go to Languages > Choose language.
  2. Check the language or languages if there are more than one language to be recognized.

Using OCR Plugin

Mind that OCR plugin together with other bot task plugins are used in the Code Perspective mode only. To switch to Code Perspective, either click on the Code button at the top right corner of the WorkFusion Studio window or enter 'Code' in the Quick Access field and press Enter.

You can use <ocr> plugin to recognize text in images with the help of WorkFusion OCR service. For that, <ocr> plugin must contain one or more <ocr-image> plugins that need to be directly or indirectly nested in <ocr>.

OCR plugin example
<var-def name="ocrResult">
  <ocr>
    <ocr-image>
      <var name="image"/>
    </ocr-image>
  </ocr>
</var-def>

To learn more about <ocr> plugin, go to WorkFusion Studio Help > Help contents > Machine Task Plugins > OCR Plugin.

Using our OCR plugin, you can also convert your document into the format you wish. Supported export formats are as follows:

  • txt – plain text, that is the default format
  • html – html page
  • pdfSearchable – enables to search text in such file
  • xml – file contains characters/words along with their location in the original document (coordinates/frames)
  • xmlForCorrectedImage – the same as xml, except location is taken from a processed/adjusted document

To convert a document, define it as <ocr-image> in the code and specify the export format parameter for the output file in the <export> result string. For example, the code below shows how to convert PDF to HTML.

The export format parameter can contain up to three export formats, separated with commas, for example: <ocr export-format="xmlForCorrectedImage,pdfSearchable">