The OCR action group is intended for automatic image recognition and writing the recognized text to a Recorder variable of string type.
The OCR action is NOT recorded automatically and should be added to the recording manually.
In an automated task, you can face a point, when it is necessary to copy textual information from an image, desktop application, applets, like Java, Adobe Flash, or any others of this kind. Such information cannot be copied directly with the keyboard or mouse, so you can add the image to or take a screenshot in the OCR action to be processed by the OCR plugin to recognize and extract the text.
- The OCR action can be used for recognition of images that contain the Latin and numeric characters only.
- The amount of pages is 1k pages per license.
- Input formats support image files only (TIFF, PNG, JPG, GIF).
Using OCR Action
You can add the OCR action manually when editing a recording by dragging it from the Actions Library to the Actions Flow.
OCR Action Properties
When the OCR action is added to Actions Flow, you can set the action properties.
The OCR action has the following properties:
- Capture new image – the image for the action captured by the Recorder
- Choose new image – the image for the action is chosen from the existing image files on your machine
- Put OCR result into variable – a Recorder Variable linked to receive the text is recognized from the image by OCR
- Wait – a delay before the action starts (in milliseconds)
- Comments – description of the action
Adding Images to Recording
Capturing New Image
- Click Capture new image in the Action Properties window.
The countdown of 5 seconds starts in the right bottom corner of your screen.
The countdown duration is adjustable. It is set in the RPA Recorder Preferences (see more information)
- During the countdown, switch to the application to capture the image from.
- When the countdown stops, select the region to be captured.
- The image is captured and displayed in the Action Properties window.
- Now you can adjust the anchor region to explicitly mark a place on the image with static information and the capture region containing the text to be recognized.
- Anchor region excludes irrelevant or dynamic elements and creates a reference point for the bot on the screen to calculate the distance to the capture region.
- Capture region creates a placeholder, where the text to be recognized appears.
- Click the image in the Action Properties.
The Set the Anchor and Capture regions window opens with highlighted Anchor region (1) and Capture region (2).
Use the spinners to mark the static information as the Anchor region (1) and select the placeholder for text as the Capture region (2).
You can increase/decrease the value in the spinner for 1 with the Up/Down Arrow keys or for 10 with the Page Up/Page Down keys.
- Click OK to complete the procedure.
Choosing New Image
The option allows you to choose an existing image and add it to the OCR action.
- Go to the folder where your images are stored.
- Select an image (or images) to add to the recording.
- Copy/paste or drag and drop the image(-s) to the folder with your recording in the Media Files window.
Choose the Copy files option.
The Copy files option is recommended, as the files are located in your Express Edition workspace and can be managed with the Media Files browser. The Link to files option creates links to the files in the external location. It can produce some unnecessary problems that are avoided by choosing the previous option.
The image(-s) are copied to your recording.
- Click Choose new image, select the image in the Choose image window, and click OK. The image is added to the OCR action.
- Adjust the anchor and capture regions. Click the image to open the window to set the anchor and capture regions. Use the spinners to mark the static information as the Anchor region and select the Capture region right over the place, where the text to be recognized can appear.
- Click OK to complete the procedure.
Choosing Recorder Variable
A Recorder Variable is needed to get results of image recognition by OCR. The output from OCR is a text, so the variable should be declared as a string.
- Create a Recorder variable and set it type to String in the Recorder Variables.
- In the OCR Action Properties window, assign the Recorder variable to receive the text recognized by OCR.
Available from WorkFusion Intelligent Automation Cloud 2.4.0.
Within the OCR action settings, you can select a specific language that is used in the text you are recognizing.
- Go to Languages > Choose language.
- Check the language or languages if there are more than one language to be recognized.
OCR Use Case
This sample recording demonstrates how you can extract text from images using OCR.
To run it, you need to:
- download and unzip the recording folder
- copy the folder to your workspace (default location is C:\Users\%USERNAME%\workfusion-workspace\rpae_project)
- refresh the Media Files tab in RPA Recorder and open OCR-action.rpae
- Before playing the recording, go to Window > Preferences > WorkFusion Studio > RPA Recorder and make sure that Enable typing in any window without explicitly switching is enabled.
ocr_resultvariable on the Recorder Variables tab will be used to save the extracted text.
Opening Notepad Group contains Enter Keystrokes actions used to open Notepad.
Current key combinations work with an English OS only. If you have OS in another language, you might need to change the language settings. See here for more information.
Win + R opens the Run window.
The bot types "notepad" into the window and presses ENTER to start the application.
Win + UP maximizes the window.
- OCR Group contains actions used to perform actual text recognition.
ALT + H opens the help menu.
The "a" key opens the About Notepad page.
The OCR action extracts text and saves it into the
ocr_resultvariable. Recapture the image on you PC and adjust the Anchor and Capture regions before playing the script.
ALT + F4 closes the About Notepad page.
The bot types the OCR-ed result into the Notepad window.
Using OCR Plugin
Mind that OCR plugin together with other bot task plugins are used in the Code Perspective mode only. To switch to Code Perspective, either click on the Code button at the top right corner of the WorkFusion Studio window or enter '
Code' in the Quick Access field and press Enter.
You can use
<ocr> plugin to recognize text in images with the help of WorkFusion OCR service. For that,
<ocr> plugin must contain one or more
<ocr-image> plugins that need to be directly or indirectly nested in
To learn more about
<ocr> plugin, go to WorkFusion Studio Help > Help contents > Machine Task Plugins > OCR Plugin.
Using our OCR plugin, you can also convert your document into the format you wish. Supported export formats are as follows:
- txt – plain text, that is the default format
- html – html page
- pdfSearchable – enables to search text in such file
- xml – file contains characters/words along with their location in the original document (coordinates/frames)
- xmlForCorrectedImage – the same as xml, except location is taken from a processed/adjusted document
To convert a document, define it as
<ocr-image> in the code and specify the export format parameter for the output file in the
<export> result string. For example, the code below shows how to convert PDF to HTML.
The export format parameter can contain up to three export formats, separated with commas, for example: