Important! Read this.

We have launched a new documentation website. If you're using Enterpise Edition 10 (or higher) or Express/Business Edition 2.4.1 (or higher), visit this page.

Skip to end of metadata
Go to start of metadata

We have moved to a new documentation platform. This section is no longer supported. For the up-to-date information, see Web scraping workflow.

Script Overview

Environment

  • Windows OS that meets System Requirements
  • Any of these browsers: Google Chrome, FireFox, or Internet Explorer
  • MS Excel (version doesn't matter)
  • MS Outlook (version doesn't matter) with the email account configured and logged in for the bot to be able to send emails

Data and Logic

Input Data

  • Period of dates: [7 days before current date] - [current date]
  • Excel template that should be copied to any convenient folder. The bot should have access to this folder.

Output Data

  • Excel file with the list of press releases

Common Logic

The bot opens a website in the default browser and scrapes data in the background mode. It means that you see only an open web page in the browser during this operation. After scraping, the bot filters press releases from the current date to seven days before today. Then it prepares the data to be pasted into an Excel template and saves the file. After updating the file, the bot opens Outlook, creates a new email, attaches the Excel list, and sends to the email that is set in the script's variable. 

In this document:

Script Variables

Variable nameTypeDescriptionDefault value
current_dateDateTimethe current date to calculate date period
file_pathStringthe path to the Excel template where the data will be saved
final_tableTablealready filtered data from the site for the latest seven days.
press_release_dateDateTimethe press release date from the web page
secgov_urlStringwebsite URLhttps://www.sec.gov/news/pressreleases
start_dateDateTimethe start date required to calculate the period
subjectStringthe subject of an Outlook email with the attachment to be sent to the recipient
temporary_tableTable

all the data scraped (Date, Title, Release No, Link) from the last web page

of https://www.sec.gov/news/pressreleases


user_emailStringthe email to send the Excel file to

Script Workflow

Calculate period for press-releases

Description: The group of actions is required to calculate period of dates from the current date to seven days before today. The whole group is reusable in other projects.

How To Reuse

To reuse the script, change the text on the 6th row of the script Period.ofWeeks(1) to the period you would like to calculate. For example:

  • Period.ofWeeks(2) calculate the date two weeks before today
  • Period.ofDays(3) calculate the date three days before today
ActionDescriptionImage
The action saves current date to the current_date variable.

Groovy script in action 3 calculates the period of date in the variable current_date to seven days before and saves to the start_date variable.

Open website, scrape and filter data

Description: The group of actions is required to scrape information from the website https://www.sec.gov/ and then filter it, so only press releases for the latest seven days are saved to the final_table variable. 

ActionDescription

Image

Exception handlingThe exception handling is added, so the script can be opened in a default browser installed on your machine (Google Chrome, Mozilla Firefox, or Internet Explorer).

Open websiteAction 7 opens the website from the secgov_url variable.

Web Element

Actions 8-11 save Date, Title, Release Number, and press release URL from the website to the List variables using XPath:

  • date_column
  • title_column
  • link_column
  • release_no_column

Expression Value

Actions 12-15 add the previously scraped and saved data from List variables to the Table variable temporary_table. All the press releases from the last page of the website are stored in this variable.


For EachAll the actions nested in the For Each loop are performed for each row in the Table variable temporary_table.

Exception HandlingException handling is added as the dates on the website are written in the custom format.
In order to compare all these dates with the start_date variable (seven days before today), convert them into DateTime variables. But as the dates are written in different format, there can be an exception during conversion, that's why exception handling is added.


IF-Else

Action 65 compares the press release dates with the start date (seven days before today). If the press release date is more or equal to the start date, then it pushes the filtered data to the final_table variable.

Save data in Excel file

Description: The group of actions is required to paste the data from the final table results to an Excel spreadsheet and set appropriate values to necessary cells.

ActionDescriptionImage

Open Spreadsheet

Set Range

Set Cell Value

  1. The bot opens the Excel spreadsheet using the file_path variable.
  2. The bot pastes the value of the final_table variable to the Excel spreadsheet. The range will start from A2 cell, as cells A1-D1 contain column titles (Date, Title, Link, Release No).
  3. The final actions take values from string variables (Date, Title, Link, Release No) and paste them into specific cells in Excel.

Create specific subject for Outlook email

Description: The group of actions is required to create a specific subject for Outlook email to be sent.

ActionDescriptionImage
Date Format

Actions 75-76 convert DateTime variables start_date and current_date to string variables, so we can use them in the email subject.

Join Strings

Actions 77-78 join four variables so we get the email subject. Action 34 joins text_for_subject and start_date_string and saves to the subject_start variable, for it to look like 'Press releases for Feb 01, 2019'.

Action 35 joins the subject_start and current_date String variables, for the final result to look like 'Press releases for Feb 01, 2019 - Feb 08, 2019'.

Send email with attachment via Outlook 

Description: The group of actions is required to open Outlook and to send the email with the press releases attached. The whole group is reusable in other projects. 

How to reuse

To reuse this group, change values of variables:

  • user_email  change the user email to send to, for example, test@test.com
  • subject  choose the subject to set into the variable value
  • file_path  set the right path for the file you want to attach, for example, C:\RPAExpress\invoices.xlsx
ActionDescriptionImage

Actions 80-84 opens the Run window, types 'Outlook', and then opens Outlook.

With the help of the Window action, the bot switches to the specified window and ignores any random popups.

Actions 85-110 send an email with the specific subject and the attached file to the email from the user_email variable. We will use the Enter Keystrokes, Click Mouse (via Inspector) and Window actions.

The exception handlings are added as some window titles and elements differ in various versions of Outlook. The script below works for Outlook from 2007 to 365:

  1. Action 85 adds the exception handling due to the window title difference in Outlook versions.
  2. Action 92 adds the exception handling due to the difference in the Subject selector in Outlook versions.
  3. Action 96 adds another exception handling as the Attach file selector differs in Outlook versions.
  4. Action 101 adds the exception handling as the flow for attaching a file differs depending on the Outlook version.
  5. Action 106 adds the exception handling as the Send button selector differs depending on the Outlook version.

Close Outlook 

Description: The group of actions is required to close Outlook when the email is sent. The whole group is reusable in other projects. 

ActionsDescriptionImage

The bot enters Alt+F4 to close Outlook.

We've added the Wait action to give time for email to be sent.

Exception handling is added due to the window title difference in various Outlook versions.

  • No labels