Often, information on websites is stored in the form of tables. As this format is inconvenient for automation, we use Excel files to store tables. But copying information from a web page to an .xlsx file takes a lot of time each time. So, we need a use case that can transfer information from a web table to an Excel spreadsheet, and this example is as follows.
To scrap a web table, we need to open a website with a table, select a table, and copy and paste it into an Excel file. It's that simple, but in your own project when you need an updated table for every script run, you will start thinking about automating this action.
The example uses standard XPaths for the web table and a variable of the Table type. In this case, we have three simple steps: open the website, get the table data line by line, and go to the next page (if you need to scrap more than one page). After the third step, we return to the previous step. On receiving the table, we create a new .xlsx file (using the standard Run application and Excel), or open an existing file and save the table in a specific place in the file.
Installation and Getting Started
Extract the Table scraping folder and drop it into your Workfusion Studio Workspace (C:\Users\name\workfusion-workspace2\rpae_project).
Open the sample in WorkFusion Studio.
Set default values for the following variables:
str_table_xpath – the XPath to a table element on the page;
str_btn_next_xpath (optional) – the XPath to the next page button;
num_how_pages – the number of pages you want to scrap; if you want to scrap only one page, set the value "1" for this variable and you cannot set the value for str_btn_next_xpath;
str_xlsx_file_path – the XPath to the directory where you want to create a new .xlsx file or open an existing one, for example: C:\TablesFolder (without a slash at the end);
str_xlsx_file_name – the file name; if a file with that name does not exist, it will be created, for example: testFile (without a format at the end).