Extract Content from Excel

The Excel feature in the Basic Engine Robot is designed to extract data from an Excel document and use it for other automation tasks. Also, you can update an existing excel document with data. If you want to create a new document and you want some advanced styling and formatting in it, create a template document in a dedicated Excel application and then fill data into it using the robot. You can use built-in number formats in the spreadsheet.

Note To see the supported data formats, use the BuiltinFormats.getAll() POI method or navigate to https://poi.apache.org/ and search the site for "built in formats".

Design Studio has three steps for extracting content from a spreadsheet:

  • The Extract Cell step is used to extract text content from the found range.
  • The Extract Sheet name step is used to extract the sheet name of the sheet of the found range.
  • The Extract As HTML step is used to extract the found range of a spreadsheet as an HTML page containing a table with the cells of the range into a variable.

For the Extract Cell and Extract As HTML steps you can specify what to extract from the cells. This is controlled by the value of the Extract This option. The choice here is the same as the View Modes for the Spreadsheet View. The possible options are described in this topic.

Formatted Values

The extracted values are what you see in Excel and the values of dates and numbers are extracted formatted, which means that numbers may have fewer decimals than the actual values of the cells.

Plain Value

The extracted values are the actual values that Excel would show if the values of the cells were not formatted. For example, numbers would not have rounding of decimals.

Formulas

If a cell contains a formula, it is extracted or otherwise, it is the same value as for the Plain Values option is extracted.

If you create the steps by right-clicking the Spreadsheet View, the value of Extract This is set to the value of the selected View Mode. If you set the View Mode to Formulas and then right-click in the page view and select Extract > Text from the context menu (into a text variable), the Extract This option of the Extract Cell action step is set to Formulas.

You may need to reformat (or normalize) the extracted content, and the Extract Cell action allows you to do this by configuring a list of data converters.

To do so, in the Spreadsheet view, right-click to create a step. Select the desired extract step and specify necessary parameters.

Shared formulas in Excel files

The built-in Excel driver does not support documents with shared formulas. A shared formula is a cell with a formula that is automatically copied to other cells. Any operation that changes the structure of an Excel document containing shared formulas, such as adding or removing rows, may lead to errors in this document.

This limitation is only observed in Excel documents created outside Design Studio. An Excel file created with a robot cannot contain shared formulas.

Workaround: Ensure that your Excel document does not contain any shared formulas. When copying a formula cell to many cells, do not copy it to multiple cells at once; instead, copy the formula to one cell at a time.

Alternatively, you can use the convertSharedFormulas.snippet file included in the Snippets folder of your Kofax RPA installation to convert any shared formulas in your Excel document. The snippet performs these steps:

  1. Takes an Excel document containing shared cells.
  2. Loops over all formula cells.
  3. For each of these cells, extracts the formula and set it on the cell again.