Convert HTML to Text

This data converter converts the HTML input text to plain text, and structures the text similarly to how it would appear in a browser.

Properties

The Convert HTML to Text data converter can be configured using the following properties:

Include Aligned Tables and Images

Specifies that the tables and images that are aligned to the left or right of the text are included in the output text. Disabling this can sometimes result in removing the desired content.

Include URLs

Specifies that the actual URLs in link tags will be included in the output text.

Include Image Text Alternatives

Specifies that the text representation of images will be included in the output text.

Include Form Fields

Specifies that the text representation of form fields will be included in the output text.

Insert This Before a Heading

Specifies that this data converter should guess at the location of headings and insert the specified text before them.

Insert This After a Heading

Specifies that this data converter should guess at the location of headings and insert the specified text after them.

Keep Ampersand Encodings

Specifies that ampersand encodings will not be decoded. Text in script and style sheet will be respected.

Description

Type in a description to be shown in the list of data converters. If there is no type in a description, one will be generated.