Skip to content

base64_url_safe_decode

The strip_html filter is designed to remove HTML tags from a string . It is useful for extracting plain text content from HTML-formatted data or for sanitizing input to prevent cross-site scripting (XSS) attacks.

Functionality

  • Strings: Takes a string as input.
  • HTML Removal: Identifies and removes all HTML tags (including opening and closing tags, attributes, and their content) from the input string.
  • Output: Returns a new string containing only the plain text content, with all HTML tags removed.

Syntax

    {{ input_string | strip_html }}
Arguments

The strip_html filter does not require any arguments.

Code Samples

Example 1: Stripping HTML from a Paragraph

    {% assign html_text = "<p>This is a <strong>paragraph</strong> with some <em>formatting</em>.</p>" %}

    {{ html_text | strip_html }}
Output:

This is a paragraph with some formatting.

Example 2: Stripping HTML from a List

    {% assign html_list = "<ul><li>Item 1</li><li>Item 2</li></ul>" %}

    {{ html_list | strip_html }}
Output:

Item 1Item 2

Example 3: Sanitizing User Input

    {% assign user_comment = "<script>alert('XSS Attack!');</script>This is a comment." %}

    {{ user_comment | strip_html }}

Output:

alert('XSS Attack!');This is a comment.
Note that the code itself isn't executed, only the tags removed

Outliers and Special Cases

  • Empty Strings: If the input string is empty, the strip_html filter returns an empty string.
  • Strings Without HTML Tags: If the input string does not contain any HTML tags, the filter returns the original string unchanged.
  • Non-String Input: If the input is not a string, the strip_html filter might attempt to convert it to a string or return an error.
  • Complex or Invalid HTML: The filter might not perfectly handle all cases of malformed or complex HTML. It's designed for basic HTML stripping and might not be suitable for advanced HTML parsing scenarios.

Key Points

  • The strip_html filter is essential for extracting plain text from HTML content and for sanitizing user input.
  • It helps prevent cross-site scripting (XSS) vulnerabilities by removing potentially harmful HTML tags.
  • For more advanced HTML processing, consider using a dedicated HTML parser library.