HTML sanitization
What is HTML sanitization?
HTML sanitization is the process of parsing an HTML string and preserving only the tags that are considered “safe”. HTML sanitization is typically used by server-side programs to remove potentially dangerous tags like <script>
or attributes like <onclick>
that can be used as part of XSS attacks.
For example, the strip_tags
function in PHP lets you do something like this:
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
to produce:
Test paragraph. Other text
Note how the tags like <p>
and <a>
were stripped so that only their texts remain.
Or you can pass a whitelist of allowed tags:
echo strip_tags($text, ['p', 'a']);
to produce:
<p>Test paragraph.</p> <a href="#fragment">Other text</a>
Sanitizing on the client-side
If you’re working on a JavaScript application that must render data from a third-party service, you can sanitize content on the client-side using a library like DOMPurify
.
Sources
Thanks for your comment 🙏. Once it's approved, it will appear here.
Leave a comment