What is HTML sanitization?
24 June 2022 (Updated 24 June 2022)
On this page
In a nutshell
HTML sanitization is the process of parsing an HTML string and preserving only the tags that are considered “safe”. HTML sanitization is typically used by server-side scripts to remove potentially dangerous tags like <script>
or attributes like <onclick>
that can be used as part of XSS attacks.
For example, the strip_tags
function in PHP lets you do something like this:
$text = '<p>Test paragraph.</p><!-- Comment --> <a href="#fragment">Other text</a>';
echo strip_tags($text);
to produce
Test paragraph. Other text
Or you can pass a whitelist of allowed tags:
echo strip_tags($text, ['p', 'a']);
to produce:
<p>Test paragraph.</p> <a href="#fragment">Other text</a>
Sanitizing on the client-side
If you’re working on a JavaScript application that must render data from a third-party service, you can sanitize content on the client-side using a library like DOMPurify
.
Sources
Tagged:
Web security
Thanks for your comment 🙏. Once it's approved, it will appear here.
Leave a comment