diff --git a/source b/source index bed7090aed1..77b6dafc58f 100644 --- a/source +++ b/source @@ -5,6 +5,7 @@ ! - section for the element itself ! - descriptions of the element's categories ! - images/content-venn.svg + ! - sanitizer categories ! - syntax, if it's void or otherwise special ! - parser, if it's not phrasing-level ! - rendering @@ -14,6 +15,7 @@ ! Adding a new attribute involves editing the following sections: ! - The IDL and content attributes for the relevant elements ! - element and attribute indices + ! - sanitizer categories !--> + +
Web applications often need to process untrusted HTML strings, such as when rendering + user-generated content or using client-side templates. Safely inserting these strings into the DOM + requires careful sanitization to prevent DOM-based cross-site scripting (XSS) attacks.
+ +HTML sanitization provides a native mechanism for safely parsing and sanitizing HTML strings. + By using the user agent's own HTML parser, they ensure the sanitized output accurately reflects + how the browser will render the content, preventing script execution and mitigating advanced + attacks such as script + gadgets.
+ +These APIs offer functionality to parse a string containing HTML into a DOM tree, and to filter + the resulting tree according to a user-supplied configuration. The methods come in two main + flavors: "safe" and "unsafe".
+ +The "safe" methods will not generate any markup that executes script. That is, they are + intended to be safe from XSS. The "unsafe" methods will parse and filter based on the provided + configuration, but do not have the same safety guarantees by default.
+ +Sanitizer interface[Exposed=Window]
+interface Sanitizer {
+ constructor(optional (SanitizerConfig or SanitizerPresets) configuration = "default");
+
+ // Query configuration:
+ SanitizerConfig get();
+
+ // Modify a Sanitizer's lists and fields:
+ boolean allowElement(SanitizerElementWithAttributes element);
+ boolean removeElement(SanitizerElement element);
+ boolean replaceElementWithChildren(SanitizerElement element);
+ boolean allowProcessingInstruction(SanitizerPI pi);
+ boolean removeProcessingInstruction(SanitizerPI pi);
+ boolean allowAttribute(SanitizerAttribute attribute);
+ boolean removeAttribute(SanitizerAttribute attribute);
+ boolean setComments(boolean allow);
+ boolean setDataAttributes(boolean allow);
+
+ // Remove markup that executes script.
+ boolean removeUnsafe();
+};
+
+ config = sanitizer.get()Returns a copy of the sanitizer's configuration.
sanitizer.allowElement(element)Ensures that the sanitizer configuration allows the specified element.
sanitizer.removeElement(element)Ensures that the sanitizer configuration blocks the specified element.
sanitizer.replaceElementWithChildren(element)Configures the sanitizer to remove the specified element but keep its child + nodes.
sanitizer.allowAttribute(attribute)Configures the sanitizer to allow the specified attribute globally.
sanitizer.removeAttribute(attribute)Configures the sanitizer to block the specified attribute globally.
sanitizer.allowProcessingInstruction(pi)Configures the sanitizer to allow the specified processing instruction.
sanitizer.removeProcessingInstruction(pi)Configures the sanitizer to block the specified processing instruction.
sanitizer.setComments(allow)Sets whether the sanitizer preserves comments.
sanitizer.setDataAttributes(allow)Sets whether the sanitizer preserves custom data attributes (e.g., data-*).
sanitizer.removeUnsafe()Modifies the configuration to automatically remove elements and attributes that are + considered unsafe.
A Sanitizer object has an associated configuration,
+ which is a SanitizerConfig.
The new
+ Sanitizer(configuration) constructor steps are:
If configuration is a SanitizerPresets string:
+ +Assert: configuration is "default".
Set configuration to the built-in safe default + configuration.
Configure this given + configuration and true.
To configure a Sanitizer
+ sanitizer, given a dictionary configuration and a boolean
+ allowCommentsPIsAndDataAttributes:
Canonicalize the configuration configuration with + allowCommentsPIsAndDataAttributes.
If configuration is not valid,
+ then throw a TypeError.
Set sanitizer's configuration to + configuration.
To canonicalize the configuration SanitizerConfig + configuration with a boolean allowCommentsPIsAndDataAttributes:
+ +For each member of configuration + that is a list of strings:
+ +Replace each string in member with the result of canonicalizing it.
If neither configuration["elements"] nor configuration["removeElements"] exists, then set configuration["removeElements"] to an empty list.
If neither configuration["attributes"] nor configuration["removeAttributes"] exists, then set configuration["removeAttributes"] to an empty
+ list.
If neither configuration["processingInstructions"] nor
+ configuration["removeProcessingInstructions"]
+ exists:
If allowCommentsPIsAndDataAttributes is true, then set
+ configuration["removeProcessingInstructions"]
+ to an empty list.
Otherwise, set configuration["processingInstructions"] to an empty
+ list.
If configuration["elements"]
+ exists:
Let newElements be « ».
For each element of
+ configuration["elements"], append the result of canonicalizing element to newElements.
Set configuration["elements"] to newElements.
If configuration["removeElements"] exists, then set configuration["removeElements"] to the result of canonicalizing
+ configuration["removeElements"].
If configuration["attributes"] exists, then set configuration["attributes"] to the result of canonicalizing configuration["attributes"].
If configuration["removeAttributes"] exists, then set configuration["removeAttributes"] to the result of canonicalizing configuration["removeAttributes"].
If configuration["replaceWithChildrenElements"]
+ exists, then set configuration["replaceWithChildrenElements"] to
+ the result of canonicalizing
+ configuration["replaceWithChildrenElements"].
If configuration["processingInstructions"] exists, then set configuration["processingInstructions"] to the result
+ of canonicalizing
+ configuration["processingInstructions"].
If configuration["removeProcessingInstructions"]
+ exists, then set configuration["removeProcessingInstructions"]
+ to the result of canonicalizing
+ configuration["removeProcessingInstructions"].
If configuration["comments"]
+ does not exist, then set it to
+ allowCommentsPIsAndDataAttributes.
If configuration["attributes"] exists and configuration["dataAttributes"] does not exist, then set it to allowCommentsPIsAndDataAttributes.
To canonicalize a sanitizer list list:
+ +Let newList be « ».
For each item in list, append the result of canonicalizing item to newList.
Return newList.
To canonicalize a processing instruction list list:
+ +Let newList be « ».
For each item in list, append the result of canonicalizing item to newList.
Return newList.
To canonicalize a processing instruction given a SanitizerPI
+ pi:
If pi is a DOMString, then return «[ "target" → pi ]».
Assert: pi is a dictionary and pi["target"] exists.
Return «[ "target" →
+ pi["target"]
+ ]».
To canonicalize a sanitizer name given a DOMString or dictionary name, and a default namespace
+ defaultNamespace (default null):
If name is a DOMString, then return «[
+ "name" → name, "namespace" → defaultNamespace ]».
Assert: name is a dictionary and both name["name"] and + name["namespace"] exist.
If name["namespace"] is the empty string, then set it to null.
Return «[ "name" → name["name"], "namespace" → name["namespace"] + ]».
To canonicalize a sanitizer element given a SanitizerElement
+ element:
Return the result of canonicalizing + element with the HTML namespace as the default namespace.
To canonicalize a sanitizer element list list:
+ +Let newList be « ».
For each item in list, append the result of canonicalizing item to newList.
Return newList.
To find the canonicalized intersection + of lists A and B:
+ +Let setA be « ».
+ +Let setB be « ».
+ +For each entry of A, append the result of canonicalizing entry to setA.
For each entry of B, append the result of canonicalizing entry to setB.
Return the intersection of setA and + setB.
The get() method
+ steps are:
Outside of the get() method, the order of
+ the Sanitizer's elements and attributes is unobservable. By explicitly sorting the
+ result of this method, we give implementations the opportunity to optimize by, for example, using
+ unordered sets internally.
Let config be this's configuration.
Assert: config is valid.
If config["elements"] exists:
For each element of config["elements"]:
If element["attributes"] exists, then set element["attributes"] to the
+ result of sorting element["attributes"], with
+ compare sanitizer items.
If element["removeAttributes"]
+ exists, then set element["removeAttributes"]
+ to the result of sorting element["removeAttributes"],
+ with compare sanitizer items.
Set config["elements"] to
+ the result of sorting config["elements"], with compare sanitizer
+ items.
Otherwise:
+ +Set config["removeElements"] to the result of sorting config["removeElements"], with compare
+ sanitizer items.
If config["replaceWithChildrenElements"]
+ exists, then set config["replaceWithChildrenElements"] to
+ the result of sorting config["replaceWithChildrenElements"],
+ with compare sanitizer items.
If config["processingInstructions"] exists, then set config["processingInstructions"] to the result
+ of sorting config["processingInstructions"], with
+ piA["target"] being
+ code unit less than piB["target"].
Otherwise:
+ +Set config["removeProcessingInstructions"]
+ to the result of sorting config["removeProcessingInstructions"],
+ with piA["target"]
+ being code unit less than piB["target"].
If config["attributes"]
+ exists, then set config["attributes"] to the result of sorting config["attributes"] given compare sanitizer
+ items.
Otherwise:
+ +Set config["removeAttributes"] to the result of sorting config["removeAttributes"] given compare
+ sanitizer items.
Return config.
The allowElement(element) method steps
+ are:
Let configuration be this's configuration.
Assert: configuration is valid.
Set element to the result of canonicalizing element.
If configuration["elements"]
+ exists:
Let modified be the result of removing
+ element from configuration["replaceWithChildrenElements"].
If configuration["attributes"] exists:
If element["attributes"] exists:
Set element["attributes"] to the
+ result of creating a set from element["attributes"].
Set element["attributes"] to the
+ difference of element["attributes"] and
+ configuration["attributes"].
If configuration["dataAttributes"] is true, then remove all items item from element["attributes"] where
+ item is a custom data attribute.
If element["removeAttributes"]
+ exists:
Set element["removeAttributes"]
+ to the result of creating a set from
+ element["removeAttributes"].
Set element["removeAttributes"]
+ to the intersection of
+ element["removeAttributes"]
+ and configuration["attributes"].
Otherwise:
+ +If element["attributes"] exists:
Set element["attributes"] to the
+ result of creating a set from element["attributes"].
Set element["attributes"] to the
+ difference of element["attributes"] and
+ element["removeAttributes"]
+ with default « ».
Remove element["removeAttributes"].
Set element["attributes"] to the
+ difference of element["attributes"] and
+ configuration["removeAttributes"].
If element["removeAttributes"]
+ exists:
Set element["removeAttributes"]
+ to the result of creating a set from
+ element["removeAttributes"].
Set element["removeAttributes"]
+ to the difference of element["removeAttributes"]
+ and configuration["removeAttributes"].
If configuration["elements"]
+ does not contain element:
Append element to
+ configuration["elements"].
Return true.
Let currentElement be the item in configuration["elements"] whose name member is element's name member and whose namespace member is
+ element's namespace
+ member.
If element is equal to currentElement, then return + modified.
Remove element from
+ configuration["elements"].
Append element to
+ configuration["elements"].
Return true.
Otherwise:
+ +If element["attributes"] exists or element["removeAttributes"]
+ with default « » is not empty, then return
+ false.
Let modified be the result of removing
+ element from configuration["replaceWithChildrenElements"].
If configuration["removeElements"] does not contain element, then return modified.
Remove element from
+ configuration["removeElements"].
Return true.
The removeElement(element) method steps
+ are to return the result of removing
+ element from this's configuration.
The replaceElementWithChildren(element)
+ method steps are:
Let configuration be this's configuration.
Assert: configuration is valid.
Set element to the result of canonicalizing element.
If the built-in non-replaceable elements list contains element, then return false.
Let modified be the result of removing
+ element from configuration["elements"].
If removing element from
+ configuration["removeElements"] is true, then set
+ modified to true.
If configuration["replaceWithChildrenElements"]
+ does not contain element:
Append element to
+ configuration["replaceWithChildrenElements"].
Return true.
Return modified.
The allowAttribute(attribute) method
+ steps are:
Let configuration be this's configuration.
Assert: configuration is valid.
Set attribute to the result of canonicalizing attribute.
If configuration["attributes"] exists:
If configuration["dataAttributes"] is true and
+ attribute is a custom data attribute, then return false.
If configuration["attributes"] contains attribute, then return false.
If configuration["elements"]
+ exists:
For each element in
+ configuration["elements"]:
If element["attributes"] with default « » contains attribute, then remove attribute from element["attributes"].
Append attribute to
+ configuration["attributes"].
Return true.
Otherwise:
+ +If configuration["removeAttributes"] does not contain attribute, then return false.
Remove attribute from
+ configuration["removeAttributes"].
Return true.
The removeAttribute(attribute) method
+ steps are to return the result of remove
+ an attribute with attribute and this's
+ configuration.
The setComments(allow) method steps
+ are:
Let configuration be this's configuration.
Assert: configuration is valid.
If configuration["comments"]
+ exists and is equal to allow, then return
+ false.
Set configuration["comments"] to allow.
Return true.
The setDataAttributes(allow) method
+ steps are:
Let configuration be this's configuration.
Assert: configuration is valid.
If configuration["attributes"] does not exist, then return false.
If configuration["dataAttributes"] exists and is equal to allow, then return false.
If allow is true:
+ +If configuration["elements"]
+ exists:
For each element of
+ configuration["elements"]:
If element["attributes"] exists, then remove all items
+ item from element["attributes"] where
+ item is a custom data attribute.
Remove all items item from
+ configuration["attributes"]
+ where item is a custom data attribute.
Set configuration["dataAttributes"] to allow.
Return true.
The allowProcessingInstruction(pi)
+ method steps are:
Let configuration be this's configuration.
Assert: configuration is valid.
Set pi to the result of canonicalizing pi.
If configuration["processingInstructions"] exists:
If configuration["processingInstructions"] contains pi, then return false.
Append pi to
+ configuration["processingInstructions"].
Return true.
Otherwise:
+ +If configuration["removeProcessingInstructions"]
+ contains pi:
Remove pi from
+ configuration["removeProcessingInstructions"].
Return true.
Return false.
The removeProcessingInstruction(pi)
+ method steps are:
Let configuration be this's configuration.
Assert: configuration is valid.
Set pi to the result of canonicalizing pi.
If configuration["processingInstructions"] exists:
If configuration["processingInstructions"] contains pi:
Remove pi from
+ configuration["processingInstructions"].
Return true.
Return false.
Otherwise:
+ +If configuration["removeProcessingInstructions"]
+ contains pi, then return false.
Append pi to
+ configuration["removeProcessingInstructions"].
Return true.
The removeUnsafe() method steps are to return the
+ result of removing unsafe from this's
+ configuration.
dictionary SanitizerElementNamespace {
+ required DOMString name;
+ DOMString? _namespace = "http://www.w3.org/1999/xhtml";
+};
+
+// Used by "elements"
+dictionary SanitizerElementNamespaceWithAttributes : SanitizerElementNamespace {
+ sequence<SanitizerAttribute> attributes;
+ sequence<SanitizerAttribute> removeAttributes;
+};
+
+dictionary SanitizerAttributeNamespace {
+ required DOMString name;
+ DOMString? _namespace = null;
+};
+
+dictionary SanitizerProcessingInstruction {
+ required DOMString target;
+};
+
+typedef (DOMString or SanitizerElementNamespace) SanitizerElement;
+typedef (DOMString or SanitizerElementNamespaceWithAttributes) SanitizerElementWithAttributes;
+typedef (DOMString or SanitizerProcessingInstruction) SanitizerPI;
+typedef (DOMString or SanitizerAttributeNamespace) SanitizerAttribute;
+
+dictionary SanitizerConfig {
+ sequence<SanitizerElementWithAttributes> elements;
+ sequence<SanitizerElement> removeElements;
+ sequence<SanitizerElement> replaceWithChildrenElements;
+
+ sequence<SanitizerPI> processingInstructions;
+ sequence<SanitizerPI> removeProcessingInstructions;
+
+ sequence<SanitizerAttribute> attributes;
+ sequence<SanitizerAttribute> removeAttributes;
+
+ boolean comments;
+ boolean dataAttributes;
+};
+
+ SanitizerElementNamespace, SanitizerAttributeNamespace,
+ SanitizerElementNamespaceWithAttributes, and
+ SanitizerProcessingInstruction dictionaries are considered equal when all of their
+ members are equal.
Equality should be defined in the infra spec instead. See issue #664.
+ +Configurations can and ought to be modified by developers to suit their purposes. Options are
+ to write a new SanitizerConfig dictionary from scratch, to modify an existing
+ Sanitizer's configuration by using the modifier methods, or to get() an existing Sanitizer's
+ configuration as a dictionary and modify the dictionary and then create a new
+ Sanitizer with it.
An empty configuration allows everything (when called with the "unsafe" methods like setHTMLUnsafe()). A configuration "default" contains a
+ built-in safe default configuration. Note that "safe" and "unsafe" sanitizer methods
+ have different defaults.
Not all configuration dictionaries are valid. A valid configuration avoids redundancy (like + specifying the same element to be allowed twice) and contradictions (like specifying an element to + be both removed and allowed.)
+ +Several conditions need to hold for a configuration to be valid:
+ +Mixing global allow- and remove-lists:
+ +elements or removeElements can exist, but not both. If
+ both are missing, this is equivalent to removeElements being an empty list.
attributes or removeAttributes can exist, but not both.
+ If both are missing, this is equivalent to removeAttributes being an empty
+ list.
dataAttributes is conceptually
+ an extension of the attributes allow-list.
+ The dataAttributes member is only
+ allowed when an attributes list is
+ used.
Duplicate entries between different global lists:
+ +There are no duplicate entries (i.e., no same elements) between elements, removeElements, or replaceWithChildrenElements.
There are no duplicate entries (i.e., no same attributes) between attributes or removeAttributes.
Mixing local allow- and remove-lists on the same element:
+ +When an attributes list exists,
+ both, either or none of the attributes and removeAttributes
+ lists are allowed on the same element.
When a removeAttributes list
+ exists, either or none of the attributes and removeAttributes
+ lists are allowed on the same element, but not both.
Duplicate entries on the same element:
+ +There are no duplicate entries between attributes and removeAttributes
+ on the same element.
No element from the built-in non-replaceable elements list appears in replaceWithChildrenElements,
+ since replacing these elements with their children could lead to re-parsing issues or invalid
+ node trees.
The elements element allow-list can also
+ specify allowing or removing attributes for a given element. This is meant to mirror this
+ standard's structure, which knows both global attributes as well as local attributes
+ that apply to a specific element. Global and local attributes can be mixed, but note that
+ ambiguous configurations where a particular attribute would be allowed by one list and forbidden
+ by another, are generally invalid.
| + | global attributes |
+ global removeAttributes |
+
|---|---|---|
local attributes |
+ An attribute is allowed if it matches either list. No duplicates are allowed. | +An attribute is only allowed if it's in the local allow list. No duplicate entries between + global remove and local allow lists are allowed. Note that the global remove list has no + function for this particular element, but can apply to other elements that do not have a + local allow list. | +
local removeAttributes |
+ An attribute is allowed if it's in the global allow-list, but not in the local remove-list. + Local remove has to be a subset of the global allow lists. | +An attribute is allowed if it is in neither list. No duplicate entries between global + remove and local remove lists are allowed. | +
Please note the asymmetry where mostly no duplicates between global and per-element lists are + permitted, but in the case of a global allow-list and a per-element remove-list the latter has to + be a subset of the former. An excerpt of the table above, only focusing on duplicates, is as + follows:
+ +| + | global attributes |
+ global removeAttributes |
+
|---|---|---|
local attributes |
+ No duplicates are allowed. | +No duplicates are allowed. | +
local removeAttributes |
+ Local remove has to be a subset of the global allow lists. | +No duplicates are allowed. | +
The dataAttributes setting allows
+ custom data attributes. The rules above easily extends
+ to custom data attributes if one considers dataAttributes to be an allow-list:
| + | global attributes and dataAttributes set |
+
|---|---|
local attributes |
+ All custom data attributes are allowed. No + custom data attributes can be listed in any + allow-list, as that would mean a duplicate entry. | +
local removeAttributes |
+ A custom data attribute is allowed, unless it's + listed in the local remove-list. No custom data + attribute can be listed in the global allow-list, as that would mean a duplicate + entry. | +
Putting these rules in words:
+ +Duplicates and interactions between global and local lists:
+ +If a global attributes allow list
+ exists, then all element's local lists:
If a local attributes allow list
+ exists, there can be no duplicate entries between these lists.
If a local removeAttributes
+ remove list exists, then all its entries also need to be listed in the global attributes allow list.
If dataAttributes is true,
+ then no custom data attributes can be listed in
+ any of the allow-lists.
If a global removeAttributes
+ remove list exists:
If a local attributes allow list
+ exists, there can be no duplicate entries between these lists.
If a local removeAttributes
+ remove list exists, there can be no duplicate entries between these lists.
Not both a local attributes allow list
+ and local removeAttributes
+ remove list exists.
dataAttributes has to be
+ false.
To set and filter HTML, given an Element or
+ DocumentFragment target, an Element
+ contextElement, a string html, a dictionary options,
+ and a boolean safe:
If all of the following are true:
+ +safe;
contextElement's local name
+ is "script"; and
contextElement's namespace is + the HTML namespace or the SVG namespace,
then return.
+Let sanitizer be the result of calling getting a sanitizer from options given safe.
Let newChildren be the result of parsing a fragment given contextElement, html, and + true.
Let fragment be a new DocumentFragment whose node
+ document is contextElement's node document.
For each node in newChildren, + append node to fragment.
Sanitize fragment given sanitizer and + safe.
Replace all with fragment within + target.
To get a sanitizer instance from options from a dictionary options with a + boolean safe:
+ +Let sanitizerSpec be "default".
If options["sanitizer"]
+ exists, then set sanitizerSpec to
+ options["sanitizer"].
Assert: sanitizerSpec is either a Sanitizer instance,
+ a SanitizerPresets member, or a SanitizerConfig dictionary.
If sanitizerSpec is a string:
+ +Assert: sanitizerSpec is "default".
Set sanitizerSpec to the built-in safe default + configuration.
If sanitizerSpec is a dictionary:
+ +Let sanitizer be a new Sanitizer object.
Let allowCommentsAndDataAttributes be true if safe is false; false + otherwise.
Configure sanitizer given + sanitizerSpec and allowCommentsAndDataAttributes.
Set sanitizerSpec to sanitizer.
Return sanitizerSpec.
To sanitize a Node node with a Sanitizer
+ sanitizer and a boolean safe:
Let configuration be sanitizer's + configuration.
Assert: configuration is valid.
If safe is true, then remove unsafe from + configuration.
Run the inner sanitize steps given node, configuration, + and safe.
To perform the inner sanitize steps given a Node node, a
+ SanitizerConfig configuration, and a boolean
+ handleJavascriptNavigationUrls:
For each child of node's children:
+ +Assert: child is a Text, Comment,
+ Element, ProcessingInstruction, or DocumentType
+ node.
If child is a DocumentType or Text node, then
+ continue.
If child is a Comment node:
If configuration["comments"] is not true, then remove child.
Continue.
If child is a ProcessingInstruction node:
Let piTarget be child's target.
If configuration["processingInstructions"] exists:
If configuration["processingInstructions"] does
+ not contain piTarget, then remove child.
Otherwise:
+ +If configuration["removeProcessingInstructions"]
+ contains piTarget, then remove child.
Otherwise:
+ +Let elementName be a SanitizerElementNamespace with + child's local name and namespace.
If configuration["replaceWithChildrenElements"]
+ exists and configuration["replaceWithChildrenElements"]
+ contains elementName:
Assert: node is not a Document.
Run the inner sanitize steps given child, + configuration, and handleJavascriptNavigationUrls.
Let fragment be a new DocumentFragment whose node
+ document is node's node document.
For each innerChild of + child's children, append innerChild to + fragment.
Replace child with + fragment within node. Assert that this did not + throw.
Continue.
If configuration["elements"] exists:
If configuration["elements"] does not contain elementName, then remove child and
+ continue.
Otherwise:
+ +If configuration["removeElements"] contains elementName, then remove child and
+ continue.
If elementName["name"] is "template" and
+ elementName["namespace"] is the HTML
+ namespace, then run the inner sanitize steps given child's
+ template contents, configuration, and
+ handleJavascriptNavigationUrls.
If child is a shadow host, then run the inner sanitize + steps given child's shadow root, configuration and + handleJavascriptNavigationUrls.
Let elementWithLocalAttributes be «[ ]».
If configuration["elements"] exists and configuration["elements"] contains elementName, then set
+ elementWithLocalAttributes to configuration["elements"][elementName].
For each attribute in child's + attribute list:
+ +Let attrName be a SanitizerAttributeNamespace with + attribute's local name and + namespace.
If elementWithLocalAttributes["removeAttributes"]
+ exists and elementWithLocalAttributes["removeAttributes"]
+ contains attrName, then remove attribute.
Otherwise, if configuration["attributes"] exists:
If configuration["attributes"] does not contain attrName and
+ elementWithLocalAttributes["attributes"] with default « » does not contain attrName, and if "data-" is
+ not a code unit prefix of attribute's local name or attribute's namespace is not null or
+ configuration["dataAttributes"] is not true, then
+ remove
+ attribute.
Otherwise:
+ +If elementWithLocalAttributes["attributes"] exists and elementWithLocalAttributes["attributes"] does
+ not contain attrName, then remove attribute.
Otherwise, if configuration["removeAttributes"] contains attrName, then remove attribute.
If handleJavascriptNavigationUrls is true:
+ +If the pair (elementName, attrName) matches an entry in the
+ built-in navigating URL attributes list, and if attribute
+ contains a javascript: URL, then remove attribute.
If child's namespace is
+ the MathML Namespace, attribute's local name is "href",
+ and attribute's namespace is
+ null or the XLink namespace, and attribute contains a
+ javascript: URL, then remove attribute.
If the built-in animating URL attributes list contains the pair (elementName, attrName), and
+ attribute's value is "href" or "xlink:href", then remove attribute.
Run the inner sanitize steps given child, + configuration, and handleJavascriptNavigationUrls.
To determine whether an attribute attribute contains a javascript:
+ URL:
Let url be the result of running the basic URL parser on + attribute's value.
If url is failure, then return false.
Return true if url's scheme is "javascript", and false otherwise.
To remove an element + element from a SanitizerConfig configuration:
+ +Assert: configuration is valid.
Set element to the result of canonicalizing element.
Let modified be the result of removing
+ element from configuration["replaceWithChildrenElements"].
If configuration["elements"]
+ exists:
If configuration["elements"]
+ contains element:
Remove element from
+ configuration["elements"].
Return true.
Return modified.
Otherwise:
+ +If configuration["removeElements"] contains element, then return modified.
Append element to
+ configuration["removeElements"].
Return true.
To remove an attribute + attribute from a SanitizerConfig configuration:
+ +Assert: configuration is valid.
Set attribute to the result of canonicalizing attribute.
If configuration["attributes"] exists:
Let modified be the result of removing
+ attribute from configuration["attributes"].
If configuration["elements"]
+ exists:
For each element of
+ configuration["elements"]:
If element["attributes"] with default « » contains attribute:
Set modified to true.
Remove attribute from
+ element["attributes"].
If element["removeAttributes"]
+ with default « » contains attribute:
Assert: modified is true.
Remove attribute from
+ element["removeAttributes"].
Return modified.
Otherwise:
+ +If configuration["removeAttributes"] contains attribute, then return false.
If configuration["elements"]
+ exists:
For each element in
+ configuration["elements"]:
If element["attributes"] with default « » contains attribute, then remove attribute from element["attributes"].
If element["removeAttributes"]
+ with default « » contains attribute, then remove attribute from element["removeAttributes"].
Append attribute to
+ configuration["removeAttributes"].
Return true.
To remove unsafe from a SanitizerConfig configuration:
+ +Assert: configuration is valid.
Let result be false.
For each element in built-in safe
+ baseline configuration["removeElements"]:
If removing + element from configuration is true, then set result to + true.
For each attribute in built-in safe
+ baseline configuration["removeAttributes"]:
If removing + attribute from configuration is true, then set result to + true.
For each attribute that is an event handler content attribute:
+ +If removing + attribute from configuration is true, then set result to + true.
Return result.
To compare sanitizer items itemA and itemB:
+ +Let namespaceA be itemA["namespace"].
Let namespaceB be itemB["namespace"].
If namespaceA is null:
+ +If namespaceB is not null, then return true.
Otherwise:
+ +If namespaceB is null, then return false.
If namespaceA is code unit less than namespaceB, then + return true.
If namespaceA is not namespaceB, then return false.
If itemA["name"] is
+ code unit less than itemB["name"], then return true.
Return false.
To canonicalize a + SanitizerElementWithAttributes element:
+ +Let result be the result of canonicalizing element.
If element is a dictionary:
+ +If element["attributes"] exists, then set result["attributes"] to the
+ result of canonicalizing
+ element["attributes"].
If element["removeAttributes"]
+ exists, then set result["removeAttributes"]
+ to the result of canonicalizing
+ element["removeAttributes"].
If neither result["attributes"] nor
+ result["removeAttributes"]
+ exists, then set result["removeAttributes"]
+ to an empty list.
Return result.
To determine whether a canonical SanitizerConfig config is valid:
+ +It's expected that the configuration being passed in has previously been run + through the canonicalize the configuration steps. We will simply assert conditions + that that algorithm is guaranteed to hold.
+ +Assert: config["elements"] exists
+ or config["removeElements"]
+ exists.
If config["elements"] exists and config["removeElements"] exists, then return false.
Assert: Either config["processingInstructions"] exists or config["removeProcessingInstructions"]
+ exists.
If config["processingInstructions"] exists and config["removeProcessingInstructions"]
+ exists, then return false.
Assert: Either config["attributes"] exists or config["removeAttributes"] exists.
If config["attributes"]
+ exists and config["removeAttributes"] exists, then return false.
Assert: All SanitizerElementNamespaceWithAttributes, + SanitizerElementNamespace, SanitizerProcessingInstruction, and + SanitizerAttributeNamespace items in config are canonical, meaning they + have been run through canonicalizing, as + appropriate.
If config["elements"] exists:
If config["elements"]
+ has duplicates, then return false.
Otherwise:
+ +If config["removeElements"] has duplicates, then return false.
If config["replaceWithChildrenElements"]
+ exists and has
+ duplicates, then return false.
If config["processingInstructions"] exists:
If config["processingInstructions"] has duplicates, then return false.
Otherwise:
+ +If config["removeProcessingInstructions"]
+ has duplicates, then return false.
If config["attributes"] exists:
If config["attributes"]
+ has duplicates, then return false.
Otherwise:
+ +If config["removeAttributes"] has duplicates, then return false.
If config["replaceWithChildrenElements"]
+ exists:
For each element of config["replaceWithChildrenElements"]:
If the built-in non-replaceable elements list contains element, then return false.
If config["elements"] exists:
If the intersection of
+ config["elements"] and
+ config["replaceWithChildrenElements"]
+ is not empty, then return false.
Otherwise:
+ +If the intersection of
+ config["removeElements"]
+ and config["replaceWithChildrenElements"]
+ is not empty, then return false.
If config["attributes"] exists:
Assert: config["dataAttributes"] exists.
If config["elements"] exists:
For each element of
+ config["elements"]:
If element["attributes"] exists and element["attributes"] has duplicates, then return false.
If element["removeAttributes"]
+ exists and element["removeAttributes"]
+ has duplicates, then return false.
If the intersection of
+ config["attributes"] and
+ element["attributes"] with default « » is not empty, then return false.
If element["removeAttributes"]
+ with default « » is not a subset of config["attributes"], then return false.
If config["dataAttributes"] is true and
+ element["attributes"]
+ contains a custom data attribute, then return false.
If config["dataAttributes"] is true and
+ config["attributes"] contains a
+ custom data attribute, then return false.
Otherwise:
+ +If config["elements"] exists:
For each element of
+ config["elements"]:
If element["attributes"] exists and element["removeAttributes"]
+ exists, then return false.
If element["attributes"] exists and element["attributes"] has duplicates, then return false.
If element["removeAttributes"]
+ exists and element["removeAttributes"]
+ has duplicates, then return false.
If the intersection of
+ config["removeAttributes"] and
+ element["attributes"] with default « » is not empty, then return false.
If the intersection of
+ config["removeAttributes"] and
+ element["removeAttributes"]
+ with default « » is not empty, then return
+ false.
If config["dataAttributes"] exists, then return false.
Return true.
An element's sanitization category can be one of the following:
+ +SanitizerConfig.setHTMLUnsafe() or parseHTMLUnsafe(), and removed by setHTML(), parseHTML(),
+ and removeUnsafe().)SanitizerConfig.The built-in safe baseline configuration is a SanitizerConfig. Its
+ removeElements list consists of all HTML
+ elements normatively marked as Unsafe within their
+ individual definitions, along with the obsolete frame element, and the SVG
+ script and SVG use elements, and its removeAttributes list is empty.
Event handler content attributes are automatically removed by the remove
+ unsafe algorithm during safe sanitization, so the effective baseline behaves as if they
+ were included in the removeAttributes
+ list.
The built-in safe default configuration is a SanitizerConfig whose + members are initialized as follows:
+ +processingInstructionsattributesdirlangtitlealignment-baselinebaseline-shiftclip-pathclip-rulecolorcolor-interpolationcursordirectiondisplaydisplaystyledominant-baselinefillfill-opacityfill-rulefont-familyfont-sizefont-size-adjustfont-stretchfont-stylefont-variantfont-weightletter-spacingmarker-endmarker-midmarker-startmathbackgroundmathcolormathsizeopacitypaint-orderpointer-eventsscriptlevelshape-renderingstop-colorstop-opacitystrokestroke-dasharraystroke-dashoffsetstroke-linecapstroke-linejoinstroke-miterlimitstroke-opacitystroke-widthtext-anchortext-decorationtext-overflowtext-renderingtransformtransform-originunicode-bidivector-effectvisibilitywhite-spaceword-spacingwriting-modecommentsdataAttributeselementsattributes list.The following table lists the MathML and SVG elements that are categorized as Default in the built-in safe default
+ configuration, represented as a list of
+ SanitizerElementNamespaceWithAttributes dictionaries. For each row in the table, the
+ "Element" column corresponds to the name
+ member, the "Namespace" column corresponds to the namespace member, and the "Allowed
+ attributes" column corresponds to the attributes member (where
+ each listed attribute is represented as a SanitizerAttribute in the sequence):
| Element | +Namespace | +Allowed attributes | +
|---|---|---|
math |
+ MathML | ++ |
merror |
+ MathML | ++ |
mfrac |
+ MathML | ++ |
mi |
+ MathML | ++ |
mmultiscripts |
+ MathML | ++ |
mn |
+ MathML | ++ |
mo |
+ MathML | +fence, form, largeop, lspace, maxsize, minsize, movablelimits, rspace, separator, stretchy, symmetric |
+
mover |
+ MathML | +accent |
+
mpadded |
+ MathML | +depth, height, lspace, voffset, width |
+
mphantom |
+ MathML | ++ |
mprescripts |
+ MathML | ++ |
mroot |
+ MathML | ++ |
mrow |
+ MathML | ++ |
ms |
+ MathML | ++ |
mspace |
+ MathML | +depth, height, width |
+
msqrt |
+ MathML | ++ |
mstyle |
+ MathML | ++ |
msub |
+ MathML | ++ |
msubsup |
+ MathML | ++ |
msup |
+ MathML | ++ |
mtable |
+ MathML | ++ |
mtd |
+ MathML | +columnspan, rowspan |
+
mtext |
+ MathML | ++ |
mtr |
+ MathML | ++ |
munder |
+ MathML | +accentunder |
+
munderover |
+ MathML | +accent, accentunder |
+
semantics |
+ MathML | ++ |
a |
+ SVG | +href, hreflang, type |
+
circle |
+ SVG | +cx, cy, pathLength, r |
+
defs |
+ SVG | ++ |
desc |
+ SVG | ++ |
ellipse |
+ SVG | +cx, cy, pathLength, rx, ry |
+
foreignObject |
+ SVG | +height, width, x, y |
+
g |
+ SVG | ++ |
line |
+ SVG | +pathLength, x1, x2, y1, y2 |
+
marker |
+ SVG | +markerHeight, markerUnits, markerWidth, orient, preserveAspectRatio, refX, refY, viewBox |
+
metadata |
+ SVG | ++ |
path |
+ SVG | +d, pathLength |
+
polygon |
+ SVG | +pathLength, points |
+
polyline |
+ SVG | +pathLength, points |
+
rect |
+ SVG | +height, pathLength, rx, ry, width, x, y |
+
svg |
+ SVG | +height, preserveAspectRatio, viewBox, width, x, y |
+
text |
+ SVG | +dx, dy, lengthAdjust, rotate, textLength, x, y |
+
textPath |
+ SVG | +lengthAdjust, method, path, side, spacing, startOffset, textLength |
+
title |
+ SVG | ++ |
tspan |
+ SVG | +dx, dy, lengthAdjust, rotate, textLength, x, y |
+
The built-in navigating URL attributes list corresponds to all HTML elements marked
+ with navigating URL attributes in their normative definitions, as well as the
+ element-attribute pairs represented in the following table. For each row in the table, the element
+ corresponds to a SanitizerElementNamespace whose name member is given by the "Element" column
+ and whose namespace member is given
+ by the "Element namespace" column; and the attribute corresponds to a
+ SanitizerAttributeNamespace whose name member is given by the "Attribute"
+ column and whose namespace member
+ is given by the "Attribute namespace" column:
| Element + | Element namespace + | Attribute + | Attribute namespace + |
|---|---|---|---|
a
+ | SVG + | href
+ | no namespace + |
a
+ | SVG + | href
+ | XLink + |
The built-in animating URL attributes list is the list of element-attribute pairs
+ represented by the following table. For each row in the table, the element corresponds to a
+ SanitizerElementNamespace whose name member is given by the "Element" column
+ and whose namespace member is given
+ by the "Element namespace" column; and the attribute corresponds to a
+ SanitizerAttributeNamespace whose name member is given by the "Attribute"
+ column and whose namespace member
+ is null:
| Element + | Element namespace + | Attribute + |
|---|---|---|
animate
+ | SVG + | attributeName
+ |
animateTransform
+ | SVG + | attributeName
+ |
set
+ | SVG + | attributeName
+ |
The built-in non-replaceable elements list contains elements that must not be
+ replaced with their children, as doing so can lead to re-parsing issues or an invalid node tree.
+ It is the following list of SanitizerElementNamespace dictionaries, represented by
+ the table below. For each row in the table, the "Element" column corresponds to the name member, and the "Element namespace" column
+ corresponds to the namespace
+ member:
| Element + | Element namespace + |
|---|---|
html
+ | HTML + |
svg
+ | SVG + |
math
+ | MathML + |
The Sanitizer API is intended to prevent DOM-based cross-site scripting by traversing supplied
+ HTML content and removing elements and attributes according to a configuration. By design, the
+ setHTML() and parseHTML() methods remove script-capable markup regardless of the
+ configuration supplied; if any configuration could preserve such markup through these methods,
+ that would be a bug.
However, there are security issues that the Sanitizer API cannot prevent. The following + sections describe them.
+ +The Sanitizer API operates solely in the DOM and adds a capability to traverse and filter an
+ existing DocumentFragment. The Sanitizer API does not address server-side reflected
+ or stored XSS.
DOM clobbering describes an attack in which malicious HTML confuses an application by using
+ id or name attributes such that DOM
+ properties, such as the children property of an HTML
+ element, are shadowed by malicious content.
The Sanitizer API does not protect against DOM clobbering attacks by default, but can be
+ configured to remove id and name
+ attributes.
Script gadgets are a technique in which an attacker uses existing application code from popular + JavaScript libraries to cause their own code to execute. This is often done by injecting + innocent-looking code or seemingly inert DOM nodes that are only parsed and interpreted by a + framework which then performs the execution of JavaScript based on that input.
+ +The Sanitizer API cannot prevent these attacks. Instead, it relies on authors to explicitly
+ allow unknown elements in general, and additionally to explicitly allow attributes, elements, and
+ markup commonly used for templating and framework-specific code, such as data-* and slot attributes and
+ elements like slot and template. These restrictions are not exhaustive
+ and authors are encouraged to examine their third party libraries for this behavior.
Mutation XSS or mXSS describes an attack that exploits cases where the parsed DOM structure is + not the same after serializing and parsing again, to bypass sanitization that happens before + serialization. An example for carrying out such an attack is by relying on the change of parsing + behavior for foreign content or mis-nested tags.
+ +The Sanitizer API offers only functions that turn a string into a node tree. The context is + supplied implicitly by all sanitizer functions: setHTML() uses the current element; Document.parseHTML() creates a new document. Therefore Sanitizer API + is not directly affected by mutation XSS.
+ +If a developer were to retrieve a sanitized node tree as a string, e.g. via innerHTML, and to then parse it again then mutation XSS can
+ occur. This practice is strongly discouraged. If processing or passing of HTML as a string is
+ necessary after all, then any string is to be considered untrusted and re-sanitized when inserted
+ into the DOM. In other words, a sanitized and then serialized HTML tree can no longer be
+ considered sanitized. A more complete treatment of mXSS can be found in MXSS.
The setTimeout() and [MULTIPLEBUFFERING]