nsISupports
Last changed in Gecko 14.0 (Firefox 14.0 / Thunderbird 14.0 / SeaMonkey 2.11)nsContentUtils
, nsTreeSanitizer
, and so on directly instead.Implemented by: @mozilla.org/parserutils;1
as a service:
var parserUtils = Components.classes["@mozilla.org/parserutils;1
"]
.getService(Components.interfaces.nsIParserUtils);
Method overview
AString convertToPlainText(in AString src, in unsigned long flags, in unsigned long wrapCol); |
nsIDOMDocumentFragment parseFragment(in AString fragment, in unsigned long flags, in boolean isXML, in nsIURI baseURI, in nsIDOMElement element); |
AString sanitize(in AString src, in unsigned long flags); |
Constants
Constant | Value | Description |
SanitizerAllowComments |
(1 << 0) |
Flag for sanitizer: Allow comment nodes. |
SanitizerAllowStyle |
(1 << 1) |
Flag for sanitizer: Allow
Note: If
-moz-binding is absent, properties that might be XSS risks in other Web engines are preserved! |
SanitizerCidEmbedsOnly |
(1 << 2) |
Flag for sanitizer: Only allow cid: URLs for embedded content. At present, sanitizing CSS backgrounds, and so on., is not supported, so setting this together with |
SanitizerDropNonCSSPresentation |
(1 << 3) |
Flag for sanitizer: Drops non-CSS presentational HTML elements and attributes, such as <font> , <center> , and the bgcolor attribute. |
SanitizerDropForms |
(1 << 4) |
Flag for sanitizer: Drops forms and form controls (excluding <fieldset> and <legend> . |
SanitizerDropMedia |
(1 << 5) |
Flag for sanitizer: Drops <img> , <video> , <audio> , and <source> , and flattens out SVG. |
Methods
convertToPlainText()
Converts HTML to plain text.
AString convertToPlainText( in AString src, in unsigned long flags, in unsigned long wrapCol );
Parameters
-
src
- The HTML source to parse (C++ callers are allowed but not required to use the same string for the return value.)
-
flags
-
Conversion option flags defined in
nsIDocumentEncoder
. -
wrapCol
- Number of characters per line; 0 for no auto-wrapping.
Return value
The plain text conversion of the HTML specified in src
.
parseFragment()
Parses markup into a sanitized document fragment.
nsIDOMDocumentFragment parseFragment( in AString fragment, in unsigned long flags, in boolean isXML, in nsIURI baseURI, in nsIDOMElement element );
Parameters
-
fragment
- The input markup.
-
flags
- Sanitization option flags defined above.
-
isXML
-
true
if |fragment| is XML andfalse
if HTML. -
baseURI
- The base URL for this fragment.
-
element
- The context node for the fragment parsing algorithm.
Return value
An nsIDOMDocumentFragment
object for the resulting sanitized document fragment.
sanitize()
Parses a string into an HTML document, sanitizes the document, and returns the result serialized to a string.
The sanitizer is designed to protect against XSS when sanitized content is inserted into a different-origin context without an iframe-equivalent sandboxing mechanism.
By default, the sanitizer doesn't try to avoid leaking information that the content was viewed to third parties. That is, by default, for example <img>
with a source pointing to an HTTP server potentially controlled by a third party is not removed. To avoid ambient information leakage upon loading the sanitized content, use the SanitizerInternalEmbedsOnly
flag. In that case, <a>
links (and similar) to other content are preserved, so an explicit user action (following a link) after the content has been loaded can still leak information.
By default, non-dangerous non-CSS presentational HTML elements and attributes or forms are not removed. To remove these, use SanitizerDropNonCSSPresentation
and/or SanitizerDropForms
.
By default, comments and CSS is removed. To preserve comments, use SanitizerAllowComments
. To preserve <style>
elements and style
attributes on other elements, use SanitizerAllowStyle
. -moz-binding
is removed from <style>
elements and style
attributes if present. In this case, properties that Gecko doesn't recognize can get removed as a side effect.
-moz-binding
is not present, <style>
elements and style
attributes, and if SanitizerAllowStyle
is specified, the sanitized content may still be XSS dangerous if loaded into a non-Gecko Web engine!AString sanitize( in AString src, in unsigned long flags );
Parameters
-
src
- The HTML source to parse (C++ callers are allowed but not required to use the same string for the return value).
-
flags
- Sanitization option flags defined above.
Return value
The resulting text.