========= scriptGard =========

  1. "For example, we find that two sanitizers in our test application are not commutative: the order of application matters, only one order is safe, yet both orders appear in our empirical study."
  2. "Instead, we use binary rewriting of server code to embed a browser model that determines the appropriate browser parsing context when HTML is output by the web application."
  3. "It is well-known that script-injection attack vectors are highly context-dependent | a string such as expression: alert(a) is innocuous when placed inside a HTML tag context, but can result in JavaScript execution when embedded in a CSS attribute value context."
  4. "In particular, two sanitizer functions, EcmaScriptStringEncode and HtmlAttribEncode, are applied for the JavaScript string context and the HTML attribute context, respectivel"
  5. "For instance, EcmaScriptStringEncode simply transforms all characters that can break out of JavaScript string literals (like the " character) to Unicode encoding (\u0022 for "), and, HtmlAttribEncode HTML-entity encodes characters (" for ""
  6. "The key observation is that applying EcmaScriptStringEncode first encodes the attacker-supplied " character as a Unicode representation \u0022. This Unicode representation is not subsequently transformed by the second HtmlAttribEncode sanitization, because \u0022 is a completely innocous string in the URI attribute value context."
  7. "For our purposes, we model a web browser as a parser consisting of sub-parsers for several languages"
  8. "More precisely, we treat the browser as a collection of parsers for different HTML standard-supported languages"
  9. "Conceptually, parsers for various languages are invoked in stages. After each sub-parser invocation, if a portion of the input HTML document is recognized to belong to another sub-language, that portion of the input is sent to the appropriate sub-language parser in the next stage."
  10. "As a result, any portion of the input HTML document may be recognized by one or more sub-grammars. Transitions from one sub- grammar to another are restricted through productions involving special transition symbols defined above as T , which is key for our formalization of context"
  11. "For instance, data recognized as a JavaScript string is subject to Unicode decoding before being passed to the AST."
  12. "In addition, HTML 5-compliant browsers subject data recognized as a URI to percent-encoding of certain characters before it is sent to the URI parser"
  13. "The goal of sanitization is typically to remove special characters that would lead to a sub-grammar transition"
  14. ""
  15. ""
  16. ""
  17. ""
  18. ""
  19. ""
  20. ""
  21. ""
  22. ""
  23. ""
  24. ""
  25. ""
  26. ""
  27. ""