Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Solve XSS by signing SCRIPT tags (jgc.org)
19 points by jgrahamc on July 4, 2010 | hide | past | favorite | 15 comments


What about attribute-based XSS? How do you sign "onmouseover"?

What about dynamically-generated scripts that include user input? People do this all the time in the real world. Even if you sign it, you're still screwed if you get quoting wrong.

The crypto is also superfluous; you'd get the same protection by having the server set a long random nonce in a header, and then require every <script> element to bear the same nonce. No crypto required.


Clearly you ban onmouseover etc. inline. These have to be set inside a <script>. Agree about the user input part.

How do you securely tell the browser what the nonce is to check?


It doesn't matter how secure the nonce is. All that matters is that an attacker can't predict what it'll be on each page render (just like a CSRF token), which means they can't craft an input that'll render as Javascript --- and, in particular, they can't ever expect to store such an input and have it render as JS.

I'm not endorsing "script tokens" either --- I think it's a bad idea to change all browsers for a half-measure --- but I think you can execute your core idea far more simply than with crypto.


Not quite. onmouseover can appear in href tags.


XSS isn't only about SCRIPT tags. What about javascript URIs, IMG and IFRAME tags, flash elements that you can add in a page by XSS? Filtering what is shown to the user is still the best solution IMHO.


Could explain what you mean by this?

Do you mean filtering what is sent to the browser - validating libraries so user input cannot generate URIs, IMGs, etc?


This is a hard problem. There are all sorts of places you can hide JS. My favourite is inside a data uri in an iframe. Eg:

<iframe src="data:text/html;base64,PHNjcmlwdD5hbGVydCgnRk9PQkFSJyk8L3NjcmlwdD4="></iframe>


Thanks John for posting this as a main thread of its own. I noticed that your comment was lost in a flood of other comments in the YT thread. I don't know if you saw my comment but I mentioned your solution, while technically very solid, requires a lot more work for everyone.

Why wouldn't this work: Have a meta-tag (or something in the <head>) that says: do not allow in-line script tags on this HTML page. Only run scripts from external .js files. No onclick/onmouse code allowed in HTML. For additional security (I don't know if this is possible or not), external .js files cannot do document.write("<script>...") or $("#foo").innerHTML("<script>...") i.e. if external .js file makes an Ajax call and document.write() the response, the response should be handled as non-executable. Moreover, only external .js files can attach events and that too, using attachEvent etc. and not via the on* attributes.


1. Browsers can only run JS inside application/javascript files.

2. Browsers will ignore any 'on' attributes.

3. HTML may link to JS with one or more LINK tags.

4. JS may assign 'on' attribute event handlers to individual HTML tags.

5. JS will have access to only the cookies of domain hosting the JS file.

There's still a risk of bad LINK tags creeping in, but the code will only be able to see the cookies of the domain hosting the dodgy JS.


So, I've been using this guide as the gold standard : http://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%2...

what it seems to boil down to is : sanitize user input.

In what circumstances is this not sufficient?


When your sanitizer fails against a previously-untested kind of input, such as the recent problems on YouTube. As audiodude on reddit[1] points out, all you needed was "<script><script>PAYLOAD" to break their sanitizer. I'm honestly surprised this wasn't discovered sooner, odds are someone just got lucky.

But yes. Sanitizing user input (assuming a perfect sanitizer) can assure perfect XSS protection, but at a rather severe loss of features (no images / URLs / formatting in your comments). A separate, limited DSL is often the solution for this (BBCode, Markdown, etc), but those sometimes make mistakes too, especially when they're younger.

[1]: http://www.reddit.com/r/programming/comments/cluc5/html_inje...


If you're going with a DSL anyway, making that language happen to look like a strictly-defined subset of HTML doesn't necessarily reduce its security (plus it allows for a WYSIWYG editor option). But too often what happens next on the back end is an attempt to "sanitize" that input in place and output whatever's left, rather than to fully parse it. That just leads to an arms race between sanitizers and XSS exploiters.

Instead, parse the HTML just as you'd have to with BBCode or Markdown. Store it in an internal representation that's only capable of the minimum needed formatting features. (There's no actual HTML left at this stage. It's equivalent to any other DSL.) Then render HTML out of that parse tree, so that every bit of user input is HTML entity-encoded, and everything else (tags and attributes) comes from constants in the program.

This can be even more secure than regexp-based DSL translators that build up a result from input in multiple passes, since they tend to lack such a well-defined separation between input text and output HTML.


This makes perfect sense and it's a cogent explanation besides.

Do you know of good libraries for this type of work off the top of your head? bonus points for native java


well, this guid says that you must sanitize user input, but at input time, you may not know where it will come out. So, the solution proposed by this guide is to escape the output. It is another layer of defense: you don't know if you have properly verified the input, but everything you will send to the output will be clean.


to me it seems input is anything coming from the user to the database. I was thinking just like, <> -> &lt; &gt; but maybe this is naive or sometimes you require proper html input




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: