Wednesday, October 19, 2005

Security: Block worms from your abusing your web apps

WARNING: This blog entry was imported from my old blog on (which used different blogging software), so formatting and links may not be correct.

I came across this fascinating piece about a worm that spread quickly through the MySpace web site.
What's remarkable is that the web site had pretty good security. The person who did it used some pretty clever techniques to make it work.
I highly recommend reading the write-up, and especially his detailed explanation for how he worked around each security measure.

Briefly, since the site lets you enter HTML in your profile, he wrote very clever HTML that would get executed in other people's browsers whenever they viewed his profile. The executed HTML would automatically do various background posts (using ajax techniques) to automatically execute a requests and confirmation responses on their behalf. It would have the author added as a "friend" to the account, and then it would add the same html code to the new user's profile such that anyone viewing this profile would become "infected" in the same way. The worm spread to a million users in a matter of hours!

Myspace already had various security measures in place, such as blocking out dangerous tags, and removing strings such as "javascript" and "onreadystate" completely. He got around that by taking advantage of browser bugs. For example, even though myspace removed references to javascript, it would not remove java\nscript, and it turns out IE will still recognize this as a javascript URL, even with embedded newlines in the name! As another example, even though onreadystate is blocked out, he could use his ability to execute JavaScript to reconstruct it with string concatenation:

eval('xmlhttp.onread' + 'ystatechange = callback');

Fascinating. This got me thinking. What if you wanted to have a Comments feature on your web app written with Creator.
You want to allow the user to enter some HTML, such that they can add emphasis, use paragraphs etc. But how do you prevent
HTML that compromises your web site?

It seems like a really safe way would be to prevent any HTML attributes from being entered! As long as you don't allow attributes, and you don't allow the style or script tags, you should be safe. This may be overly restrictive, especially for a site like MySpace which tries to let users customize their pages visually. But at least for a comments feature it should be viable.

I took a quick stab at this with Creator. What we'll do is this: Use a TextArea component for the comment. We'll add a Validation event on the text area, and the validation will check the string for suspicious text. If it sees a problem, it will raise a validation error, and the messsage will be displayed in the Message component associated with the text area. I drop the components, then right click on the
text area and choose Add Event Handler | validate, as shown below (click for full size):

Now we'll just need to write the code. Once you add the event handler, you're placed in the source editor and get to edit your code.
By default you get a simple comment telling you roughly how to write a validate event handler:

public void textField2_validate(FacesContext context, UIComponent component, Object value) {
// TODO: Check the value parameter here, and if not valid, do something like this:
// throw new ValidatorException(new FacesMessage("Not a valid value!"));


Here's a simple implementation - pretty naive. It doesn't try to properly parse the HTML,
it just looks for occurrences of <, and when found makes sure that it's properly matched
with nothing other than an approved tag and an optional / at the front or end.
Am I forgetting to handle some other tricks worm developers can take advantage of?

the event handler - shown as an image since many news readers completely butcher formatted
text in attempts to make blog entries comply to their site style; click link for text version.

Here's how this looks at runtime:

(No, the error text didn't magically move from the top of the text area to the bottom;
I moved it between taking the first screenshot and taking the deployment screenshot.)

P.S. Don't forget to hook up the Message component to the text area, such that it displays
errors raised in the text area's validator. Do that by dropping it, then follow the advise
listed in the default message area on screen text: Ctrl-Shift Drag from the message area
to the text area (or vice versa). If you use a Message Group component, you don't have
to "bind" it to any components; it will display error messages from any and all components
on the page. It's a good habit
to always have one during development.


  1. Things must be terribly wrong with XML apis. Why do people always consider string manipulation easier than using a parser?

  2. Have you ever tried validating hex-encoded inputs? Converting every ascii-code to hex, browsers will interpret the hex-encoded code successfully, while some code validators check inputs just for ascii-code. example for user@domain.test in the next line, but also possible for any other string.


  3. My preferred approach would be to use HTML tidy to convert to XML, and then extract the tags I'd permit. In addition, I might use an XML parser to reencode the string, cleaning out entity equivalents. Stripping out attributes sound like a fine idea, although it means that you can't style your DIV.
    The problem that MySpace had was it used simple string checking when HTML is so relaxed and complex (in the sense of so many variations are equivalent) that a lot of broken fragments could pass through. In addition, MySpace did not check for CSS validity. If the CSS was parsed and outputted in a standard form, it would have been easy to pick up attempts to circumvent the javascript checking.