ActionScript, XML, and Character Entities (solved)

 Feb, 19 - 2009   7 comments   Uncategorized

Nothing’s quite as fun as staying up until 2am hacking an entity encoding problem. We’re committed to keeping this excitement all to ourselves (so you don’t have to).

Technically, XML only supports & < > " '. All the others are XHTML. Flash also supports  . Flash doesn’t natively understand anything else, but you can add the support yourself. You’re on the wrong side of the spec though, unless you add some processing instructions to define these new-fangled entities. Flash can’t understand those either, but they’ll keep browsers (and other parsers that do) from choking—and hey, interoperability is supposed to be the whole point of XML, right?

Before we dive in, two thoughts:

  1. I hate CDATA. I can never remember the sequence, and that means I always screw it up. You have to go to all the trouble to stick it in there, and then parse for it on the other side. No thanks.
  2. Numeric entities are supported by XML, but who can remember those? I just want to enter em-dashes and curly quotes, and I want to be able to recognize that in my highly-readable-non-cdata-infested markup.
  3. It’d be nice if you’re doing transformations on your HTML (a future article will explore an example), to be able to leverage the XML parser so you’ll have access to e4x, XMLList, prettyPrinting, etc. Wrapping HTML in CDATA helps you load it and apply as-is, but doesn’t get you around Flash’s re-encoding of “orphaned” ampersands.

Download XmlUtil, CharacterEntity, and StringUtils

It isn’t required for Flash, but you’ll want to add a DTD, or an inline definition of your entities like this:

 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE site [
<!ENTITY mdash "&#8212;">
]>

 

“site” is the top-level node of your document—this one happens to be from a Gaia Framework project I’ve been working on. In this example, we’ve added “mdash,” so now we can use &mdash; in our content.

How do we use this entity-encoded content in Flash?

var copyContent = XmlUtil.getHTMLContent ( myXML.description)

copy_tf.htmlText = "<body>" + copyContent + "</body>"

The body tag’s on there because I’m using a stylesheet (with body rules). I could have put it directly in my XML if I’d wanted to. This example assumes my document has a node <description> as a direct child of the document root.

As I encountered in this post, ActionScript will reENcode ampersands on entities it doesn’t understand when you get XML content via toXMLString(). The getHTMLContent() function DEcodes all &amp; to & before continuing to replace the XHTML entity set. Not exactly elegant, but it’s better than the 1st thing I had, and I’m totally open to suggestions.

 

Additional Reading:

HTML entities in XML

Entity Reference


Related articles

 Comments 7 comments

  • Maxx says:

    This is very great.  I’m a big dumb dumb…are the .as files your prescribe available as AS 2?

    ReplyReply
  • jon jon williams says:

    AS3 only. Haven’t got the will to port it as little AS2 work as we’ve been doing.

    ReplyReply
  • Alex says:

    How can I do the oposite; i.e. go from “John & Barry” to “John &amp; Barry”.

    I want to use that for some spanish words.

    ReplyReply
  • Rob Ruchte rob says:

    @alex The CharacterEntity class provides methods for encoding character entities as well as decoding them. Use CharacterEntity.encode() to transform all characters with character entity equivalents to the character entity representation.

    ReplyReply
  • Reinarudo says:

    Its amazing how many loops we have to go through for such a simple thing. It seems to detail some sort of built in way of dealing with Character Entities. I haven’t been able to use it though, im settling for the huge CharacterEntity class.
    I found this online http://www.kirupa.com/web/xml/XMLmanageData4.htm

    ReplyReply
  • Kramer auto Pingback[…] ActionScript, XML, and Character Entities (solved) SAVE | SHARE […]


  • Leave a Reply

    Your email address will not be published. Fields with * are mandatory.