Rob Ruchte (Rob Ruchte)

Website: http://thirdpartylabs.com


I'm an interactive developer specializing in custom content management and application development. I'm currently the owner and CTO of Thirdparty Labs, a boutique interactive shop in Raleigh, NC.


 

Posts by Rob Ruchte (Rob Ruchte):


OK, so browsers are not supposed to send named anchors (or anything after them) to the server. However, I noticed today that SWFAddress links like this:

http://example.com/#/portfolio/myClient/myProject

...when sent as plain text via email to an iPhone, get URL encoded when displayed in the mail client, so Safari receives the URL from Mail like this:

http://example.com/%23/portfolio/myClient/myProject

...and happily sends the whole URI to the server, which looks for something to do with a path containing %23, and comes up with a 404.

This is a clear violation of URI RFC 3986, which states:

2.2. Reserved Characters

URIs include components and subcomponents that are delimited by characters in the "reserved" set. These characters are called "reserved" because they may (or may not) be defined as delimiters by the generic syntax, by each scheme-specific syntax, or by the implementation-specific syntax of a URI's dereferencing algorithm. If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed.

reserved = gen-delims / sub-delims

gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"

sub-delims = "!" / "$" / "&" / "'" / "(" / ")"

/ "*" / "+" / "," / ";" / "="

The purpose of reserved characters is to provide a set of delimiting characters that are distinguishable from other data within a URI. URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent. Percent-encoding a reserved character, or decoding a percent-encoded octet that corresponds to a reserved character, will change how the URI is interpreted by most applications. Thus, characters in the reserved set are protected from normalization and are therefore safe to be used by scheme-specific and producer-specific algorithms for delimiting data subcomponents within a URI

Furthermore:

2.4. When to Encode or Decode

Under normal circumstances, the only time when octets within a URI are percent-encoded is during the process of producing the URI from its component parts. This is when an implementation determines which of the reserved characters are to be used as subcomponent delimiters and which can be safely used as data. Once produced, a URI is always in its percent-encoded form.

In other words, keep your dirty mitts off of my URI's!

I spent some time poking around in Apple Mail (full OS X version, not on the iPhone), and noticed that the Edit/Link/Add dialog does not allow named anchors in URLs! As soon as you enter a # in the dialog, the "OK" button is disabled.

OK button enabled

OK button disabled

So clearly Apple is aware of the problem, but have yet to give us a good solution. This got me even more curious - I wanted to see how the iWork apps handle anchors. Turns out that Pages, Numbers, and Keynote '08 all encode anchors in URLs added via the hyperlink dialog! This means that we have a bigger problem than just the iPhone edge case, any links in documents produced using the iWork suite can potentially be malformed. For sites that make heavy use of SWFAddress, this is a huge problem.

I fired up MS Word 2004 for Mac, and was pleasantly surprised, it has a fairly robust interface for working with links that contain named anchors, which results in properly formed URLs.

So what can we do about this? On the server side, we can use mod_rewrite to trap incoming URIs that contain %23 (or an actual # if it comes in, even though it shouldn't), and basically redirect it right back to the client unencoded, so the browser can call the proper URI, and handle the named anchor appropriately.
RewriteCond %{REQUEST_URI} ^(.*)?#(.*) RewriteRule .+ %1#%2 [NE,R=301,L]
The problem with that shotgun approach is that it will trigger the redirect for URIs that rightfully contain URL encoded hash marks. Consider this; your app displays news posts, and pulls the post data from the server using nice semantic URLs, like http://example.com/news/My+Post+Title. The first time you have a post with a title like "We are #1", you will be unable to access the data, as the server will receive a request for http://example.com/news/Were+are+%231, and send a redirect back to the browser to http://example.com/news/Were+are+#1, at which point your browser will fire off a new request for http://example.com/news/Were+are+, which will result in a 404. This will not do.

For browser based client apps that implement SWFAddress, we need a more surgical approach to detecting and redirecting URLs with bogus encoded anchor delimiters.

Here are some mod_rewrite rules for making this happen (mod_rewrite docs can be found here):
RewriteRule ^#(.*) /#$1 [NE,R=301,L]
If your client app loads at the site root, and is only accessible from /, here is a simple solution. This traps and redirects URLs like http://mydomain.com/%23/some/stuff to http://mydomain.com/#/some/stuff
RewriteRule ^path/to/my/app/loadpage.html#(.*)
↵ /path/to/my/app/loadpage.html#$1 [NE,R=301,L]

If your app is further down in the site structure, you can include the path to it, perhaps including an HTML page that loads it, if appropriate. http://mydomain.com/path/to/my/app/loadpage.html/%23/some/stuff to http://mydomain.com/path/to/my/app/loadpage.html/#/some/stuff
RewriteRule ^path/to/my/app/(index\.html)?#(.*)
↵ /path/to/my/app/index.html#$2 [NE,R=301,L]

If you're using an index page to load your client app, it may be accessed either by the path to the directory, or the full path including the file name. Putting an optional check for the file name cracks that nut.

Needless to say, this does nothing to help standard named anchors in HTML pages, it's just a band-aid for client apps that use SWFAddress. Apple really needs to address this issue, and I think it's safe to assume that there are other apps and services out there with the same problem.

UPDATE:
One of my partners just informed me that Microsoft's Windows Mail that currently ships with Vista suffers from these same URI encoding issues!

"I have just discovered that the MS Mail client on Vista has the same problem as Apple Mail, when it comes to handling urls that include a "hash" component.  The hash-sign gets url encoded before it is sent out to the browser, and so the browser thinks it's part of the url and sends it on to the server, rather than treating it as a hash.

I sent someone a link to the *** stuff I did, and it got busted by their mail -- When I finally figured out what was happening, I had to pause briefly and confirm that they weren't using a Mac.

It's so simple it kills me... and MS and Apple are both supposed have the best minds in the world working on this stuff !?

If you ask me, this is pretty good proof that Vista is heavily based on on OSX (conceptually, that is).  I mean... they've even copied the bugs!"

, , ,

Installing PDO_MYSQL on CentOS (the easy way)

When deploying LAMP projects, quite often we find ourselves being given access to a newly installed server with distribution-specific default configurations, which are usually very stripped down. Our Carbon Content Management Toolkit, which runs on top of the Zend Framework, uses the PDO_MYSQL driver for data access, along with a few other modules that are typically present in commercial hosting environments, but not always included in an OS's default PHP build.

Getting PEAR, PECL, and a build environment set up can be a time consuming process if you do it from scratch. Luckily most OSs these days come with easy to use package management systems that take care of downloading, building, and installing the packages you need, and their dependancies.

Check the package management system entry on Wikipedia if you're unsure about which package system your OS uses. Each one works a little bit differently, but the concepts are the same. You're going to need to set up your OS's version of the following packages in order to make your life easy:

php-devel
php-pear
mysql-devel
httpd-devel

The package names will differ a bit for each system. The following example uses the Yum package manager in CentOS, and gets PDO_MYSQL up and running in just a few minutes.

# yum install php-devel php-pear mysql-devel httpd-devel
# pecl install pdo
# PHP_PDO_SHARED=1 pecl install pdo_mysql

Add these lines to php.ini:

extension=pdo.so
extension=pdo_mysql.so

Now restart Apache, and you should see your PDO modules in phpinfo(), Robert's your father's brother.
# apachectl restart

If you need to actually make a custom PHP build, you should have everything you need to do that after installing the devel packages. You still may need to build any dependancies that your custom PHP build requires, but your package system should make that fairly easy.

, ,

ClipStation Clipboard Writer 1.0 Released

ClipStation is a free lightweight solution for writing to your user’s clipboard from an HTML page. Using a single tiny (as small as 1.9KB) invisible SWF that is embedded dynamically via JavaScript, you can pass an unlimited number of content clips onto the clipboard.

ClipStation is designed to be lightweight, flexible, and easy to implement. What makes ClipStation different from other clipboard SWF solutions is the ability to decode HTML character entities, allowing you to pass complex HTML markup to the clipboard from within form elements, divs, pre tags, etc. We developed ClipStation for use on a widget sharing page we've implemented for a client. After looking around for a good lightweight cross-browser solution and coming up empty handed, we decided to build our own. We're now happy to offer it to you at the low, low price of free. More information and the release package can be found at thirdpartylabs.com/clipstation/

»Download ClipStation 1.0

, ,