Had an interesting little bug report today. Apparently someone has created an email address with an ampersand in the local-part. While this is perfectly, valid, it means we enter a world of pain when someone decides they want that email address published on a web page.
Anyone who's ever done web or XML stuff will know that & is a special character used to define other characters. For example, if you want to draw a greater-than character (>) in HTML, you have to type > rather than directly typing > as greater-than is one of the HTML (and XML) delimiters.
So noting this bug report, which was that when the user tried to use our drag-and-drool WYSIKCTWYG (What You See Is Kinda Close To What You Get) editor, adding the mailto: link resulted in & being inserted, and there didn't seem to be any way around it, including editing the source.
So I started thinking about the many levels of abstraction and translation in our application. It's quite staggering when you think about it, and any modern, complex application is likely to have similar layers.
When the content is published, the whole thing happens in reverse except that the HTML is generated by stitching together all the little snippets according to the templates. Finally we see for sure whether anything went wrong.
My justification for insisting they change the email address is that this system isn't likely to be the only one that has problems with the ampersand. I know for a fact that there are hundreds of broken email address validation systems out there that don't allow a whole stack of perfectly-valid characters in email addresses.