I’ve been playing around with the W3C HTML validator, and I’ve found, sadly, that there’s no easy way to get this page to validate. There were some problems that I fixed, but when I try to validate against 4.01 Transitional, I get about 50 errors related to the use of “&” in URLs.
Apparently you’re supposed to use the HTML entity for the ampersand (“& a m p ;”) even in URLs. But since this entitiy isn’t present in the URL in the address bar of the browser, and that’s where you generally copy the URL from, how are you supposed to convert these without manually picking through every URL you use? You could try to get funky with regular expressions, but I can’t imagine that would work perfectly in every case.
This brings up a larger point in that you can’t really expect to validate a site where a large part of the HTML of the page is provided by people other than the original Web developer. Every entry on this page — comprising the entire middle section — can be entered by someone else, and how can I make sure they’re entering valid HTML markup?
This is where HTML Tidy integration will work very well in PHP 5. Using this tool, you can validate HTML that people enter before you store it in the database, or before you output it. You can make sure all tags are closed, all tags match, etc. so perhaps you can hope for some sort of valid markup.
But, in an even larger sense, does validation matter much? I’ve never gotten any comment from anyone about the validation of this site. So what that I’m throwing 50 errors because of ampersands in URLs — can someone provide me with a valid (excuse the pun) reason why this matters?
I understand problems can occur from gross misuse of the HTML spec, but are all validation errors created equal? My apparent misuse of ampersands has got to rank pretty low on the sin list.