Industry guru Dave Taylor offers free tech support on a wide variety of technical and business topics, including HTML, Apple iPhone, online advertising, Cascading Style Sheets, Web design, management, Unix, Linux, search engine optimization, online dating, Mac OS X, shell script programming and Microsoft Windows.

How do I validate my Web HTML pages?

How do I validate my Web pages and end up with one of those nifty "xhtml 1.0 verified" graphics for my page?

Dave's Answer:

In the beginning of the Web, web browsers were lazy and HTML coders were sloppy, trusting that if things 'just kinda worked alright' that it was time to move on to the next page. As the HTML language and its descendents began to evolve into ever-more-powerful systems (e.g., XML, JavaScript, CSS), it began to be a problem that sloppy HTML was so pervasive.

Worse, the World Wide Web Consortium published a formal, but almost incomprehensible specification that detailed exactly the valid and legal HTML language structure for each generation of HTML.

Enter validators.

The idea of a validator is that you can feed it the address of a Web page, specify exactly what 'flavor' of HTML you've written it in, and it'll then parse and analyze the actual tag structure to see if it's all legal. There are a surprisingly large number of different flavors of HTML too, but the newest and most important is called XHTML, a slight twist on HTML 4 that enforces XML-based syntax.

In a nutshell, this means that you can't do something like this:

    Sample line one.
    <p>
    Sample line two.
    <P align=center>
    Sample line<br>three.

Why? Because the paragraph tag is a container tag: it needs a closing </p> each time it's used. Further, all xhtml tags MUST be in all lowercase, so the second <P> is wrong in that regard too. Finally, every argument must be quoted (e.g., align="center") and every non-paired tag must end with the rather odd looking space+slash+angle sequence: <br /> not <br>

So why bother? Because as more sophisticated Web-based tools appear on the network, they are more and more relying on properly formed pages. Ditto the latest generation of browsers. And in general, it's just good karma to have clean, properly written pages, just as it's best to communicate with proper spelling and grammar.

STRUCTURING A PAGE FOR VALIDATION

So that's the long answer as to why validation is useful. To actually validate a Web page you need to do two things:

  1. Add an XML type definition to the very top of the page. I use this:
    <?xml version="1.0" encoding="UTF-8"?>
    This indicates that I'm using 8-bit Unicode (you'll probably be using the same on your page: you can safely copy this without knowing more about it than 'it's gotta be there')
  2. Add an HTML or XHTML data definition reference line. I use this:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    This says that I'm writing "xhtml 1.0 transitional" code. There are lots of different types of DOCTYPE, as alluded to earlier in my message. You can find a list of 'em at the w3.org Web site, if you're curious.

TO VALIDATE YOUR PAGE

To validate your Web page, once you've added the ?xml and !DOCTYPE header lines as detailed, go to http://validator.w3.org/ and enter the full URL of the page. You can also upload files directly to their validator, but since I always have my projects online in a work directory, I let them find it directly.

If everything looks good, you'll run the validator and, assuming you're also writing to xhtml 1.0 transitional, it'll come back with the cheery message:

This Page Is Valid XHTML 1.0 Transitional!

Otherwise you'll see:

This page is not Valid XHTML 1.0 Transitional!

and it'll give a detailed list of the problems it found and why they're a problem. Fix them, submit the URL again, and you should be able to get to the success message.

Then, finally, at the bottom of that page is the information on how to link to the neato graphical buttons on your own validated pages.



Help others find this article at Del.icio.us, Digg, Netscape, Reddit, and Stumble Upon    

Subscribe!

Never miss another useful Q&A article again! Subscribe to AskDaveTaylor with Google Reader.

Comments

Dave,

Is HTML 4.01 more useful than XHTML 1.0 ?

There is a bit of advocacy building up against using XHTML until there is a widely-used version of IE that supports it. I read this article and it concerns me:
http://hixie.ch/advocacy/xhtml

Since I rarely use IE, and XHTML support (with the proper MIME type) in IE seems to be a long time coming, I am concerned to choose the right standards for my first websites.

What have been your choices of standards, and why have you made them?

My plan right now is to ignore the naysayers, stick with the new XHTML standard (validated 1.0 Strict), test in IE6 to make sure it looks ok, and hope for the best regarding IE5 and IE5.5.

David

Posted by: David Corking at June 10, 2005 11:01 AM

I don't fully understand your comment, David, because the difference between well-formed HTML 4.01 and well-formed XHTML is so miniscule that I can't imagine anyone be pro or against any of it. I mean, in HTML you'd write a tag like <hr> and in XHTML you'd write <hr />

Is there really enough to quibble over?

Posted by: Dave Taylor at June 10, 2005 11:01 PM

Thanks, Dave - you have alleviated my concerns.
I didn't realise the differences between SGML and XML were that small.
I thought that I would upset IE by closing out every tag and using an XML DTD.
I see now that you have picked XHTML 1.0 Transitional for this site and it is clearly working well for you.

Posted by: David Corking at June 11, 2005 9:40 AM

Hi Dave,

How do i get to draw a line across the two cells/parts of the "Interests" panel of my myspace profile. (using xhtml code)
(This is an obvious request, which most people would wish to know, but it is fiendishly difficult to get any type of clear and relevant explanation on.)

I am just learning HTML coding etc.

I was using to great effect until the browser suddenly decided to not recognise it.

After some research on google i came to the conclusion that myspace was now using XHTML.
Hence will not work.

If you check my site you will see that i have managed OK with stopping the Text from Wrapping by using 100 m as advised.
I have been able to enter links etc on one line so they make more sense to the uninitiated.

But stubbornly refuses to work using this method and wraps halfway. Grrr!!!!

(I wish to draw a line under each section containing a Band Logo/Photo & Link, so as to make it more obvious which part is related to which etc.)

Sorry if this is a bit of a long winded explaination.

Cheers!

Ross Sakey

Posted by: Ross Sakey at March 29, 2008 5:58 PM

I have a lot to say, but ...
Starbucks coffee cup I have a lot to say, and questions of my own for that matter, but most of all I'd like to say thank you for all your efforts on this Web site by buying you a chai!

I do have a comment, now that you mention it!











Remember personal info?


Please note that I will never send you any unsolicited commercial email. Ever.

While I'm at it, please note that by submitting a question or comment you're agreeing to my terms of service, which are: you relinquish any subsequent rights of ownership to your material by submitting it on this site.









Uniblue: Free Virus Scan

Follow me on Twitter @DaveTaylor

Search
Find just the answers you seek from among our 2300+ free tech support articles by using our Lijit search engine.


Help!





Subscribe to
Ask Dave Taylor!

Add to Google Reader
Add to My Yahoo!
Subscribe in NewsGator Online

RDF   XML

Free Updates!
Sign up and get free weekly updates and special offers on books, seminars, workshops and more.


Recent Entries
Book Links
© 2002 - 2009 by Dave Taylor. All Rights Reserved.

Note: This web site is for the purpose of disseminating information for educational purposes, free of charge, for the benefit of all visitors. We take great care to provide quality information. However, we do not guarantee, and accept no legal liability whatsoever arising from or connected to, the accuracy, reliability, currency or completeness of any material contained on this web site or on any linked site.

[whiteboard marker tray]
"Ask Dave Taylor®" is a registered trademark of Intuitive Systems, LLC.