 * XHTML Validator
 *
 * Public domain application implementation by Mr. Tines <tines@ravnaandtines.com>
 *
 * Version 1.2 (21-Jun-03)
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ''AS IS'' AND ANY EXPRESS
 * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
 * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
 * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
 * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
 * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


New at 1.2
==========

Added the ability to use the stated DOCTYPE (assuming it's an XHTML
non-frameset one), and to select a directory and check all the XHTML
files within (recursing sub-directories if desired).


New at 1.1
==========

There is a general confusion between ASCII (which only defines 128 
character values - a-z, A-Z, 0-9, English punctuation, values 0-31
being control characters); ISO-Latin-1 (or ISO-8859-1) which defines 
256 character values, including accented letters, some more symbols,
and a new control character range (values 128-143); and Microsoft's
cp1252, which is like ISO-Latin-1 but uses some of the 128-143 range 
for other symbols rather than as control characters.

These characters aren't valid SGML; so your page will fail at the 
definitive validator.w3.org test.  The new version of the validator
will now detect these characters, whether they are input directly as, 
for example a "" for a bullet, or its equivalents &#x95; or &#149; 
and will in this case suggest you use the correct form for this glyph,
which is &bull; - or perhaps you might have  or &#x80; or &#128; rather
than &euro;

The accompanying file cp1252.html contains all these suspect characters
in rows of four - first as hex escapes (&#x... values), then the Unicode 
code point that the character is attempting to be, then the correct
escape sequence to use. 

Note that the Z-caron characters have no named version, unlike &Scaron; 
and &scaron;, and a hex escape needs to be used (&#x017d; and &#x017e;).
You can use the Unicode code point as an alternative to the name in 
the same way for all the others, but that will make the HTML more obscure.

Note that &#x81; &#x8d; &#x8f; &#x90; and &#x9d; don't have an 
associated character glyph so if any of them are encountered, the program
will suggest "?" as a placeholder.

Unescaped values are detected as the file is being read in at a low level,
and some context is given.  Escaped values will appear in the order they
are found in the file, correctly inetrleaved with any XHTML errors against
the DTD, with positioning information.

The accompanying file cp1252-test.html gives an example of a file with
bad character values both as literals and escape sequences (and a 
deliberate XHTML error) for use as an example to see how the program
works and the sort of output you should expect.

1.1-01 : tweaked the output format for cp1252 characters slightly
1.1-02 : echoes the file's DOCTYPE value so you can check what it
         is claiming to be.


Installation
============
This program was built and tested using J2SDK 1.4.1_01, but might well
run with earlier versions of the J2SDK from http://www.java.com.  

I used Apache Xerces-J 2.4.0 for XML support; other Xerces-J 2.x versions
from http://xml.apache.org may well work.

Extract the .jar file, and add to your classpath, along with the Xerces
jars xmlParserAPIs.jar and xercesImpl.jar; the others are not needed.

Run it with 

java -cp JXHTML.jar com.ravnaandtines.jxhtml.JXHTML

(If you don't have a CLASSPATH variable set up, you can just put all the
jars and zip files in the same folder and go something like

start javaw -cp JXHTML.jar;skinlf.jar;xercesImpl.jar;xmlParserAPIs.jar;css2.zip com.ravnaandtines.jxhtml.JXHTML xplunathemepack.zip

where skinlf.jar, css2.zip in the classpath and the xplunathemepack.zip 
argument are optional)

Skinning
--------
Optionally, you may add the Skin Look & Feel jar (skinlf.jar) from 
http://www.L2FProd.com/ to the classpath, and run it with a theme-pack
argument

java -cp JXHTML.jar com.ravnaandtines.jxhtml.JXHTML [themepack.zip filename]

e.g.

java -cp JXHTML.jar com.ravnaandtines.jxhtml.JXHTML xplunathemepack.zip


I used the skinlf-1.2.3 version.  [You could also skin the application
with the usual Skin Look & Feel skinning overrrinde of an existing app.]

CSS Validation
--------------

If you have a copy of the W3C CSS validator on your classpath, (and I've only
tested with the old Level 1 validator using the copy built by Laurent Caprani
and archived from his "Logiciels XML" page at http://www.espacecourbe.com/, 
after a clean XHTML validation it will invoke the main method of class 

org.w3c.css.css.StyleSheetCom

with the page just processed as target, to check any style sheet for 
conformance.

Purpose
=======
This program is intended to do a final checking pass over XHTML documents.  
You are probably best off starting by getting the document validating in a tool
such as Amaya (http://www.w3.org/Amaya/User/BinDist.html) before using this.

Use
===
Type a file name into the edit box or browse for one.  Select the XHTML version
to validate at, then hit validate (the button is disabled if the file can't
be read).

The program ignores any DOCTYPE declaration in your file and reads the DTD for
the XHTML version selected by the drop-down list from the jar file.

The program will emit error messages where the XHTML contains things that
aren't in the DTD.  The messages are couched in terms that imply that the XML
is definitive so that if you have XHTML containing a link of the form

	<a href="http://www.ravnaandtines.com/" target="_top">

and validate against 1.0 Strict or 1.1, which disallow the "target" attribute,
you'll get a message like 

	[Error] :91:63: Attribute "target" must be declared for element type "a".

which means that the attribute ending at line 91 column 63 isn't declared in 
the DTD.  In this case, the DTDs are definitive, and the XHTML needs to be adjusted.


What next?
==========
Since you can validate to different standards without changing your DOCTYPE, you
can get the document validating at the standard you want, first then claim the 
DOCTYPE you want to deploy with.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
      "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

For example if you have to include the "target" attribute for links because your
page will be shown in a <frame>, then you can check everything else validates
at 1.1 or 1.0 Strict, but publish with the 1.0 Transitional that allows the
"target" attribute.

The following XHTML fragments signal valid XHTML 1.0 and 1.1 and link to
the W3C validator, so a visitor can check the claims.

<p>
      <a href="http://validator.w3.org/check/referer"><img
          src="http://www.w3.org/Icons/valid-xhtml10"
          alt="Valid XHTML 1.0!" height="31" width="88" /></a>
    </p>
    
<p>
      <a href="http://validator.w3.org/check/referer"><img
          src="http://www.w3.org/Icons/valid-xhtml11"
          alt="Valid XHTML 1.1!" height="31" width="88" /></a>
    </p>

To show your readers that you've taken the care to create correct CSS, you may display 
an icon on any page that validates. Here is the HTML you could use to add this icon to 
your Web page:
 
<p>
 <a href="http://jigsaw.w3.org/css-validator/">
  <img style="border:0;width:88px;height:31px"
       src="http://jigsaw.w3.org/css-validator/images/vcss" 
       alt="Valid CSS!" />
 </a>
</p>

If you would like to create a link to the W3C validator for CSS, to make it easier to 
re-validate this page in the future or to allow others to validate your page, the URI is: 

          http://jigsaw.w3.org/css-validator/validator?uri=<fill in CSS URI here>
       or 
          http://jigsaw.w3.org/css-validator/check/referer 

(for HTML/XML document only)

Alas, my web hosting appends invalid junk to the page I put up to include 
advertising, so the page as seen by the public is not valid XHTML or even HTML.  *sigh*
