Tidying Up Web Pages
Most people use software if they write HTML pages. I am somewhat perverse, however, in that I write all of mine (with a few exceptions) in a text editor. Of course I produce errors and need help.
OS X is based on BSD UNIX. This has been around a while and there are some 1100 commands available to those brave enough. Tidy is a Unix utility, which will "validate, correct, and pretty-print HTML files". With its switches, or ways to fine-tune a command (such as "-errors or -e" to show only errors) you can do a fair job of cleaning up your HTML and much more, like converting HTML to XHTML, among other things.
People use Macs because the interface hides all that stuff underneath. To some it really means something. Others glaze over and exit as fast as they can. Fortunately there are huge numbers of software producers who are kind (and skilled) enough to put a graphic interface between us and the UNIX.
The World Wide Web Consortium has page-verification, but the readout may be somewhat complex. Input is by URL, file or direct. Several other Open Source validators are available here.
I found a utility to optimise HTML on Versiontracker recently, HTML-Optimizer Pro 4.9, and have been using it for a while. There are also Windows versions. A demo is available but this will optimise only 50% of the pages. The cost of the downloaded Pro version is $29.00 (1,000 baht)for a Single User license. As the licence depends on the email address, it can be used on a second machine (laptop and desktop for example).
One of its prime purposes is to save space on the web, which translates into reduced download times for users, or for customers. According to the Readme file it will also "check your web pages for dangling tags, missing attributes and broken links." For me the tag problem is the most common, so this part is a real help for my handiwork.
The software can check a single file; a folder may be selected; or the entire site. As I keep a mirror of my website on my hard disk this is a quick method of improving my site efficiency. There are a couple of drawbacks.
As part of its process, HTML-Optimizer Pro creates a duplicate folder on the hard disk. The default location may be used, or any folder location specified by the user. The new version of the site is saved inside that folder: all files are carried across, including images. The site optimisation was carried out smoothly. My complete site was processed in under 30 seconds.
The optimised text files (html) are not a pretty sight. Every excess space or carriage return is ruthlessly excised and it makes it hard on the eyes to follow the coding that has been produced this way.
But that is not the point. As the Tips at startup and the Readme file make clear, this new folder is not an editing location, it is the optimised product that is uploaded to the website. Editing is carried out in the original source folder and then it can be optimised again, when the target folder will be refreshed: source for editing, target for uploading.
The initial setup is not a major task, particularly if defaults are selected. As I decided to relocate the target folder, I used a menu item in the Configure menu.
Optimizing was a one-click operation, but if the process is repeated after the mirror folder has been updated, Optimizer Pro will balk and suggest that you first "Update the Duplicate Web Folder Directory" in the Extras menu. This takes only a couple of seconds. I did, however, find that it was wise to audit the contents of the folders to ensure that all files were there. Three mp3 files failed to be included in one folder update.
With any utility like this, there is much fine tuning that can be done to the way that it carries out its tasks. Some 21 file types are listed, including jsp, asp, php, css and xml. Any of these can be removed from the list and other types may be added.
In the Extra preference (separate from the menu of that name), there is a way to convert "ASCII 128-255 characters in a WindowsANSI-encoded (sic) text file created on a PC with Windows". Smart Handling enables only changed files to be processed. There are three modes: a "Date + Time" setting, "Today" and a specified date, "Since".
Every file I processed showed savings. Some were relatively small while some files were reduced in size by 400 or 500 bytes. A couple of XML files written using a third party utility had substantial reductions. Overall, the target folder was some 4MB smaller than the source. The code produced may not be to everyone's taste.
For further information, e-mail to
To eXtensions: 2004-05
To eXtensions: Year Two
To eXtensions: Year One
To eXtensions: Book Reviews
Back to homepage