Update HTML Tidy and TidyLib to the latest official version

Number:rdar://6376494 Date Originated:17-Nov-2008 01:31 PM
Status:Open Resolved:
Product:MacOSX Product Version:10.5
Classification: Reproducible:

The HTML Tidy version, that is shipped with MacOSX 10.5.x is rather outdated.
On MacOSX 10.5.7, the shell command 'tidy --version' outputs a "TML Tidy for Mac OS X released on 31 October 2006 - Apple Inc. build 13".

Please update HTML Tidy and its core, TidyLib, to the latest official version (as time of this writing, this means version 25 March 2009) because of better functionality and several fixed bugs.

According to the developers of Tidy, there do not exist official tagged release builds of Tidy/LibTidy but only the CVS HEAD source. Following that, the latest official release always means (is equivalent to) the latest CVS trunk/HEAD snapshot. The version string of Tidy is placed in tidy/src/version.h, at time of this writing, this is "25 March 2009".

In particular, many web developers rely on Tidy and its TidyLib and rely on its up-to-dateness to some degree.
I would appreciate, if Apple's HTML Tidy would keep up to some degree with the latest official (meaning: CVS HEAD) version of the HTML Tidy project. Therefore, it should be updated/synced with the official sources more often than so far.

For details, see http://tidy.sourceforge.net/


New evolution and revival since January 2015

The tidy binary (/usr/bin/tidy) and its Library, TidyLib (/usr/lib/libtidy.A.dylib) shipped in OS X/iOS is rather old and very outdated. It's not capable of HTML5.

The tidy project has been revived in January 2015 by been officially handed over from W3C's hands into the hands of the new founded HTML Tidy Advocacy Community Group (HTACG) and has come out with a new stable 5.0.0 release suitable for distribution. It supports HTML5. It's currently also distributed via the OS X package managers MacPorts, Brew and Fink, which have adjusted their distributed packages/ports accordingly.

See https://github.com/htacg/tidy-html5 for the Github project, and http://www.html-tidy.org/ for the web site. HTML Tidy Advocacy Community Group under the roof of W3C is https://www.w3.org/community/htacg/.

There is a known dependency that the PHP 5 tidy extension (/usr/libexec/apache2/libphp5.so) depends on the old libtidy-0.99.0-shlibs. It should be updated accordingly.

[quote=http://www.htacg.org/] HTACG

Founded on 2015-January-15, the HTML Tidy Advocacy Community Group (“HTACG”) is the development group now responsible for the continued support, development, and evolution of HTML Tidy, the venerable command line application and library that cleans, diagnoses, and pretty-prints your HTML.

Since its early development and release by Dave Raggett, HTML Tidy has matured and become the de facto, go-to tool for HTML diagnosis and pretty printing. HTACG is charged with ensuring its continued legacy. [/quote]

[quote=https://www.w3.org/community/htacg/] HTML Tidy Advocacy Community Group

The HTML Tidy Advocacy Community Group ("HTACG") is dedicated to the continued support, development, and evolution of the HTML Tidy command line application and library. The Community in cooperation with the W3C aims to become the canonical release group for HTML Tidy, which has been without a stable, public release since 2008. The Community aspires to achieve the agreement and support of the original and current developers to this end. The Community will continue to develop HTML Tidy to adapt it to modern standards; to implement testing systems; and to implement robust build systems. The Community will also promote the continued relevance of HTML Tidy in modern software systems. This group will not publish Specifications. [/quote]

The HTML Tidy project on Sourceforge is dead since years, has had an interim ownership by the W3C directly, is now in new hands, an new Community Group of enganged developers, backed and supported by the W3C. Since under the lead of the new founded HTACG, Tidy's development, design, API have changed. It supports HTML5 (finally! – that alone is a huge benefit and could justify an upgrade to a most recent version). Not speaking of the dozens bug fixes and feature enhancements which have been long awaited for several years and now have been done.

I ask, I urge Apple to update Tidy and its TidyLib shipped in OSX/iOS to a recent version, the version now officially provided by HTACG, which is capable of HTML5. It would make so much sense.

The W3C QA team officially has forked the Tidy project to http://w3c.github.io/tidy-html5/ resp. https://github.com/w3c/tidy-html5 because the old project on tidy.sourceforge.net is dead since years, and the main maintainer, Björn Höhrmann even hasn't merged his own HTML5-patch into the main trunk. What, au contrair, this official W3C fork has done instead. So the main difference between both is: HTML 5 support among several code improvements of the W3C fork.

From the project's description:

HTML Tidy for HTML5 (experimental)

This repo is an experimental fork of the code from tidy.sourceforge.net. This source code in this version supports processing of HTML5 documents. The changes for HTML5 support started from a patch developed by Björn Höhrmann.

For more information, see w3c.github.com/tidy-html5

The staff behind this W3C fork consists (among others) and is backed by several W3C Members (including the original author of Tidy, Dave Raggett (GitHub Nick: draggett), and including Dominique Hazael-Massieux (GitHub Nick: dontcallmedom), lead by W3C staff member Michael Smith (GitHub Nick: sideshowbarker).

One contributor among others of this official fork namend tidy-html5 also is Andy Lester (GitHub Nick: petdance), who is the initiator and maintainer of an earlier Tidy fork, tidyp: http://tidyp.com/ which (yet) is the base for the not unpopular Perl Modul HTML::Tidy on CPAN: http://search.cpan.org/~petdance/HTML-Tidy-1.56/ (as Andy Lester told me, he has planned to abandon his tidyp fork in favour of the official W3C tidy-html5 fork and to link his CPAN module against it instead his own tidyp).

2 years ago, I could convince the MacPorts maintainer of the Tidy port to abandon the old, dead end Tidy from tidy.sourceforge.net in favour the new W3C's fork tidy-html5. Result: https://trac.macports.org/browser/trunk/dports/www/tidy/Portfile See also the notice attached to this port's switch in it's changeset (https://trac.macports.org/changeset/96009):

Revision 96009: 07/28/12 09:41:55 tidy: switch to W3C tidy-html5 version 20120720 at the suggestion of Sierk Bornemann, since the original project at SourceForge hasn't released a new version in over three years; add openmaintainer

So, taken all this together, I think, Apple would do good to also update it's shipped Tidy (released on 31 October 2006 - Apple Inc. build 15.12) to this new W3C fork with HTML5 support and improvements instead of further shipping a very outdated version of tidy (without HTML5 support and old code) from SF.net, which hasn't been and still isn't maintained and updated since years.

Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!