New MacBook Pros are here! Get the lowest prices anywhere: Apple Price Guides updated Sept 2nd (exclusive coupons)
 


Saturday, September 19, 2009, 04:00 pm PT (07:00 pm ET)

Why Apple is betting on HTML 5: a web history


HTML 2: Who's in charge?

After the original HTML draft, along with the HTML+ expired in 1994, the IETF set up the HTML Working Group to begin work on HTML 2. At the same time, Berners-Lee created the World Wide Web Consortium (W3C) with the overlapping intent to shepherd the development of web standards in general. The HTML 2.0 IETF specification was released in 1995 to codify a variety of new changes and serve as a foundation for future web development.

In 1996, the IETF closed its own working group and essentially delegated the task of managing HTML to the W3C. Much of the work of developing the HTML 2 standard had been based on simply recognizing the extensions various browser developers had originated on their own, rather than actually laying out an optimized, well-conceived way to achieve specific goals.

The lead developer at Mosaic, Marc Andreessen, had left the NCSA in 1993 to set up Netscape as a private enterprise to develop his personal concept of where the web browser should go. Netscape began creating its own extensions to HTML without any discussion with the larger community, a problem that risked derailing the open nature of the web itself.

Netscape was primarily interested in rapidly creating a way to deliver web pages that could catch the attention of consumers, so the additions it began adding to HTML included tags the specified things like a background color for the page, or specific font faces for text. To academics, this inappropriately mixed presentation into a standard that ideally should only present descriptive semantics of how the document was organized.

If this continued, HTML would stop being a flexible document format that could be interpreted for different purposes, and simply become a clumsy way to render an specific document view for a single purpose: a web browser running on a desktop PC. Defining a background color in HTML, for example, might result in a page design that is difficult to render for blind users, while specifying a specific font size or face could prevent the document from being properly scaled up or down to fit the client device being used.

HTML 3: so many standards to choose from

In 1995, the W3C floated a draft of HTML 3, which intended to formalize a variety of emerging features including supporting the needs of math and scientific documents. Among the other new features included in HTML 3 was support for tables, based on a request by the US Navy to accommodate tables of data used in its complex documentation.

As HTML 3 branched out to serve the needs of virtually everyone, browser vendors with limited resources began to pick and choose what elements of the specification they could or would implement. This resulted in different browsers supporting different subsets of the "standard," while they also each added unique, non-standard features of their own.

Meanwhile, Netscape's leadership in the browser market was challenged by Microsoft, which in 1995 licensed the original Mosiac code and began forking it off in a new direction in an effort to prevent the web from being defined by group of companies (primarily Netscape and Sun) that had a vested interest in breaking up Microsoft's grip on the PC operating system market.

To keep ahead of Microsoft, Netscape continued adding its own proprietary extension to HTML. One example is the concept of HTML frames, which allowed a browser to display multiple independent web pages together within the same screen. After adding frames to its Navigator browser, Netscape submitted the idea for inclusion into the HTML specification, essentially precluding the community from being able to discuss merits of the idea or its implementation.

By the end of 1996, Microsoft had scrambled out the third major release of Internet Explorer in just a year, a frantic pace that was clearly designed to tie the future of the web to Windows. IE 3.0 added support for ActiveX, a way to build complex interface controls within web pages that would only run on Windows. Netscape added its own implementation of ActiveX and added a scripting language named JavaScript, which IE matched with its own compatible JScript.

With Netscape and Microsoft racing to outdo each other in unique features, the glacial pace of the sausage-making deliberation on how to best implement HTML as an interoperable standard began to run aground. The bottom of the barrel was reached with Netscape's BLINK tag, which Microsoft matched in silliness with its own MARQUEE tag; both unplugged HTML from the goal of delivering serious documentation presentation and instead targeted the web at replicating the garish desperation of gaudy neon signs in a red light district.

Meanwhile, the IETF continued work on its official HTML 2 specification, adding support for features such as international characters, tables, and image maps. The challenge of defining a minimum official standard while also allowing for both rapid independent innovation and the potential for a well-conceived future roadmap everyone could agree on began to seem like a nearly impossible effort.

HTML 4: getting on the same page

However, the WC3 managed to broker high-level consensus on important issues within a series of regular meetings held between representatives of Netscape, Microsoft, Sun and other involved parties invited to participate a new HTML Editorial Review Board.

Among the common ground established was the decision to exclude BLINK and MARQUEE tags from the official HTML specification and plans to deprecate support for Netscape's presentational markup features in favor of using CSS (Cascading Style Sheets), which would separate web presentation from the purely descriptive elements of HTML.

Using CSS, a web author could create HTML documents that can be rendered on screen, targeted for printing, voiced by a screen reader, or otherwise interpreted in different ways. There were two problems: first, Netscape's tags had already mingled presentational and descriptive markup together for most existing web pages; second, browsers would need to add new support for CSS specifications in addition to HTML.

CSS 1 supplied a way for HTML documents to describe elements according to a given style (such as "heading"), rather than a specific presentation ("helvetica bold 16 pt"). A CSS file would supply the presentation details, which the user or browser could substitute to suit their needs, such as using a different typeface or reading the text with a different inflection. CSS 1 also enabled styling for text, graphics, and table alignment.

To enable HTML 4 to emerge cleanly from the existing HTML 3.2 specification released in early 1997, the new HTML 4 allowed both "transitional" pages, which allowed earlier conventions of deprecated tags, and new "strict" pages that adhered to the CSS approach of separating description from presentation details. HTML 4 was launched at the end of 1997, followed by the clarification of HTML 4.01 at the end of 1999.

In addition to the separation of description and presentation, W3C's HTML 4 also codified procedural aspects of web documents, standardizing both the JavaScript language for performing client-side interactivity and the DOM (Document Object Model) for representing and interacting with objects described in HTML. The "HTML 4.01 strict" specification was then published as an ISO standard in 2000.

Microsoft actually pioneered support for the new HTML, CSS, and JavaScript web standards, in large part because it had killed off Netscape by bundling IE with Windows. The vast amount of work required to develop a new browser from scratch erected significant barriers of entry for new competitors, and the ubiquity of IE seemed to offer no market-based incentive for anyone to try.

After a flurry of rapid development throughout the 90s, there would be no practical advancement of HTML for almost a decade.

HTML timeline


XHTML 1: specification perfection failure

The W3C decided to take HTML 4 in a new direction starting in 1998 with XHTML 1.0. Rather than being a form of SGML, the new specification forced HTML into the stricter subset of XML (extensible Markup Language), requiring semantic changes that were relatively minor.

Rendering existing HTML documents had become extremely complex due to HTML's flexible ambiguity. By imposing the strict rules of XML formatting on HTML documents, the W3C hoped to achieve two goals. The first was the ability to create, modify and render HTML documents using XML tools; existing HTML was too sloppy to be reliably parsed by machines without a very complex rendering engine aware of HTML's many exceptions.

The second benefit of moving to XML was the potential to modularize specialized markup, freeing the HTML spec from having to define how to mark up specialized math or scientific content, which could instead be defined independently using a specialized XML grammar, such MathML. An XHTML document could simply reference an external definition for marking up a specific bit of such content.

However, the move from HTML 4.01 to XHTML 1 introduced new complexity that few had any real reason to master. Additionally, Microsoft now had no effective competition left in the browser market, so it had little reason to invest in making IE XHTML-compliant as opposed to adding support for more practical new features such as CSS 2 or solving the serious security crisis that had resulted from rushing ActiveX to market.

Without support for XHTML in the browser most people were using, there was little reason for most web developers to upgrade their sites to support the new standard. Attempts made to transition to XHTML often resulted in complex new problems without delivering any real benefits, so most web developers returned to using HTML 4.01.

XHTML 2.0: standards for nobody

Faced with the reality that IE and its 90% share of the browser market would never support XHTML, the W3C decided in 2002 to begin publishing drafts of a new XHTML 2.0 specification that killed any pretense of backward compatibility with HTML 4 or XHTML 1. This greatly simplified the scope of the project, but made the new work almost completely irrelevant to anyone.

As the W3C continued along this trajectory, the competitive landscape began changing dramatically. Netscape had been reborn as Mozilla, and its new Firefox browser began to pick up new users dissatisfied with Microsoft's virtual abandonment of its own web browser after IE 6 was released in 2001.

By the time IE 7 was launched at the end of 2006, Firefox had become a viable, popular alternative on the PC, and Apple's new Safari browser had evicted IE entirely from the web-significant Mac platform. Opera was similarly staking out a specialty niche among mobile browsers, where Microsoft's terrible Pocket IE offered opportunity for alternatives.

A growing contingent of increasingly important players on the web, led by Apple, Mozilla and Opera, began looking at how web standards could be advanced to support open, vendor-independent, standards-based alternatives to the then existing landscape of the web, which was increasingly becoming dominated by proprietary Flash content that relied upon a separate plugin to render it, effectively turning the open web back into a proprietary system like AOL had been in the late 80s: closed, buggy and slow.

Convinced that the web would never effectively transition to XML, Mozilla and Opera proposed that the W3C drop its XHTML efforts in favor of extending HTML 4 in more practical new ways that focused on rich web apps, and limit the use of XML to new technologies such as RSS. "We consider Web Applications to be an important area that has not been adequately served by existing technologies," the companies wrote in a position paper. "There is a rising threat of single-vendor solutions addressing this problem before jointly-developed specifications." W3C members voted the idea down.

On page 3 of 3: HTML 5 and the future.