Is the Web still the Web?

Given the growing number of data types and file formats being transmitted over HTTP and the increasing complexity of the applications that make use of them, is today's Web really still the Web?

For developers of RIAs (rich Internet applications), Adobe's announcement that Google and Yahoo will soon be able to index text within Flash movies should come as welcome news. Until now, Flash files have been black boxes; with these binary files, search indexers could no more extract textual information from them than from JPEGs or PNGs.

This first stab at Flash search still sounds somewhat primitive, but it raises an issue of importance to all Internet application developers. Given the growing number of data types and file formats being transmitted over HTTP and the increasing complexity of the applications that make use of them, is today's Web really still the Web? Or is it morphing into something else? How can we ensure that today's Web apps offer enough capabilities and flexibility to make Web 2.0 worthy of its name?

When Tim Berners-Lee first envisioned the Web in the 1980s, he saw it as primarily an information storage and retrieval system, based on the concept of hypertext. Web documents were fundamentally text, embellished with a markup language (HTML) that described how the text should be formatted and how each document was linked to others elsewhere on the Web.

Those embellishments were really only recommendations, however. How a page actually looked depended on the browser, client, or device on which you viewed it. Each document had a unique URL that described where it could be found, and not much else. Once you'd retrieved it, what you did with it was up to you and your software. If you wanted to, you could even view the raw source to see how its author encoded it.

Today, that early vision is gradually being replaced by a much more complex model. The static HTML document is largely a thing of the past. In its place is a diverse range of technologies, each of which falls somewhere along a continuum that spans from the flexibility and openness of Web 1.0, all the way to a closed, binary-only paradigm that's more akin to traditional desktop software.

Plain HTML, CSS, and JavaScript are still there for those who want them, of course. In addition to preserving the traditional features of the Web, standards-compliant HTML allows pages to be viewed on the widest possible range of devices. For RIA developers, however, this is often an arduous road. Traditional methods are too limited to facilitate the kinds of rich application UIs that users have come to expect.

AJAX offers some relief, but already AJAX developers must make trade-offs. Few AJAX applications are device-neutral; they assume a GUI browser, a keyboard, a mouse. By comparison, even the presence of JavaScript is not a given in the "pure" Web 1.0 model. What's more, content delivered via AJAX applications is fragmentary and less structured than traditional Web pages. Web documents become amorphous constructs that change to reflect the whims of the end-user.

The next step along the continuum is exemplified by such products as Google Web Toolkit (GWT), which compiles entire Java applications down to JavaScript code, to be executed in the browser. At this point, the concept of the HTML document as the atomic unit of the Web virtually disappears. The only "document" is an executable program, and though the source code may be available for viewing, it's likely to be inscrutable even to experienced developers.

This trend reaches its utmost with content delivered for plug-ins, such as Flash or Microsoft Silverlight. Applications written for these platforms don't resemble HTML in the slightest. They are binary blobs, little different from executable programs built for a desktop OS. You cannot view the underlying code in its raw form. Though Google and Yahoo may be able to extract text data from those binaries, deep linking to specific items is impossible.

These distinctions invite important questions. Is it still the Web if it's not really hypertext? Is it still the Web if you can't navigate directly to specific content? Is it still the Web if the content can't be indexed and searched? Is it still the Web if you can only view the application on certain clients or devices? Is it still the Web if you can't view source?

Equally important, if today's RIAs no longer resemble what we would call the Web, then is shoehorning those applications into the Web's infrastructure really the right way to go? If application developers feel limited by the constraints of standards-compliant browser technologies, should they really be targeting their applications for the browser? Or is the problem that the client platforms simply aren't evolving fast enough to meet our needs? The debate on these issues is only just beginning.

Join the newsletter!

Error: Please check your email address.

More about Adobe SystemsContinuumGoogleINSMicrosoftParadigmVIAYahoo

Show Comments

Market Place

[]