No central organization controls the web, but there is a need for agreement on the basic protocols, formats, and practices, so that the independent computer systems can interoperate. In 1994, recognizing this need, the Massachusetts Institute of Technology created the World Wide Web Consortium (W3C) and hired Tim Berners- Lee, the creator of the web, as its director. Subsequently, MIT added international partners at the Institut National de Recherche en Informatique et en Automatique in France and the Keio University Shonan Fujisawa Campus in Japan. W3C is funded by member organizations who include most of the larger companies who develop web browsers, servers, and related products.
W3C is a neutral forum in which organizations work together on common specifications for the web. It works through a series of conferences, workshops, and design processes. It provides a collection of information about the web for developers and users, especially specifications about the web with sample code that helps to promote standards. In some areas, W3C works closely with the Internet Engineering Task Force to promulgate standards for basic web technology, such as HTTP, HTML, and URLs.
By acting as a neutral body in a field that is now dominated by fiercely competing companies, W3C's power depends upon its ability to influence. One of its greatest successes was the rapid development of an industry standard for rating content, known as PICS. This was a response to political worries in the United States about pornography and other undesirable content being accessible by minors. More recently it has been active in the development of the XML mark-up language.
Companies such as Microsoft and Netscape sometimes believe that they gain by supplying products that have non-standard features, but these features are a barrier to the homogeneity of the web. W3C deserves much of the credit for the reasonably coherent way that the web technology continues to be developed.
Conventions
The first web sites were created by individuals, using whatever arrangement of the information they considered most appropriate. Soon, conventions began to emerge about how to organize materials. Hyperlinks permit an indefinite variety of arrangements of information on web sites. Users, however, navigate most effectively through familiar structures. Therefore these conventions are of great importance in the design of digital libraries that are build upon the web. The conventions never went through any standardization body, but their widespread adoption adds a coherence to the web that is not inherent in the basic technology.
x Web sites. The term "web site" has already been used several times. A web site is a collection of information that the user perceives to be a single unit.
Often, a web site corresponds to a single web server, but a large site may be physically held on several servers, and one server may be host to many web sites.
The convention rapidly emerged for organizations to give their web site a domain name that begins "www". Thus "www.ibm.com" is the IBM web site";
"www.cornell.edu" is Cornell University; "www.elsevier.nl" is the Dutch publisher, Elsevier.
x Home page. A home page is the introductory page to a collection of web information. Almost every web site has a home page. If the address in a URL does not specify a file name, the server conventionally supplies a page called
"index.html". Thus the URL:
http://www.loc.gov/
is interpreted as http://www.loc.gov/index.html. It is the home page of the Library of Congress. Every designer has slightly different ideas about how to arrange a home page but, just as the title page of a book follows standard conventions, home pages usually provide an overview of the web site.
Typically, this combines an introduction to the site, a list of contents, and some help in finding information.
The term "home page" is also applied to small sets of information within a web site. Thus it is common for the information relevant to a specific department, project, or service to have its own home page. Some individuals have their own home pages.
x Buttons. Most web pages have buttons to help in navigation. The buttons provide hyperlinks to other parts of the web site. These have standard names, such as "home", "next", and "previous". Thus, users are able to navigate confidently through unfamiliar sites.
x Hierarchical organization. As seen by the user, many web sites are organized as hierarchies. From the home page, links lead to a few major sections. These lead to more specific information and so on. A common design is to provide buttons on each page that allow the user to go back to the next higher level of the hierarchy or to move horizontal to the next page at the same level. Users find a simple hierarchy an easy structure to navigate through without losing a sense of direction.
The web as a digital library
Some people talk about the web technology as though it were an inferior stop-gap until proper digital libraries are created. One reason for this attitude is that members of other professions having difficulty in accepting that definitive work in digital libraries was carried out by physicists at a laboratory in Switzerland, rather than by well-known librarians or computer scientists. But the web is not a detour to follow until the real digital libraries come along. It is a giant step to build on.
People who are unfamiliar with the online collections also make derogatory statements about the information on the web. The two most common complaints are that the information is of poor quality and that it is impossible to find it. Both complaints have some validity, but are far from the full truth. There is an enormous amount of material on the web; much of the content is indeed of little value, but many of the web servers are maintained conscientiously, with information of the highest quality. Finding information on the web can be difficult, but tools and services exist that enable a user, with a little ingenuity, to discover most of the information that is out there.
Today's web, however, is a beginning, not the end. The simplifying assumptions behind the technology are brilliant, but these same simplifications are also limitations.
The web of today provides a base to build the digital libraries of tomorrow. This requires better collections, better services, and better underlying technology. Much of
the current research in digital libraries can be seen as extending the basic building blocks of the web. We can expect that, twenty five years from now, digital libraries will be very different; it will be hard to recall the early days of the web. The names
"Internet" and "web" may be history or may be applied to systems that are unrecognizable as descendants of the originals. Digital libraries will absorb materials and technology from many places. For the next few years, however, we can expect to see the Internet and the web as the basis on which the libraries of the future are being built. Just as the crude software on early personal computers has developed into modern operating systems, the web can become the foundation for many generations of digital libraries.