TrAC - Internet Column


To cite this article please refer to the printed edition of TrAC: Trends Anal. Chem. 14 (1995) 426

Distributing and Retrieving Chemical Information Using the World-Wide Web

Brian M. Tissue

Department of Chemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061-0212

Chemical Information on the Internet

More and more chemists are distributing chemical information and chemical-education software over the internet. The information ranges from research and preprint databases, conference proceedings, chemistry department information, company profiles, product catalogs, and educational materials [1-5]. The information exists in many different formats, each of which require the appropriate retrieval tool. Common internet distribution and retrieval methods are electronic mailing lists, File-Transfer Protocol (FTP), Gopher, Wide-Area Information Server (WAIS), Telnet, and Usenet News [6,7]. A relatively new transfer protocol is Hypertext Transport Protocol (HTTP) [7,8]. HTTP adds some powerful features to information transfer over the internet including support for hyperlinks between different files, graphics display, launching of external programs, and rudimentary interactivity. Some HTTP-client software programs, which are known as browsers, also support the older information transfer protocols. These browsers give users a single integrated tool to access a wide variety of internet services [7,8].

This article describes HTTP client and server software and its use to distribute and access chemical information over the internet. A detailed description of the analytical chemistry hypermedia at Virginia Tech illustrates some of the advantages and limitations of developing and distributing hypermedia over the internet.

The World-Wide Web

The World-Wide Web(W3, WWW, or Web) is an internet-based hypermedia system that originated at CERN (European Laboratory for Particle Physics) [7-10]. The Web consists of all computer files (documents) that are archived on computers and accessible through HTTP server software. Each document on the Web has a Uniform Resource Locator (URL) that identifies it by transfer protocol, internet address, and file type, e. g., http://computer.domain.network/filename.html, gopher://computer.domain.network, or news:sci.chem. To access hypermedia documents a user runs a browser program such as NCSA Mosaic [7,11,12] or NetScape [13] on a client computer with internet access. When a user opens a URL in a browser program, the browser retrieves the requested file from an HTTP server by the appropriate protocol, interprets the file type and formatting instructions, and displays the information on the client computer screen. Fig. 1 shows a screen capture of NCSA Mosaic displaying a simple hypermedia document that contains small graphic images and highlighted (underlined) hyperlinks to other documents. Clicking on one of the underlined headings with a mouse pointer retrieves and displays the selected document. Most browsers support in-line images in Graphics Interchange Format (GIF), and can support other graphics formats, audio clips, and video clips (MPEG or QuickTime) by launching external viewer or player programs.

Fig. 1. Screen capture of an HTML document using NCSA Mosaic for Windows. The hyperlinks are underlined.

HTML

The hypermedia documents archived on HTTP servers are written in Hypertext Markup Language (HTML) [14]. The individual HTML documents are ascii text files that contain formatting commands, much like early word-processing or type-setting programs. To illustrate the HTML format the source document for Fig. 1 is reproduced below. Subdirectories were left out of the filenames to simplify this example.


<HTML>
<HEAD>
<TITLE>Virginia Tech Chemistry Hypermedia</TITLE>
</HEAD>
<BODY>
<H1>Virginia Tech Chemical Education Hypermedia</H1>
<HR>
The main headings below lead to descriptions of the hypermedia tutorials that are listed in the menus. A separate <A HREF="vt-chem-course-material.html">Virginia Tech Chemistry Course Material</A> document indexes course-specific material for Va Tech chemistry students.
<HR>
<H3><IMG SRC="VT-Web.gif"><A HREF="Overview.html">Overview of the Chemistry Hypermedia Project</A>< /H3><HR>
<H3><IMG SRC="VT-ac-Flasks.gif"><A HREF="ac-home.html">Analytical Chemistry Hypermedia</A></H3>
<UL>
<LI><H7><A HREF="ac-intro.html">Introduction to Analytical Chemistry</A></H7>
<LI><H7><A HREF="ac-basics.html">Analytical Chemistry Basics</A></H7>
<LI><H7><A HREF="ac-methods.html">Encyclopedia of Instrumental Methods</A></H7>
<LI><H7><A HREF="ac-spectroscopy.html">Advanced Analytical Spectroscopy</A></H7> </UL>
<HR>
<H3><IMG SRC="OrgChem.gif"><A HREF="org-home.html">Organic Chemistry Hypermedia</A></H3> . . . <ADDRESS>Professor Brian Tissue, Department of Chemistry, Virginia Polytechnic Institute & State University, Blacksburg, VA 24061-0212 / (703)231-3786 / TISSUE@VT.EDU</ADDRESS> </BODY> </HTML>

The < > brackets contain the formatting instructions that the browser interprets for displaying the document on the screen. For example: < H#>...</H#> define headers, <HR> places horizontal rules, the <UL><LI>...</UL> structure defines an unordered list, and <IMG SRC="filename"> is the file source for an inline image. <A HREF="filename">...</A> is an anchor that defines a hyperlink. The browser highlights the text between anchor brackets (underlined in Fig. 1), and clicking on the highlighted text instructs the browser to retrieve the file listed in the anchor. The file could be another HTML document, an expanded graphic image, or an audio or video clip. The user controls the final display of an HTML document by selecting browser viewing options, such as font and font size.

The current HTML 2.0 specification has some limitations for scientific writing. NCSA Mosaic currently supports the ascii character set, accented characters, superscripts, and subscripts, but not advanced math symbols or Greek letters. HTML 3, which currently exists as an Internet draft, will include specifications for tables, figures, text layout control, and mathematical equations [14]. Specific browsers may currently support more or fewer features than the "official" HTML standard. HTML is easy to write with a text editor or with stand-alone hypertext editors, word-processor templates, or conversion programs, e.g., Rich-Text Format (RTF) to HTML. The Web is expanding so rapidly that it is difficult to stay current on the availability and status of these software tools. The most up-to-date information about HTML authoring and HTTP server and browser software is found on the Web itself [8] and in a WWW-FAQ (Frequently-Asked Questions) in the alt.hypertext and comp.infosystems.www.users newsgroups.

The Chemistry Hypermedia Project at Virginia Tech

Hypermedia and network delivery have several potential advantages in education and training [15]. Hypermedia provides links not only to related topics but also to remedial or advanced material. The hyperlinks are placed in-context and therefore provide help specifically where it is needed. The hyperlinks also illustrate the connections between advanced and basic topics, which continually reinforces the basic principles and helps students to see the "big picture." The incorporation of multimedia in instructional material can be more effective than textual descriptions alone [16,17]. However, multimedia is best used in a supportive role with appropriate text in order to convey the chemical concept and not just a flashy picture. Distributing educational material via an internet server allows authors to update and append material continuously, provides a record of student use, and allows integrated e-mail communication with authors or instructors.

The cost-effectiveness of server delivery versus CD-ROM distribution depends on the scale of the distribution and is difficult to predict. The personnel time required to develop specialized multimedia material will usually exceed eventual distribution costs. The disadvantage of internet delivery is that it requires the network and computer infrastructure, which is not available to many would-be users, especially K-12 students. Network bandwidth also places a practical limit on video use, and we use video only when it is necessary to convey a sequence of events such as operation of an instrument.

The goal of the Chemistry Hypermedia Project is to develop and evaluate hypermedia for chemical education. Individual hypermedia documents consist of basic theoretical and experimental descriptions of a chemistry topic with links to related topics, basic concepts, and advanced examples [15,18]. The individual documents are specific and self-contained to allow flexibility in piecing them together for different applications. Keeping the documents short (typically 3-5 computer screens) also minimizes internet transfer time. Fig. 2 shows part of a hypermedia document on simple uv-vis absorption spectroscopy. The document is mostly text (out of view) describing typical uv-vis applications and instrumental components such as light sources, monochromators, and detectors. There are hyperlinks to the theory of absorption measurements and the Beer-Lambert law at the beginning of the document and a link to dual-beam absorption spectroscopy at the end of the document. The small window in the lower right of the figure shows a movie player playing a video clip that gives operating instructions for a single-beam spectrophotometer. The browser program launches the player and transfers the video file when a user clicks on a hyperlink (out of view in Fig. 2). The file size for this 30-second movie is 1.3 Mbyte, which takes approximately one minute to transfer to a client with an ethernet connection.

Fig. 2. Screen capture of part of an HTML document on uv-vis absorption spectroscopy. A movie player is visible in the lower right part of the screen.

The aim of integrating hypermedia into the chemistry curriculum is to improve the efficiency and effectiveness of students' study time outside of class and their time in laboratory [15,18]. We are developing hypermedia tutorials that supply remedial material for students with background deficiencies and pre-lab exercises to improve students' understanding of an experiment before coming to laboratory. The graphics and video clips in the hypermedia documents illustrate the actual laboratory equipment that students use in their experiments. Students access the individual hypermedia documents from entry-point documents that resemble menus of course topics. A tutorial for a non-laboratory graduate spectroscopy course provides lecture notes with links to remedial material on spectroscopy, optics, and quantum mechanics. It is designed for students who did not have an instrumental analysis course in their undergraduate curriculum. A similar tutorial is under development for senior-level instrumental analysis. It will contain pre-lab assignments that complement the experiment handouts with graphics and video. The pre-labs will also contain multiple choice and simple answer questions for the students to answer that the server can correct.

The first full-scale implementation of hypermedia in instrumental analysis (24-36 students) is scheduled for Fall 1995. Student use and feedback in this pilot project will guide plans for further integration of hypermedia into the lower-level analytical and general chemistry curriculum.

There are currently three stand-alone analytical chemistry tutorials (under continuous development) that are designed for outside users [18]. An Analytical Chemistry Basics tutorial follows a typical course syllabus for sophomore-level analytical chemistry with links to introductory documents on gravimetry, titration, electrochemistry, spectroscopy, and separations. An Encyclopedia of Instrumental Methods is a menu of instrumental techniques, and an Advanced Analytical Spectroscopy tutorial is a subset of the encyclopedia with additional spectroscopy theory. In keeping with a modular design, these tutorials use the same documents as the course material but arrange them differently.

As of June 1995 the Virginia Tech server is averaging over 2400 connections per day, and the number of connections is increasing 20-30% per month (one connection is one file transfer, the number of different clients accessing the server is >200 per day) [19]. Approximately 40% of users are from US educational institutions, 30% are international, and 20% are US commercial. The access patterns range from users who are just "looking around" to users who take an in-depth look at Departmental information or the hypermedia tutorials. An average access load of approximately 1500 connections per day began to saturate a PowerMac 6100/60AV and MacHTTP software. Our server is now a Silicon Graphics Indy workstation with WebForce server software, and our bottleneck for outside users at peak times appears to be the Virginia Tech link to the internet.

Summary

HTML is a relatively easy multimedia authoring language and the popularity and platform-independence of the Web will make it a stable system for distributing chemical information. The multimedia support of the HTTP protocol greatly enhances the capability of information distribution over the internet and on-line publishing, conferences, and education will become more common [5]. The rapid increase in users currently puts a heavy load on network bandwidth and server hardware that requires that applications be designed to minimize file-transfer time.

Acknowledgment

I gratefully acknowledge the other contributors to the Chemistry Hypermedia Project at Virginia Tech; C.-W. Yip, Y.-L. Wong, R. L. Earp, and M. R. Anderson; and financial support from the NSF Division of Undergraduate Education, DUE-9455382.

References

Brian M. Tissue received a B.A. in chemistry from the Johns Hopkins University in 1983 and a Ph.D. in analytical chemistry from the University of Wisconsin-Madison in 1988. He held postdoctoral positions at the University of Georgia and Los Alamos National Laboratory before beginning his current position of Assistant Professor in the Department of Chemistry at Virginia Polytechnic Institute and State University in 1993. His research interests include laser-spectroscopy of nanocrystalline materials, laser-induced-plasma time-of-flight mass spectrometry, and applications of computer technology in chemical education.

©1995 Elsevier Science bv

Back to the TrAC Home Page