Information Technology Demystified

A Report from the UniForum Technical Steering Committee

Web Access to Corporate Data

Each month, the TSC examines a key emerging technology or its use. This time, we look at "DataWeb" technologies.

By Derek C. Kaufman and Lanny J. Lampl

The Internet and the World Wide Web continue to be the darlings of Wall Street. The media, entranced with a vision of point-and-click access to worldwide information, are publishing URLs in ads and newscasts. Meanwhile, the information systems departments of corporations are scrambling to demonstrate to their partners in the business units that they have mastered the Web and are ready to harness it to strategic advantage.

The first demonstration of mastery is easy: a Web site and a home page sponsored by marketing, which likely was created by a PR agency. Then begins the real work, as the IS department turns to the task of providing useful applications of Web technology within the enterprise on the corporate intranet. Because the majority of today's enterprise-wide applications revolve around accessing and updating corporate information stored in ever-growing databases, it is there that Web technology can make the easiest impact by bringing database management system (DBMS) data to the user on the desktop through a browser.

The movies have already provided us with glimpses of the future of Web-based data-access applications ("DataWeb" applications, as coined by Sarah E. Varney of Datamation)--a stroll through a data vault, an "angel" who helps find files and a stomach-wrenching flyby past giant data towers guarding sinister secrets. Thanks to Web technologies, these applications are not entirely science fiction; data visualization applications can be built now but at considerable cost. The true promise of the Web, however, is as a ubiquitous user interface, enabled by tools embedded in every desktop.

It appears that the Web browser will become the universal user interface (UUI) that eluded us during the GUI wars, and universal resource locators (URLs) will live up to the promise of the U word in their name. The term information will then denote a broad range of data types covering all sorts of media. Browsers bring the additional benefit of enforcing the discipline of separating data presentation from business logic. This encourages improved modularization and distribution of responsibility, which ultimately may lead to the development of real business objects and simplified maintenance. Add to those benefits increased portability of applications, a trend to lower-cost products and a natural affinity for data warehouse applications, and Web data access technologies constitute a truly winning combination.

Technology Leverage

Currently, there are scores of tools for publishing DBMS data on the Web, operating on all platforms from PC to mainframe. Most current products rely on Common Gateway Interface (CGI) and Hypertext Markup Language (HTML) technology: HTML to build forms and reports, CGI to enable the data access and processing routines on the server. CGI is a set of interfaces that permit a "program" (shell script, Perl, TCL, C, C++, Fortran, Cobol and others) to receive parameters from an HTML document sent from a client, invoke processing routines and create a dynamic HTML document containing results.

DataWeb products mainly consist of HTML authoring tools for form building; DBMS access tools for use within a CGI program; software to format result sets into dynamic HTML documents; and a master CGI program that runs it all. Some products available now do not use CGI but instead use the proprietary Netscape Server application programming interface (NSAPI). While giving developers fine-grained control over the server's behavior and boosting performance, NSAPI is not subject to the safety and security constraints of CGI and should be used with caution by professional software developers only. Misuse of NSAPI can crash a server.

The next generation of DataWeb products most likely will be based on Java technology. Using Java, cooperating client and server database access applets will provide greater flexibility and efficiency. JavaScript will be available for writing simple applications to be embedded in HTML documents, and Java database connectivity (JDBC) will be available to provide unified access from the Java environment to relational and object databases alike. The JDBC API will be the keystone of second-generation Web-based DBMS products. JDBC is based on the X/Open call-level interface (CLI), as enhanced by Microsoft in ODBC. The first version of JDBC should be finalized soon and available on the Web from JavaSoft.

Planning Considerations

Obviously, if a DataWeb application is being written to serve up data from existing databases, the legacy DBMS involved will determine the approach to storing data types such as audio, video, images, geographic data, user defined types, objects and methods. It is even possible (if painful) to store the HTML pages and structure of a Web site in a relational database. However, if the opportunity arises to use a new DBMS for a DataWeb application, the data types and structures supported should be investigated. Many vendors are beginning to offer powerful hybrid products that marry object and relational DBMS features. In the future, SQL3 will harmonize the disparate vendor approaches to multimedia data types and queries. In the meantime, application developers would be wise to isolate and document vendor-specific usage.

Currently, the majority of DataWeb applications read and update data outside the context of transactions. In typical Internet fashion, they are stateless and do not provide a transactional context. The current technologies for DataWeb applications are well-suited to the customarily brief transactions of the Internet, but long-lived transactions are a problem. To remedy this, products will become available soon that work with existing transaction monitors and messaging systems to provide the missing capabilities. The fact that standards such as X/Open's XA already exist should accelerate broad availability.

In addition to technology considerations, there are management and procedural considerations in developing DataWeb applications. First and foremost, continue to use existing development methods, but adapt them as necessary to use the new technologies. Because Web technology is changing so rapidly, care is needed in developing a modular design that isolates the technologies for easy replacement and preservation of the business logic. DataWeb apps are particularly well suited to an iterative RAD approach, in which quickly seeing the result leads to quickly improving the application. DataWeb projects should aim for completion within 12 months and be scheduled in 90-day chunks to permit adapting to the changes in the technology, product and service landscape that are bound to occur in the rapidly evolving context of the Web.

Now that the corporate Web page is online, DataWeb applications are the next logical step to exploiting the power of the Internet. Tools and standards are available to simplify the job, and a few precautions can prevent missteps. Even the most conservative IS organization can demonstrate its Web prowess and at the same time contribute significantly to the business.

Derek C. Kaufman is middleware manager for Levi Strauss & Co. in San Francisco and a member of the UniForum Technical Steering Committee. He can be reached at DKaufman2@levi.com.

Lanny J. Lampl is a technical specialist in the global information resources organization of Levi Strauss & Co. He can be reached at LLampl@levi.com.