By Larry Stevens
Combining relational and object database technologies may be the way to deal with new types of data while protecting your investment in the conventional.
[Chart]: Object Relational Database
[Chart]: Robertson, Stephens & Co. Performance Monitoring System
[Chart]: Object and Relational Data
Database applications are becoming livelier. No longer confined to mere number and character strings, many are now storing and, more importantly, processing a much wider range of data types. The challenge for IS staff is to find a database that can handle these new kinds of data.
The ubiquitous technology for organizing data--the relational database management system (RDBMS)--is excellent for what it has been called upon to do. Over the years, RDBMSs have become efficient at processing huge volumes of transactions. But there's a catch. Traditional RDBMSs perform well only when working on numeric data and characters stored in tables, what are often called "simple data types." Try feeding your RDBMS things like geographic information, time series, scanned images, full-text documents, video, audio, film clips or even composite objects (such as stock transactions or bills of materials). Then ask it to perform, say, a Select function on that data type. Suddenly, the technology seems much less robust.
Where do you go from here? That was the question asked by the Wheel Trans Division of the Toronto Transit Commission. The division provides 6,000 door-to-door free rides a day to qualified Toronto residents, primarily the disabled. If the rides were consistent--if people traveled to the same locations on the same days and times each week, the data could easily be stored in an RDBMS table.
But the actual service is much more complex. Customers can call at any time during the day to request a ride for any time the next day to any location in Toronto. When a customer calls, his or her file is accessed on the customer service screen; so far this is normal RDBMS stuff. But then the system automatically maps out the ride and selects the best vehicle, given other scheduled rides, to transport the customer. To do that, the system must manipulate data about time and space. The geographic information is more specific than even a street map, since it includes data about one-way streets and legal and illegal turns. And that is not the kind of data an RDBMS is built to handle. "Manipulating geographic and time series information on a scale like ours just wouldn't work if we had to try to program an application using a traditional RDBMS," says George Procunier, superintendent of data administration at the division.
In implementing a solution, Procunier chose CA-Ingres from Computer Associates (CA) International of Islandia, NY. While CA-Ingres is primarily an RDBMS, it includes extensions that allow it to store and manipulate objects of the kind the Wheel Trans Division's application uses. Because CA-Ingres includes features of an RDBMS as well as some features of an object-oriented database management system (ODBMS), it is called an ORDBMS.
According to Richard Finkelstein, principal of Performance Computing, a consultancy in Chicago that specializes in database application development, the interest in object database technology arises from "the rich content problem." In relational systems, every piece of data in a field has to be atomic, indivisible. "But when the content becomes more complex and richer, you have to be able to manipulate data within the field. That's when you need object technology," Finkelstein says.
The need to manipulate objects is what led RWD Technologies of Columbia, MD, to turn to Illustra, an ORDBMS recently purchased by Informix Software of Menlo Park, CA. RWD, which does application programming for Fortune 100 corporations, had to find a way to manage its growing database of images. "Creating software these days involves using lots of images," says Wayne Elias, chief object technologist at RWD. For example, the firm uses images in screen illustrations, as icons and in technical manuals. In the past, developers at the company had to thumb through paper catalogs or flip though on-screen image data banks, trying to select those they wanted by title or keyword.
An RDBMS can store images as the binary large object (BLOB) data type. But the RDBMS can't interpret the internal structure of a BLOB, so someone at RWD would have had to tag the BLOB data with keywords stored in other fields in the database. By treating images as objects, the company can access them based on characteristics such as composition, shape, color and texture. As a result, the time it takes users to find images they need has dropped from hours to seconds, according to Elias.
A growing number of companies are offering or promising to deliver products that can be classified as ORDBMSs. Besides Computer Associates and Informix, they include Oracle, Sybase and smaller firms like UniSQL of Austin, TX. Their products differ in what and how many object-oriented and relational features they provide, but in general an ORDBMS has two distinct sides. The relational side provides structured query language (SQL) data definition and manipulation statements. The object side includes object definitions (as well as the ability to create user-defined objects), along with methods of working with objects. Some, such as Illustra and UniSQL, offer advanced object-oriented features including polymorphism, in which the same function can operate differently depending on the type of object, and inheritance, in which objects created from other objects inherit their "parents'" attributes.
If object technology is so capable, why not simply install an ODBMS? According to Joshua Duhl, principal of Stillpoint Consulting, a Cambridge, MA-based object technology consultant, the combination of relational and object technologies provides a more complete solution for many users. "There's a chasm between skills needed for object and relational technologies. Very few people will want to take the time to bridge it," says Duhl. He explains that for many database programmers, learning object technology represents a "paradigm shift which makes the learning curve pretty steep." He estimates that most people take six months to a year to become comfortable with the technology. In contrast, an ORDBMS provides some form of SQL access. "It allows them to address objects using a query language known to them."
Elias of RWD is a proponent of this option. His firm recently developed an application for a national package delivery company. The On-Road Performance Monitoring System, which automatically chooses optimal routes, includes much geographic information. The advantage of using Illustra instead of an ODBMS was that "we were leveraging our investment in SQL programming skills," he says. Like other ORDBMSs, Illustra supplements the standard SQL capabilities with support for some object types and functions. (In Illustra's case, they're in the form of plug-ins called Data Blades.)
Historically, the ORDBMS grew out of the failure of ODBMSs, which began to be released in the late 1980s, to win acceptance from corporate users. The ODBMS has limitations that can prevent it from taking on enterprise-wide tasks. For one thing, it doesn't share a standard query language like SQL. Secondly, it's not as scalable as the RDBMS. Won Kim, an object technology pioneer and founder and president of UniSQL, says that the ODBMS has uses. "They work well for individual users or very small groups. But at a certain point, they croak on you." He estimates that ODBMS-based applications normally perform well up to about 20 to 30 users or 5GB of data.
As well as being far more scalable, the RDBMS usually is superior in areas of performance, security, integrity and availability. "RDBMS vendors have spent years perfecting these features. We don't want to throw that research out," says Kim.
By storing objects in the object side of the ORDBMS but keeping the simpler data in the relational side, users may approach the best of both worlds. That's the reason the Wheel Trans Division opted for CA-Ingres. "We need the object management extension capability for geographic data, which are objects, but we have other data which work much better in a relational system," says Procunier. For example, the application stores customer profiles, which include address and phone number, eligibility, previous destinations, how often they've canceled rides without notice and so on, in relational tables.
Of course, an alternate solution to the problem is to have two DBMSs, one relational, one object-oriented. This may not be unreasonable, since many companies have more than one DBMS. Elisabeth Watson, director of strategic interactivity at Robertson, Stephens & Co. in San Francisco, considered that option but decided against it. "We wanted a one-button solution," she says. Maintaining two or more DBMSs would increase the firm's workload and present challenges in integrating data from the two systems into a single report.
The company, which serves the financial community with research reports, chose Illustra and is about to roll out an application in which customer service people can use an intranet to create custom reports for clients. While any Web application can make heavy use of objects--each button and page can be treated as an object--Robertson, Stephens keeps track of 420 stocks, which are best maintained in the relational part of Illustra. By combining the speed at which relational databases handle raw stock data housed in tables with the flexibility of treating trades as objects and using intranet objects as a delivery medium, the application achieves its best performance. "When an institutional client calls, our people have 60 to 90 seconds to create the report, or the person will hang up. We could only do that using combined relational and object technology," Watson says.
Object technology may play a more visible role in future applications at Robertson, Stephens. For example, Watson plans to include graphics, sound and animation in reports that the customer can access over the Internet. "My vision is we will store on one large database all information serving our clients, as well as multiple delivery mechanisms," she says. This vision assumes the presence of both relational and object technologies.
The Robertson, Stephens application illustrates an important point about object technology. While it is often seen as a way to handle multimedia, graphics and Internet objects (the "sexy" applications), it can also help with drier but no less problematic manipulation of data. "There's a world of difference between the way you'd program a relational or an object-oriented database when you're creating reports with composite data," says Mike DeSanti, president of Enterprise Management Consulting, an ORDBMS programming firm and systems integrator in New York City.
DeSanti, who works largely for financial organizations, says the most efficient way to treat an equity trade is as an object with multiple "legs," each of which represents different sides of the deal (buyer, seller, broker and so on). Each of those legs includes different cash flows and other attributes. In an object model, which looks like a tree structure, you have a trade object, which points to the various legs of that trade, each of which in turn points to its cash flow. "That's very direct. With a relational system you have to a lot of decisions to make, none of them optimal," says DeSanti. For example, you can place the different cash flows in a single table, but that can be difficult. If you place them in separate tables, you have to turn those associations into foreign keys. "It could get tangled. Object technology is simpler," he says. Generally, Enterprise Management Consulting uses Persistence from Persistence Software of San Mateo, CA, which is not a DBMS but middleware that provides the ability to map objects within an RDBMS.
Jack Barnett, another Persistence user, agrees that object technology can make certain types of programming tasks more efficient. He's the integration requirements manager at Lucent Technologies, an AT&T spin-off based in Warren, NJ. Lucent developed a telecommunications operations support system to help local telephone companies manage broadband networks. "Each component of the network can be viewed as a specialized object," Barnett says. For example, there is the amplifier, which connects to the company's coaxial branch and serves the neighborhood; the conversion filter, which allows you to connect the coaxial cable to a fiber-optic network; alarms, automatic signals that something has gone out; and customer service tickets, which are produced when customers call in problems. In all, Barnett estimates there are about 14,000 such objects. By treating them as objects rather than trying to break them down into fields in a relational table, Barnett estimates his company saved 20 to 30 percent of development time.
Not only can development time be shortened, but the speed of the application may be improved when it is programmed using object technology. For example, one stock exchange (which declined to be named) is about to install a program developed using UniSQL, which should automatically identify insider trades. The system, which looks for unexpected peaks or valleys, combines a table of stocks in the relational side of UniSQL along with trade and time-series objects. In the exchange's initial pilot, which used only relational technology, it took seven minutes to identify an insider trade. In the second go-round, which combined object and relational technology, the time was reduced to 45 seconds.
Choosing the right ORDBMS may not be easy, especially because so many of them are still evolving. Doug Barry, principal of Barry & Associates in Minneapolis, publishes DBMS Needs Assessment for Objects, a book that compares 18 ODBMSs and ORDBMSs on 100 measures. He points out that ORDBMSs vary greatly in the amounts of relational and object support they offer. While there are no short answers as to which package is best for a particular application, he offers this general rule of thumb: "If the data is relatively simple, you'd want a relational system with some object extensions. If the data is more complex--say, multimedia--lean toward a package with a strong object component or even an ODBMS."
Procunier of the Wheel Trans Division is taking that advice. His organization is in the process of completing a feasibility study on a project that would provide maintenance data to engineers in multimedia form. Because video, sound and animation are so complex, he decided it would be difficult to store in an ORDBMS like CA-Ingres. Additionally, because the number of users will be relatively small, it won't need the scalability offered by the relational side of an RDBMS. So Procunier plans to use Computer Associates' new ODBMS, Jasmine, which will be released later this year. "We were assured [by consultants] that for projects that make heavy use of very complex objects, an ODBMS would be our best bet," he says.
Duhl of Stillpoint Consulting agrees that choosing a technology means balancing the pros and cons of each and matching them with the needs of your application. "There is no black-and-white, but rather there are shades of gray," he says. It's a continuum."
For example, UniSQL is strong on the object side, having been developed by an object expert. Oracle's Universal Server is at the other end of the spectrum, with only a few object features. The Sybase Adaptive Server Architecture will keep the relational components of the System 11 RDBMS intact while adding object capabilities, such as support for complex data types, through an increasing number of plug-ins. CA-Ingres offers features of both, and next year Informix's relational engine will be incorporated into Illustra.
The trick, say analysts, is not to assume that any of the three options--object, relational or object/relational--is better without careful consideration. Says Finkelstein of Performance Computing, "Many people use a relational system without considering the implications just because that's what they're used to. The problem is they may not hit a snag until the project is well under way." As ORDBMS products are becoming increasingly available and more functional, IS has a number of choices in all three categories.
Larry Stevens writes about business and technology from Monson, MA. He can be reached at email@example.com.