Introduction

Object Oriented database systems could not be named a new technology. As first OO languages, OO databases appeared a lot of time ago. And it was even considered that them will completely replace relational database systems. It was not happen. What we can see now is that OO and relational databases are moving toward each other: most vendors of OODBMS provide in their products support of SQL-like non-procedural query language while vendors of mainstream RDBMSes incorporate OO extensions in their products (OO version of SQL). So, we can expect that at some moment of time it will not be possible to say that some DBMS is relational and some is object oriented (as far as it is difficult to say now if some language is object oriented or not).

Here we should talk a little bit about what is it "object oriented databases" and their main advantages. Most of the modern programming languages are object oriented, while most of the mainstream databases - relational. So programmer has to seat at two chairs and work with two data models - relational and object. It significantly complicates design of application, because system architect has two work with different notions representing the same entities. And programmer has to implement a lot of code responsible for packing/unpacking data from one representation to another. It is huge, error prone work and what is even worse - such packing/unpacking routines significantly degrade system performance. Modern modeling tools helps to somehow solve this problem - them generate both application classes and database tables from universal model description. But them are still far from ideal.

Object oriented database (as it is clear from its names) is able to store objects. So there is no more need to work with different models of data - objects are everywhere. In addition to significant advantages from system design point of view, object oriented database can much efficiently handle complex relations between objects. Unlike RDBMSes which to follow such relations has to perform a lot of joins, in OODBMSes relations between objects are represented directly so traversing such relation requires significantly less time. That is why object oriented databases becomes popular in areas requiring manipulations with complex sets of data: such as CAD/CAM, GIS,... In these areas traditional relational approach is too inefficient and attempt to represent complex object graphs using two-dimensional tables cause significant troubles.

Ideally object oriented database should completely hide database specific from application programmer. He/she should not worry whether the program works with normal (transient) or persistent (stored in the database) objects. This paradigm is called transparent persistence. There are many different approaches in reaching this goal, but certainly ideal can not be completely reached. The main problem is that database applications has some aspects which are not present in normal application: transactions, locking, data distribution. Them can not be completely hidden from application programmer, so it is not possible to reach complete transparency. But in any case, solutions provided now by existed OODBMSes significantly simplify life of programmer, comparing with relational databases.

Basic principals of building object oriented databases

Which database can be named "object oriented"? There is no precise answer to this question, as far as it is not possible to specify all characteristics of object-oriented language. First of all there is no single definition of what is it "object". We will use the following definition: object is set of properties and methods for manipulation with them. All objects with the same set of properties and methods form a class. Although it is not possible to specify complete set of features which can be used to determine if the particular DBMS is object oriented or not, there are certainly a number of features which are common to the most of existed object-oriented database systems:

Object is the basic notion in object oriented system. It is basic unit of storing data in the database. Each object has unique OID which is automatically generated by the system.
Direct representation of references between objects.
Support of inheritance and polymorphism.
Tight integration with at least one object oriented programming language
Support of traditional DBMS features: ACID transactions, backups, import-export utilities, scheme evaluation,...

Below these features are explained more precisely:

Object identifier (OID): In relational database systems records are identified only by their content. If all attributes of two records have the same values, then there is no way to distinguish these two records. But in object oriented databases objects are uniquely identified by OID. Format of OID is specific for each system. In some systems just four bytes with object index or object position in the file is enough to identify the object. In other systems object identifiers are more complex and preserve uniqueness even outside the scope of the local computer.
Object references: References between objects in OO databases are represented using OIDs. So if field F of persistent object A contains reference to persistent object B, then this field F of object A stores OID of object B. Object oriented database provides efficient way to access object by it's OID.
Object classes: As it was mentioned above, OODBMS stores objects inside database. To make it possible to load/store objects and perform some other operations with them (garbage collection for example), OODBMS should know format of the object, i.e. set of object fields and their types. It is not efficient to store this information for each object instance. But because format of all objects belonging to one class is the same, then it is possible to store class and let object reference it's class. Some OODBMS treat classes as normal objects, which format also has to be defined. In this case the notion of metaclass is introduced. Metaclass is class of the class.
Inheritance and polymorphism: All object oriented languages supports inheritance. Inheritance is mechanism allowing child object to inherit behavior and properties of parent object. OODBMS should certainly be able to represent inheritance in database. If class A is derived from class B, then class A inherits all methods of class B and it can be used everywhere where class B can be used. So it makes it possible to manipulate with instance of class B using variable with type A. It is called polymorphism and its support is one of the main features of object oriented languages and databases.
Tight integration with programming language: Relational databases provides non-procedural query language. It means that query expressed in this language specifies what should be done, but doesn't specify how to do it. In contradiction to it, most of the modern programming languages are imperative languages - so the program written in this language specifies at higher or lower level of abstractions which actions should be performed to produce requested result. Object-oriented languages are also imperative languages. And although OODBMS usually provides non-procedural query language (some kind of object oriented extension of SQL), the main language to access object oriented database is imperative language. So the main goal of OODBMS is to provide seamless and efficient integration with this language(s). Details of building transparent interfaces to OODBMS are discussed in next chapter.
Traditional DBMS features: Database systems were implemented to provide efficient and consistent way of manipulation with data. So object oriented database systems also need to support these features to be able to called "database systems". One of the basic notion is ACID (Atomic, Consistent, Isolation, and Durable) transactions. I do not want to explain principles of supporting transactions in database systems. The only thing I should notice is that database should be able to provide concurrent access to the data by many clients and preserve consistency of data.

Transparent API to the OODBMS

The main goal of OODBMS design is to hide from the programmer unnecessary complexity of manipulation with persistent data. Unlike normal (transient) object, persistent object survey execution of the program and so are stored either on disk, either in some other non-volatile media. To make it possible to application to access this objects it usually has to be loaded in the application memory and if it is updated by application then these changes should be stored back to the database. If transparent API is not provided, then programmer has to use load/store methods to fetch/save object in the storage. Using transparent database API it is not needed and persistent objects can be manipulated in the same way as transient objects.

Also it is necessary to provide way to add new objects to the database. Once again it can be done explicitly using some kind of put method. Alternative approach is to request all persistent classes (classes which instances are persistent) to be derived from some special base class. So all instances of such class can be automatically added to the database when them are created. But if we want to provide transparent API, neither of these two approaches is suitable. We should use another approach called persistence by reachability.

The idea of persistence by reachability approach is very simple: let's make persistent all objects referenced from other persistent objects. Usually there is one or more root objects which are made persistent in some special way (this are the only objects for which transparency is violated). All other objects are made persistent automatically when some other persistent object contains reference to this object. So an object is created in the same way as any other transient objects. It's lifetime is also limited by application lifetime. But if we assign reference to this object to some other persistent object, then this object is also made persistent and will be stored in the database for arbitrary duration of time, until it will be explicitly or implicitly deleted from the database.

Usually object is not made persistent immediately when reference to it is assigned to persistent object. Instead of it database API maintains list of modified persistent objects. When transaction is committed, this list is recursively traversed. Recursive traverse is done by following all references in the object. So if we assign reference to some newly created object N to some persistent object P, then this object P will be included in the list of the modified objects. During transaction commit OODBMS traverses object P and reach object N. It detects that N is not yet persistent (has not assigned OID) and makes it persistent by allocating space for it in the storage and assigning OID.

This is short description of how "persistence-on-reachability" principle works. Real implementation may be different. For example, not all systems maintain list of modified objects - modified persistent object can be thrown away from the object cache before transaction commit. But the main idea of "persistence-on-reachability" principle is recursive traversal of all referenced objects and making persistent those of them which are not persistent yet. To avoid infinite recursion we should somehow mark traversed object or use some other mechanism cutting recursion.

Persistence-by-reachability approach can be implemented in any programming language supporting reflection (or more precisely structural reflection). Reflection is mechanism allowing program to know details about itself. Using reflection API, a program can get class of the object, list of its fields and methods, get/set values of these fields,... So using reflection API it is possible to implement recursive traversal of object tree (we can find all reference fields in the object, get their values and traverse referenced objects). Such reflection API is supported by most of modern languages like Java, C# (although it was originally introduced a lot of time ago - in Lisp). Unfortunately such popular language as C++ still has no such API. Runtime type information (RTTI) is only very limited subset of reflection API, not allowing to traverse objects. And even RTTI is not supported by all C++ compilers. So for C++ we should find another solution. It is discussed later.

Unfortunately presence of reflection API doesn't help to solve the problem of transparent loading and storing object in the database. For it we need behavioral reflection which allows to change behavior of the program. Such reflection allows to alter way of executing method or accessing an object field. Using behavioral reflection it is possible to wrap normal access to object fields with some extra code which will load object object from the database if it is not yet loaded and handle object modification. Unfortunately behavioral reflection is present only in very few languages (some dialects of Lisp and Smalltalk). One of the general solutions of the problem is to implement own compiler or source code preprocessor which will insert requested code in all places where persistent capable objects are accessed. Unfortunately development of compiler or even preprocessor of modern programming language is non-trivial task and using such tool complicates development and support of application (causing problems with debugging and modifying code of application). There are some language-specific solutions of solving this problem which will be discussed in following chapters.

At the end of this chapter I want to enumerate characteristics of transparent API to a database:

Any object can be made persistent - there is no need to derived class from some specialized base class.
Persistent objects are manipulated in the same way as transient objects. Programmer has not to perform some additional actions to access persistent object or to store it to disk.
Objects are made persistent according to persistence-by-reachability approach. So any object directly or indirectly reachable from some other persistent object also becomes persistent.

Using aspect-oriented approach for building transparent database API

Speaking about transparent database API implementation it is not possible not to say about aspect-oriented approach and metaobject protocol. This approach was developed many years ago (first used in Smalltalk) but becomes popular only now when real tools for using this approach arrives. Aspect-oriented approach is further extension of modular principle - one of the fundamental principals of development (not only of software). Human brains are very restricted and can not do everything at the same time. Instead of it, we usually break large complex tasks into simple subtasks and implement them separately. In case of programming it is achieved by implementing separate functions, classes, modules. So this approach is used from the earliest day of programming (extracting peace of assembler code into subroutine to be called from several different places also can be considered as one of the adoptions of modular principle).

But modern programming systems becomes very complex. Them has to deal with a lot of different aspects: user interface, memory allocation, security, parallel execution, ... It is not always possible to separate all of these aspects. For example, if we implement stack class which is intended to be used in multiprocessor environment, we should use some synchronization primitives to synchronize access to the stack. For example below is implementation of stack class in C++ using Posix threads:

class Stack {
     void** buf;
     size_t sp;
     size_t size;
     pthread_mutex_t mutex;

     Stack(size_t size) {
         buf = new void*[size];
         this->size = size;
         sp = 0;
         pthread_mutex_init(&mutex, NULL);
     }

     ~Stack() {
         pthread_mutex_destroy(&mutex);
     }

     void push(void* obj) {
         pthread_mutex_lock(&mutex);
         assert(sp < size);
         buf[sp++] = obj;
         pthread_mutex_unlock(&mutex);
     }

     void* pop() {
         pthread_mutex_lock(&mutex);
         assert(sp > 0);
         void* obj = buf[--sp];
         pthread_mutex_unlock(&mutex);
         return obj;
     }

     void* top() {
         pthread_mutex_lock(&mutex);
         assert(sp > 0);
         void* obj = buf[sp-1];
         pthread_mutex_unlock(&mutex);
         return obj;
     }
}

It is very simple class. But even here it is possible to notice that code responsible for synchronization of access takes almost the same number of lines as implementation of stack logic itself. If in one case we need to use stack with synchronization because for example data is pushed and poped by different threads, and in other case we do not need any synchronization because stack is used only within one thread, then we have either implement two different stacks, either resign ourself in losing of performance in second case. And if we want to change synchronization strategy used in application (for example use our own library instead of Posix threads), then we have to change all places in application where synchronization is used. So we will have to change almost the whole code of the application and there is a big chance that we forget to change something. Here synchronization of access in concurrent environment is separate aspect which can not be separated from main application code using traditional principals of modular programming by isolating them in separate module. Certainly instead of direct usage of Posix thread library we can create our own mutex class hiding details of mutex implementation. But it will not help if we want one stack with mutual exclusive synchronization, one stack without synchronization and one stack with multiple-readers-single-writer synchronization.

So synchronization in the example above is a separate aspect. To increase program readability and reliability and make its maintainance more easy, we need to separate this synchronization aspect from the main code. Unfortunately it is not so easy to do with traditional programming tools because it is clear that synchronization code should be added to stack code at the compilation phase. Fortunately there are such tools available now, especially for Java: for example AspectJ and JAssist. Them allow to implement aspect as separate classes and methods and automatically merge them with the main application code and other aspects during build.

As it was already mentioned, there are can be several implementations of synchronization. We should be able to somehow associate particular instance of stack with required synchronization implementation. Here we should use yet another tree-letter abbreviation: MOP. It stands for metaobject protocol. Metaobject is object controlling object. In our case object is stack and metaobject is responsible for performing proper synchronization when stack methods are invoked. We can define MutualExclusionMetaobject, MultipleReaderSingleWriterMetaobject, NotSycnhronizedMetaobject metaobjects implementing different synchronization strategies. Now creating particular instance of the stack, we can specify which metaobject will control this stack:

     Stack stack = new Stack(100:
     stack.setMetaobject(new MutualExclusionMetaobject());
     ...

It is easy to notice that persistence is also yet another aspect of application. Actually we can never implement completely transparent API to the database, because there is some database specific which can not be ignored by application: for example transactions, locks... The solution of the problem is to extract all this database specific code in separate module: persistence aspect of application. In this case we should not worry writing main application code about transactions commits or rollbacks, object locks... As in case in stack sample, we can define several metaobjects implementing different locking schemes: optimistic locking, pessimistic locking and isolation levels: dirty reads, repeatable-reads, serialized transactions... By associating of of the metaobjects with application object we specify database specific behavior for the particular object without touching it's code.

Java API to the OODBMS

Java library provides structural reflection, which makes it possible to inspect content of the objects and so implement transparent persistence. But implementation of transparent loading/storing of persistent objects requires some other approaches. Java compiler produces byte code: very simple commands where code of operation use one or two bytes, operands of the operation are taken from the stack and result is also pushed to the stack. File with compiled byte code of the class methods is called class file and its format is defined by Java specification.

Presence of simple and standardized byte code makes it possible to implement the following approaches of providing transparent persistence:

Byte code preprocessor

It is special utility which is invoked after compiler, reads class files and patches Java byte code by wrapping operations with persistent capable objects with some extra code. This approach requires extra step in build procedure, so it cause some inconvenience when popular IDEs like Eclipse or JBuilder are used to build application code (but it is not so difficult to include launch of byte code preprocessor in project build scenario). Byte code preprocessor can adjust debugging information is such way that it is possible to debug patched code without any problems (certainly if fault doesn't happen in inserted byte code) Yet another disadvantage of this approach is that wrapping access to any object with extra code can cause significant increase of application size and reduce its performance.

To reduce this overhead it is possible to violate one of the principles of transparent persistence and introduce the notion of persistent capable object and class. In this case not all objects can be made persistent - only those of them which classes are derived from some special base class (Persistent for example). Such classes are called persistent capable classes and instances of these classes - persistent capable objects. Persistent capable objects not always become persistent. Them can be deleted as normal transient objects been not stored in the database. Persistence-by-reachably approach can be used to make persistent capable objects persistent. Another approach of selecting persistent capable classes is to enumerate them in some configuration file (may be using regular expressions). In both cases byte code preprocessor can insert extra code only for operations accessing persistent capable objects. It will cause significant reducing of code size and increasing its speed if most of application objects are transient.

It is not required to run byte code preprocessor as separate step of build procedure, so that it take original class file produced by compiler and overwrite it with patched class file. Byte code preprocessing can be done at runtime using specialized class loader. In the last case byte code of class methods will be preprocessed when the class is first loaded by application. Result code is not saved anywhere and so when application is terminated and started once again, preprocessing has to be done again. Also requirement to use special class loader to load all classes may be not acceptable for some applications and inconvenient for others.

Specialized JVM

Java byte code is executed by Java virtual machine. There are several such machines implemented by different vendors: Sun, IBM, Microsoft,... It is possible to implement own JVM which will interact with database and execute store/load instructions in special way, loading and updating persistent objects. using this approach there is no need in modification of byte code and any development environment can be used to compile application.

The main disadvantage of this approach is that modern JVM's are quite complex: them are providing JIT (just-in-time) translation of byte code into the native machine code. Using this optimization it is possible to reduce gap between Java and optimized C++ to the order of 2 (instead of 10 for pure interpreter). Implementation of comparable JVM is very complex task and extending existed JVM is not possible mostly because of licensing issues. Another disadvantage is that it is always necessary to ship you own JVM together with application and once language is extended (as for example in JDK 1.5), you will have to update your your JVM to be able to deal with new language features.

There are a several aspect-oriented tools for Java, most popular are AspectJ and JAssist. Using this tools we can implement transparent API without implementing own preprocessor. This tools will do this work for us. AspectJ uses its own Java compiler which understand AOP extensions. And JAssist use byte code preprocessor which can be invoked either statically, during compilation, either at runtime using specialized class loader. Both tools allows to solve the problem with transparent loading/storing of the object. Unfortunately both of them are doing it in not so efficient way: them make it not possible to efficiently distinguish access to self component or method and component or method of other object: x.foo() vs. foo(). And it is needed to reduce number of operations which need to be wrapped by aspect code. It is clear that once object is loaded, all accesses its components inside instance method (this access) require no extra checks and actions. Such optimization allows to significantly reduce size of application code and increase speed. Unfortunately it is absent, which means that authors of these tools do not think about using their products for implementing persistent aspect.

CSharp API to the OODBMS

Like Java, CSharp compiler generates byte code. So it is also possible to create byte code preprocessor or specialized virtual machine for this language. But C# byte code is significantly more complex than Java one, so implementation of preprocessor will require significantly more efforts. May be it is one of the reasons why there are no now aspect-oriented tools for this language.

But CSharp provides some other alternatives for implementing transparent persistence. First of all CSharp has something like behavioral reflection - but it is very specialized and not well documented. It is called remoting API and was intended to be used for implementation of remote method invocation. But it can be also used for implementing transparent object loading is storing (actually these two tasks are very similar - in one case access to the object should be redirected to remote host and in another - persistent object should be previously loaded). Microsoft .Net remoting API allows to load/store objects completely transparent. But it has two very serious limitations:

It works only with public components. So it is necessary to make all fields of persistent object public. It contradicts with one of the main principals of object-oriented programming: hiding details of implementation.
Very bad performance. .Net remoting API adds too large overhead. Invocation of method locates at the local host (in the same program) through remoting API is about 100 times slower than normal method invocation. So speed of accessing persistent objects using this API is very slow.

Another approach of providing "almost-transparent" persistence for CSharp is based on such feature of this language as properties. Object property can be considered as virtual component which is accessed and stored using two accessor methods provided by programmer. From the point of view of users of this class, there is not difference between object field and object property - both are accessed in the same way. It is good programming practice to replace all public object fields with properties - it makes it possible to change their representation and behavior in future and also redefine in derived classes. This approach for construction of transparent interface to the database requires that in persistent capable objects properties are always used instead of fields. User should define abstract class or interface, where he declare required abstract properties. Then we use another feature of .Net framework - - runtime code generation. .Net framework provides package for generation of classes and methods at runtime. So we can generate class derived from the user's abstract class and provide implementation of these properties.

Below is example of "normal" class definition and definition of the same class using this approach:

public class Node : Persistent
{
    public Node Left;
    public Node Right;
    public string Value;
}

public abstract class Node : Persistent
{
    public abstract Node Left 
    {
        set;
        get;
    }

    public abstract Node Right
    {
        set;
        get;
    }

    public abstract string Value
    {
        set;
        get;
    }
}

Disadvantage of this approach is obvious - classes should be declared in special way (it is not more difficult then using "normal" fields, but at least it violates principle of transparent persistence). Also objects can not be created using new operator and special method should be used to construct persistent capable objects. Advantage of this approach is good performance and simplicity.

Dynamic code generation package of .Net framework can be also used for improving performance of database API. When object is loaded from the database it should be unpacked: at the disk object is stored as array of bytes. It can be done using reflection mechanism. But more efficient approach is to generate method which is responsible for unpacking object. And when object is stored to the disk, it should be first packed (serialized) to array of bytes. Once again, generated method can significantly minimize overhead of this operation and so increase speed.

C++ API to the OODBMS

C++ is significantly older than Java or C#. Somebody calls it object-oriented assembler. Certainly this language can not be considered as best programming language for high level programming, but the reality is that now most of applications are written in C/C++. That is why most of object-oriented database vendors provides API to this language.

C++ has no reflection support (RTTI is not suitable for database needs). So format of persistent objects can be obtained in of these ways:

Explicitly specified by programmer using some macros. For example:

// this macro is used for description of fields of persistent objects
#define FIELD(x) \
(*new field_descriptor(#x, sizeof(x), 1, (char*)&(x) - (char*)this, \
                       &describe_field(x)))

class Node : public Persistent {
  public: 
    ref<Node> left;
    ref<Node> right;
    char*      value;

    // Add to class definition methods and constructors used by database API
    METACLASS_DECLARATIONS(Node, Persistent);
};

// describe class fields
field_descriptor& Guess::describe_components()
{
    return FIELD(left), FIELD(right), FIELD(value);
}

// register class descriptor
REGISTER(Node, Persistent, optimistic_repeatable_read_scheme);

Described using some external configuration file using some special language (IDL, XML, some propertary language...) There should be special tool which can parse this file and generate database scheme description used by database API.
Use specialized compiler or preprocessor which will generate required information immediately from C++ class definition. Unfortunately C++ has very complex grammar and language is constantly extended by new features. So this approach is not easy to implement.
Generate classes for databases application using some wizard, which will automatically generate classes from provided higher level model (UML,...) as well as metainformation describing this model which is needed for database API.

Transparent loading/storing of objects in C++ is possible only when specialized compiler is used. But development of such compiler is very challenged task. And it is not convenient for user, who prefer to use his(here) own favorite build tools. Another approach of providing "almost-transparent" access to the objects is based on so called smart pointers. C++ allows to redefined language operators. For example operator -> can be overloaded. Using template class with overloaded -> operator allows to create such implementation of references between peristent objects which can be used in the same way as normal pointer:

template<class T>
class ref : { 
  private:
    oid_t oid;
  public:
    T* operator->() const { 
       return (T*)load_object(oid);
    }
};

...

ref<Node> nptr;
nptr->left = nptr->right;
nptr->right = NULL;

Some C++ databases are using operating system virtual memory mapping mechanism to access database file. In this case database API can just override object allocator. Unlike normal (transient) objects, persistent objects are allocated by this allocator not in the application heap but in virtual memory segment mapped on the database file. If virtual address of database segment is always the same, then it is possible to use normal C pointers to reference persistent objects. Page fault handled by operating system will cause loading page with requested object. Modification of the page can be also detected and requested actions are performed. This approach provides the highest speed of accessing persistent data if database file fits in main memory, because once all data is loaded, dereferencing pointer to persistent objects has exactly the same time as dereferencing pointer to transient object (and so is very fast). For databases larger than size of available physical memory this approach also works but large number of page faults can cause thrashing and so significant performance degradation.

Database API to languages without static type checking

Except language with static (compile time) type checking, such as C++, Java, C#, Pascal, there are many languages where type of the variable is not specified and so can not be checked at compile time. These are mostly scripting languages such as PHP, Python, Ruby. Also the first well known object oriented language (and the one considered as most pure OO language by most of the people) - Smalltalk has no static type checking.

Static type checking allows to fix large number of errors and mistypings at compile time and so reduce time needed for debugging of the program. But what is more important - static type checking makes it possible to more precisely specify APIs which is very useful for complex systems created by large team of developers. But languages with static type checking - dynamic languages provide better flexibility and code reuse.

In dynamic languages there is usually also notion of class. But class doesn't specify types of its components. So we can declare class with fields x,y,z. But in one instance of this class we can assign string value to field "x" and in other instance - integer number. To be able to fetch object from the storage and perform garbage collection, DBMS needs to know types of all object components. Certainly it is possible to store information about object types inside object itself. For example we can store value of field "x" and tag saying that "x" value has string type. But it is not efficient since we are increasing space for representing the object almost twice.

The problem can be solved using class descriptors as in case of languages with static type checking. But now there can be many descriptors for each class. The idea of this approach is that application usually assign values of the same type to the component in all instances of this object. So if there are N builtin (basic) types in the language and M fields in the class, then number of different class descriptors for this class will be not N^M, but usually one or two. So we save a lot space by excluding type information from object instances.

Comparing performance of object-oriented, object-relational and relational database systems

Which database is faster - object-oriented or relational? There is no simple answer to this question. First of all, as it was mentioned at the beginning of this article, almost all commercial object-oriented databases have now declarative SQL-like language. So them can execute SQL queries in the same way as relational databases systems. And relational database systems often has object-oriented extensions allowing to access data as objects. So functionality of two products one of which is positioned as relational DBMS and another as object-oriented DBMS can be similar and performance comparison in this case means just evaluation of two particular products and not comparison of models.

Speaking about comparison of object-oriented and relational databases, people usually mean comparison of the performance of a program written in one of the imperative object-oriented languages supported by the OODBMS and correspondent query written in SQL and executed by relational database system. For simplest queries, like find record by key value, object oriented databases certainly provide easier and more efficient solution. Compare two fragments of Java code - one is getting object by key in OODBMS and another execute the same code using JDBC:

    boolean findPerson(Map<String,Person> persons, String name) {
        Person p = persons.get(name);
        if (p != null) {
            System.out.println("full name=" + p.getFullName() + ", age=" + p.getAge() + ", weight=" + p.getWeight());
            return true;
        }
        return false;
    }

    boolean findPerson(Connection con, String name) {
        Statement stmt = con.prepareStatement("select * from Person where name=?");
        stmt.setString(1, name);
        ResultSet cursor = stmt.executeQuery();
        boolean found = false;
        while (cursor.next()) { 
            System.out.println("full name=" + cursor.getString("full-name") 
                 + ", age=" + cursor.getInt("age") + ", weight=" + cursor.getFloat("weight"));
            found = true;
        }
        cursor.close();
        stmt.close();
        return found;
    }

First version is definitely faster (assuming that access by index is implemented in the same way in object-oriented and relational database). There is no overhead of parsing and optimizing SQL statement. And it is easier for understanding, more compact and less error prone. In JDBC case there are many things which can be checked only at runtime and so which can not be checked by compiler: for example if due to mistyping we wrote cursor.getFloat("weght") instead of cursor.getFloat("weight") this error will be detected only when this statement is executed.

Certainly if we need to execute more complex query, writing it in declarative SQL language will be usually easier (because we do not explain to the system how to execute query - we just specify which result we want to get letting system find the best plan for reaching this goal).

But some kind of queries are difficult to be represented in relational model. One of the reasons is lack of record identity and notion of current record in classical SQL. Consider the following example: we have table containing information about students and we want to find the best student in each group (student in the group with the highest mark). One of the possible SQL versions of this query is the following:

select S1.* from Students S1 where S1.mark=(select max(S2.mark) from Student S2 
where S2.GroupId=S1.GroupID group by S2.GroupId);

This query is complex and inefficient at the same time and I am not sure that even the smartest optimizer can choose for this query execution plan which will be as efficient as algorithm expressed in imperative programming language mentioned below. Usually such queries are implemented in SQL using temporary table. But temporary table is not very standard approach and more important - by introducing temporary table and so splitting query into two subqueries we loose pure declarative semantic of query - now we explicitly specify steps which system should perform to execute our query. Object-oriented version of the query above is not significantly more complex than SQL statement but is certainly more efficient:

// get iterator of student sorted by their group
Iterator<Student> iterator = studentsGroupIndex.iterator();
int maxMark = 0;
Student maxStudent = null;
Group group = null;
while (iterator.hasNext()) { 
    Student s = iterator.next();
    if (s.getGroup() != group) { 
        if (maxStudent != null) { 
            maxStudent.dump();
        }
        group = s.getGroup();
        maxMark = s.getMark();
        maxStudent = s;
    } else if (s.getMark() > maxMark) { 
        maxMark = s.getMark();
        maxStudent = s;
    }
}        
if (maxStudent != null) { 
    maxStudent.dump();
}

It is importantly to notice that almost all programming languages are computational complete - briefly speaking any problem for which exists deterministic algorithm can be expressed in this language. And it is not true for SQL. For example, if we want to find students which mark is monotonically increased using last N semesters (assuming that we store students mark for each semesters), then code for object oriented database will not be more complex than one mentioned above - just simple loop with few inductive variables. And although some SQL guru can formulate SQL query for solving this problem it will be very non-trivial and certainly less clear for understanding than imperative approach.

Another disadvantage of SQL is the consequence of it's main advantage - declarative semantic. We do not need to specify how the query should be executed. But unfortunately it also means that we can not specify it even if we want to do it. Query optimizers in modern relational systems are very smart and comprehensive: usually them are choosing very efficient execution plan - even better than can be expected by non-experienced programmer. But sometimes even such smart optimizers using statistics about data distributions are not knowing some information and so choose not optimal execution plan. Usually SQL programmer is not able to explicitly specify execution plan. The only thing he can do is to provide some hints to the DBMS to make it choose one or another alternative. If you see in SQL expression something like this: "x+0=?", then it means that programmer wants to force SQL engine not to use index for evaluating this predicate by inserting dummy operation (add zero).

Another aspect of comparison of object oriented and relational database is that in object-oriented database queries are mostly executed at client side while in relational databases - at server side. It is clear that if query has to traverse a large amount of data to select few records matching search criteria, then SQL approach is much more efficient: data is proceeded at the place where it is located (at server) and only small result is sent to the client. But if queries are mostly navigational (application traverse objects by pointers), then SQL approach becomes inefficient. Object oriented database can load complete cluster of tightly coupled objects to the client object cache and so most of traverse operations will not require access to the server. And in case of relational database each traverse require sending request to the server which has to perform index search to locate referenced object instead of direct access to the object by OID in object oriented database. This is why SQL is more efficient in systems requiring complex processing of large volumes of the data with simple structure, while object oriented databases are better choice for applications performing navigation in complex data structures (CAD/CAM, GIS,...)

It is obvious that the most powerful solution is combination of declarative and imperative style of formulating query. And that is what we can see in real word - relational databases almost always provides some kind of procedural language and stored procedures, while object oriented database systems - declarative query language which allows execution of queries at server. Object oriented extensions of SQL languages increase its power and makes it possible to simplify queries. For example instead of complex SQL statement selecting details shipped by the particular consumer with two joins:

select d.* from the Detail d, Order o, Producer c where c.name='ABB' and o.producer=c.id and detail.order=o.id;

we can write the following simple query in OQL (object query language):

select * from Detail d where d.order.producer.name='ABB';

Conclusion

Although object oriented programming paradigm is failed to become "silver bullet" in programming, is is really very powerful approach and now most of the programming languages support this paradigm. No wonder that database systems vendors are also moving toward the objects. It is clear now that there will be no war between object oriented and relational systems - there will be symbiosis of both technologies taken the best features from both worlds. As it is rarely expressed or discussed now if some programming language is object-oriented or not (almost all programming languages including such old monsters as Cobol and Fortran has object extensions and modern programming languages like Java or C# are based on the object model), I think that in near future there will be no strict division between object-oriented and relational databases.