Consus user's guide


Introduction

Consus is a 100% pure Java embedded object-relational database management system. It provides the JDBC interface and SQL-92 compatible query language JSQL. In addition to the standard JDBC methods, Consus also provides the following object oriented extensions:

JSQL

Differences from ANSI SQL

JSQL is a non-procedural query language provided by Consus. It implements almost all of the features of SQL-89 standard. It is described more in depth in the following section. The most convenient and efficient way of extracting data from Consus database is to use objects instead of tuples. It leads to a more transparent interface with the Java programming language. The result of a query execution is set of Java objects and the programmer should not worry about packing/unpacking record fields (but Consus does provide the standard JDBC methods for fetching/storing record fields values without using the object-oriented interface). To select objects instead of tuples just skip the columns list in the SQL statement (or use '*' character). In this case Consus will extract the data from the database as objects (it's still possible to access fields of the object using the standard JDBC ResultSet methods). The following example illustrates the difference between standard JDBC and the object-oriented interface:

Standard JDBC interface
    Connection connection = DriverManager.getConnection("jdbc:consus:test.dbs");	
    Statement stmt = connection.createStatement();
    stmt.execute("create table Person(name varchar, salary bigint, address varchar)");
    PreparedStatement pstmt = connection.prepareStatement
        ("insert into Person (name, salary, address) values (?,?,?)");
    pstmt.setString(1, "John Smith");
    pstmt.setInt(2, 75000);
    pstmt.setString(3, "1 Guildhall St., Cambridge CB2 3NH, UK");
    pstmt.executeUpdate()
    pstmt.close();
    ResultSet cursor = stmt.executeQuery("select * from Person where salary > 100000");
    while (cursor.next()) { 
        System.out.println("name = " + cursor.getString("name"));
        System.out.println("salary = " + cursor.getLong("salary"));
        System.out.println("address = " + cursor.getString("address"));
    }
    stmt.close();

Consus object-oriented interface
    class Person { 
        String name;
	long   salary;
	String address;
 
        Person(String aName, long aSalary, String aAddress) { 
	    name = aName;
	    salary = aSalary;
	    address = aAddress;
        }
    };
    Connection connection = DriverManager.getConnection("jdbc:consus:test.dbs");	
    ConsusStatement stmt = (ConsusStatement(connection.createStatement();
    Person p = new Person("John Smith", 75000, "1 Guildhall St., Cambridge CB2 3NH, UK");
    stmt.insert(p);
    ConsusResultSet cursor = (ConsusResultSet)stmt.executeQuery
         ("select from Person where salary > 100000");
    while (cursor.next()) { 
        p = (Person)cursor.getSelfObject();
	p.print();
    }
    stmt.close();

The differences between JSQL and ANSI SQL are enumerated below:

  1. Standard Java types are used for atomic table columns.
  2. There are no NULL values, except null references. We completely agree with C.J.Date, he is critical of the three-value logic and his proposal is to use default values instead.
  3. Structures and arrays can be used as record components. Special exists quantor is provided for locating elements in an array.
  4. Each object has unique object identifier, OID, and can be directly located by this.
  5. The query language is deeply integrated with the Java language, and therefore case sensitive mode is used for language identifiers (although JSQL keywords are case insensitive)
  6. No implicit conversion of integer and floating types is done to their string representations (except with compare operations). If conversion is needed, it should be done explicitly.
  7. All non-scalar data types have varying lengths, so there are no such types as CHAR(10). CHAR is equivalent to VARCHAR.

Types mapping

JSQL and Java type mapping
Java typeJSQL typejava.sql.Types constant
booleanboolean
bit
Types.BIT
bytetinyintTypes.TINYINT
shortsmallintTypes.SMALLINT
intintegerTypes.INTEGER
longbigintTypes.BIGINT
floatfloatTypes.FLOAT
doubledouble
real
Types.DOUBLE
Stringvarchar
char
string
Types.VARCHAR
byte[]binary
varbinary
Types.VARBINARY
java.sql.RefreferenceTypes.REF
java.sql.DatedateTypes.DATE
java.sql.TimetimeTypes.TIME
java.sql.TimestamptimestampTypes.TIMESTAMP
java.sql.BlobblobTypes.BLOB
java.sql.ClobclobTypes.CLOB

JSQL formal grammar

The following rules, in BNF-like notation, specify the grammar of the JSQL statements:

Grammar conventions
ExampleMeaning
expressionnon-terminals
NOTterminals
|disjoint alternatives
[NOT]optional part
{1..9}repeat zero or more times

statement::= 
     query-statement | update-statement | delete-statement 
     | create-table-statement | alter-table-statement | drop-table-statement 
     | create-domain-statement | 
     | create-index-statement | drop-index-statement
     | commit-statement | rollback-statement

query-statement::= select-statement {UNION [ALL] select-statement } 
     [ORDER BY columns-ref-list] [FOR UPDATE]
select-statement::= SELECT [ALL | DISTINCT] [columns-list] FROM table-list
     [WHERE expression] [iterator-spec] [group-by] 
update-statement::= UPDATE table SET assignments [WHERE expression]
delete-statement::= DELETE FROM table [WHERE expression]
create-table-statement::= CREATE TABLE table (fields-declaration-list)
alter-table-statement::= ALTER TABLE table {ADD new-fields-decls 
     | MODIFY new-fields-decls | DROP (constraint | field | (fields-list) ) | RENAME rename-list}
drop-table-statement::= DROP TABLE table
create-index-statement::= CREATE INDEX [identifier] ON table ( field ) 
drop-index-statement::= DROP INDEX table ( field ) 
create-domain-statement::= CREATE DOMAIN identifier JSQL-type
commit-statement::= COMMIT
rollback-statement::= ROLLBACK

columns-list::= * | column-spec {, column-spec}
column-spec::= table.* | expression [[AS] identifier]
column::= [table .] identifier 
table-list::= table-spec {, table-spec}
table-spec::= table {[NATURAL] JOIN table [USING fields-list] } [[AS] identifier]
table::= {package .} identifier 
package::= identifier 
fields-list::= field {, field }
field::= identifier { . identifier }
fields-declaration-list::= field-declaration {, field-declaration}
field-declaration::= field JSQL-type {field-constraint} | [CONSTRAINT identifier] constraint
constraint: PRIMARY KEY field | FOREIGN KEY field REFERENCES table [ (field) ] [ON DELETE CASCADE]
field-constraint: PRIMARY KEY | USING INDEX | [NOT] NULL | UNIQUE
      | [FOREIGN KEY] REFERENCES table [ (field) ] [ON DELETE CASCADE]
rename-list::= rename-item {, rename-item}
rename-item::= old-field-name [AS] new-field-name
old-field-name::= field | string
new-field-name::= field | string
assignments::= assignment {, assignment}
assignment::= field = expression
columns-ref-list::= column-ref {,  column-ref}
column-ref::= column | column-index
columns-index::= integer-constant
group-by::= GROUP BY columns-ref-list [HAVING expression]
iterator-spec::= START FROM start-position [FOLLOWING BY fields-list]
start-position::= FISRT | LAST | parameter 
new-fields-decls::= field-declaration | ( fields-declaration-list ) | constraint
expression::= disjunction
disjunction::= conjunction 
        | conjunction OR disjunction
conjunction::= comparison 
        | comparison AND conjunction
comparison::= operand = operand 
        | operand != operand 
        | operand <> operand 
        | operand < operand 
        | operand <= operand 
        | operand > operand 
        | operand >= quantor subquery 
        | operand != quantor subquery 
        | operand <> quantor subquery 
        | operand < quantor subquery 
        | operand <= quantor subquery 
        | operand > quantor subquery  
        | operand >= quantor subquery 
        | operand [NOT] LIKE operand 
        | operand [NOT] LIKE operand escape string
        | operand [NOT] IN operand
        | operand [NOT] IN subquery
        | operand [NOT] IN expressions-list
        | operand [NOT] BETWEEN operand AND operand
	| operand IS [NOT] NULL
operand::= addition
addition::= multiplication 
        | addition +  multiplication
        | addition || multiplication
        | addition -  multiplication
multiplication::= power 
        | multiplication * power
        | multiplication / power
power::= term
        | term ^ power
term::= identifier | parameter | number | string 
        | TRUE | FALSE | NULL | OID | SYSDATE
	| ( expression ) 
        | NOT comparison
	| - term
        | subquery
	| term [ expression ] 
	| identifier . term 
	| function term
	| set-function
	| CAST ( expression [AS] JSQL-type )
        | EXISTS identifier : term
	| EXISTS subquery
quantor::= ANY | ALL | SOME
subquery::= select-statement
function::= ABS | LENGTH | LOWER | UPPER
        | INTEGER | REAL | STRING | SUBSTR | 
        | SIN | COS | TAN | ASIN | ACOS | 
        | ATAN | LOG | EXP | CEIL | FLOOR
set-function::= COUNT (*) 
        | distinct-set-function ( DISTINCT term ) 
	| all-set-function ([ALL] term )
distinct-set-function::= MIN | MAX | AVG | SUM | COUNT
all-set-function::= MIN | MAX | AVG | SUM
string::= ' { { any-character-except-quote } [''] } '
parameter::= ?
expressions-list::= ( expression { , expression } )
JSQL-type::= BIT | BOOLEAN | TINYINT | SMALLINT | INTEGER | BIGINT 
        | REAL | FLOAT | DOUBLE | DATE | TIME | TIMESTAMP | BLOB | CLOB 
	| STRING | VARCHAR [( integer-constant )] | CHAR [( integer-constant )]
        | BINARY [( integer-constant )] | VARBINARY [( integer-constant )]
        | NUMERIC [( integer-constant [, integer-constant])] 
	| DECIMAL [( integer-constant [, integer-constant])]
Identifiers are case sensitive, beginning with a..z, A..Z, '_' or the '$' character, containing only a-z, A..Z, 0..9 '_' or '$' characters, and do not duplicate SQL reserved words.

List of reserved words
absacosaddallalter
andanyasascasin
atanavgbetweenbinarybigint
bitblobbooleanbycascade
castcharceilclobcommit
constraintcoscountcurrent_datecreate
datedecimaldeletedescdistinct
domaindoubledropescapeexists
expfalsefirstfloatfloor
followingforforeignfromgroup
havinginindexintoinsert
isintegerjoinkeylast
lengthlikeloglowermax
minmodifynaturalnotnull
numericoidonororder
primaryrealreferencereferencesrename
rollbackselectsetsinsmallint
somesqrtstartstringsum
substrsysdatetabletantime
timestamptinyinttotruevalues
varbinaryvarcharunionuniqueupdate
upperusingwhere

JSQL extends the ANSI standard SQL operations by supporting bit manipulation operations. Operators and/or can be applied not only to boolean operands but also to operands of the integer type. The results of applying the and/or operator to integer operands is an integer value with its bits set by the bit-AND/bit-OR operation. Bit operations can be used for efficient implementation of small sets. Raising the integer and floating types, with the power operation ^, is supported by JSQL.

Strings

All strings in JSQL have varying lengths, and the programmer should not worry about specifying the maximum length for character fields. All operations acceptable for arrays are also applicable to strings, which also have their own set of operations. For example, strings can be compared with each other using the standard relation operators. JSQL uses java.lang.String class for the string operation implementation.

The construction like can be used for matching a string with a pattern containing the special wildcard characters '%' and '_'. The character '_' matches any single character, while the character '%' matches any number of characters (including 0). The extended form of the like operator with escape part can be used to handle characters '%' and '_' in the pattern as normal characters only if they are preceded by the special escape character, which is specified after the escape keyword.

It is possible to search substring within the string by using the in operator. So the expression ('blue' in color) will be true for all the records for which the color fields contains the 'blue' string. Strings can be concatenated by using the + or || operators. The last operator was added only for compatibility with the ANSI SQL standard. JSQL doesn't support implicit conversions to the string type in expressions, so the semantic of the operator + can be redefined for strings.

References

References can be used in JSQL for fast and direct access to the record by the OID. Reference fields can be also indexed and used in the ORDER BY clause. Access to the object by reference can be done either using the special ConsusStatement.get(Ref ref) method or by the following condition in the JSQL select statement: "where oid = ?" ('?' is the placeholder for the parameter, which should be set using the PreparedStatement.setRef method).

JSQL accepts typed and untyped (abstract) references. The type of record accessed is always checked at runtime, for example the statement "select from Unit where oid = ?" will check that the fetched record belongs to the table "Unit" or to the table derived from table "Unit".

References can be dereferenced using the same dot notation as used in accessing structure components. For example the following query:

        company.address.city = 'Chicago'
will access the record referenced by the company component of the Contract record. It will then extract the city component of the address field, of the referenced record, from the Supplier table.

References can be checked for null by the is null or is not null predicates. They can also be compared for equality with each other, as well as with the special null keyword. When a null reference is dereferenced, an exception is raised by Consus.

There is a special keyword oid, that can be used to get the reference to the current record during the table search. Usually the oid keyword is used for comparison of the current record identifier with other references, or locating it within an array of references. For example, the following query will search in the Contract table for all active contracts:

        oid not in supplier.canceledContracts
Consus provides a special construction for recursive traverse of records by references:
     START FROM root-reference
     [ FOLLOWING BY list-of-reference-fields ]
The first part of this construction is used to specify root objects. Nonterminal root-references should be a variable of the ResultSet, reference or array of reference type. Two special keywords, first and last, can be used here. They locate the first/last record in the table accordingly. If you want to check for some condition in all of the records referenced by the array of references or by a single reference field, then this construction can be used without the following by part.

If you specify following by part, then Consus will recursively traverse the table records, starting from the root references and using the list of reference fields list-of-reference-fields for transition between records. list-of-reference-fields should consist of fields with the reference or array of reference type. Traversing is done by depth first in top left to right order (first visiting the parent node and then each of the siblings in left to right order). Recursion is terminated when a null reference is accessed. If you specify the DISTINCT qualifier, then Consus will use a bitmap to detect loops and therefore will not visit the same record twice. Otherwise loops in the reference graph can cause infinite recursion and stack overflow. For example the following query will search the tree records for a weight larger than 1, going from top left to the right:

select from Tree where weight > 1 start from first following by left, right;

For the following tree:

                              A:1.1
              B:2.0                             C:1.5
      D:1.3         E:1.8                F:1.2         G:0.8
the result of query execution will be:
('A', 1.1), ('B', 2.0), ('D', 1.3), ('E', 1.8), ('C', 1.5), ('F', 1.2)

Direct access to the record by a reference can be performed by the query "select from Contract where oid = ?" (Consus provides special functions for direct record access, but it can be also done using the standard JDBC function).

When a table is created by the create table statement, it is possible to explicitly specify the type of reference table for the reference field:

      create table A (x reference to B)
If the TO clause was not specified, the reference is considered to be an abstract reference and can not be dereferenced in any JSQL expressions. Although abstract references can still can be used for direct loading of the record from the database, using either the ConsusStatement.get(Ref ref) method or the "oid = ?" select predicate.

When a table descriptor is built from a Java application class definition, information about the reference field target type can be extracted from the field name. If the reference field name (or array of references field name) contains the '$' character, then Consus considers the rest of the name after the dollar sign as name of the referenced table. It will use this name to locate the corresponding table class descriptor during the query compilation.

It is not necessary to do tables joins in Consus. It is possible to select objects from a single table with references to the related tables. Such an approach leads to simpler and more efficient queries. Consider the classical Supplier-Shipment-Detail examples:
class Detail { 
    String name;
    double weight;
    
    final static String indices[] = {"name"};
}

class Supplier { 
    String company;
    String address;

    final static String indices[] = {"company"};
}

class Shipment { 
    Ref  $Detail;
    Ref  $Supplier;
    long price;
    long quantity;
    Date delivery;

    final static String indices[] = {"$Detail", "$Supplier"};
}
We want to get information about the delivery of some concrete details from some concrete suppliers. In a relational database this query would be written something like this:
     select from Supplier,Shipment,Detail where 
                 Supplier.SID = Shipment.SID and Shipment.DID = Detail.DID 
		 and Supplier.company like ? and Supplier.address like ?
		 and Detail.name like ? 
In Consus this request can be written as:
    PreparedStatement stmt
        = connection.prepareStatement("select from Shipment where " 
			            + "$Supplier.company like ? " 
			            + "and $Supplier.address like ? " 
			            + "and $Detail.name like ? order by delivery");
     stmt.setString(1, input("Detail like: "));
     stmt.setString(2, input("Company like: "));
     stmt.setString(3, input("Address like: "));
     ConsusResultSet cursor = (ConsusResultSet)stmt.executeQuery();
     System.out.println("Detail Company Address Price Quantity Delivery");
     while (cursor.next()) { 
	 shipment = (Shipment)cursor.getSelfObject();
	 supplier = (Supplier)stmt.get(shipment.$Supplier);
	 detail = (Detail)stmt.get(shipment.$Detail);
	 System.out.println(detail.name + "\t" + supplier.company + "\t" + 
	  	            supplier.address + "\t" + shipment.price + "\t" + 
		            shipment.quantity + "\t" + shipment.delivery);
     }
Consus will first perform an index search in the table Detail for the details matching the search condition. Then it performs another index search to locate the shipment records referencing the selected details. Finally a sequential search is used to check the rest of select predicate.

Arrays

JSQL accepts arrays with dynamic length as components of records. Multidimensional arrays are not supported, but it is possible to define an array of arrays. JSQL provides a set of special constructions for dealing with arrays:

  1. It is possible to get the number of elements in the array by using the length() function.
  2. Array elements can be retrieved by using the [] operator. If the index expression is out of the array range, then an exception will be raised.
  3. The operator in can be used for checking if an array contains values specified by the left operand. This operation can be used only for arrays of atomic types; with boolean, numeric, reference or string components.
  4. Iteration through array elements is performed by the exists operator. Variables specified after the exists keyword can be used as an index in the arrays for the expression preceded by the exists quantor. This index variable will iterate through all the possible array index values, until the value of expression becomes true or the index runs out of the arrays range. The condition:
            exists i: (contract[i].company.location = 'US')
    
    will select all of the details of shipments made by companies located in the US, while the query:
            not exists i: (contract[i].company.location = 'US')
    
    will select all the details of shipments from companies outside the US.

    Nested exists clauses are allowed. Using nested exists quantors is equivalent to nested loops using their corresponding index variables. For example, the query:

            exists colon: (exists row: (matrix[colon][row] = 0))
    
    will select all the records containing 0 in elements of matrix field, which has the type, array of array of integers. This construction is equivalent to the two following nested loops:
           boolean result = false;
           for (int colon = 0; colon < matrix.length(); colon++) { 
                for (int row = 0; row < matrix[colon].length(); row++) { 
    	         if (matrix[colon][row] == 0) { 
                         result = true;
    		     break;
                     }
                }
           }
    
    NB: The order in which the indices are used is significant! The result of the following query execution:
            exists row: (exists colon: (matrix[colon][row] = 0))
    
    will be completely different from the result of previous query. The program can simply hang in last case due to an infinite loop caused by empty matrices.

Unfortunately the Java language provides not many operations for arrays. Except for the built-in get-element/set-element operations and System.arraycopy method, there were no standard functions for manipulation of arrays. In JDK 1.2, the class java.util.Arrays was added, which provides sort, fill and binary search operations for arrays. Arrays are the only typed collection of objects in Java (there are no template classes). It would be useful to support mutable collections on the base of arrays. The Consus class AnyArray provides methods for inserting and deleting elements of arrays. The size of array can not be changed in Java since all these methods create new copies of the array. So these methods should be used something like this:

     String[] arr = new String[100];
     ...
     arr = (String[])AnyArray.append(arr, "Hello world");

Functions

JSQL built-in functions
NameArgument typeReturn typeDescription
abs(i)integerintegerabsolute value of the argument
abs(r)realrealabsolute value of the argument
sin(r)realrealsin (rad)
cos(r)realrealcos (rad)
tan(r)realrealtan (rad)
asin(r)realrealarcsin
acos(r)realrealarccos
atan(r)realrealarctan
exp(r)realrealexponent
log(r)realrealnatural logarithm
ceil(r)realrealthe smallest integer value that is not less than r
floor(r)realrealthe largest integer value that is not greater than r
integer(r)realintegerconversion of real to integer
integer(s)stringintegerconversion of string to integer
length(a)arrayintegernumber of elements in array
length(s)stringintegerlength of string
lower(s)stringstringlowercase string
real(i)integerrealconversion of integer to real
real(s)stringrealconversion of string to real
string(i)integerstringconversion of integer to string
string(r)realstringconversion of real to string
substr(s,m[,n])string,integer[,integer]stringsubstring of s, beginning at character m,
n characters long
(if n is ommitted, to the end of string)
upper(s)stringstringuppercase string

Constructing queries

Frequently an application has to construct a query dynamically, based for example on the fields entered by the user in a HTML form. To simplify for the programmer the creation of dynamic queries, Consus provides a special class Query. This class provides the following methods:

MethodDescription
Query(Connection con)Query constructor.
void reset()Reset query to the initial state.
Query add(String cond)Add string to the query.
Query and(String cond)Add conjunct to the query. If it is the first conjunct in the statement, then string is prepended by " WHERE ", otherwise string is prepended by " AND ".
Query param(T val)Add parameter to the statement (T stands for any Java type). String " ? " is appended to the statement and the value of the parameter is bound with placeholder. Parameter value will be used later after the statement be prepared.
ConsusPreparedStatement prepare()Prepare statement.
ConsusResultSet executeQuery()Prepare statement and execute the query.
int executeUpdate()Prepare statement and execute update statement.

Table iterators

Consus provides a more convenient way for the iteration through all of the table records (objects) than using the standard JDBC ResultSet. The class TableIterator implements most of the ResultSet navigation methods (but not all them, therefore does not implement this protocol). This class is oriented on work with objects, so it can not be used to extract specific columns of the record. The following two query fragments produce the same result:
    Statement stmt = conxn.createStatement();
    ConsusResultSet cursor = (ConsusResultSet)stmt.executeQuery("select * from Person");
    while (cursor.next()) {
        ((Printable)cursor.getSelfObject()).print();
    }
    stmt.close();

    TableIterator iterator = new TableIterator(conxn, "Person");
    while (iterator.next()) { 
        ((Printable)iterator.get()).print();
    }
}
Using the TableIterator is more efficient than using ResultSet with the select all statement because:
  1. No SQL query needs to be compiled,
  2. The Iterator loads the records one-by-one and ConsusResultSet extracts the OIDs of all the objects.

The following methods are implemented by the TableIterator class:

MethodDescription
TableIterator(ConsusConnection conxn,
String tableName)
Construct table iterator
boolean next()Moves cursor to next row
boolean previous()Moves cursor to previous row
Object get()Fetch current table row as Java object
void close()Closes iterator
boolean isBeforeFirst()Indicates whether the cursor is before the first row
boolean isAfterLast()Indicates whether the cursor is after the last row
boolean isFirst()Indicates whether the cursor is on the first row
boolean isLast()Indicates whether the cursor is on the last row
void beforeFirst()Moves the cursor to the front of the result set
void afterLast()Moves the cursor to the end of the result set
boolean first()Moves the cursor to the first row
boolean last()Moves the cursor to the last row
int getRow()Retrieves the current row number
boolean absolute(int row)Moves the cursor to the given row number
boolean relative(int rows)Moves the cursor a relative number of row
void deleteRow()Deletes the current row
int getNumberOfRows()Get number of rows in the table
toArray(Object[] a)Returns an array containing all selected elements. The runtime type of the returned array is that of the specified array. If the number of selected elements is not greater then specified array length, then them are returned therein. Otherwise, a new array is allocated with the runtime type of the specified array and length equal to the number of selected elements.
toArray()Returns an array of java.lang.Object containing all selected elements.

Consus object-oriented JDBC extensions

Consus extends the standard JDBC interface by providing some extra methods through implementing Consus object-oriented extensions. To be able to use these methods, the programmer should perform explicit conversion from the JDBC interface type to the corresponding Consus implementation class (for example from the ResultSet class returned by Statement.executeQuery to ConsusResultSet).

ConsusConnection

Consus is not a pure object-oriented database. It provides a convenient interface for extracting/storing Java objects to and from a database. It is unable to provide transparent access to the persistent objects. Moreover if you select the same record (object) from the database twice, Consus will create two different instances of the object. True object-oriented applications need to retrieve persistent object collections from the database to the memory and such behaviour can be a problem.

To solve this problem, Consus provides an optional facility called "instance manager". By default it is switched off (if you are using persistent classes it is switched on automatically). If you set the dbObjectCacheSize static variable of ConsusConnection class to a non-zero value, then Consus will allocate a weak object hash and will guarantee that access to the same database record will always refer to the same Java object. Consus uses the JDK 1.2 weak references to implement this object cache. References to the object will be removed from the hashtable by the Garbage Collector (GC) when there are no "normal" references remaining to the object in the application. The value of the dbObjectCacheSize variable specifies the initial size of the hashtable. Implementation of the hashtable dynamically extends it when the load factor exceeds some predefined value, therefore the initial size of the hashtable can only reduce the number of table reallocations. Using the instance manager can significantly increase the memory demands of the application. It can also slow down the application because of the extra overhead of locating objects in the hashtable.

There is a static public component dbPagePoolSize in the ConsusConenction class, which can be set before opening the connection, to specify page pool size. The value of this variable specifies the number of pages in page pool and is used to optimize access to the disk. The default value of this variable is 1024 and this means that up to 4Mb of the operating memory will be used for caching the database file data. Increasing this value will improve the performance unless the size of the used virtual memory exceeds the size of the available physical memory and then swapping starts.


void commitSubtransaction() throws SQLException;
Commit subtransaction. This method can be used to implement co-operative transactions. This method releases all the locks set by the current threads, allowing other threads to continue execution and see the changes made by the current thread. The transaction is not actually committed and any rollback or fault will cause the lose of all the changes made by the subtransactions. NB: Use this mode with care since concurrent thread execution can cause unexpected effects and results.

void backup(java.io.OutputStream stream) throws SQLException, java.io.IOException;
Performs the backup of the database to the specified stream. Backup is performed in parallel with other database activities. If you want to minimize the influence of backup on the normal database functionality, set the priority of the thread to minimum before performing the backup. No transaction can be committed before the end of the backup, but it is possible to modify database objects and commit subtransactions during the backup.
Parameters
stream where to output the backup. The stream is not closed after the end of the backup.


void scheduleBackup(String fileName, long periodMsec);
Automatically performs backup with specified period. Backup is performed by separate thread running with the minimal priority. This functions is implicitly invoked when consus.backup.name system property is set. Period value is taken from consus.backup.period system property if specified, otherwise it is set to one day.
Parameters
fileName name of the backup file. If this name ends with "?" character, then date of backup start will be appended to the file name, producing unique file name. Otherwise, backup will be performed to the file with name fileName + ".new". When backup is completed, old file fileName will be removed and new backup file will be renamed to the fileName.
periodMsec period in milliseconds between backups (more precisely between completion of the last backup and start of new backup).

void setThreadAutoCommit(boolean autoCommit);
Sets the autocommit mode only for the current thread. The transaction commit modes of other threads are not changed. The transaction will be automatically committed after each statement, when either the global auto-commit mode is set or the current thread is in the auto-commit state.
Parameters
autoCommit true - enables auto-commit for the current thread; false - disables

boolean getThreadAutoCommit()
Gets the current auto-commit state of the current thread.
Returns
the auto-commit state of the current thread.

public Ref insert(Object obj) throws SQLException;
Insert specified object into the database. The table in which object will be inserted is determined by the object class.
Parameters
obj Java object to be inserted.
Returns
OID of the created object.

public void update(Ref ref, Object obj) throws SQLException;
Update an object with the specified OID.
Parameters
ref reference specifying the OID of the updated object.
obj Java object with new values of the record fields.

public Object get(Ref ref) throws SQLException;
Get an object by the OID.
Parameters
ref reference the specifying OID of the object.
Returns
the retrieved object.

public void remove(Ref ref) throws SQLException;
Remove the object with the specified OID.
Parameters
ref reference of the object to be removed. Can be null.

public void createObjectCache() 
Allocate an object cache (instance manager) if not allocated yet.

public void clearObjectCache() 
Remove all objects from the object cache. This method should be called to prevent memory exhaustion caused by loading a lot of persistent interconnected objects (each of the objects is accessible from some set of root objects) from the database. GC will not be able to deallocate such objects because they are referencing each other. So finally all objects from the database can be loaded to the memory. Programmers should check the number of loaded objects (by getObjectCacheSize or getUsedMemorySize method) and if it exceeds some threshold, call clearObjectCache method to remove all objects from the cache. Alternatively the programmer can call the Persistent.unget method to replace an object with a stub, but this approach is less efficient because usually it throws away the most recently used object.

public int getObjectCacheSize();
Returns the number of objects in object cache.
Returns
the number of objects in the cache.

public static long getUsedMemorySize();
Returns the size of memory currently used by the application.
Returns
Size of memory, in bytes, used by application.

ConsusResultSet


public Object getObject(int columnIndex) throws SQLException;
JDBC: Gets the value of a column, in the current row, as a Java object.
This method from the JDBC ResultSet interface is extended in Consus to retrieve the whole record as a Java object.
Parameters
columnIndex the first column is 1, the second is 2, etc...
Consus specific: a zero value refers to the currently selected object itself
Returns
A Java object holding the column value of the record or the record itself.

public Object getObject(String columnName) throws SQLException;
JDBC: Gets the value of a column, in the current row, as a Java object.
This method from the JDBC ResultSet interface is extended in Consus to retrieve the whole record as Java object.
Parameters
columnName the SQL name of the column
Consus specific: using "this" value as a parameter will refer to the currently selected object itself
Returns
A Java object holding the column value of the record or the record itself.

public void updateObject(int columnIndex, Object x) throws SQLException;
JDBC: Updates a column with an Object value.
This method from the JDBC ResultSet interface is extended in Consus to update the whole record using values from the passed Java object. You should either use this method to update all record fields or update fields separately using the standard JDBC semantic. NB: Do not mix the two approaches.
Parameters
columnIndex the first column is 1, the second is 2, etc...
Consus specific: a zero value as the parameter means that all the record fields should be updated and new values should be taken from the passed Java object.
x the new column or record value

public void updateObject(String columnName, Object x) throws SQLException;
JDBC: Updates a column with an Object value.
This method from the JDBC ResultSet interface is extended in Consus to update the whole record using values from the passed Java object. You should either use this method to update all record fields or update fields separately using the standard JDBC semantic. NB: Do not mix the two approaches.
Parameters
columnName the SQL name of the column
Consus specific: using "this" as the value of the parameter means that all record fields should be updated and new values should be taken from the passed Java object.
x the new column or record value

public Ref getRef(int i) throws SQLException;
JDBC: Gets a REF column value from the current row.
This method from JDBC ResultSet interface is extended in Consus to get a reference to the currently selected record.
Parameters
columnIndex the first column is 1, the second is 2, etc...
Consus specific: a zero value passed as a parameter refers to the currently selected object itself
Returns
Ref object representing an SQL REF value

public Ref getRef(String colName) throws SQLException;
JDBC: Gets a REF column value from the current row.
This method from JDBC ResultSet interface is extended in Consus to get reference to the currently selected record.
Parameters
columnName the SQL name of the column
Consus specific: using "this" value as the parameter refers to the currently selected object itself
Returns
Ref object representing an SQL REF value.

public Object getSelfObject() throws SQLException;
Consus specific method to get the current record as a Java object. This method is equivalent to getObject(0) or getObject("this").
Returns
The Java object representing the selected record.

public Ref insertObject(Object obj) throws SQLException; 
Insert specified object into the table
Parameters
obj object to be inserted.
Returns
OID of the created record.

public void updateSelfObject(Object obj) throws SQLException;
Update the current record.
Parameters
obj object with the new values of the record fields.

public Ref getSelfRef();
Get OID of the current object. This method is equivalent to getRef(0) or getRef("this").
Returns
OID of the current record.

public int getSelectionSize(); 
Gets the number of selected records.
Returns
the number of selected records.

ConsusStatement

public Ref insert(Object obj) throws SQLException;
Insert specified object into the database. The table in which the object will be inserted is determined by the object class.
Parameters
obj the Java object to be inserted.
Returns
OID of the created object.

public void update(Ref ref, Object obj) throws SQLException;
Update object with the specified OID.
Parameters
ref the reference specifying the OID of the updated object.
obj a Java object with the new values of the record fields.

public Object get(Ref ref) throws SQLException;
Get object by OID.
Parameters
ref the reference specifying OID of the object.
Returns
the retrieved object.

public void remove(Ref ref) throws SQLException;
Remove object with the specified OID.
Parameters
ref a reference of the object to be removed. Can be null.

public Ref getLastRef() throws SQLException;
Get the reference to the last selected, updated or inserted object.
Returns
the reference to the last manipulated object.

Consus as a Persistent Object Store

In addition to the JDBC interface, Consus provides a more convenient and transparent interface for accessing persistent objects. Unlike most of available Java OODBMS, Consus doesn't require the use of a special preprocessor compiler. As a result the interface is usually not fully transparent. A persistent object interface can be normally be used together with JSQL queries. It is possible to locate the root object using a JSQL query and then load all the referenced objects, either implicitly (by recursive traversing of all referenced objects) or by manually moving from one object to another using references.

Persistency Model

Each persistent-capable object (i.e. an object that can be stored in the database) should be derived from the org.garret.consus.Persistent class. This class provides methods for loading the object from the database, updating the object, storing the object in the database. Deriving the class from org.garret.consus.Persistent doesn't necessarily mean that all instances of this class will be stored in the database. An object can be stored in database if the method store is explicitly called for this object or for some other object which contains a reference to this object.

When a persistent object is loaded from the database, all the object references from the loaded object are inspected. If a reference contains an OID of the object which is already loaded, then a direct pointer to this object is stored in the reference field of Java object. Otherwise a new stub object is created. The stub object is of the same type as real object, but it contains no data (except the OID of the object). To load data of this object, the programmer should invoke the load method for the object. It is possible to make Consus automatically load the closure of the objects (i.e. all the objects referenced from the root object, objects referenced from these objects, etc...). Automatic loading of all referenced objects is performed by Consus if the loadClosure method declared in Persistent class returns true for the initial object. The algorithm for loading objects is:

  1. Initial persistent object can be selected by any JSQL query or directly accessed by OID.
  2. When a persistent object is loaded from the database, the method loadClosure is invoked for the object.
  3. If the method loadClosure returns true, then Consus will recursively load all the referenced objects.
  4. If the method loadClosure returns false, then Consus creates object stubs - i.e. objects of the same class but without any data (except the OID).
  5. All loaded objects are placed in the object cache, where they can be located by the OID. This cache is used to guarantee reference integrity, i.e. two different references in the database to the same object will also point to the same object instance in main memory, when objects containing the reference are loaded.
All persistent capable classes should have a default constructor (i.e. a constructor without parameters). If a class has no constructor at all then a default constructor will be automatically generated by the compiler. If you define one or more constructors yourself, then the compiler will not generate a default constructor. In this case you should explicitly provide a constructor with an empty argument list. This constructor will be called each time object is loaded from the database. So you should not perform initialization of object fields in the default constructor except on transient fields, which are not stored in the database.

When a persistent capable class (i.e. a class derived from the Code.Persistent base) is registered in the database, the instance manager is automatically started. It manages the object cache and is used by the object loader to check if an object with such OID is already present in memory.

Persistent Object Interface

All persistent capable classes should be derived from the org.garret.consus.Persistent class. This class provides the following methods:
boolean loadClosure();
This method allows the control of implicit loading of the object closure. It is called by Consus for each loaded object to check if the referenced object should be also loaded. This method allows you to distinguish which object should be loaded implicitly by Consus and which one should be loaded explicitly by the programmer.
Returns
true - if all the referenced objects from this object should be loaded, false otherwise. Implementation of this method in the org.garret.consus.Persistent class returns true. If you do not want to implicitly load all referenced objects, then redefine this method in your class and return false. Although the implicit load of an object closure saves a lot of the programmers efforts and makes a program more transparent, it's recursive load can cause loading of all objects in the databases, for example if they are linked in a double linked list.

boolean isNew() 
Checks if the object has not yet been placed in storage.
Returns
true if the object has no assigned OID.

boolean isLoaded() 
Checks if the object was loaded from the database.
Returns
true if it was loaded from the database or was just created (does not have assigned OID yet), false if the object data is not yet loaded and we have stub instead of the object.

boolean isDeleted();
Checks if the object was deleted. Any access to the deleted object except by this method is prohibited,
Returns
true if object was deleted, false otherwise.

Persistent load();
Load an object from the database. This method can be called to load the stub object data or to reload the object if it was changed by another application/thread.
Returns
this object reference.

Persistent get()
This method checks if the object has been already loaded and if not (i.e. the object is a stub) then loads the object.
Returns
reference to the loaded object (equal to this)

Persistent get(Persistent obj)
This method is equivalent to the get() method except that it is static and takes an object reference as a parameter. It checks the parameter for null, and if it is null, returns null.
Parameters
obj - reference to the object to be loaded (can be null).
Returns
reference to the loaded object (equal to the obj parameter)

void loadComponents()
Loads the components of the persistent object. If load closure mode is disabled for this class (method loadClosure() returns false), then Consus create stubs instead of referenced objects. The programmer can use this method to explicitly load referenced objects. This method loads only stub objects (i.e. objects that are not yet loaded) and doesn't reload loaded objects.

void unget()
Clear all object fields (to make it possible for GC to collect unused persistent object). If this method is not called and all persistent objects in the database are referencing each other (for example linked in L2-list), then after some time, all the data can be loaded into memory from the database.

void lock();
Lock database in exclusive mode. The lock will be released after the current transaction commits or rollback occurs.

void store(ConsusConnection db)
Store the object in the specified database. If object has no assigned OID, it will be first one allocated in the storage. If the object contains references to other persistent capable objects without assigned OIDs, then all these objects will also be saved into storage (persistence by reachability).
Parameters
db - Consus Connection object.

void store()
Store the object in the storage from which it was loaded. It is not possible to invoke the method store for newly created objects without assigning OIDs.

boolean save()
Checks if an object is in the deferred update mode, if not save it into storage. The object is stored in the storage from which it was loaded. It is not possible to invoke method store for newly created objects without assigning OIDs.
Returns
true if the method stores the object, false otherwise

void remove()
Remove the current object from the database.

The class org.garret.consus.Persistent has two public instance variables - db and deferredUpdate. db can be used to access the connection from which the object was loaded. deferredUpdate allows you to group several updates of the object into one single update. It is described below.

Consus doesn't use any preprocessor so it can not detect the moment when the object has changed and needs to be saved in the storage. It is the responsibility of the programmer not to forget to save changes into the database. So each method, that modifies the object, should save the object before they return. Sometimes one method can call other methods to do their part of the job. If each of these methods saves the object before returning, performance will decrease. In this instance, Consus allows the programmer to set the object into the deferred update mode. In the deferred update mode, the method Save will not store the object immediately. The field deferredUpdate is used as a counter and will not store anything till it reaches zero. In this way the outer most method called, can save all the changes made by other invoked methods, using just a single store operation.

Operations on persistent objects are not declared as throwing java.sql.SQLException, so there is not need to catch this exception or include it in any method prototype in your code that works with persistent objects. Instead of doing this, operations with persistent objects can throw the POSError exception which is derived from java.lang.RuntimeException and so doesn't need to be caught or included in throws list in method declaration.

Containers

To implement a one-to-many relationship between persistent objects, in addition to normal arrays, Consus provides two additional container types: list and ordered list. Both classes implement the java.util.Collection and java.util.List interfaces and provide the same methods as the java.util.ArrayList class.

These collections are implemented as dynamically growing arrays. When the number of elements exceeds the length of the array, it is reallocated with a duplicated size. Only persistent capable objects (i.e. objects of the classes derived from org.garret.consus.Persistent) can be stored in these containers. Iteration through container elements can be done by iterators or accessing the elements by index. Consus provides four classes:

Consus classCorrespondent JDK classDescription
POCollectionjava.util.CollectionBase class for all persistent collections
POComparatorjava.util.ComparatorBase class for all comparable persistent classes
POListjava.util.ArrayListDynamically reallocatable array
POSortedListjava.util.ArrayListDynamically reallocatable sorted array

POSortedCollection uses two alternative approaches to keeping the order of elements. It is possible to associate a comparator object with the list (POComparator). Comparator is responsible for comparing objects with each other. Another approach is for the objects to implement the java.lang.Comparable interface. The first method is more flexible, while the second is more simpler for programmer. Search and insertion of elements into the POSprtedList is performed using binary searches, unlike the sequential search in POList.

Garbage Collector (GC)

Consus lets a programmer make the choice whether to use explicit or implicit deallocation of persistent objects. Explicit deallocation is more efficient, implicit deallocation is more reliable and convenient. Currently Consus implements a simple blocking mark-and-sweep algorithm for garbage collection.

To be able to run garbage collection, the programmer has to specify a set of root tables. All records from these tables are considered to be root objects (are marked by GC as accessible). In case of table inheritance it is necessary to specify only the root table. Then Consus will recursively scan all the accessible objects and mark all the referenced objects as accessible. When no more objects can be marked, the marking stage of garbage collection is completed. The garbage collector then performs the sweep stage. At this stage, the collector deallocates all the objects which were not marked from the first stage.

Garbage collection can be started explicitly by programmer, or it is possible to specify a threshold of allocated space after which garbage collection is initiated. Allocated space is calculated as the difference between allocated and deallocated objects (if explicit deallocation methods are used) since the last GC. It is possible to know the total size used by objects in the database. This can be useful for doing backups, synchronizing changes for replicated tables, etc...

The following table summarizes the garbage collection related methods provided by Consus:

MethodDescription
void setGCRoots(String tableNames[]) Specify the root tables for the garbage collection. All tables should exist in the database.
void setGCThreshold(long threshold) Specify the size of allocated objects, after which GC should be initiated. By default it is set to Long.MAX_VALUE, so GC is disabled.
void doGarbageCollection() Explicitly initiate a new garbage collection (if has not already started)
long getUsedSize() Get the size used by objects in the database

JDBC Object-Relational Adapter

Consus makes it possible to use its object-oriented API not only with Consus itself but with any other DBMS with a JDBC interface. All OO specific methods of storage are included in PersistentObjectStorage interface. This interface is implemented by the ConsusConnection and JDBCObjectRelationalAdapter classes. The last class is used to emulate object framework on top of the relational database system.

The tables used by applications should be explicitly created by the programmer using the standard JDBC protocol. This is the main difference with the implementation of PersistentObjectStorage by the ConsusConnection. In last case, tables are automatically generated from class definitions. Following is the initialization sequence for a database to be used with JDBCRerlationalAdapter:

  1. Create TypeInfo table with TID, TableName, ClassName fields.
  2. Insert into this table the information about all the tables/classes used by application. This table is assigned a unique id of the type and establishes mapping between the application class and the database table.
  3. Create OidTable. This table contains a single column ID with a single row that is used to generate new object identifiers.
  4. Insert the initial value of identifier into the OidTable table.
  5. Create tables for all classes used by the application.

To check whether a database has been initialized or not, the programmer should use the PersistetnObjectStorage.isInitialized() method. It should be called before opening and so the database should be initialized prior to invocation of the PersistetnObjectStorage.open() method.

These rules are illustrated by the following example:

    class Address { 
        String  country;
        String  city;
        String  street;
    };
    
    class Person extends Persistent { 
        String   name;
        Address  address;
        Person   mother;
        Person   father;
        Person[] childs; 
    };
    
    Connection con = DriverManager.getConnection(databaseURL);	
    PersistentObjectStorage storage = (con instanceof PersistentObjectStorage) 
        ? (PersistentObjectStorage)con : new JDBCObjectRelationalAdapter(con);
    if (!storage.isInitialized()) {
        Statement stmt = con.createStatement();
        stmt.executeUpdate("create table TypeInfo (TID int, TableName varchar, ClassName varchar)");
        stmt.executeUpdate("create table OidTable (ID int)");
        stmt.executeUpdate("insert into OidTable values (1)");
        stmt.executeUpdate("create table Person (name varchar primary key," + 
                                                "address$country varchar," + 
                                                "address$city varchar," + 
                                                "address$street varchar," + 
                                                "mother bigint," + 
                                                "father bigint," + 
                                                "childs varbinary)");
        stmt.executeUpdate("insert into TypeInfo values (1, 'Person', 'Person')");
    }
    storage.open();
You can use the java.sql.DatabaseMetaData interface to map Java types to the corresponding SQL types. See the GuessPO test or BugDB example for more details of using JDBCObjectRelationalAdapter.

Most of the PersistentObjectStorage is used by the Persistent class and should not be used by the application programmer, although the following methods are useful to the application programmer:


boolean isInitialized() throws Exception;
Returns true if the database has already been initialized. If the database was not initialized, the programmer has to create the TypeInfo, OidTable and applications' tables, inserting information about class-table mapping in TypeInfo table and initialize ID counter by inserting the value in OidTable. Initialization can be done through the standard JDBC methods.
Returns
true if database has already been initialized, false otherwise

void open() throws Exception;  
Opens the persistent object store. This method should be called after checking that store has been initialized. If not initialized it should initialise it.

Object fetchObject(ResultSet cursor) throws Exception;
Retrieve the current row at the cursor as a Java object
Parameters
cursor - result set with the current position corresponding to the retrieved object
Returns
The object for the current cursor row

void updateObject(Object obj, ResultSet cursor) throws Exception; 
Update the current row of the cursor with the new values passed in the Java object
Parameters
obj - the object with the new values cursor - a result set with the current position pointing to the updated row

void lock() throws Exception;
Lock the database in exclusive mode. The lock is kept until the end of the transaction. This method should be called to prevent deadlocks caused by upgrading shared locks to exclusive. For example, if you first issue a read-only statement and then within the same transaction update a statement, it can cause a deadlock if concurrent thread also tries to upgrade the shared lock to exclusive.

TableIterator getTableIterator(String tableName) throws Exception; 
Gets the table iterator. This is a more efficient way for iteration through the the table then using "select * from" and ResultSet. Provides sequential access and direct access by key.
Parameters
tableName - name of the table through which iteration will be performed
Returns
iterator for the specified table

Object getObjectByOid(long oid) throws Exception;
Get object by the object identifier. This method is not frequently used. The Persistent class is responsible for the automatic dereferencing of references to persistent objects. This method should be used only when you save OID of the object and later want to get this object through its OID. For example, the BugDB application places the OID of the object into the generated HTML form, so when processing the POST HTML request, it can get the object by OID using this method.
Parameters
oid - identifier of the object to be retrieved
Returns
the object with the specified OID

Support for Replicated and Mobile Databases

Consus is optimized to be used with mobile Java applications. Such applications have no persistent connection with the main database storage. So they should contain local data storage and be able to exchange the data with the server when connection is re-established. Data replication has become a very popular approach. Unlike traditional distributed data management schemes, based on two-phase transaction commit protocol, data replication can offer better performance (data is stored locally), is more reliable (application can work autonomously if connection with server can not be established) and better for scaling solutions The main drawback of data replication is the possibility of conflict updates and lack of generic algorithms for resolving such conflicts.

Consus provides a solution both for applications working with replicated data and for mobile applications. It provides a very simple way to synchronize the content of the database table on the client, to the server (or master table and replica table). This mechanism is mostly transparent to the programmer, it has only to invoke the checkOutTable and checkInTable methods. All the work of locating changed records, transferring them between client and server, and detecting conflicts is done by Consus automatically. In the case of update conflicts, Consus throws a MergeConflictException which should be caught and handled by programmer.

To use this replication mechanism, it is necessary to declare "recordTimestamp" in the table (both on the client and server side). The name of this field can be changed to some other by assignment to the ConsusConnection.recordTimestamp component, but it should be the same for all classes stored in the database both at client and server site. This field should be of the Java long type (or SQL BIGINT type).

To be able to locate related records in the master and replica tables, Consus needs either the information about the table's primary key or a special reference field containing the reference to the master record. The name of this reference field can be also specified in the ConsusConnection class by assignment to the ConsusConnection.masterOID component. The default value of this component is "masterOID". If the field is present in the table, Consus uses it for storing the reference to the master record. This approach is more efficient then using some existing primary key, but assumes that on the server we also have a Consus database. If the table contains no masterOID field, then Consus will search for the table's primary key (the first column marked as primary key is taken). The column can be marked as primary key in two ways:

  1. If you create table through the SQL CREATE TABLE statement, then you should use the PRIMARY KEY qualifier to mark the primary keys fields.
  2. If the table was created implicitly from a Java class definition, you should add the static String primaryKeys[] component to the class and assign to it an array with names of primary key columns.

When the application establishes connection with server (master) for the first time, it should use checkOutTable for all the tables that need to be replicated. It is not necessary for master and replica table to be exactly the same. Replica table can contain a subset of the master table columns. But they both should have recordTimestamp and optionally the masterOID fields. The application can close the connection with the server and work normally with the replica table autonomously.

To synchronize changes with the server (master), it is necessary to re-establish the connection with the server and invoke checkInTable for all the changed tables. This method will firstly transfer to the client all the records which were changed on the server from the moment of the check out. If some of these records were also changed on the client, then a merge conflict is detected and MergeConflictException is thrown. The exception object is given the value of the key of the record causing the conflict. Consus will then transfer all the records which were changed by the client to the master table, assigning them the value of a timestamp greater than on any other record in the master table. It is possible to perform check-ins several times, synchronizing the content of the master and replica tables.

Consus Specific of JDBC Implementation

One the main goals of the Consus database design were the full and correct implementation of all JDBC methods. There are however a few differences from the JDBC specification:
  1. Autocommit mode is by default disabled in Consus.
  2. Select cursor is not closed automatically after retrieving all the objects. The programmer should explicitly close the cursor or statement. If autocommit mode is enabled, than the transaction will be committed only after the explicit closing of the cursor or statement. INSERT, UPDATE, DELETE, CREATE and DROP statements immediately commit the transaction into autocommit mode, after the completion of the operation.
  3. Setting the transaction mode to TRANSACTION_NONE will make the commit() method (or commit JSQL statement) release all locks set by the current transaction but does not update the data on the disk. In case of a fault, all changes will be lost (this is only with TRANSACTION_NONE mode). Closing the connection will implicitly commit all the opened transactions and flush all the changes to the disk in all transaction modes.
  4. Commit or rollback of a transaction causes the closing of all cursors opened by the thread initiated by the commit or rollback operation.

Consus will automatically perform database recovery if your application wasn't terminated successfully (closed database connection). All changes made by uncommitted transactions will be undone. Consus uses a shadow page scheme for transaction committing so it needs no transaction log file and is able to perform recovery almost in no time at all. Recovery is done automatically and requires no interaction by the administrator. To recover from a backup file, just copy it over the corrupted database file.

Consus implements a multiple-reader single-writer scheme for synchronization of concurrent access to the database. It means that many read-only transactions can concurrently access the database, but only one active modification transaction is allowed (other threads will be blocked until the end of the transaction). Consus doesn't guarantee any specific order of granting lock requests. For example, if there are many read requests, the thread which tries to modify the database can be blocked for an infinite time because starting the read transaction will prevent the modification transaction to proceed. It is also not correct to say that Consus implements the readers-first scheme.

Consus isolates the different threads working with the database by starting separate transactions for each thread. An application should open a single connection with the database, to be used by all application threads. You should not use one instance of the Statement, PreparedStatement or ResultSet object in two different threads - it can cause unpredictable results. It is possible to work with several result sets in a single thread, but Consus doesn't guarantee that the changes made in one result set (record insertion/update/delete) will be visible in other result sets. It is responsibility of programmer to avoid such conflicts. When a transaction is closed (explicitly or implicitly) all the result sets opened by the current thread are closed (no more records can be accessed using these result sets). Consus doesn't implicitly close transactions after retrieving the last record, even if the autocommit mode is set. This is so you can perform as many passes through the selected records as you want (in backward direction, forward direction or using random access). Consus doesn't automatically close BLOB and CLOB objects when a transaction is closed. You should avoid future manipulations with these objects after the end of transaction because these objects can be altered by other threads.

The only possible deadlock situation in Consus is caused by upgrading the lock from shared to exclusive. It happens when the application executes some read-only select statement and then tries to update the database within the same transaction. If two concurrent threads try to concurrently upgrade their locks, deadlock can happen. Consus is able to automatically detect such situations and raise an exception in case of deadlock. To avoid deadlocks you can force the transaction to set an exclusive lock by adding the "for update" clause to the first select statement or invoke the ConsusConnection.lock() method. You will also have to specify the "for update" clause if you are going to perform update/delete/insert operations with the produced result set.

It is possible to mix object-oriented and traditional access to the table records with Consus. The columns of the table created from the Java class descriptor contain all the fields from the corresponding class and the classes referenced by its components. In the last case, the field will have a compound name separated by dots. For example in the following Java classes:

    class A { 
        int a;
    }
    class B extends A { 
        int b;
        A   r1;
        A   r2;
        A[] arr;
        String s;
    }
it will produce the table with the following columns:

Column nameabr1.ar2.aarrs
Column index123456

Consus doesn't allow restriction of selection to one table to enable polymorphic queries. It will always select records from derived tables. But the delete statement deletes records only from the specified table and doesn't touch records in the derived tables.

When Consus tables are created from Java class descriptors, Consus can automatically update the scheme when the Java class definitions are changed. When the database is opened, class descriptors are loaded and compared with descriptors of the application Java classes. If descriptors are different, then Consus tries to establish mapping between the old and new table descriptors. Mapping is done by field names. Consus is able to perform automatic conversion between all scalar data types. This means that smallint can be converted for example to a double, and boolean to timestamp. But such conversions may cause loose of significant digits. Other types (strings, references, structures, arrays, BLOBS and CLOBS) can not be converted to other type automatically. Conversion is done recursively for the fields of all nested structures and arrays. Values of the fields absent in the old table descriptor are set to zero. Indices are also automatically updated. Attention! Removing some field from the Java class will cause the loose of the data from the corresponding table column in the database as a result of data conversion.

Consus is an embedded database engine which can be used only within one application (but that application can spawn several concurrent threads to work with the database). The Java library doesn't provide functions for locking files from concurrent access. If another application tries to access the same database file, Consus will not be able to detect and prevent such situation, but the result of such "cooperative" work with the same database file is unpredictable and can cause loose of all data in the database.

Consus methods are able to throw a number of exceptions. Most frequently you will deal with two types of exceptions: ConsusCompileError and ConsusRuntimeError. Both of these are derived from SQLException. ConsusCompileError is thrown when passed JSQL statements containing errors and ConsusRuntimeError is called when something goes wrong during the execution of the statement. The execution of the statement can also cause generation of ArithmenticException, IllegalArgumentException, IndexOutOfBoundsException, StringIndexOutOfBoundsException. Compiler error exception reports the position in the statements where the error was detected (first position is 1) The following table summarizes the information about exceptions thrown by Consus:

Exception classDescription
AssertionFailedSome Consus internal invariant is violated. Signals about the bug in Consus. Please send bug reports if you get this one.
ConsusColumnNotFoundErrorColumn name specified in ResultSet method is not present in the table.
ConsusCompileErrorError in passed JSQL statement. Exception object contains information about the position in the statement string where the error was detected.
ConsusIOErrorOperation with database file has failed. Most likely caused by the lack of free space on the disk.
ConsusIncompatibleTypesErrorConsus is not able to perform requested data conversion between database types and types specified by the application.
ConsusNotImplementedThis method is not currently implemented by Consus (see the following table for the list of unimplemented methods).
ConsusNotInUpdateModeAn attempt to perform the insert/update/delete operation with the result set created by select statement, without the for update clause.
ConsusRuntimeErrorError during statement execution.
ConsusLoginRefusedExceptionException thrown on client side when connection is refused by the server.
MergeConflictExceptionException thrown by the checkInTable method when the records changed by the application have already been updated by some other application from the moment that the last checkOutTable was done by the application.
InvalidOIDErrorInvalid OID.
POSErrorException was thrown during an operation with the persistent object.

Methods from JDBC 2.0 currently unimplemented by Consus
JDBC InterfaceMethod signature
ConnectionCallableStatement prepareCall(String sql)
CallableStatement prepareCall(String sql, int resultSetType, int resultSetConcurrency)
java.util.Map getTypeMap()
void setTypeMap(java.util.Map map)
PreparedStatementvoid setObject(int parameterIndex, Object x, int targetSqlType, int scale)
void setObject(int parameterIndex, Object x, int targetSqlType)
void addBatch()
void setArray (int i, Array x)
void setDate(int parameterIndex, java.sql.Date x, Calendar cal)
void setTime(int parameterIndex, java.sql.Time x, Calendar cal)
void setTimestamp(int parameterIndex, java.sql.Timestamp x, Calendar cal)
RefString getBaseTypeName()
ResultSetvoid updateObject(int columnIndex, Object x, int scale)
Object getObject(int i, java.util.Map map)
Array getArray(int i)
Object getObject(String colName, java.util.Map map)
java.sql.Date getDate(int columnIndex, Calendar cal)
java.sql.Time getTime(int columnIndex, Calendar cal)
java.sql.Timestamp getTimestamp(int columnIndex, Calendar cal)
ResultSetMetaDataint getColumnDisplaySize(int column)
int getPrecision(int column)
int getScale(int column)
void setMaxFieldSize(int max)
ConsusStatement
int getQueryTimeout()
void setQueryTimeout(int seconds)
void cancel()
void setCursorName(String name)
void addBatch( String sql )
void clearBatch()
int[] executeBatch()
ConsusDatabaseMetaDataResultSet getProcedures(...)
ResultSet getProcedureColumns(...)
ResultSet getColumnPrivileges(...)
ResultSet getTablePrivileges(...)
ResultSet getBestRowIdentifier(...)
ResultSet getVersionColumns(...)
ResultSet getImportedKeys(...)
ResultSet getExportedKeys(...)
ResultSet getCrossReference(...)
ResultSet getUDT(...)

Delayed transactions and online backup scheduler

Consus supports ACID transactions. It means that after database is reported that transaction is committed, it is guaranteed that database will be able to recover transaction in case of system fault (except corruption of database image at hard disk). The only way to provide this feature on standard equipment (without non-volatile RAM for example) and under general-purpose operating systems (Windows, Unix, ...) is to perform synchronous write to the disk. "Synchronous" in this context means that operating system will not return control to the application until data will be really written to the disk. Unfortunately synchronous write is very time expensive operation - average disk access time is about 10ms, so it is hard to achieve performance more than 100 transactions per second.

But in many cases it is acceptable to loose changes for few last seconds (but preserving consistency of the database). With this assumption, database performance can be significantly increased. Consus provides "delayed transaction commit model" for such applications. When commit transaction delay is non zero, database doesn't perform commit immediately, instead of it delay it for specified timeout. After expiration of this timeout, transaction is normally committed, so it ensures that only changes done within specified timeout can be lost in case of system crash.

If thread, which has initiated delayed transaction, starts new transactions before delayed commit of transaction is performed, then delayed commit operation is skipped. So Consus is able to group several subsequent transactions performed by on client into the large single transaction. And it will greatly increase performance, because it reduces number of synchronous writes and number created shadow pages (see section Transactions).

If some other client tries to start transaction before expiration of delayed commit timeout, then Consus force delayed commit to proceed and release resource for another thread. So concurrency is not suffered from delayed commit.

By default delayed commits are disabled (timeout is zero). To specify non-zero value of transaction commit delay, you should set "consus.commit.delay" property (milliseconds). This property is inspected at the time of creating connection, so it should be set before.

Transaction commit scheme used in Consus guaranty recovery after software and hardware fault if image of the database at the disk was not corrupted (all information which was written to the disk can be correctly read). If for some reasons, database file is corrupted, then the only way to recover database is use backup (hoping that it was performed not so long time ago).

Backup can be done by just copying database file when database is offline. Class ConsusConnection provides backup method which is able to perform online backup, which doesn't require stopping of the database. It can be called at any time by programmer. But going further, Consus provides backup scheduler, which is able to perform backup automatically. The only things needed - name of the backup file and interval of time between backups.

The method ConsusConnection.scheduleBackup(String fileName, long periodMsec) spawns separate thread which performs backups to the specified location with specified period (in milliseconds). If fileName ends with "?" character, then data of backup initiation is appended to the file name, producing the unique file name. In this case all backup files are kept on the disk (it is responsibility of administrator to remove too old backup files or transfer them to another media). Otherwise backup is performed to the file with fileName + ".new" name, and after completion of backup, old backup file is removed and new file is renamed to fileName. Also in last case, Consus will check the creation date of the old backup file (if exists) and adjust wait timeout in such way, that delta of time between backups will be equal to specified period (so if database server is started only for 8 hours per day and backup period is 24 hours, then backup will be performed each day, unlike scheme with uniquely generated backup file names).

It is possible to schedule backup processing by setting consus.backup.name system property. Period value is taken from consus.backup.period system property if specified, otherwise it is set to one day. To recover from backup it is enough to copy some of the backup files instead of corrupted database file.

Using indices in queries

Consus provides two main facilities to reduce query execution time: direct access by OID and index access. Consus uses a B-tree for indexing numeric, date, string and reference fields. The elements of the arrays can not be indexed. Access to the record by OID is very fast and takes the same time despite the size of the database and number of records in the tables. Index access to the record requires the reading of one or more B-tree pages - the maximum number of read pages can be estimated as ceil(log(n, (page_size-4)/2/(4+sizeof_key)), where n is the number of records in the table, sizeof_key an average size of the key (for string keys +4 bytes string header).

Indices can be explicitly created for the table rows by the create index on statement or can be specified in the table creation statement. Consus will create indices for the fields marked with primary key or with the using index clause. It is also possible to specify which fields should be indexed when the database table is created using information from the Java class descriptor. If a Java class has a public static String indices[] component, then Consus assumes that the elements of this array are names of the fields which should be indexed and therefore creates indices for these fields. If a Java class has a public static String primaryKeys[] component, then Consus assumes that the elements of this array are names of the fields which should be marked as primary keys (but they are not automatically indexed unless they are specified in the indices array).

Consus uses simple rules for applying indices, allowing the programmer to predict when an index will be used. Checking for index applicability is done during each query execution, so a decision can be made depending on the values of operands. The following rules describes the algorithm of applying indices by Consus:

If the index is used to search the prefix of the like expression, and the suffix is not just the '%' character, then the index search operation can return more records than actually matches the pattern. In this case we should filter the index search output by applying the pattern match operation.

When the search condition is a disjunction of several sub-expressions (expression containing several alternatives combined by or operator), then several indices can be used for query execution. To avoid record duplicates a bitmap is used in cursor to mark records already included in the selection.

If the search condition requires a sequential table scan, the B-tree index still can be used if order by clause contains the single record field for which B-tree index is defined. Sorting is a very expensive operation, using of index instead of sorting significantly reduce time of query execution.

When the index search is performed in a referenced table (i.e. when the expression is something like this "ref.x = 'foo'", where ref is reference to some table A) then Consus actually performs several index searches - first it searchs for records in table A with value of the field x equal to 'foo', and then for each OID of the record selected from the table A, Consus performs searches in the current table for the record with value of the field ref equals to the OID of record from the table A.

To merge an index search results of an OR expression operand, Consus uses a bitmap to avoid duplicates of objects in the result set. But if you specify the ALL qualifier in the select statement, then the check for duplicates is omitted and the query is executed a little faster (and no bitmap is allocated). If there are large number of the objects in the database (more than 10 million), then the object bitmap can consume significant space. Merging the results of index search is the only operation for which Consus implicitly uses object bitmaps.

Tuples vs. Objects

JSQL can execute queries in two modes: tuple mode and object mode. In object mode the Consus result of a query execution is a set of objects. It is not necessary to specify the list of selected columns in this case (you can write "select from" or "select * from"). Selection of objects provides the best performance. But there are some limitation to this mode - not all SQL-89 constructions can be executed in this mode.

First of all in object mode, objects are always retrieved from one table. No joins are possible in this mode. Group functions (avg, max, min, sum, count) also can not be used in this mode. No subqueries are possible in object mode. But it is possible to use the iterator construction (start from ... following by) for selecting objects. UPDATE and DELETE statements are always using object mode in their conditions. Updating/deleting and inserting records in cursor is only possible if the select was performed in object mode. In tuple mode, Consus can execute almost any valid SQL-92 query. Subqueries, joins, natural joins, group functions, group by and having clauses - everything works in this mode. But retrieving tuples requires more space and CPU time than selecting objects. In object mode only object identifiers (OIDs) are stored in the cursor, and in tuple mode, the value of each column is stored as a separate Java object. An iterator construction can not be used with tuples. Remote queries can be only executed in tuple mode.

Tuple mode can not be specified explicitly. Compiler itself recognizes the mode based on the features used in the query. If, for example, the select statement contains several tables in FROM list or the column list is not empty, then the selection is done in tuple mode. The following table summarizes the differences between tuple and object modes:

FeaturesObject modeTuple mode
Select from more than one table-+
Table joins-+
Subqueries-+
GROUP BY and HAVING clause-+
group functions-+
remote query execution-+
START FROM .. FOLLOWING BY clause+-
get OID of record+-
fetch record as Java object+-
cursor insert,update,delete operations+-
Condition in UPDATE, DELETE statements+-

Finding the optimal plan of execution of SQL statement is a very complex and challenging task. It requires a lot of work to query the optimizer. But it is not possible to find the optimal plan by performing only semantic analysis of the query. Optimization greatly depends on data distribution in tables involved in queries. That is why most of the modern RDBMS keep statistic information about number of records in each table, selectivity of keys, etc... So building optimal plans of query execution requires a lot of extra CPU time for gathering statistic and optimization by itself. There is one more problem - it is hard for the programmer to guess which approach will be used by the RDBMS for execution of their query, and it is difficult to explain which RDBMS approach is better. Consus provides different approaches to optimization. It is based on two principles:

  1. Database should do its best to provide the fastest execution of any query that can be executed fast, i.e. allowing the usage of indices.
  2. The programmer should have full control over optimizations - optimization rules should be simple and clear to the programmer.
There are almost no problems with optimization in the object mode. There is only one table and no joins and the task of optimizer comes when locating indices which can be applied to replace a sequential search with more efficient access methods. Rules for optimization in object mode are described in the previous section.

In tuples the number of possible execution plans is very large. There are subqueries, joins, unions, group functions, so the domain space for the optimizer is very large. According to the two principles mentioned above, Consus uses the following rules for optimization of queries:

  1. Tables are taken in the order in which they are specified in the query. Each table is assigned an index representing its position in FROM list (starting from 1).
  2. Search predicate is split into the set of conjuncts and then the conjuncts are sorted, so that the expressions accessing the tables with smaller indices are checked first. Literals are considered to belong to the table with an index of 0.
  3. A Cartesian product of table records is built using nested loops. When we are retrieving records from the table with the index i, we execute all conjuncts which refer only to this table or to the tables with smaller indices. If it is possible to use the B-tree for selecting records matching search criteria, then the sequential search through the table is replaced with the index search.
When the DISTINCT qualifier is in the query, then all selected tuples are sorted by all columns and then duplicates are removed. If the ORDER BY clause is present in the statement together with the DISTINCT qualifier, then the fields in ORDER BY are compared first during sorting, so the sort operation is performed only once. In case of the GROUP BY construction, retrieved tuples are also sorted by the list of specified fields.

Consus uses the Quicksort algorithm for sorting records. To reduce the number of object loads while sorting in object mode, Consus first extracts the sort keys into a separate array (part of key in case of strings), then sorts this array, and finally refines the order by performing a comparison of all columns mentioned in the ORDER BY list. In tuple mode, Consus directly sorts the array of selected tuples using the java.lang.Comparable interface implemented by objects representing the value of each column.

Consus optimizes the execution of subqueries by checking the dependencies of the subquery expression. The result returned by the subquery execution is saved and only recalculated if the subquery expression refers to the fields from the enclosing scope.

Consus Implementation Issues

This section describes some aspects of Consus implementation. It is not necessary to read this section unless you want to know more about the Consus internals.

Memory Allocation

Memory allocation is performed in Consus by bitmap. Memory is allocated in chunks called allocation quantum. In the current version of Consus the size of allocation quantum is 64 byte. It means that the size of all the allocated objects is aligned on a 64 byte boundary. Each 64 byte of database memory is represented by one bit in the bitmap. To locate the whole of the requested size in bitmap, Consus sequentially searches the bitmap pages for corresponding numbers of successive cleared bits. Consus uses three arrays indexed by bitmap byte, which makes possible fast calculation of whole offset and size within the byte.

Consus performs cyclic scanning of bitmap pages. It keeps the identifier of the current bitmap page and current position within the page. Each time the allocation request arrives, scanning of the bitmap starts from the current position. When the last allocated bitmap page is scanned, scanning continues from the beginning (from the first bitmap page) until the current position. When no free space is found after a full cycle through all bitmap pages, a new bulk of memory is allocated. The size of extension is the maximum size of the allocated object and extension quantum. The bitmap is extended to be able to map additional space. If virtual space is exhausted and no more bitmap pages can be allocated, then the OutOfMemory error is reported.

Allocation of memory using bitmap provides high locality of references (objects are mostly allocated sequentially) and also minimizes the number of modified pages. Minimization of the number of modified pages is significant when the commit operation is performed and all dirty pages should be flushed on the disk. When all cloned objects are placed sequentially, the number of modified pages is minimal and so transaction commit time is also reduced. Using larger extension quantum also helps to preserve sequential allocation. Once the bitmap is extended, objects will be allocated sequentially until the extension quantum is completely used. Only after reaching the end of the bitmap does scanning restart from the beginning, searching for holes in the previously allocated memory.

To reduce number of bitmap pages scans, Consus associates a descriptor with each page, which is used to remember maximal size of the hole on the page. Calculation of maximal hole size is performed in the following way: if object of size M can not be allocated from this bitmap pages, then the maximal hole size is less than M, so M is stored in the page descriptor if previous value of descriptor is larger than M. For the next allocation of object of size greater or equal than M, we will skip this bitmap page. The page descriptor is reset when some object is deallocated within this bitmap page.

Some database objects (like B-tree pages) should be aligned on page boundary to provide more efficient access. Consus memory allocator checks requested size and if it is aligned on page boundary, then the address of allocated memory segment is also aligned on page boundary. Search for a free hole will be done faster, because Consus increases the step of the current position increment according to the value of the alignment.

To be able to deallocate memory used by an object, Consus needs to keep somewhere, information about the objects size. Consuss memory allocator deals with two types of objects - normal table records and page objects. All table records are prepended by a record header, which contains record size, type and pointers of L2-list linking all records in the table. The size of the table record object can be extracted from this record header. Page objects always occupies the whole database page and are allocated at the positions aligned on page boundary. Page objects have no headers. Consus distinguishes page objects from normal objects by using a special marker in the object index.

By default the maximal database size supported by Consus is limited to one terabyte. It is possible to increase (or reduce) this value by specifying values of the dbDatabaseOffsetBits parameter. The default value of this parameter is 37. Consus is not able to handle more than a 1Gb objects because OID is represented by an integer type.

Transactions

Each record (object) in Consus has a unique identifier (OID). Object identifiers are used to implement references between objects. To locate an object by reference, its OID is used as the index in the array of object offsets within the file. This array is called the object index and is an element of this array - object handle. These are two copies of object indices in Consus, one of which is current and the other is its shadow. The header of the database contains pointers to both object indices and indicates which index is current at this moment.

When an object is modified for the first time, it is cloned (a copy of the object is created) and the object handle in the current index is changed to point to the newly created object copy. The shadow index still contains the handle which points to the original version of the object. All changes are done with the object copy, leaving the original object unchanged. Consus marks this in the special bitmap page of the object index, which contains the modified object handle.

When the transaction is committed, Consus first checks if the size of the object index has increased during the current transaction. If yes, it reallocates the shadow copy of object index. Then Consus frees the memory for all the "old objects", i.e. objects which have been cloned within the transaction. Memory cannot be deallocated before commit, because we wants to preserve the consistent state of the database by keeping the cloned object unchanged. If we deallocate memory immediately after cloning, the new object can be allocated at the place of cloned object and we loose consistency. Memory deallocation in Consus is done through bitmap, using the same transaction mechanism as for normal database objects, deallocation of object space requires clearing of some bits in the bitmap page, which also should be cloned before modification. Cloning the bitmap page will require new space for allocation for the page copy. We could reuse the space of deallocated objects, but this is not acceptable due to the reason explained above - we will loose database consistency. That is why deallocation of an object is done in two steps. When the object is cloned, all bitmap pages used for marking objects space are also cloned (if there were not cloned before). So when transaction is committed, we only clear the bits in the bitmap pages and no more requests for allocation memory can be generated at this moment.

After deallocation of old copies, Consus flushes all modified pages on disk to synchronize the content of the memory and disk file. After that, Consus changes the current object index indicator in the database header to switch the roles of the object indices. Now the object index, which was current, now becomes the shadow, and shadow index becomes the current. Then Consus again flushes the modified page (i.e. the page with the database header) on disk, transferring the database to the new consistent state. After this, Consus copies all the modified handles from the new object index to the object index that was previously tha shadow, and becomes current. At this time, the contents of both indices are synchronized and Consus is ready to start a new transaction.

The bitmap of a modified object index pages is used to minimize the time of committing the transaction. Not the whole object index, but only its modified pages should be copied. After committing the transaction, the bitmap is cleared.

When a transaction is explicitly aborted by the dbDatabase::rollback method, the shadow object index is copied back to the current index, eliminating all changes done by the aborted transaction. After the end of copying, both indices are identical again and the database state corresponds to the moment before the start of the current transaction.

Allocation of object handles is done through a free handles list. The header of the list is also shadowed and two instances of this list headers are stored in the database header. Switching between them is done in the same way as the switching of object indices. When there are no more free elements in the list, Consus allocates handles from the unused part of new index. When there is no more space in the index, it is reallocated. The object index is the only entity in database which is not cloned on modification. Instead of this, two copies of object index are always used.

There are some predefined OID values in Consus. OID 0 is reserved as an invalid object identifier. OID 1 is used as identifier of a metatable object - a table containing descriptors of all other tables in database. This table is automatically constructed on database initialization and descriptors of all registered application classes are stored in this metatable. OID's starting from 2 are reserved for bitmap pages. The number of bitmap pages depends on databases maximum virtual space. For one terabyte of virtual space with a 4Kb page size and a 64 byte allocation quantum, then 64K bitmap pages are required. So the 128K handles are reserved in object index for the bitmap. Bitmap pages are allocated on demand when the database size is extended. So the OID of first users object will be 0x10002.

Recovery

The recovery procedure is trivial in Consus. There are two instances of object index, one of which is current and another that corresponds to the consistent database state. When opening a database, Consus checks the database header to detect if database was normally closed. If not (the dirty flag is set in the database header), then Consus performs database recovery. Recovery is very similar to the rollback of a transaction. The indicator of the current index in the database object header is used to determine the index corresponding to the consistent database state, and object handles from this index are copied to another object index, eliminating all changes done by uncommitted transaction. The only action performed by the recovery procedure is the copying of the objects index (really only handles having different values in current and shadow indices are copied to reduce the number of modified pages) and the size of the object index is usually small, therefore the recovery can be done very fast. The fast recovery procedure reduces the "out-of-service" time of the application.

There is one hack in Consus to increase database performance. All records in the table are linked in a L2-list, allowing efficient traversal through the list and insertion/removing of records. The header of the list is stored in table object (which is record of Metatable table). L2-list pointers are stored at the beginning of the object together with the object size. New records are always appended in Consus to the end of the list. To provide consistent inclusion in the list, we should clone the last record in the table and the table object itself. But since the record size can be large, the cloning of last record for each inserted record can cause significant space and time overhead.

To eliminate this overhead Consus does not clone the last record allowing a temporary inconsistency in the list. In which state will be list in if a system fault happens before the commit of transactions? A consistent version of the table object will point to the record which was last record in previous consistent state of the database. But since this record was not cloned, it can contain pointer to next record, which doesn't exist in this consistent database state. To fix this inconsistency, Consus checks all tables in the database during recovery procedure and if the last record in the table contains a not null next reference, it is changed to null to restore consistency.

If the database file was corrupted on disk, the only way of database recovery is to use a backup file (if you do not forget to make them). Backup files can be made by the ConsusConenction.backup (online backup) method or by just coping the database file when no database application is active. Since the database file is always in a consistent state, the only thing needed to perform recovery from the backup file is to replace the original database file with backup file. If the backup was stored on tape or on some other external device, it should be first extracted to the disk.

If the application starts a transaction, locks the database and then crashes, the database is left in a locked state and no other application can access it. To restore it from this situation you should stop all applications working with database. The first application opening the database after this, will initialize the database monitor and perform the recovery from the crash.

B-tree

B-Tree is the classical structure for implementing database indices. B-tree minimizes the number of disk access needed to locate data by key. Disk access can be the most expensive operation, minimizing of disk reads will reduce query execution time. The main idea of B-Tree is to produce a balanced tree with a large width and small depth. All leaf pages in B-tree have the same depth (distance from the root) page. This distance is called tree height. The number of pages accessed during an index search is equal to the tree height, so reducing the tree height will minimize the number of disk operations.

All pages of the B-tree contain key values and references to the objects. For scalar values, key values and object references are stored in two arrays grown towards each other. The array of key values grows from the beginning of the page to the page end. An object reference corresponding to the key value in i - the position is stored in the position (Page.pageSize - (i-1)*4). For string keys, B-tree page contains an array of elements with object reference, string size and string body offset within the page. Strings bodies are allocated starting from the end of the page.

To keep a minimal tree height, B-tree has a restriction for the minimal number of nodes on the page. All internal pages, except root, should have no less than half of the page used (this criteria can be changed). String keys have different lengths and it is not possible to set a limitation on the number of nodes on the page. Instead of this, the limitation for used page size is specified. If more than half of the page is free (underflow), then reconstruction of the tree is needed.

All operations with B-tree (insert/find/remove) have log(N) complexity, where N is number of nodes in the tree (size of the table). Elements are stored on B-Tree pages in ascending order, so it is possible to use a binary search to locate an element on the page.

When a new item is inserted into the tree, adding the new element to the page can cause page overflow. In this case, the new page is created and half of the nodes from the overflow page are moved to the newly created page. Creation of a new page causes insertion of the new element into the parent page; propagation of inserts can continue to the root of the tree. If the root page is split, then the new root page is created and the tree height is increased.

When an item is removed from the page, page underflow can happen - more than half of the page is not used. In this case the neighboring page is investigated and either a merge of the neighboring page with underflow page takes place, reallocation of nodes between underflow page and neighbor page is performed, restoring tree invariants, is performed. As well as with insert operations, propagation of removes can reach the root page, and when a single element is left on the root page, this root page is deallocated and the tree height is decreased.

Utilities

Consus provides several utilities for database administration. These utilities are included in org.garret.consus package (stored in consus.jar archive). So you should invoke them with the "org.garret.consus" prefix, for example:
     java org.garret.JSQL mydatabase.dbs script.sql

Interactive JSQL utility

Consus has an interactive JSQL utility that can be used for browsing databases and performing some administrative actions on it. Do not run it concurrently while an active application is working with the database.

To run the JSQL utility, type:

     java org.garret.JSQL <database-login> {sql-file...};
For local connections, your database login should be the name of the database file. For remote connections, the login string has the following format:
    <database-login> ::= user:password@host:port
User and password fields can be omitted and in this case anonymous connection takes place. Host should be either a symbolic name of the host computer or IP address. Port is the port number where the Consus server can be found.

The JSQL utility reads SQL statements from the files specified in the command string. In the event of errors, JSQL reports the error with the line number relative to the beginning of the file. Once all specified files have been processed, JSQL switches to interactive mode and starts to read commands from the standard input.

You can execute any JSQL statement and see results in the ASCII dump form. Each statement should be terminated with the ';' character. One statement can be split into several lines. A JSQL session can be terminated by the exit command (without ';'). You can get information about the tables stored in the database by selecting from the preexisted MetaTable table. It is not possible to select all columns from the MetaTable in remote mode, because information about record columns is stored in fields array and arrays (except array of byte) cannot be passed through the remote connection protocol.

Universal Import Data Utility

Consus provides a Loader utility for importing data from other databases and sources of data. This utility can import data from:
  1. Another JDBC database (database with JDBC or ODBC driver). The loader utility uses reflection methods of JDBC to determine the format of the imported data. You can specify pattern for name of the table and scheme, and also the catalogue from which the data should be imported. If these parameters are skipped, all the tables from the source database will be imported to the target database. If the imported table doesn't exist, it will be created.
  2. Dbase (*.DBF) files. Loader is able to read data from Dbase-III+ compatible file. Fields of type C, D, N, L, I are supported. No memo fields and *.DBT files are supported by this version. You can specify the name of the table in the command string. If no table name is specified, it is extracted from the name of DBF file name (by truncation of .DBF suffix). If imported table doesn't exist, it will be created.
  3. ASCII text file. Loader uses java.io.StreamTokenizer class for parsing ASCII text. The first line of the file should contain the list of columns. You can specify the name of the table in the command string. If no table name is specified, it is extracted from the name of TXT file name (by truncation of .TXT suffix). If the imported table doesn't exist, it will be created.

It is possible to invoke Loader in three different ways:

java org.garret.consus.Loader target-database-login source-database-login [table-pattern [scheme-pattern [catalog]]]
java org.garret.consus.Loader target-database-login DBF-file-name [tableName]
java org.garret.consus.Loader target-database-login TXT-file-name [tableName]
Where database-login is either the name of the database file (for local connection to Consus database) or a string in the following format:
    database-login ::= database-file-name | user:password@driver-class:URL
Here the driver-class is the fully qualified name of JDBC driver class used and the URL is the driver specific string, identifying the database. User and password fields can be omitted, in this case anonymous collection is established.

Examples of using the import utility:

java org.garret.consus.Loader file1.dbs file2.dbs
Import data from all tables from local Consus database file2.dbs to local Consus database file1.dbs
java org.garret.consus.Loader file1.dbs file2.dbs MyTable
Import table MyTable from local Consus database file2.dbs to local Consus database file1.dbs
java org.garret.consus.Loader file1.dbs john:welcome@org.garret.consus.ClientDriver:jdbc:consus-client:altair:5100
Import data from remote Consus database located at host altair with server started at port 5100 connection as user john with password welcome to the local Consus database file1.dbs
java org.garret.consus.Loader file1.dbs @COM.cloudscape.core.JDBCDriver:jdbc:cloudscape:CloudscapeDB
Import data from Cloudscape database CloudscapeDB to the local Consus database file1.dbs
java org.garret.consus.Loader file1.dbs Orders.dbf
Import data from the Dbase file Orders.dbf to the Orders table of the local database file1.dbs
java org.garret.consus.Loader file1.dbs SP.dbf Delivery
Import data from the Dbase file Sp.dbf to the Delivery table of the local database file1.dbs
java org.garret.consus.Loader file1.dbs Orders.txt
Import data from the ascii text file Orders.txt to the Orders table of the local database file1.dbs

Remote Connections server

Although Consus was primarily designed for the role of an embedded database engine, it can be also used as a remote database server. Consus provides the org.garret.consus.ClientDriver class which transfers JDBC calls to the server and gets results from it. Currently only JDBC 1.0 is supported for remote connections (no cursor update/insert/delete operations are supported). No Consus object JDBC protocol extensions are supported in remote mode. For remote connections Consus always work in tuples mode (not in object mode).

To be able to establish connection with remote server, the Consus application should load org.garret.consus.ClientDriver class, and specify the database URL in the following format:

      URL ::= jdbc:consus-client:host:port
Where host is the symbolic name of server node or IP address, and port integer constant, specifying the port where the server is listening.

On the server site you should start the server process:

      java org.garret.consus.Server database-file-name port [authenticator-class]
The server will create a thread for each client connection. Each client starts its own transaction, so clients are isolated from each other and changes done by one client can not be seen by another client until the transaction is committed. Currently Consus supports the multiple-readers-single-writer synchronization scheme. This means that if one client starts a modification transaction, all other clients will be blocked.

If an exception is thrown in the client thread (it can be caused by errors in passed SQL statements or by a runtime error during query execution), it is caught by the server and transferred to the client side. The exception is then rethrown on the client. When the connection with client is closed normally, as a result of the Connection.close method invoked by the client, the current transaction is committed and the clients thread terminated. If the connection with the client is broken due to a network problem or as a result of a client process termination, the current transaction is aborted, rollback is performed, and the locks set by the client process are released.

By default, Consus server accepts connections from any client who knows the server host name and port. It is possible to use the authentication protocol to allow connection only from trusted users. Consus provides the Authenticator interface, by implementing this, it is possible to create different authentication mechanisms. Methods for this protocol are described below:


boolean authenticate(ConsusConnection conxn, Properties userInfo) throws SQLException;
Performs client authentication. Information about the user is passed through the Properties structure. Usually it contains "user" and "password" properties. Consus server adds the "host" property where it stores the remote Inet address of the connected socket. This structure is specified while establishing connection with the server by the Driver.connect(String url, java.util.Properties info) method.
Parameters
conxn local connection with the database
userInfo information about the user
Returns
true if the client is authenticated (connection is accepted) or false if connection is refused.

boolean add(ConsusConnection conxn, Properties userInfo) throws SQLException; 
Adds a new user.
Parameters
conxn local connection with the database
userInfo information about new user
Returns
true if user is successfully added, false otherwise.

boolean remove(ConsusConnection conxn, Properties userInfo) throws SQLException; 
Removes user.
Parameters
conxn local connection with the database
userInfo information to identify the user.
Returns
true if the user is removed successfully, false otherwise.

boolean change(ConsusConnection conxn, Properties userInfo) throws SQLException;
Change user properties.
Parameters
conxn local connection with the database
userInfo information to identify the user and new values of the user properties. For example, if only user and password properties are used, the value of user property is used to locate the user and password specifies the new user's password.
Returns
true if the user is removed successfully, false otherwise.

If you specify the authenticator class name to the Server utility, it will invoke the authenticate method for each client connection request. Also it is possible with the Server to register new users, change their password and remove users using the add, change and remove methods of the Authenticator interface. You should use only one of the following commands add, change or remove for these purposes.

Consus provides basic implementation of the Authenticator interface. It is implemented by the org.garret.consus.StdAuthenticator class. It uses only the user and password properties. Information about the user is placed in org.garret.consus.UserInfo table. The server command add will ask you for new users name and password, change - for users name and new password and remove - only for users name. This implementation of the authenticator protocol is not secure because user name and password are transferred through the net directly without any encryption.

Below is an example of the remote execution of JSQL with StdAuthenticator. On the server side, the scenario for starting the server and registering a new user looks like this:

$ java org.garret.consus.Server test.dbs 5100 org.garret.consus.StdAuthenticator
> add
Adding new user
User: guest
Password: welcome
User added
>
On the client we should start JSQL in the following way (assuming that server is a computer with the name "altair"):
$ java org.garret.consus.JSQL guest:welcome@altair:5100
> 
To stop the server process, type exit.

By default connection between server and client is done through TCP-IP sockets. But if both client and server are running at the same computer - it is not efficient way of communication between two processes (especially at Windows). For fast communication between client and server at the same computer Consus provides native local sockets implementation jnilocalsock.dll which is ten times faster than standard Windows sockets. Client will use this library if you specify "localhost" as the server address. Library jnilocalsock.dll is located in consus/lib and this directory should be appended to PATH environment variable. If the library is not in the PATH or fails to load, Consus will use standard sockets.

Database Dump Utility

The dump utility allows you to get information about the database file: total and used size, number of used and allocated objects, level of fragmentation... You should specify the database file name as a parameter to this utility:
    java org.garret.consus.Dump guess.dbs
Below is an example of the dump utility output:

C:\Consus-1.1>java org.garret.consus.Dump guess.dbs
Database size: 1077248
Index size: 66048
Used index size: 65547
Bitmap pages: 1
Number of used objects: 0
Number of free objects: 0
Number of used pages: 0
Free size: 6336
Used size: 1066816
Number of holes: 5
Min. hole size: 64
Max. hole size: 4544
Holes with size 64 bytes: 2
Holes with size 256 bytes: 1
Holes with size 1408 bytes: 1
Holes with size 4544 bytes: 1

How To Use Consus

To be able to use the Consus classes, you should add the path to the Consus class archive to the CLASSPATH environment variable. For example, let's say you have extracted the Consus distribution on a Windows system in the directory C:\Consus. The CLASSPATH environment variable should be modified/set by:
      set CLASSPATH=c:\Consus\lib\consus.jar;.;%CLASSPATH%
To establish connection with the Consus JDBC driver, you should first load the org.garret.consus.ConsusDriver class, and then invoke DriverManager.getConnection(URL + FILE) method, where URL is "jdbc:consus:" and FILE should specify the name of the database file. For example:
    static public void main(String args[]) throws Exception { 
	// Parameter for Consus
	String DRIVER =   "org.garret.consus.ConsusDriver";
	String URL    =   "jdbc:consus:";
	String FILE   =   "TestPerf.dbs";

	Class.forName(DRIVER);
	Connection connection = DriverManager.getConnection(URL + FILE);
	...
     }
It is possible to split a Consus database between several physical files. Consus provides four types of files:
  1. java.io.RandomAccessFile - standard java file class for random access
  2. org.garret.internal.consus.RandomAccessMultiFile - multifile consisting of several physical segments (size of each segment except the last one should be specified). So this class overcomes the operating system limitations for maximum file size and allows you to place the database on several disk partitions.
    To use this type of file you should create file with ".mfile" extension. Each line of this file contains path to the physical file and size of the segment in kilobytes (except the last line, where only path without size should be specified).
  3. org.garret.internal.consus.Raid0FileRAID 0 implemenetation - file block are scattered between several physical files (first block in first file, second block in second file, ... K-th block K modulo N file, where N is number of RAID segments.
    To use this type of file you should create file with ".raid0" extension. First line of this file contains size for RAID 0 block in kilobytes. Other lines describes RAID segments. Each line should contain path to the physical file (partition) and optional offset within this partition (kilobytes).
  4. org.garret.internal.consus.Raid0FileRAID 1 implementation - data is duplicated in all specified physical files (so that content of all physical files is the same and in case of corruption of one disk, database image from the other disks can be used. This type of file also optimize read access by splitting read requests between several files (certainly it has an advantage if each physical file is accessed by its own disk controller)
    To use this type of file you should create file with ".raid1" extension. Each line of this file should contain path to the physical file (partition) and optional offset within this partition (kilobytes).
So common rules for multifile description files: