Anda di halaman 1dari 66

Final Report NewSQL

Table of Contents
1 Abstract........................................................................................................ 5
Why another language?............................................................................. 5
What about existing databases?.................................................................. 5
What about Object-Relation tools and Object Databases?................................5
Project Goals............................................................................................ 6
What was realised..................................................................................... 6
2 Requirements and Targets............................................................................... 7
2.1 Current Situation and Motivation............................................................... 7
2.2 Project Targets and Scope........................................................................ 7
Development Platform............................................................................... 7
Development Tools.................................................................................... 8
Tested Operating System........................................................................... 8
Hardware................................................................................................. 8
Language Features.................................................................................... 8
Survey.................................................................................................... 9
NewSQL to SQL Translator......................................................................... 9
Testing Requirements................................................................................ 9
Benchmark Application.............................................................................. 9
Query Wizard........................................................................................... 9
Demo Application...................................................................................... 9
Popular Database.................................................................................... 10
3 Project Organization..................................................................................... 11
3.1 Approach / Processes............................................................................. 11
3.2 Schedule and Project Organization........................................................... 12
Milestones / Reviews............................................................................... 12
Working Hours........................................................................................ 12
Meetings with the Tutors.......................................................................... 12
4 Comparison................................................................................................. 13
4.1 The SQL Standard.................................................................................. 13
History of SQL........................................................................................ 13
Contents of the SQL Standards.................................................................. 14
SQL92................................................................................................... 14
SQL99................................................................................................... 15
SQL2003................................................................................................ 15
Books about SQL99 and SQL2003.............................................................. 16
SQL Grammar (Overview)........................................................................ 16
Problems of SQL...................................................................................... 23
4.2 History of OQL / ODMG........................................................................... 24
About ODMG........................................................................................... 24
ODMG History......................................................................................... 24
Books.................................................................................................... 25
OQL Grammar (Overview)........................................................................ 26
5 NewSQL Features......................................................................................... 34
5.1 NewSQL Concepts and Terms.................................................................. 34
Databases.............................................................................................. 34
Tables................................................................................................... 34
Columns................................................................................................ 34
Indexes................................................................................................. 34
Users..................................................................................................... 35
Roles..................................................................................................... 35

2003-12-10 Page 1
Final Report NewSQL

Other terms............................................................................................ 35
5.2 NewSQL Language Features (Commented)................................................ 35
Language Style (Look and Feel)................................................................ 35
General Grammar Style............................................................................ 36
Using Semicolons.................................................................................... 37
Identifiers.............................................................................................. 37
Creating a Table...................................................................................... 38
Autoincrement columns............................................................................ 39
Data Types and Constants........................................................................ 39
Dropping a Table..................................................................................... 40
Retrieving data....................................................................................... 40
Explicit versus Implicit Joins..................................................................... 42
Search Conditions................................................................................... 43
Null handling.......................................................................................... 43
Expressions............................................................................................ 44
Adding data............................................................................................ 44
Changing data........................................................................................ 45
Deleting data.......................................................................................... 45
Transactions........................................................................................... 46
Keywords............................................................................................... 46
5.3 NewSQL Grammar Reference................................................................... 46
5.4 The NewSQL JDBC Driver........................................................................ 51
About JDBC............................................................................................ 51
The NewSQL JDBC Driver Architecture........................................................ 51
Using the NewSQL JDBC Driver................................................................. 52
6 Applications................................................................................................. 53
6.1 Demo Application.................................................................................. 53
Purpose................................................................................................. 53
Functionality........................................................................................... 53
How to Run the Application....................................................................... 54
Application Architecture........................................................................... 55
6.2 Benchmark Application........................................................................... 56
Purpose................................................................................................. 56
Functionality........................................................................................... 57
Benchmark Results.................................................................................. 57
How to Run the Application....................................................................... 59
Application Architecture........................................................................... 60
6.3 Database Tool....................................................................................... 61
Purpose................................................................................................. 61
Functionality........................................................................................... 61
How to Run the Application....................................................................... 62
6.4 Build Process........................................................................................ 62
The Build Process.................................................................................... 62
Building NewSQL for JDK 1.3.x.................................................................. 62
Building NewSQL for JDK 1.4.x and later.................................................... 62
6.5 The Regression Test............................................................................... 63
Purpose................................................................................................. 63
Functionality........................................................................................... 63
How to Run the Application....................................................................... 63
6.6 Coverage Test Results............................................................................ 64
How to Run the Coverage Test.................................................................. 64
Test Results........................................................................................... 64

2003-12-10 Page 2
Final Report NewSQL

Conclusions............................................................................................ 64
7 Outlook....................................................................................................... 65
7.1 Current Limitations................................................................................ 65
Database Tool Wizard.............................................................................. 65
7.2 Design Alternatives................................................................................ 65
Join Syntax............................................................................................ 65
7.3 Possible Future Work.............................................................................. 65
Transaction handling methods................................................................... 65
Null handling simplification....................................................................... 66
Text Constants with Escape Sequences....................................................... 66
Implicit Joins.......................................................................................... 66
Converter SQL to NewSQL........................................................................ 66
ODBC driver........................................................................................... 66
Appendix
A Survey........................................................................................................ 67
A.1 Summary............................................................................................. 67
A.2 About the project NewSQL...................................................................... 67
A.3 Questions on Database Usage................................................................. 68
A.4 Questions on NewSQL............................................................................ 70
A.5 Miscellaneous Questions......................................................................... 73
A.6 Importance........................................................................................... 74
A.7 Contact Information (optional)................................................................ 74
B Survey Results (Commented)......................................................................... 75
B.1 Summary............................................................................................. 75
B.2 Survey Process...................................................................................... 75
Paper Survey.......................................................................................... 75
Online Survey......................................................................................... 75
Addressing the Public............................................................................... 75
Lessons learned...................................................................................... 76
B.3 Questions on Database Usage................................................................. 76
SQL (Structured Query Language)............................................................. 76
Other Database Access Languages............................................................. 77
How I use (or plan to use) SQL.................................................................. 77
Databases.............................................................................................. 78
Programming Languages and APIs............................................................. 79
B.4 Questions on NewSQL............................................................................ 79
Identifiers.............................................................................................. 79
Quoted Names........................................................................................ 80
NULL handling........................................................................................ 80
Different syntax for different databases...................................................... 80
Autoincrement columns............................................................................ 81
Strings (text data, varchar)...................................................................... 81
Style..................................................................................................... 82
Grammar Examples................................................................................. 82
Joins .................................................................................................... 87
Exception handling.................................................................................. 88
B.5 Miscellaneous Questions......................................................................... 88
B.6 Importance........................................................................................... 90
B.7 Other Interesting Results........................................................................ 91
Operating System................................................................................... 91

2003-12-10 Page 3
Final Report NewSQL

Browser used.......................................................................................... 91
C Installation Guide......................................................................................... 92
C.1 Target Platforms.................................................................................... 92
C.2 Contents of the CD................................................................................ 92
C.3 Installation instructions.......................................................................... 93
C.4 Uninstall.............................................................................................. 94
C.5 Development Environment...................................................................... 94
C.6 List of Ant Scripts.................................................................................. 95
C.7 Installation Directory Layout................................................................... 96
D NewSQL Language Features.......................................................................... 97
D.1 NewSQL Grammar Reference................................................................. 102
E Glossary and Links...................................................................................... 107
E.1 Glossary............................................................................................. 107
E.2 Books................................................................................................. 108

2003-12-10 Page 4
Final Report NewSQL

1 Abstract
NewSQL is a new database access language. It is easier to learn than SQL, elegant,
consistent, and well defined. It is not an extension or subset of SQL, and not an
Object database language.
Why another language?
SQL is outdated, too complex and has many flaws. Object databases are not the
future. The two main issues are:
There is just no real standard. Every product has a bit different syntax rules and
concepts, even for quite simple things such as autoincrement columns, data type
names, size restriction for data types, name spaces. This is not obvious for
beginners, but leads to vendor lock-in later in the development cycle.
SQL is hard to learn. It contains too many syntax rules. In 'regular' programming
languages, the syntax rules don't change that often when a feature is added (for
example, when a method is added). For SQL, the keyword list is huge and different
among databases. Some things are overly complex or unnecessary, such as the
rules for Null values, or quoted names. These features make the language hard to
learn and use.
SQL is ancient. It was developed at a time when COBOL was the main
programming language, and inherited many if not most concepts from COBOL. SQL
contains features that are not found in modern programming languages, such as
size restrictions in text and other data. Modern concepts such as Unicode are not
directly supported. SQL is, other than most modern programming languages, case
in-sensitive.
What about existing databases?
Existing databases are supported by the use of a converter. An application using
NewSQL can be run against an existing database by the use of a NewSQL to SQL
converter.
What about Object-Relation tools and Object atabases?
Object-relation mapping tools and standards are made to simplify or abstract the use
of databases. Examples are Enterprise Java Beans and JDO. But direct database
access is still necessary for various reasons (speed, simplicity). But even the object
relation tools need to access the database in a way; a database access language is
required in any case.
Object database do not have a big market share. There are several reasons why: The
relational databases are more advanced and cheaper when compared to OODBs
(Object Oriented DataBases), and there is no standard that is convincing enough
(compared to SQL).

2003-12-10 Page 5
Final Report NewSQL

!roject "oals
The main goals are to develop a new database access language NewSQL based on
feedback from developers. The new language should have the same functionality than
SQL, but should be more logical, consistent and easier to learn.
Due to the limited time, only the most commonly used subset of the functionality of
SQL should be implemented.
What was realised
A survey has been made in order to find out what are the major problems of
developers when working with databases. The survey also gave the developers the
choice between different grammar variants of the new language. The results of the
survey where used as a guide when developing the new language.
A NewSQL to SQL converter has been implemented in the form of a JDBC driver. In
that way, existing applications can benefit from NewSQL with very little change and
new applications can be made without having to learn a new API.
In addition to that, tools such as a demo application, a database tool with query
wizard, and a benchmark application have been implemented. And finally, the new
language is documented.

2003-12-10 Page 6
Final Report NewSQL

# Re$uire%ents and &argets
#'1 (urrent Situation and )oti*ation
Currently SQL is the generally accepted, most commonly used database (query-)
language. However SQL has some disadvantages:
There is no real standard (each database has its own SQL dialect).
There are too many syntax rules.
Some functionality seem outdated (e. g. length restriction for text; this was
probably inherited from COBOL - modern languages like Java do not know such
restrictions).
Null is not equivalent to null - this is not obvious for many developer.
It is difficult to integrate SQL in Java.
The SQL language is very old and the language was never revised, but extends in
each case.
For this reasons, we think that a replacement for SQL should be developed. The new
language should be easier to learn, more elegant, consistent, and well defined. The
new language should not be a subset or extension of SQL, and not an object database
language.
The main targets of this project are to create a new database access language called
NewSQL, a converter to allow applications written in NewSQL to work with current
databases, as well as tools, examples and documentation promoting the usage of this
language. The language as well as the tools are to be licensed as open source.
Actually, an open source project called NewSQL has already been started on
November 2001, but has been stopped on July 2002 for various reasons. At that point,
no software at all was developed and only a few web pages had been written. The
current project is a continuation of the old NewSQL open source project.
#'# !roject &argets and Sco+e
The detailed requirements are described as a separate document. This document is
included in the product CD, the file is docs/requirements.pdf. This chapter discusses
the targets of NewSQL (what should be achieved), and how this should be done.
e*elo+%ent !lat,or%
Platform independence should be achieved if possible. For this reason, Java is selected
as the programming language Java for all software components. The current Java
platforms are JDK 1.3.x and 1.4.x. See also http://java.sun.com [Java - Sun]

2003-12-10 Page 7
Final Report NewSQL

e*elo+%ent &ools
As this project is open source, other developers should be in the position to
seamlessly continue development after the official project period of this project (that
is, at the end of the official project period). To allow this, all components are created
with standard, open source tools. For a detailed list of development tools, see also the
paragraph 'Development Environment' in the appendix 'Installation Guide'.
&ested O+erating Syste%
Because currently Microsoft Windows is the most common operating system (probably
even among open source developers), Windows XP is used as the main development
platform. However, as Java is used, the software also works with Linux.
-ardware
There are no special requirements for the hardware. The software works with any
Microsoft Windows compatible hardware, as well as any Linux compatible hardware (if
Linux is used as the operating system).
Language .eatures
The new language should have the following features. The following list represent the
general direction of the solution: NewSQL should be
Simple to learn, if possible simpler than SQL
Consistent
Have a low number of grammar rules
Be extensible in a way that can not break existing applications. For example,
adding new functionality to the access right subsystem without adding a new
keyword. This is similar to adding libraries to a general programming language.
Solve all common data access requirements
Easy to parse
Modern
Not have any hidden unexpected (or confusing) behavior (like NULL behavior in
SQL)
It should be possible to write a query wizard, to generate a statement in a step-by-
step way. See also the Query Wizard Application. This requires that 'roots' come
before the 'nodes' (example: the table name is before the column names; this is
not always the case in SQL, for example the SELECT statement where the table
name(s) come after the select list).
Should be functionally compatible to SQL so that the current database APIs can be
reused. Reason: it is less likely that developers accept a new language if they have
to learn and use a new API as well.

2003-12-10 Page 8
Final Report NewSQL

Sur*ey
To find out what are the major problems of developers when working with databases,
a survey was made as part of this project. The survey gave the developers the choice
between different grammar variants of the new language. The results of the survey
are used as a guide in many design decisions.
NewSQL to SQL &ranslator
A NewSQL to SQL translator is designed and implemented. This software tool converts
NewSQL statements to SQL.
&esting Re$uire%ents
All main functions of the software are tested, and a coverage test is used to ensure
that enough tests are written. The number of test cases and minimum test coverage
numbers depend on the module; the main module (the NewSQL to SQL converter)
needs to be tested more than the applications.
/ench%ar0 A++lication
One of the major concerns for application developers is speed. If the new language is
slow, for example because the translation from NewSQL to SQL is slow, or because the
generated SQL code executes a lot slower than functionally equivalent SQL code, then
the developers will probably not use the new language. To show the speed difference
between using SQL as compared to NewSQL, a benchmark application is written that
compares the two techniques side by side. The benchmark uses an algorithm that is
similar to an already used benchmark in that area: TPC-A. The documentation on
these benchmarks is available at www.tpc.org [TPC]
Query Wi1ard
The query wizard is an application similar to the Oracle SQLPlus Worksheet. The
reason to create such an application is to show people in a simple and intuitive way
(learning by doing) what NewSQL means, and how it works. This application should
bridge the gap between learning NewSQL ,from the books (by reading the
documentation) and using it. This application is simple to use, but not feature
complete.
The wizard is implemented in Java, using Swing.
e%o A++lication
As an additional example that proves it is easy to write or port application to NewSQL,
a small example application is written that uses NewSQL as the data access layer. The
source code of the two applications may be compared side-by-side.

2003-12-10 Page 9
Final Report NewSQL

!o+ular atabase
In order to be successful, NewSQL must work with as many popular databases as
possible. Here is a list of popular databases:
Oracle 9i (www.oracle.com)
MS SQL Server 2000 (www.microsoft.com)
MySQL (www.mysql.com)
IBM DB2 (www.ibm.com)
PostgreSQL (www.postgresql.org)
PointBase (www.pointbase.com)
HSQLDB (former Hypersonic SQL) (hsqldb.sourceforge.net)

2003-12-10 Page 10
Final Report NewSQL

2 !roject Organi1ation
2'1 A++roach 3 !rocesses
An iterative model is used to develop the components. Three iterations for this project
are used:
!rototy+e !hase
In the first iteration (prototype phase), a first version of the applications (JDBC driver,
query wizard, benchmark, demo application) is implemented, even though the
grammar of the language itself is not yet fully defined or may change. The results of
the prototype phase are a set of working applications, while these applications still
lack a lot of functionality. Only minimal testing and documentation is made at this
level.
Al+ha !hase
In the second iteration (alpha phase), the target is to achieve as many as possible of
the main priorities. In this phase, all applications are brought to a stable and
functional state, but may be buggy and lack testing and documentation. In a
commercial product, this phase is called alpha phase.
Release !hase
In the third iteration (release phase), some more functionality is implemented. The
testing is intensified, and the documentation is completed. This phase can be
compared to the release of a commercial product.

2003-12-10 Page 11
Drawing 1 - Approach
Concept &
Design
Implementation
Documentation
Testing &
Bugfixing
Project x
Management x
Prototype Phase Alpha Phase Release Phase
Final Report NewSQL

2'# Schedule and !roject Organi1ation
The following schedule is used for this project:
)ilestones 3 Re*iews
Three milestones are defined for this project:
Prototype Review
Alpha Version Review
Final Review
Wor0ing -ours
The weekly working time is 12 hours per graduate.
)eetings with the &utors
About once every three weeks a meeting with the tutors is scheduled. The content of
the meeting is to discuss the current state of the project.

2003-12-10 Page 12
Illustration 1 - Schedule
Year 03 04
Month 4 5 6 7 8 9 10 11 12 1
Day 5 2 7 4 1 6 3 1 5
Week # 19 23 28 32 36 41 45 49 2
Phase
Concepts & Design 1
Implementation 1
Documentation 1
Testing & Bugfixing 1
Milestone Meeting 1
Concepts & Design 2
Implementation 2
Documentation 2
Testing & Bugfixing 2
Milestone Meeting 2
Concepts & Design 3
Implementation 3
Documentation 3
Testing & Bugfixing 3
Milestone Meeting 3
Vacations
Tests
Diploma work
Exhibit
Review
Final Report NewSQL

4 (o%+arison
4'1 &he SQL Standard
A short overview of the history of relational database, the SQL standard, and the
contents of the SQL specification is given here. This document is based on the SQL
standard specification itself, and on the excellent description in the book 'Practical
PostgreSQL' [PPOST], chapter 'Understanding SQL'.
-istory o, SQL
SQL stands for Structured Query Language, and is the most commonly used query
language for relational database access today. Here is a short history of SQL:
Year Description
1970 &he relational %odel was defined by r' 5' .' (odd, a researcher for IBM.
1972 Relational algebra and relational calculus where introduced by E. F.
Codd. This provided the basic concepts behind computing SQL syntax.
Relational algebra defines set operations (union, intersection, cartesian
product), selection (filtering rows), and projection (filtering columns).
1974 IBM developed the language S5Q5L as part of the System/R project. It was
later renamed "SQL" for legal reasons. However, this was still a research
and prototype project.
1979 Oracle released their first commercial RDBMS based on the SQL language.
1981 IBM released their first commercial SQL product: SQL/DS. DB2 was released
two years later.
1986 ANSI (American National Standards Institute) standardizes SQL: X3.135.
1987 ISO (International Standards Organization) standardizes SQL as well.
1989 SQL67 (also named SQL1), a revised version of the ANSI/ISO standard, was
published. Partially to conflicting interests from commercial vendors, the
SQL89 standard was left incomplete, and many features were labeled
implementer-defined.
1992 SQL7# (also named SQL2) was published by the ANSI committee. SQL92 is
approximately six times more complex than its predecessor, SQL89.
1999 SQL77 (also named SQL3) was released by ANSI/ISO. This standard
addresses some of the more advanced and previously ignored areas of
database systems, such as object-relational database concepts, call level
interfaces, and integrity management.
2003 SQL#882. The standard is a conservative enhancement of SQL99. Many
errors in the previous standard where addressed. Only minor changes and
additions in the core language where made.

2003-12-10 Page 13
Final Report NewSQL

(ontents o, the SQL Standards
Here the different parts of the standard. As you see in the number of pages column,
the SQL Standard is a huge document.
Part Title Content Page
SQL/Framework Common definitions and concepts.
About conformance
85
SQL/Foundation Embedded / dynamic SQL (except Java)
Traditional SQL and object oriented SQL
1'300
SQL/CLI CLI = Call-Level Interface
Best known implementation: ODBC
400
SQL/PSM PSM = Persistent Stored Modules
corresponding (but not conforming) products: PL/SQL,
Transact-SQL
170
SQL/MED MED = Management of External Data
Embedding of external data (other databases, files,
sensors,...)
500
SQL/SCHEMATA Catalog structures and contents; describes objects
(tables, types, triggers, code pages)
300
SQL/OLB OLB = Object Language Bindings.
Embedding of SQL in Java.
Based on SQLJ Part 0.
360
SQL/JRT JRT = Java Routines and Types
Java routines callable from SQL
Java classes as data types
Based on SQLJ Part 1
200
SQL/XML New data type: XML
Mapping between SQL and XML.
Operations for crating XML documents
280
SQL7#
SQL92 is also named SQL2 and was published 1992 by the ANSI committee.
Le*els o, co%+liance
SQL92 is approximately six times more complex than its predecessor, SQL89. Three
levels of SQL92 compliance where defined:
Entry-level conforance
Interediate-level conforance
!ull conforance"
Most databases today still only support Entry-level conformance.

2003-12-10 Page 14
Final Report NewSQL

New in SQL7#
Recursive queries
Triggers
New data types: Boolean, Array, Row, Ref, Structured Types
SQL77
This standard addresses some of the more advanced and previously ignored areas of
database systems, such as object-relational database concepts, call level interfaces,
and integrity management.
Le*els o, con,or%ance
Two degrees of conformance are defined:
#ore SQL$$
Enhanced SQL$$"
None of the vendors currently supports the entire Core SQL99 standard.
New in SQL77
Roles model in the authorization schema
Java binding
Usage of Java routines and classes in SQL
Embedding of external data
SQL#882
The standard is a conservative enhancement of SQL99. Many errors in the previous
standard where addressed. Only minor changes and additions in the core language
where made.
Le*els o, con,or%ance
#ore SQL2%%3
Enhanced SQL2%%3"
None of the vendors currently supports the entire Core SQL2003 standard.

2003-12-10 Page 15
Final Report NewSQL

New in SQL#882
MERGE operation
Sequences, identity columns
Handling of XML data
/oo0s about SQL77 and SQL#882
For more details about the SQL99 and SQL2003 standard, the following books are
available:
Books about SQL99 and SQL2003
SQL&1$$$ '(nderstanding )elational Language #oponents*
J. Melton, A. R. Simon, Morgan Kaufmann, 2002.
Advanced SQL&1$$$
J. Melton, Morgan Kaufmann, 2003.
Einf+hrung in den Sprach,ern von SQL-$$ '-eran*
W. Panny, A. Taudes, Springer, 2000.
Daten.an,en / 0ava '0D1#2 SQL0 und 3D4-* '-eran*
G. Saake, W. Sattler, dpunkt, 2000.
SQL&1$$$ / SQL&2%%3 '3.5e,trelationales SQL2 SQL0 / SQL674L* '-eran*
C. Trker, dpunkt, 2003.
SQL o.5e,torientiert '-eran*
D. Petkovic, Addison-Wesley, 2003.
SQL "ra%%ar 9O*er*iew:
Only an overview of SQL grammar will be given here. For the description of the full
SQL grammar, please consult the standard, and, as most database products do not
comply with the standard in many cases, consult the documentation of the database
product.
This is the SQL grammar as implemented by the LDBC project. LDBC (Liberty
DataBase Connectivity) is a JDBC driver that provides vendor-independent database
access. With LDBC, your application will just work on all major databases and you
don't have to change any source code. LDBC is based on ANSI-SQL and JDBC. See
also: http://ldbc.sf.net. [SourceForge]
;NS5R&
;NS5R& ;N&O tableNa%e
< 9colu%nNa%e <='''> : >
?AL@5S 9 *alue <='''> :
Inserts a single row into a table.
;nserting into an table with autoincre%ent colu%n

2003-12-10 Page 16
Final Report NewSQL

If the table contains an autoincrement column, then the value for the autoincrement
column is set automatically. For example, if the table contains the columns ID and
NAME, and ID is the autoincrement column, then the following inserts will work:
INSERT INTO TEST VALUES('Hello')
INSERT INTO TEST(NAME) VALUES('Hello')
However it is also possible to specify the value for ID explicitly:
INSERT INTO TEST(ID, NAME) VALUES(10,'Hello')
Example:
INSERT INTO HELLO_WORLD VALUES(1)
@!A&5
@!A&5 tableNa%e
S5& colu%nNa%eAex+ression <='''>
< W-5R5 condition >
Updates rows in a table.
Example:
UPDATE HELLO_WORLD SET ID=ID+1
5L5&5
5L5&5 .RO) tableNa%e
< W-5R5 condition >
Deletes rows from a table.
Example:
DELETE FROM HELLO_WORLD
S5L5(&
S5L5(& <;S&;N(&> B C D selectList E
.RO) tableList
< W-5R5 condition >
< "RO@! /F colu%nNa%e <='''> >
< OR5R /F colu%nNa%e <BAS(D5S(E> <='''> >
Queries one or more tables.
tableList:
tableName [alias] [, | LEFT OUTER JOIN | RIGHT OUTER JOIN | INNER JOIN tableList]
selectList:
{ COUNT(*) | expression } [ AS alias ]
Example:
SELECT * FROM HELLO_WORLD

2003-12-10 Page 17
Final Report NewSQL

(R5A&5 &A/L5
(R5A&5 &A/L5 tableNa%e 9
colu%ne,inition <='''>
<=!R;)ARF G5F9colu%n <='''>:>
<=.OR5;"N G5F9colu%n <='''>: R5.5R5N(5S tableNa%e 9 colu%n <='''>:>
:
Creates a new table.
columnDefinition:
columnName dataType [[NOT] NULL] [PRIMARY KEY]
dataType:
{
INT
| INT AUTOINCREMENT
| VARCHAR(size)
| DECIMAL(precision,scale)
| DATETIME
| BLOB
| CLOB
}
Executing this statement automatically commits any open transaction.
See also:
Data Types
Examples:
CREATE TABLE HELLO_WORLD (ID INT)
CREATE TABLE TEST(ID INT PRIMARY KEY,NAME VARCHAR(255))
CREATE TABLE ORDERLINE(ORDER_ID INT,LINE INT,TEXT VARCHAR(255),AMOUNT
DECIMAL(10,2),PRIMARY KEY(ORDER_ID,LINE))
CREATE TABLE PARENT(ID INT PRIMARY KEY)
CREATE TABLE CHILD(P_ID INT,ID INT,PRIMARY KEY(P_ID,ID),FOREIGN KEY(P_ID)
REFERENCES PARENT(ID))
CREATE TABLE AUTOINC(ID INT AUTOINCREMENT PRIMARY KEY,VALUE VARCHAR
(255))
(R5A&5 ;N5H
(R5A&5 ;N5H indexNa%e ON tableNa%e 9 colu%nNa%e <='''> :
Creates a new index in a table. To drop the index, the table has to be dropped.
Executing this statement automatically commits any open transaction.
Example:
CREATE INDEX IDXID ON HELLO_WORLD (ID)
RO! &A/L5
RO! &A/L5 tableNa%e

2003-12-10 Page 18
Final Report NewSQL

Drops a table.
Executing this statement automatically commits any open transaction.
Example:
DROP TABLE HELLO_WORLD
RO! ;N5H
RO! ;N5H indexNa%e ON tableNa%e
Drops an index.
Executing this statement automatically commits any open transaction.
Example:
DROP INDEX IDXID ON HELLO_WORLD
(O));&
(O));&
Commits the current transaction.
ROLL/A(G
ROLL/A(G
The current transaction is rolled back.
(onditions= 5x+ressions= ?alues
conditionI
logicalTerm [ OR logicalTerm ...]
logicalTerm:
logicalFactor [ AND logicalFactor ...]
logicalFactor:
expression {
= expression
| > expression
| < expression
| >= expression
| <= expression
| <> expression
| [NOT] BETWEEN expression AND expression
| [NOT] LIKE expression [ESCAPE quotedString]
| IS [NOT] NULL
| IN (expression [, expression...])
}
ex+ressionI
term [ { + | - } term ...]
Remark: The operator + is for integer or decimal addition, not String concatenation.
term:
factor [ {* | / } factor ...]

2003-12-10 Page 19
Final Report NewSQL

Remark: the results of overflow or underflow are undefined and different across
databases.
factor:
factorPlusMinus [ || factorPlusMinus...]
Remark: The operator || is for String concatenation, not a logical OR.
factorPlusMinus:
[ NOT ] factorPlusMinus | value | columnName | ( expression ) | function
*alueI
[-] number
| 'string'
| NULL
| ?
| DATE 'yyyy-mm-dd'
| TIMESTAMP 'yyyy-mm-dd hh:mm:ss'
| X'001122'
In a String, the ' character needs to be written twice, for example 'Joes''s Taxi' will
result in: Joe's Taxi
Examples:
condition:
NAME LIKE 'T%' OR NAME IS NULL
expression:
( ID + 1 ) * 2
DATE '2002-01-01'
X'01ab'
ata &y+es
The list of data types is:
Data
type
nae
Liitation Searc!
"able
Data type
de#inition
Constant $a%a&s'l&
Types
INT -(2^31) ..
(2^31-1)
Yes INT 1 INTEGER
VARCHAR 255 characters Yes VARCHAR
(255)
'Hello' VARCHAR
DECIMAL 20 precision,
10 scale
Yes DECIMAL
(10,2)
3.14 DECIMAL
DATETIME 1 sec precision,
years 1800 -
9999
Yes DATETIME DATE '2002-01-01' or
TIMESTAMP '2002-01-
01 20:00:00'
TIMESTAMP
BLOB 400'000 bytes No BLOB X'01020a0b0c' BLOB
CLOB 400'000
characters
No CLOB 'Text' CLOB

2003-12-10 Page 20
Final Report NewSQL

ata &y+e Aliases
In order to support existing applications, the following data type aliases are
supported. Please don't use them in new applications:
Date Type (liases
INT INTEGER, SMALLINT, BIT, TINYINT, BOOLEAN
VARCHAR CHAR
DECIMAL NUMERIC, DEC, REAL, FLOAT, DOUBLE, BIGINT
DATETIME DATE, TIME, TIMESTAMP
BLOB BINARY, VARBINARY, LONGVARBINARY, IMAGE
CLOB TEST, LONGVARCHAR
Geywords
Keywords are identifiers that can not be used as table names or column names. The
list of words is:
ABS, ADD, ALL, ALTER, AND, AS, ASC, AVG, BEFORE, BETWEEN, BIGINT, BINARY,
BIT, BLOB, BOOLEAN, BOTH, BY, CACHED, CASCADE, CASE, CAST, CHAR,
CHARACTER, CHARACTER_LENGTH, CHAR_LENGTH, CLOB, COLUMN, COMMIT,
CONCAT, CONSTRAINT, COUNT, CREATE, CROSS, CURRENT_DATE, CURRENT_TIME,
CURRENT_TIMESTAMP, DATABASE, DATE, DATETIME, DEC, DECIMAL, DEFAULT,
DELETE, DESC, DISTINCT, DOUBLE, DROP, EXISTS, EXTRACT, FALSE, FLOAT, FOR,
FOREIGN, FROM, GRANT, GROUP, HAVING, IF, IMAGE, IN, INDEX, INFILE, INNER,
INSERT, INT, INTEGER, INTO, IS, JOIN, KEY, KILL, LEADING, LEFT, LENGTH, LIKE,
LIMIT, LINENO, LOAD, LOB, LOCAL, LOCATE, LOCK, LONG, LONGVARBINARY,
LONGVARCHAR, LOWER, MATCH, MAX, MEDIUMINT, MIN, MOD, NATURAL, NOT, NULL,
NUMERIC, OBJECT, OCTET_LENGTH, ON, OPTION, OR, ORDER, OTHER, OUTER,
OUTFILE, POSITION, PRECISION, PRIMARY, PRIVILEGES, PROCEDURE, READ, REAL,
REFERENCES, RENAME, REPLACE, RESTRICT, RETURNS, REVOKE, RIGHT, ROLLBACK,
SAVEPOINT, SELECT, SESSION_USER, SET, SMALLINT, SQRT, SUBSTRING,
SUM, SYSDATE, TABLE, TEMP, TEXT, TIME, TIMESTAMP, TINYINT, TO, TRAILING,
TRIGGER, TRIM, TRUE, UNION, UNIQUE, UNSIGNED, UPDATE, UPPER, USER, USING,
VALUES, VARBINARY, VARCHAR, VARCHAR_IGNORECASE, WHEN, WHERE, WITH,
WRITE, ZEROFILL

2003-12-10 Page 21
Final Report NewSQL

.unctions
The built-in functions are:
)unction Data type Description *+aple
CAST(value AS
type)
variable Convert a value.
value: original value
type: target data type
CAST('5' AS INT) = 5
LENGTH(text) INT text: VARCHAR LENGTH('Hello') = 5
MOD
(value,dividend)
INT Modulo function: returns the
remainder after x is divided
by the dividend
MOD(10, 3) = 1
CONCAT(s1,s2) VARCHAR Concatenates two Strings CONCAT('A', 'B') = 'AB'
LOWER(s) VARCHAR Converts a String to
lowercase.
LOWER('Hello') =
'hello'
UPPER(s) VARCHAR Converts a String to
lowercase.
UPPER('Hello') =
'HELLO'
NOW() DATETIME Gets the current timestamp NOW() = '2002-08-16'
Aggregates are only supported for SELECT:
)unction Data type Description *+aple
COUNT(*) INT Count all rows COUNT(*)
COUNT(column) INT Count the columns that are
non-null
COUNT(VALUE)
MIN(column) any * The lowest value, NULL if no
value
MIN(ID)
MAX(column) any * The highest value, NULL if no
value
MAX(ID)+1
SUM(column) numeric * Sums a column, NULLs are
counted as 0
SUM(VALUE)
AVG(column) numeric * Sums a column, and then
divides by number of non-null
values; same as SUM
(column)/COUNT(column)
AVG(VALUE)
*any: except BLOB or CLOB
* numeric: INT or DECIMAL

2003-12-10 Page 22
Final Report NewSQL

.unction Aliases
In order to support existing applications, the following function type aliases are
supported. Please don't use them in new applications:
)unction (liases
UPPER(s) UCASE(s)
LOWER(s) LCASE(s)
CAST(value AS type) CONVERT(value, type)
LENGTH(text) CHAR_LENGTH(text)
NOW() CURRENT_DATE(), CURRENT_TIME(), CURDATE(), CURTIME()
!roble%s o, SQL
There are various problems with the SQL language. Here a short list of the problems.
This list is not complete, but should give a short overview to the most common
problems a developer faces when writing a database application using SQL.
There is just no real standard. Every product has a little bit different syntax rules,
even for quite simple things. This is not obvious for beginners, but leads to vendor
lock-in later in the development cycle.
Hard to know if a syntax is supported by a popular database or not.
Keywords / reserved words are vendor dependent.
Data types are not portable among database products.
Vendor specific size restrictions in data types.
No standard for autoincrement columns / sequences.
Different products have different namespace mechanisms (schemas, catalogs,
databases).

2003-12-10 Page 23
Final Report NewSQL

4'# -istory o, OQL 3 O)"
OQL stands for Object Query Language and was defined by the ODMG.
The following description of the ODMG history and OQL is based on the information
available at http://www.odmg.org/ [ODMG]. To access older versions of the ODMG
web sites, the Wayback-Machine of Archive.org was used: http://www.archive.org/
[Archive].
About O)"
The Object Data(base) Management Group was founded in 1991 by OODB (object
oriented database) constructors:
O2 Technology (later Ardent Software, now Ascential Software)
GemStone Systems
Object Design
Objectivity
POET
Versant Object Technology
and other
Later joined by software companies (Ericsson, Sun,...), reviewer members (Baan,
CERN, CA,...), and academic members (e. g., S. Zdonik, B. Liskov, D. Maier)
The Objective of ODMG is to create a standard in OODBs, that applies to both ODBMSs
(Object Database Management Systems) that store objects directly, and to Object-to-
Database Mappings that convert and store objects in, RDBs (relational databases).
O)" -istory
Year Description
1993 O)" 1'8 (ODMG-93). The specification includes an object model chapter,
which is an extension to the OMG object model; an object definition
language (ODL); an object query language (OQL) which is a extension of
SQL; a binding to C++ and Smalltalk; mappings to OMG; an appendix
suggesting enhancements to ANSI C++ which will allow better language
integration and facilitate other, more general application needs in C++.
1997 O)" #'8. A Java persistence standard in was added in addition to the
existing Smalltalk and C++ ones. Added a meta-object interface, defined an
object interchange format. Changes throughout the specification.

2003-12-10 Page 24
Final Report NewSQL

Year Description
2000 O)" 2'8 includes a number of enhancements to the Java binding. It
incorporates improvements to the object model. It also includes various
corrections and enhancements in all of the chapters, including changes
necessary to broaden the standard for use by object-relational mapping
systems as well as for the original target of the standard, object DBMSs
2001 &he O)" grou+ is shut down
As the ODMG as an organization is no longer operational, currently nobody is working
on future versions of the ODMG specification or OQL.
/oo0s
The latest ('final') ODMG 3.0 specification is available as a book:
Books about ,D-. and ,QL
3.5ect Data.ases& An 3D4- Approach
Cooper, Richard; International Thomson Computer Press; 1997
8he 3.5ect Data Standard& 3D4- 3"%
Rick Cattell and others; Morgan Kaufmann Publishers; 2000

2003-12-10 Page 25
Final Report NewSQL

OQL "ra%%ar 9O*er*iew:
This paragraph is based on the OQL Sample Grammar for Object Data Management
Group (ODMG) that is available from http://www.odmg.org/ [ODMG].
queryProgram:
( ( declaration | query ) ( ';' )? )+
EOF
;
declaration:
defineQuery
| importQuery
| undefineQuery
;
importQuery:
"import" Identifier ( '.' Identifier )*
( "as" Identifier )?
;
defineQuery:
"define" ( "query" )? Identifier
( '(' type Identifier ( ',' type Identifier )* ')' )?
"as" query
;
undefineQuery:
"undefine" ( "query" )? Identifier
;
query:
(
selectExpr
| expr
)
;
selectExpr:
"select" ( "distinct" )? projectionAttributes
fromClause ( whereClause )?
( groupClause )? ( orderClause )?
;
fromClause:
"from" iteratorDef ( ',' iteratorDef )*
;

2003-12-10 Page 26
Final Report NewSQL

iteratorDef:
(
Identifier "in" expr
| expr ( ( "as" )? Identifier )?
)
;
whereClause:
"where" expr
;
projectionAttributes:
(
projection ( ',' projection )*
| '*'
)
;
projection:
expr ( ( "as" )? Identifier )?
;
groupClause:
"group" "by" groupColumn ( ',' groupColumn )*
( "having" expr )?
;
groupColumn:
fieldList
;
orderClause:
"order" "by" sortCriterion ( ',' sortCriterion )*
;
sortCriterion:
expr ( "asc" | "desc" )?
;
expr:
( '(' type ')' )* orExpr
;
orExpr:
andExpr ( "or" andExpr )*
;
andExpr:
quantifierExpr ( "and" quantifierExpr )*
;

2003-12-10 Page 27
Final Report NewSQL

quantifierExpr:
(
equalityExpr
| "for" "all" Identifier "in" expr ':' equalityExpr
| "exists" Identifier "in" expr ':' equalityExpr
)
;
equalityExpr:
relationalExpr
(
( '=' | "<>" )
( relationalExpr
| ( "all" | "any" | "some" ) relationalExpr
)
| "like" relationalExpr
)*
;
relationalExpr:
additiveExpr
(
( '<' | '>' | "<=" | ">=" )
( additiveExpr
| ( "all" | "any" | "some" ) additiveExpr
)
)*
;
additiveExpr:
multiplicativeExpr
(
( '+' | '-' | "||" | "union" | "except" )
multiplicativeExpr
)*
;
multiplicativeExpr:
inExpr
(
( '*' | '/' | "mod" | "intersect" )
inExpr
)*
;
inExpr:
unaryExpr ( "in" unaryExpr )?
;

2003-12-10 Page 28
Final Report NewSQL

unaryExpr:
( '+' | '-' | "abs" | "not" )*
postfixExpr
;
postfixExpr:
primaryExpr
(
'[' index ']'
| ( '.' | "->" ) Identifier ( argList ) ?
)*
;
index:
expr
(
( ',' expr )+
| ':' expr
) ?
;
primaryExpr:
(
conversionExpr
| collectionExpr
| aggregateExpr
| undefinedExpr
| objectConstruction
| structConstruction
| collectionConstruction
| Identifier ( argList )?
| '$' ('0'..'9')
| literal
| '(' query ')'
)
;
argList:
'(' ( expr ( ',' expr )* )? ')'
;
conversionExpr:
( "listtoset" | "element" | "distinct" | "flatten" )
'(' query ')'
;
collectionExpr:
( "first" | "last" | "unique" | "exists" )
'(' query ')'
;

2003-12-10 Page 29
Final Report NewSQL

aggregateExpr:
(
(
"sum" | "min" | "max" | "avg" )
'(' query ')'
| "count" '(' ( query | '*' ) ')'
)
;
undefinedExpr:
( "is_undefined" | "is_defined" )
'(' query ')'
;
objectConstruction:
Identifier '(' fieldList ')'
;
structConstruction:
"struct" '(' fieldList ')'
;
fieldList:
Identifier ':' expr
( ',' Identifier ':' expr )*
;
collectionConstruction:
(
( "array" | "set" | "bag" ) '(' ( expr ( ',' expr )* )? ')'
| "list" '(' ( expr ( ".." expr | ( ',' expr )* ) )? ')'
)
;
type:
( "unsigned" )? ( "short" | "long" )
| "long" "long"
| "float"
| "double"
| "char"
| "string"
| "boolean"
| "octet"
| "enum" ( Identifier '.' )? Identifier
| "date"
| "time"
| "interval"
| "timestamp"
| "set" '<' type '>'
| "bag" '<' type '>'
| "list" '<' type '>'
| "array" '<' type '>'

2003-12-10 Page 30
Final Report NewSQL

| "dictionary" '<' type ',' type '>'
| Identifier
;
literal:
objectLiteral
| booleanLiteral
| longLiteral
| doubleLiteral
| CharLiteral
| StringLiteral
| dateLiteral
| timeLiteral
| timestampLiteral
;
objectLiteral:
"nil"
;
booleanLiteral:
( "true" | "false" )
;
longLiteral:
('0'..'9')
;
doubleLiteral:
( doubleApprox | doubleExact )
;
dateLiteral:
"date" StringLiteral
;
timeLiteral:
"time" StringLiteral
;
timestampLiteral:
"timestamp" StringLiteral
;
NameFirstCharacter:
( 'a'..'z' | '_' )
;
NameCharacter:
( 'a'..'z' | '_' | '0'..'9' )
;

2003-12-10 Page 31
Final Report NewSQL

Identifier:
NameFirstCharacter ( NameCharacter )*
;
doubleApprox:
'e' ('+'|'-')? ('0'..'9')+
;
doubleExact:
'.' ( '0'..'9' )+ ( doubleApprox )?
| ( '0'..'9' )+
(
'.' ( '0'..'9' )* (doubleApprox )?
| doubleApprox
)
;
CharLiteral:
'\''
(
'\'' '\''
| '\n'
| ~( '\'' | '\n' )
)*
'\''
;
StringLiteral:
'"'
(
'\\' '"'
| '\n'
| ~( '\'' | '\n' )
)*
'"'
;
WhiteSpace:
( ' ' | '\t' | '\r' )
;
NewLine:
'\n'
;
CommentLine:
'/' '/' ( ~'\n' )* '\n'
;

2003-12-10 Page 32
Final Report NewSQL

MultiLineComment:
"/*"
( ~('*'|'\n') )*
"*/"
;
(o+yright notice
OQL Sample Grammar for Object Data Management Group (ODMG)
Copyright (c) 1999 Micro Data Base Systems, Inc. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted
provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions
and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of
conditions and the following disclaimer in the documentation and/or other materials provided
with the distribution.
3. All advertising materials mentioning features or use of this software must display the
following acknowledgment: "This product includes software developed by Micro Data Base
Systems, Inc. (http://www.mdbs.com) for the use of the Object Data Management Group
(http://www.odmg.org/)."
4. The names "mdbs" and "Micro Data Base Systems" must not be used to endorse or promote
products derived from this software without prior written permission. For written permission,
please contact info@mdbs.com.
5. Products derived from this software may not be called "mdbs" nor may "mdbs" appear in
their names without prior written permission of Micro Data Base Systems, Inc.
6. Redistributions of any form whatsoever must retain the following acknowledgment: "This
product includes software developed by Micro Data Base Systems, Inc. (http://www.mdbs.com)
for the use of the Object Data Management Group (http://www.odmg.org/)."
THIS SOFTWARE IS PROVIDED BY MICRO DATA BASE SYSTEMS, INC. "AS IS" AND ANY
EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL MICRO DATA BASE SYSTEMS, INC. OR ITS ASSOCIATES BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

2003-12-10 Page 33
Final Report NewSQL

J NewSQL .eatures
J'1 NewSQL (once+ts and &er%s
NewSQL is a new language for database access. Functionally, it is almost identical to
SQL. The new language should be easier to learn, more elegant, consistent, and well
defined. The new language should not be a subset or extension of SQL, and not an
object database language.
To make the new language easier to learn, the same data organization is used as in
the SQL language. If possible, the same words are used for the same concepts;
however other than in SQL, in NewSQL the names are well-defined and not vendor
specific. In SQL, some database vendors mean different things when using names like
'database', 'schema', 'instance' and so on. This should be avoided in the new language
wherever possible.
atabases
In NewSQL, a database contains a set of tables. A user can connect to a database, and
can access the tables in this database.
A database is a logical entity, and not necessarily a physical one. In NewSQL, the
relation between database and processes, as well as the relation between databases
and the file system is not specified. It is possible that a database process manages
multiple databases at the same time, and that multiple databases reside in the same
file or that a database is spread out to different machines.
&ables
As in SQL, the data is organized in tables containing rows of data, and each row in a
table contains the same columns.
(olu%ns
Each table contains a number of columns. Each column has a name and one specific
data type.
;ndexes
Indexes are used for faster access and for quicker ordering of data in a table.
Originally, SQL did not define indexes as part of the language because indexes are an
optimization, and in some databases it may not be required to define the indexes
manually. However, most databases do support indexes.

2003-12-10 Page 34
Final Report NewSQL

In the SQL language, the indexes also have names, as in NewSQL. However, the index
naming is different among databases products: In some cases (for example Oracle),
the index names need to be unique across a database. For other products (for
example MS SQL Server), the index names need to be unique only for a specific table.
In the NewSQL language, the index names need to be unique across a specific table.
@sers
A user can connect to a database using a password. In this database, the user may or
may not have access right to read / update / delete data or create new objects.
Roles
In the current version of NewSQL, roles are not defined. However, roles are an
important concept for grouping access rights.
Other ter%s
In the scope of NewSQL, other terms such as instance, schema, catalog and so on are
not defined.
J'# NewSQL Language .eatures 9(o%%ented:
A part of this document is repeated in the appendix, in the chapter 'NewSQL
Grammar'. However with the comments removed. This duplication is done on purpose,
so that developers that are not interested in the history and / or reasons for a design
decision, but who are only interested in the final design, can read it without
distraction.
Language Style 9Loo0 and .eel:
Certain features are supported by all databases, but in a slightly different way. An
example is comparison syntax: in Java, the comparison operators are: ==, !=, <=,
>=, <, >. In SQL, they are: =, <>, <=, >=. There are language details that define
the language style, like the usage of brackets (), [], {} and uppercase / lowercase
usage.
The Language Style was discussed in the Survey under the question 'Style':
(ns/er Count
Java, C#, Perl, Python 45
SQL, COBOL 14

2003-12-10 Page 35
Final Report NewSQL

(ns/er Count
Other 2
No Answer 3
Modern programming languages like Java are the clear winner. For this reason, the
look and feel of Java is used wherever possible.
"eneral "ra%%ar Style
The grammar style was discusses in the Survey under the question 'Which grammar
do you like most?'. This was probably the most important question, but unfortunately
the result is not clear.
Most people seem to like SQL, but it may be only because it is familiar. Or - as a
recent study in USA showed - because SQL is the first answer. The results of the
survey were:
(ns/er Count
SQL 24
Java Style 19
Simplified SQL 14
No opinion or answer 7
However, NewSQL should not look like SQL. There is already a project in work that
unifies the different SQL dialects ([LDBC], see also http://ldbc.sf.net).
For this project, the second most wanted option was picked, Java Style. The third
grammar style, simplified SQL, was not implemented for the following reasons:
It is similar to SQL, and may be confusing for people already familiar with SQL, or
as one person in the survey noted: ,Too similar to SQL to avoid confusion - I think
it'll just fall between the stools
The grammar style with the most votes was implemented; SQL was not really a
choice for this project, so Java Style was used. The simplified SQL style can be
implemented at a later time.
The Java style fits well with the language style (look and feel), where most people
voted for Java / C# / Perl / Python.
Ad*antages o, the Ka*a Style
In most cases, the table (or 'object') is always on the left of the statement. In this
sense, it looks like an object oriented programming language.
There is no need to explicitly state the command in the most common case: the
'select' case. All that is required is to specify the table name (and filter, if required).

2003-12-10 Page 36
Final Report NewSQL

isad*antage
Unfortunately, it is not always easy to keep the table on the left side of the statement:
for joins (inner joins, outer joins, unions), that means where multiple tables are
involved, it may be better to list the table name(s) after the keyword (join, union). For
more information, see also 'Design Alternatives'.
@sing Se%icolons
In the C, C++ and Java programming languages, every command must end with a
semicolon ';'. The semicolon is not required in many script languages like Javascript.
Given that NewSQL is more a script language (where single statement commands are
very common); we define the semicolon to be used for command separation only (to
execute a batch of commands). As currently, only single command statements are
supported, semicolons are not used.
;denti,iers
(ase Sensiti*ity
Identifiers are table names, column names and so on. In SQL, they are case-
insensitive. However, in most other programming languages (for example Java),
identifiers are case sensitive.
The survey contains a question about case sensitive identifiers. The results are:
(ns/er Count
Case insensitive like in SQL, COBOL 32
Case sensitive like Java, C, C++, C#,... 24
No opinion or answer 8
Most people are ok with case insensitive identifiers; however the result is not as clear
as the question about Language Style, where the majority voted for Java style.
In this case, NewSQL does not implement what most people voted for, because case
insensitive identifiers would conflict with the Java style. Therefore, the identifiers in
NewSQL are implemented case sensitive.
Quoted ;denti,iers
SQL supports 'quoted identifiers'. If double quotes are used around identifiers, they
become case sensitive and can include spaces and special characters. In the survey,
the question was asked if NewSQL should support quoted identifiers. There results
are:

2003-12-10 Page 37
Final Report NewSQL

(ns/er Count
There is no need to support quoted
names
35
Quoted names should be supported 19
No opinion or answer 10
Another reason not to support quoted identifiers is that they are easily confused with
text constants. In C, C++, C#, Java and most other programming languages, double
quotes are used for strings.
Therefore Quoted Identifiers are not supported by NewSQL.
(reating a &able
Creates a new table. <col> is:
<type> is:
and <key> is:

2003-12-10 Page 38
<table> = new table ( <col>|<key>,... )
key( <col>,... )
nullable autoincrement key <type> <name>
int | string | decimal | date | binary
Final Report NewSQL

Examples:
Script -eaning
test=new table(key int id, string name) Create a new table called 'test' with two
columns: id (this is the primary key) and
name. The column 'id' is of type integer,
and 'name' is of type string.
line=new table(int invoiceId, int lineId,
string text, decimal value,
key(invoiceId, lineId))
Create a table 'line' with 4 columns. The
combination of invoiceId and lineId is the
primary key.
(hanges in the "ra%%ar
The grammar draft in the survey was changed for the final version of NewSQL:
Survey: test=new table(int id, string name, key(id))
Final: test=new table(key int id, string name)
The keyword 'key' is now a modifier, similar to 'final' in Java. This simplifies the
common case, where the primary key is just one column.
Autoincre%ent colu%ns
In many cases, the primary key is just one column, and an incrementing number
should be used for each row inserted. Only the newest version of SQL (SQL2003)
addresses the topic of autoincrement columns, and it provides two ways to implement
it: sequences and identity columns. In this way, the SQL 'standard' only documents
the current situation: Some databases support sequences (for example, Oracle) and
others support identity columns (MS SQL Server, MySQL).
In the survey, this topic is addressed as well. The result of the question
'Autoincrement Column' is:
(ns/er Count
Autoincrement columns should be supported 38
Oracle style sequences should be supported 14
No opinion or answer 12
The developer shows the developers like the identity columns a little more than
sequences. Therefore, NewSQL supports this feature. Autoincrement column are
defined when the table is created.
ata &y+es and (onstants
The list of data types for NewSQL is:

2003-12-10 Page 39
Final Report NewSQL

Data
type
nae
Liitation Searc!"
able
Constant $a%a&s'l&
Types
int -(2^31) ..
(2^31-1)
Yes 1 INTEGER
string 255 chars* Yes 'Hello' or "Hello" VARCHAR
decimal 20 precision,
10 scale
Yes 3.14 DECIMAL
date 1 sec precision,
years 1800 -
9999
Yes date('2002-01-01') or
date('2002-01-01 20:00:00')
TIMESTAMP
binary 400'000 bytes No 0x01020a0b0c BLOB
There are no data type aliases defined.
ro++ing a &able
Removes all data from a table and erases the table meta data as well.
Script -eaning
test.drop() Removes the table test including all data.
Why to Gee+ the 5%+ty /rac0ets
In SQL, actions never have brackets; however NewSQL should follow the Java style.
In functional programming languages like Java, the empty brackets '()' are used to
distinguish between actions (methods) and variables. Therefore, the empty brackets
are kept if there is an action involved.
Retrie*ing data
Where to retrie*e the data ,ro%
It is possible to:
Retrieving data from a single table
Combining the data from multiple tables in one set

2003-12-10 Page 40
<table> .drop()
Final Report NewSQL

.iltering
Sometimes not all data needs to be retrieved. There are two ways to filter data:
Filtering the rows based on a condition
Retrieving only some of the columns that are available
Retrie*ing data ,ro% a single table
<table> is the name of the table. <alias> may be used for a local alias for the table
name. <condition> is similar to a condition in Java. If only some columns need to be
retrieved, they may be specified at the end.
Examples:
Script -eaning
test Retrieve all rows and all columns from the
table named 'test'.
test[id==1] Retrieve the row(s) where the column 'id'
is equal to 1.
test(id, name) Select all rows from the table 'test', but
only the columns id and name.
test[name=='World' || id<0](name) Only select rows where the name is
'World' or where the id is smaller than 0.
Only retrieve the column 'name'.
(hanges in the "ra%%ar
The grammar draft in the survey was changed for the final version of NewSQL:
Original: test.get()
Changed: test
As the most common case, the select statement should be as simple as possible. It
seems it is not possible to simplify it more than this. It is still possible to specify a
filter or column list in the same way: test[id==1]; test(name).

2003-12-10 Page 41
<table> <alias> [<condition>] ( <expr>,... )
Final Report NewSQL

Koining data ,ro% %ulti+le sources
Join multiple tables together.
Examples:
Script -eaning
join(invoice i, line l)[i.id == l.invoiceId] Retrieve all rows from the table invoice,
and link them all columns from the table
named 'test'.
join(invoice i, line l)[i.id == l.invoiceId]
(i.id, l.value)
Same as above, but only retrieve the
columns 'id' from invoice and 'value' from
line.
(hanges in the "ra%%ar
The grammar draft in the survey was changed for the final version of NewSQL:
Original: t1=test; t2=test; t1.join(t2[t1.id==t2.id]).get(t1.id,t2.name)
Changed: join(test t1, test t2)[t1.id==t2.id](t1.id, t2.name)
The join syntax is the hardest part. After some discussion and test, the keyword 'join'
was moved to the far left. Even if this makes the language a little bit inconsistent
(now the table name is not always on the far left), it seems keeping the list of table
on the same level improves the readability of the statement.
5x+licit *ersus ;%+licit Koins
In SQL, the joined tables and the type of join must be specified when the query is
executed; that means the links between the tables must be stated explicitly for every
query.
In OQL it is possible to navigate to the joined tables without having to specify the
links. That means joins are implicit (or can be implicit) for OQL.
For NewSQL, the idea of implicit joins was discussed. One of the questions is: how
important are implicit joins? Even if they are not supported, this feature can be added
to the language at a later time.
In survey, this question was asked as well under the point 'joins'. The result is:

2003-12-10 Page 42
[ <condition> ] oin ( <select>,... ) ( <expr>,... )
Final Report NewSQL

(ns/er Count
Explicit joins like in SQL are enough 28
NewSQL should support automatic
navigation similar to OQL
23
No opinion or answer 13
For this reason, implicit joins are not yet supported in the first version of NewSQL;
however it seems to be a good addition in the future.
Search (onditions
As already discussed, most developers seem to like the Java style. The grammar for
search conditions is therefore very similar to Java. There are however differences: for
Java, only one set of data is used at any time; where for databases (at least for the
languages SQL and OQL), always a set of data is compared.
Not all of the possibilities of C++ / Java are supported by NewSQL. Here is the list of
currently implemented functionality:
Condition 0e/SQL SQL
Equal ID == 10 ID = 10
Bigger ID > 10 ID > 10
Smaller ID < 10 ID < 10
Bigger or Equal ID >= 10 ID >= 10
Smaller or Equal ID <= 10 ID <= 10
Not Equal ID != 10 ID <> 10
And ID>0 && ID<10 ID>0 AND ID<10
Or ID<0 || ID>10 ID<0 OR ID>10
Not ! (ID==1) NOT (ID==1)
Other than in Java, the comparison operation can also be used for Strings, and for
objects that can be null (in Java, only == can be used for objects).
Null handling
The Survey showed that the null handling of SQL is not very intuitive:
(ns/er Count
I think these NULL rules are confusing
and should be simplified
38
I like these NULL rules 18
No opinion or answer 8

2003-12-10 Page 43
Final Report NewSQL

If possible, the null handling should be simplified. However, some experiments have
shown that it is not easy to do that. It would take quite a lot of time to implement the
simplification in a consistent way. Due to time limitations, NewSQL currently uses the
null handling of the underlying database.
For more information about this topic, please see 'Possible Future Work' / 'Null
handling simplification'.
5x+ressions
Also here, the Java style is used where possible; however the difference to the SQL
style is very small. At this time, only the minimum set of operations is implemented:
*+pression Date Types *+aple 1earks
Summation Numeric ID + 10
Subtraction Numeric ID - 10
Multiplication Numeric 2 * ID
Division Numeric ID / 10 The result for division
by 0 is vendor specific
Adding data
This adds one row to an existing table.
Script -eaning
test.add(id=1, name='Hello') Add a row with id 1 and name 'Hello'.
(hanges in the "ra%%ar
The grammar draft in the survey was changed for the final version of NewSQL:
Original: test.add(1,"Hello")
Changed: test.add(id=1, name="Hello")
Keeping the column names close to the values improves the readability of the insert
statement a lot. This idea is not new, however not implemented in SQL.

2003-12-10 Page 44
.add(<col>=<!alue>,...) <table>
Final Report NewSQL

(hanging data
Update rows in a table based on a condition.
Script -eaning
test.set(name='') Set the name to an empty string in all
rows of the table test.
test[id==2].set(name='World') Change the name to 'World' in the table
'test' where id is 2.
employee[salary>10]
.set(salary=salary-10)
Reduce the salary by ten for all rows with
salary bigger than ten.
eleting data
Delete rows in a table based on a condition.
Script -eaning
test.remove() Remove all rows in the table 'test'.
test[id==2].remove() Delete the row with the id 2. If no row is
found, nothing is deleted.
(hanges in the "ra%%ar
The grammar draft in the survey was changed for the final version of NewSQL:
Original: test[id==1].delete()
Changed: test[id==1].remove()
The word 'delete' was replaced with 'remove' to more closely match the verb used in
the Java collection interface. Developers familiar with Java will like the new syntax;
but people familiar with SQL may not like this change.

2003-12-10 Page 45
.set(<col>=<expr>,... ) <table> [<condition>]
<table> [<condition>] .remo!e()
Final Report NewSQL

&ransactions
Transaction handling is currently not implemented in NewSQL, however it is possible
to support transactions using the JDBC interface: the methods Connection.commit()
and Connection.rollback() can be used to commit a transaction or to undo the
changes. Please note that the connection must be first switched to the manual commit
mode by calling Connection.setAutoCommit(false).
Geywords
Here is the list of reserved words for NewSQL. Please note the language definition is
not finished, so this list may still grow. Some of these keywords could still be used as
object names such as 'drop', 'add', 'remove', 'set' as these are method names.
add autoincrement binary date decimal
drop group int join key
new nullable null order remove
set string table union
J'2 NewSQL "ra%%ar Re,erence
The grammar of the NewSQL language is listed here as a reference and for those who
are already familiar with the concepts of SQL. The semantics of the language and
examples are discusses in the chapters Language Features.
statement:
( assignment
| command
| union
| join
)
EOF
;
union:
"union" '(' object ( ',' object )* ')'
;
join:
"join"
'(' object ( ',' object )* ')'
'[' condition ']'
( '(' expression ( ',' expression )* ')' )?
( order ) ?
;

2003-12-10 Page 46
Final Report NewSQL

order:
'.' "order" '(' expression ( ',' expression )* ')'
;
assignment:
identifier '=' "new" "table" '(' table_def ')'
;
command:
object ( '.' operation ) ?
;
table_def:
column_def (',' column_def)*
;
column_def:
( ("nullable")? (("autoincrement")? "key")? datatype identifier )
|
( "key" '(' identifier ( ',' identifier )* ')' )
;
modifier:
"null" | "key"
;
datatype:
"int"
| "string"
| "decimal"
| "date"
| "binary"
;
object:
identifier ( alias ) ? ( '[' condition ']' )?
( '(' expression ( ',' expression )* ')' )?
( order ) ?
;
alias:
identifier
;
operation:
( "remove" '(' ')' )
| ( "set" '(' assign (',' assign)* ')' )
| ( "add" '(' assign (',' assign)* ')' )
| ( "drop" '(' ')' )
;

2003-12-10 Page 47
Final Report NewSQL

assign:
identifier '=' expression
;
condition:
cond_or
;
cond_or:
cond_and ("||" cond_and)*
;
cond_and:
cond_neg ("&&" cond_neg)*
;
cond_neg:
cond_rel | ( '!' cond_rel )
;
cond_rel:
cond_exp
( ( "==" cond_exp )
| ( '>' cond_exp )
| ( '<' cond_exp )
| ( "=" cond_exp )
| ( ">=" cond_exp )
| ( "<=" cond_exp )
)*
;
cond_exp:
cond_factor (('+' | '-') cond_factor)*
;
cond_factor:
cond_term (('*' | '/') cond_term)*
;
cond_term:
cond_end
| ( '-' cond_end )
;
cond_end:
value_or_column
| ( '(' condition ')' )
;

2003-12-10 Page 48
Final Report NewSQL

expression:
expr_exp
;
expr_exp:
expr_factor (('+' | '-') expr_factor)*
;
expr_factor:
expr_term (('*' | '/') expr_term)*
;
expr_term:
expr_end
| ( '-' expr_end )
;
expr_end:
value_or_column
| ( '(' expression ')' )
;
value_or_column:
value
| column
;
column:
( identifier '.' )?
identifier
;
value:
decimal_value
| string
| hex
| datevalue
| '?'
| "null"
;
identifier:
IDENTIFIER
;
string:
STRING
;
hex:
HEX
;

2003-12-10 Page 49
Final Report NewSQL

datevalue:
"date" '(' STRING ')'
;
decimal_value:
( NUMBER
| '.' NUMBER
| NUMBER '.' (NUMBER)?
)
;
IDENTIFIER:
'a'..'z' ( 'a'..'z' | '0'..'9' | '_' )*
;
STRING:
( '\'' ( ~'\'' )* '\'' ( STRING )* )
|
( '\"' ( ~'\"' )* '\"' ( STRING )* )
;
HEX:
'0' 'x' ( '0'..'9' | 'a'..'f' )*
;
NUMBER:
'0'..'9' ( '0'..'9' )*
;
WS:
( ' ' | '\t' | '\n' | '\r' )
;

2003-12-10 Page 50
Final Report NewSQL

J'4 &he NewSQL K/( ri*er
About K/(
JDBC stands for Java Database Connectivity and is the main interface for Java to
relational databases. The API is already well documented:
http://java.sun.com/products/jdbc/ [JDBC] For more information on this topic, many
books have been written. One of the best books is the reference guide:
Books about 2DBC
0D1# A9I 8utorial and )eference2 8hird Edition
Fisher, Ellis, and Bruce, Addison-Wesley, 2003.
0D1# Data.ase Access :ith 0ava& A 8utorial and Annotated )eference
Graham Hamilton, Rick Cattell, Maydene Fisher, Addison-Wesley, 1997.
&he NewSQL K/( ri*er Architecture
The NewSQL JDBC Driver is responsible to convert the NewSQL statement to SQL. This
is done transparently to the user application, which means the user application can
use the NewSQL JDBC driver just like another JDBC driver (except, NewSQL
statements can be used instead of SQL statements).
Behind the scene, the NewSQL JDBC driver converts the statement to standard SQL,
and calls the LDBC JDBC driver in turn. Like this, the NewSQL driver does not have to
deal with vendor specific SQL statements.
The architecture of this software stack is as follows:

2003-12-10 Page 51
User App
execute
NewSQL
NewSQL
JDBC
Driver
standard
SQL
LDBC
JDBC
Driver
vendor
SQL
Vendor
JDBC
Driver
Vendor
Database
vendor
SQL
execute
Final Report NewSQL

Please note that even though it loo,s li,e the two added layers (the NewSQL driver
and the LDBC driver) between the user application and the database will decrease the
performance of the application, this is actually not the case. As the Benchmark
Application (discussed in the chapter 'Applications', 'Benchmark Application') proves,
the performance difference between using NewSQL and SQL is negligible.
@sing the NewSQL K/( ri*er
In order to use NewSQL in an application, the following things need to be changed:
The JDBC driver class name
The JDBC URL
Everything else is done by the NewSQL driver (or, by the LDBC driver) automatically.
Therefore, it is not required to load the vendor JDBC driver in the user application.
Example:
3sing %endor speci#ic SQL 3sing 0e/SQL
import java.sql.*;
public class Test {
public static void main(String[] a)
throws Exception {
Class.forName(
"org.hsqldb.jdbc.jdbcDriver"
);
Connection conn=
DriverManager.getConnection(
"jdbc:hsqldb:test",
"sa",""
);
conn.createStatement().execute(
"UPDATE TEST SET ID=ID+1"
);
}
}
import java.sql.*;
public class Test {
public static void main(String[] a)
throws Exception {
Class.forName(
"org'news$l'jdbc'jdbcri*er"
);
Connection conn=
DriverManager.getConnection(
"jdbc:news$lIhsqldb:test",
"sa",""
);
conn.createStatement().execute(
"test'set9idAidL1:"
);
}
}
To run this 'Hello World' style example, a database named 'test' must exist and
contain the table TEST with the column ID.
In this example, the JDBC driver, URL, as well as the user name and password, are
hard-coded. In many applications, these parameters are set external to the
application (for example, in a properties file).

2003-12-10 Page 52
Final Report NewSQL

M A++lications
M'1 e%o A++lication
!ur+ose
The demo application shows how to use NewSQL inside an application using JDBC
calls. There are two version of the demo application: one written using ANSI-SQL, and
the other written using NewSQL. By comparing the source code of the two
applications, one can see what needs to be changed in order to convert an application
from using SQL to use NewSQL.
.unctionality
The application represents a very simple address book.
When the application is started, an address list is displayed. New addresses can be
added to the address book; existing records can be edited or deleted.
Each address contains a name, a phone number, an e-mail address and a picture.

2003-12-10 Page 53
Final Report NewSQL

-ow to Run the A++lication
There are two versions: the NewSQL version, and the SQL version. Both are
equivalent functionally, but the first one uses NewSQL statements, and the second
one SQL statements.
Both versions of the application can be started from the command line:
Open a command window and navigate to the directory:
newsql/lib
To run the NewSQL version, execute the command:
java -cp newsql.jar org.newsql.demo.newsql.AddressBook
To run the SQL version, execute the command:
java -cp newsql.jar org.newsql.demo.sql.AddressBook
For Windows, batch files are provided for your convenience.
To run the NewSQL version, double click the batch file:
newsql/bin/runDemoNewSQL.ba
To run the SQL version, double click the batch file:
newsql/bin/runDemoSQL.ba

2003-12-10 Page 54
Final Report NewSQL

A++lication Architecture
Only the most important attributes and methods are shown here. There is one
AddressBook object per application, which has a list of Address objects. Each Address
object has a link to a Picture.

2003-12-10 Page 55
String DRIVER, URL, USER
Connection conn
Hashtable cache
Vector addressVector
Address address
Address/oo0
main(String[])
run()
prepare(String)
updateList()
close()
AddressBook book
int id
String name, phone, email
Picture picture
Address
delete(AddressBook, int)
readAll(AddressBook)
Address(AddressBook)
read(int)
save()
byte[] data
Image image
!icture
Picture()
Picture(byte[])
Picture(String)
setData(byte[])
byte[] getData()
Image getImage()
1 1
1
*
Final Report NewSQL

Address/oo0
The database driver, URL, user and password are hardcoded.
The PreparedStatements are cached using a Hashtable.
The application has all Address items loaded in a Vector.
There is always one Address object active.
Address
The Address contains the id, name, phone, email and picture data.
This object knows how to read and save itself from/to the database.
!icture
This helper class is responsible to read an image from a file, to converting it to a byte
array (to be stored in the database), and also convert the byte array to an Image
object that can be displayed on the screen.
M'# /ench%ar0 A++lication
!ur+ose
The purpose of the Benchmark application is to show that an application using
NewSQL is not slower than an application using SQL.
In many cases, a new version of a software is slower than the older version.
The same goes for new technology: for example, the programming language Java was
very slow in the beginning. Even if subsequent version of Java got faster, there is still
the perception that Java is slow.
In the case of NewSQL, one can fear that it is slower than using SQL, because it is an
additional layer between the application and the database.
However, there is no reason why an application using NewSQL should be measurably
slower than an application using SQL: The statement conversion is done only at
prepare time (when creating the prepared statement in the case of JDBC). For a good
written application, statement compilation is done only once, and then prepared
statements are used. Compilation time normally takes less than 1% of the time of the
actual work (retrieving and storing data).
It is even possible that using NewSQL is faster than using SQL directly, for example if
the application does not use prepared statements. In this case, the added layer (the
NewSQL JDBC driver) could cache the prepared statement and thus make the
application faster than when using native SQL.

2003-12-10 Page 56
Final Report NewSQL

.unctionality
The benchmark application uses an algorithm similar to the TPC-A benchmark
algorithm defined by the Transaction Processing Performance Council
(http://www.tpc.org [TPC]).
The TPC-A benchmark is a very simple database performance test algorithm. It was
issued in 1989 and used to measure performance in update-intensive database
environments. Such use cases are usually called on-line transaction processing (OLTP)
applications. They are characterized by:
Multiple concurrent connection
Significant disk input and output
Fast response time, simple operations
Transaction integrity
The benchmark only uses one single, and a very simple transaction. The benchmark
measure how many such transactions a system can process.
The TPC-A benchmark is now obsolete; it was replaced by more complex algorithms,
such as the TPC-C benchmark.
The TPC-A benchmark was selected because it is very simple to implement, because it
is a standard test used by many companied before, and because it represents a
realistic use case similar to modern 3-tier web applications.
/ench%ar0 Results
The test results show that there is no measurable performance difference between
NewSQL and SQL.
The benchmark was executed using two different databases, and using both SQL and
NewSQL statements, on the same system.
Because a Java application is compiled at runtime, and because the benchmark results
very on each run, the application loops around the main benchmark a few times. The
variance is quite big, as one can see from the detail results. At the end, it is not
possible to measure a statistical difference between using NewSQL and SQL.
The results are in milliseconds, so smaller is faster.

2003-12-10 Page 57
Final Report NewSQL

Oracle
1un 0e/SQL SQL
# 1 13810 14961
# 2 13870 13980
# 3 15643 14020
# 4 14861 14851
# 5 14200 14752
Average 1;;<< 1;=13
)ySQL
1un 0e/SQL SQL
# 1 4556 4527
# 2 4436 4457
# 3 4677 4236
# 4 4466 4296
# 5 4356 4617
Average ;;$> ;;2<
-SQL/
1un 0e/SQL SQL
# 1 3815 3635
# 2 3636 3885
# 3 3646 3615
# 4 3635 3685
# 5 3625 3616
Average 3?<1 3?><

2003-12-10 Page 58
#1 #2 #3 #4 #5 Avg
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
12000
13000
14000
15000
16000
NewSQL
SQL
#1 #2 #3 #4 #5 Avg
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
NewSQL
SQL
#1 #2 #3 #4 #5 Avg
0
250
500
750
1000
1250
1500
1750
2000
2250
2500
2750
3000
3250
3500
3750
4000
NewSQL
SQL
Final Report NewSQL

Please note that these benchmark tests were not made to compare the performance of
the databases. No database optimizations where made for any of the databases.
-ow to Run the A++lication
The benchmark can be started from the command line:
Open a command window and navigate to the directory:
newsql/lib
To run the benchmark, execute the command:
java -cp newsql.jar org.newsql.pca.!"#A
For Windows, batch files are provided for your convenience.
To run the benchmark version, double click the batch file:
newsql/bin/run!"#A.ba
In all cases, the benchmark will run against the database HSQLDB, which is included
in the installation. To run the application against other database, you will need to
install the database, include the JDBC driver of the database in the class path, and run
the benchmark with the following options:
-url $daabase url% wi&ou jdbc'(
-user $user name(
-password $password(
Example:
java -cp ../lib/newsql.jar)../lib/m*sql.jar org.newsql.pca.!"#A
-url m*sql'//local&os/es -user ldbc -password ldbc
Please note that the line break is for formatting only and must not be included in the
actual command line.

2003-12-10 Page 59
Final Report NewSQL

A++lication Architecture
The static part of the TPC class initializes the database, and runs the main benchmark
in a loop; each time for SQL and for NewSQL.
This is a multi threaded application; for each Teller there is one thread created that
runs the transactions.
At the end of each run, the results are printed to screen.
The first run is not counted as during that time, the Java Virtual Machine compiles the
bytecode.

2003-12-10 Page 60
int BRANCHES, TELLERS, ACCOUNTS
int DELTA
String FILLER
String CREATE_SQL[][]
boolean log, trace
&!(A 9static:
main(String[])
run(String[])
run(boolean,String,String,String,int,int)
showResult(String)
trace(String)
boolean newsql
Connection conn
int branch, teller, account
int transactions
long time
&!(A 9+er &eller:
TPCA(boolean,Connection,int,int,int,int)
run()
runTestNewSQL()
runTestSQL()
getTime()
1
*
Final Report NewSQL

M'2 atabase &ool
!ur+ose
The database tool was developed to let developers and user experience the look and
feel of the new language, NewSQL. For developers that are familiar with SQL, it is also
a way to see what SQL statements are actually executed in the database.
In addition to that, the tool can be used as a very generic, simple database query tool.
It is easy to use, provides the basic functionality needed to execute basic queries, and
works for many databases.
.unctionality
After starting the tool, the query window appears. By default, the tool opens a
connection to a HSQLDB database.

2003-12-10 Page 61
Final Report NewSQL

-ow to Run the A++lication
The benchmark can be started from the command line:
Open a command window and navigate to the directory:
newsql/lib
To run the benchmark, execute the command:
java -cp newsql.jar org.newsql.ools.Daabase!ool
For Windows, batch files are provided for your convenience.
To run the benchmark version, double click the batch file:
newsql/bin/runDaabase!ool.ba
M'4 /uild !rocess
&he /uild !rocess
Building NewSQL is fully automated by using an Ant script. For more information
about Ant, see also: http://ant.apache.org [Ant].
To compile the application, navigate to the source folder and execute the command:
an
In this case, all source files are compiled and the newsql.jar file is created (the build-
target 'all' is executed).
/uilding NewSQL ,or KG 1'2'x
Some java.sql.* of JDK 1.4.x are not available in JDK 1.3.x and earlier. In order to be
JDK 1.4.x compatible, NewSQL must reference some of these classes in the JDBC API.
But, as these classes are not available in JDK 1.3.x, NewSQL can not be compiled with
JDK 1.3.x
To solve this problem, methods that are required in JDK 1.4.x, but not allowed in JDK
1.3.x, must be remarked before compiling the code.
This step is automated. The source code of NewSQL can be switched to be JDK 1.3.x
compatible.
To switch the code to JDK 1.3.x (by remarking the new JDK 1.4 methods) can be done
by executing the command:
an enable+jdk,-
Afterwards, rebuild everything as described above:
an
/uilding NewSQL ,or KG 1'4'x and later
By default, the source code of NewSQL is switched to JDK 1.4.x and later. If the code
was switched to the JDK 1.3.x mode, it is not possible to compile it with JDK 1.4.x and
later. In this case, it is required to convert the source code to JDK 1.4.x first.
To do that, execute the command:

2003-12-10 Page 62
Final Report NewSQL

an disable+jdk,-
Afterwards, rebuild everything as described above:
an
M'J &he Regression &est
!ur+ose
The regression test application was implemented to test the NewSQL to SQL translator
in an automated way. It executes NewSQL statements against the database
configured in the file test.properties.
To run the regression test, no manual intervention is necessary, that means each test
case runs fully automatically, including setup, input, and checking for correctness.
.unctionality
Running the application does not produce a result, if everything is ok. If an error
occurs, the application displays the stack trace and other error information.
-ow to Run the A++lication
The benchmark can be started from the command line:
Open a command window and navigate to the directory:
newsql/source
To run the application, execute the command:
an es
If no error was found, no output is generated. However, if an error occurs, the
stack trace is printed out.
To run the application with debug log enabled, execute the command:
an esDebug

2003-12-10 Page 63
Final Report NewSQL

M'M (o*erage &est Results
The purpose of the coverage test is to make sure that the main areas of the code are
actually tested. While it is not possible to test every possible input data, it is possible
to test at least the main areas of the code to make sure the number of bugs is as low
as possible. Testing is always a time trade-off. If more time is spent on testing, the
more likey bugs are found andfixed. However it is important to test all areas of the
code. That's why coverage tests are made.
-ow to Run the (o*erage &est
To run the regression test, navigate to the source folder and execute the command:
an es#overage!ranslaor
&est Results
The outout of the test is:
....newsql.source(an es#overage!ranslaor
Build/ile' build.0ml
es#overage!ranslaor'
coverageBe/ore'
1delee2 Deleing -34 /iles /rom #'.daa.newsql.newsql.coverage
1cop*2 #op*ing ,56 /iles o #'.daa.newsql.newsql.coverage
1java2 - o/ - 789
coverageA/er'
1javac2 #ompiling ,53 source /iles
1javac2 Noe' Some inpu /iles use or override a deprecaed A":.
1javac2 Noe' ;ecompile wi& -deprecaion /or deails.
1java2 url' $jdbc'newsql'&sqldb'sample(
1java2 ;unning org.newsql.es.!es<rammar
1java2 ========================================================
1java2 N>! #>?@;@D
1java2 --------------------------------------------------------
1java2 No covered' ,, 9 A47 o/ 43-B
1java2 --------------------------------------------------------
1java2 C>S! #ALL@D
1java2 !oal' 4764
1java2 --------------------------------------------------------
The coverage statistics says that 11% are not covered; that means about 90% of the
source code of the NewSQL to SQL converter is tested. The target of the requirements
was 70% (must), 85% (should), and 95% (may).
(onclusions
Out of the remaining untested code, most code is actually assertion code (handling of
unexpected behavior or data; exception handling code). These lines of code are hard if
not impossible to reach; they represent a 'second line of protection' in the code.
Therefore, we do not currently plan to improve the test.

2003-12-10 Page 64
Final Report NewSQL

N Outloo0
N'1 (urrent Li%itations
atabase &ool Wi1ard
The current version of the database tool wizard is usefull in many situations, but it is
far from beeing complete and stable. The reason is, the wizard does not use the
NewSQL parser tree due to technical and time limitations. Instead, it uses a small set
of rules and parses the statement only partially (by scanning for some keywords, as
well as looking at the beginning and end of the statement). The implementation of the
wizard is a quick-and-dirty solution that works in most cases but is hard to extend.
If more time is available, the wizard should be re-implemented in a more extensible
way, so that it may become even more usefull.
N'# esign Alternati*es
Koin Syntax
For joins, it would be possible to keep the table name to the left. Example:
invoice.join(line)[invoice.id == line.invoiceId]
invoice i.join(line l)[i.id == l.invoiceId]
invoice[id>10](id).union(line[id>20](id))
However it would look more complex, at least in the last case. It seems it is harder to
'see' that the two tables belong together, at least in the last example.
N'2 !ossible .uture Wor0
&ransaction handling %ethods
Transaction handling is currently not supported in the NewSQL language. However it
would be easy to add the grammar to deal with transactions. One way to do that is to
add a static object (for example, a 'database' object) that can execute the actions
commit and rollback:
db.commit()
db.rollback()
Similar to the JDBC interface, methods should be added to switch the autocommit
mode of the connection on and off, for example by using:
db.autoCommit(false)
db.autoCommit(true)

2003-12-10 Page 65
Final Report NewSQL

Null handling si%+li,ication
Currently, the null handling of the underlying database is used; however it should be
simplified.
It is not very simple to simplify the rules for NULL if NewSQL must be compatible to
current databases, because the new rules must be mapped to the old rules while
converting the NewSQL statement to the SQL statement.
However, it would be possible to simplify the rules for NULL.
Example: NULL shall be equal to NULL.
In this case, the following comparison:
(A == B)
Would need to be converted to:
( (A = B) OR (A IS NULL AND B IS NULL) )
...if both columns A and B allow null values.
&ext (onstants with 5sca+e Se$uences
Currently, text constants (Strings) in NewSQL are implemented in the same way as
done in SQL. That means control characters such as newline or tab can not be used
inside a constant (however, they can be used by using parameters). The original idea
was to support text constants in the same way as in Java. However, the underlying
JDBC driver LDBC (as well as various vendor JDBC driver) does not support any
escape syntax at this time. For this reason, Java style escape sequences could not yet
be implemented.
;%+licit Koins
It would be possible to add implicit joins. In this case, it would be required to define
the links between the tables on design-time.
(on*erter SQL to NewSQL
Currently, a converter from NewSQL to SQL is implemented. One of the advantages of
NewSQL is, that it works with all databases in the same way; one of the
disadvantages of SQL is that an application written for one database does most likely
not work with other databases. By using a converter from (vendor specific) SQL to
NewSQL, it would be possible to port an application to a different database without
changing the application code. All that is required would be to use a different
database driver.
O/( dri*er
Currently, a JDBC driver is implemented. However, for applications written in other
languages than Java, this driver can not be used. The most common technology for
C++ is ODBC. By writing an ODBC driver for NewSQL, these platforms can be
supported as well.

2003-12-10 Page 66

Anda mungkin juga menyukai