com
CONTENTS INCLUDE:
n
n
Oracle Berkeley DB Family
Berkeley DB Java Edition Features
Getting Started with
Oracle Berkeley DB
n
BDB JE Base API and Collections API
n
BDB JE Direct Persistent Layer
n
BDB JE Transaction Support and Performance Tuning
n
BDB JE Backup and Recovery
n
Hot Tips and more... By Masoud Kalali
entirely different set of features and characteristics (See Table
About Oracle Berkeley DB 4). The base feature sets are shown in Table 1.
The Oracle Berkeley DB (BDB) family consists of three open Table 1: Family Feature Sets
source data persistence products which provide developers Feature Set Description
with fast, reliable, high performance, enterprise ready local Data Store (DS) Single writer, multiple reader
databases implemented in the ANSI C and Java programming Concurrent Data Store (CDS) Multiple writers, multiple snapshot readers
languages. The BDB family typically stores key-value pairs but Transactional Data Store (TDS) Full ACID support on top of CDS
is also flexible enough to store complex data models. BDB and High Availability (HA) Replication for fault tolerance. Fail over recovery support
BDB Java Edition share the same base API, making it possible
to easily switch between the two. Table 2 shows how these features are distributed between the
We will review the most important aspects of Oracle BDB different BDB family members.
family briefly. Then we will dig deep into Oracle BDB Java
Table 2: Different Editions’ Feature Sets DS CDS TS HA
Edition and see what its exclusive features are. We discuss
Based API and Data Persistence Layer API. We will see how
BDB/BDB XML Edition
we can manage transactions DPL and Base API in addition BDB Java Edition
to persisting complex objects graph using DPL will form the
overall development subjects. Backup recovery, tuning, and
Additional Features
www.dzone.com
Feature Benefit
Oracle BDB Core Edition Locking High concurrency
Berkeley DB is written in ANSI C and can be used as a library Data stored in application-native format Performance, no translation required
to access the persisted information from within the parent Programmatic API, no SQL Performance, flexibility/control
application address space. Oracle BDB provides multiple
In process, not client-server Performance, no IPC required
interfaces for different programming languages including
Zero administration Low cost of ownership
ANSI C, the Java API through JNI in addition to Perl, PHP, and
ACID transactions and recovery Reliability, data integrity
Python.
Dual License Open/Closed source distributions
Oracle BDB XML Edition
Built on top of the BDB, the BDB XML edition allows us to
easily store and retrieve indexed XML documents and to use
XQuery to access stored XML documents. It also supports Get More Refcardz
accessing data through the same channels that BDB supports. (They’re free!)
Oracle Berkeley DB
retrieval API as well as an “access as collection” API. n Hot tips & examples
In memory or on disk operation Transacted caching/ persisted data store db.get(null, keyValue, searchEntry, LockMode.DEFAULT);//retrieving
record
Similar data access API Easy switch between JE and BDB String foundData = new String(searchEntry.getData(), “UTF-8”);
dataValue = new DatabaseEntry(“updated data content”.
Just a set of library Easy to deploy and use
getBytes(“UTF-8”));
Very large databases Virtually no limit on database size db.put(null, keyValue, dataValue);//updating an entry
db.delete(null, keyValue);//delete operation
db.close();
Features unique to BDB Java Edition are listed in Table 4. dbEnv.close();
Table 4: BDB Java Edition Exclusive Features There are multiple overrides for the Database.put method to
Feature Benefit prevent duplicate records from being inserted and to prevent
record overwrites.
Fast, indexed, BTree Ultra fast data retrieval
Java EE JTA and JCA support Integration with Java EE application servers DPL Sample
Efficient Direct Persistence Layer EJB 3.0 like annotation to store Java Objects graph DPL sample consists of two parts, the entity class and the
Easy Java Collections API Transactional manipulation of Base API through entity management class which handle CRUD over the entity
enhanced Java Collections class.
Low Level Base API Work with dynamic data schema
Entity Class
JMX Support Monitor able from within parent application
@Entity
public class Employee {
These features, along with a common set of features, make the @PrimaryKey
public String empID;
Java edition a potential candidate for use cases that require public String lastname;
caching, application data repositories, POJO persistence, @SecondaryKey(relate = Relationship.MANY_TO_MANY,
relatedEntity = Project.class,onRelatedEntityDelete =
queuing/buffering, Web services, SOA, and Integration. DeleteAction.NULLIFY)
public Set<Long> projects;
public Employee() { }
Introducing Berkeley DB java Edition public Employee(String empID, String lastname, Set<Long> projects)
{
this.empID = empID;
Installation this.lastname = lastname;
this.projects = projects;
You can download BDB JE from http//bit.ly/APfJ5. After }
}}
extracting the archive you’ll see several directories with self-
describing names. The only file which is required to be in the This is a simple POJO with few annotations to mark it as an
class path to compile and run the included code snippet is je- entity with a String primary key. For now ignore the
3.3.75.jar (the exact file name may vary) which is placed inside @Secondarykey annotation, we will discuss it later.
the lib directory. Notice that BDB JE requires J2SE JDK version
1.5.0_10 or later.
The data management Class
EnvironmentConfig envConfig = new EnvironmentConfig();
envConfig.setAllowCreate(true);
All editions of Berkeley DB are freely available for download Environment dbEnv = new Environment(new File(“/home/masoud/dben-
and can be used in open source products which are dpl”), envConfig);
Hot not distributed to third parties. A commercial license is StoreConfig stConf = new StoreConfig();
stConf.setAllowCreate(true);
Tip necessary for using any of the BDB editions in a closed
EntityStore store = new EntityStore(dbEnv, “DPLSample”, stConf);
source and packaged product. For more information about
licensing visit: http://bit.ly/17pMwZ PrimaryIndex<String, Employee> userIndex;
userIndex = store.getPrimaryIndex(String.class, Employee.class);
userIndex.putNoReturn(new Employee(“u180”, “Doe”, null));//insert
Access APIs Employee user = userIndex.get(“u180”);//retrieve
userIndex.putNoReturn(new Employee(“u180”, “Locke”, null));//
BDB JE provides three APIs for accessing persisted data. The Update
Base API provides a simple key-value model for storing and userIndex.delete(“u180”);//delete
retrieving data. The Direct Persistence Layer (DPL) API lets store.close();
dbEnv.close();
you persist any Java class with a default constructor into the
database and retrieve it using a rich set of data retrieval APIs. These two code snippets show the simplest from of performing
And finally the Collections API which extends the well known CRUD operation without using transaction or complex object
Java Collections API with data persistence and transaction relationships.
support over data access.
Sample code description
Base API sample An Environment provides a unit of encapsulation for one or
The Base API is the simplest way to access data. It stores a key more databases. Environments correspond to a directory on
and a value which can be any serializable Java object. disk. The Environment is also used to manage and configure
resources such as transactions. EnvironmentConfig is used to
EnvironmentConfig envConfig = new EnvironmentConfig();
envConfig.setAllowCreate(true); configure the Environment, with options such as transaction
Environment dbEnv = new Environment(new File(“/home/masoud/dben”),
envConfig);
configuration, locking, caching, getting different types of
DatabaseConfig dbconf = new DatabaseConfig(); statistics including database, locks and transaction statistics, etc.
dbconf.setAllowCreate(true);
dbconf.setSortedDuplicates(false);//allow update
Database db = dbEnv.openDatabase(null, “SampleDB “, dbconf);
One level closer to our application is DatabaseConfig and
DatabaseEntry searchEntry = new DatabaseEntry(); Database object when we use Base API. When we use DPL
DatabaseEntry dataValue = new DatabaseEntry(“ data content”.
getBytes(“UTF-8”)); these objects are replaced by StoreConfig and EntityStore.
DatabaseEntry keyValue = new DatabaseEntry(“key content”.
getBytes(“UTF-8”)); In Base API DatabaseConfig and Database objects provide
db.put(null, keyValue, dataValue);//inserting an entry
access to the database and how the database can be accessed.
Configurations like read-only access, record duplication Table 5: Base API Features
handling, creating in-memory databases, transaction support, Key value store retrieval, value can be anything
etc. are provided through DatabaseConfig.
Cursor API to traverse in a dataset forward and backward
In DPL StoreConfig and EntityStore objects provide access to JCA (Java Connectivity Architecture) support
object storage and how the object storage can be accessed. JMX (Java Management eXtension) support
Configurations such as read only access, data model mutation,
Table 6 shows most important DPL API characteristics and
creating in-memory databases, transaction support, etc. are
capabilities. Annotation and Object Mapping make rapid
provided through StoreConfig.
application development possible.
The PrimaryIndex class provides the primary storage and Table 6: DPL API Features
access methods for the instances of a particular entity class.
Type Safe access to persisted objects
There are multiple overrides for the PrimaryIndex.put method
Updating classes with adding new fields is supported
to prevent duplicate entity insertion and provide entity
Persistent class fields can be private, package-private, protected or public
overwrite prevention.
Automatic and extendable data binding between Objects and underlying storage
BDB Java edition environment anatomy Transaction support in Base API is a bit different, as in Base
API we directly deal with databases while in DPL we deal with
Table 5 shows Base API characteristics and benefits. The key- environment and EntityStore objects. The following changes
value access model provides the most flexibility. will allow transaction support in Base API.
@PrimaryKey Defines the class primary key and must be used one and only one time for The Environment and EntityStore definitions are omitted.
every entity class.
The SecondaryIndex provides primary methods for retrieving
@SecondaryKey Declares a specific data member in an entity class to be a secondary key for
that object. This annotation is optional, and can be used multiple times for objects related to the Secondary Key of a particular object.
an entity class. The SecondaryIndex can be used to retrieve the related objects
@Persistent Declares a persistent class which lives in relation to an entity class. through a traversable cursor. We can also use SecondaryIndex
@NotTransient Defines a field as being persistent even when it is declared with the transient to query for a specific range of objects in a given range for its
keyword.
primary key.
@NotPersistent Defines a field as being non-persistent even when it is not declared with the
transient keyword.
Table 9: Supported Object Relation
@KeyField Indicates the sorting position of a key field in a composite key class when
the Comparable interface is not implemented. The KeyField integer Relation Description
element specifies the sort order of this field within the set of fields in the
composite key. ONE_TO_ONE A single entity is related to a single secondary key value.
The @PrimaryKey annotation has a string element to define The only different between using BDB JE collections API and
the name of a sequence from which we can assign primary classic collections is the fact that when we use BDB JE Collections
key values automatically. The primary key field type must be API we are accessing persisted objects instead of in-memory
objects which we usually access in classic collection APIs. Now let’s see the driver code which put this class in action.
Collections API extends Java serialization to store class description separately to make data
The steps demonstrated in the CollectionSample are
records much more compact. self describing. The only new object in this snippet is
the TransactionRunner object which we used to run the
To get a real sense about BDB JE Collections API think of it as
TransWorker object. I omit many of the safe programming
we can persist and retrieve objects using a collection class like
portions to keep the code simple and conscious. we need
SortedMap’s methods like tailMap, subMap or put, putall, get,
exception handling and properly closure of all BDB JE objects
and so on.
to ensure data integrity
But before we use the SortedMap object to access the stored
data, we need to initialize the base objects like Database and
Environment; we should create the ClassCatalog object, and
BDB JE Backup/Recovery and Tuning
finally we should define bindings for our key and value types.
Backup and Recovery
Collections API sample We can simply backup the BDB databases by creating an
Now let’s see how we can store and retrieve our our objects operating system level copy of all jdb files. When required we
using Collections API. In this sample we are persisting a pair of can put the archived files back into the environment directory
Integer key and String value using SortedMap. to get a database back to the state it was at. The best option is
First lets analyze the TransactionWorker implementation. to make sure all transactions and the write process are finished
public class TransWorker implements TransactionWorker {
to have a consistent backup of the database.
private ClassCatalog catalog;
private Database db; The BDB JE provides a helper class located at com.sleepycat.
private SortedMap map;
je.util.DbBackup to perform the backup process from within
public TransWorker(Environment env) throws Exception {
DatabaseConfig dbConfig = new DatabaseConfig(); a Java application. This utility class can create an incremental
dbConfig.setTransactional(true);
dbConfig.setAllowCreate(true); backup of a database and later on can restore from that
Database catalogDb = env.openDatabase(null, “catalog”, dbConfig); backup. The helper class ideally freezes the BDB JE activities
catalog = new StoredClassCatalog(catalogDb);
// use Integer tuple binding for key entries during the backup to ensure that the created backup exactly
TupleBinding keyBinding =
TupleBinding.getPrimitiveBinding(Integer.class); represents the database state when the backup process
// use String serial binding for data entries started.
SerialBinding dataBinding = new SerialBinding(catalog,
String.class);
db = env.openDatabase(null, “dben-col”, dbConfig); Tuning
map = new StoredSortedMap(db, keyBinding, dataBinding, Berkeley DB JE has 3 daemon threads and configuring these
true);
} threads affects the overall application performance and
/** Performs work within a transaction. */
public void doWork() throws Exception { behavior. These 3 threads are as follow:
// check for existing data and writing
Integer key = new Integer(0); Cleaner Thread Responsible for cleaning and deleting unused log files. This thread is
String val = (String) map.get(key); run only if the environment is opened for write access.
if (val == null) {
map.put(new Integer(10), “Second”); Checkpointer Thread Basically keeps the BTree shape consistent. Checkpointer thread is
} triggered when environment opens, environment closes, and database
//Reading Data log file grows by some certain amount.
Iterator iter = map.entrySet().iterator();
while (iter.hasNext()) { Compressor Thread For cleaning the BTree structure from unused nodes.
Map.Entry entry = (Map.Entry) iter.next();
//Process the entry These threads can be configured through a properties file
}
} named je.properties or by using the EnvironmentConfig and
}
EnvironmentMutableConfig objects. The je.properties file,
TransWorker implements TransactionWorker which makes which is a simple key-value file, should be placed inside the
it necessary to implement the doWork method. This method environment directory and override any further configuration
is called by TransactionRunner when we pass an object of which we may make using the EnvironmentConfig and
TransWorker to its run method. The TransWorker constructor EnvironmentMutableConfig in the Java code.
simply receive an Environment object and construct other The other performance effective factor is cache size. For on-
required objects . Then it opens the database in Collection disk instances cache size determines how often the application
mode, creates the required binding for the key and values needs to refer to permanent storage in order to retrieve some
we want to store in the database and finally it creates the data bucket. When we use in-memory instances cache size
SortedMap object which we can use to put and retrieve objects determines whether our database information will be paged
using it. into swap space or it will stay in the main memory.
je.cleaner. Ensures that a minimum amount of space is occupied by live records by structure, and loading the dump into another environment.
minUtilization removing obsolete records. Default occupied percentage is 50%.
DbDump Dumps a database to a user-readable format.
je.cleaner. Determines the cleaner behavior in the event that it is able to remove
expunge an entire log file. If “true” the log file will be deleted, otherwise it will be DbLoad Loads a database from the DbDump output.
renamed to nnnnnnnn.del
DbVerify Verifies the structure of a database.
je.checkpointer. Determines how often the Checkpointer should check the BTree structure. If
bytesInterval it performs the checks little by little it will ensure a faster application startup
but will consume more resources specially IO.
To run each of these utilities, switch to BDB JE directory,
je.maxMemory Determines what percentage of JVM maximum memory size can be used switch to lib directory and execute as shown in the following
Percent for BDB JE cache. To determine the ideal cache size we should put the
application in the production environment and monitor its behavior. command:
A complete list of all configurable properties, with java -cp je-3.3.75.jar com.sleepycat.je.util.DbVerify
explanations, is available in EnvironmentConfig Javadoc. The
list is comprehensive and allows us to configure the BDB JE at The JAR file name may differ depending on your version of
granular level. BDB JE. These commands can also be used to port a BDB JE
database to BDB Core Edition.
All of these parameters can be set from Java code using the
EnvironmentConfig object. The properties file overrides the A very good set of tutorials for different set of BDB JE APIs
values set by using EnvironmentConfig object. are available inside the docs folder of BDB JE package.
Hot Several examples for different set of functionalities are
Helper Utilities Tip provided inside the examples directory of the BDB JE
Three command line utilities are provided to facilitate dumping
package.
the databases from one environment, verifying the database
A BOUT t h e A u t h o r RECOMME N D E D B o o k
Masoud Kalali holds a software engineering degree and has The Berkeley DB Book is a practical guide to the intricacies
been working on software development projects since 1998. He of the Berkeley DB. This book covers in-depth the complex
has experience with a variety of technologies (.Net, J2EE, CORBA, design issues that are mostly only touched on in terse
and COM+) on diverse platforms (Solaris, Linux, and Windows). footnotes within the dense Berkeley DB reference manual. It
His experience is in software architecture, design and server side explains the technology at a higher level and also covers the
development. Masoud has several articles in Java.net. He is one of internals, providing generous code and design examples.
founder members of NetBeans Dream Team. Masoud’s main area of
research and interest includes Web Services and Service Oriented
Architecture along with large scale and high throughput systems’ development and
deployment. BUY NOW
Blog: http://weblogs.java.net/blog/kalali/ books.dzone.com/books/berkeley-db
Contact: Kalali@gmail.com
Bro
ugh
t to
you
by...
Professional Cheat Sheets You Can Trust
tt erns “Exactly what busy developers need:
n Pa
ld
ona
sig son
McD
simple, short, and to the point.”
De
Ja
By
#8
ired
Insp e
by th
GoF ller con
tinu
ed
ndle
r doe
sn’t
have
to
in tab
Cha d ultip ecific o should cep pat
this if the m up the
man
n M
an ac
z.co
n
sp s ents
be a ject ime. d is see passed
Download Now
Com reter of ob at runt handle plem ks to de to
...
n
uest som tion proces e no m
Itera tor ore req
n A g in metho
d
cep
dm ndlin
e e ar
tho e to ptio th
Exce ion is sm to ha up th tered or
e ca
n
Ob
se Me S renc ral
Refcardz.com
plate RN refe listed in mp
le ce pt ni
echa n pa ssed co un avio
Beh
TTE
n
ex
is en
ick
Adobe Live Cycle Windows Powershell
am
Tem Exa he .
A s has ack. W tion uest to ject
NP a qu s, a ct- cep Ob
it
n
st q
IG e s e rn b je call e ex e re
vid patt able O le th nd th
DES
! V is
pro s, hand s to ha
UT ard ign us ram .
des diag
Adobe Flash Builder 4 Dependency Injection with EJB 3
ct
ABO refc oF) f Re le obje
erns ts o lass exa
mp oke
r
Patt ur (G lemen es c Inv
arz
n F o d rl d h
Des
ig go
f s: E inclu al w
o suc
Th is G a n
l 23 sign Pa
tt e rn e rn
patt , and a
re je ts
t ob mentin
c g AN
D
Java Performance Tuning Netbeans IDE JavaEditor
c
a h c M M
ef
in c u
orig kD
e
re.
Ea tion ons
tr ple CO ma
nd
boo Softwa rma to c their im om
WPF Getting Started with Eclipse
re R
( low
e x p o n la rg cute is al ch
ati an b rm ts. +exe . Th
bject nship
s su
Cre y c fo je c an o
t th
e
ed
to ob ms, d as ed rela
tio
Get
nsh
ips
that
can psu
Enca quest
the
re
late
sa
and
e ha
to b llbacks
ca
.
nalit
y.
riant
times
or in
varia
ling
the
invo
catio
ing
n.
DZone, Inc. ISBN-13: 978-1-934238-62-2
ra o pose uing nctio led at va hand
io n ti ct cess be
av ,a la P ur
as q
ue
ack
fu je pro
Beh nships t re
1251 NW Maynard
ob
e
ISBN-10: 1-934238-62-7
nd the to
je c a n b callb b e ha eded. m ro nous nality ed
tio ob at c
ed
You ne s need
to ne
sts is couple
d fro
ynch nctio y ne
rela with s th st que
n
e th
e as the fu
itho
ut an is
eals
it
ship Reque y of re de ar
ld be litat pattern ssing w tation articul
e: D tion faci
50795
or
e. Use
n
A hist shou d ce en p
ed to mman for pro implem ents its ting.
cop runtim rela type
ker
Cary, NC 27513
n
S s Whe invo y us
n
s co al m pec
jec t at cla Pro
to Th e
w id el th e ue ue
ac tu
d im
p le ex
are utilizing a job q of the ue is
Ob ged with
n
C ues ueue e que
han eals me.
y dge enq
que s. B en to th
e c : D P roxy Job orithm be giv knowle that is terface
b e ti
cop
g n ct
pile r S in
rato er mp
le of al ed ca to have d obje of the
ss S at com serv
888.678.0399
Exa ut
Deco nes
an
.com
919.678.0300
one
Abst tory
C C
Fac
B
State
t
pte
r eigh y
teg
Ada Flyw Stra od
z
Meth
more than 3.3 million software developers, architects and decision
S S r B
w. d
e rp rete plate
ridg
Refcardz Feedback Welcome
B B
Inte Tem
S B
er tor or
ww
C B r B
NSIB
ILIT
Y B
S
Com
po si te
P O succ
ess
or Sponsorship Opportunities 9 781934 238622
RES
“DZone is a developer’s dream,” says PC Magazine.
F
CH
AIN
O
ace
>> sales@dzone.com
terf r
<<in andle
H st ( )
re
ndle
+ha
ern
1 ki ng
ler by lin .c o
m
and uest one
teH req z
cre () le a w.d
Con uest hand | ww