Salim Ansari
Oracle OpenWorld 2013
What if...
We created a powerful analytical engine
to apply to the data?
We revolutionized the way in which we
searched for data in the archives?
Applied marketing methods to scientific
data?
Created the domain of Science
Intelligence...!??
Select an item
Multiwavelength Astronomy
Chronic Disease
Management
Jerven Bolleman
Developer
SIB Swiss Institute of Bioinformatics
46 groups
650 collaborators
Strategic Goal
To provide key competencies
& research support to the medical &
life science community
volume
velocity
3PB storage
24 TB per week
Value
variety
100s of sources
veracity
experimental
uncertainty
150GB of information
per person
CycliX data
Ensembl data
2013 SIB Swiss Instute of Bioinformatics
Oracle Open World 2013
current
CPU bandwidth
limiting factor
no IO wait
IO wait
2013 SIB Swiss Instute of Bioinformatics
Oracle Open World 2013
VALUE
Secure
access
backups
Upgradable
Maintainable
Fast data ingestion
Works within the week
SQL/SPARQL
low developer costs
set based analytics
Query language
good for questions
Adaptable
New question
New query
jervenbolleman
jerven.bolleman@isb-sib.ch
dmitry.kuznetsov@isb-sib.ch
Supporting Researchers
and the Large Hadron Collider
with Oracle
Tony Cass
Head of Database Services
4/12/16
Document reference
38
scan electron-table
scan jet-table
scan electron-table
Slide courtesy of Maaike Limper
scan jet-table
39
Maybe
Maybe another
another Exadata
Exadata test
test in
in future:
future:
With
With more
more complex
complex analysis
analysis
In-memory
In-memory columnar
columnar beta
beta ??
Hadoop vs Oracle
Hadoop
Hadoop version
version of
of Z+H
Z+H benchmark
benchmark analysis:
analysis:
Physics-data
stored
as
comma-delimited
Physics-data stored as comma-delimited text-files
text-files in
in hadoop
hadoop filesystem
filesystem (hdfs)
(hdfs)
Reproduce
Z+H
benchmark
analysis
with
MapReduce-code
(java!)
Reproduce Z+H benchmark analysis with MapReduce-code (java!)
Mappers:
Mappers: one
one mappers
mappers per
per object
object to
to select
select muon,
muon, electron
electron etc.
etc.
Reduce:
Reduce: select
select events
events with
with 22 good
good leptons
leptons and
and 22 b-jets,
b-jets, calculate
calculate invariant
invariant mass
mass
Hadoop:
Hadoop:
179
179 seconds
seconds
(limited
(limited by
by CPU)
CPU)
Oracle
Oracle (parallel
(parallel 40):
40):
150
150 seconds
seconds
(limited
(limited by
by iowait)
iowait)
40
Hadoop vs Oracle
I/O reads speed comparison for the Z+H benchmark
Hadoop:
Hadoop:
up
up to
to 1600
1600 MB/s
MB/s
Oracle
Oracle DB:
DB:
up
up to
to 2100
2100 MB/s
MB/s
41
root-ntuple
root-ntuple IO:
IO:
up
to
500
MB/s
up to 500 MB/s
Note
Note that
that root
root uses
uses column-based
column-based storage!
storage!Amount
Amount of
of data
data read
read is
is less
less
(benchmark
uses
45
out
of
4000
stored
variables)
(benchmark
uses
45
out
of
4000
stored
variables)
42