on
Database Workshop Report
Submitted to:
Saiful Islam Submitted by:
Farzana Afrin
Department Of Computer Science and Engineering
Badhon
Daffodil International University
ID: 172-15-9802
Workshop Aims
• Database Futures Workshop
– 29th-30th May, 2nd edition
– https://indico.cern.ch/event/615499/overview
– 74 participants registered
• Aims
– Discuss future requirements in the database area
– Identify common needs between user communities
– Evaluate new trends & technologies
– Understand how services should evolve/improve to fulfil new requirements
2
Relational Databases Distributed Databases
(DBMS) Object Oriented Databases Time Series Databases
NoSQL (TSDB)
(OODBMS) Object Relational Databases Big Data
(ORDBMS) Analytics
Relational Model PostgreSQL
ElasticSearch
4
Requirements for Run3&4
• Traditional applications will still be key elements in the HL-LHC era
• Increase of database applications to index events and analysis applications
– Based on relational databases and NoSQL technologies
• Overall the relational model is valid
• Oracle is the preferred solution for critical applications
– Exceptions: ALICE (“zoo of solutions”), LHCb (mainly MySQL)
– Cost-effective platform in terms of functionality and performance
– Expertise
– Support is a key factor
5
Requirements for Run3&4
• Run3&4 larger insert and update rates = database workload increase
– Advance Oracle 12c features (in-memory, new partitioning, …)
• Migration to Oracle 12cR2 in LS2
– Powerful hardware to improve response times
• More difficulties to schedule interventions
– Move towards zero downtime
– Fast-switching standby services
• Alternatives to Oracle for smaller projects to facilitate
– Collaboration with other institutes
– Open sourcing
6
Implementations & Technologies
• New systems/evolution – move towards NoSQL solutions
– New accelerator logging service (NXCALS)
– Next generation archiver
– Next generation for Post Mortem event storage and analysis
– Conditions data management system for HEP experiments
– CMS Big Data project
7
8
Implementations & Technologies
• Motivation
– Scale out
– Enable data analytics
– Newer technologies more appropriate to solve specific use cases
• No antagonism SQL vs NoSQL anymore
• Risks in the medium term
– Less interest / disappear
– Difficult to maintain
9
Implementations & Technologies
• Provenance
– LHCb bookkeeping, CMS analysis, …
– Integrate origin / meta information important for further analysis
• Database on Demand
– Supports MySQL, PostgreSQL (relational) and InfluxDB (time series)
– Backup & Recovery, HA, Monitoring updates
– Working to offer instances in TN
– Help to use different DBMS
• Open source tools available to facilitate migration
• DBoD team can be contacted
10
Going beyond relational
• Data Analytics
– Hadoop, Spark, Sqoop, Impala, Hbase, Hive and Pig
• Centralised Elasticsearch service
– Distributed, RESTful search and analytics engine
• Hadoop and ElasticSearch becoming critical to ATLAS
• Growing interest on Time Series databases
– Easier analysis
– Improved storage and ingestion rates
– InfluxDB use cases:
• DBoD monitoring
• IT monitoring
• Streams processing
– Kafka pilot service use cases:
• Accelerator logging service
• Computing infrastructure monitoring
11
In general…
• Positive feedback on database services by IT
• Fruitful discussions
• Synergies (even overlaps)
– Collaboration
– Scope to optimise resources for all
• Next similar workshop in 2019 (LS2)
– Given the dynamic nature of the technologies
– Many projects in development
12
13