Anda di halaman 1dari 27

Does Big Data Spell Big Costs ?

Click to 30 Master subtitle style March edit 10:00 am PT/ 1:00 pm ET

@ impetuscalling

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Outline

Big Data Current Scenario Cost components in a Big Data Warehouse Best Practices - Reducing the cost of Big Data solutions

Cost of storage Technologies- What and Where? Big Data strategies

Our recommendations to reduce TCO

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Big Data - Current Scenario

2.5 Quintillion $6 Trillion $650 Billion 1ZB 1800EB 90% 18 Months

Bytes produced every day Big data cost IDC/EMC Cost of wasted productivity because of information overload Estimated Internet Traffic by 2015 Size of the digital universe in 2011 90% of the data in the world today has been created in the last two years alone Estimated time for the digital universe to double

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Age of Data

Age of Software

Age of Data

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Existing State of Big Data Solutions

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Using Commodity H/w for Big Data


Commodity Hardware Pros

Build your own The promise of innovation Building reliable storage $1 per GB Add the cost of managing / monitoring / hosting

Cons

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Using Open Source & Cloud Computing


Open Source Pros

Software is free !! Glory to the Elephant Cost of Training thinking parallel is not intuitive Cost of Support support is not free

Cons

Cloud Computing Pros

Rent what you need $14,000 a month for 100 TB data storage only
Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Cons

Impetus Proprietary ImpetusProprietary

Big Data Warehouse Cost Components


Initial entry costs- Cost of experimentation Cost of integration and moving data - Cost of ETL Query and analytics capability Manageability On-going maintenance - Monitoring and tuning Changing capacity - Additional hardware Cost of compliance

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Lowering TCO of Big Data

Hardware

Lower cost of storage Lower cost of computation Make things faster Do more with less

Software

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

How to reduce the cost of storage?

Compress RainStor and similar solutions

Just make sure your Read Throughput is high Setup data pipelines or use ILM Principles

Retain all v/s load & process


Creation and Receipt Distribution Use Maintenance Disposition

Focus on Big Data but dont forget the Small Data

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

10

Technologies: What and Where?


What? Open Source vs. Commercial software? Specialized hardware/appliances vs. commodity hardware? Vendor lock-in vs. vendor independence? Cost of latencies? Cloud? Where? OLTP - NoSQL v/s OLAP - DW (MapReduce & MPP)

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

11

OLAP: Big Data Scenarios

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

12

Data Tapping Point, Cost & Latency

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

13

Indirect Analytics over Hadoop

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

14

Direct Analytics over Hadoop

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

15

Analytics over Hadoop with MPP DW

Impetus Proprietary ImpetusProprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

16

Selecting the Right Technology


Key considerations

$ per TB Business Continuity/ Cost/ Vendor Lock-in Latency Needs

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

17

Choosing MPP
$ per TB Driven

EMC Greenplum Teradata, Aster HP Vertica Oracle Exadata Netezza ParAccel Others

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

18

Faster Map Reduce & Hadoop


Business Continuity/ Cost/ Vendor Lock-in

MapR HPCC Hadapt Pervasive DataRush, HStreaming Cloud Map Reduce DataStax Platform Computing MARS, GPMR ParStream

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

19

OLTP: NoSQL Solutions


Latency Needs

Column stores

HBase, Cassandra MongoDB, CouchDB Redis, Riak etc.; Kyoto Cabinet/Tokyo Tyrant, Berkley Neo4j SimpleDB

Documents stores

Key stores

GraphDB

Cloud stores

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

20

OLTP: New Era RDBMS Versions


Postgres, InfiniDB, Infobright MySQL Cluster GridSQL, EnterpriseDB MS SQL Sybase IQ Specialized stores

VoltDB, MarkLogic, Clustrix

Xeround ParStream Oracle NoSQL

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

21

Recommendations Cost Components of a Big Data Warehouse


Initial Entry Costs - Cost of Experimentation We recommend Follow Best Practices , Learn or Hire Cost of Integration and Moving Data- Cost of ETL We recommend - Remove costly licensed tools, switch to Map Reduce for ETL or ELT Manageability - Provisioning, management tools We recommend Opt for multi-vendor management toolsets, e.g. Impetus Ankush On-Going Maintenance- Monitoring and Tuning We recommend Automate! Automate! Automate! Changing Capacity - Additional Hardware Do you know the GPU?

Impetus Proprietary Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

2222

Recommendations Hardware & Software


Cost of Storage- Compress Data We recommend Opting for RainStor/ similar solutions Do More with Less - Faster MR We recommend MapR/ similar solutions

Acunu and related solutions for NoSQL

Impetus Proprietary Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

2323

About Us

Strategic partners for software product engineering and R&D Thought leaders in cutting-edge technologies Mature processes and practices that are methodical, yet flexible Diverse domain expertise

Our services in Big Data and Analytics


Expert consulting Proof-of-concept & Implementation Support services

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

24

Big Data Quick Start Program


Three Modules Gear up (1 day session) Base Camp (4 day session) Summit (5 day session)

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Questions

Please send in your questions using the chat panel

Impetus Proprietary

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=57

2626

Thank you
For more information, write to us at inquiry@impetus.com Click to edit Master subtitle style
@ impetuscalling

Recorded version available at http://www.impetus.com/webinar_registration?event=archived&eid=56

Anda mungkin juga menyukai