By Frank Ohlhor st
Copyright 2013 by John Wiley & Sons, Inc.
Index
Backu p system s, 66 67
Abstraction tools, 54 Batch processin g, 53, 54
Access to data, 32, 50, 63, 69 Beh avioral an alytics, 17 18
Accu racy of data, 103 104, 117 118 Ben efits an alysis, 23 24
Activity logs, 64 Best practices, 93 109
Algorith m s an om alies, 101 103
accu racy, 18, 103 104 expedien cy-accu racy tradeoff,
an om alies, 103 103 104
data m in in g, 119 h igh -valu e opportu n ities focu s,
evolu tion of, 90 91 94 95
real-tim e resu lts, 104 in -m em ory processin g, 104 109
scen arios, 5 project m an agem en t processes,
statistical application s, 4 98 101
text an alytics, 59 project prerequ isites, 93 94
Am azon , 11 12, 27 th in kin g big, 95 96
Am azon S3, 44 worst practice avoidan ce, 96 98
An alysis of data. See Data an alysis BI. See Bu sin ess in telligen ce (BI)
An om alies, valu e of, 101 103 Big Data an d Big Data an alytics
Apple, 102 an alysis categories, 4 5
Application s, 4, 52 57, 70 application platform s, 52 57
Arch ives, 51 best practices, 93 109
Artificial in telligen ce, 86 87, 116, 120 bu sin ess case developm en t, 21 28
Astron om y, 114 ch allen ges, 6 7, 112
Au to-categorization , 57 classification s, 5 6
Au tom ated m etadata acqu isition com pon en ts, 47
system s, 116 117 defin ed, 1 2, 21 22, 78
Availability of data, 63, 71 72 evolu tion of, 77 91
exam ples of, 113 115
4Vs of, 3 4
BA. See Bu sin ess an alytics (BA) goal settin g, 39 40
BackType, 18 in trodu ction , ix xi
153
En cryption , 70, 72
En tertain m en t in du stry, 41 Hadoop
En tity extraction , 59 advan tages an d disadvan tages of,
En tity relation extraction , 59 7 10, 26, 28, 46, 60, 69, 85
Errors, 119 design an d fu n ction of, 7 8,
Even t-driven data distribu tion , 56 84 85, 103
Eviden ce-based m edicin e, 88 even t-processin g fram ework, 53
Evolu tion of Big Data, 77 91 fu tu re, 85
algorith m s, 90 91 origin s of, 26
cu rren t issu es, 84 85 ven dor su pport, 46
fu tu re developm en ts, 85 90 Yah oo s u se, 26 27
m odern era, 80 83 HANA, 85
origin s of, 77 80 HBase, 9 10
Expectation s, 98 HDFS, 9 10
Expedien cy-accu racy tradeoff, Health care
103 104 Big Data an alytics opportu n ities,
Extern al data, 38, 40 114
Extract, tran sform , an d load (ETL), 9 Big Data tren ds, 41
Extractiv, 43 com plian ce, 68
evolu tion of Big Data, 87 90
See also Electron ic m edical
Facebook, 12, 27 records
Filters, 116 Hibern ate, 54
Fin an cial con trollers, 109 High -valu e opportu n ities, 94 95
Fin an cial sector, 16 17, 91, 109 History. See Evolu tion of Big Data
Fin an cial tran saction s, 42 Hive, 9, 54
Flexibility of storage system s, 50 Hollerith Tabu latin g System , 78
4Vs of Big Data, 3 4 Horton works, 28
RAM-based devices, 55 56
Object-based storage system s, 49 Real-tim e an alytics, 53, 104 109, 116
OLAP system s, 120 Recru itm en t of data an alytics
OOZIE, 9 person n el, 32 33
Open HeatMap, 44 Red Hat, 54
Open sou rce tech n ologies Relation al database m an agem en t
availability, 28 system (RDBMS), 58, 67
option s, 43 44, 45 Research an d developm en t (R&D),
pilot projects, 61 82
See also Hadoop Resou rce description fram ework
Organ ization al stru ctu re, 30 31 (RDF), 58 59
Ou tsou rcin g, 61 Resu lts, 121
Retailers
an om alies, 102
Parallel processin g, 55 Big Data u se, 84, 109
Paten ts, 72 75 click-stream data, 16, 17
Pen tah o, 9 data sou rces, 41
Perform an ce m easu rem en t, 35 goal settin g, 39 40
Perform an ce-secu rity tradeoff, in -m em ory processin g
63 64, 71 72 tech n ology, 109
Perlowitz, Bill, 83 organ ization al cu ltu re, 34
Ph arm aceu tical com pan ies, 2 Reten tion of data, 64 65
Pig, 9 Retu rn on in vestm en t (ROI), 25
Pilot projects, 9 10, 61 Risk an alysis, 24 25
Plan n in g, 44 46, 93 94, 99
Poin t-of-sale (POS) data, 16
Predictive an alysis, 4 5, 40 SANS, 7
Privacy, 122 123 SAP, 85
Problem iden tification , 45 Scale-ou t storage solu tion s, 48 49
Processin g, 59 61, 104 109 Scalin g, 95, 120
Project m an agem en t processes, Scen arios, 5, 121 122
98 101 Sch m idt, Erik, 80
Project plan n in g, 44 46, 93 94, 99 Scien ce, 16, 78 83, 114
Pu blic in form ation sou rces, 43 44 Scope of project, 24
Pu rgin g of data, 64 65 Scru bbin g program s, 101