Anda di halaman 1dari 9

contributed articles

DOI:10.1145/ 3210752
value, or aiming to generate significant
Featuring the various dimensions of data value for the organization; veracity, or
reliability of the processed data; and
management, it guides organizations variability, or the flexibility to adapt to
through implementation fundamentals. new data formats through collecting,
storing, and processing.
BY SERGIO ORENGA-ROGLÁ AND RICARDO CHALMETA Big data sources can include an
overall company itself (such as through

Framework for
log files, email messages, sensor data,
internal Web 2.0 tools, transaction
records, and machine-generated), as
well as external applications (such as

Implementing data published on websites, GPS sig-


nals, open data, and messages posted
in public social networks).

a Big Data
This data cannot be managed effi-
ciently through traditional methods17
(such as relational databases) since big

Ecosystem in
data requires balancing data integrity
and access efficiency, building indices
for unstructured data, and storing data

Organizations
with flexible and variable structures.
Aiming to address these challenges,
the NoSQL and NewSQL database sys-
tems provide solutions for different
scenarios.
Big data analytics can be used to
extract useful knowledge and analyze
large-scale, complex data from applica-
tions to acquire intelligence and extract
unknown, hidden, valid, and useful re-
lationships, patterns, and information.1
data have been generated and
EN ORM OU S AM O UNT S O F Various methods are used to deal with
such data, including text analytics, audio
stored over the past few years. The McKinsey Global analytics, video analytics, social media
Institute reports this huge volume of data, which is analytics, and predictive analytics; see
generated, stored, and mined to support both strategic the online appendix “Main Methods for
Big Data Analytics,” dl.acm.org/citation.
and operational decisions, is increasingly relevant to cfm?doid=3210752&picked=formats.
businesses, government, and consumers alike,7 as
they extract useful knowledge from it.11 key insights
There is no globally accepted definition of “big ˽˽ This fresh approach to the problem
of creating frameworks helps project
data,” although the Vs concept introduced by managers and system developers
implement big data ecosystems in
Gartner analyst Doug Laney in 2001 has emerged as business organizations.
a common structure to describe it. Initially, 3Vs were ˽˽ The related literature review of big data
for business management covers
used, and another 3Vs were added later.13 The 6Vs some of the existing frameworks used
that characterize big data today are volume, or very for this purpose.

large amounts of data; velocity, or data generated ˽˽ The methodology dimension of


the proposed framework covers the big
and processed quickly; variety, or a large number of data project life cycle and defines
when and how to use the framework’s
structured and unstructured data types processed; other six dimensions.

58 COMM UNICATIO NS O F THE ACM | JA NUA RY 201 9 | VO L . 62 | NO. 1


Big data reflects a complex, intercon- amount of data, use advanced analyti- company databases, focusing on ob-
nected, multilayered ecosystem of high- cal tools, and staff with appropriate taining reports and indicators to mea-
capacity networks, users, and the ap- skills to manage the tools and the data.3 sure and assess business performance.
plications and services needed to store, Big data is a key factor for organi- Conversely, big data works with semi-
process, visualize, and deliver results to zations looking to gain a competitive structured and unstructured data from
destination applications from multiple advantage,4 as it can help develop new multiple sources, focusing on extract-
data sources.26 The main components products and services, make automat- ing value related to exploration, discov-
in that ecosystem include properties, ed strategic and operational decisions ery, and prediction.9
infrastructure, life cycle, models and more quickly, identify what has hap- Big data frameworks. Developing
structures, and security infrastructure.10 pened and predict what will happen in and implementing a big data ecosys-
Big data and business management. the immediate future, identify customer tem in an organization involves not
In order to succeed in today’s complex behavior, guide targeted marketing, pro- only technology but management of the
business world, organizations have to duce greater return on investments, rec- organization’s policies and people.28 A
find ways to differentiate themselves ognize sales and market opportunities, number of frameworks have thus been
from their competitors. With the rise plan and forecast, increase production proposed in the literature.8,10,12,14,18,27,28
of cloud computing, social media, and performance, guide customer-based A framework might describe concepts,
mobile devices, the quantity and qual- segmentation, calculate risk and market features, processes, data flows, and re-
ity of data generated every moment of trends, generate business insight more lationships among components (such
every day is constantly being enhanced, directly, identify consumer behavior as software development), with the
and organizations need to take advan- from click sequences, understand busi- aim of creating a better understanding
tage of it. If they use data properly, they ness disruption, implement product (such as descriptions of components
can become more collaborative, accu- changes that prevent future problems, or design) or guidance toward achiev-
rate, virtual, agile, adaptive, and syn- obtain feedback from customers, calcu- ing a specific objective.23 Frameworks
chronous. Data and information are late price comparisons proactively, rec- consist of (usually interrelated) dimen-
thus primary assets for organizations, ommend future purchases or discounts, sions or their component parts.
IMAGE BY M.G.W HIT E

with most trying to collect, process, and refine internal processes.25 Big data frameworks focus on assist-
and manage the potential offered by Big data analytics can be seen as a ing organizations to take advantage of
big data.5 To take advantage, organiza- more advanced form of business in- big data technology for decision mak-
tions need to generate or obtain a large telligence that works with structured ing. Each has its good points, although

JA N UA RY 2 0 1 9 | VO L. 6 2 | N O. 1 | C OM M U N IC AT ION S OF T HE ACM 59
contributed articles

each also has weaknesses that must be ing, and visualizing necessary to make
addressed, including that none include use of it. However, unlike other frame-
all dimensions (such as data architec- works, it focuses not only on operations
ture, organization, data sources, data affecting data but also other aspects of
quality, support tools, and privacy/se-
curity). Moreover, they lack a method- Data and management like human and material
resources, economic feasibility, profit es-
ology to guide the steps to be followed information are timation, type of data analysis, business
processes re-engineering, definition of
thus primary assets
in the process of developing and imple-
menting a big data ecosystem, making indicators, and system monitoring.
the process easier. They fail to provide
strong case studies in which they are
for organizations, The BD-IRIS framework includes
seven interrelated dimensions (see Fig-
evaluated, so their validity has not been with most trying ure 1): methodology, data architecture,
proved. They do not consider the im-
pact of the implementation of big data
to collect, process, organization, data sources, data qual-
ity, support tools, and privacy/security.
on human resources or organizational and manage The core is the methodology dimen-
and business processes. They do not
consider previous feasibility studies of the potential sion that serves as a guide for the steps
involved in implementing an ecosys-
big data ecosystem projects. They lack offered by big data. tem with big data technology includes
systems monitoring and a definition of phases, activities, and tasks supported
indicators. They fail to study or identify by the six other dimensions. These
the type of knowledge they need to man- other dimensions include various tech-
age. Moreover, they fail to define the niques, tools, and good practices that
type of data analysis required to address support each phase, activity, and task
organizational goals; see the online ap- of the methodology. Additionally, they
pendix for more on the frameworks and include properties and characteristics
their features and weaknesses. that must be fulfilled in certain stages
In addition to big data frameworks, of such development. With the excep-
system developers should also consid- tion of a methodology, the other six
er big data maturity models that define dimensions are included in some of
the states, or levels, where an enter- the seven frameworks outlined earlier,
prise or system can be situated, a set of though none includes all dimensions.
good practices, goals, and quantifiable Methodology dimension. This is the
parameters that make it possible to de- main axis of the framework; the other
termine on which of the levels the en- dimensions are techniques, tools, and
terprise stands, and a series of propos- good practices that support each phase,
als with which to evolve from one level and the activities and tasks within it. The
of maturity to a higher level.2 Several methodology provides practical guid-
such models have been proposed,15,16,24 ance for managing an entire project life
all focused on assessing big data matu- cycle by indicating the steps needed to
rity (the “as is”) and building a vision execute development and implementa-
for what the organization’s future big tion of big data ecosystems. The meth-
data state should be and why (the “to odology consists of phases that in turn
be”). There is thus a need for a new consist of activities that in turn consist
framework for managing big data eco- of tasks, whereby each one must be com-
systems that can be applied effectively pleted before the next one can begin.
and simply, accounting for the main Table 2 (see in the online appendix) lists
features of big data technology and the phases and activities that constitute
avoiding the weaknesses so identified. the methodology, along with the main
dimensions that support execution of
Proposed Framework the activities and tasks. The support-
In this context, the IRIS (the Spanish tools dimension is not included in Table
acronym for Systems Integration and 2 because it is present or can be present
Re-Engineering) research group at the in all tasks of the methodology, as dif-
Universitat Jaume I of Castellón, Spain, ferent information technology tools are
has proposed the Big Data IRIS (BD-IRIS) available to support each of them.
framework to deal with big data ecosys- The methodology can be applied in
tems, reflecting the literature dealing waterfall mode, or sequentially, for each
with this line of research. The BD-IRIS phase, activity, and task. It can also be
framework focuses on data and the tasks applied iteratively, whereby the project
of collecting, storing, processing, analyz- is divided into subprojects executed in

60 COMM UNICATIO NS O F THE ACM | JA NUA RY 201 9 | VO L . 62 | NO. 1


contributed articles

waterfall mode, with each subproject be- terns are applied by software engineers vanced data-analysis techniques are
gun when the previous one has finished; to ensure only valuable data is collect- applied, perhaps divided into two
for example, each subproject can cover ed. Traditional data sources are easier main groups: research and modeling.
an individual knowledge block or a tool. to link to because they consist of struc- Valuable information is obtained as
Data architecture dimension. This tured data. But social software poses a result of applying these techniques
dimension identifies the proposed a greater technological challenge, as to the collected data. Metadata is also
steps the software engineer performs it contains human information that generated, reducing the complexity
during data analysis. The order in is complex, unstructured, ubiquitous, and processing of queries or opera-
which each task is executed in each of multi-format, and multi-channel. tions that must be performed while
the steps and its relationship with the Enhancement. The main objec- endowing the data with meaning.
other dimensions of the framework are tives here are to endow the collected Data and metadata are stored in a da-
specified in the methodology dimen- data with value, identify and extract tabase for future queries, processing,
sion. The data architecture dimension information, and discover otherwise generation of new metadata, and/or
is divided into levels ranging from unknown relationships and patterns. training and validation of the models.
identifying the location and structure To add such endowment, various ad- Inquiry. Here, the system can ac-
of the data to the display of the results
requested by the organization. Figure Figure 1. BD-IRIS framework dimensions.
2 outlines the levels that make up the
data architecture, including:
Content. Here, the location and char-
acteristics of the data are identified
Data
(such as format and source of required Architecture
data, both structured and unstruc-
tured). In addition, the software engi-
neer performs a verification process to
Organizational Support Tools
check that data location and character-
istics are valid for the next level. Data
can be generated offline, through the Methodology
traditional ways of entering data (such
as open data sources and relational
databases in enterprise resource plan- Data Quality Data Sources
ning, customer relationship manage-
ment systems, and other management Privacy
information systems). In addition, data
and Security
can also be obtained online through
social media (such as LinkedIn, Face-
book, Google+, and Twitter).
Acquisition. Here, filters and pat-

Figure 2. Proposed data architecture levels.

Collected data using Valuable information Result of the


Data sources and their
filters and patterns stored in the database requested queries
characteristics identified
required by queries

Research
Connectors Query Plan
Structured Analysis User
Term Analysis Analysis Interaction
Data Sources Highlighting
Enhancement

Visualization
Acquisition

Tacit/Explicit Patterns Filters Dashboard


Automatic Language
Content

Query
Inquiry

Detection Finding Presentation


Structured/
Unstructured Data Data Integration Modeling Alerts
Sentiment Analysis Query Tools Reports
Offline/Online Taxonomy
Video Analysis Delivery
Data Ingestion Classification/ Database
Categorization Access

Access to the database Request for the


for making queries necessary information

JA N UA RY 2 0 1 9 | VO L. 6 2 | N O. 1 | C OM M U N IC AT ION S OF T HE ACM 61
contributed articles

cess the data and metadata stored in dimension is related to the character- project’s target users, including custom-
the system database generated at the istics and needs of the organization to ers, suppliers, and employees. It is also
enhancement level. The main mode provide data and processing and mak- necessary to define the overall corporate
of access is through queries, usually ing use of it. It is also related to all the transformation it is willing to make and
based on the Structured Query Lan- decisions the organization has to make the new business roles required to ex-
guage, that extract the required infor- to adapt the system to its needs. ploit big data technology. For example,
mation as needed. On the one hand, the organization’s a big data project could aim to use the
Visualization. This level addresses strategy must be analyzed, since big knowledge extracted from customer
presentation and visualization of the data projects must align with the or- data, products, and operations through
results, as well as interpretation of ganization’s business strategy. If not the organization’s processes to change
the meaning of the discovered infor- aligned, the results obtained may not be its business model and create value, op-
mation. Due to the nature of big data as valuable as they could be for the orga- timize business management, and iden-
and the large amount of data to be nization’s decision making. To achieve tify new business opportunities. These
processed, clarity and precision are such alignment, the organization must projects are thus potentially able to in-
important in the presentation and vi- determine the objectives the project is crease customer acquisition and satis-
sualization of the results. intended to achieve, as well as the orga- faction, as well as increase loyalty and
Organizational dimension. This nizational challenges involved and the reduce the rate of customer abandon-
ment. They can also improve business
Criteria for selecting appropriate tools. efficiency by, say, eliminating overpro-
duction and reducing the launch time
What is the price? of new products or services. In addition,
Is it a new product and/or company or well established? they can help negotiate better prices
with suppliers and improve customer
Is it an open source or commercial tool?
service. The project will thus be defined
If commercial, is a trial version available?
by the organization’s business strategy.
If commercial, is licensing per seat or per core? On the other hand, the resources offered
Is it platform independent? and the knowledge acquired through
What is the implementation time? big data technology allows optimization
What is the implementation cost?
of existing business processes by im-
proving them as much as possible.
Does it work in the cloud and use MapReduce and NoSQL features?
To integrate enterprise strategy, busi-
Can real-time features be used or integrated into a real-time system? ness process, and human resources, the
How easy is it to upgrade? BD-IRIS framework uses the ARDIN
How scalable is it? (the Spanish acronym for Reference Ar-
Can it work in batch and/or programmable mode? chitecture for INtegrated Development)
enterprise reference architecture, al-
How easy is it to use? Is a GUI available?
lowing project managers to redefine
What learning curve should be expected?
the conceptual aspects of the enterprise
How compatible is it with other products? (such as mission, vision, strategy, poli-
Does it work with big data? cies, and enterprise values), redesign
Does it offer an API? and implement the new business pro-
Can it integrate with geospatial data (such as GIS)?
cess map, and reorganize and manage
human resources considering in light
Does it provide modern techniques for data analysis?
of the new information and communi-
Can it handle missing data and data cleaning? cation technologies—big data in this
Will it be possible to incorporate new techniques (such as add-ons or modules) different from those case—to improve them.6
already implemented, as user needs evolve? In addition, models of the business
What is the speed of computations? Does it use memory efficiently? processes must be developed so weak
Does it support programming languages (such as C++, Python, Java, and R) rather than just some points and areas in need of improve-
internal ad hoc language? ment are detected. BD-IRIS uses sev-
Is it able to fetch data from the Internet or from databases (such as SQL-supported)? eral modeling languages:
Does it require connectors for databases? If yes, what do they cost? I*. I* makes it possible for project
Does it support the SQL language?
engineers to gain a better understand-
ing of organizational environments
Are visualization capabilities available?
and business processes, understand
Does it offer a Web or mobile client? the motivations, intentions, goals, and
Is good technical support, training, and documentation available? rationales of organizational manage-
Is benchmarking available? ment, and illustrate the various char-
acteristics seen in the early phases of
requirement specification.30

62 COM MUNICATIO NS O F TH E AC M | JA NUA RY 201 9 | VO L . 62 | NO. 1


contributed articles

Business Process Model and Notation Big data technology is able to process
(BPMN). BPMN,20 designed to model both structured data (such as from re-
an overall map of an enterprise’s busi- lational databases, ERPs, CRMs, and
ness processes, includes 11 graphical, open data), as well as data from semi-
or modeling, elements classified into
four categories: core elements (the BPD Considering that structured and unstructured data (such
as from log files, machine-generated
core element set), flow objects, con-
necting objects, and “swimlanes” and
the foundation data, social media, transaction records,
sensor data, and GPS signals). Objec-
artifacts. BPMN 2.0 extends BPMN. of big data tives depend on the data that is available
Unified Modeling Language. UML2.019
is also used to model interactions among
ecosystems is data, to the organization. To ensure optimal
performance, the organization must de-
users and the technological platform in it is essential fine what data is of interest, identify its
greater detail without ambiguity.
In selecting these modeling lan-
that such data sources and formats, and perform, as
needed, the pre-processing of raw data.
guages, we took into account that they is reliable and Data is transformed into a format that
are intuitive, well-known by academ-
ics and practitioners alike, useful for provides value. is more readily “processable” by the
system. Methods for preprocessing raw
process modeling and information- data include feature extraction (select-
system modeling, and proven in real- ing the most significant specific data
world enterprise-scale settings. for certain contexts), transformation
Support-tools dimension. This di- (modifying it to fit a particular type of
mension consists of information-tech- input), sampling (selecting a represen-
nology tools that support all dimen- tative subset from a large dataset), nor-
sions in the framework, facilitating malization (organizing it with the aim
execution of the tasks to be performed of allowing more efficient access to it),
in each dimension. Each such task and “de-noising” (eliminating existing
can be supported by tools with certain noise in it). Once such operations are
characteristics; for example, some performed, data is available to the sys-
tools support only certain tasks, and tem for processing.
some tasks can be carried out with and Data-quality dimension. The aim
without the help of tools. here is to ensure quality in the acquisi-
The tools that can be used in each tion, transformation, manipulation,
dimension, except for data architec- and analysis of data, as well as in the
ture, are standard tools that can be validity of the results. Quality is the con-
used in any software-engineering sequence of multiple factors, includ-
project. Types of tools include busi- ing complexity (lack of simplicity and
ness management, office, case, project uniformity in the data), usability (how
management, indicator management, readily data can be processed and inte-
software testing, and quality manage- grated with existing standards and sys-
ment. The data architecture dimen- tems), time (timelines and frequency
sion requires specific tools for each of of data), accuracy (degree of accuracy
its levels; see Table 3 in the online ap- describing the measured phenome-
pendix for examples of tools that can non), coherence (how the data meets
be used at each level in the data archi- standard conventions and is internally
tecture dimension. consistent, over time, with other data
Several tools are able to perform the sources), linkability (how readily the
same tasks, and the choice of appropri- data can be linked or joined with other
ate tool for each project depends on data), validity (the data reflects what it
the scenario in which it is used. The is supposed to measure), accessibility
table here lists criteria to help prompt (ease of access to information), clarity
the questions that project engineers (availability of clear and unambiguous
must address when choosing the ap- descriptions, together with the data),
propriate tools for the particular needs and relevance (the degree of fidelity of
of each project. the results with regard to user needs, in
Data sources dimension. Consider- terms of measured concepts and repre-
ing that the foundation of big data eco- sented populations).29
systems is data, it is essential that such The United Nations Economic Com-
data is reliable and provides value. This mission for Europe29 has identified the
dimension refers to the sources of the actions software engineers should per-
data processed in big data ecosystems. form to ensure quality in data input and

JA N UA RY 2 0 1 9 | VO L. 6 2 | N O. 1 | C OM M U N IC AT ION S OF T HE ACM 63
contributed articles

output results, thereby minimizing the business models or their business


risk in each of the various factors; see processes. Big data has emerged over
Table 4 in the online appendix. the past five years in companies, forc-
Privacy/security dimension. Big data ing them to deal with multiple business,
ecosystems usually deal with sensitive
data, and the knowledge obtained from Although proper management, technological, process-
ing, and human resources challenges.
the data that may be scattered and lack-
ing in value by itself. Due to such scat-
integration of big Seven big data frameworks have been
proposed in the IT literature, as outlined
tering, the customers and users who data in a company here, to deal with them in a satisfactory
generate the data are often unaware of
its value, disclosing it without reflection
is recognized way. A framework can be defined as a
structure consisting of several dimen-
or compensation. Meanwhile, lack of as a key success sions that are fitted and joined together
awareness can lead to unexpected situ-
ations where the generated information
factor in all big to support or enclose something, in this
case development and implementation
is personally identifiable and metadata data projects, of a big data ecosystem.
is more important than the data itself.
Moreover, big data involves the real-time only two existing Big data frameworks also have weak-
ness. First, none includes a methodol-
collection, storage, processing, and anal- frameworks provide ogy, understood as a documented ap-
ysis of large amounts of data in different
formats. Organizations that want to use any guidance proach for performing all activities in a
big data project life cycle in a coherent,
big data must consider the risks, as well
as their legal and ethical obligations,
about the need consistent, accountable, repeatable
manner. This lack of a methodology is
when processing and circulating it. to consider a big handicap because big data is still
This dimension considers the priva-
cy and security aspects of data manage-
corporate a novel area, and only a limited supply
of well-trained professionals know what
ment and communications, included management steps to take, in what order to take them,
in the same dimension because they
are strongly related to each other, as implications. and how they should be performed.13
It is thus difficult for IT professionals,
explained in the online appendix. even those well trained in big data proj-
ects, to successfully take on a project
BD-IRIS Framework Validation employing the existing frameworks. In
Once the framework is developed, the addition, in large-scale big data projects
next task is to validate and improve it, a employing multiple teams of people,
process consisting of two phases: expert decisions regarding procedures, tech-
assessment and case studies. The aims nologies, methods, and techniques can
are to validate the framework by verifying produce a lack of consistency and poor
and confirming its usefulness, accuracy, monitoring procedures. Second, each
and quality and improve the framework of the six dimensions of the big data
with the feedback obtained from the framework—data architecture, organi-
organizations involved and the conclu- zation, sources, quality, support tools,
sions drawn from the case studies. In and privacy/security—addresses a dif-
such a case study, the framework is ap- ferent aspect of a project. However, al-
plied to a single organization. For exam- though existing frameworks consider
ple, we applied it to a Spanish “small and several dimensions, none of the seven
medium-size enterprise” from the metal frameworks proposed in the IT litera-
fabrication market with 250 employ- ture considers all six dimensions. Using
ees, using it to guide development and only one of these frameworks means
implementation of a social CRM system some important questions are ignored.
supported by a big data ecosystem.21 In Third, the approaches in each dimen-
another case study, we applied it to the sion are not fitted and joined together
Spanish division of a large oil and gas and are sometimes too vague and gen-
company, using it to guide development eral or do not cover all the activities of
and implementation of a knowledge the whole project life cycle. For exam-
management system 2.0 as supported by ple, although proper integration of big
a big data ecosystem;22 see the online ap- data in a company is recognized as a key
pendix for results. success factor in all big data projects,3
only two existing frameworks provide
Discussion any guidance about the need to con-
Big data helps companies increase their sider corporate management implica-
competitiveness by improving their tions. Neither do they explain when and

64 COMMUNICATIO NS O F TH E AC M | JA NUA RY 201 9 | VO L . 62 | NO. 1


contributed articles

how to improve business strategy or Although the framework has been July 16–20). Lecture Notes in Computer Science,
8557. Springer International Publishing, Switzerland,
when and how to carry out reengineer- validated through two different meth- 2014, 214–227.
ing of a business process using big data. ods—expert evaluation and case stud- 12. Ferguson, M. Architecting a Big Data Platform
for Analytics. IBM White Paper, Oct. 2012;
As a result, opportunities for improving ies—it also involves some notable limita- http://www-01.ibm.com/common/ssi/cgi-bin/
business performance can be lost. tions. For example, the methods we used ssialias?htmlfid=IML14333USEN
13. Flouris, I., Giatrakos, N., Deligiannakis, A., Garofalakis,
For this reason, the BD-IRIS frame- for the analysis and validation in the two M., Kamp, M., and Mock, M. Issues in complex event
work needs to be structured in all case studies are qualitative and not as processing: Status and prospects in the big data era.
Journal of Systems and Software 127 (May 2017),
seven dimensions. The main innova- precise as quantitative ones and based 217–236.
tion is the BD-IRIS methodology di- on the perceptions of the people involved 14. Gèczy, P. Big data management: Relational framework.
Review of Business & Finance Studies 6, 3 (2015), 21–30.
mension, along with the fact that it in the application of the framework in 15. Halper, F. and Krishnan, K. TDWI Big Data Maturity
Model Guide. TDWI Research, Renton, WA, 2013;
takes into account all the dimensions the case studies and the consultants who https://tdwi.org/whitepapers/2013/10/tdwi-big-data-
a big data framework should have evaluated it. Moreover, the evaluation maturity-model-guide.aspx
16. Hortonworks. Hortonworks Big Data Maturity Model, 2016;
within a single framework. The BD- experts were chosen from the same con- http://hortonworks.com/wp-content/uploads/2016/04/
IRIS methodology represents a guide sulting company to avoid potential bias. Hortonworks-Big-Data-Maturity-Assessment.pdf
17. Jagadish, H.V., Gehrke, J., Labrinidis, A.,
to producing a big data ecosystem ac- Finally, we applied the framework in two Papakonstantinou, Y., Patel, J.M., Ramakrishnan, R.,
cording to a process, covering the big companies in two different industrial and Shahabi, C. Big data and its technical challenges.
Commun. ACM 57, 7 (July 2014), 86–94.
data project life cycle and identifying sectors but have not yet tested its validity 18. Miller, H.G. and Mork, P. From data to decisions: A
when and how to use the approaches in other types of organization. value chain for big data. IT Professional 15, 1 (Jan.-
Feb. 2013), 57–59.
proposed in the other six dimensions. Regarding the scope of future work, 19. Object Management Group. Unified Modeling
The utility of the framework and its we are exploring four areas: apply and Language. OMG, 2000; http://www.uml.org/
20. Object Management Group. Business Process Model
completeness, level of detail, and ac- assess the framework in companies and Notation. OMG, 2011; http://www.omg.org/spec/
curacy of the relations among the from different industrial sectors; evalu- BPMN/2.0
21. Orenga-Roglá, S. and Chalmeta, R. Social customer
methodology tasks and the approach- ate the ethical implications of big data relationship management: Taking advantage of Web
es to other dimensions were validated systems; refine techniques for convert- 2.0 and big data technologies. SpringerPlus 5, 1462
(Aug. 2016), 1–17.
in 2016 by five expert professionals ing different input data formats into a 22. Orenga-Roglá, S. and Chalmeta, R. Methodology
for the implementation of knowledge management
from a Spanish consulting company common format to optimize the pro- systems 2.0: A case study in an oil and gas company.
with experience in big data projects, cessing and analysis of data in big data Business & Information Systems Engineering (Dec.
2017), 1–19; https://doi.org/10.1007/s12599-017-0513-1
and by managers of the two organiza- systems; and finally, refine the automat- 23. Pawlowski, J. and Bick, M. The global knowledge
tions (not experts in big data projects) ic identification of people in different management framework: Towards a theory for
knowledge management in globally distributed
participating in our case studies. Lack social networks, allowing companies to settings. Electronic Journal of Knowledge
of validation is a notable weakness of gather information entered by the same Management 10, 1 (Jan. 2012), 92–108.
24. Radcliffe, J. Leverage a Big Data Maturity Model to
the existing frameworks. person in a given social network. Build Your Big Data Roadmap. Radcliffe Advisory
Services, Ltd., Guildfor, U.K., 2014.
25. Sagiroglu, S. and Sinanc, D. Big data: A review. In
Conclusion References
Proceedings of the International Conference on
1. Adams, M.N. Perspectives on data mining. International
This article has explored a framework Journal of Market Research 52, 1 (Jan. 2010), 11–19.
Collaboration Technologies and Systems (San Diego,
CA, May 20–24). IEEE Press, 2013, 42–47.
for guiding development and imple- 2. Ahern, M., Clouse, A., and Turner, R. CMMI Distilled:
26. Shin, D.H. and Choi, M.J. Ecological views of big data:
A Practical Introduction to Integrated Process
mentation of big data ecosystems. We Perspectives and issues. Telematics and Informatics
Improvement, Second Edition. Addison-Wesley
32, 2 (May 2015), 311–320.
developed its initial design from the Longman Publishing Co., Inc., Boston, MA, 2003.
27. Sun, H. and Heller, P. Oracle Information Architecture:
3. Alfouzan, H.I. Big data in business. International
existing literature while providing ad- An Architect’s Guide to Big Data. Oracle White Paper,
Journal of Scientific & Engineering Research 6, 5 (May
Aug. 2012; https://d2jt48ltdp5cjc.cloudfront.net/
ditional knowledge. We then debugged, 2015), 1351–1352.
uploads/test1_3021.pdf
4. Bharadwaj, A., El Sawy, O.A., Pavlou, P.A., and
28. Tekiner, F. and Keane, J.A. Big data framework. In
refined, improved, and validated this Venkatraman, N. Digital business strategy: Toward a
Proceedings of the IEEE International Conference on
next generation of insights. MIS Quarterly 37, 2 (June
initial design through two methods— 2013), 471–482.
Systems, Man, and Cybernetics (Manchester, U.K., Oct.
13–16). IEEE Press, 2013, 1494–1499.
expert assessment and case studies—in 5. Brown, B., Chui, M., and Manyika, J. Are you ready for
29. United Nations Economic Commission for Europe.
the era of ‘big data’? McKinsey Quarterly 4 (Oct. 2011),
a Spanish metal fabrication company 24–35.
A Suggested Framework for the Quality of Big Data.
Deliverables of the UNECE Big Data Quality Task
and the Spanish division of an interna- 6. Chalmeta, R., Campos, C., and Grangel, R. Reference
Team. UNECE, Dec. 2014; http://www.unece.org/
architectures for enterprise integration. Journal of
tional oil and gas company. The results Systems and Software 57, 3 (July 2001), 175–191.
unece/search?q=A+Suggested+Framework+for+the+
Quality+of+Big+Data.+&op=Search
show the framework is considered valu- 7. Chui, M., Manyika, J., and Bughin, J. Big data’s
30. Yu, E. Why agent-oriented requirements engineering.
potential for businesses. Financial Times (May 13,
able by corporate management where 2011); https://www.ft.com/content/64095dba-7cd5-
In Proceedings of the Third International Workshop
on Requirements Engineering: Foundation of Software
the case studies were applied. 11e0-994d-00144feabdc0
Quality (Barcelona, Spain, June 16–17). Presses
8. Das, T.K. and Kumar, P.M. Big data analytics:
The framework is useful for guiding A framework for unstructured data analysis.
Universitaires de Namur, Namur, Belgium, 1997, 171–183.

organizations that wish to implement International Journal of Engineering and Technology


5, 1 (Feb.-Mar. 2013), 153–156. Sergio Orenga-Roglá (sergio.orenga@uji.es) is a
a big data ecosystem, as it includes a 9. Debortoli, S., Müller, O., and Vom Brocke, J. Comparing researcher in the Systems Integration and Re-Engineering
methodology that indicates in a clear business intelligence and big data skills: A text (IRIS) research group at the Universitat Jaume I,
mining study using job advertisements. Business & Castellón, Spain.
and detailed way each activity and Information Systems Engineering 6, 5 (Oct. 2014),
task that should be carried out in each 289–300. Ricardo Chalmeta (rchalmet@uji.es) is an assistant
10. Demchenko, Y., de Laat, C., and Membrey, P. Defining professor in the Department of Computer Languages and
of its phases. It also offers a compre- architecture components of the big data ecosystem. Systems and Director of the Systems Integration and
In Proceedings of the International Conference on Re-Engineering (IRIS) research group at the Universitat
hensive understanding of the system. Collaboration Technologies and Systems (Minneapolis, Jaume 1, Castellón, Spain.
Moreover, it provides control over a MN, May 19–23). IEEE Press, 2014, 104–112.
11. Elgendy, N. and Elragal, A. Big data analytics: A
project and its scope, consequences, literature review. In Proceedings of the 14th Industrial
opportunities, and needs. Conference on Data Mining (St. Petersburg, Russia, © 2019 ACM 0001-0782/19/01 $15.00

JA N UA RY 2 0 1 9 | VO L. 6 2 | N O. 1 | C OM M U N IC AT ION S OF T HE ACM 65
Copyright of Communications of the ACM is the property of Association for Computing
Machinery and its content may not be copied or emailed to multiple sites or posted to a
listserv without the copyright holder's express written permission. However, users may print,
download, or email articles for individual use.

Anda mungkin juga menyukai