Anda di halaman 1dari 41

GKB UNISSULA Semarang

#20 30 October 2018

Interoperability & Data Aggregation


Towards Big Data
Imam Much Ibnu Subroto, ST, M.Sc, Ph.D
Pengembang Science and Technologi Index (SINTA) KEMENRISTEKDIKTI
Ketua Jurusan Teknik Informatika UNISSULA
BIODATA
Afiliasi: Dosen Universitas Islam Sultan Agung
Short CV: Imam Much Ibnu Subroto, ST, M.Sc, Ph.D
Jabatan : Ketua Juruan Teknik Informatika Tim Pakar SINTA RISTEKDIKTI, IAES Founder, IEEE member
Bidang Keahlian: Kecerdasan Buatan, Data KARYA BASIS DATA AWARD
Mining, e-Learning Founder, SINTA Science and Technology Index
Lahir: Semarang, 13 Maret 1973 (sinta.ristekdikti.go.id) Juara III Dosen Berprestasi bidang Sains, LLDKTI
Founder, Indonesian Publication Index (IPI) (KOPERTIS) Wilayah IV Jawa Tengah tahun 2018
PENDIDIKAN (portalgaruda.org). Penghargaan Pengembang GARUDA, Direktur
ST (S1) Sarjana Teknik Elektro, Universitas Founder, Garba Rujukan Digital (GARUDA) Kekayaan Intelektual RISTEKDIKTI 2018
Gadjah Mada (UGM), 1998 (garuda.ristekdikti.go.id). Juara I Dosen Berprestasi, Universitas Islam
M.Sc (S2) Computer Science, Universiti Founder, JARLITBANGNOV Jateng Jaringan Sultan Agung, Quality Day Nov 2017
Teknologi Malaysia (UTM), 2007 Peneliti Jawa Tengah Penghargaan Kekayaan Intelektual,
PhD (S3) Computer Science, Universiti Teknologi (jarlitbangnov.bappeda.jatengprov.go.id) “Pengembang SINTA Science and Technology
Malaysia (UTM), 2015 Founder, e-SINAU Adaptive e-Learning Systems Index”, Direktorat Jenderal Riset dan
PENGALAMAN ORGANISASI PENGELOLA JURNAL INTERNASIONAL
Pengembangan – RISTEKDIKTI, HAKTEKNAS,
2002-2005 : Kepala UPT Komputer dan 2017
Managing Editor, Journal of Telematics and
Teledukasi UNISSULA Informatics (JTI) Best IT Innovation Award for Practical Used,
2015-2017 : Ketua Program Magister Teknik Managing Editor, International Journal of International Conference on Research
Elektro (MTE) UNISSULA Artificial Intelligence (IJ-AI) Innovation in Information System (ICRIIS 2009)
2017-Sekarang : Ketua Jurusan Teknik Editorial Board, Journal of Information Malaysia, 2009
Informatika UNISSULA Technology and Communication (IJ-ICT) BRONZE Medal “MyCopyDetect: Plagiarism
Editorial Board, Computer Engineering and Detection Tool” , INATEX Industrial Art and
PENELITIAN Applications Journal (ComengApp) Technology Exhibition Malaysia 2009
» Plagiarism Detection Using Multiple
Classifiers
HAK KEKAYAAN INTELEKTUAL (HKI) PUBLIKASI INTERNASIONAL TERINDEKS
» Adaptive e-Learning Systems using Machine
Science and Technology Index (SINTA), SCOPUS
Learning
Hak Cipta RISTEKDIKTI 2018 (granted) H-Index : 5
» Publication Performance Measurement
GARUDA Garba Rujukan DIgital, Hak Cipta Documents: 12
based on Science and Technology Index
RISTEKDIKTI 2018 (terdaftar) Citations : 86
(SINTA)
E-SINAU Adaptive e-Learning System, Hak Cipta
2018 (Granted)
Outline
• Project and Research
• About Big Data
• SINTA: Towards Big Data
• Data Collection
• Interoperability and Aggregation
• Data Analysis
• 4 V’s of Big Data in SINTA
• SINTA: Next Work
• AI, Data Mining, Machine Learning
Project and Research
1. SINTA Science and Technology Index
2. GARUDA Garba Rujukan Digital
3. Plagiarism Detection Tools
1)
SINTA: Science and
Technology Index
http://sinta2.ristekdikti.go.id
2)
GARUDA
Garba Rujukan Digital
RISTEKDIKTI
http://garuda.ristekdikti.go.id
3) Plagiarism Detection Tool
• Berbasis GARUDA (Garba Rujukan Digital)
• Database:
• Jurnal
• Seminar
• Skripsi, Tugas Akhir, Tesis, Disertasi
• Laporan Penelitian
• Plagiarism Detection on Indonesian Language Documents
Research Trends on ML
Based on Publication on Scopus
Research Trends on Big Data
Based on Publication on Scopus
About Big Data
Pendahuluan
• Indonesia : salah satu negara penghasil Data terbesar di dunia
• Data ibarat Fuel Oil, siapa yang menguasai data akan berpotensi kaya
raya
• SINTA adalah milik negara RI, bukan berorientasi bisnis!
• SINTA towards Big Data
Data Level: DIKW Pyramid

Kearifan untuk menjawab Why?

Pengetahuan untuk menjawab How?

Data yang punya makna (5W)

Kumpulan facta
Data Mining
• Data mining (pencarian pengetahuan dari data)
• Definisi Data Mining:
Mengekstrak secara otomatis pola atau pengetahuan yang
menarik (tidak sederhana, tersembunyi, tidak diketahui
sebelumnya, berpotensi berguna) dari data dalam jumlah
sangat besar.
The Information Continuum

Cartoon by David Somerville, based on a two pane version by Hugh McLeod


Big Data
Data that is too large or
too complex to be
managed using
traditional data
processing, analysis,
and storage techniques.
4 V’s
of Big Data
Volume: scale of data
Velocity: analysis of streaming data
Variety: different forms of data
Veracity: trustworthiness of data

• Origin
• Authenticity
• Trustworthiness
• Completeness
• Integrity
Big Data Empowering AI and Machine Learning
Industry 4.0
Six Design Principles
• Interoperability: the ability of cyber-physical systems (i.e. work piece carriers,
assembly stations and products), humans and Smart Factories to connect and
communicate with each other via the Internet of Things and the Internet of Services
• Virtualization: a virtual copy of the Smart Factory which is created by linking sensor
data (from monitoring physical processes) with virtual plant models and simulation
models
• Decentralization: the ability of cyber-physical systems within Smart Factories to make
decisions on their own
• Real-Time Capability: the capability to collect and analyze data and provide the
insights immediately
• Service Orientation: offering of services (of cyber-physical systems, humans and Smart
Factories) via the Internet of Services
• Modularity: flexible adaptation of Smart Factories for changing requirements of
individual modules
SINTA: Towards Big Data
Current Progress
Data Integration:
A Higher-level Abstraction
Query Independence of:
• source & location
Mediated Schema • data model, syntax
• semantic variations
•…
Semantic
Mappings
SSN
S1 Name Category SSN CID S2 S3
123-45-6789 Charles undergrad 123-45-6789 CSE444 <cd> <title> The best of … </title>

… …
234-56-7890 Dan grad 123-45-6789 CSE444
… … 234-56-7890 CSE142 <artist> Carreras </artist>
… <artist> Pavarotti </artist>
CID Name Quarter <artist> Domingo </artist>
CSE444 Databases fall
CSE541 Operating systems winter <price> 19.95 </price>
</cd>
Interoperability
• The exchange of information that preserves the meaning and
relationships of the data exchanged.
• Interoperability is the property that allows for the unrestricted
sharing of resources between different systems. This can refer to the
ability to share data between different components or machines,
both via software and hardware, or it can be defined as the exchange
of information and resources between different computers
Data aggregation
• Data aggregation is any process in which information is gathered and
expressed in a summary form, for purposes such as statistical
analysis. A common aggregation purpose is to get more information
about particular groups based on specific variables such as age,
profession, or income.
Interoperability Source

PD-DIKTI
Pemeringkatan Author: overall vs 3 years
Top Afiliasi : 3 years, Overall years
Profil Afiliasi
Profil Author
Collaboration Network
(author)
Author
(2 level
network)

Affiliation Network (Next)


V-1: Data Volume
• GB Size
• MySQL
• Millions Articles
• A Half Millions Authors
• All universities in Indonesia
• Many Asian Country authors
V-2: Velocity
• 5 millions visitors per month
• Daily update
• Daily Synchronization to many other sources
V-3 Variety Data Source

PD-DIKTI
V-4: Veracity
Kepercayaan data tergantung dari:
• Validity profile by PD-DIKTI
• Level of trusted indexer: Scopus, WoS, Crossref, GARUDA, GS
Next SINTA Work
• Machine Learning Applied
• Field Area categorization
• Prediction of next
• Normalized Field Area of Author performance
• Personality behavior detection
• Publication Pattern
• Plagiarism Detection
Terimakasih
imam@unissula.ac.id

Anda mungkin juga menyukai