Anda di halaman 1dari 271

rd

IICMA 2015, the 3 IndoMS International


Conference on Mathematics and Its Applications
Proceedings

ISBN: 978-602-50020-0-7

Editors:
Bevina D. Handari
Dipo Aldila
Gianina Ardaneswari

Reviewers :
Titin Siswantining, Mila Novita, Denny Riama Silaban, Maman Faturahman,
Budi Nurani Ruchjana, Herni Utami, Al Jupri, Zulkardi, Yusep Suparman, Yudi Satria,
Yudhie Andriyana,Yekti Widyaningsih, Yaya S. Kusumah, I Wayan Mangku,
Utriweni Mukhaiyar, Turmudi, Syafrizal, Suwanda, Sutarto Hadi, Supama,
Sugi Guritman, Suarsih Utama, Sri Haryatmi, Rianti Setiadi, Pasrun Adam,
Setiawan, Ratu Ilma, Nurtiti Sunusi, Nurdin, Nur Iriawan, Nora Hariadi,
Nelson Nainggolan, Mukhsar, Lienda Noviyanti, Kiki Ariyanti Sugeng, Ipung Yuwono,
Indah Emilla Wijayanti, Hengki Tasman, Hendri Murfi, Ema Carnia, Edi Cahyono,
Djati Kerami, Dipo Aldila, Dedi Rosadi, Dafik, Dadan Kusnandar, Ch.Rini,
Bevina D. Handari, Atiek Iriany, Asrul Sani, Asari, Aryadi Wijaya, Anang Kurnia,
Al Haji Akbar Bachtiar, Al Sutjijana, Ahmad Fauzan, Anak Agung Gede Ngurah,
Agung Lukito

Penerbit:
The Indonesian Mathematical Society

Redaksi:
Gedung CAS Lantai IV, Institut Teknologi Bandung,
Jalan Ganesha No. 10, Bandung 40132, Indonesia.

Hak cipta dilindungi undang-undang


Dilarang memperbanyak buku ini dalam bentuk dan
cara apapun tanpa ijin tertulis dari penerbit.
IICMA 2015
rd
the 3 IndoMS International Conference on
Mathematics and Its Applications
Proceedings
rd th
November 3 and 4 2015
Depok, Indonesia
CONTENTS
FOREWORDS …………………………………………………………………………..........i
President of the Indonesian Mathematical Society (IndoMS) …………….……….…..ii
Chair of the Committee IICMA 2015 ……………………………………………...….....…iii

Acknowledgement …………………………………………………………………..…...…iv

INVITED SPEAKER
ADVANCES IN STATISTICAL GENOMICS OF HUMAN COMPLEX TRAITS AND
DISEASES
Beben Benyamin ……………………………………………………………....………x
MODELLING, PREDICTING AND UNDERSTANDING KIDNEY DEVELOPMENT
Nicholas A.Hamilton ………………………………………………........……………xi
A SIMPLE INTERVENTION FOR DENGUE TRANSMISSION; HOW FAR
MODELING CAN GO
E. Soewono ……………………………………………………………..……………xii
ERROR-CORRECTING PAIRS FOR PUBLIC-KEY CRYPTOSYSTEMS
Ruud Pellikaan and Irene Marques-Corbella ……………………………………..…xiii
ON CLASS OF 𝝀-MODULES
Indah E. Wijayanti, M. Ardiansyah, Puguh W.Prasetya ……………………………..xiv
STATISTICS IN THE ERA OF BIG DATA: A CASE IN ROBUST STATISTICS
Maman A. Djauhari …………………………………………………………………..xv
ACTUARIAL CAPACITY DEVELOPMENT IN INDONESIA
Budi TA Tampubolon ………………………………………….……………...…….xvi
PARALLEL SESSIONS
BOUNDARY VALUE PROBLEMS FOR A CLASS OF HAMILTON-JACOBI-
BELLMAN EQUATIONS
Muhammad Kabil Djafar, Yudi Soeharyadi, Hendra Gunawan …………...……….….1
BOUNDED SETS IN FINITE DIMENSIONAL N-NORMED SPACES
Esih Sukaesih and Hendra Gunawan ………………….………………………….……8
THE STOCHASTIC SI MODEL IN A SINGLE AREA: COMPARISON WITH THE
DETERMINISTIC SI MODEL AND REAL DATA
Benny Yong, Livia Owen, Elvina Octora …………………..….……………………..13

v
INVESTIGATING THE ACCURACY OF BPPT FLYING WING UAV’S
LINEARIZED EQUATION OF MOTION COMPARED TO BPPT-04C SRITI
FLIGHT TEST DATA
Jemie Muliadi ………………………..……………………………….……..……….24
DETERMINATION OF FISHERMEN POVERTY ALLEVIATION PROGRAM
USING ANALYTIC HIERARCHY PROCESS IN PARIGI MOUTONG, CENTRAL
SULAWESI PROVINCE, INDONESIA
Sutikno, Soedarso, Sukardi, Syamsuddin HM, Yusman Alharis, Kartika Nur ‘Anisa ...33
CLASSIFICATION DATA OF CANCER USING FUZZY C-MEANS WITH
FEATURE SELECTION
Bona Revano, Zuherman Rustam …………………………………………..………...40
ON THE RATE OF CONSTRAINED ARRAYS
Putranto Utomo and Ruud Pellikaan ………………………….……..……….………49
ERROR-CORRECTING PAIRS FOR A PUBLIC-KEY CRYPTOSYSTEM
Ruud Pellikan and Irene Marquez-Corbella ……………………...………………..…57
RAINBOW CONNECTION NUMBER AND STRONG RAINBOW CONNECTION
NUMBER OF THE GRAPH (𝒅𝟐(𝑷𝒏) + 𝑲𝟏)
Diah Prastiwi and Kiki A. Sugeng ……………………………………………...…….66
ENCRYPTION ALGORITHM USING NEW MODIFIED MAP FOR DIGITAL
IMAGE
Suryadi M.T., Maria Yus Trinity Irsan, Yudi Satria …………………………..…..….71
FACTORS THAT INFLUENCE INFIDELITY TENDENCY FOR WORKERS AND
SCHOLARS IN JAKARTA
Rianti Setiadi,Dini Riyani, Yanny Arumsari ……………………...……….…………79
FACTORS AFFECTING THE AGREEMENT TOWARDS LEGALIZATION OF
LGBT MARRIAGE LEVEL
Rianti Setiadi1, Rosi Melati ……...…………………………..………..……………. 86
RELATIONSHIP BETWEEN JOB-DISTRESS AND DEMOGRAPHIC FACTORS
AMONG EMPLOYEES IN JAKARTA
Rianti Setiadi, Titin Siswantining, Baizura Fahma, Astari Karamina ……..……….…92

GEOGRAPHICALLY WEIGHTED BIVARIATE NEGATIVE BINOMIAL


REGRESSION (GWBNBNR)
Ahmad Fatih Basitul Ulum, Purhadi and Wahyu Wibowo …………..……………. 100

vi
THE POISSON INVERSE GAUSSIAN (PIG) REGRESSION IN MODELING THE
NUMBER OF HIV NEW CASES (CASE STUDY IN EAST JAVA PROVINCE IN
2013)
Sayu Made Widiari, Purhadi, I Nyoman Latra ………………………………..….…106
SPATIAL ANALYSIS AND MODELLING OF VILLAGE LEVEL POVERTY (A
CASE STUDY: POVERTY MODELLING IN PATI REGENCY)
Duto Sulistiyono, Ismaini Zain, and Sutikno …...….…………………….………… 115
MIXED ESTIMATOR OF KERNEL AND MULTIVARIATE REGRESSION
LINEAR SPLINE TRUCATED IN NONPARAMETRIC REGRESSION
Ali Akbar Sanjaya, I Nyoman Budiantara, Bambang Widjanarko Otok................….124
NONPARAMETRIC MIXED REGRESSION MODEL: TRUNCATED
LINEAR SPLINE AND KERNEL FUNCTION
Rory, I Nyoman Budiantara, Wahyu Wibowo ……………...…………...………..…133
SMALL AREA ESTIMATION WITH BAYESIAN APPROACH (CASE STUDY:
PROPORTIONS OF DROPOUT CHILDREN IN POVERTY)
Amalia Noviamti1, and Kartika Fithriasari, Irhamah ………...………………..……143
AN APPLICATION OF BAYESIAN ADAPTIVE LASSO QUANTILE REGRESSION
TO ESTIMATE THE EFFECT OF RETURN TO EDUCATION ON EARNING
Zablin, Irhammah, and Dedy Dwi Prastyo ……………...……………………..……151
HIERARCHICAL BAYES MODELING IN SMALL AREA FOR ESTIMATING
UNEMPLOYMENT PROPORTION UNDER COMPLEX SURVEY
Arip Juliyanto, Heri Kuswanto, Ismaini Zain …………………...………...…...……160
FOOD INSECURITY STRUCTURE IN PAPUA AND WEST PAPUA
Agustin Riyanti1, Vita Ratnasari, Santi Puteri Rahayu …………….…………….…169
FORECASTING ON INDONESIAN’S FISHERY EXPORT VALUE USING ARIMA
AND NEURAL NETWORK
Eunike W. Parameswari, Brodjol S.S Ulama, Suhartono ………………...…………176
SPATIAL AUTOREGRESSIVE POISSON (SAR POISSON) MODEL TO DETECT
INFLUENTIAL FACTORS TO THE NUMBER OF DENGUE HAEMORRHAGIC
FEVER PATIENTS FOR EACH DISTRICT IN THE PROVINCE OF DKI JAKARTA
Siti Rohmah Rohimah, Sudarwanto, Ria Arafiyah ……………...…………….…….183
LOSS SEVERITY DISTRIBUTION ESTIMATION OF OPERATIONAL RISK
USING GAUSSIAN MIXTURE MODEL FOR LOSS DISTRIBUTION APPROACH
Seli Siti Sholihat, Hendri Murfi ……………………………………..………………191

vii
THE CALCULATION OF AGGREGATION ECONOMIC CAPITAL FOR
OPERATIONAL RISK USING A CLAYTON COPULA
Nurfidah Dwitiyanti, Zuletane Maska, Hendri Murfi, Siti Nurrohmah ……....……..199
INCREASING THE STUDENTS’ MATHEMATICAL COMMUNICATION SKILL
FOR JUNIOR HIGH SCHOOL BY APPLYING REACT (RELATING,
EXPERIENCING, APPLYING, COOPERATING, AND TRANSFERRING)
LEARNING STRATEGY
Della Afrilionita ……………………...………………………………………..……208
INVESTIGATING STUDENTS’ SPATIAL VISUALIZATION IN THE
PROPERTIES OF SOLID FIGURE BY USING DAILY LIFE CONTEXT
Lestariningsih …………………..…….…………………………………………….215
USING MATEMATIKA GASING IN LEARNING MATHEMATICS FOR THREE
DIGIT ADDITION
Aloysius Ajowembun, Falenthino Sampuow,Wiwik Wiyanti, Johannes H. Siregar
.............................................................................................................................................220
IMPLEMENTATION OF MATEMATIKA GASING ADDITION OF MANY
NUMBERS FOR MATRICULATION STUDENTS AT STKIP SURYA,
TANGERANG
Jenerson Otniel Duwit, Delson Albert Gebze,Wiwik Wiyanti, Johannes H. Siregar...227
THE IMPLEMENTATION OF MATEMATIKA GASING ON MULTIPLICATION
CONCEPT TOWARD INTEREST, STUDY MOTIVATION, AND STUDENT
LEARNING OUTCOME
Asri Gita, Nia Yuniarti, Nerru Pranuta M. …………………...……………..……….234
THE DEVELOPMENT OF PALOPO LOCAL CONTEXT LEARNING MODEL
(MODIFIED COOPERATIVE LEARNING TECHNIQUES; THINK PAIR SHARE
AND PROBLEM-BASED LEARNING TECHNIQUES)
Patmaniar, and Darma Ekawati ………………………...……………...……………243

viii
INVITED SPEAKERS

ix
ADVANCES IN STATISTICAL GENOMICS OF HUMAN COMPLEX TRAITS AND
DISEASES
BEBEN BENYAMIN
CENTRE FOR NEUROGENETICS AND STATISTICAL GENOMICS,
QUEENSLAND BRAIN INSTITUTE, THE UNIVERSITY OF QUEENSLAND,
ST LUCIA, BRISBANE, AUSTRALIA, 4072|
b.benyamin@uq.edu.au

Abstract. Since the completion of the Human Genome Project (2003), we have seen dramatic advances
in efforts to dissect the genetic causes of human complex traits and diseases. More than 2000 genetic
variants have been implicated in complex traits and diseases, which some are leading to clinical
applications. These advances were accelerated by the combination of large-scale genomics projects (e.g.
the International HapMap Project and 1000Genome Project), advances in high throughput genomic
technologies (e.g. sequencing and microarray) and the availability of large-scale sample collections
enriched with phenotypic and clinical information. These big data created huge challenges associated with
its statistical analyses. In this lecture, I will discuss some of the results from our research on the statistical
analyses of these genomics data. I will start with the estimation of phenotypic variance due to genetic factors
using pedigree data and genomic data. I will then discuss about the identification of genetic variants
affecting complex traits/diseases and their use in predicting disease. I will also briefly discuss about the
current progress and challenges in the analyses of additional omics data, such as epigenomics.
Keywords: Statistic, Genetic, Genomic, Gene, Complex trait, Complex disease.

x
MODELLING, PREDICTING AND UNDERSTANDING KIDNEY DEVELOPMENT
NICHOLAS A HAMILTON
DIVISION OF GENOMICS OF DEVELOPMENT AND DISEASE AND DIVISION OF CELL BIOLOGY
AND MOLECULAR MEDICINE
INSTITUTE FOR MOLECULAR BIOSCIENCE, THE UNIVERSITY OF QUEENSLAND,
BRISBANE 4064, QUEENSLAND, AUSTRALIA|
n.hamilton@imb.uq.edu.au

Abstract. Kidneys of equal size can vary 10-fold in the number of nephrons, the fundamental filtering
unit of the kidney, at birth. Discovering what regulates such variation has been hampered by a lack of
quantitative analysis to define kidney development. Factors leading to the formation of the ureteric tree, the
branching structure from which nephrons hang, are also poorly understood. However, recent advances in
microscopy such as Optical Projection Tomography and confocal imaging have enabled us to image mouse
kidneys at high resolution at varying developmental stages. In recent work, my group in collaboration with
the developmental biology groups of Prof. Melissa Little and Assoc. Prof. Ian Smyth created a high-
throughput large-scale pipeline for the multi-dimensional analysis of kidney imaging [1]. This has created
a massively dense dataset that describes the multiple stages of kidney development. For example, in one
data set we have some 32 ureteric trees imaged of normal mouse kidneys at 7 distinct stages of development
together with 28 Tgf β+/- mutant trees at the same stages. Using software developed by our collaborators,
the imaging for these trees was segmented and connectivity graph created for each tree. These graphs are
rooted trees in the graph theory sense. We developed new graph algorithms to “overlay” and compare these
trees to give a measure of discordance between trees as well as measures of inclusion [2].
These methods were used to answer questions such as: is there a structural difference between mutant trees
and controls [2]; and is the branch formation stereotypic [3], that is follows essentially the same patterning
in all kidneys across or is there a “random” element? The later question is particularly exciting as it would
suggest that branch formation is tightly regulated and hence potentially might be manipulated to increase
branch/nephron formation. In this presentation I will describe our analysis pipeline and algorithms as well
as recent results we have obtained on defining the rules for branch patterning.
Time permitting, I will also describe a rate-based proliferation model of compartments in the kidney that
enabled us to model both growth and movements of cells between compartments [4].
Key words and Phrases : Kidney, Development, Modelling, Graph alignment, Bio-imaging, Ureteric tree.

xi
A SIMPLE INTERVENTION FOR DENGUE TRANSMISSION;
HOW FAR MODELING CAN GO
E. SOEWONO
The Department Of Mathematics, Institut Teknologi Bandung, Indonesia
esoewono@math.itb.ac.id

Abstract. Problem solving of real world problems is in reality a delicate process, compromising
between complication of the real field conditions and sophistication of the mathematical model. It
is natural that the more real situations to fit, the more complications to add in the model, which may
end up even with no mathematical tools to solve. In the case of Dengue transmission, basic model
in the form of Host-Vector (SIR-SI) Esteva-Vargas type has been known with an endemic threshold
which is defined as the basic reproductive ratio of the disease transmission. This basic model is
necessary to be understood for understanding biological phenomena in the “ideal” situation, such
as which critical biological parameters are crucial in the appearance or disappearance of t he endemic
state. From the basic and simple model, gradual steps of model improvement, accommodating more
proper assumptions, can be constructed naturally, provided enough information is given from the
field. Here a real situation which is easily found in the intervention program for Dengue
transmission is presented. Consider an intervention program which was implemented in a region
where pre-school and elementary school children were instructed to wear “full outfit” (long-sleeved
cloth and trousers for boys, and long skirts for girls) during the school period. It was found that in
the year where the program was implemented, the Dengue outbreak was lower than the previous
year before the start of the intervention. This problem looks simple and simple model c an be
constructed, but then the model is only based on very “ideal” situations which are not representing
the real situation. A step-by-step model constructions and improvement is discussed in this
presentation, along with related mathematical tools to investigate the complication of the models
Key words and Phrases : Problem Solving, Mathematical Model, Dengue.

xii
ERROR-CORRECTING PAIRS FOR PUBLIC-KEY CRYPTOSYSTEMS
RUUD PELLIKAAN1 AND IRENE MARQUES-CORBELLA2
1
Discrete Mathematics, Techn. Univ. Eindhoven P.O. Box 513, 5600 MB Eindhoven, The Netherlands
g.r.pellikaan@tue.nl
2
SECRET Project-Team - INRIA, Paris-Rocquencourt
B.P. 105, 78153 Le Chesnay Cedex France
irene.marquez-corbella@inria.fr

Abstract. Code-based cryptography is an interesting alternative to classic number -theory public


key cryptosystems since it is conjectured to be secure against quantum computer attacks. Many
families of codes have been proposed for these cryptosystems. One of the mai n requirements is
having a high performance t-bounded decoding algorithm which is achieved in the case of a t-error-
correcting pair (ECP). In this article the class of codes with a t-ECP is proposed for the McEliece
cryptosystem. The hardness of retrieving the t-ECP for a given code, in particular for algebraic
geometry codes, is considered. Recent results will be surveyed. See references [1 –8] below
Key words and Phrases : Code-Based Cryptography, Error-Correcting Pairs, Distinguisher
REFERENCES
1. Couvreur, A., M´arquezCorbella, I., Pellikaan, R.: A polynomial time attack against algebraic
geometry code based public key cryptosystems. In: Proceedings IEEE-ISIT 2014, p. 1446 (2014)
2. Couvreur, A., MrquezCorbella, I., Pellikaan, R.: Cryptanalysis of public-key cryptosystems that use
subcodes of algebraic geometry codes. In: Coding Theory and Applications: 4th International Castle
Meeting, Palmela, pp. 133–140 (2014)
3. M´arquez-Corbella, I., Mart´ınez-Moro, E., Pellikaan, R.: The non-gap sequence of a subcode of a
generalized Reed-Solomon code. Designs, Codes and Cryptography 66(1-3), 317–333 (2013)
4. M´arquez-Corbella, I., Mart´ınez-Moro, E., Pellikaan, R.: On the unique representation of very strong
algebraic geometry codes. Designs, Codes and Cryptography 70(1-2), 215–230 (2014)
5. M´arquez-Corbella, I., Mart´ınez-Moro, E., Pellikaan, R., Ruano, D.: Computational aspects of
retrieving a representation of an algebraic geometry code. Journal of Symbolic Computation vol. 64,
pp. 67–87, 2014. 64 (2014)
6. M´arquez-Corbella, I., Pellikaan, R.: Error-correcting pairs for a public-key cryptosystem. preprint
arXiv:1205.3647 (2012)
7. M´arquez-Corbella, I., Pellikaan, R.: Error-correcting pairs and arrays from algebraic geometry codes.
Proc. Appl. Computer Algebra, M´alaga, pp. 129–132 (2013)
8. M´arquez-Corbella, I., Pellikaan, R.: Error-correcting pairs: a new approach to codebased
cryptography. Proc. Appl. Computer algebra, New York, pp. 1–5 (2014).

xiii
ON CLASS OF 𝝀-MODULES
INDAH E. WIJAYANTI1, M. ARDIANSYAH, PUGUH W. PRASETYO
Mathematics Department, Universitas Gajah Masa, Yogyakara, Indonesia
ind_wijayanti@ugm.ac.id

Abstract. By the ring ℛwe mean any commutative ring with unit and the module 𝑀 means a left ℛ-
module, except we state otherwise. An ℛ-module 𝑀is called a multiplication module if for any submodule
𝑁in 𝑀, there is an ideal 𝐼 in ℛ such that 𝑁 = 𝐼𝑀. Let ℒ(𝑀)be the lattice of submodules of ℛ-module𝑀,
where for any submodules 𝑁and 𝐾 in 𝑀 the ‘join’ and ‘meet’ are defined as
𝑁 ∨ 𝐾 = 𝑁 + 𝐾, 𝑁 ∧ 𝐾 = 𝑁 ∩ 𝐾,
And 𝑁 ≤ 𝐾means 𝑁 ⊆ 𝐾, Especially, for 𝑀 = ℛ we have the lattice of ideals in ℛ and it is denoted by
ℒ(ℛ).
Smith (2014) introduced maps between the lattice of ideals of commutative rings and the lattice of
submodules of anℛ-module 𝑀, i.e. 𝜇 and 𝜆 mappings.
𝜇: ℒ(𝑀) → ℒ(ℛ), 𝑁 → 𝐴𝑚𝑛𝑅 (𝑀/𝑁) (0.1)
𝜆: ℒ(ℛ) → ℒ(𝑀), 𝐼 → 𝐼𝑀 (0.2)
The mappings (0.1) and (0.2) are motivated by the relationship of submodules and ideals in a multiplication
module. Some necessary and sufficient conditions for the maps to be a lattice homomorphisms were studied
by Smith (2014). In this work we define a class of modules as following:
𝜆 = {𝑀|(𝐵 ∩ 𝐶)𝑀 = 𝐵𝑀 ∩ 𝐶𝑀, ∀𝐵, 𝐶 finitely generate indeals ofℛ},
And observe the properties of the class. We give a sufficient conditions for the module and the ring such
that the class 𝜆 is a hereditary pretorsion class. Some main results are given below.
Proposition 1. If 𝑀 is a Dedekind divisible ℛ-module, then:
𝑀is a 𝜆-module.
For any 𝑁 ∈ 𝜎[𝑀], 𝑁 ∈ 𝜆.
The class of 𝜆 is closed under submodlues and homomorphic images.
Proposition 2. Let 𝑀 be a faithful multiplication module and Pr𝑢̈ fer.Then 𝜎[𝑀] ⊆ 𝜆.

Corollary 3. Let 𝑅 be a semisimple ring. 𝑀a chain ℛ-module, faithful and a subgenerator for all
semisimpleℛ-modules. Then [𝑀] = 𝜆.
Keywords and Phrases: lattice of ideals; lattice of submodules, multiplication modules, class of modules.

xiv
STATISTICS IN THE ERA OF BIG DATA: A CASE IN ROBUST STATISTICS
MAMAN A. DJAUHARI
RESEARCH FELLOW
INSTITUTE FOR MATHEMATICAL RESEARCH (INSPEM)
UNIVERSITI PUTRA MALAYSIA
maman_abd@upm.edu.my

Abstract. 4-V (Variety, Velocity, Veracity and Volume) is a new paradigm in dealing with data-based
information industry. This is an immediate and direct consequence of all business activities in Big Data.
The term “big” is not only big in volume but most importantly big in terms of complexity; complexity in
its variety of the problems, in computational velocity, and in methodological veracity. In this talk, an
implementation of 4-V paradigm in recent development of statistics will be illustrated for the case of robust
statistics. Discussion will begin with the evolution of FMCD since it was introduced in 1985. It is the most
popular and widely used robust method for estimating location and scale. Mathematically speaking, the
theory of FMCD consists of (i) how to order observation vectors, and (ii) how to determine the most
concentrated data subset. To increase the computational velocity of that method, by keeping its
methodological veracity, most of the researchers have been focusing their works on the second step.
However, since the first step involves matrix inversion, FMCD is still not apt for high dimensional data. To
overcome that obstacle, here we introduce a new method of observation vectors ordering based on the
concept of data depth. We show that its computational velocity is very promising.

xv
ACTUARIAL CAPACITY DEVELOPMENT IN INDONESIA
BUDI TA TAMPUBOLON
Chief of PAI School Division and President Director Of BNI Life, Indonesia
budi.tampubolon@bni-life.co.id

Abstract. Gap between demand for and supply of actuaries in the Indonesian market is described.
Cooperation between the Society of Actuaries of Indonesia and local universities in university's
mathematics and statistics course based actuarial certification and need for creation of actuarial
science study programs in universities for better production system of actuarie s in Indonesia are
also shared

Key words and Phrases : Actuaries, Mathematics, Statistics.

xvi
CONTRIBUTED SPEAKERS

xvii
Proceedings of IICMA 2015
Analysis

Boundary Value Problems for A Class Of


Hamilton-Jacobi-Bellman Equations
Muhammad Kabil Djafar1,a), Yudi Soeharyadi2,b), Hendra
Gunawan3,c)
1,2,3Institut Teknologi Bandung

a)
kabildjafar@gmail.com
b)yudish@math.itb.ac.id

c)hgunawan@math.itb.ac.id

Abstract. The Dirichlet and Neumann boundary value problems for a class of
n
Hamilton-Jacobi-Bellman equations in the first quadrant of R are considered in this
article. Suitably defined, it is shown that the minimum operator coming from a finitely many
hamiltonians of corresponding Hamilton-Jacobi equations is m accretive. The
Crandall-Liggett theorem for nonlinear semigroup generation implies existence of a mild
solution for the terminal value problems of Hamilton-Jacobi-Bellman equation. Using
symmetries and invariance, Neumann problem and Dirichlet problem of
n
Hamilton-Jacobi-Bellman equation in the first quadrant of R is governed by a strongly
continuous quasi contractive semigroups, and therefore proving well-posedness.

Keywords and Phrases: Hamilton-Jacobi-Bellman equation, Abstract Cauchy Problem,


m -accretive operator, boundary value problem.

1. Introduction
Hamilton-Jacobi-Bellman equation is a nonlinear partial differential equation
which frequently appears in optimal control. See for example Evans [6], Lions [15],
and Oksendal [16]. This equation is often formulated as a terminal value problem.
For a family of hamiltonian {H } , the Hamilton-Jacobi-Bellman equation is
expressed as follow

ut min{H ( Du)} = 0 in Rn (0, T )


(1)
u = g on Rn {t = T }

Here, the hamiltonian H : R n R and the terminal function g : Rn R are


given. The unknown function is
n
u :R (0, T ), u = u( x, t ), Du = Dxu = (u x , u x ,...,u x ) .
1 2 n

There are a comprehensive body of knowledge for the


Hamilton-Jacobi-Bellman theory, for example Bardi and Dolcetta [3], Evans [6],
Liberzon [13], Lions [15], and Oksendal [16]. Our concern is the
Hamilton-Jacobi-Bellman equation with finitely many Hamiltonians

1
{Hi : i = 1,2,..., n} for which the coresponding Hamilton-Jacobi equation
ut Hi (Du) = 0 (i = 1,2,..., n) is well-posed.
Wellposedness of Hamilton-Jacobi equation is considered in many
literatures. See for example Aizawa [1], Burch [4], and Goldstein and Soeharyadi
[7]. They show that the infinitesimal generator Ai coresponding to the hamiltonian
H i is m -accretive in the Banach space of uniformly continuous functions on R n .
We will use this results to show that the generator for the semigroup of the
Hamilton-Jacobi-Bellman equation, is also m accretive. For this purpose, we
observe m -accretiveness of the operator A as the minimum of collection of
finitely many operators, Au = min { Ai u}in=1 . We do this in Section 2.
Boundary value problem of Hamilton-Jacobi equation is often discussed in
nonlinear partial differential equation, We can see for instance Aizawa [2], Iishi
[12], Lions [14], and Tataru [18]. Odd solutions, even solutions,and how they
correspond to the Dirichlet BVP, Neumann BVP, and mixed BVP for
Hamilton-Jacobi equations was explored in Burch and Goldstein [5]. Symmetries
and invariance of semigroups, and their relation to the boundary value problem is
discussed in Goldstein, Goldstein, and Soeharyadi [8]. Their results are used in
Section 4 to obtain well-posedness of the boundary value problems of
Hamilton-Jacobi-Bellman equation in the first quadrant of R n .

2. Main Results
2.1. The Minimum of M-Accretive Operators
Let X be a Banach spaces of bounded and uniformly continuous of
functions (BUC) and be any open domain in R n . Let A1 , A2 be m -
accretive operators with domain D( A1 ) = D( A2 ) = D . The minimum A , of the
collection of operators { A1 , A2 } is defined as follows

A1u( x) if A1u(x) A2 u(x)


Au ( x) = (2)
A2u( x) if A2 u(x) A1u(x)
for u D(A) , x .

Since each of Ai ,i = 1,2 is acretive, then for any > 0 we have

u v (I A1 )u ( I A1 )v (3)

u v (I A2 )u ( I A2 )v (4)

with u, v D . From equation (3) we have


u v sup{| ( I A1 )u ( x) ( I A1 )v( x) |}
x
(5)
= sup{| (u ( x) v( x)) ( A1u ( x) A1v( x)) |}
x

2
On the other hand, from equation (4) we have

u v sup{| ( I A2 )u ( x) ( I A2 )v( x) |}
x
(6)
= sup{| (u ( x) v( x)) ( A2u ( x) A2v( x)) |}
x

Based on equation (5) and (6) , we have four cases for accretiveness of the
operator A
1. A1u( x) A2u( x), A1v( x) A2v( x) ,
2. A2u( x) A1u( x), A2v( x) A1v( x) ,
3. A1u( x) A2u( x), A2v( x) A1v( x) ,
4. A2u( x) A1u( x), A1v( x) A2v( x) .
For the first case, accretiveness of the operator A follows from the
accretiveness of the operator A1 . Likewise, for the second case, accretiveness
follows from the one of the operator A2 .

For the third case, Au1 = A1u1 , Au 2 = A2u2 , consider

Au ( x) = A1u( x), Av( x) = A2v( x). (7)

Since A2v( x) A1v( x) , then we can write


A1v( x) = A2v( x) 0 (8)

with some 0 > 0 . Using equation (5) and (8) we have


u v sup{| (u ( x) v( x)) ( A1u ( x) A1v( x)) |}
x
= sup{| (u ( x) v( x)) ( A1u ( x) A1v( x)) |}
x (9)
= sup{| (u ( x) v( x)) ( A1u ( x) ( A2v( x) 0 )) |}
x
= sup{| (u ( x) v( x)) ( A1u ( x) A2 v( x)) 0 |}
x

Since 0 always positive, the equation (9) become

u v sup{| (u( x) v( x)) ( A1u( x) A2v( x)) |} (10)


x

Using (7) and (10) we have

u v sup{| (u ( x) v( x)) ( A1u ( x) A2v( x)) |}


x
= sup{| (u ( x) v( x)) ( Au ( x) Av ( x)) |}
x
= sup{| (u ( x) v( x)) Au ( x) Av ( x) |}
x (11)
= sup{| (u ( x) Au ( x)) (v( x) Av ( x)) |}
x
= sup{| ( I A)u ( x) ( I A)v( x) |}
x
= (I A)u ( I A)v

3
thus, the operator A is accretive.
For the fourth case is similar to the third one. Thus, we can conclude that the
operator A is accretive.
Furthermore, we will show that the operator I A , for > 0 , is
surjective. Lets h X . From m -accretiveness of operator of A1 , A2 , we have
u, v D such that
(I A1 )u = h, ( I A2 )v = h.
Let us define 1 := {x : A1u( x) A2v( x)} , and 2 = \ 1 . Furthermore
we define
u ( x) if x 1
w( x) :=
v( x) if x 2 .

Let x . Then we have ( I A1 )u( x) = h( x) and ( I A2 )v( x) = h( x) . If


x 1 , then w( x) = u ( x) and Aw( x) = A1u ( x) A2v( x) . Then we have
(I A)w( x) = ( I A1 )u( x) = h( x).
On the other hand, if x \ 1 , then w( x) = v( x) and
Aw( x) = A2v( x) A1u( x) . Then we have
(I A)w( x) = ( I A2 )v( x) = h( x).
Then for any h X , we always have w D such that ( I A) w = h . Thus, the
operator I A is surjective. Since the operator A is accretive and I A is
surjective, then we can conclude that the operator A is m -accretive. By
mathematical induction, this result can be generalize for the minimum of collection
of finitely many m -accretive operators.
Proposition 2.1 If A = {A1 , A2 ,..., An } is a finite collection of m -accretive
operators on Banach space X = ( BUC) , then the operator A = min A is also
m - accretive .

2.2. Semigroup Approach for Hamilton-Jacobi-Bellman Equation


Hamilton-Jacobi-Bellman equation is often formulated as a terminal value
problem. To analyze this equation using semigroup approach, let us transform it to
an initial value problem. Introducing new variable s = T t , we have

ut min{ H i ( Du)}in=1 = 0 in Rn (0, T )


(12)
u = g on Rn {s = 0}
We then can express the initial value problem of Hamilton-Jacobi-Bellman equation
as an abstract Cauchy problem in Banach space X = BUC( )

ut Au = 0, u(0) = g (13)

with Au = min { Ai u}in=1 . In this case, Aiu = Hi ( Du) . If the operator Ai is m

4
-dissipative, then, by definition, the operator Ai is m -accretive.
Nonlinear semigroups are generated by m -accretive operators. Aizawa [1]
and Burch [4] showed that the operator A = H D defined by Au = H ( Du) on
a suitable domain is densely defined and m -accretive on X . By Crandall-Liggett
theorem, the initial value problem of Hamilton-Jacobi equation is governed by a
strongly continuous contractive nonlinear semigroup T0 = {T (t ) : t 0} . In this
case, u(t ) = T0 (t )u0 , for t > 0 , is the unique mild solution of the Cauchy problem
with the initial data u0 X.
The initial value problem (12) is actually a variant of Hamilton-Jacobi
equation. In this case, the Hamiltonian H is defined as
H ( Du) := min { H i ( Du)}in=1. (14)

If each operator Ai that corresponding to the hamiltonian H i in (14) is m


-dissipative, then the operator Ai that corresponds to the hamiltonian H i is
m -accretive. Using Proposition 2.1 , and the Crandall-Liggett theorem, the abstract
Cauchy problem (13) that coresponding to the initial value problem of Hamilton-
Jacobi equation (12) , is governed by a semigroup of nonlinear operator T (t ) , and
the solution can be expressed as an evolution of the initial data
u(t ) = T (t )u (0), t > 0.
u (0) = g

2.3. Symetries, Invariance, and Boundary Value Problems


We now consider some boundary value problem of the
Hamilton-Jacobi-Bellman equation on the first quadrant
Rn = {( x1 , , xn } : xi > 0, i = 1, , n}

of R n . Consider

ut min{ H i ( Du)}in=1 = 0 in Rn (0, T )


(15)
u = g on Rn {s = 0}
with Dirichlet boundary condition

u( x, t ) = 0, x Rn (0, T ) (16)
or Neuman boundary condition
u
( x, t ) = 0, x R n (0, T ) (17)
n

For boundary value problem, we refer to Goldstein, Goldstein, and


n
Soeharyadi [8]. They defined subsets of the Banach space X = BUC(R ) as
follow

5
X e := {u X : u is even}
X so := {u X : u is skew - odd}
They showed that if the hamiltonian H is even, the Hamilton-Jacobi equation with
Neumann condition is governed by a strongly continuous quasi contractive
semigroup Se (t ) : X e X e on BUC(Rn ) . Furthermore, they also showed the
generation case for skew odd hamiltonian H , in an even dimensional n of the
spatial domain. They showed that the Hamilton-Jacobi equation with Dirichlet
condition is governed by a strongly continuous quasi contractive semigroup
Sso (t ) : X so X so on BUC(Rn ) . Direct application of these results to the
minimum operator, hence the Hamilton-Jacobi-Bellman equation, yields:
1. If each hamiltonian H i is even, for i = 1, , k , then the generalized Neuman
problem (15), (17), is governed by a strongly continuous quasi contractive
semigroup Se (t ) : X e X e on BUC(Rn ) .
2. If each hamiltonian H i is skew odd and the spatial dimension n of R n is
even, the Dirichlet problem (15), (16), is governed by a strongly continuous
quasi contractive semigroup Sso (t ) : X so X so on BUC(Rn ) .
These shows that the spaces X e , X so are invariant under the action of the
restricted semigroups. However, we know that these symmetries (even, skew-odd)
encode boundary condition, namely an even function satisfies generalized Neumann
condition, and skew-odd function satisfies Dirichlet condition on the boundary
Rn . Therefore the above result can be interpreted as well-posedness of BVPs.

Acknowledgement.
This research is supported by Hibah Riset dan Inovasi ITB 2015.

References
[1] S. Aizawa, A semigroups treatment of the Hamilton-Jacobi equation in several
space variables, Hiroshima Math. J 6, 15-30, 1976.
[2] S. Aizawa, A mixed initial and boundary value problem for the
Hamilton-Jacobi equation in several space variables, Funkcial. Ekvac. 9,
139-150, 1966.
[3] M. Bardi and I.C. Dolcetta, Optimal Control and Viscosity Solution of
Hamilton-Jacobi-Bellman Equation, Birkhauser, 2008.
[4] B.C. Burch, A semigroups treatment of the Hamilton-Jacobi equation in one
space variable, J. Differential Equations 23, 107-124, 1977.
[5] B.C. Burch and J.A Goldstein, Some boundary value problem for the
Hamilton-Jacobi equation, Hiroshima Math J. 8, 223-233, 1978.
[6] L.C. Evans, Partial Differential Equations, American mathematical Society,
USA, 1998.
[7] J. A. Goldstein and Y. Soeharyadi, Regularity of perturbed Hamilton-Jacobi
equations, Nonlinear Analysis 51, 239-248, 2002.

6
[8] J.A. Goldstein, G.A. Goldstein, and Y. Soeharyadi, On symetries, invariances,
and boundary value problems for the Hamilton-Jacobi equation , J. Comp.
Anall.and Appl. 8, 205-222, 2006.
[9] G. R. Goldstein and J. A. Goldstein, Invariant sets for nonlinear operator, In
Stochastics Processes and Functional Analysis (A. Krinik and R. Swift eds ),
Marceel Decker, 141-147, 2004.
[10] G. R. Goldstein and J. A. Goldstein eds, Semigroups of Linear and Nonlinear
Operations and Applications, Springer, 1993.
[11] K.S. Ha, Nonlinear Functional Evolutions in Banach Spaces, Springer,
2003.
[12] Iishi. H, A boundary value problem of the Dirichlet type for Hamilton-Jacobi
equation, Ann. Scuola Norm. Sup. Pisa Cl. Sci, 16, 105-135, 1989.
[13] D. Liberzon, Calculus of Variation and Optimal Control Theory, Princeton
University Press, 2012.
[14] P. L. Lions, Neumann type boundary condition for Hamilton-Jacobi
equations, Duke Math J. 52, 793-820, 1985.
[15] P. L. Lions, Hamilton-Jacobi-Bellman equation and the Optimal Control of
Stochastic Systems, Proceedings of the International Congress of
Mathematicians, Warszawa, 1983.
[16] B. Oksendal, Stochastic Differential Equations, Springer, 2005.
[17] R. E. Showalter, Monotone Operators in Banach Space and Nonlinear Partial
Differential Equations, American mathematical Society, USA, 1997.
[18] D. Tataru, Boundary value problems for first order Hamilton-Jacobi
equations, Nonlin. Anal. 19, 1091-1110, 1992.

7
Proceedings of IICMA 2015
Analysis

Bounded Sets in Finite Dimensional N-Normed


Spaces
Esih Sukaesih1,a) and Hendra Gunawan2,b)
1
Institut Teknologi Bandung, Jl. Ganesha no. 10 Bandung
2
Institut Teknologi Bandung, Jl. Ganesha no. 10 Bandung

a)
esih_s@students.itb.ac.id
b)
hgunawan@math.itb.ac.id

Abstract. Gunawan et al. [5] introduced the notion of boundedness with respect to a linearly
independent set. In this paper we show that in finite dimensional 𝑛-normed spaces, a set 𝐾 ⊂
𝑋 is bounded with respect to a linearly independent set if and only if set 𝐾 is bounded with
respect to another linear independent set. Consequently, a set 𝐾 is bounded if it is so with
respect to any linearly independent set.
Keywords and Phrases: 𝑛-norm, 𝑛-normed spaces, bounded set.

1. Introduction

In 1963, Gähler ([9], [10], [11]) introduced the theory of 2-normed spaces
and extended it to 𝑛-normed spaces. For 𝑛 ∈ ℕ and 𝑋 be a real vector space
(dim(𝑋)≥ 𝑛). A real function ‖∙, ⋯ ,∙‖: 𝑋 𝑛 → [0, ∞) which satisfies the following
conditions for all 𝑥, 𝑥1 , ⋯ , 𝑥𝑛 ∈ 𝑋 and for any 𝛼 ∈ ℝ is called an 𝑛-norm,

(1) ‖𝑥1 , ⋯ , 𝑥𝑛 ‖ = 0 if and only if 𝑥1 , ⋯ , 𝑥𝑛 linearly dependent,


(2) ‖𝑥1 , ⋯ , 𝑥𝑛 ‖ invariant to permutation,
(3) ‖𝛼𝑥1 , ⋯ , 𝑥𝑛 ‖ = |𝛼|‖𝑥1 , ⋯ , 𝑥𝑛 ‖ for every 𝛼 ∈ ℝ,
(4) ‖𝑥1 + 𝑥, 𝑥2 , ⋯ , 𝑥𝑛 ‖ ≤ ‖𝑥1 , 𝑥2 , ⋯ , 𝑥𝑛 ‖ + ‖𝑥, 𝑥2 , ⋯ , 𝑥𝑛 ‖.

and the pair (𝑋, ‖∙, ⋯ ,∙‖) is called an 𝑛-normed spaces.

Any inner product space (𝑋, 〈∙,∙〉) can be equipped by standard 𝑛-norm ([4],
[6])

‖𝑥1 , ⋯ , 𝑥𝑛 ‖ = √𝑑𝑒𝑡(〈𝑥𝑖 , 𝑥𝑗 〉),

8
where geometrically, ‖𝑥1 , ⋯ , 𝑥𝑛 ‖ can be interpreted as the volume of 𝑛-
dimensional parallelepiped spanned by 𝑥1 , ⋯ , 𝑥𝑛 ∈ 𝑋.

Various aspect in 𝑛-normed spaces were developed ([2], [3]). Recently,


Harikrishnan and Ravindran [7] introduced the notion of boundedness in 2-normed
spaces. Then Kir and Kiziltunc [8] extended it to the boundedness in 𝑛-normed
spaces.
Definition 1.1. Let (𝑋, ‖∙, ⋯ ,∙‖) be a linear 𝑛-normed space, 𝐾 be a nonempty
subset of 𝑋 and 𝑥 ∈ 𝐾 then 𝐾 is said to be 𝑥-bounded if there exist some 𝑀 > 0
such that ‖𝑥, 𝑥2 , ⋯ , 𝑥𝑛 ‖ ≤ 𝑀 for all 𝑥2 , ⋯ , 𝑥𝑛 ∈ 𝐾. If for all 𝑥 ∈ 𝐾, 𝐾is 𝑥-bounded
then 𝐾is called a bounded set [8]..

Subsequently, Kir and Kiziltunc's definition apply only for 𝐾 ⊂ 𝑋 with 𝑛 ≤


rank(𝐾) ≤ dim(𝑋) and rectified it by the boundedness with respect to a linearly
independent set [5].
Definition 1.2. Let (𝑋, ‖∙, ⋯ ,∙‖) be an 𝑛-normed space, 𝐾 be a nonempty subset of
𝑋 and 𝐴 = {𝑎1 , ⋯ , 𝑎𝑚 } be a linearly independent set (𝑚 ≥ 𝑛). Then 𝐾 is called
bounded with respect to 𝐴 if there is 𝑀 > 0 such that
‖𝑥, 𝑎𝑖2 , ⋯ , 𝑎𝑖𝑛 ‖ ≤ 𝑀
for every 𝑥 ∈ 𝐾 and for every {𝑖2 , ⋯ , 𝑖𝑛 } ⊂ {1, ⋯ , 𝑚} [5].
By means of norms that are introduced in [1], we show the connection of
boundedness with respect to any linearly independent set to that with respect to its
induced norm.

2. Main Results

Hereafter, let (𝑋, ‖ ∙, ⋯ ,∙ ‖) be a finite dimensional 𝑛-normed space (dim(𝑋)=


𝑑), 𝐾 be a nonempty set of 𝑋, and 𝐴 = {𝑎1 , ⋯ , 𝑎𝑚 } be a set of 𝑚 linearly
independent vectors in 𝑋 (rank(𝐴) = 𝑚), where 𝑛 ≤ 𝑚 ≤ 𝑑.

By Definition 1.2 , 𝔅𝐴 (𝑋, ‖∙, ⋯ ,∙‖) is collection of bounded set with respect
to 𝐴. If a set 𝐾 is bounded with respect to 𝐴, then 𝐾 ∈ 𝔅𝐴 (𝑋, ‖∙, ⋯ ,∙‖).

In 2011, Burhan [1] introduced the following norm.


Proposition 2.3. [1]Let 𝐴 = {𝑎1 , ⋯ , 𝑎𝑛 } is a set of n linearly independent vectors in
𝑋, then
1
2
2
‖𝑥‖𝐴 = [ ∑ ‖𝑥, 𝑎𝑖2 , ⋯ , 𝑎𝑖𝑛 ‖ ]
{𝑖2 ,⋯,𝑖𝑛 }⊂{1,⋯,𝑛}

is a norm in 𝑋. Norm ‖∙‖𝐴 derived from an 𝑛-norm and 𝐴.

In the following corollary, we show the correlation between the boundedness


with respect to 𝐴 and the boundedness in (𝑋, ‖∙‖𝐴 ). Recall that a set 𝐾 is bounded
in a normed space (𝑋, ‖∙‖) if and only if there is 𝑀 > 0 such that ‖𝑥‖ ≤ 𝑀 for
every 𝑥 ∈ 𝐾.

9
Corollary 2.4. Let (𝑋, ‖ ∙, ⋯ ,∙ ‖) be a finite dimensional 𝑛-normed space (dim(𝑋)=
𝑑) which also equipped with a norm ‖∙‖𝐴, 𝐾 be a nonempty set of 𝑋, and 𝐴 =
{𝑎1 , ⋯ , 𝑎𝑛 } be a set of 𝑛 linearly independent vectors in 𝑋. A set 𝐾 is bounded with
respect to 𝐴 if and only if 𝐾 is bounded in (𝑋, ‖∙‖𝐴 ).

P ROOF . Because of the boundeness with respect to 𝐴 then we have


2
‖𝑥, 𝑎𝑖2 , ⋯ , 𝑎𝑖𝑛 ‖ ≤ 𝑀2 for every 𝑥 ∈ 𝐾 and for every {𝑖2 , ⋯ , 𝑖𝑛 } ⊂ {1, ⋯ , 𝑛},
1
2 2
such that ‖𝑥‖𝐴 = [∑ {𝑖2 ,⋯,𝑖𝑛 }⊂{1,⋯,𝑛}‖𝑥, 𝑎𝑖2 , ⋯ , 𝑎𝑖𝑛 ‖ ] ≤ √𝑛𝑀.

Conversely, the boundedness of 𝐾 in (𝑋, ‖∙‖𝐴 ) ensure that ‖𝑥, 𝑎𝑖2 , ⋯ , 𝑎𝑖𝑛 ‖ ≤ 𝑀
for every 𝑥 ∈ 𝐾 and for every {𝑖2 , ⋯ , 𝑖𝑛 } ⊂ {1, ⋯ , 𝑛}.□
Here, we have 𝔅(𝑋, ‖∙‖𝐴 ) is collection of bounded set in (𝑋, ‖∙‖𝐴 ). If a set
𝐾 is bounded in (𝑋, ‖∙‖𝐴 ), then 𝐾 ∈ 𝔅(𝑋, ‖∙‖𝐴 ).
For dim(𝑋)= 𝑛, 𝐴 is a basis of 𝑋. For dim(𝑋)= 𝑑 > 𝑛, let 𝐵 = {𝑏1 , ⋯ , 𝑏𝑑 }
is a basis of 𝑋, that is, a set of 𝑑 linearly independent vectors in 𝑋. The same as
Proposition 2.3, Burhan [1] also introduced a norm derived from an 𝑛-norm and a
basis of 𝑋.

Proposition 2.5. [1]Let 𝐵 be a basis of 𝑋, then


1
2
2
‖𝑥‖𝐵 = [ ∑ ‖𝑥, 𝑎𝑖2 , ⋯ , 𝑎𝑖𝑛 ‖ ]
{𝑖2 ,⋯,𝑖𝑛 }⊂{1,⋯,𝑑}

is a norm in X. Norm ‖∙‖𝐵 derived from an 𝑛-norm and 𝐵.

We also have correlation of the boundedness in the following corollary.


Corollary 2.6. Let (𝑋, ‖ ∙, ⋯ ,∙ ‖) be a finite dimensional 𝑛-normed space (dim(𝑋)=
𝑑) which also equipped with a norm ‖∙‖𝐵 , 𝐾 be a nonempty set of 𝑋, and 𝐵 =
{𝑏1 , ⋯ , 𝑏𝑑 } be a basis of 𝑋. A set 𝐾 is bounded with respect to 𝐵 if and only if 𝐾 is
bounded in (𝑋, ‖∙‖𝐵 ).

P ROOF . Because of the boundeness with respect to 𝐵 then we have


2
‖𝑥, 𝑏𝑖2 , ⋯ , 𝑏𝑖𝑛 ‖ ≤ 𝑀2 for every 𝑥 ∈ 𝐾 and for every {𝑖2 , ⋯ , 𝑖𝑛 } ⊂ {1, ⋯ , 𝑑}, such
1
22 𝑑
that ‖𝑥‖𝐵 = [∑ {𝑖2 ,⋯,𝑖𝑛 }⊂{1,⋯,𝑑}‖𝑥, 𝑏𝑖2 , ⋯ , 𝑏𝑖𝑛 ‖ ] ≤ √( ) 𝑀.
𝑛−1
Conversely, the boundedness of 𝐾 in (𝑋, ‖∙‖𝐵 ) ensure that ‖𝑥, 𝑏𝑖2 , ⋯ , 𝑏𝑖𝑛 ‖ ≤ 𝑀
for every 𝑥 ∈ 𝐾 and for every {𝑖2 , ⋯ , 𝑖𝑛 } ⊂ {1, ⋯ , 𝑑}.□
Let 𝔅(𝑋, ‖∙‖𝐵 ) is collection of bounded set in (𝑋, ‖∙‖𝐵 ). If a set 𝐾 is bounded in
(𝑋, ‖∙‖𝐵 ), then 𝐾 ∈ 𝔅(𝑋, ‖∙‖𝐵 ).

10
In finite dimensional spaces, we have equivalence between any two norms.
Consequently for any set of 𝑚 linearly independent vectors 𝐴 = {𝑎1 , ⋯ , 𝑎𝑚 } in 𝑋
(𝑛 ≤ 𝑚 ≤ dim(𝑋)), we have the following condition.
Lemma 2.7. Let (𝑋, ‖ ∙, ⋯ ,∙ ‖) be a finite dimensional 𝑛-normed space (dim(𝑋)=
𝑑) which also equipped with a norm ‖∙‖, and 𝐾 be a nonempty set of 𝑋. A set 𝐾 is
bounded with respect to 𝐴 if and only if 𝐾 is bounded in (𝑋, ‖∙‖).

P ROOF . Because of the boundeness with respect to 𝐴 then we have


2
‖𝑥, 𝑎𝑖2 , ⋯ , 𝑎𝑖𝑛 ‖ ≤ 𝑀2 for every 𝑥 ∈ 𝐾 and for every {𝑖2 , ⋯ , 𝑖𝑛 } ⊂ {1, ⋯ , 𝑚} (by
Corrolary 2.4 for 𝑚 = 𝑛 and by Corrolary 2.6 for 𝑚 = 𝑑), such that ‖𝑥‖𝐴 =
1
2 2 𝑚
[∑ {𝑖2 ,⋯,𝑖𝑛 }⊂{1,⋯,𝑚}‖𝑥, 𝑏𝑖2 , ⋯ , 𝑏𝑖𝑛 ‖ ] ≤ √( ) 𝑀. We have that any two norms
𝑛−1
in finite dimensional space are equivalent. So we have the relation between the
boundedness in (𝑋, ‖∙‖𝐴 ) and the boundedness in (𝑋, ‖∙‖), as desired.□
Let 𝔅(𝑋, ‖∙‖) is collection of bounded set in (𝑋, ‖∙‖). If a set 𝐾 is bounded in
(𝑋, ‖∙‖) then 𝐾 ∈ 𝔅(𝑋, ‖∙‖).

FIGURE 2.1 The relation between the boundedness with respect to 𝐴 and the
boundedness in (𝑋, ‖∙‖).

In an 𝑛-normed space (𝑋, ‖ ∙, ⋯ ,∙ ‖), we have many sets of 𝑚 linearly


independent vectors in 𝑋 (𝑛 ≤ 𝑚 ≤ 𝑑). If 𝐴1 , 𝐴2 are sets of 𝑚 linearly
independent vectors in 𝑋, we show that the boundedness in (𝑋, ‖∙‖) ties up the
boundedness with respect to 𝐴1 and the boundedness with respect to 𝐴2 .
Lemma 2.8. Let (𝑋, ‖ ∙, ⋯ ,∙ ‖) be a finite dimensional 𝑛-normed space, 𝐾 be a
nonempty set of 𝑋, 𝐴1 = {𝑎11 , ⋯ , 𝑎1𝑚1 } be a set of 𝑚1 linearly independent
vectors in 𝑋 (𝑛 ≤ 𝑚1 ≤ 𝑑), and 𝐴2 = {𝑎21 , ⋯ , 𝑎2𝑚2 } be a set of 𝑚2 linearly
independent vectors in 𝑋 (𝑛 ≤ 𝑚2 ≤ 𝑑, 𝐴1 ≠ 𝐴2 ). A set 𝐾 is bounded with respect
to 𝐴1 if and only if 𝐾 is bounded with respect to 𝐴2 .
P ROOF . Use Lemma 2.7 and Corollary 2.4 (for 𝑚2 = 𝑛) or use Corollary 2.6
(for 𝑚2 = 𝑑).□
By Lemma 2.8, we show that the boundedness of a set 𝐾 does not depend to
the choice of a linearly independent set.

11
References
[1] M. J. I. Burhan, Teorema Titik Tetap di Ruang Norm-n Berdimensi Hingga,
Master thesis, Institut Teknologi Bandung, (2011)
[2] H. Gunawan, On n-inner product, n-norms, and the Cauchy-Schwarz
inequality, Sci. Math. Jap. Online Vol. 5, 47-54, (2001)
[3] H. Gunawan and M. Mashadi, On n-normed spaces, IJMMS, 27:10, 631-
639, (2001)
[4] H. Gunawan, The space of p-summable sequences and its natural n-norm,
Bull. Ausral. Math. Soc., Vol.64, 137-147, (2001)
[5] H. Gunawan, O. Neswan, and E. Sukaesih, Fixed point theorems on
bounded sets in an n-normed spaces, JMA, Vol. 6-3, 51-58, (2015)
[6] H. Gunawan, On convergence in n-inner product spaces, Bull. Malaysian
Math. Sc. Soc., Second series 25, 11-16, (2002)
[7] P. K. Harikrishnan and K. T. Ravindran, Some properties of Accretive
operators in linear 2-normed spaces, Int. Math. Forum, 6 no 59, 2941-2947,
(2011)
[8] M. Kir and H. Kiziltunc, On fixed point theorems for contraction mappings
in n-normed spaces, Applied Math. Information Sci. Letters, 2 (2014), 59-
64, (2014)
[9] S. Gähler, Lineare 2-normierte Räume, Math. Nachr., 28, 1-43, (1964)
[10] S. Gähler, Untersuchungen Uber Verallgemeinerte m-Metrische Räume I,
Math. Nachr. 40, 165-189., (1969)
[11] S. Gähler, Untersuchungen Uber Verallgemeinerte m-Metrische Räume II,
Math. Nachr. 40, 229-264., (1969)
[12] E. Kreyzig, Introductiory Functional Analysis with Applications, John
Wiley & Sons, New York., (1978)

12
Proceedings of IICMA 2015
Applied Math

The Stochastic SI Model in A Single Area:


Comparison with The Deterministic SI Model
and Real Data
Benny Yong1,a), Livia Owen2,b), Elvina Octora3,c)
1,2,3
Department of Mathematics, Faculty of Information Technology and Science
Parahyangan Catholic University, Jalan Ciumbuleuit 94 Bandung 40141
a)
benny_y@unpar.ac.id
b)
livia.owen@unpar.ac.id
c)
octorasoem@gmail.com

Abstract. Deterministic and stochastic models often used in mathematical modeling. It is


important to know the relationship between these models and the results obtained from the
models. Here we discuss the stochastic SI (Susceptible-Infected) model in a single area. Solution
of the stochastic SI model will be solved numerically using the Euler-Maruyama method. Data
SARS from WHO in 2003 in Singapore and Hong Kong will be used to compare solution of the
deterministic and stochastic SI models with the real data.
Keywords: deterministic, stochastic, stochastic differential equation, Euler-Maruyama method

1. Introduction
Mathematical models are used in many real problems, for example in
predicting the weather, the predator and prey problem, and the dynamics of the
population in the spread of infectious diseases. Many infectious diseases that can
be modeled by mathematical models. This paper will discuss about a mathematical
model for the spread of SARS (Severe Acute Respiratory Syndrome) using data of
infected cases in Singapore and Hong Kong in 2003.
Mathematical models can be formed in deterministic or stochastic.
Deterministic models are fully determined by the parameter values and initial
conditions while stochastic models possess some inherent randomness. Some
mathematical models for infectious diseases are solved in deterministic, for
example on a mathematical model for dengue fever, tuberculosis, and HIV ([3],
[2], [7]). In deterministic models, the solution is found with solving system of
differential equations. Usually, we found the equilibrium points and analyze its
stabilities. There are many methods to solve stochastic models numerically. In this
paper, the solution of stochastic SI model in a single area will be solved
numerically by using Euler-Maruyama method. Numerical simulations using
MATLAB will compared the solution of deterministic and stochastic model with
the real data.

13
2. Main Results
2.1. Deterministic SI Model in A Single Area
Based on the assumptions and transmission diagram given in paper Octora et
al. [6], the SI (Susceptible-Infected) model in a single area is given by:
𝑑𝑆(𝑡) 𝛽𝑆(𝑡)𝐼(𝑡)
𝑑𝑡
= 𝑎 − 𝑆(𝑡)+𝐼(𝑡) − 𝑏𝑆(𝑡) + 𝑑𝐼(𝑡)
{ 𝑑𝐼(𝑡) 𝛽𝑆(𝑡)𝐼(𝑡)
(1)
𝑑𝑡
= 𝑆(𝑡)+𝐼(𝑡) − (𝑐 + 𝑑)𝐼(𝑡)

where 𝑆 and 𝐼 represent the number of susceptible and infected population at time 𝑡
(in days) respectively. The susceptible population is increased by recruitment of
individuals 𝑎 and recovered individuals with rate 𝑑, whereas this population is
decreased by natural death with rate 𝑏 and new infected individuals with
transmission rate 𝛽. We assumed that the recovered individuals gain no immunity
and can be re-infected. The infected population is decreased by death caused by the
disease with rate 𝑐 (𝑐 > 𝑏) and recovered individuals with rate 𝑑, whereas this
population is increased by new infected individuals with transmission rate 𝛽.
The model (1) has two equilibrium points, they are disease free equilibrium
𝑎
point (𝑆1 ∗ , 𝐼1 ∗ ) = ( , 0) that stable iff ℜ0 < 1 and endemic equilibrium point
𝑏
𝑎 𝑎(ℜ0 −1)
(𝑆2 ∗ , 𝐼2 ∗ ) = ( , ) that stable iff ℜ0 > 1, where the basic
𝑏+𝑐(ℜ0 −1) 𝑏+𝑐(ℜ0 −1)
𝛽
reproductive number ℜ0 is and it satisfies the condition 𝑐1 < 𝑐 < 𝑐2 , 𝑐1,2 =
𝑐+𝑑
1 1
(𝑏 + 𝛽 − 𝑑) ± √𝛽 2 + (2(𝑏 − 𝑑))𝛽 + (𝑏 + 𝑑)2 .
2 2

2.2. The Stochastic SI Model in A Single Area


Because of the uncertainty factor on the number of SARS infected
individuals each day, the more precise mathematical model to describe the
dynamics of the population on the spread of infectious diseases is using a
stochastic model. In this section, we derive the stochastic version from the
deterministic SI model in a single area. To solve the equation numerically, we use
Euler-Maruyama method in [4].
We assume the total population in a single area is constant, 𝑆(𝑡) + 𝐼(𝑡) = 𝑛,
𝛽
for any 𝑡 and set 𝜆 = 𝑛. For describe the dynamics of infected population, the
𝑑𝐼(𝑡)
differential equation 𝑑𝑡
in the deterministic model (1) is modified to the
following stochastic differential equation:

𝑑𝐼(𝑡) = (𝜆(𝑛 − 𝐼(𝑡)) − (𝑐 + 𝑑)) 𝐼(𝑡)𝑑𝑡 + 𝜇𝑔(𝐼(𝑡))𝑑𝑊(𝑡), (2)

𝐼(0) = 𝐼0 , 0 ≤ 𝑡 ≤ 𝑇
where 𝜇 and 𝜆 are a real constants and 𝑊(𝑡) is a random variable following a
Brownian motion. The first term in the right hand side of equation (2) is the
deterministic part of the model, while the second term is the stochastic part.

14
First, we construct a Brownian motion where 𝑊(𝑡) is sampled at discrete
𝑇
time-steps 𝑡. Set 𝛿𝑡 = with 𝑁 ∈ ℤ+ , then we have 𝑑𝑊𝑗 = 𝑊𝑗 – 𝑊𝑗−1 , 𝑗 =
𝑁
1,2, … , 𝑁 where 𝑑𝑊𝑗 ~√𝛿𝑡 𝑁(0,1).
We apply Euler-Maruyama method in equation (2) in interval [0, 𝑇] with
𝑇
discrete time-steps. Define the increment Δ𝑡 = 𝐿
with 𝐿 ∈ ℤ+ and set 𝜏𝑗 = 𝑗Δ𝑡, we
have
𝐼𝑗 = 𝐼𝑗−1 + (𝜆(𝑛 − 𝐼𝑗−1 )– (𝑐 + 𝑑))𝐼𝑗−1 Δ𝑡 + 𝜇𝑔(𝐼𝑗−1 )(𝑊(𝜏𝑗 ) – 𝑊(𝜏𝑗−1 )),
𝑗 = 1,2, … , 𝐿 (3)
where 𝐼𝑗 is the numerical approximation to 𝐼(𝜏𝑗 ). If 𝜇 = 0, equation (3) is the Euler
approximation to the deterministic model given by equation (1).
Next, increment Δ𝑡 is calculated from 𝑅 × δ𝑡 with 𝑅 ∈ ℤ. Those establish the
set of points {𝑡𝑗 }, where the Brownian path contains {𝜏𝑗 }. Moreover, increment
𝑊(𝜏𝑗 ) – 𝑊(𝜏𝑗−1 ) is calculated from Euler-Maruyama method,

𝑊(𝜏𝑗 ) – 𝑊(𝜏𝑗−1 ) = 𝑊(𝑗𝑅δ𝑡) – 𝑊((𝑗 − 1)𝑅δ𝑡)


𝑗𝑅

= ∑ 𝑑𝑊𝑘
𝑘=𝑗𝑅−𝑅+1

To calculate the average error of the simulation, we use the following formula:

2
√∑𝑘𝑖=1(𝐼̂𝑖 − 𝐼𝑖 )
𝐸=
𝑘
where 𝐼𝑖 is the cumulative number of infected individuals from the real data at time
𝑖, 𝐼̂𝑖 is the cumulative number of infected individuals from model at time 𝑖, and 𝑘 is
time interval.

2.3. Numerical Simulation


In this section, we present our numerical simulation for stochastic SI model in
a single area. We compare the deterministic and stochastic solutions with the real
data of SARS infected individuals that occurred in Singapore and Hong Kong in
2003. The data taken from WHO [5] during 96 days, from March 17, 2003 to July
11, 2003.

Model 1: 𝒈(𝑰) = 𝑰
In this numerical simulation, we apply 𝑔(𝐼) = 𝐼. Parameter values for
Singapore are 𝑐 = 0.05, 𝑑 = 0.01, and 𝑛 = 206, while we use parameter values
𝑐 = 0.02, 𝑑 = 0.1, dan 𝑛 = 1755 for Hong Kong. If 𝜇 = 0, the stochastic model
in equation (2) will be the deterministic model because there is no stochastic part in
this equation. Furthermore, we will look for the value of 𝜆 so that the average error
with μ = 0 is smallest.

15
Number of infected population

Number of infected population


t (days) t (days)

FIGURE 1. The number of SARS FIGURE 2. The number of SARS


infected population in Singapore infected population in Hong Kong with
with 𝜇 = 0 and the variation of 𝜆 𝜇 = 0 and the variation of 𝜆

In Table 1, we see that the smallest average error in Singapore and Hong
Kong occurred when 𝜆 = 0.068 and 𝜆 = 0.0085 respectively.
TABLE 1. The average error for SARS infected data in Singapore (SG) and Hong
Kong (HK) with 𝜇 = 0 and the variation of 𝜆

𝜆 𝑆𝐺 𝐸𝑆𝐺 𝜆𝐻𝐾 𝐸𝐻𝐾


0.05 2.012947 0.0080 6.574687
0.068 0.738320 0.0085 5.402897
0.0728 0.857625 0.0091 6.483367

In next figure, we will see the number of SARS infected population in


Singapore and Hong Kong with 𝜆 that give the smallest average error and the
variation of 𝜇. The number of path is 100.
Number of infected population

Number of infected population

t (days) t (days)

FIGURE 3. The number of SARS FIGURE 4. The number of SARS


infected population in Singapore with infected population in Hong Kong with
𝑔(𝐼) = 𝐼, 𝜆 = 0.068 and the 𝑔(𝐼) = 𝐼, 𝜆 = 0.0085 and the variation
variation of 𝜇 of 𝜇

16
In Table 2, we see that the average error in Singapore and Hong Kong for
stochastic model is slightly larger than deterministic model. The results in Table 2
to Table 9 are example of samples, because if the simulation is repeated with the
same parameter values, we will obtain different results with different average
errors, but the results does not differ much.
TABLE 2. The average error for SARS infected data in Singapore (SG) and Hong
Kong (HK) with 𝜇 ≠ 0 and 𝑔(𝐼) = 𝐼

𝜇 Number of path 𝐸𝑆𝐺 𝐸𝐻𝐾


0.1 100 0.766682 5.663882
0.5 100 1.519922 9.213685
0.9 100 1.758386 18.616213

In paper Ang [1], 𝜆 = 0.05. Next simulation will compare the number of infected
population for SARS outbreak in Singapore from model Octora et al. [6] (we call
Model A) and model Ang [1] (we call Model B) with 𝜇 = 0.25 and the number of
path is 500.
Number of infected population

Number of infected population

t (days) t (days)

FIGURE 5. The number of SARS FIGURE 6. The number of SARS


infected population in Singapore for infected population in Singapore for
Model A with 𝑔(𝐼) = 𝐼, 𝜆 = Model B with 𝑔(𝐼) = 𝐼, 𝜆 = 0.068 and
0.068 and 𝜆 = 0.05 𝜆 = 0.05

Table 3 shows the simulation results of number of SARS infected population


in Singapore for Model A and Model B with the smallest average error occured
when 𝜆 = 0.068.

17
TABLE 3. The average error for SARS infected data in Singapore for Model A
and Model B with 𝑔(𝐼) = 𝐼

𝜆 𝜇 Number of path 𝐸𝐴 𝐸𝐵
0.068 0.25 500 0.899946 0.824351
0.05 0.25 500 2.464749 2.122516

Figure 7 and 8 show the simulation results of number of SARS infected


population in Hong Kong for Model A and Model B with 𝜇 = 0.25 and the
number of path is 500.
Table 4 shows the simulation results of number of SARS infected population
in Hong Kong for Model A and Model B with the smallest average error occurred
when 𝜆 = 0.0085.
TABLE 4. The average error for SARS infected data in Hong Kong for Model A
and Model B with 𝑔(𝐼) = 𝐼

𝜆 𝜇 Number of path 𝐸𝐴 𝐸𝐵
0.0085 0.25 500 8.602190 8.142499
0.05 0.25 500 52.301216 52.725848
Number of infected population

Number of infected population

t (days) t (days)

FIGURE 7. The number of SARS FIGURE 8. The number of SARS


infected population in Hong Kong for infected population in Hong Kong for
Model A and 𝑔(𝐼) = 𝐼 with 𝜆 = Model B and 𝑔(𝐼) = 𝐼 with 𝜆 =
0.0085 and 𝜆 = 0.05 0.0085 and 𝜆 = 0.05

Model 2: 𝒈(𝑰) = √𝑰
In previous subsection, we show the numerical simulations with 𝑔(𝐼) = 𝐼. In
this subsection, we apply 𝑔(𝐼) = √𝐼. We use same parameters with previous
subsection. For 𝜇 = 0, the results have been shown in Figure 1 and 2, because
model (2) will be deterministic model.

18
Now we want to find the smallest average error for SARS infected
population in Singapore and Hong Kong with 𝜆 = 0.068 and 𝜆 = 0.0085
respectively. In the following figure, we describe the number of SARS infected
population in Singapore and Hong Kong with varying 𝜇 and the number of path is
100.Number of infected population

Number of infected population


t (days) t (days)

FIGURE 9. The number of SARS FIGURE 10. The number of SARS


infected population in Singapore infected population in Hong Kong
with 𝑔(𝐼) = √𝐼 , 𝜆 = 0.068, and with 𝑔(𝐼) = √𝐼 , 𝜆 = 0.0085, and
the variation of 𝜇 the variation of 𝜇

In Table 5, we see that the smallest average error in Singapore occurred at


𝜇 = 0.1 and the smallest average error in Hong Kong occurred at 𝜇 = 0.5. The
average error for stochastic model with 𝑔(𝐼) = √𝐼 for both Singapore and Hong
Kong are slightly smaller than its deterministic model.
TABLE 5. The average error for SARS infected data in Singapore (SG) and Hong
Kong (HK) with 𝜇 ≠ 0 and 𝑔(𝐼) = √𝐼

𝜇 Number of path 𝐸𝑆𝐺 𝐸𝐻𝐾


0.1 100 0.735629 5.404616
0.5 100 0.755446 5.392932
0.9 100 0.766904 5.397073

In next simulation, we will compare the number of infected population for SARS
outbreak in Singapore between Model A and Model B with 𝜇 = 0,25 and the
number of path is 500.

19
Number of infected population

Number of infected population


t (days) t (days)

FIGURE 11. The number of SARS FIGURE 12. The number of SARS
infected population in Singapore for infected population in Singapore for
Model A with 𝑔(𝐼) = √𝐼 , 𝜆 = Model B with 𝑔(𝐼) = √𝐼 , 𝜆 =
0.068, and 𝜆 = 0.05 0.068, and 𝜆 = 0.05

Table 6 shows the simulation result of the number of SARS infected


population in Singapore for Model A and Model B, with the smallest average error
occured when 𝜆 = 0.068.
TABLE 6. The average error for SARS infected data in Singapore for Model A
and Model B with 𝑔(𝐼) = √𝐼

𝜆 𝜇 Number of path 𝐸𝐴 𝐸𝐵
0.068 0.25 500 0.741732 0.753567
0.05 0.25 500 1.956201 1.939687

Figure 13 and 14 show the simulation results of the number of SARS infected
population in Hong Kong for Model A and Model B with 𝜇 = 0.25 and the
number of path is 500.
Number of infected population

Number of infected population

t (days) t (days)

FIGURE 13. The number of SARS FIGURE 14. The number of SARS
infected population in Hong Kong for infected population in Hong Kong for
Model A with 𝑔(𝐼) = √𝐼 , 𝜆 = Model B with 𝑔(𝐼) = √𝐼 , 𝜆 =
0.0085, and 𝜆 = 0.05 0.0085, and 𝜆 = 0.05

20
Table 7 shows the simulation result of the number of SARS infected
population in Hong Kong for Model A and Model B with the smallest average
error occurred when 𝜆 = 0,0085.
TABLE 7. The average error for SARS infected data in Hong Kong for Model A
and Model B with 𝑔(𝐼) = √𝐼

𝜆 𝜇 Number of paths 𝐸𝐴 𝐸𝐵

0.0085 0.25 500 5.362697 5.845456

0.05 0.25 500 52.507509 52.660696

In the following simulations, we will vary the number of path in the value of 𝜇 that
implies the smallest average error.
Number of infected population

Number of infected population

t (days) t (days)

FIGURE 15. The number of SARS FIGURE 16. The number of SARS
infected population in Singapore with infected population in Hong Kong with
𝜇 = 0.1 and varying number of path 𝜇 = 0.5 and varying number of path

Table 8 shows the average error for SARS data in Singapore and Hong Kong
when we vary the number of path. For SARS data in Singapore, the more number
of path the bigger average of error. For SARS data in Hong Kong, the smallest
average error occured when the number of path is 100.
TABLE 8. The average error for SARS infected data in Singapore and Hong Kong
for Model A with 𝜇 that implies the smallest average error and 𝑔(𝐼) = √𝐼

Number of path 1 100 500 1000


𝐸𝑆𝐺 0.725367 0.736501 0.737072 0.742474
𝐸𝐻𝐾 5.587203 5.348280 5.428575 5.420294

21
Figure 17 and 18 show the simulation results between deterministic model and
stochastic model for both 𝑔(𝐼) = 𝐼 and 𝑔(𝐼) = √𝐼 with real data.

Number of infected population


Number of infected population

t (days) t (days)

FIGURE 17. The number of SARS FIGURE 18. The number of SARS
infected population in Singapore for infected population in Hong Kong for
deterministic model and stochastic deterministic model and stochastic
model model

Table 9 shows that for Model 1 (𝑔(𝐼) = 𝐼), the solution of deterministic
model is better than the solution of its stochastic model. But for Model 2 (𝑔(𝐼) =
√𝐼), the solution for stochastic model is better than the solution of its deterministic
model, for SARS data in Singapore occured when 𝜇 = 0.1 and for SARS data in
Hong Kong occured when 𝜇 = 0.5.
For SARS data in Singapore and Hong Kong, either Model A or Model B,
the model with 𝑔(𝐼) = √𝐼 estimates the real data better than the model with 𝑔(𝐼) =
𝐼. It means that the selection of 𝑔(𝐼) in the stochastic model is essential to get the
accurate result (the solution with smaller average error).
TABLE 9. The average error for SARS infected data in Singapore (SG) and Hong
Kong (HK) in Model A with 100 paths.

𝜇 0 0.1 0.5

𝐸𝑆𝐺,𝑔(𝐼)=𝐼 0.738320 0.766682 1.519922

𝐸𝐻𝐾,𝑔(𝐼)=𝐼 5.402897 5.663882 9.213685

𝐸𝑆𝐺,𝑔(𝐼)=√𝐼 0.738320 0.735629 0.755446

𝐸𝐻𝐾,𝑔(𝐼)=√𝐼 5.402897 5.404616 5.392932

3. Concluding Remarks
Mathematical modeling in epidemics mostly use deterministic models that
involve a system of differential equations. In this paper, we present a stochastic SI
model in a single area to compare its solution with the solution of deterministic SI

22
model and the real data. The numerical simulations using Euler-Maruyama method
show that the solution of stochastic SI model in a single area is good enough when
compared with deterministic solution and the real data of SARS infected from
WHO in Singapore and Hong Kong in 2003. For stochastic model, the model with
𝑔(𝐼) = √𝐼 has smaller average error than the model with the 𝑔(𝐼) = 𝐼. It means
the model with 𝑔(𝐼) = √𝐼 is better than the model with 𝑔(𝐼) = 𝐼. The proper
selection of the parameters give accuracy solution of model to the real data. For
further research, we can find another function of 𝑔 and some parameters that give
better result with smaller average error. We also can choose another method
besides the Euler-Maruyama method for finding the solution numerically.
Acknowledgment.
The author would like to thank the anonymous reviewers for their
valuable comments and suggestions.

References
[1] K.C. Ang , A simple stochastic model for an epidemic-numerical
experiments with MATLAB, The Electronic Journal of Mathematics
and Technology; 1(2): 117-128, (2007)
[2] C. Castillo-Chavez and Z. Feng, Mathematical models for the disease
dynamics of tuberculosis, Biometrics Unit, Cornell University, New
York, (1996)
[3] Z. Feng and J. X. Velasco-Hernandez , Competitive exclusion in a
vector-host model for the dengue fever, Journal of Mathematical
Biology, 35: 523-544, (1997)
[4] Higham, D. J., An algorithmic introduction to numerical simulation of
stochastic differential equations, SIAM Review, 43(3): 525-546, (2001)
[5] Laboratory Biosafety Manual , Global Alert and Response (GAR),
www.who.int/csr/sars/country/en. [5 November 2014], (2004)
[6] E. Octora, B. Yong, and L. Owen, Analisis model S-I untuk satu dan
dua wilayah, Prosiding Seminar Nasional Matematika UNPAR, 9:
100-110, (2014)
[7] B. Yong, Model penyebaran HIV dalam sistem penjara, Jurnal
Matematika, Ilmu Pengetahuan Alam, dan Pengajarannya, 36(1): 31-
47 , (2007)

23
Proceedings of IICMA 2015
Applied Math

Investigating The Accuracy of BPPT Flying


Wing UAV’S Linearized Equation of Motion
Compared to BPPT-04C Sriti Flight Test Data
Jemie Muliadi1,a)
1
BPPT, The Agency for The Assesment and The Application of Thecnology
a)
Jemie.muliadi@bppt.go.id

Abstract. Aircraft movements were subjected to the Newtonian equations of mechanics and to
the Euler’s moment equations. There are highly nonlinear and also consuming high computational
load in its complete form. For some practical reason, the methods of linearization were developed
to reduce the complexity of the computation needed to produce dynamic solution of the aircraft
model. The linearizing method was quite accurate in the small angle in equilibrium flight. This is
due to the multiplied terms—i.e. the nonlinear terms—that vanished after the multiplication of
small values thus the linear terms dominantly affect the solution of the equation of motion. So far,
the method works well in conventional aircraft consist of wings, fuselage (body), vertical tail and
horizontal tail. Recently, the Agency of The Assessment and The Application of Technology
(BPPT, Indonesia) has developed the unconventional “Flying Wing” configuration UAV. The
configuration was unconventional due to the omission of both tails in order to reduce the drag
force. Therefore, an assessment needs to be performed to investigate the suitability of
linearization method for Flying Wing UAV compared to its exact solution. The result of the
simulation shows that the omission of tail doesn’t affect much to the accuracy of the linearized
model compared to its exact solution. Which conclude that, although Flying Wing were
unconventional configuration, but it was suitable to be modelled using linearized equation of
motion.
Keywords and Phrases: Linearized Equation of Motion, Flying Wing, BPPT UAV.

1. Introduction
BPPT-04C "Sriti" The Flying Wing UAV
The Indonesian Agency for The Assessment and The Application of
Technology (BPPT) has conducting Unmanned Aerial Vehicle (UAV) research and
results in several types of prototypes. These UAVs serving each specific mission
from military purposes such as supporting border patrol, to agricultural purposes
such as providing low altitude hyper spectral [1]. The configuration varies from
conventional type entitled BPPT-01A “Wulung” which using rectangular wing and
ordinary tail, to the BPPT-01B “Gagak” and BPPT-02A “Pelatuk” which using
tapered-rectangular wing with their unconventional tail (i.e. the V tail and inverted-
V tail) and the BPPT-03A “Laron” which eliminates the wings and using rotor

24
instead [2]. The variety of design then continued to the tailless form of UAV,
entitled BPPT-04A “Sriti” which is a flying wing type Unmanned Aerial Vehicle
(UAV). The UAV only consist of fuselage and wings, without the tail planes,
horizontal tail plane nor vertical tail plane. Conventionally, an aircraft would be
recognized not only from its wing and fuselage, but also horizontal and vertical tail
which designed for it flight stability in its operation. However, the overall
configuration and the omission of tail of the BPPT-04C “Sriti” were designed for
its mobility aspect especially to be carried using backpack [3].

FIGURE 1. The Flying Wing UAV “Sriti”, Developed by BPPT, Indonesia

The Flight Mechanics, a branch of Flight Physics, has evolved into hundreds
of methods to analyze the performance, stability, control and dynamics of aircraft
in its conventional configuration. Restricted by the safety issue of the passenger
transportation, it is understandable that the developed of new aircraft were bounded
to the conventional configuration. It is also understandable that most of the aircraft
designer were clearly avoiding the exploration of unconventional aircraft
configuration.
In the era of Unmanned Aerial Systems technologies, the aircraft
configuration design has experienced a paradigm shifting. Instead being trapped
with the conventional configuration, these UAV designers has showed rapid
breakthrough in developing new configuration. The Predator, a combat UAV, have
inverted-V tail which diffuse the definition of “horizontal” and “vertical” tail plane.
Like the Predator, the Medium Altitude Long Endurance UAV mostly adopt the V
tail configuration, which placing the two tail planes like butterfly wings. In the
micro scale, now the multi-rotor concept result so many new UAV, like tricopter
(with 3 propellers), quadcopters (with 4 propellers), hexacopters (with 6 propellers)
and many others.
Due to the Unmanned Concept, the UAV were designed and built up to fulfill
its mission or the features that it offers the most. Trade-off in advantage and
disadvantages becomes main consideration in choosing the most optimum
configuration of UAV. To answer the needs of mobile UAV, which portable
enough to be break-down for a backpack carrier, the tailless UAV becomes a
suitable idea. This unconventional configuration also known as “the flying wing
configuration” omits the existence of conventional tail. Thus, the function of the
“omitted tail” will be compensated by designing special contour of the wing.
The conventional Flight Mechanics analyze the aircraft flight with the tail
parameters as the most important factor for its stability and maneuverability.
Although the complete aircraft’s Equation of Motion was nonlinear, it was
linearized to simplify the analysis. It is known that for conventional aircraft, the

25
conventional Flight Mechanics will perform accurate analysis. But what about the
analysis of unconventional aircraft flight? Is the linearizing method still performed
the same accuracy? Intuitively, we can deduce a hypothesis that the omission of the
tail will reduce the accuracy of the linearization method. But is the accuracy
decrement is truly occurred when the linearization method applied to the flying
wing? If accuracy decreases, how far the error arises?
Since the linearized method were known widely due its simplicity and low
computation cost, the method should undergo an analysis to measure its accuracy
when applied to Flying Wing aircraft. This work will simulate the approximation
of linearized Equation of Motion solution by using the Flight Test Data of BPPT’s
Flying Wing “Sriti”. The approximated result than compared with the true flight
data measurement to have the error analyzed.

2. Main Results
2.1 The Linearized Aircraft Equation of Motion
2.1.1. The Complete Equation for Rotation Mode
McRuer et al. [10] assuming the aircraft to be rigid, and the earth were fixed
in the space, to apply the Newton’s laws of mechanics and Euler Moment Equation
in deriving the Equation of Motion. Consider that the aircraft undergo a linear
momentum vector, p, and an angular momentum, H, with respect to Inertial
Coordinate Reference System. By Newton’s second law, the time rate of change of
linear momentum will be equals the sum of all externally applied forces.
dp
F
dt (1)
Analogous to linear case, in rotational movement the equality were also holds that the
rate of change of the angular momentum is the sum of all applied torques, M,
dH
M
dt (2)
These vector differential equations will be the starting point to describe the body
motions of the aircraft.

FIGURE 2. Definition of Aircraft Body Axis xyz, Velocity Vector UVW, Angular Rate Vector PQR,
Forces XYZ and Moments LMN (Source:[6])

26
To simplify the notation, from now on, the differential with respect of time will be
represented by the dot mark above the variable. In mathematical expression:
d
x x
dt (3)
To analyze the aircraft movement in each Body Axis, the equation expanded
into completed form of Aircraft Equation of Motion as (Allerton [6], McRuer et al.
[10]):
L I x P QR( I z I y ) I yz (Q2 R 2 ) I xz ( R PQ) I xy (Q PR)
M I y Q PR( I x I z ) I xz ( P 2 R 2 ) I yz ( R PQ) I xy ( P QR)
N I z R PQ( I y I x ) I xy ( P 2 Q 2 ) I xz ( P QR) I yz (Q PR)
(4)
The Aircraft Equation of Motion related the external torsion (control moment, L, M,
N in Body Axis x, y, z respectively) with the angular velocity P, Q, R and its rate
P, Q, R, . Also together with their coefficients Ix, Iy, Iz, which are the moment of
inertia in Body Axis, and the product of inertia Ixy, Iyz, Ixz, in their respective axis.
These angular velocity parameter P, Q, and R will cause the aircraft to rotate
concentric to X axis, which called “rolling”; then rotate in Y axis which named
“pitching” and rotate also in Z axis, which entitled as “yawing”. Illustration of
these rotations were displayed in Figure 3. From common notation of rolling, φ,
pitching, θ, and yawing, ψ, then the kinematics relation between the PQR and the
, , , will expressed:
1 sin tan cos tan P
0 cos sin Q
0 sin sec cos sec R
(5)

FIGURE 3. Definition of Aircraft Roll, Pitch and Yaw Angles (Source:[6])

27
2.1.2. Linearization of Equation
Most of the aircraft mission is planned to fly in straight and level manner. In
that steady state flight, the nonlinear equation can be linearized around its trimmed
condition. So, if the aircraft undergo good design process and constructed into
statically stable configuration, then any small deviation from its steady flight
(trimmed) condition will produce adverse force and moment to recover its attitude
for a stable flight.

Linearizing inter-variable multiplied terms. For linearizing PQ, it will be


assumed that in the steady state condition the aircraft fly in P0 and Q0 while
experiencing slight disturbance δP and δQ. Since steady flight means P0 , Q0 and
R0 are zeros and multiplication between δP and δQ were so small that it can safely
neglected, then,

PQ P0 P Q0 Q P0 Q0 P0 Q Q0 P P Q 0
(6)
Using the same approximation, resulting:
PQ QR PR 0 (7)

Linearizing squared terms. For linearizing P2, the same assumption hold,
that in the steady state condition the aircraft fly in P0 while experiencing slight
disturbance δP. Since steady flight means P0, Q0 and R0 are zeros and
multiplication between δP and δP were so small that it can safely neglected, then,
2 2
P2 P0 P P02 2P0 P P 0
(8)
Using the same approximation, the results showed up:

P2 Q2 P2 0 (9)

Linearizing trigonometric terms. For trigonometric terms, the same


assumption also hold, and the approximation becomes,
sin sin 0 sin 0 cos cos 0 sin 0 1 1 0

cos cos 0 cos 0 cos sin 0 sin 11 0 1 0

tan 0 tan
tan tan 0
0
1 tan 0 tan 1 0

1 1
sec
cos 0 1 0
(10)

28
The results:
sin ; cos 1 ; tan ; sec 1 (11)

2.1.3. Linearized Rotational Equation of Motion


The substitution of Eq. (7) and (9), into Eq. (4) will produce the Linearized
Rotational Equation of Motion as follows:
L I x P I xz R
M I yQ
N I z R I xz P (12)
Another substitution of Eq. (11) to Eq. (5) will result the Linearize Kinematics
Equation as follows (after omitting the δ notation):

P
Q
R (13)

2.2. Comparing Linearized Model to Flight Test Result


To compare the Linearized Equation performance with respect to the actual
measured values, these sequence then held as simulation steps.
1. Begin to compute control moment L, M, N, from flight data to be fed to
Linearized Equation.
a. Compute L, M, N, using Eq. (4).
b. Input: Flight Data from [11].
2. Apply the computed L, M, N, to the Linearized Equation and extract
P, Q, R, from it.

a. Compute
P, Q, R,
using Eq. (12) (using the 5 points of
numerical differentiation [12]),
b. Input: L, M, N from Step 1.

3. Do numerical integration over


P, Q, R, to obtain P, Q, R, then

transform it into
, , .

4. Do numerical integration over


, , to obtain φ, θ, ψ, then
compare it with φ, θ, ψ, from flight data.

29
2.3. Analysis of the Comparison
2.3.1. Analyzing the Accuracy of the Roll Approximation

FIGURE 4. Plot Absolute Value of Roll Error

The solid purple line in the Figure 4 FIGURE 4 were showing the plot of
absolute value of roll angle error from linearized solution compared to flight data
value. The errors of the roll angle approximation were below 25 degrees.
The increment of error after data number 260 and after, were occurred in the
turning maneuver of the Sriti. Thus in the cruise maneuver, i.e. the aircraft is in
equilibrium flight, the linearization method show adequate accuracy in modeling
the flight.

FIGURE 5. Plot Absolute Value of Pitch Error

2.3.2. Analyzing the Accuracy of the Pitch Approximation


The solid green line in the Figure 5 FIGURE 5 were showing the plot of
absolute value of Pitch error from linearized solution compared to flight data value.
The errors of the approximate roll error were below 40 degrees.
The pure cruise segment which occurred between data number 160 to 210,
were related well with the small error i.e. below 3 degrees. Thus the comparison

30
result enhance the hypothesis that the linearization method modeled the cruise
segment in a good approximation.
While the larger values between data number 220 to 260 in the pitch error
were not occurred in the roll error. This phenomenon showed that the roll and pitch
maneuver were decoupled from each other. This decoupled response will give
proper base for further linearization which can separate the equation of motion to
the roll part and the pitch part.

2.3.3. Analyzing the Accuracy of the Yaw Approximation

FIGURE 6. Plot Absolute Value of Yaw Error

The solid light brown line in the Figure 6 were showing the plot of absolute
value of Yaw error from linearized solution compared to flight data value. The
errors of the approximate roll error were below 25 degrees. The spike occurred
around the data number 280 due to instrumentation glitch in the data recording.
Instead of spikes that can be neglected in the analysis, the error in yaw
simulation showing good approximation for cruise flight and the yaw movement
were decoupled also from the pitch movement. This might direct the further
linearization into at least two part of equation that might be solved separately. First,
the pitch linearized equation of motion, and the second might be the roll-yaw
linearized equation of motion.
Although the error of each flight angle approximation were bounded to their
upper limit, but careful treatment must be considered when applying the
linearization method for longer than 1 minute of flight simulation. The reason for
this skeptic judgment is the error propagation that could deviates rapidly drift the
flight angle approximation afar from its true value. The small angle deviation in
each approximation doesn't guaranteed to cancel each other or reduce the previous
error which exhibit by the numerical integration process. Since the sign of the
present error could be similar or opposing the sign of previous error, then the
accumulative error might built up to be excessive or even lead to divergence results
in the simulation.
3. Concluding Remarks
The linearization method performed a good approximation when it was
applied in the cruise flight. In the cruise condition, the aircraft fly in equilibrium

31
state. Because pilot always perform correction in this state, than any deviation in
roll, pitch and yaw movement were bounded to be small. This small interaction of
flight angle were reducing the contribution of nonlinear terms in the Newton-Euler
equation. The reducing of non linear terms in this flight regime were adequately
bring accuracy to its linearized equation of motion.
From the comparison of the error and the interaction between flight angles,
we can conclude that the error of Linearization Method were independent to the
Flying Wing maneuver. The error characteristic in Roll-Yaw and Pitch-Yaw is
decoupled in the Flying Wing Configuration.
References

[1] S.D. Jenie, Design and Engineering UAV (in Indonesian), Keynote Speech
of the Proceeding of National Seminar of UAV Design and Engineering,
BPPT, Jakarta: Indonesia, (2007)

[2] J. Muliadi, The Development of BPPTs UAV Case Study: T-Tail, V-Tail,
Inverted V-Tail Configurations, Proceeding of Chinese National Seminar of
UAV, Beijing: P.R. China, (2008)
[3] F. Hasim and Madhapi, The BPPT UAV Aerodynamic Configuration
Development Program for Sriti Prototype to Extend its Flight Range and
Endurance (in Indonesian), The Final Report for PKPP Insentive Research
(Program Document No. 021.2.F1.133), Intern. Doc. LAGG-BPPT, Banten:
Indonesia, (2012)
[4] J.R. Brannan and W.E. Boyce, Differential Equations: An Introduction to
Modern Methods and Applications (2nd Ed.), John Wiley & Sons, Ltd,
(2011)
[5] D.J. Diston, Computational Modelling and Simulation of Aircraft and the
Environment, John Wiley & Sons, Ltd, (2009)

[6] D. Allerton, Principles of Flight Simulation, John Wiley & Sons, Ltd, (2009)
[7] AIAA, AIAA Aerospace Design Engineers Guide (5th Ed.), AIAA Inc,
(2003)
[8] Lennon, A., R/C Model Aircraft Design, Air AGE Media, (1996)
[9] M.V. Cook, Flight Dynamics Principles, Butterworth-Heinemann (Elsevier),
(2007)
[10] D. McRuer, I. Ashkenas, and D. Graham, Aircraft Dynamics and Automatic
Control, Princeton University Press, New Jersey, (1973)
[11] D.H. Budiarti, Djatmiko, and M. Dahsyat, Technical Report No.
012/FT/6.1/PUNA/XI/2013 BPPT-04C Sriti Experimental Flight No. 02 Test
Report, Test Subject: Parameter Identification, BPPT Internal Technical
Document, Jakarta: Indonesia, (2013)
[12] R. L. Burden and J. D. Faires, Numerical Analysis (9th ed.), Boston, MA
(Amerika Serikat): Brooks/Cole, (2011)

32
Proceedings of IICMA 2015
Applied Math

Determination of Fishermen Poverty


Alleviation Program Using Analytic Hierarchy
Process in Parigi Moutong, Central Sulawesi
Province, Indonesia
Sutikno1,a), Soedarso2,b), Sukardi3,c), Syamsuddin HM4,d), Yusman
Alharis5,e), and Kartika Nur ‘Anisa’6,f)
1,5,6
Department of Statistics, Institut Teknologi Sepuluh Nopember, Jl. Arief Rahman
Hakim, Surabaya 60111, Indonesia
2
Departmnet of Social Humaniora
3,4
STMIK Adhi Guna Palu
a)
sutikno@statistika.its.ac.id
b)
darso@mku.its.ac.id
c)
sukardi_palu@yahoo.co.id
e)
yusmanalharis@gmail.com
f)
anisa.kartika99@gmail.com

Abstract. Parigi Moutong is one of four districts which the percentage of poor people is high,
more than 20%. Poverty alleviation programs, which have been undertaken, are still not optimal
in accelerating poverty reduction, because the decreasing average is relatively small, below 0.5%.
Therefore, it is needed to combine top-down and bottom-up in determining poverty program
Parigi Moutong. The community is involved in process of determining the program. Otherwise,
when deciding program the community should be assisted by experts who are familiar with the
existing problems. These models are expected to increase success in poverty reduction. A method
that can be used to assist in the decision-making process is AHP (Analytic Hierarchy Process).
This research uses stakeholders as the expert respondent. The results show that there are several
priorities that must be developed as tourism development in Parigi Moutong. The preparing
fishermen’s economic improvement program is the main priority, the priority value is 0.530. The
availability of bins for environmental conservation (0.553), local cultural traditions as tourists
attracting (0.485) and community involvement in economic activities (0.461), should be taken
precedence in tourism development. The second priority is fish cultivation with priority value is
0.188 and the availability of fishpond as the development of fishing capture result (0.492). Post-
fishing management is the third priority (0.178) with packaging/management of fishing capture
result. Supporting facilities of fishing capture result on the sea (0.102) become the last priority
aspect and the empowerment of boat as one of fishing vessels has to be developed in this aspect
with priority value is 0.392.
Keywords and Phrases: Analytic Hierarchy Process, Fishermen Community, Parigi
Moutong, Poverty.

33
1. Introduction
Until 2014, the number of poor people in Central Sulawesi is decreased, i.e.
18.07% in 2010 to 13.93% in 2014, BPS Central Sulawesi province [2]. Among the
11 districts/ cities in Sulawesi, there are four districts with the percentage of poor
people is high, more than 20%, which are: Tojo Una-Una, Poso, Morowali and
Parigi Moutong, BPS Parigi Moutong [1]. Mostly poor people live in rural areas
and work in agriculture including fishing.
Poverty alleviation programs that undertaken are still not optimal in
accelerating poverty reduction because the decreasing average is relatively small,
below 0.5%. During 2011-2014 period, the number of poor people fluctuated and
tended to decrease from year to year, BPS Central Sulawesi province [2]. Poverty
alleviation programs in Central Sulawesi, including PKPS BBM (2003; 2005) and
PNPM Mandiri (2008; 2011) BPS Central Sulawesi province [2]. The programs are
largely top-down. PNPM Mandiri Program is more emphasis on the community
which is more active role to start election monitoring and evaluation program.
Programs which are purely bottom-up are still not optimal, although the rate of
failure is not as big as top-down. Program targets are the poor community in all
aspects, resulting in determination and monitoring programs are not all determined
by the program beneficiaries.
Therefore it is necessary to combine top-down and bottom-up in determining
the poverty program Parigi Moutong. In the process to determine program, the
community is involved. Otherwise, when deciding program the community should
be assisted by an expert who is familiar with the problems that exist at the site.
These models are expected to increase success in poverty reduction. This research
examines the needs of the fishing community in Parigi Moutong which are
appropriate, so it can be arranged poverty alleviation programs. The aspects that
are considered in this research are supporting facilities of fishing capture result on
the sea, post-fishing management, fish cultivation and tourism development. A
method, used to assist in the decision-making process, is AHP (Analytic Hierarchy
Process).

2. Main Results
2.1 Definition of Analytic Hierarchy Process
Analytic hierarchy process (AHP) is a methodological approach which
implies structuring criteria of multiple options into a system hierarchy, includes
relative values of all criteria, comparing alternatives for each particular criterion
and defining the average importance of alternatives, Saaty [6]. AHP method offers
a meaningful and rational framework for structuring problems, presentation, and
qualification of elements that make a problem. This method has been widely used
to select the best alternative among many alternatives based on multiple criteria,
including some AHP application as it has been used by Plastic, and Lalic, B. [5]
used the method for Selecting and Evaluating Projects; Kholil [4] used it for the
selection of local commodities, as well as the Application of AHP methodology in
making proposal for a public work contract, Bertolini, Bragila and Carmignan, [7].
The weakness of AHP method is often inconsistent in their assessments
between one criterion to the other criteria, Saaty [6]. In addition, the measurement

34
cannot be given absolutely to the criteria compared, if there is a reduction/
increasing in one criterion, thereby granting the rank is an irrelevance. The
principle of AHP method is to solve a complex problem into its parts are structured
into (a) what is the purpose, (b) what is the criteria and (c) whoever/ whatever that
meet these criteria. The most important thing in doing the analysis by AHP Saaty
[6], is to set parts or variables into a hierarchy, gives a numerical value to each
variable and synthesized to select variables that have the highest priority. There are
4 steps being taken are:
(1) Decomposition, solving a complex problem into a simpler elements, and then
create a hierarchy of goal, criteria, and alternatives.
(2) Comparative Judgement, assessing the relative importance of the two pairs of
elements. The pairwise comparison must be in the form of quantitative
assessment in terms of numbers as in Table 1.

TABLE 1.Scale Assessment of the Relative Importance, Saaty [7]


Importance Definition
1 equally importance
3 slightly more importance
5 materially more importance
7 significantly more importance
9 absolute importance
2,4,6,8 compromise values

If there are C1, C2, C3,…,Cn is a collection of n activities, it can be formed


nxn judgment matrix pairs:
A = (aij), (i, j = 1,2,3,…, n), this is a reciprocal matrix with all of diagonal values is
1, with the following conditions:
a. If aij = α, then aji = 1 / α, for α ≠ 0
b. For aij where i = j, then aij = 1, hence A (aij) is a matrix as follows:
1 a12 ... a1n
1/ a12 a22 ... a2 n
A
... ... ... ...
1/ a1n 1/ a2 n ... 1

In assessing pairwise comparison, there are two important things:


a. Which element is more important
b. How many times the importance one element over another
If there are w1, w2, ... wn are assessed with pairwise comparison, the value
between w1 and w2 is written wi/wj= a(ij); where i,j = 1,2,3,…,n thus the matrix A
(aij) is written into:

35
w1 / w1 w1 / w2 ... w1 / wn
w2 / w1 w2 / w2 ... w2 / wn
A
... ... ... ...
wn / w1 wn / w2 ... wn / wn

If the matrix A (wi/wj) is multiplied by the vector W=(W1.W2.W3,…,Wn)


the result is:
AW = n W (1)
If the matrix A is known, then the value of W can be obtained by the
following equation:
A-[nI W] = 0 (2)
Equation (2) will produce a non-zero solution if n is an eigenvalue of A
and W is the eigenvector. After all eigenvalue of matrix A is obtained as α1, α2,
α2,…, αn and based on the matrix A to a(ii)= 1, then it will apply:
n

i n (3)
i 1

Value of w can be obtained by substituting the maximum value


eigenvalues as follow:
AW = αmaksW (4)
A-αmaksI W = 0 (5)
to get a zero value, then:
A-αmaksI = 0 (6)
From equation (6), it will be obtained the value of αmaks. By entering a
value in the equation (5), we will get the value of wi (i = 1,2,3, ...., n) which is the
eigenvector corresponding to the maximum eigenvalue.
(3) Synthesis of Priority is the selection of priority based on pairwise
comparisons.
(4) Logical consistency is to test consistency for each pairwise comparison matrix.
Consistency assessment of pairwise comparison matrices based on two
aspects
a. By looking at multiplicative preferences, for example when A is twice heavier
than B and B is two times heavier than C, then A should be 4 times heavier
than C.
b. By viewing transitive preferences, for example if A is less than B and B is less
than C, then A must be smaller than C.

2.2 Methodology
There are two stages that used of this research: first is to set the stage
hierarchy (goal, criteria, and alternatives), and the second is data collection and
data analysis. Determination of hierarchy and data collection is done by expert

36
interview, while data analysis uses software Expert Choice 2000. Steps and method
determination of priority strategies by using AHP are as shown follow.
Literature Review
Discussion with experts to determine criteria and alternatives
Pairwise comparison (experts based)
Synthesis of priority
Test of consistency
Determination of priority strategy

2.3 Results and Discussion


2.3.1 Aspect Priorities
Based on the expert discussion, there are 4 criteria as the basis for the
selection of strategies: C1(supporting facilities of fishing capture result on the sea),
C2(post-fishing management), C3(fish cultivation) and C4(tourism development).
This analysis uses data from two experts in tourism and maritime Parigi Moutong.
Each expert fills AHP comparison matrix questionnaire that refers to section 2.1.
The results from two experts are merged into one by geometric mean rule with
n=2. The merger results will get full comparison matrix as Table 2.

TABLE 2.Value of Pairwise Comparison (Expert Based)


Criteria C1 C2 C3 C4
C1 1 1/2.449 1/1.414 1/4.242
C2 1 1/2 1/2.645
C3 1 1/4.582
C4 1

Synthesis result of priority for level 2 (criteria) refer to level 1 (goal) is shown in
Figure 1.

FIGURE 1. Synthesis of Priority for the Criteria (Refer to Goal)

37
Based on Figure 1, the aspect of tourism development in Parigi Moutong
becomes the first priority that must be developed to define poverty alleviation
programs (0.530). Supporting facilities of fishing capture result on the sea become
the last priority (0.102) in the determination of poverty alleviation program. It
means that tourism aspects considered is able to increase the community’s
economy, which tourism in Parigi Moutong has potential but it still needs
management to be done properly so that policy making and regulatory efforts to
raise national and international tourism are able to bring prosperity to the
surrounding community.
By the same way, synthesis of priority for alternatives is shown in Figure 2.

FIGURE 2. Synthesis of Priority for All Alternatives

In supporting facilities of fishing capture result on the sea, the availability of


adequate fishing boats is the first priority (0.392) and the renewal of technology
becomes the last priority (0.153) which is considered able to help the community in
fishing facilities on the sea. While in post-fishing management, the packaging
becomes the first priority (0.341) and giving fund becomes the last priority (0.044)
which is considered able to help the community to manage the result of sea
capture. In fish cultivation aspect, the availability of fishpond is the first priority
(0.492) and the increasing of human resources becomes the last priority (0.089)
which is considered able to help people to increase economy in Parigi Moutong by
doing cultivation.
In the economic sector, community involvement in the activities of Small
and Medium Enterprises is the first priority (0.460) and infrastructure improvement
is the last priority (0.066) which is assessed able to increase community’s economy
in tourism aspect. In the socio-cultural sector, the cultural tradition of local area
becomes the first priority that must be developed (0.484), while the safety and
comfortable become the last priority (0.049). In the environmental sector, garbage
dump is the first priority that must be considered in environmental management

38
(0.553), while public environmental awareness becomes the last priority (0.048) in
terms of tourism development of the environmental sector.

3. Concluding Remarks
The most appropriate strategy for poverty reduction in Parigi Moutong
fishing communities is to develop tourism in a sustainable manner, it is supported
by the high potential of tourism in Parigi Moutong. In the economic sector,
government supports for small and medium enterprises into the most appropriate
strategy is to improve the community’s economy independently. In the socio-
cultural sector, the preservation of local cultural traditions will be an attraction for
tourists to visit in Parigi Moutong. In the environmental sector, the making of
garbage dump becomes an important priority that must be done by the government
to support the development of sustainable tourism in Parigi Moutong.

Acknowledgement. Thank you submitted to Higher Education who has


supported this research in BOPTN DIKTI.

References
[1] Badan Pusat Statistik Kabupaten Parigi Moutong, Kabupaten Parigi Moutong
Dalam Angka 2014, (2014)

[2] Badan Pusat Statistik Provinsi Sulawesi Tengah, Profil Kemiskinan Di


Sulawesi Tengah Maret 2014, Berita Resmi Statistik. No.
38/07/72/Th.XVII., (2014)

[3] Bertolini, Bragila, and Carmignan, Application of the AHP methodology in


making a proposal for apublic work contract , International Journal of
Project Management. Volume 24, Pages 422–430, (2006)

[4] Kholil, Selecting of region excellence commodities, Case study in West


Aceh Regency, Ministry of Industry of the Republic of Indonesia, Jakarta,
(2010)

[5] I. Palsic and B. Lalic, Analytical hierarchy process As a tool for selecting
and evaluating projects, Int J. Simulation model. ISSN 1726 -4529; p 16-26.,
(2009)

[6] T. L. Saaty, Rank generation, preservation, and reversal, I the analytic


hierarchy decision process, Decision sciences, Volume 18, No. 2, p : 27-45,
(1987)

[7] T. L. Saaty, Decision Making for Leaders, Analytic Hierarchy Process for
Decision Making in Complex Situations, Jakarta: PT. Pustaka Binaman
Pressindo., (1993)

39
Proceedings of IICMA 2015
Applied Math

Classification Data of Cancer using Fuzzy C-


Means with Feature Selection
Bona Revano1,a), Zuherman Rustam2,b)
1,2
Department of Mathematics, Universitas Indonesia

a)
bona.revano@sci.ui.ac.id
zuhermanrustam@gmail.com
b)

Abstract. For many years, cancer classification has improved to detect cancer at early stage of
treatment. Cancer classification is used for the treatment of cancer has entered the challenge to
target specific therapy for each type of cancer pathogens in an effort to maximize efficiency and
minimize toxicity. This research present cancer classification with feature selection using
microarray data. The clustering method based on K-Means and classification using Fuzzy C-
Means. Feature selection is crucial for cancer classification, because for each cancer has small
numbers of gene are informative. In this paper, filter method is used. In the filter method, a
feature relevance score is calculated, and low-scoring features are removed. The class I (healthy)
and class II (cancerous) are already known. The feature are ranked according to the symmetric
divergence between positive and negative classes distribution after that using K-Means for
clustering. The experiment investigation that optimization based clustering with feature selection
increase the accuracy, sensitivity, and specificity classification. The results show the difference
between all the dataset used and the dataset using feature selection.
Keywords: Cancer classification, feature selection, clustering, microarray data, feature score

1. Introduction

Cancer is a group of diseases characterized by the uncontrolled growth and


spread of abnormal cells. If the spread is not controlled, it can result in death. In
2012, there were 14.1 million new cancer cases in worldwide, the corresponding
estimates for total cancer deaths were 8.2 million. By 2030, the global burden is
expected to grow to 21.7 million new cancer cases and 13 million cancer deaths
simply due to the growth and aging of the population. However, the estimated
future cancer burden will probably be considerably larger due to the adoption of
lifestyles that are known to increase cancer risk, such as smoking, poor diet,
physical inactivity, and fewer pregnancies, in economically developing countries
[1].

Cancer classification has been based primarily on morphological appearance of


the tumor, but this has serious limitations. Tumors with similar appearance can
follow significantly different clinical courses and show different responses to
therapy. In a few cases, such clinical heterogeneity has been explained by dividing
morphologically similar tumors into subtypes with distinct pathogeneses. So,
cancer classification has been difficult in part because it has historically relied on

40
specific biological insights, rather than systematic and unbiased approaches for
recognizing tumor subtypes [2]. Cancer classification can help to maximize
efficiency and minimize toxicity of cancer treatment with specify target.

Microarray data is a collection of small DNA spots attached to a solid surface.


A microarray contains thousands of DNA spots, covering almost every gene in a
genome. In experiments, the signal collected from each spot is used to estimate the
expression level of a gene. Each of genes is with specific purpose. This research
using microarray data for colon cancer are considered in this investigation

Feature Selection is selection method to reduce dataset because not all dataset
are informative. This selection is crucial because other dataset are not informative
can reduce the accuracy, specificity, and sensitivity. Feature selection using
clustering to reduce the data because the method wants to shuffling data to get true
informative dataset from each clustering. Clustering in this method based on K-
means clustering because K-mean clustering is the simplest
unsupervised learning algorithms that solve the well-known clustering problem. In
classification, we using classification using Fuzzy C-Means because this method is
most stable clustering method, because the central cluster and the results of
grouping does not change if new extreme data come in.

2. Main Results
2.1 Materials and Methods

FIGURE 1. Classification System


As shown in Fig 1., Classification system was conducted using clustering to
partition the data so that it will look anywhere informative features on each
partition. In accordance with the discussion in the previous chapter, the clustering
method used is K-means clustering. Each clustering process is carried out as many
as 100 iterations per clustering process. Clustering process takes place 6 times with
a different number of clusters according to the partitions are used per clustering. In
the clustering process first used the cluster so that no data partitioning thus selected

41
20 features with the largest value of the feature score. In the process of subsequent
clustering of the process 2 to 6 are partitions. The second is used for clustering 2
cluster means data partitioned two parts. In this process the number of shots for
each partition features obtained from the calculation in equation (2).
For the clustering process 3 to 6 partitions decision is also based on the
equation (2). Total feature selected for each clustering process is the 20 features.
Because the process is performed six times, then the total is 120 features selected
features. However, there are features of 120 features were elected more than twice.
As a result, of the 120 will feature in the selection back to 25 features. The process
of taking 25 such features taken from the frequency of occurrence of each feature
on the clustering process. Thus it was selected 25 features the results of microarray
data is already done feature selection. The features of the 25 reconstituted
microarray new data, then each sample on the data classified using the Fuzzy C-
Means.
2.1.1 Microarray data
We using microarray data is a gene expression data set for colon cancer is
obtain in (http://genomics-pubs.princeton.edu/oncology/), this is publicly available.
The data consists of 7457 genes and each gene has 36 samples correspond to 18
healthy cases and in which 18 are cancerous [3].
2.1.2. Clustering Genes
K-Means Clustering is the simplest unsupervised learning algorithms that solve
the well-known clustering problem. K-means method is allocating data into
clusters based on a comparison with the distance between the centroid data. Data is
allocated to the cluster that has a centroid closest to the data. In general objective
function for K-Means Clustering is
𝑁 𝑐

𝐽(𝑈, 𝑉) = ∑ ∑ 𝑎𝑖𝑘 𝑑2 (𝑋𝑘 , 𝑉𝑘 ) (1)


𝑘=1 𝑖=1

where,
𝑁 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎
𝑐 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑢𝑠𝑡𝑒𝑟
𝑎𝑖𝑘 = 𝑚𝑒𝑚𝑏𝑒𝑟𝑠ℎ𝑖𝑝 𝑑𝑎𝑡𝑎 𝑡𝑜 𝑘 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑢𝑠𝑡𝑒𝑟 𝑖, 𝑤ℎ𝑖𝑐ℎ 𝑖𝑠 𝑤𝑜𝑟𝑡ℎ
1, 𝑑 = 𝑚𝑖𝑛{𝑑(𝑥𝑘 , 𝑣𝑖 )}
𝑎𝑖𝑘 = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
𝑉𝑖 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑 𝑐𝑙𝑢𝑠𝑡𝑒𝑟 𝑖
𝑋𝑘 = [𝑥1 , 𝑥2 , … , 𝑥𝑁 ], 𝑐𝑜𝑙𝑢𝑚𝑛 𝑘 𝑜𝑓 𝑋
𝑋𝑘𝑖 = [𝑥1 , 𝑥2 , … , 𝑥𝑁 ], 𝑐𝑜𝑙𝑢𝑚𝑛 𝑘 𝑜𝑓 𝑋 𝑖𝑛 𝑐𝑙𝑢𝑠𝑡𝑒𝑟 𝑖

42
Algorithm 1. K-Means Clustering

2.1.3. Feature Selection


Feature selection is crucial for cancer classification, as for each cancer type
only a small numbers of genes are informative, and the presence of other genes
reduces the classification accuracy [4]. The scoring examines the level of
contribution of each gene to cause discrimination between healthy (class I) and
cancerous (class II) tissues. In this study, because the distributions of class I and
class II is already known, the features are ranked according to the symmetric
divergence between the positive and negative class distributions as shown below
[5].

1 (𝜎𝑖+ )2 (𝜎𝑖− )2 1 (𝜇𝑖+ − 𝜇𝑖− )2


(𝑋 )
𝐹𝑠𝑐𝑜𝑟𝑒 𝑖 = ( + )−1+ ( ) (2)
2 (𝜎 − )2 (𝜎 + )2 2 (𝜎 + )2 + (𝜎 − )2
𝑖 𝑖 𝑖 𝑖

where class I (+) and class II (-), 𝜎𝑖 𝑎𝑛𝑑 𝜇𝑖 is standard deviation and mean of
gene i.
After we know Fscore for each genes. The total number of genes best of all cluster
formed in to 𝑁𝑔 = 20. The best total genes selected from each cluster is
calculated by:

∑𝑚 𝑘
𝐹𝑠𝑐𝑜𝑟𝑒 (𝑋𝑖 )
𝑁𝑔𝑘 = 𝑟𝑜𝑢𝑛𝑑 ((𝑁𝑔 − 𝑞) ( 𝑚𝑖−1 )) + 1 (3)
∑𝑖−1 𝐹𝑠𝑐𝑜𝑟𝑒 (𝑋𝑖 ))

where,
𝑁𝑔𝑘 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑔𝑒𝑛𝑒𝑠 𝑖𝑛 𝑐𝑙𝑢𝑠𝑡𝑒𝑟 𝑘
𝑚𝑘 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑔𝑒𝑛𝑒 𝑖𝑛 𝑐𝑙𝑢𝑠𝑡𝑒𝑟 𝑘
𝑚 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑔𝑒𝑛𝑒𝑠
𝑞 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑢𝑠𝑡𝑒𝑟

43
2.1.4. Classification
Fuzzy C-Means is development method from method of K-Means. Fuzzy C-
Means was proposed by Dunn (1973) and developed by Bezdek (1981). This
method is a method in Non-Hierarchical Clustering. Fuzzy C-Means is most stable
clustering method, because the central cluster and the results of grouping does not
change if new extreme data come in. In general objective function of Fuzzy C-
Means is
𝑐 𝑁
2 (𝑋
𝐽𝐹𝐶𝑀 (𝑉, 𝑈, 𝑋, 𝑐, 𝑤) = ∑ ∑(𝑢𝑙𝑘 )𝑤 𝑑𝑙𝑘 𝑘 , 𝑉𝑙 ) (4)
𝑙=1 𝑘=1

With constraint function is


𝑐

∑ 𝑢𝑖𝑘 = 1, ∀𝑘 ∈ {1,2, … , 𝑁} (5)


𝑖=1

where,
𝑉&𝑈=
𝑡𝑤𝑜 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑡ℎ𝑎𝑡 𝑤𝑖𝑙𝑙 𝑓𝑖𝑛𝑑 𝑓𝑜𝑟 𝑜𝑝𝑡𝑖𝑚𝑎𝑙 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛, 𝑡ℎ𝑒 𝑜𝑝𝑡𝑖𝑚𝑎𝑙 𝑐𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛𝑠 𝑓𝑜𝑟
𝑡ℎ𝑒 𝑚𝑎𝑡𝑟𝑖𝑥 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑒𝑠 𝑡ℎ𝑒 𝑐𝑜𝑛𝑣𝑒𝑟𝑔𝑒𝑛𝑐𝑒 𝑜𝑓 𝑔𝑟𝑜𝑢𝑝 𝑚𝑒𝑚𝑏𝑒𝑟𝑠ℎ𝑖𝑝 𝑖𝑛 𝑡ℎ𝑒 𝐹𝑢𝑧𝑧𝑦 𝐶
− 𝑀𝑒𝑎𝑛𝑠.
𝑋 = 𝑚𝑎𝑡𝑟𝑖𝑥 𝑑𝑎𝑡𝑎 𝑖𝑛 𝑡ℎ𝑒 𝑐𝑙𝑢𝑠𝑡𝑒𝑟.
𝑋𝑘 = [𝑥1 , 𝑥2 , … , 𝑥𝑁 ], 𝑐𝑜𝑙𝑢𝑚𝑛 𝑜𝑟 𝑟𝑜𝑤 𝑜𝑓 𝑋
𝑐 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠 𝑡ℎ𝑎𝑡 𝑓𝑢𝑙𝑓𝑖𝑙𝑙 𝑋.
𝑤 = 𝑓𝑢𝑧𝑧𝑦 𝑑𝑒𝑔𝑟𝑒𝑒 𝑓𝑜𝑟 𝑔𝑟𝑜𝑢𝑝𝑖𝑛𝑔.
2
𝑑𝑖𝑘 = 𝑡ℎ𝑒 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑡𝑜 𝑡ℎ𝑒 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑, 𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 𝑏𝑦
(‖𝑥𝑘 − 𝑣𝑖 ‖2 = (𝑥𝑘 − 𝑣𝑖 )𝑇 (𝑥𝑘
− 𝑣𝑖 )) (6)
𝑉 = 𝑚𝑎𝑡𝑟𝑖𝑥 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑
Algorithm 2. Fuzzy C-Means
Step 1. Determine
a. 𝑋 with size 𝑛 × 𝑚, where 𝑛 is number of dataset training, 𝑚 is number of
parameter.
b. Number of cluster, 𝑐 ≥ 2.
c. 𝑤 > 1.
d. Max iteration N.
e. Iteration stop if 𝜀 = 𝑠𝑚𝑎𝑙𝑙 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑛𝑢𝑚𝑏𝑒𝑟.
f. Determine centroid with mean for each cluster.
Iteration begin 𝑡 = 1,2, … , 𝑁
Step 2. Update degree of membership fuzzy for each data in each cluster (correct
membership matrix) 𝑢 = [𝑢𝑖𝑘 ], 𝑘 = 1,2, … , 𝑛
𝑐 1 −1
2 𝑤−1
𝑑𝑖𝑘
𝑢𝑖𝑘 = [∑ ( 2 ) ] (7)
𝑑𝑗𝑘
𝑗=1
2
where, 𝑑𝑖𝑘 = ‖𝑥𝑘 − 𝑣𝑖 ‖2 = (𝑥𝑘 − 𝑣𝑖 )𝑇 (𝑥𝑘 − 𝑣𝑖 )

44
Step 3. Calculated centroid 𝑉 for each cluster 𝑖.
∑𝑛𝑘=1(𝑢𝑖𝑘 )𝑤 𝑋𝑘
𝑉𝑖 = (8)
∑𝑛𝑘=1(𝑢𝑖𝑘 )𝑤
Step 4. Determine iteration stop for centroid of now iteration (𝑡) and iteration
before (𝑡 − 1).
∆= ‖𝑉 𝑡 − 𝑉 𝑡−1 ‖ (9)
If ∆< 𝜀 then iteration stop.

2.1.5. Performance Evaluation


In this stage the performance of the classifier is evaluated. The evaluation is
carried out in the forms of sensitivity, accuracy and specificity. There are 4
possible outcomes from the classifier. The first possibility is true positive (TP),
which refers to the case that a diseased sample is correctly diagnosed. The second
possibility is false positive (FP), in which a healthy sample is incorrectly identified
as a diseased case. The third possibility is true negative (TN), which indicates the
case where a healthy sample is correctly spotted. Final possibility is false negative
(FN), which refers to the case that diseased sample is incorrectly identified as
healthy [6]. The percentage value for the evaluation criteria (sensitivity, specificity
and accuracy) can be calculated using equation (10)-(12) [3].
2.1.5.1. Sensitivity
𝑛 𝑇𝑃
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = × 100 (10)
𝑛 𝑇𝑃 + 𝑛𝐹𝑃
2.1.5.2. Specificity
𝑛 𝑇𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = × 100 (11)
𝑛 𝑇𝑁 + 𝑛𝐹𝑁
2.1.5.3. Accuracy
𝑛 𝑇𝑃 + 𝑛 𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = × 100 (12)
𝑛 𝑇𝑃 + 𝑛 𝑇𝑁 + 𝑛𝐹𝑃 + 𝑛𝐹𝑁

2.2. Main Result


2.2.1 Microarray data
TABLE 1. Data Example

45
𝑟𝑜𝑤 = 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑜𝑓 𝑎 𝑔𝑒𝑛𝑒 𝑎𝑐𝑟𝑜𝑠𝑠 𝑎𝑙𝑙 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡𝑠
𝑐𝑜𝑙𝑢𝑚𝑛 = 𝑔𝑒𝑛𝑒 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑙𝑒𝑣𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑎 𝑠𝑖𝑛𝑔𝑙𝑒 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡
𝑇 = 𝑇𝑢𝑚𝑜𝑟 (𝐶𝑙𝑎𝑠𝑠 𝐼𝐼)
𝑁 = 𝑁𝑜𝑟𝑚𝑎𝑙 (𝐶𝑙𝑎𝑠𝑠 𝐼)

2.2.2 Clustering and Feature Selection

TABLE 2. Feature selection with Clustering result

The data is clustered 6 times, setting different numbers of clusters, ranging


from 1 to 6, in each case. This means that in the first run the number of clusters is
set to one implying no clustering is used, hence the algorithm goes straight to the
gene selection step and selects the top 20 genes. In the second run, data is
partitioned in two clusters, while the clustering algorithm iterates 100 times to
minimize the cost function to achieve more accurate clusters.
After 100 iterations of K-Means Clustering, feature selection is carried out
according to the number of clusters and the population in each cluster. Therefore,
depending on the number of clusters different sets of genes are selected. Similar
procedure, as in run 2, carries on until the final run in which data is clustered into
six partitions. As a result of 20 genes selected in each run, a total of 120 genes are
finally selected from 6 runs, and are scored according to how many times they
were repeated in each case. Finally the 25 genes with the highest score are
extracted [3]. As shown.
TABLE 3. 25 genes selected with highest F score

46
2.2.3. Classification
After that make a new data from 25 genes selected and fed in to Fuzzy C-
Means and after that we do classification without feature selection. The result as
shown below.
TABLE 4. Classification with feature selection

As shown in Table 4. In the classification process for classification using feature


selection, all the normal cells in precisely the Data Class 1 where no one is in the
Data Class 2, while for the tumor cells there are three cells into the Data Class 1 are
supposed to be in the Data Class 2. The calculation of the level of sensitivity (10),
specificity (11), and accuracy (12) contained in Table 6.
TABLE 5. Classification without feature selection

As shown in Table 5. In the classification process for classification using feature


selection, there are 6 (six) normal cells that are in the Data Class 2 where they
should be in the Data Class 1, whereas for tumor cells there are 3 (three) cells into
the Data Class 1 are supposed to be in the Data Class 2. for the calculation of the
level of sensitivity (10), specificity (11), and accuracy (12) contained in Table 6.

2.2.4. Performance Evaluation


We calculated accuracy, sensitivity, and specificity. Compare the result
classification using feature selection and without using feature selection. The result
shown below.
TABLE 6. Accuracy, Sensitivity and Specificity

47
CHART 1. Accuracy, Sensitivity, and Specificity

3. Concluding Remarks
A cancer classification approach using Fuzzy C-Means method with feature
selection with K-Means clustering is developed the accuracy of cancer
classification performance, compared to cases in which no feature selection is used.
To this end, the most informative genes (25 genes) were selected using feature
selection, then fed into the classifier (Fuzzy C-Means method). Finally sensitivity,
accuracy and specificity of identifying a cancerous sample data were calculated
and developed significant.
Acknowledgment. The author are grateful to the referees for their valuable
comments and suggestions. The authors also acknowledgment that this research is
the final project in Universitas Indonesia.

References
[1] American Cancer Society, (2015), Global Cancer Facts & Figures 3rd Edition
Atlanta: American Cancer Society; 2015.
[2] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, et al.,
Molecular classification of cancer: class discovery and class prediction by gene
expression monitoring, Science 286 (5439) (1999) 531–538.
[3] V. Elyasigomari, M.S. Mirjafari, H.R.C. Screen, M.H. Shaheed, Cancer
classification using a novel gene selection approach by meansof shuffling
based on data clustering with optimization, ScienceDirect (2015).
[4] D.B. Allison, X.Q. Cui, C.P. Page, M. Sabripour, Microarray data
analysis:from disarray to consolidation and consensus, Nat. Rev. Genet. 7 (1)
(2006)55–65.
[5] Y. Zhang, J.C. Rajapakse (Eds.), Machine Learning in Bioinformatics, John
Wiley& Sons, Inc., New Jersey, 2009.
[6] A. Rahideh, M.H. Shaheed, Cancer classification using clustering based
geneselection and artificial neural networks, in: International Conference on
Control, Instrumentation and Automation (ICCIA), Shiraz, 2011, pp. 1175–
1180.

48
Proceedings of IICMA 2015
Graph and Combinatorics

On The Rate of Constrained Arrays


Putranto Utomo1,a) and Ruud Pellikaan2,b)
1,2
Eindhoven University of Technology. Department of Mathematics
and Computer Science, Coding and Crypto. PO Box 513. NL-5600 MB.
a)
p.h.utomo@tue.nl
b)
g.r.pellikaan@tue.nl

Abstract. Sudokus are nowadays very popular puzzles and they are studied for their
mathematical structure. Binary Puzzles are also interesting puzzles with certain rules.
A solved Binary Puzzle is an n by n binary array satisfying: (1) there are no three
consecutive ones and also no three consecutive zeros in each row and each column,
(2) the number of ones and zeros must be equal in each row and in each column, and
(3) every two rows and every two columns must be distinct. Binary Puzzles can be
seen as constrained arrays and can be used for modulation purposes, 2D recording and
barcodes and is studied in statistical mechanics. In our previous paper, we outlined
some problems related to Binary Puzzles such as the rate of these codes, the erasure
decoding probability, the decoding algorithms and their complexity. In this paper, we
focus on the first problem, that is finding the rate of a code based on the Binary
Puzzle.
Keywords and Phrases: Binary Puzzle, rate of a code, constrained arrays

1. Introduction
Binary Puzzle is an interesting puzzle with certain rules. We look at the
mathematical theory behind it. The solved Binary Puzzle is an n × n binary array
that satisfies:
(1) no three consecutive ones and also no three consecutive zeros in each row and
each column,
(2) the array is balanced, that is the number of ones and zeros must be equal in
each row and in each column,
(3) every two rows and every two columns must be distinct.
Figure 1 is an example of a Binary Puzzle. There is only one solution
satisfying all three conditions. But there are three solutions satisfying (1) and (2).
The solution satisfying all conditions is given in Figure 2.
In our previous paper [15], we outlined some problems related to Binary
Puzzles such as the rate of these code, the erasure decoding probability, the
decoding algorithms and their complexity. In this paper, we focus on the first
problem.
The computation of the number of n × n Binary Puzzles is a very difficult
problem, and so far we were only able to obtain the values for small n, by brute
force.

49
FIGURE 1. Unsolved FIGURE 2. Solved
Since a Binary Puzzle has to satisfy the conditions (1), (2) and (3), we
consider these conditions separately and split the computation into several
different parts, where each part corresponds to a particular condition.
That means we consider the following collections of m × n binary arrays that
are constrained:
satisfies (1) };
satisfies (2) };
satisfies (3) };
satisfies (1), (2) and (3) };
satisfies (1) and (2) },
where is the set of all m × n binary arrays.
The first three collections correspond to condition (1), (2), and (3),
respectively. The main goal is to find the rate of arrays satisfying all three
conditions, but we also consider collection of arrays satisfying first and second
conditions, to have a better approximation on the size of collections of arrays
satisfying all conditions.
Although the exact size of Am×n,Bm×n, Dm×n and Em×n is still an open problem,
we provide lower and upper bounds of their size, and also of the asymptotic rates.
The exact size of Cm×n is acquired by means of a recursive formula.

2. Main Results
Definition 2.1. Let C be a code in Qn, where the alphabet Q has q elements.
Recall that the (information) rate of C is defined by

.
Let C be a collection of codes over a fixed alphabet Q with q elements.
The length of C ⊆ Qn is n and is denoted by n(C). Suppose that the length n(C)
goes to infinity for C ∈ C. The upper and lower asymptotic rate or capacity of C
are defined by

and in case the limit exists, that means 𝑅 (C) = R(C) it will be denoted by R(C).

50
Let Am be the collection of all codes Am×n for fixed m and all positive integers
n and let A be the collection of all codes Am×n for all positive integers m, n.
Similarly the collections Bm, Cm, Dm, Em and B, C, D, E are defined.
2.1. First Constraint.
An array that satisfies the first constraint of the Binary Puzzle is often called
a constrained array. Finding the capacity of certain constrained arrays has been
studied recently [3, 9, 10, 16, 17] for storage information in 2D and holographic
recoding, for 2D barcodes, and in statistical mechanics.
The theory of constrained sequences, that is for m = 1, is well established
and uses the theory of graphs and the eigenvalues of the incidence matrix of the
graph to give a linear recurrence. See [5, 6, 8]. An explicit formula for the number
of such sequences of a given length n can be expressed in terms of the
eigenvalues. The asymptotical rate is equal to logq(λmax), where λmax is the largest
eigenvalue of the incidence matrix. Shannon [12] showed already that the
following Fibonacci relation holds for n ≥ 1:
|A1×(n+2)| = |A1×(n+1)| + |A1×n|.
Asymptotically this gives

Furthermore |A2×n| = |A1×n|2, since there is no constraint for the columns in case m
= 2. Therefore also .
Using the same idea, we can find the capacity R(Am) where the code Am×n of
Am is viewed as a code of constrained sequences of length n in the alphabet
. For m we get in this way the graph Γm with Am×2 as set of vertices. A
directed edge in Γm is from vertex X to vertex Y such that (X|Y ) is in Am×4. So the
degree of the characteristic polynomial is |Am×1|2, where |Am×1| is given by the
Fibonacci sequence 2, 4, 6, 10,...,110 for m = 1,2,3,4,...,9. So in particular, the
degree of the characteristic polynomial pm(λ) in case of m = 9 is equal to 1102 =
12,100. The largest eigenvalue λm of the incidence matrix of Γm, is not computed
as the largest root of pm(λ), but approximated numerically by Rayleigh quotients
[9], that is by the power method with entry scaling. This straight forward idea is
due-able for m ≤ 9 with capacities given in Table 1.
It is clear that |A(l+m)×n| ≤ |Am×l|·|Am×n|, since we can split an array Z in A(l+m)×n into
two arrays of X and Y in Am×l and Am×n, respectively, where X consists of the first l
rows of Z, and Y of the last m rows of Z. Hence R(Am×2n) ≤ R(Am×n).
On the other hand, suppose X ∈ Al×n with last row x and Y ∈ Am×n with first
row y. Let Z be the (l + m + 2) × n array with X in the first l rows of Z, and Y in
the last m rows of Z, and 𝒙 in row l + 1 and 𝒚 in row l + 2, where 𝒙 is the
complementary column by changing every zero in a one and every one in a zero.
Then Z is an element of A(l+m+2)×n. Hence |Al×n|·|Am×n| ≤ |A(l+m+2)×n|.
𝑚
Therefore 𝑅(𝐴𝑚×𝑛 ) ≤ 𝑅(𝐴(2𝑚+2)×𝑛 ) . Similarly as in [17], we get for the
𝑚+1
capacities

51
Now R(Am) is decreasing in m and ) is increasing in m. So in this way
we get lower and upper bounds on the lower capacity R(A):

.
But in contrast to the case for constrained sequences, there is so far no theory
available that gives a closed formula for the capacity of constrained arrays.

TABLE 1. Capacity of Am TABLE 2. Capacity of E2m

m λmax R(Am) m λmax R(E2m)


1 1.6180 0.69424 0.34712 1 2.6180 0.347120
2 6.8541 0.69424 0.46283 2 10.0125 0.415465
3 11.7793 0.59303 0.44477 3 29.3321 0.406200
4 23.5755 0.56990 0.45592 4 89.7965 0.405536
5 44.3167 0.54697 0.45581
5 284.9148 0.407719
6 85.3928 0.53467 0.45829
7 162.9352 0.52486 0.45925
8 312.1198 0.51787 0.46033
9 596.9673 0.51230 0.46107

Using the same idea as before we can also find R(E2m). The corresponding
graph Λm has E2m×2 as set of vertices, which is the subset of A2m×2 of arrays that
have balanced columns. So Λm is a subgraph of Γ2m. The corresponding result is
shown in Table 2. The number of vertices of Λ5 is 7,056.

2.2. Second Constraint.


A sequence of even length is called balanced, if the number of zeros is equal
to the number ones. The number of balanced sequences of length 2m is .
The rows of an 2 × 2m array in B2×2m are complementary to each other. Hence
and .
The number of all l×2m binary matrices such that all the rows are balanced
2𝑚 𝑙
is equal to ( ) . If 𝑋 is an l×2m binary matrix such that all its rows are
𝑚
balanced, then the 2l × 2m binary matrix that is obtained by adding 𝑋 under X has
all its rows and columns balanced, where 𝑋 is the complement of X. Hence

.
1
From these inequalities it follows that, asymptotically: ≤ 𝑅(ℬ2𝑙 ) ≤ 1 for all l.
2

52
TABLE 3. Capacity of ℬ2𝑙 TABLE 4. Rate of B2m×2m

l 𝑅(ℬ2𝑙 ) m R(B2m×2m)

1 0.5 1 0.25000
2 0.6462 2 0.40574
3 0.7203 3 0.50502
4 0.7259 4 0.57448
5 0.7977 5 0.62546

6 0.8210 6 0.66453

7 0.8389 7 0.69550

Four arbitrary elements of B2m×2m give an element of B4m×4m. So |B4m×4m| ≥


|B2m×2m|4. Therefore R(B2m×2m) is increasing in m.

Referring to [1], we have a good approximation of |B2m×2m|:

for l,m → ∞. From this it follows that

and therefore R(B) = 1, which was already shown in [11].


Let bl×m = |Bl×m|. The exact number of b2m×2m can be obtained recursively. Let
i = (i1,...,il) and j = (ji,...,jm). Denote by bl×m(i;j) the collection of all l × m arrays
with 0/1’s such that there are is ones in row s, for all s = 1,...,l, and there are jt ones
in column t, for all t = 1,...,m.
Suppose we know the number of bl×m(x;j) and bl×m(y;j) for all x and y. It is
clear that bl×(m+n)(i;[j k]) is equal to summation of bl×m(x;j)·bl×m(y;j) for all possible
x and y such that x + y = i. In other word,

𝑏𝑙×(𝑚+𝑛) (𝑖; [𝑗 𝑘]) = ∑ 𝑏𝑙×𝑚 (𝒙; 𝑗) ⋅ 𝑏𝑙×𝑛 (𝒚; 𝑘),


𝑥+𝑦=𝑖

where k = (k1,...,kn), x = (x1,...,xl) and y = (y1,...,yl).

2.3. Third Constraint.


Consider 𝐶̃𝑚×𝑛 the set of all m×n binary arrays that have mutually distinct
columns. Let . Then C˜m×n can de identified with all words in with
mutually distinct entries. Hence |C˜m×n| = 2m(2m −1)···(2m −n+1). Now Cm×n is a
subcode of C˜m×n. Therefore
|Cm×n| ≤ 2m(2m − 1)···(2m − n + 1).
In particular |Cm×n| = 0 for all n > 2m. So 𝑅(𝒞𝑚 ) = 0. Therefore 𝑅(𝒞) = 0.
Let X in Cm×m. Then all the rows of X are distinct and all its columns are
distinct. We can extend X to an (m + 1) × (m + 1) array in C(m+1)×(m+1) by appending

53
a column y and a row z to X, such that still all rows are distinct and all columns are
distinct. Then y is an m×1 array and z is an (m+1)×1 array. There are 2𝑚+1 − 𝑚
possibilities for y and 2m+1 − m possibilities for z. Hence, if m = n, we have
|C(m+1)×(m+1)| ≥ |Cm×m| · (22m+1 − 3m2m + m2).
Suppose we have an arbitrary m × m binary array, say X. Let 𝑙 = ⌈𝑙𝑜𝑔2(𝑚)⌉ +
1. Let Y be an l×m binary array such that all its columns are mutually distinct and
have weight not equal to one. Such an array exists, since m ≤ 2l − l. Then (Y |Il) is
an l × (m + l) array such that all its columns are mutually distinct. Let Z be the (m
+ l) × (m + l) array such that X is the upper m × m subarray of Z in the first m rows
and columns, with (Y |Il) in its last l rows and (Y |Il)T in its last l columns. Then Z
is an array with mutually distinct columns and mutually distinct rows. Hence
2
|𝐶(𝑚+𝑙)×(𝑚+𝑙) | ≥ 2𝑚 . So

Therefore limm→∞ R(Cm×m) = 1, since lim = 0. Therefore 𝑅̅ (𝒞) =


1 = 𝑅(𝒞), and 𝑅(𝒞) is not defined.
A partition of a set of m elements is a collection of non-empty subsets that
are mutually disjoint and their union is the whole set. The collection of all
partitions of t non-empty subsets of the set of {1,...,m} is denoted by 𝒮(𝑚, 𝑡). The
number |𝒮(𝑚, 𝑡)| is called the Stirling number of the second kind and is denoted
by S(m,t). Then S(m,0) = 0, S(m,1) = S(m,m) = 1 and the following recurrence
relations hold [14]:
S(m + 1,t) = tS(m,t) + S(m,t − 1).
The following explicit formula holds [13]:

.
Proposition 2.1. The numbers |Cm×n| satisfy the following recurrence:
𝑚

∑ |𝐶𝑡×𝑛 | ⋅ 𝑆(𝑚, 𝑡) = 2𝑚 (2𝑚 − 1) ⋯ (2𝑚 − 𝑛 + 1).


𝑡=1

Proof. Let Cmt ×n be the collection of all m × n binary arrays that have mutually
distinct columns and have exactly t mutually distinct rows. Then in particular
1
|𝐶𝑚×𝑛 | = |𝐶1×𝑛 | = |𝐶̃1×𝑛 |

Since the are mutually distinct for k = 1,...,m, and give a partitioning of
C˜m×n, we have that .
We have seen before that |C˜m×n| = 2m(2m − 1)···(2m − n + 1).
𝑡
Let 𝑋 ∈ 𝐶𝑚×𝑛 and let xi be the i-th row of X. Define the sequence y1,...,yt by
induction as follows y1 = x1. Suppose that y1,...,yi are defined. Then yi+1 is the first
row in X that is distinct from y1,...,yi. Then y1,...,yt are mutually distinct and give a

54
t × n array Y ∈ Ct×n such that yi is the i-th row of Y . Let Ij = {i|xi = yj for j = 1,...,t.
Then ℐ = {𝐼1 , … , 𝐼𝑡 } is a partitioning of {1,...,m} with t non-empty subsets.
Conversely, let Y be in Ct×n with rows y1,...,yt. Let ℐ = {𝐼1 , … , 𝐼𝑡 } be a
partitioning of {1,...,m} with t non-empty subsets. Without loss of generality we
may reorder I1,...,It such that 1 ∈ I1 and the minimal j ∈ {1,...,m} that is not in I1 ∪
··· ∪ Ii, is in Ii+1 for all i < t. Let X be the m × n matrix such that the i-row of X is
equal to yj is i ∈ Ij. Then .
𝑡
In this way we have obtained a bijection between 𝐶𝑚×𝑛 and 𝐶𝑡×𝑛 × 𝒮(𝑚, 𝑡).

TABLE 5. Rate of Cm×m


m R(Cm×m)
1 1.00000
3 0.89382
TABLE 6. Rate of Cm×m
5 0.96765
m R(D2m×2m)
8 0.99527
1 0.25
11 0.99936
2 0.39
12 0.99967
3 0.34
15 0.99995
17 0.99998
25 0.99999

2.4. All Constraints.


The size of D2m×2m can be approximated by smaller building blocks such that
the conditions are still satisfied. There are exactly two building block of size
1 2 1
2 × 2. Hence, 𝑅(𝐷2𝑚×2𝑚 ) ≥ log 2 (2𝑚 ) = , for m ≥ 1.
(2𝑚)2 4

By brute force computations, we could give only a few values in Table 6.


It is shown by De Biasi [2] that asymptotically the rate rate of D2m×2m is the
same as A2m×2m ∩ B2m×2m as m goes to infinity, by an argument similar to our proof
showing that limm→∞ R(Cm×m) = 1.
For constrained sequences it is known that the capacity does not change if
one adds the condition that the sequence is balanced [4, 7]. It seems that a similar
balancing argument as in [16] is not true for our constrained arrays, since
A2m×2m ∩ B2m×2m ⊆ E2m×2m ⊆ A2m×2m
and Tables 1 and 2 indicate that R(E2m) < R(A2m). Therefore more research is
needed to determine R(D), that is the capacity of Binary Puzzles.

55
References
[1] E.R. Canfield and B.D. Mckay, Asymptotic enumeration of dense 0-1
matrices with equal row sums and equal column sums. The Electronic
Journal of Combinatorics , vol. 12, # R29, Jun 2005.
[2] M. De Biasi, Binary puzzle is NP-complete, http://nearly42.org
[3] T. Etzion and K.G. Paterson, Zero/positive capacities of Two-Dimensional
RunlengthConstrained Arrays. IEEE Trans. on Information Theory, vol. 51
no. 9, pp. 3186–3199, Sept 2005.
[4] P.S. Henry, Zero disparity coding systems. US Patent 4 309 694, 1982.
[5] H.D.L. Hollmann, Modulation Codes. Philips Electronics N.V., PhD thesis
Techn. Univ. Eindhoven, 1996.
[6] K.A.S. Immink, P.H. Siegel, and J.K. Wolf, Codes for digital recorders.
IEEE Trans. on Information Theory, vol. 44 no. 6, pp. 2260–2299, Oct
1998.
[7] D.E. Knuth, Efficient balanced codes. IEEE Trans. on Information Theory,
vol. 32, pp. 51–53, 1986.
[8] D. Lind and B. Marcus, An introduction to symbolic dynamics and coding.
Cambridge University Press, Cambridge 1995.
[9] E. Louidor and B.H. Marcus, Improved lower bounds on the capacities of
symmetric 2D constraints using Rayleigh quotients. IEEE Trans. on
Information Theory, vol. 56, no. 4, pp. 1624-1639, April 2010.
[10] B. Marcus, Capacity of higher-dimensional constrained systems. Coding
Theory and Applications, CIM Series in Mathmatical Sciences vol. 3, pp.
3–21, 2015.
[11] E. Ordentlich and R.M. Roth, Two-dimensional weight-constrained codes
through enumeration bounds. IEEE Trans. on Information Theory, vol. 46,
no.4, pp. 1292-1301, Jul 2000.
[12] C.E. Shannon, A mathematical theory of communication. Bell System
Technical Journal, vol. 27 no. 10, pp. 379–423, October 1948.
[13] H. Sharp Cardinality of finite topologies. J. Combinatorial Theory, vol. 5,
pp. 82-86, 1968.
[14] R.P. Stanley Enumerative Combinatorics Volumes 1 and 2. Cambridge
University Press, Cambridge 1997, 1999.
[15] P. Utomo and R. Pellikaan, Binary Puzzles as an Erasure Decoding
Problem. Proc. 36th WIC Symp. on Information Theory in the Benelux , pp.
129–134 May 2015.
http://www.w-i-c.org/proceedings/proceedings_SITB2015.pdf.
[16] A. Vardy, M. Blaum, P.H. Siegel and G.T Sincerbox, Conservative arrays:
Multidimensional modulation codes for holographic recoding. IEEE Trans.
on Information Theory, vol. 42, no. 1, pp. 227–230, Jan 1996.
[17] W. Weeks and R.E. Blahut, The capacity and coding gain of certain
chekerboard codes. IEEE Trans. on Information Theory, vol. 44, no. 3, pp.
1193–1203, May 1998.

56
Proceedings of IICMA 2015
Invited Speakers

Error-correcting pairs for a public-key


cryptosystem
Ruud Pellikaan1,a) and Irene Márquez-Corbella2,b)
1
Discrete Mathematics, Techn. Univ. Eindhoven
P.O. Box 513, 5600 MB Eindhoven, The Netherlands.
2
SECRET Project-Team - INRIA, Paris-Rocquencourt
B.P. 105, 78153 Le Chesnay Cedex France.

a)g.r.pellikaan@tue.nl
b)irene.marquez-corbella@inria.fr

Abstract. Code-based Cryptography (CBC) is a powerful and promising alternative for


quantum resistant cryptography. Indeed, together with lattice-based cryptography,
multivariate cryptography and hash-based cryptography are the principal available
techniques for post-quantum cryptography. Many families of codes have been proposed for
these cryptosystems. One of the main requirements is having high performance t-bounded
decoding algorithms which in the case of having an error correcting pair is achieved. In this
article the class of codes with a t-ECP is proposed for the McEliece cryptosystem. The
hardness of retrieving the t-ECP for a given code is considered. As a first step we give a
survey of results about distinguishers of several subclasses. Recent results will be surveyed.

Keywords: Code-based Cryptography, Error-Correcting Pairs, Distinguisher.

INTRODUCTION
The notion of public key cryptography (PKC) was first published in the
public domain in 1976 by Diffie and Hellman though Merkle and Hellman had
developed some of the key concepts during the same time. In fact, Ellis published
the same idea already in 1970 and called it non-secret encryption but it was not
made public until 1997, see [32, p. 279–292]. The advantage with respect to
symmetric key cryptography is that it does not require an initial exchange of
secrets between sender and receiver. A one-way function is the crucial concept in
their papers. After forty years one can still state with [16] that:
“At the heart of any public-key cryptosystem is a one-way function - a
function y = f(x) that is easy to evaluate but for which it is computationally
infeasible (one hopes) to find the inverse x = f−1(y)”.
Well known (supposedly) one-way functions are:
1. Discrete logarithm for which a group G (written multiplicatively) and an
element a ∈ G are required, then x is an integer and y = ax. The security of the
key exchange proposed by Williamson and Diffie-Hellman in 1974 and 1976,

57
Proceedings of IICMA 2015
Invited Speakers

respectively depend on the difficulty of finding discrete logarithms in a finite


field.
2. Integer factorization where x = (p,q) is a pair of distinct prime numbers and
y = pq is its product. The security of the cryptosystem proposed by Cocks and
Rivest-Shamir-Adleman (RSA) from 1973 and 1978, respectively is based on
the hardness of factorizing integers.
3. Integer knapsack where an n-tuple of integers a1,...,an is given, then x is an
n-tuple of zeros and ones and . The Merkle-Hellman public key
system from 1978 is based on the difficulty of the integer knapsack problem.
4. Decoding error-correcting codes where an n-tuple of vectors h1,...,hn in Frq is
given, then x is an n-tuple of elements in Fq and . In
1978 McEliece [24] presents the first PKC system based on the difficulty of
decoding error-correcting codes.
5. Elliptic curve discrete logarithm where G is an elliptic curve group (written
additively) over a finite field, x = P is a point on the curve and y = kP is
another point on the curve obtained by the multiplication of P with a positive
integer k. Elliptic Curve Cryptography (ECC) proposed independently by
Koblitz and Miller in 1985 is based on the difficulty of inverting this function
in the group of points on an elliptic curve over a finite field.
All known public key cryptosystems depend for their security on an
unproven proposition. In the cases (3) and (4) it depends on the assumption that P
6= NP. Even if this is indeed the case, it is not shown that most of the cases are
difficult to solve, since the theory of NP completeness deals with the worst-case
situation. Finally, it may be that in the average the problem is difficult, but that
the particular instances used in the system can be broken by a structural attack.
That in fact was the fate of the Merkle-Hellman public-key system.
With the discovery of Shor’s algorithm anyone with a quantum computer
can break in polynomial time all cryptosystems whose security depends on the
difficulty of the problems (1), (2) and (5). Post-quantum cryptography gave birth
to the next generation of cryptography algorithms, which are designed to run on
conventional computers but no attacks by classical or quantum computers are
known against them. See [4] for an overview of the state of the art in this area.
It may be the fate of all proposed one-way functions that they will be
broken in the future. It may also be the case that some party has already broken
some widely used one-way function without revealing it in the public domain for
their own benefit. The key difficulty lies in the fact that present day knowledge on
lower bounds on the complexity of these functions is still out of reach. This is a
sobering and humbling conclusion after so many years of research. So we
continue with the practice of proposing and refuting PKC systems.
In 1978, McEliece [24] presented the first PKC system based on the theory
of error-correcting codes. Its main advantages are its fast encryption and
decryption schemes. However, the main drawback of the original McEliece was
its large key size. But this does not mean that code-based cryptography is
inherently inefficient. Indeed, there have been impressive achievements in this
area (reducing the key size while keeping the same level of security): now [26]
there are constructions with compact keys or around 5000 bits for 80 bits of

58
Proceedings of IICMA 2015
Invited Speakers

security which is comparable to RSA’s public key size. Code-based


cryptosystems such as proposed by McEliece [24] and Niederreiter [27] are
interesting candidates for post-quantum cryptography. See the survey [5].

CODE BASED CRYPTOGRAPHY


A code C is a linear subspace of . The parameters of the code are denoted
by [n,k,d], where n is its length, k its dimension and d its minimum distance. The
problem of minimum distance decoding has as input (G,y), where G is a generator
matrix of a code C over Fq of parameters [n,k,d] and y is a received word.
The output is a codeword c ∈ C of minimal distance to y. One can phrase the
problem equivalently in terms of a parity check matrix H of the code.
Then the input is (H,s), where s . The output is an e of minimal
weight such that eHT = s. The relation of the two versions is given by s = yHT the
syndrome and e = y−c the error vector of the received word y. The bounded
distance decoding problem depends on a function t(n,k,d). The input is again
(H,s) but the output is a word e (if any) such that wt(e) ≤ t(n,k,d). Moreover
decoding up to half the minimum distance is the bounded distance decoding
problem such that t(n,k,d) = b(d − 1)/2c for all n, k and d. The solution of the
decoding problems posed above has two parts [13]. Firstly the preprocessing part
done at a laboratory or a factory where for an appropriate code C a decoder AC is
built which is allowed to be time consuming. Secondly the actual operating of the
many copies of the decoder for consumers which should work very fast. So we
can consider the problem of minimum distance decoding with preprocessing.
From the error-correction point of view it seems pointless to decode a bad code,
but for breaking the McEliece cryptosystem by a general or generic attack one
must be able to decode efficiently all, or almost all, codes.
The security of code-based cryptosystems is based on the hardness of
decoding up to half the minimum distance. The minimum distance decoding
problem was shown by Berlekamp-McEliece-Van Tilborg [3] to be NP-hard.
However it is not known whether this problem is almost always or in the average
difficult. The status of the hardness of decoding up to half the minimum distance
is an open problem. McEliece proposed to use binary Goppa codes for his PKC
system.
All known minimum distance decoding algorithms for general codes have
exponential complexity in the length of the code. The complexity exponent of
decoding general binary codes up to half the minimum distance has been lowered
from above 1/3 that is 0.3869 for brute force decoding to below 1/20 that is
0.04934. See [1,31]. However there are several classes of codes such as the
generalized Reed-Solomon (GRS), BCH, Goppa or algebraic geometry codes
which have polynomial decoding algorithms that correct up to a certain bound
which is at most half the minimum distance.
In the McEliece PKC system the public key space K is the collection of all
generator matrices of a chosen class of codes that have an efficient decoding
algorithm that corrects all patterns of t errors, the plaintext space is
Wn,q,t, where Wn,q,t is the collection of all e of weight t, and the ciphertext
sample space is given by Ω = P × K. The encryption map
EG : P → C for a given key G ∈ K is defined by EG(m,e) = mG + e. An adversary

59
Proceedings of IICMA 2015
Invited Speakers

A is a map from C × K to P. This adversary is successful for (x,G) ∈ Ω if


A(EG(x),G) = x.
Let C be a class of codes such that every code C in C has an efficient
decoding algorithm correcting all patterns of t errors. Let be a generator
matrix of C. In order to mask the origin of G, take a k × k invertible matrix S over
Fq and an n × n permutation or monomial matrix P. Then for the McEliece PKC
the matrices G, S and P are kept secret while G0 = SGP is made public.
Furthermore the (trapdoor) one-way function of this cryptosystem is usually
presented as follows:

where m ∈
Fkq is the plaintext and e ∈
Fnq is a random error vector with Hamming
weight at most t.

McEliece proposed to use the family of Goppa codes. The problem of


bounded distance decoding for the class of codes that have the same parameters
as the Goppa codes is difficult in the worst-case. However, it is still an open
problem whether decoding up to half the minimum distance is NP-hard which is
the security basis of the McEliece cryptosystem.

In 1986, Niederreiter [27] presented a dual version of McEliece cryptosystem


which is equivalent in terms of security. Niederreiter’s system differs from
McEliece’s system in the public-key structure (it uses a parity check matrix
instead of a generator matrix of the code), in the encryption mechanism (we
compute the syndrome of a message by the public key) and in the decryption
message. Niederreiter proposed several classes of codes such as alternant codes
which contains the Goppa codes as subclass, algebraic geometry codes introduced
by Goppa [14] and GRS codes.
be a parity check matrix of a code C in C. H is masked
by H0 = SHP, where S is an invertible matrix over Fq of size n − k and P is an n ×
n permutation or monomial matrix. The (trapdoor) one-way function in case of
the Niederreiter PKC is presented by

where m ∈ Fnq has weight t.


The security of a general attack of the PKC systems of McEliece and
Niederrieter is based on two assumptions [5, 15]:
A.1 In the average it is difficult to decode t errors for all codes that have the same
parameters as the codes used as key,
A.2 It is difficult to distinguish arbitrary codes from those coming from K.
Concerning the second assumption recent progress is made [8, 10, 11] where it is
shown that one can distinguish between high rate Goppa, alternant and random
codes.
Assuming the Kerckhoff principle, the attacker knows the class K. A key
recovery or structural attack uses the special structure of codes in the class of K.

60
Proceedings of IICMA 2015
Invited Speakers

For instance Sidelnikov-Shestakov gave an adversary that is always successful if


one takes for public key space the generator matrices of GRS codes.
It was shown in [9, 17, 28, 29] that the known efficient bounded distance
decoding algorithms of GRS, BCH, Goppa and algebraic geometry codes can be
described by a basic algorithm using an error-correcting pair. That means that the
proposed McEliece cryptosystems that use these classes of codes are in fact using
the error-correcting pair as a secret key. Hence the security of these PKC systems
is not only based on the inherent intractability of bounded distance decoding but
also on the assumption that it is difficult to retrieve efficiently an error-correcting
pair.

Error-Correcting Pairs
From now on the dimension of a linear code C will be denoted by k(C) and
its minimum distance by d(C). Given two elements a and b in , the star
multiplication is defined by coordinatewise multiplication, that is a∗ b =
(a1b1,...,anbn) while the standard inner multiplication is defined by a
.
Let A, B and C be subspaces of . Then A ∗ B is the subspace generated by
{a ∗ b | a ∈ A and b ∈ B}. And C⊥ = {x|x · c = 0 for all c ∈ C} is the dual code
of C. Furthermore A ⊥ B if and only if a · b = 0 for all a ∈ A and b ∈ B.
Consider the following properties:

Let C be a linear code in . The pair (A,B) of linear codes over Fqm of
length n is called a t-error-correcting pair (ECP) for C if the conditions E.1, E.2,
E.3 and E.4 hold.
If (A,B) is a pair of codes that satisfies conditions E.1, E.2, E.3, E.5 and E.6,
then d(C) ≥ 2t + 1 and (A,B) is a t-ECP for C by [29, Corollary 3.4].
The notion of an error-correcting pair for a linear code was introduced in 1988 by
Pellikaan [28] and independently by Kötter in [17, 18] in 1992. It is shown that a
linear code in with a t-error-correcting pair has a decoding algorithm which
corrects up to t errors with complexity O(n3).
Generalized Reed-Solomon (GRS) codes are the prime examples of codes
that have a t-error-correcting pair. Moreover if C is an [n,n − 2t,2t + 1] code
which has a t-error-correcting pair, then C is a GRS code. This is trivial if t = 1,
proved for t = 2 in [29, Theorem 6.5] and for arbitrary t in [23].
The existence of ECP’s for GRS and algebraic geometry codes was shown
in [28]. For many cyclic codes Duursma and Kötter in [9, 17, 18] have found
ECP’s which correct beyond the designed BCH capacity.
The class of GRS codes was mentioned by Niederreiter [27] in his proposal of a
code-based PKC. However GRS codes are not suited for a coded-based PKC by
the attack of Sidelnikov-Shestakov.

61
Proceedings of IICMA 2015
Invited Speakers

A binary Goppa code with parameters [1024,524,101] as proposed by McEliece is


no longer secure with nowadays computing power as shown in Peters et al. [31,
30] by improving decoding algorithms for general codes.
The class of subcodes of GRS codes was proposed by Berger-Loidreau [2]
for code-based PKC to resist precisely the Sidelnikov-Shestakov attack. But for
certain parameter choices this proposal is also not secure as shown by
Wieschebrink [33, 34] and M´arquez et al. [20].
Algebraic geometry codes were proposed by Niederreiter [27] and
JanwaMoreno [14] for code based PKC systems. This system was broken for low
genus at most two [12, 25] and for arbitrary genus by M´arquez et al. [7, 19, 21]
for certain choices of the parameters.
Subfield subcodes of algebraic geometry codes were proposed by
JanwaMoreno [14] and broken by Couvreur et al. [6] for certain parameters. The
class of Goppa codes remains so far unbroken.

THE ECP ONE-WAY FUNCTION


Let P(n,t,q) be the collection of pairs (A,B) such that there exist a positive
integer m and a pair (A,B) of Fqm-linear codes of length n, that satisfy the
conditions E.2, E.3, E.5 and E.6. Let C be the Fq-linear code of length n that is the
subfield subcode that has the elements of A ∗ B as parity checks. So

.
Then the minimum distance of C is at least 2t + 1 and (A,B) is a t-ECP for C as
was noted before. Let F(n,t,q) be the collection of Fq-linear codes of length n and
minimum distance d ≥ 2t + 1. Consider the following map

The question is whether this map is a one-way function.


If the map ϕ(n,t,q) is indeed difficult to invert, then we will call it the ECP
one-way function and the code C with parity check matrix W might be used as a
public-key in a coding based PKC. Otherwise it would mean that the PKC based
on codes that can be decoded by error-correcting pairs is not secure.
Let 𝐾 be a collection of generator matrices of codes that have a t-error
correcting pair and that is used for a coded-based PKC system. We address
assumption whether we can distinguish arbitrary codes from those coming from
𝐾.
ACKNOWLEDGEMENT
An earlier version of this work was presented for the first time by the second
author at the Code-Based Cryptography Workshop, May 2012 at the Technical
University of Denmark, Lyngby and posted at arXiv [22] and furthermore at the
conferences Applications of Computer Algebra 2013 and 2014 at Malaga and
Fordham, respectively.

62
Proceedings of IICMA 2015
Invited Speakers

REFERENCES
[1] A. Becker, A. Joux, A. May, and A. Meurer, Decoding random binary
linear codes in 2n/20: How 1 + 1 = 0 improves information set decoding. In:
Advances in cryptology—EUROCRYPT 2012, Lecture Notes in Comput.
Sci., vol. 7237, pp. 520–536. Springer, Heidelberg (2012)
[2] T. Berger and P. Loidreau, How to mask the structure of codes for a
cryptographic use. Designs, Codes and Cryptography 35, 63–79 (2005)
[3] E. Berlekamp, R. McEliece, and H. van Tilborg, On the inherent
intractability of certain coding problems. IEEE Transactions on
Information 24, 384–386 (1978)
[4] D. Bernstein, Introduction to post-quantum cryptography. In: D.J.
Bernstein, J.B., Dahmen, E. (eds.) Post-quantum cryptography, pp. 1–14.
Springer-Verlag, Berlin (2009)
[5] B. Biswas and N. Sendrier, McEliece cryptosystem implementation :
Theory and practice. In: Post-Quantum Cryptography, Lecture Notes in
Computer Science. vol. 5299, pp. 47–62. Springer, Berlin (2008)
[6] A. Couvreur, I. M´arquez-Corbella, and R. Pellikaan, Cryptanalysis of
public-key cryptosystems that use subcodes of algebraic geometry codes.
In: 4ICMCTA, Coding Theory and Application, CIM Series in
Mathematical Sciences 3, pp. 133–140 (2014)
[7] A. Couvreur, I. M´arquez-Corbella, and R. Pellikaan, A polynomial time
attack against algebraic geometry code based public key cryptosystems.
In: IEEE International Symposion on Information Theory ISIT 2014, p.
1446 (2014)
[8] A. Couvreur, P. Gaborit, V. Gauthier-Uman˜a, A. Otmani, and J.P.Tillich,
Distinguisher-based attacks on public-key cryptosystems using Reed-
Solomon codes. Des. Codes Cryptogr. 73(2), 641–666 (2014)
[9] I. Duursma and R. Ko¨tter, Error-locating pairs for cyclic codes. IEEE
Trans. Inform. Theory 40, 1108–1121 (1994)
[10] J.C. Faug`ere, A. Otmani, L. Perret, and J.P. Tillich, Algebraic
cryptanalysis of McEliece variants with compact keys. In: Gilbert, H.
(ed.) EUROCRYPT 2010, Lecture Notes in Computer Science. vol. 6110,
pp. 279–298. Springer, Berlin (2010)
[11] J.C. Faug`ere, V. Gauthier-Uman˜a, A. Otmani, L. Perret, and J.P. Tillich,
A distinguisher for high-rate McEliece cryptosystems. IEEE Trans.
Inform. Theory 59(10), 6830–6844 (2013)
[12] C. Faure and L. Minder, Cryptanalysis of the McEliece cryptosystem over
hyperelliptic codes. In: Proceedings 11th Int. Workshop on Algebraic and
Combinatorial Coding Theory, ACCT 2008. pp. 99–107 (2008)
[13] T.Høholdt and R. Pellikaan, On decoding algebraic-geometric codes.
IEEE Transactions on Information 41, 1589–1614 (1995)

63
Proceedings of IICMA 2015
Invited Speakers

[14] H. Janwa and O. Moreno, McEliece public crypto system using algebraic-
geometric codes. Designs, Codes and Cryptography 8, 293–307 (1996)
[15] K. Kobara and H. Imai, Semantically secure McEliece public-key
cryptosystems conversions for McEliece PKC. In: PKC 2001, Lecture
Notes in Computer Science. vol. 1992, pp. 19–35. Springer, Berlin (2001)
[16] N. Koblitz and A. Menezes, The brave new world of bodacious
assumptions in cryptography. Notices Amer. Math. Soc. 57(3), 357–365
(2010)
[17] R Kötter, A unified description of an error locating procedure for linear
codes. In: Proceedings of Algebraic and Combinatorial Coding Theory,
pp. 113–117. Voneshta Voda (1992)
[18] R Kötter, On algebraic decoding of algebraic-geometric and cyclic codes.
Ph.D. thesis, Linko¨ping University of Technology, Linko¨ping Studies in
Science and Technology, Dissertation no. 419 (1996)
[19] I. Márquez-Corbella, E. Martíınez-Moro, and R. Pellikaan, On the unique
representation of very strong algebraic geometry codes. Designs, Codes
and Cryptography pp. 215–230 (2012)
[20] I. Márquez-Corbella, E. Martíınez-Moro, and R. Pellikaan, The non-gap
sequence of a subcode of a generalized Reed-Solomon code. Designs,
Codes and Cryptography 66(1-3) (2013)
[21] I. Márquez-Corbella, E. Martíınez-Moro, R. Pellikaan, and D. Ruano,
Computational aspects of retrieving a representation of an algebraic
geometry code. J. Symbolic Computation 64, 67–87 (2014)
[22] I. Márquez-Corbella and R. Pellikaan, Error-correcting pairs for a public-
key cryptosystem. Preprint arXiv:1205.3647 (2012)
[23] I. Márquez-Corbella and R. Pellikaan, A characterization of MDS codes
that have an error-correcting pair. Preprint arXiv:1508.02187 (2015)
[24] McEliece, R.: A public-key cryptosystem based on algebraic coding
theory. DSN Progress Report 42–44, 114–116 (1978)
[25] L. Minder, Cryptography based on error correcting codes. Ph.D. thesis,
3846 EPFL (2007)
[26] R.Misoczki, J.P. Tillich, N. Sendrier, and P. Barreto, MDPC-McEliece:
New McEliece variants from moderate density parity-check codes. In:
Information Theory Proceedings (ISIT), 2013 IEEE International
Symposium on. pp. 2069–2073 (2013)
[27] H.Niederreiter, Knapsack-type crypto systems and algebraic coding
theory. Problems of Control and Information Theory 15(2), 159–166
(1986)
[28] R.Pellikaan, On decoding by error location and dependent sets of error
positions. Discrete Math. 106–107, 369–381 (1992)
[29] R. Pellikaan, On the existence of error-correcting pairs. Statistical
Planning and Inference 51, 229–242 (1996)

64
Proceedings of IICMA 2015
Invited Speakers

[30] C. Peters, Information-set decoding for linear codes over Fq. In: Post-
Quantum Cryptography, Lecture Notes in Computer Science. vol. 6061,
pp. 81–94. Springer, Berlin (2010)
[31] Peters, C.: Curves, codes and cryptography. Ph.D. thesis, Technical
University Eindhoven (2011)
[32] S.Singh: The code book: Science of secrecy from ancient Egypt to
quantum cryptography. Anchor Books, New York (1999)
[33] C. Wieschebrink, An attack on the modified Niederreiter encryption
scheme. In: PKC 2006, Lecture Notes in Computer Science. vol. 3958, pp.
14–26. Springer, Berlin (2006)
[34] C. Wieschebrink, Cryptanalysis of the Niederreiter public key scheme
based on GRS subcodes. In: Post-Quantum Cryptography, Lecture Notes
in Computer Science. vol. 6061, pp. 61–72. Springer, Berlin (2010).

65
Proceedings of IICMA 2015
Graph and Combinatorics

Rainbow Connection Number and Strong


Rainbow Connection Number of The Graph
(𝒅𝟐 (𝑷𝒏 ) + 𝑲𝟏 )

Diah Prastiwi1,a), and Kiki A. Sugeng2,b)


1,2
Department of Mathematics, Faculty of Mathematics and Natural Science, Universitas
Indonesia
a)
prastiwi10diah@gmail.com
b)
kiki@sci.ui.ac.id

Abstract. Let 𝐺 be a nontrivial connected graph with an edge-coloring 𝑐: 𝐸(𝐺) → {1,2, … , 𝑘},𝑘 ∈
ℕ. A path is called rainbow if no two edges in the path have the same color. The graph 𝐺 is
rainbow connected if any two vertices of 𝐺 are connected by a rainbow path in 𝐺. The rainbow
connection number of 𝐺, denoted 𝑟𝑐(𝐺), is the smallest number of colors needed to make 𝐺
rainbow connected. The strong rainbow connection number of 𝐺, denoted by 𝑠𝑟𝑐(𝐺), is the
smallest number of colors needed so that for any two vertices 𝑥, 𝑦 ∈ 𝑉(𝐺) there is a rainbow 𝑥 −
𝑦 − path which is also geodesic. In an earlier finding, the first author has found the 𝑟𝑐 and 𝑠𝑟𝑐 of
𝑑2 (𝑃𝑛 ), the shadow graph of a path-graph. In this paper we compute the 𝑟𝑐 and 𝑠𝑟𝑐 of the sum of
this graph with a 𝐾1 . For 𝑛 ≤ 6, 𝑟𝑐 (𝑑2 (𝑃𝑛 ) + 𝐾1 ) = 𝑠𝑟𝑐 (𝑑2 (𝑃𝑛 ) + 𝐾1 ) = 2. For 𝑛 ≥
𝑛
7, 𝑟𝑐 (𝑑2 (𝑃𝑛 ) + 𝐾1 ) = 3 and 𝑠𝑟𝑐 (𝑑2 (𝑃𝑛 ) + 𝐾1 ) = ⌈ ⌉, with 𝑑𝑖𝑎𝑚(𝑑2 (𝑃𝑛 ) + 𝐾1 ) = 2.
3

Keywords and Phrases: rainbow connection, shadow graph, path, sum

1. Introduction
Chartrand et al. [1] introduced rainbow connection. Let 𝐺 be a nontrivial
connected graph with an edge-coloring 𝑐: 𝐸(𝐺) → {1, … , 𝑘}. A path is called
rainbow if no two edges in the path have the same color. The graph is rainbow
connected if any two vertices of 𝐺 are connected by a rainbow path in 𝐺. The
rainbow connection number of 𝐺, denoted 𝑟𝑐(𝐺), is the smallest number of colors
needed to make 𝐺 rainbow connected. The strong rainbow connection number of
𝐺, denoted 𝑠𝑟𝑐(𝐺), is the smallest number of colors needed so that any two
vertices of 𝐺 are connected by a rainbow geodesic in 𝐺. Chartrand et. al. [1] proved
the following basic results, which we collect as a lemma.
Lemma 1. Let 𝐺 be a nontrivial connected graph. Then,
1) 𝑑𝑖𝑎𝑚(𝐺) ≤ 𝑟𝑐(𝐺) ≤ 𝑠𝑟𝑐(𝐺) ≤ |𝐸(𝐺)|
2) 𝑟𝑐(𝐺) = 2 if and only if 𝑠𝑟𝑐(𝐺) = 2.
3) 𝑟𝑐(𝐺) = 1 if and only if 𝑠𝑟𝑐(𝐺) = 1 if and only if 𝐺 is a complete graph.
We study a class of graphs arising from two constructions: shadow and sum.
As described by Vaidya and Shah [2], the shadow of a graph 𝐻 is new graph
that is obtained by cloning 𝐻 into a new disjoint but isomorphic graph 𝐻 ′ , and then

66
connecting each vertex 𝑥 of 𝐻 with all neighbors of 𝑥 ′ , the clone of 𝑥 in 𝐻 ′ . Note
that there is no edge between any vertex and its clone. The resulting graph is
denoted 𝐷2 (𝐻). The sum of two vertex-disjoint graphs 𝐽 and 𝐾 is obtained by
joining each vertex of 𝐽 to all vertices of 𝐾. The resulting graph is denoted by 𝐽 +
𝐾.
Summing any graph 𝐽 with 𝐾1 diminishes the diameter to 2, because from
any vertex of 𝐽 we can go to 𝐾1 , and then return to any other vertex of 𝐽.
Intuitively, this would also decrease the rainbow connection number of 𝐽 + 𝐾1 .
This is relatively easy to see when 𝐽 is the shadow of some graph.
Lemma 2. For any graph 𝐻 with no isolated vertices, we have 𝑟𝑐(𝐷2 (𝐻) + 𝐾1 ) ≤
3.
P ROOF . Let 𝐺 = 𝐷2 (𝐻) + 𝐾1 .We prove this lemma by giving a rainbow coloring
on 𝐺 with 3 colors. Define 𝑐: 𝐺 → {1,2,3} as follows. Let 𝑉(𝐻) = {𝑣1 , … , 𝑣𝑛 } and
𝑉(𝐾1 ) = {𝑢}.
On 𝐸(𝐷2 (𝐻)), define 𝑐 to have value 1.
𝑐(𝑣𝑖 𝑢) = 1 and 𝑐(𝑣𝑖′ 𝑢) = 2 for each 𝑖 ∈ {1, … , 𝑛}.
𝑐(𝑣𝑖 𝑣𝑗′ ) = 3 if 𝑣𝑖 and 𝑣𝑗 are adjacent in 𝐻.

Now we show that 𝑐 is rainbow. Take any 𝑥, 𝑦 ∈ 𝑉(𝐺) which are non-adjacent. If
𝑥, 𝑦 are both original vertices of 𝐻, choose a neighbor of 𝑥, say 𝑤. Then the path
𝑥 − 𝑤 ′ − 𝑢 − 𝑦 has color sequence 3,2,1. Similarly for the case when 𝑥, 𝑦 are both
clones. If 𝑥 is an original vertex and 𝑦 is a clone, then the path 𝑥 − 𝑢 − 𝑦 has color
sequence 1,2.
So, in any case we can find a rainbow path from 𝑥 to 𝑦. The proof of Lemma 2 is
complete. □

In this paper we focus on the 𝑟𝑐 and 𝑠𝑟𝑐 of 𝐷2 (𝑃𝑛 ) + 𝐾1 , where 𝑃𝑛 is the


path graph on 𝑛 vertices.

FIGURE 1. Graph 𝑑2 (𝑃𝑛 ) + 𝐾1

67
In Seminar Nasional Matematika UI-UNPAD 2015, the author has described
the rc and src of 𝐷2 (𝑃𝑛 ), which are both equal to the diameter. For 𝐷2 (𝑃𝑛 ) + 𝐾1 , it
turns out that the rc and src are not so trivial.

2. Main Results
Our goal is to prove the following theorem.
Theorem. Let 𝐺 = 𝐷2 (𝑃𝑛 ) + 𝐾1 . Then
2
2 if 𝑛 ≤ 6 if 𝑛 ≤ 6
𝑟𝑐(𝐺) = { , 𝑠𝑟𝑐(𝐺) = { 𝑛
3 if 𝑛 ≥ 7 ⌈ ⌉ if 𝑛 ≥ 7
3
We organize the proof into a sequence of lemmas. For convenience, we fix some
notation. Let 𝐺 = 𝐷2 (𝑃𝑛 ) + 𝐾1 with 𝑉(𝑃𝑛 ) = {𝑣1 , … , 𝑣𝑛 } and 𝑉(𝐾1 ) = {𝑢}.
First we prove an upper bound for src.
𝑛
Lemma 3. For any 𝑛, we have 𝑠𝑟𝑐(𝐺) ≤ max {⌈ ⌉ , 2}.
3
𝑛
P ROOF . Let 𝑘 = max {⌈3 ⌉ , 2}. We prove this by constructing a strong rainbow
coloring on 𝐺 using 𝑘 colors. Define 𝑐: 𝐸(𝐺) → {1, … , 𝑘} as follows.
𝑐(𝑣𝑖 𝑣𝑖+1 ) = 𝑐(𝑣𝑖′ 𝑣𝑖+1

) ≔ 1 for 𝑖 odd,
′ ′
𝑐(𝑣𝑖 𝑣𝑖+1 ) = 𝑐(𝑣𝑖 𝑣𝑖+1 ) ≔ 2 for 𝑖 even,

𝑐(𝑣𝑖 𝑣𝑖+1 ) = 2 and 𝑐(𝑣𝑖′ 𝑣𝑖+1 ) = 1 for all 𝑖 ∈ {1, … , 𝑛 − 1}, and
𝑛
𝑐(𝑣𝑖 𝑢) = 𝑐(𝑣𝑖′ 𝑢) = ⌈ ⌉ for each 𝑖 ∈ {1, … , 𝑛}.
3

FIGURE 2. Graph Coloring of 𝑑2 (𝑃𝑛 ) + 𝐾1

Now take any non-adjacent vertices 𝑥, 𝑦 of 𝐺. Note that 𝑢 is adjacent to


every vertex, so 𝑥, 𝑦 ≠ 𝑢. So we only need to consider three cases.

68
Case 1. 𝑥, 𝑦 are both original vertices of 𝑃𝑛 .
Say 𝑥 = 𝑣𝑖 and 𝑦 = 𝑣𝑗 with 𝑖 < 𝑗. Since they are non-adjacent, 𝑗 ≥ 𝑖 + 2. If 𝑗 =
𝑖 + 2, then the geodesic 𝑣𝑖 − 𝑣𝑖+1 − 𝑣𝑖+2 is rainbow, because it has color sequence
1,2 or 2,1.
𝑖 𝑗
If 𝑗 ≥ 𝑖 + 3, then the path 𝑣𝑖 − 𝑢 − 𝑣𝑗 is geodesic with color sequence ⌈ ⌉ , ⌈ ⌉.
3 3
𝑗 𝑖
Because 𝑗 ≥ 𝑖 + 3, we have ⌈ ⌉ = ⌈ ⌉ + 1. So the path is a rainbow geodesic
3 3

Case 2. 𝑥, 𝑦 are both cloned vertices of 𝑃𝑛 .


This case is perfectly analogous the Case 1.
Case 3. 𝑥 is original, and 𝑦 is cloned.
Say 𝑥 = 𝑣𝑖 and 𝑦 = 𝑣𝑗′ . If 𝑖 = 𝑗 is an odd number less than 𝑛, the geodesic 𝑣𝑖 −

𝑣𝑖+1 − 𝑣𝑖′ has color sequence 2,1. If 𝑖 = 𝑗 is an even number less than 𝑛, the
geodesic 𝑣𝑖 − 𝑣𝑖+1 − 𝑣𝑖′ has color sequence 2,1. If 𝑖 = 𝑗 = 𝑛, one of the geodesics

𝑣𝑛 − 𝑣𝑛−1 − 𝑣𝑛′ and 𝑣𝑛 − 𝑣𝑛−1 − 𝑣𝑛′ has color sequence 1,2, depending on whether
𝑛 is odd or even.
If |𝑖 − 𝑗| = 1, then 𝑥, 𝑦 are adjacent.
If |𝑖 − 𝑗| = 2, then either 𝑗 = 𝑖 + 2 or 𝑗 = 𝑖 − 2. If 𝑗 = 𝑖 + 2, the geodesic 𝑣𝑖 −
′ ′
𝑣𝑖+1 − 𝑣𝑖+2 has color sequence 1,2 when 𝑖 is odd, while the geodesic 𝑣𝑖 − 𝑣𝑖+1 −
′ ′
𝑣𝑖+2 has color sequence 2,1 when 𝑖 is even. If 𝑗 = 𝑖 − 2, the geodesic 𝑣𝑖−2 −
′ ′
𝑣𝑖−1 − 𝑣𝑖 has color sequence 1,2 when 𝑖 is odd, while the geodesic 𝑣𝑖−2 − 𝑣𝑖−1 −
𝑣𝑖 has color sequence 2,1 when 𝑖 is even.
𝑖 𝑗
If |𝑖 − 𝑗| ≥ 3, the geodesic 𝑣𝑖 − 𝑢 − 𝑣𝑗 has color sequence ⌈ ⌉ , ⌈ ⌉. Because |𝑖𝑗| ≥
3 3
𝑖 𝑗
3, we have ⌈ ⌉ ≠ ⌈ ⌉. So this geodesic is rainbow. Lemma 2 has been proved. □
3 3

Now we prove a lower bound for 𝑠𝑟𝑐.


𝑛
Lemma 4. For any 𝑛, we have 𝑠𝑟𝑐(𝐺) ≥ ⌈3 ⌉.
𝑛
P ROOF . Let 𝑙 = ⌈ ⌉. Consider all vertices 𝑣𝑖 with 𝑖 ≡ 1(𝑚𝑜𝑑 3). There are 𝑙 of
3
them. Note that the only geodesic in 𝐺 from 𝑣1 to 𝑣4 is 𝑣1 − 𝑢 − 𝑣4 . Thus
𝑐(𝑣1 𝑢) ≠ 𝑐(𝑣4 𝑢).
Assume that 𝑟 edges 𝑣1 𝑢, 𝑣4 𝑢, … , 𝑣1+3(𝑟−1) 𝑢 uses 𝑟 different colors, where 𝑟 ≤
𝑙 − 1. For each 𝑖 ∈ {0, … , 𝑟 − 1}, the only geodesic in 𝐺 from 𝑣1+3𝑖 to 𝑣1+3𝑟 is
𝑣1+3𝑖 − 𝑢 − 𝑣1+3𝑟 . Thus 𝑐(𝑣1+3𝑖 𝑢) ≠ 𝑐(𝑣1+3𝑟 𝑢). This means that 𝑣1+3𝑟 𝑢 must
use a new color. Inductively, this shows that at least 𝑙 different colors must be used.
The proof of Lemma 4 is complete. □
Now we can proceed to prove the main Theorem.
𝑛
Suppose that 𝑛 ≤ 6. Then ⌈3 ⌉ ≤ 2. Lemma 3 shows that 𝑠𝑟𝑐(𝐺) ≤ 2. But, since 𝐺
is not a complete graph, 𝑟𝑐(𝐺) ≥ 2. Together with 𝑟𝑐(𝐺) ≤ 𝑠𝑟𝑐(𝐺), this allows us
to conclude that the rc and src are both equal to 2.

69
𝑛 𝑛
Now suppose that 𝑛 ≥ 7. Then ⌈ ⌉ ≥ 3. Lemma 3 and 4 show that 𝑠𝑟𝑐(𝐺) = ⌈ ⌉.
3 3
Thus 𝑠𝑟𝑐(𝐺) ≥ 3. This implies that 𝑟𝑐(𝐺) ≠ 2. But by Lemma 2 we have 𝑟𝑐(𝐺) ≤
3. So 𝑟𝑐(𝐺) = 3.
The proof is complete. □

3. Concluding Remarks
In this paper we have described completely the rc and src of 𝐷2 (𝑃𝑛 ) + 𝐾1 .
The values differ significantly depending on 𝑛 ≤ 6 or 𝑛 ≥ 7.
It may be interesting to investigate whether Lemma 2 still holds when 𝐽 is
not the shadow of some graph.
Another direction of research that may be taken is to study the rc and src of
𝐷2 (𝐻) + 𝐾1 for more generap garph 𝐻, or 𝐷2 (𝑃𝑛 ) + ̅𝐾̅̅𝑝̅ for 𝑝 ≥ 2, or even
𝐷𝑚 (𝑃𝑛 ) + ̅𝐾̅̅𝑝̅ for 𝑚 ≥ 3.

References
[1] G. Chartrand, G. L. Johns, K. A. McKeon, P. Zhang., 2008, Rainbow
Connection in Graphs, Mathematica Bohemica. Vol 133 no 1 (2008), 85-98.
[2] S.K. Vaidya, N.H. Shah., 2011, Some New Odd Harmonious Graphs,
International Journal of Mathematics and Soft Computing. Vol 1 no 1 (2011),
9 – 16.

70
Proceedings of IICMA 2015
Graph and Combinatorics

Encryption Algorithm Using New


Modified Map for Digital Image
Suryadi M.T.1,a), Maria Yus Trinity Irsan2,b),
Yudi Satria3,c)
1,2,3
Department of Mathematics, Faculty of Mathematics and Natural
Sciences, Universitas Indonesia, Depok 16424, Indonesia

a)
yadi.mt@sci.ui.ac.id
b)
maria.irsan@sci.ui.ac.id
c)
ysatria@sci.ui.ac.id

Abstract. At this time when information technology is developing rapidly, the public's
attention to the data and personal information is very low. This is clearly seen in the behavior
of people who so easily deploy/split the data and personal information in social media.
Moreover, a person's awareness and attention to digital data that can be accessed by off-line is
lower. In fact, even in off-line network, the digital data is very easy to move from person to
another person, which may cause the manipulation or alteration of data. Therefore, it is
essential to provide protection to the data, especially if the data is highly confidential. The
focus of this paper is to protect digital data, specifically digital image (colored digital image).
Protection is provided in the form of encrypted digital image. The encryption process using
𝑟𝜆𝑥𝑛
the new map, 𝑥𝑛+1 = )2
(mod 1), called MS map. In the result, we found the
1+𝜆(1−𝑥𝑛
encrypted digital image using MS map, and how its performance in terms of: the randomness
of key stream; average time of encryption/ decryption process; histogram analysis; quality of
decrypted images; and sensitivity of the initial value. The results showed that the key stream
were generated is random; the average time of the encryption process is relatively same as the
decryption process; histogram of the encrypted image is flat (uniform distribution); the
decrypted images has same quality as the original images and the sensitivity of the initial
value reached 10−17 . So that, the encryption algorithm which developed by MS logistic map
is more resistant to brute-force attack.
Keywords and Phrases: MS map, Logistic map, encryption and decryption
algorithm, digital image.

1. Introduction
Encryption algorithm of digital image based on chaotic system has been
widely introduced, as in [1] – [3]. The study of chaotic systems and their
possible applications to cryptography has received considerable attention during
the last years in part of the scientific community due to the properties of chaotic
system [4]. The properties of chaotic system was described in [4], that is
ergodicity, sensitivity of initial condition/control parameter, mixing property,
deterministic dynamic, and structure complexity.
71
In [1] has introduced an encryption algorithm of digital image and its
performance by using logistic map as a chaotic map. In this paper, we introduced
a new map, called MS map, and it will be applied in the digital image
encryption. MS map shown at equation (1)
𝑟𝜆𝑥𝑛
𝑥𝑛+1 = (1)
1+𝜆(1−𝑥𝑛 )2

where 𝑟, 𝜆 ∈ ℝ and 𝑥𝑛 ∈ (0,1).


First 1000 iteration from MS map shown on figure 1 with 𝑟 = 4; 𝜆 =
5. ; 𝑥0 = 0.1.

FIGURE 1. First 1000 iteration from MS map

2. Main Results
2.1 Encryption Algorithm
In this paper, the encryption process shown on figure 2, and the
decryption process shown on figure 3.

FIGURE 2. Encryption process

72
FIGURE 3. Decryption process

Let the original image has pixel size 𝑝 × 𝑞. The key stream (𝐾𝑖 ) obtained by:
𝐾𝑖 = (1000 × 𝑥𝑖 ) mod 256 (2)
where 𝑥𝑖 generated by MS map and 𝑖 = 0, 2, … ,3𝑝𝑞 − 1
For simulation, we use Python version 2.7.6.1.

2.2 Randomness Key stream Analysis


Randomness of key stream can be tested used National Institute of
Standard Technologies (NIST) test [5]. NIST test consist of 15 test, and one of
them is the frequency (monobit) test. We will use this test (the frequency test) to
analyst the randomness of key stream sequence.
The procedure of this test was described on [5]. For 𝑟 = 4; 𝜆 = 5. ; 𝑥0 =
0.1, and generated by 1000 terms of key stream, we obtained:
1. 𝑛 = 1000 × 8 = 8000
2. Using computer to obtained |𝑆1000 | = 162
|𝑆1000 | 162
3. 𝑠𝑜𝑏𝑠 = = = 1.81121506177
√𝑛 √8000
𝑠𝑜𝑏𝑠
4. Using computer to obtained 𝑃𝑣𝑎𝑙𝑢𝑒 = 𝒆𝒓𝒇𝒄 ( ) =
√2
1.81121506177
𝒆𝒓𝒇𝒄 ( ) = 0.070107567605
√2
Since 𝑃𝑣𝑎𝑙𝑢𝑒 ≥ 0.01 , then we accept the key stream sequence as random.
2.3 Encryption Process
In this paper we will used cat&dog.jpg with 5 different pixel size as data
testing. The data testing shown below:
TABLE 1. Testing Data
Data Testing Display Data Size (Pixel)
Data 1 95 × 60
Data 2 285 × 178
Data 3 855 × 534
Data 4 1538 × 962
Data 5 2563 × 1602
73
Cat&dog.jpg
The result of encryption process shown on table 3, and the average time of
encryption and decryption process of each data shown on figure 4.

(a) (b) (c)

(d) (e)

FIGURE 4. Encrypted image (a) data 1; (b) data 2; (c) data 3; (d) data 4; (e)
data 5

400
Time (Second)

300
200 Average Encrytion
Time
100
Average Decryption
0 Time
0 1 2 3 4 5
Data Testing

FIGURE 5. Average time of Encryption and decryption process

2.4 Histogram Analysis


We will test the distribution of encrypted image pixel value using
Goodness of fit test [6]. The test result shown on table 4.

TABLE 2. Test Statistic Value


Test Statistic Value
Data
For Red For Green For Blue
Data 1 239.020350877 260.578245614 255.458245614
Data 2 239.020350877 260.578245614 255.458245614
Data 3 252.799430536 241.17045798 213.568188887
Data 4 239.020350877 260.578245614 255.458245614
Data 5 264.588116785 209.65349205 280.176465918

74
With degrees of freedom 256 − 1 = 255 and 1% significance level, then from
Chi-Squared table we obtained the critical value is 310.4573882199. Since all
the test statistic values are less than the critical value, then we conclude that the
distribution of encrypted image pixel is uniform. Figure 5 shown the histogram
of encrypted image of data 3.

(a) (b) (c)

FIGURE 6. Histogram of encrypted image of data 3 (a) histogram red (b)


histogram green (c) histogram blue

2.5 Quality of Image Test


We will compare the quality of original image and decrypted image, by Peak
Signal –to Noise Ratio (PSNR) which is described in [7] with formula 𝑃𝑆𝑁𝑅 =
𝐿2 1 𝑝−1 𝑞−1
10 log10 , where Mean Squared Error(𝑀𝑆𝐸) = Σ Σ (𝑥(𝑖,𝑗) −
𝑀𝑆𝐸 𝑁 𝑖=0 𝑗=0
2
𝑦(𝑖,𝑗) ) , 𝑁 = 𝑝𝑞, 𝑥(𝑖,𝑗) and 𝑦(𝑖,𝑗) are (𝑖, 𝑗) entry of original image matrix and
decrypted image matrix, respectively. (assume that the image size is 𝑝 pixel ×
𝑞 pixel ).
If 𝑃𝑆𝑁𝑅 = ∞ then we conclude that the original image and the decrypted
image has same quality. Using python we obtained :
Table 3. PSNR result
Data MSE PSNR
Data 1 0 ∞
Data 2 0 ∞
Data 3 0 ∞
Data 4 0 ∞
Data 5 0 ∞

So, we conclude that the quality between the original image and the decrypted
image is same.

2.6 Key Sensitivity Analysis


In this part, we will decrypted data 3 with different 𝑥0 . In encryption process
we used 𝑥0 = 0.1, in this part we will decrypt the encrypted image with three
different 𝑥0 as shown in table 5 below:

75
TABLE 4. Key Sensitive test result
Original Image 𝒙𝟎 Decrypted Image

0.1 + 10−10

0.1 + 10−17

0.1 + 10−18

From table 5 we seen that the initial value 𝑥0 = 0.1 will same with 𝑥0 =
0.1 + 10−18 , means that the sensitivity of initial value of this algorithm is
reaches 10−17 .
Next Table shown the comparison between MS map algorithm and logistic
map Algorithm in term of exhaustive key search times.

76
TABLE 5. Comparison Between MS Map Algorithm and Logistic Map
Algorithm in Term of exhaustive key search times.

Size of Number of Times Needed


Key Experienc
Seconds Hours Days Months Years
Space e /sec
3.24 3.75 1.25 1.04
106 9 × 10624
628 623 622
× 10 × 10 × 10 × 10621
3.24 3.75 1.25 1.04
1012 9 × 10618
With MS × 10 622
× 10 617
× 10 616
× 10615
map
3.24 3.75 1.25 1.04
algorithm 1024 610
9 × 10606 607 606
× 10 × 10 × 10 × 10605
3.24
× 10634 3.24 3.75 1.25 1.04
1048 9 × 10582
586 581 580
× 10 × 10 × 10 × 10579
3.24 3.75 1.25 1.04
1056 9 × 10574
578 573 572
× 10 × 10 × 10 × 10571
3.87 3.23
106 1024 2.78 × 1020 1.16 × 1019 17
× 10 × 1016
With
Logistic 1.16 3.87 3.23
1012 1018 2.78 × 1014
13 11
map × 10 × 10 × 1010
algorithm 10 24
10 6
278 11.6 0.39 0.0325
1030 [1] 48 −18
10 10
1056 10−26

3. Concluding Remarks

From the result above, we conclude that:


a) MS map can be used on encryption digital image as generator of key stream.
b) Performance of algorithm are :
The encryption algorithm is very difficult to be cracked by known
plaintext attack, since we have proved that the distribution of encrypted
image pixel is uniform. And so the key streams were generated, proved
random by the frequency (monobit) test with 𝑃𝑣𝑎𝑙𝑢𝑒 =
0.620617946438 > 0.01; and by the run test with 𝑃𝑣𝑎𝑙𝑢𝑒 =
0.395958750062 > 0.01
Key sensitivity of the algorithm reaches 10−17 and key space reaches
3.24 × 10634. So that, the encryption algorithm which developed by MS
logistic map is resistant to brute-force attack
The original image and the decrypted image has same quality by PSNR.
Encryption’s time and decryption’s time is relative same
c) MS map algorithm more resistant than logistic map algorithm in terms of
the size of the keys space.

77
References

[1] M.T. Suryadi, E. Nurpeti, and Widya, Performance of Chaos-Based


Encryption Algorithm for Digital Image, TELKOMNIKA
Telecommunication, computing, electronics, and control Vol.12 No.3,
(2014).
[2] E. Nurpeti and M.T. Suryadi (2013), Chaos-Based Encryption Algorithm
for Digital Image, Proceedings IndoMS International Conference on
Mathematics and Its Application 2013, Indonesia Mathematical Society.
[3] E. Sukirman, M.T. Suryadi, and M.A. Mubarak, The Implementation of
Henon Map Algorithm for Digital Image Encryption, TELKOMNIKA
Telecommunication, computing, electronics, and control Vol.12 No.3,
(2013).
[4] L. Kocarev and S. Lian, Chaos-Based Cryptography, Berlin Heidelberg:
Springer-Verlag, (2011).
[5] R. Andrew, et al, A Statistical Test Suite for Random and Pseudorandom
Number Generators for Cryptographic Application. Technology
Administration U.S Department of Commerce. Special Publication 800-22,
(2010).
[6] R.E. Walpole, R. H. Myers, S. L. Myers, K. Ye, Probability and Statistics
for Engineers & Scientists (9th edition), Boston: Prentice Hall, (2012).
[7] A. Verma and I. Kaur, Techniques and Algorithm Design for the Detection
of PSNR in Digital Image. International Journal of Computer Applications
(0975-8887): National Conference on Structuring Innovation Through
Quality 2013, (2013).

78
Proceedings of IICMA 2015
Statistics and Probability

Factors that Influence Infidelity Tendency for


Workers and Scholars in Jakarta

Rianti Setiadi1,a), Dini Riyani2,b), Yanny Arumsari3,c)


1,2,3
Department of Mathematics, Faculty of Mathematics and Natural Sciences,
Universitas Indonesia, Depok 16424, Indonesia
a)
rianti@sci.ui.ac.id
b)
riyani.dini@gmail.com
c)
yanny.arumsari@gmail.com

Abstract: In this modern era, infidelity has been regarded as a common phenomenon among the
society, including scholar and worker. Nevertheless, according to the law as well as religion,
infidelity is still prohibited. Infidelity causes several negative impacts, such as disease, divorce,
rebellious children, and even death. Looking at this fact, infidelity ought to be prevented right at
the beginning. This research will investigate the factors which may significantly influence the
tendency of infidelity. Several factors to be included in this research is gender, occupation,
education, position at work, age, financial problem, life satisfaction, self-image, need of
acceptance, spiritual level, boredom level, broken-heart experience, income, freedom in making
friend, lifestyle, libido level, marital status, parents’ marital status. The measurement instrument
is self-made and has been proven to be reliable and valid. This research is conducted among 687
workers and scholars in Jakarta. Purposive sampling method is adopted to gather the sample. Data
is analyzed by using Logistic Regression Model. The result shows that gender, financial problem,
spiritual level, libido level and freedom in making friend are the significant factors influencing
the level of infidelity tendency. The result of this research is hoped to help the religious leaders,
counselors, and other related parties to prevent and handle the issue appropriately based on the
factors influencing the tendency of infidelity.
Keywords: Infidelity tendency, Logistic Regression Model

1. Introduction
In these last several years, infidelity has been regarded as a common and lawful
conduct. There are many married couples who have other persons to whom they can date
and have sexual relation outside their legal partners. Infidelity, in which this paper describes
as having sexual relation outside the legal partner, is something against religious view and
has bad effects, such as divorced, sexually transmitted diseases, and others. Looking at the
effect of this conduct, religious leaders, psychologists, counselors, teachers, and other
morale-related parties should raise their awareness towards the magnification of this
phenomenon. Nevertheless, preventive actions should also accompany the raising awareness
in order to reduce the occassion of infidelity among the society. Therefore, knowing what
factors are influecing the tendency of infidelity is important to do preventive actions.
From reports and observations being gathered, it is known that infidelity can be done
by both male and female, both high-degree and low-degree education, both single and
married persons, both wealthy and less-wealthy, and both high-level and low-level of
spirituality. Besides, it is also known that infidelity is triggered by financial problem, life

79
satisfaction, self image, need of acceptance, boredom level, broken-heart experience,
income, freedom in making friends, lifestyle, libido level, own and parents’ marital status.
This paper will investigate what factors significantly influence infidelity tendency level.
Throughout this research, these following questions will be answered: What factors
influence infidelity tendency level among workers and scholars at Jakarta. From this
research, readers may be able to find the factors that influence infidelity tendency level
among workers and scholars at Jakarta By knowing the factors that influence infidelity
tendency level, accurate and appropriate policy can be considered to reduce practice
infidelity, divorce and other bad impacts. Students are excluded from this research’s
respondents

2. Main Results
Variables that are used in this research are:
1. Infidelity Tendency Level: degree to which someone is attracted to do sexual relation
outside his/her legal partner. This variable is a latent variable which is measured
using self-tailored measuring instrument (questionnaire) and has been tested reliable
and valid. This measuring instrument is on the process of granting the copyright from
HAKI. The components of infidelity tendency measured in this measuring instrument
include the degree of norm being followed, the degree of willingness to do infidelity,
and the degree of awareness about the effect of infidelity. Those components are
reflected in the particular questions in the questionnaire.
2. Gender
3. Occupation: respondent’s job field when the survey is conducted
4. Position at work: respondent’s job band when the survey is conducted
5. Education: respondent’s last education degree
6. Age
7. Financial Problem: whether the respondent has financial problem or not
8. Life Satisfaction: degree of how the repondent is satisfied with his/her life. This
variable is measured using Likert Scale and then categorized into three categories
which are low, average, and high.
9. Self Image: degree of how the respondent accepts his/her self. This variable is
measured using Likert Scale and then categorized into three categories which are
low, average and high.
10. Need of acceptance: degree of how the respondent perceives the need to be accepted
by the the society. This variable is measured using Likert Scale and then categorized
into three categories which are low, average and high.
11. Spiritual Level: degree to measure respondent’s relation with God, faith, and
fellowship. This variable is measured using Likert Scale and then categorized into
three categories which are low, average and high.
12. Boredom Level: degree of how the respondent is bored with his/her life. This variable
is measured using Likert Scale and then categorized into three categories which are
low, average and high.
13. Broken Heart Experience: indicator whether or not the respondent ever felt betrayed.
14. Income: respondent’s average monthly income

80
15. Freedom in Making Friend: degree of how free the respondent in making friend. This
variable is measured using Likert Scale and then categorized into three categories
which are low, average and high.
16. Lifestyle: type of respondent’s lifestyle; whether it is conservative or modern.
17. Libido Level: degree of how the respondent is influenced by sexual matters. This
variable is measured using Likert Scale and then categorized into three categories
which are low, average and high.
18. Marital Status: respondent’s marital status when the survey is conducted. This
variable is categorized into four categories, which are single, married, spouse dies,
divorce.
19. Parents Marital Status: marital status of respondent’s parents when the survey is
conducted. This variable is categorized into, divorce, spouse dies

Methodology
Population: workers and scholars in Jakarta
Sample: 687 workers and scholars in Jakarta that are chosen by purposive sampling. Data
will be analyzed by logistic regression method. Since the predictors variables are categorical
variables, coding variables are needed. When all variables are entered to the model, it is
found that only five variables influence the predictor variables significantly, that are freedom
in making friends, spiritual level, libido level, financial problem and gender. The coding of
these variables are shown in Table 1.
TABLE 1. Categorical Variables Codings
Frequenc Parameter coding
y (1) (2) (3)
Very low 180 1,000 ,000 ,000
Freedom making in Low 246 ,000 1,000 ,000
Friend Average 197 ,000 ,000 1,000
High 64 ,000 ,000 ,000
Low 104 1,000 ,000
Spiritual level Average 231 ,000 1,000
High 352 ,000 ,000
Low 236 1,000 ,000
Libido level Average 309 ,000 1,000
High 142 ,000 ,000
Yes 221 1,000
Financial problem
No 466 ,000
Male 1,000
Gender
Female 337 ,000

TABLE 2. Dependent Variable Encoding


Original Value Internal Value
Low 0
High 1

81
The Logistic Regression model uses the five significant predictor variables is:
𝑃𝑟(𝐿𝑜𝑤 𝐼𝑛𝑓𝑖𝑑𝑒𝑙𝑖𝑡𝑦 𝑇𝑒𝑛𝑑𝑒𝑛𝑐𝑦)
𝑔 = ln ( )
𝑃𝑟(𝐻𝑖𝑔ℎ 𝐼𝑛𝑓𝑖𝑑𝑒𝑙𝑖𝑡𝑦 𝑇𝑒𝑛𝑑𝑒𝑛𝑐𝑦)
= 0.87 + 0.941 𝐺𝐸𝑁 − 0.303 𝐹𝐼𝑁𝐴𝑁𝐶 − 1.924 𝐹𝑅𝐸𝐸 (1)
− 1. −0.705 𝐹𝑅𝐸𝐸(3) − 1.368 𝐿𝐼𝐵𝐼𝐷𝑂 (1) − 0.731 𝐿𝐼𝐵𝐼𝐷𝑂 (2)
+ 1.939 𝑆𝑃𝐼𝑅𝐼𝑇𝑈𝐴𝐿 (1) + 0.810 𝑆 (2)

Goodness of Fit Models


The goodness of fit test of the model with five significant predictor variables is written below:
TABLE 3. Hosmer and Lemeshow Test
Step Chi-square df Sig.
1 7,503 8 ,483
Hypothesis:
H0: The model fits
H1: The model not fits
From table sig = 0.483 is greater than 0.15 means that the model fits or appropriate to use the
model.
TABLE 4. Classification Tablea

Predicted
Infidelity Tendency Percentage
Observed
Level Correct
low high
Low 314 65 82,8
Infidelity Tendency Level
Step 1 High 100 208 67,5
Overall Percentage 76,0

The accuracy of clasification is 76%. Model is good enough.

Interpretation of each coefficient included in this model is:


1. Coefficient GEN (0.941)
The risk of high infidelity tendency in male is equal to exp(0.941) = 2.563
times compared to female. The risk of male to have high infidelity tendency
is greater than female.
2. Coefficient FINANC (-0.303)
The risk of high infidelity tendency in the person who have financial problem
is exp(0.303) = 0.738 times compared to person who doesn’t have
financial problem.
The risk of person who have financial problem to have high infidelity
tendency is smaller than person who not have financial problem.
3. Coefficient FREE (1) (-1.924)
The risk of high infidelity tendency in the person who have very low degree

82
of freedom in making friends is exp(−1.924) = 0.146 times compared to
person who have high degree of freedom in making friend.
The risk of person who have very low degree of freedom in making friend to
have high infidelity tendency is smaller than person who have high degree of
feedom in making friends.
4. Coefficient FREE (2) (-1.494)
The risk of high infidelity tendency in the person who have low degree of
freedom in making friends is exp(−1.494) = 0.224 times compared to
person who have high degree freedom in making friends.
The risk of person who have low degree of freedom in making friends to have
high infidelity tendency is smaller than person who have high degree of
feedom making in friend.
5. Coefficient FREE(3) (-0.705)
The risk of high infidelity tendency in the person who have average degree
freedom in making friend is exp(−0.705) = 0.494 times compared to
person who have high degree freedom making in friend.
The risk of person who have average degree of freedom in making friends to
have high infidelity tendency is smaller than person who have high degree
feedom in making friends.
6. Coefficient LIBIDO (1) (-1.368)
The risk of high infidelity tendency in person who have low libido level is
exp(−1.368) = 0.255 times compared to person who have high libido level .
The risk of person who have low libido level to have high infidelity tendency
is smaller than person who have high libido level.
7. Coefficient LIBIDO(2) (-0.731)
The risk of high infidelity tendency in person who have average libido level
is exp(−0.731) = 0.482 times compared to person who have high libido
level.
The risk of person who have average libido level to have high infidelity
tendency is smaller than person who have high libido level.
8. Coefficient SPIRITUAL(1) (1.939)
The risk of high infidelity tendency in person who have low spiritual level is
exp(1.939) = 6.953 times compared to person who have high spiritual level.
The risk of person who has low spiritual level to have high infidelity
tendency is smaller than person who have high spiritual level.
9. Coefficient SPIRITUAL (2) (0.810)
The risk of high infidelity tendency in person who have average spiritual
level is equal to exp(0.810) = 2.249 times compared to person who have
high spiritual level.
The risk of person who have average spiritual level to have high infidelity
tendency is smaller than person who have high spiritual level.

Discussion
Significantly factors that influence infidelity tendency is gender, financial
problem, freedom in making friend, libido level, and spriritual level. Therefore

83
those factors have to be concerned when infidelity is happened. From this
research’s result, the risk of male to have a high infidelity tendency is higher than
female but it is found now that many females do infidelity too. Maybe in the last
era, females often stayed at home and men work outside. So the opportunities for
males to make infidelity is higher compare to females. But now almost males and
females work and meet many other people at their workplace. This fact causes the
tendency of infidelity level is increased for males and females. Otherwise infidelity
can be watched easily in social media. This phenomenon is an important thing.
Possibly the different tendency of infidelity level between males and females will
be greater in the next era. Therefore the similar research must be repeated.
The risk of someone who has financial problems to have a high infidelity
tendency is smaller than people who do not have financial problems. Usually,
people who have financial problems will do anything to fulfill their needs including
sell their self. According to Eko Haryanto, criminologist University of Indonesia in
an article entitled 'When the Prostitution "Online" writhing in Jakarta' stated that
women introduced to prostitution by her boyfriend while the women are in need of
money. Maybe people who has financial problems doesn’t have high tendency to
infidelity. They do it just because they need money. Otherwise, people who doesn’t
financial problems do infidelity because they are willing to do that. Therefore it is
makes sense that the risk of someone who has financial problems to have a high
infidelity tendency is smaller than people who do not have financial problems.
The risk of person who have very low degree of freedom in making friends to
have high infidelity tendency is smaller than person who have high degree of
feedom in making friends. The risk of person who have low degree of freedom in
making friends to have high infidelity tendency is smaller than person who have
high degree of feedom in making friends. The risk of person who have average
degree of freedom in making friends to have high infidelity tendency is smaller
than person who have high degree of feedom in making friends. In other words,
higher level of freedom in making friends will cause higher infidelity tendency
level. Friends are an important factor that influence character. Good friends will
almost always guide to good character but bad friends can break the good
character. When people have high freedom level in making friends, they will have
great probabilities to have friends who practice infidelity. So it is make-sense that
people who have higher freedom in making friends will have higher tendency level
of infidelity.
The risk of person who has low libido level to have high infidelity tendency
is smaller than person who have high libido level. The risk of person who has
average libido level to have high infidelity tendency is smaller than person who
have high libido level. In other words, higher libido level causes higher infidelity
tendency level. This result is very clear, trivial and need not more explanation.
The risk of person who has low spiritual level to have high infidelity
tendency is smaller than person who have high spiritual level. The risk of person
who has average spiritual level to have high infidelity tendency is smaller than
person who have high spiritual level. In other words, lower spiritual level will
result higher infidelity tendency level. People who have high spiritual level have a
fear of God and will have willingness to obey. So they will have lower infidelity
level.

84
3. Concluding Remarks
From this research, it is the conclusion that can made:
1. To counsel client that experience infidelity, gender, financial problem,
freedom in making friends, libido level, and spriritual level must be
considered.
2. Males need to be more careful with their tendency to do infidelity but
females are not immune to infidelity too. The result of this research
shows that the infidelity tendency level of males is higher than infidelity
tendency level of females. It must be remember that the result can be
changed in the next time.
3. People who have much money will have a bigger chance to fall into
infidelity.
4. People who have high libido level must be aware with their friendships,
relationships, movies that they watch, magazines that they read and etc.
They are suggested to have many activities and often spend time to make
personal relationship with God and take a part in religious services.
Remember that high spiritual level will relate to low infidelity tendency
level.

References
[1] L.K. Anna (Ed.), 12 Penyebab Libido Rendah. (2012) Retrieved from:
http://health.kompas.com/read/2012/03/05/11283136/12.
[2] K.D. Cahya. Ketika Prostitusi Online Menggeliat di Jakarta. (2015). Retrieved
from: http://megapolitan.kompas.com/read/2015/04/17/08000011
[3] CHAID and Exhaustive CHAID. Available on:
Algorithmsftp://ftp.software.ibm.com/software/analytics/spss/support/Stats/D
ocs/Statistics\/Algorithms/13.0/TREE-CHAID.pdf
[4] Duniafitnes.com. 2013. Apa yang Diinginkan Wanita dari Seks.
http://duniafitnes.com/news/apa-yang-diinginkan-wanita-dari-seks.html
(accessed on 27 June 2015).
[5] A.A. Ghifari, Gelombang Kejahatan Seks Remaja Modern, Bandung, Mujahid
Press, (2003).
[6] E.B. Hurlock, Psikologi Perkembangan, Jakarta : Erlangga, (2009).
[7] G. V. Kass, An exploratory technique for investigating large quantities of
categorical data, Applied Statistics 29 (2), p. 119-127, (1980).
[8] P.V. Larsen, Master of Applied Statistics ST111: Regression and analysis of
variance. http://statmaster.sdu.dk/maskel/docs/sample/ST111 (accessed on 10
April 2015)
[9] W. Mendenhall and T. Sincich, A Second Course In Statistics: Regression
Analysis (7th ed.), Pearson Education, Inc, (2012).
[10] G. Ritschard, CHAID and Earlier Supervised Tree Methods, (2010). Available
on: www.unige.ch/ses/metri/cahiers/2010_02.pdf
[11] M. Seligman, Authentic Happines, Mizan Pustaka, Bandung , (2005).
[12] R. E. Walpole, Introduction to Statistics (3rd ed), Prentice Hall Professional
Technical Reference, (1982).

85
Proceedings of IICMA 2015
Statistics and Probability

Factors Affecting The Agreement Towards


Legalization of LGBT Marriage Level
Rianti Setiadi1,a), Rosi Melati2,b)
1,2
Department of Mathematics, Faculty of Mathematics and Natural Sciences,
Universitas Indonesia, Depok 16424, Indonesia

a)
rianti@sci.ui.ac.id
b)
rosi.melati17@gmail.com

Abstract: The legalization of LGBT marriage in the United States and the approval from several
countries have more or less raised mixed reactions in other countries that have not legalized
LGBT marriage, including Indonesia. Now, in Indonesia, some people express their support for
the legalization of LGBT marriages, while some others show adverse reaction to the idea. There
are many factors affecting the different perceptions. Some of them which will be investigated in
this research are gender, social lifestyle, lifestyle, spiritual level, organization activities, reading
activities, sympathetic level and agreement of sexual orientation, knowing about sexual
orientation, knowing about legalization LGBT marriage and agreement level of legalization
LGBT marriage. This research is conducted in order to analyze which factors have significant
influence towards the different perception on the legalization of LGBT marriage. Because, the
future of Indonesia lies in the hand of the young generation, this research is conducted among the
University of Indonesia scholars as the sample for the pilot study. The sampling method adopted
Multi Stage Cluster Sampling. The primary method of analyzing is Logistic Regression Model.
Measurement instrument is a self-made and the reliability and validity have been checked already.
The result of this research is aimed to help Indonesian government to make future regulations and
decisions regarding the issue of legalization of LGBT marriage in Indonesia.
Keywords: LGBT orientation, LGBT Marriage Legalization, Multi Stages Sampling, CHAID,
Logistic Regression Model

1. Introduction

In these last several years, homosexual population seem increasing. They have more
dare to show them self as homosexual. Not only in the society, but also in University they
are not ashamed to show their identity as homosexual. According to a survey conducted by
CIA (2013) said that the number of homosexual in Indonesia are 16,625,865 people, which
means about six percent of the population of Indonesia. Some of them show themselves as
homosexual and partly still covered.

On June 26 2015, the US Supreme Court ruled that the US Constitution guarantees
the right for same-sex couples to marry in all 50 US states. Barack Obama decision to
legalized same-sex marriage made horrendous the world. Now, there are 22 countries where
same-sex marriage is legal. The Countries are Netherlands, Belgium, Canada, Spain, South
Africa, Norway, Sweden, Argentina, Iceland, Portugal, Denmark, Brazil, England and
Wales, France, New Zealand, Uruguay, Luxembourg, Scotland, Finland, Ireland and US.

Suryadharma Ali has prediction that someday Indonesia will get a challenge from the
citizens who want to fight for the legalization of same-sex marriage. Maybe, it will happen

86
in the next 10 years, where the leaders at that time are the young people now. The future of
this Country is determined by the young people now, to know their tendency attitude toward
legalization of LGBT marriage and the factors that affecting agreement level of legalization
LGBT marriage.

Throughout this research, these following questions will be answered: What factors
affecting the agreement of legalization LGBT marriage, among scholars in University of
Indonesia. From this research, readers may be able to find the factors that influence
agreement level of legalization LGBT marriage from scholars in University of Indonesia.
By knowing the factors that influence agreement level of legalization LGBT marriage, we
expect that young people can be guided, so that later when they govern the country, they can
solve the challenge of the legalization LGBT marriage marriage issues wisely. The
respondents are scholars of University of Indonesia.

2. Main Results

Research Variables and It’s Definition


1. Gender: Man and Woman.
2. Social lifestyle is a social style to make relation with others. This variable contains
three categories which are free, neutral and bounded by the rules.
3. Faculty
4. Lifestyle is a way of living of individual. This variable is divided into two
categories, which are conventional and modern.
5. Spiritual level is a degree to measure respondent’s relation with God that include
faith, worship, obedience, fellowship and religion’s service activities. This variable
is measured by Likert Scale and then categorized into 2 categories which are low
and high.
6. Organization activities is a variable that measure organization’s activities of the
respondent. This variable is categorized into active and passive.
7. Reading activities is variable that measure intensity of reading the respondent. The
variable is categorize into like and dislike.
8. Sympathetic level is a degree to measure respondent’s concern about someone who
is in need. This variable is measured using Likert Scale and then categorized into 2
categories which are low and high.
9. Agreement level of sexual orientation is a degree to measure respondent’s
agreement level about sexual orientation. This variable is measured using Likert
Scale and then categorized into 2 categories which are low and high.
10. Knowing about LGBT orientation is a variable that measure whether the
respondent know about LGBT orientation or not. This variable categorized into 2
categories which are “Yes” and “No”
11. Knowing about legalization LGBT marriage is a variable that measure whether
respondent know about LGBT marriage legalization or not. This variable is
categorized into 2 categories which are”Yes” and “No”.
12. Agreement level of legalization LGBT marriage is a variable that measure
respondent’s agreement level about legalization LGBT marriage. This variable is
measured using Likert Scale and then categorized into 2 categories which are agree
and disagree.

87
Methodology

Population: Scholars of University of Indonesia.


Sample: 200 scholars from University of Indonesia that are chosen by multistage cluster.
Data will be analyzed by logistic regression method because the dependent variables is
categorical data. Since the predictors variables are categorical variables too, coding variables
are needed. When all variables are entered into the model, it is found that only five variables
influence the dependent variables (Agreement level of legalization LGBT marriage)
significantly. Those variables are Knowing about Sexual Orientation, Agreement Level of
Sexual Orientation, Organization Activities, Knowing about Legalization LGBT Marriage
and Spiritual Level

TABLE 1. Categorical Variables Codes


Parameter
Frequen coding
cy (1)
Knowing about sexual orientation No 30 1.000
Yes 158 .000
Agreement Level of Sexual Orientation Low 122 1.000
High 66 .000
Organization activities Active 152 1.000
Passive 36 .000
Knowing about legalization LGBT NO 108 1.000
marriage YES 80 .000
Spiritual Level LOW 47 1.000
HIGH 141 .000

TABLE 2. Dependent Variable


Encoding
Original Value Internal Value
DISAGREE 0
AGREE 1

The Logistic Regression model uses five significant predictor variables is:
g(x) = 3.452 – 0.5 knowing orientation - 3.83 agree orientation + 0.165
organization activities −1.018 knowing legalization + 1.646 spiritual level

The goodness of fit of the model

TABLE 3. Hosmer and Lemeshow Test


Step Chi-square df Sig.
1 1.534 6 .957

Hypothesis
H0: The model fits
H1: The model not fits
From table, it is shown that sig = 0.957 is greater than 0.10 means that the model fits.

88
TABLE 4. Classification Table

The accuracy of clasification is 81.9%. Model is good.

Interpretation of each coefficient included in this model is:

1. The risk Coefficient knowing about sexual orientation (-0.5)


The risk of a person who agree with legalization LGBT marriage if the person
don’t know about Sexual Orientation is exp(−0.5) = 0.60653 times the risk of a
person who disagree with legalization LGBT marriage if the person do not know
about sexual orientation. Therefore the risk that a person who do not know about
Sexual Orientation to agree to legalization LGBT marriage is smaller than the risk
that a person who don’t know about Sexual Orientation to disagree to legalization
LGBT marriage.
2. Coefficient Agreement of sexual orientation (- 3.83)
The risk of a person agree with legalization LGBT marriage if the person has low
agreement level of sexual orientation is exp(−3.83) = 0.02171 times the risk person
who disagree with legalization LGBT marriage if the person has low agreement
level of sexual orientation. Therefore the risk that a person who has low level of
agreement of sexual orientation to agree legalization LGBT marriage is smaller
than the risk of person who has low agreement level of sexual orientation to
disagree legalization LGBT marriage.
3. Coefficient Organization activities (0.165)
The risk of a person agree with legalization LGBT marriage if the person active in
Organization is exp(0.165) = 1.17939 times the risk of a person who disagree
with legalization LGBT marriage if the person active in Organization. Therefore
the risk of a person who active in organization to agree legalization LGBT
marriage is greater than the risk of a person who active in Organization to disagree
legalization LGBT marriage
4. Coefficient knowing about LGBT legalization marriage (−1.01)
The risk of a person agree with legalization LGBT marriage if the person don’t
know about legalization LGBT married is exp(−1.01) = 0.36422 the risk of a
person who disagree with legalization LGBT marriage if the person don’t know
about legalization LGBT married. Therefore the risk of a person who don’t know
about legalization LGBT marriage to agree legalization LGBT marriage is smaller
than a person who don’t know about legalization LGBT married to disagree
legalization LGBT marriage.

89
5. Coefficient Spiritual Level (1.646)
The risk of a person who agree legalization LGBT marriage if the person has low
spiritual level is exp(1.646) = 5.1862 times the risk of a person who disagree
with legalization LGBT marriage if the person has low spiritual level. Therefore
the risk of a person who has low spiritual level to agree legalization LGBT
marriage is greater than the risk of a person who has low spiritual level to disagree
legalization LGBT marriage.

Discussion
From the research output above, it can be concluded that scholars who tend to agree
legalization LGBT marriage are scholars who active in organization and have low spiritual
level. People who tend to disagree of legalization LGBT marriage are people who don’t know
about orientation LGBT, have low level of orientation LGBT and don’t know about legalization
LGBT marriage issues.
Based on the results above, scholars who active in organization tend to agree to
legalization LGBT marriage. Maybe it is caused by contents of organization activities that are
conducted by scholars. Scholar who active in organization usually tend to make friendship with
many people and have openness mind to every phenomenon that happened around them. If they
have low spiritual level, it is easy for them to agree legalization LGBT marriage without
searching the purpose and the meaning of LGBT definition. They tend to agree because they
don’t have enough knowledge about their faith and easily influenced by the organization
activities like seminar, discussion and etc.
It is interesting that people who disagree legalization LGBT marriage are people who
don’t know about orientation LGBT and legalization LGBT marriage. They disagree without
knowing much about LGBT. Maybe it is caused by their spiritual level. Indonesia is still known
as religious country. That cause the scholars of Universitas Indonesia still held religious norm
which is taught by their parents or influenced by their friends and their society. So, many of
them don’t care about LGBT issues and don’t agree with legalization LGBT marriage.

3. Concluding Remarks
From this research, it is concluded that:
1. To know about agreement level of legalization LGBT marriage, the factors must be
considered are Knowing about Sexual Orientation, Agreement Level of Sexual
Orientation, Organization Activities, Knowing about Legalization LGBT Marriage
and Spiritual Level.
2. Based on the research’s result, University leader must be concerned with
organization’s activities that are conducted in the Campus. The research results that
organization activities influence significantly to agreement of legalization of LGBT
marriage. Good organization will influence the scholars to have good scholar’s
overview but otherwise bad organization will influence to bad scholar’s overview.
3. The research results that knowing about legalization LGBT marriage causes high
agreement level of legalization LGBT marriage. University leaders and scholars
should be wise to face this great homework and think a lot what they have to do.
4. The research also results that spiritual level influence agreement level of legalization
LGBT marriage. Spiritual life must be concerned build a healthy next generation who
is afraid of God.

90
References
[1] CHAID and Exhaustive CHAID Algorithmsftp:
ftp.software.ibm.com/software/analytics/spss/support/Stats/Docs/Statistics/Algo
rithms/13.0/TREE-CHAID.pdf
[2] Changing Attitude on gay marriage:
http://www.pewforum.org/2015/07/29/graphics-slideshow-changing-attitudes-
on-gay-marriage/
[3] T. Clinton and M. Laaser, Sex and Relationship, Penerbit Andi, Yogyakarta,
(2010).
[4] O.G.Encarnación, Gay Rights:Why Democracy Matters, Journal of
Democracy, Vol 25, Number 3, (2014).
[5] Gay Technical Professionals, LGBT Scientist, (2014).
http://www.algbtical.org/2A%20SCIENCE.htm
[6] E. Haney, C. Heilbrun and K. Heilbrun, Lesbian and Gay Parents and
Determination of Child. Custody: The Changing Legal Landscape and
Implications for Policy and Practice, Psychology of Sexual Orientation and
Gender Diversity, Vol.1, No.1, 19 –29, (2014).
[7] T. Jacobs, Gay Men on Campus: Smart, Studious, Involved. (2009). Retrieved
from: http://www.psmag.com/books-and-culture/gay-men-on-campus-smart-
studious-involved-3662
[8] LGBT Right. https://lgbt-rights-hrw.silk.co/
[9] R. Manullang, Apa itu homoseksual.
www.academia.edu/1919853/Apa_itu_Homoseksual
[10] J. B. Mathis, Gay and Lesbian Literature in the Classroom: Can Gay Themes
Overcome Heteronormativity?, Journal of Praxis in Multicultural
Education, Volume 7, No 1, (2013).
[11] R.S. Michael, Counseling the Homosexual, Bethany House Publishers,
Minnesota, (1998).
[12] D. C. Montgomery, E. A. Peck, G. G. Vining, Introduction to Linear
Regression Analysis, 5th Edition, New York, John Wiley & Sons, (2012).
[13] G. A. Seber, Linear Regression analysis, 2nd edition, New York, John Wiley
& Sons, (2003).
[14] J. Simanjuntak, Alasan Remaja Nyoba-nyoba Hubungan Sejenis. (2011).
Retrieved from:
http://kesehatan.kompasiana.com/seksologi/2011/07/09/alasan-remaja-
nyoba-nyoba-hubungan-sejenis-379402.html
[15] Sinyo, Anakku bertanya tentang LGBT, PT Elex Media Komputindo, Jakarta
(2014). Available on:
http://library.binus.ac.id/eColls/eThesisdoc/Bab2/2012-1-00565-
PS%20bab%202.pdf

91
Proceedings of IICMA 2015
Statistics and Probability

Relationship Between Job-Distress and


Demographic Factors Among Employees in
Jakarta
Rianti Setiadi1,a), Titin Siswantining2,b), Baizura Fahma3,c), and
Astari Karamina4,d)
1,2,3,4
Department of Mathematics, Universitas Indoesia
Depok, West Java, Indonesia

a)
rianti@sci.ui.ac.id
b)
titin@sci.ui.ac.id
c)
baizura.fahma@sci.ui.ac.id
d)
astari.karamina@sci.ui.ac.id

Abstract: “Job Stress” is a pressure experienced by workers according in their job. In


psychology job stress is classified into Job Distress which brings a negative effect, such as
discouragement, illness, etc. and Job Eustress which encourage worker to strive, work more
creatively, etc. This research emphasizes more on Job Distress. Job Distress has four sources,
called components, that are Role of Clarity, Influence over Decision, Peer Support and Leader
Support. This difference is presumed to happen because of the demographic factors of the
employees, such as gender, age, length of work, position in job, and industry where the
employee works. This research will investigate the demographic factors that influence each
job distress components. The method used to attain this research purpose is Classification
Tree Method. Classification Tree is a part of CRT Method. CRT is a Statistical Method that
can predicts the value of a target (or dependent variable) based on the values of several input
(or independent variables). Classification Tree is used where the target variable is categorical
variables. Regression Tree is used where the target variable is continuous. The result is
written in the end of this paper. This research is conducted using purposive sample among
workers in Jakarta.
Keywords: Job Distress, Demographic Factors, Purposive Sample, Classification Tree

1. Introduction
In era of AEC (ASEAN Economic Community) the employees have to face
high competition with the other employees. By the opened greater opportunities for
employees from all countries in Southeast Asia, Indonesian employees need to
increase their performance at work. In an effort to increase their performance, they
must overcome the job stress. “Job stress” is a pressure experienced by worker in
carrying out her/his job. According to psychology, there are two types of job stress.
Firstly, job stress that give the positive effect like encourage her/his spirit to work,
work more creatively, etc. This type is called as job eustress. Secondly there is job
stress that give a negative effect like hopeless, illness, etc. This type of job stress is
called as job-distress.
Job-distress needs more attention because it will give a negative effect to the
employees, so it needs more effort to reduce it. There are four sources of job-

92
distress that experienced by employees that are called components of job-distress.
In this research, components of job-distress that will be considered are job distress
that are caused by influence of decision factor, role of clarity factor, peer’s support
factor and leader’s support factor. It is considered that the difference may be
caused by demographic factors of the employees such as gender, age, working
field, job position, and length of work. In this research, the significant demographic
factors that influence each of the component of job-distress for employees will be
determined.
The issue that can be raised in this research is: What are the significant
demographic factors that influence each component of job-distress for employees
in Jakarta? The objective of this research is to determine significant demographic
factors that influence component of job-distress for employees in Jakarta. By
knowing the significant demographic factors that influence each component of job-
distress for employees, the head of the companies and the employees themselves
can know which demographic factors that are most influence their source of their
job distress and make appropriate policies to overcome their job distress.

2. Main Results
2.1. Research Variable and Its Definition
1. Role of Clarity is the certainty of his/her job likes the purpose of the job,
explanation of responsibility, time of work, and etc. This variable is
measured by Likert Scale and then categorized into two categories which
are low and high.
2. Influence of Decision is the empowerment given to the employees in
decision making process, such us the participations of employees to decide
for new development in the company. This variable is measured by Likert
Scale and then categorized into two categories which are low and high.
3. Peer’s Support is how much support of employees’s partner in work when
he/she doing her/his job. This variable is measured by Likert Scale and
then categorized into two categories which are low and high.
4. Leader Support is how much support is given by leader in the company to
employees, such as offering help that is necessary, being a role model for
employees, awareness of leaders for employees’s need, leader’s desire of
listening to the employees and etc. This variable is measured by Likert
Scale and then categorized into two categories which are low high.

2.2. Methodology
Populations: Employees in Jakarta
Sample: 229 Employees in Jakarta that are chosen by purposive sampling
Data will be analyzed by Cluster Classification and Regression Tree (CRT)
Method. Cluster Classification and Regression Tree (CRT) or CART is
classification method which uses historical data to construct decision trees. Based
on available information about the dataset, classification tree or regression tree can
be constructed. Constructed tree can be then used for classification of new
observation [5].

93
Research variables that will be involved in this research are:
1. Gender
Gender of Employees. This variable is classified into male and female
2. Age
Age of Employees when survey was conducted. This variable is classified
into six categorics which are:
22 – 25 years old
26 – 29 years old
30 – 33 years old
34 – 37 years old
More than 37 years old
3. Working Field
Working of field of employees when survey was conducted. This variable
classified into five categories, which are:
Manufacturing
Banking
IT and Telecommunication
Market Research
Others
4. Job Position
Position of Employees in job when survey was conducted. This variable
classified into three categories, which are :
≥ Supervisor
Staff
Others
5. Length of Work
Length of work since an employees starts to work in last company until
survey was conducted. This variable classified into four categories, which
are:
Less than 3 years
3 – 5 years
5 – 8 years
More than 8 years

94
2.3. Research Output

Using CRT, this tree diagrams below are found:


1. CRT for Job-Distress that caused by Role of Clarity

FIGURE 1. Tree diagram of Role of Clarity


Factor
that most influences Job-distress that is caused by Role of Clarity is Age.
Employees who has age 22-25 years 74.4% tends to have low Job-distress
otherwise, employees who has age older than 25 years is divided by Industry factor
where he/she works. Employees who works at banking or manufacturing, 82.6%
tends to have high job-distress, employees who works at IT, telecommunication,
market research and others, 50% tends to have high job-distress.

FIGURE 2. Tree diagram of Influence of Decisions

95
2. CRT for Job-Distress that caused by Influence of Decision

Factor that most influences Job-distress that is caused by Influence of


Decisions is Job Positions. Employees who has Position above Supervisor,70.2%
tends to have high Job-distress. Otherwise, Level of Job-distress of employees who
has position as staff and others is divided by Length of Work. Employees who has
position as staff and others and has Length of Work between 6-8 years, 65.2%
tends to have high job-distress. Employees who has position as staff and others and
has Length of Work between less than five years or more than eight years, 58.4%
tends to have low job-distress.

2. CRT for Job-Distress that caused by Peer’s Support

FIGURE 3. Tree diagram of Peer’s Support

Factor that most influences Job-distress that is caused by Peer’s Support is


Length of Work. Employees who has Length of Work between 6-8 years, 65.6%
tends to have high job-distress. Otherwise, level of job-distress of employees who
has length of work between less than five years or more than eight years is divided
by Age. Employees who has length of work between less than five years or more
than eight years and has age between 22-25 years old or 30-33 years old, 53.3%
tends to have high job-distress. Employees who has Length of Work between less
than five years or more than eight years and has age between 26-29 years old or
more than 34 years old, 62.3% tends to have low job-distress.

Level of job-distress of employees who has length of work between less than
five years or more than eight years and has age between 22-25 years old or 30-33
years old is differentiated by Industry. Employees who has length of work between
less than five years or more than eight years and has age between 22-25 years old
or 30-33 years old and has industry Banking, IT & Telecommunication, or others,
58.1% tends to have high job-distress. employees who has length of work between
less than five years or more than eight years and has age between 22-25 years old
or 30-33 years old and has industry Manufacturing or Market Research, 58.8%
tends to have low job-distress.

96
Level of job-distress of employees who has length of work between less than
five years or more than eight years and has age between 26-29 years old or more
than 34 years old is differentiated by Gender. Female employees who has length of
work between less than five years or more than eight years and has age between
26-29 years old or more than 34 years old, 75.7% tends to have low job-distress.
Male employees who has length of work between less than five years or more than
eight years and has age between 26-29 years old or more than 34 years old, 50%
tends to have high job-distress.

3. CRT for Job-Distress that caused by Leader’s Support

FIGURE 4. Tree diagram of Leader’s Support

Factor that most influences Job-distress that is caused by Leader’s support is


length of work. Employees who has Length of Work between 6-8 years, 68.8%
tends to have high Job-distress. Otherwise, Level of Job-distress of employees who
has length of work less than five years or more than 8 years is divided by Age.
Employees who has length of work less than five years or more than eight years
and has age between 22-33 years old, 52.9% tends to have high job-distress.
Employees who has length of work less than three five years or more than eight
years and has age more than 34 years old, 65.9% tends to have low job-distress.

Discussion
The research results that employees who have high Role of Clarity distress
are employees who are more than 25 years old and work at banking and
manufacturing. Otherwise employees who are less than 25 years are not influenced
with Role of Clarity Distress. They are still young enough and can enjoy do their
work without knowing the clarity of what they have to do. Employees who are
more than 25 years old, are mature enough to do their work. They want to know
what they have to do and what policies behind their work. Especially employees
who work at banking and manufacturing need the clarity of their work. Without the
clarity they will be confused of what they should do daily.

Employees who have high Influence of Decision distress are employees who
have position as supervisor and higher level. It can be understood that they will be

97
distress if they, as supervisor or higher level, are not invited to make policies or
decisions. Otherwise staff employees will not feel distress if they are not invited to
make decisions or policies except they have work at their company for 6-8 years. If
they work less than 6 years, they don’t care whether they are invited to make
decisions or not. That is the same condition with employees who work more than 8
years. They don’t care because they have been accustomed with this condition.

Employees who have high Peer’s Support distress are employees who have
already worked at their company for 6-8 years. Employees who have worked less
than 6 years have low Peer’s Support distress. Maybe employees who have worked
less than 6 years are still supervised by supervisors and still be guided. So, they
don’t feel that they need peer’s support yet. But employees who have worked for
6-8 years have to decide everything by themselves. They need peer’s support to do
that. Employees who have worked more than 8 years should do their tasks
independently. So, they feel that they are not really needed peer’s support. Though
they feel that they are not really needed peer’s support, this research shows that if
they are below 33 years old and work at banking or IT & Telecommunication or
they are males and above 34 years old, they suspect to have peer’s support distress.

Employees who have high Leader’s Support distress are employees who
have worked for 6-8 years at their company. Employees that also have high
leader’s support distress are employees who have worked less than 6 years or more
than 8 years but they are still less than 33 years old. The reason that explain this
issues is the same with the reason for employees who need peer’s support. This
research also notes that females need more leader’s support compare with males.

3. Concluding Remarks

1. The demographic factors that influence Job Distress caused by Role of Clarity
are Age and Industry where the employees work
2. The demographic factors that influence Job Distress caused by Influence of
Decision are Employee Position and Length of Work
3. The Demographic factors that influence Job Distress caused by Peers Support
are Length of Work, Industry where the employee work and Gender
4. The demographic factors that influence Job Distress caused by Leaders
Support are Length of Work, Age and Gender
5. In general, the demographic factors that influence Jod Distress are Length of
Work, Age, Industry where the employees work and Gender
6. When face job-distress, employees and company leaders have to consider
which is the most component distress that influences the employees. They must
know whether the employees distress is caused by the role of clarity, influence
of decision, peer’s support or leader’s support. By knowing the sources of the
Job Distress, best policies can be made to reduce the job distress
7. To help employees who have job distress, company’s leader have consider
especially to employees’ age and length of work. By knowing that, company’s
leader can give appropriate suggestions and policies to help the distress-
employees.

98
References

[1] P. K. Kurnia, The Impact of Stress at Work on Employee’s Psychological


Well-Being in Jakarta, Petra Christian University, (2015)
[2] T. A. Wright and R. Cropanzano, Psychological well-being and job
satisfaction as predictors of job performance, Journal of Occupational Health
Psychology, (2000)
[3] R. Pasca and S. L. Wagner, Occupational Stress in the Multicultural
Workplace, J Immigrant Minority Health, (2011)
[4] M. Bickford, Stress in the Workplace: A General Overview of the Causes, the
Effects, and the Solutions, Canadian Mental Health Association, (2005)
[5] R. Timofeev, Classification and Regression Trees (CART) Theory and
Applications, Humboldt University, (2004)
[6] R. Harrington, Stress, Healthy, and Well-Being Thriving in the 21st Century,
WADSWORTH Cengage Learning, (2013)

99
Proceedings of IICMA 2015
Statistics and Probability

Geographically Weighted Bivariate Negative


Binomial Regression (GWBNBNR)
Ahmad Fatih Basitul Ulum1,a), Purhadi2,b),and Wahyu Wibowo3,c)
1,2,3
Department of Mathematics and Sciences,
Sepuluh Nopember Institute of Technology, Surabaya

a)
ahmadfatih@bps.go.id
b)
purhadi@statistika.its.ac.id
c)
wahyu_w@statistika.its.ac.id

Abstract. Poisson regression is customary model for count data response. However,
the negative binomial regression is more appropriate instead of Poisson regression
when the data presence an overdispersion. Both of which are global regression which
occasionally produce a misleading result due to the spatial heterogeneity. Local
regression that so-called Geographically Weighted Negative Binomial Regression
(GWNBR) is alternatively used for univariate count data with overdisper sion. This
paper provides the parameter estimation and testing hypothesis for parameters of
bivariate count data model namely Geographically Weighted Bivariate Negative
Binomial Regression (GWBNBR).
Keywords and phrases: bivariate count data, overdispersion, spatial heterogeneity, local
regression model.

1. Introduction
In sampling theory, if the population is homogeneous, the small sample is
enough. Whatever the methods and wherever the sample is taken from, it would
guarantee represents the population. On the contrary, if the population is
heterogeneous, we need more advance method and more number of sample in
several locations to describe the true population. In the term of regression, it is
general that it would produces the regression parameters β0,β1,...,βp regardless the
location of observations (global regression). We assume that the parameters is valid
for all locations in the population area. Consequently, if there is non-homogeneous
characteristics based on locations which so-called spatial heterogeneity, the global
regression would not appropriate to represent the population, as in the sampling
theory.
Geographically Weighted Poisson Regression (GWPR) that was defined by
Nakaya et al., allows the spatial modeling of count data. This model produces local
parameters which depend on location of observations, unlike the Poisson
regression which produces the same parameters for all locations. This model have a
set of parameters, β0,β1,...,βp, for each location. As the observation came from
different locations, this model has [(p + 1) × n)] amount of parameters, where p is
the number of predictors and n is the number of observations. This model,
however, have a strictly assumption as in Poisson regression, that is equity of mean

100
and variance (equidispersion). However, overdispersion is more frequent to occure.
If there is presence an overdispersion (variance larger than mean), the standard
error of parameter estimators tend to be underestimate. So, the inference would
produce a misleading results.
Negative binomial distribution as a Poisson-Gamma mixture distribution has
an ability in count data modeling with overdispersion, because it has an additional
parameter, α, known as overdispersion parameter. Poisson regression is the special
case of negative binomial regression when the dispersion parameter (α) is very
close to zero. Developing the Nakaya, et. al. works, Ricardo and Carvalho [2]
define the Geographically Weighted Negative Binomial Regression (GWNBR) in
order to overcome the overdispersion problem in GWPR. In bivariate count data
response, Thola [3] have developed the Geographically Weighted Bivariate
Poisson Regression (GWBPR) as an extended model of GWPR. This model,
however, has a limitation as in GWPR, that is equidispersion. If there is presence
an overdispersion, based on Ricardo and Carvalho and Famoye [4] works, this
paper intend to develop bivariate count data model as an extension of GWNBR
which handle overdispersion on GWBPR. The extension model is namely
Geographically Weighted Bivariate Negative Binomial Regression (GWBNBR).
GWBNBR has 3 groups of parameter which vary over the space, β, α, and λ.
Next, this model would applied in case study of leprosy in East of Java 2012.

2. Main Results
The estimation of parameters in this model is obtained using Maximum
Likelihood Estimation (MLE) method and using Newton-Raphson (NR) algorithm
to solve the derivation of ln-likelihood function.
2.1. Bivariate Negative Binomial Regression (BNBR)
This model is based on bivariate negative binomial (BNB) distribution. The
form of BNB based on Famoye [4] described as
1
1 yk 1 k
2
( kyk ) k k
1 1 1
k 1 ( k ) ( y k 1) k k k k

f ( y1 , y 2 , , , , , λ)
1 2 1 2
[1 λ(e y1 c1 )(e y2
c2 )] ; y1 , y 2 0,1,2,...
0; otherwise
(1)

where ck [1 k ( k 1 1
k ) ] [1 e
1
k( k
1 1
k ) ] , k is dispersion parameter of

Yk, and λ is multiplicative parameter. When λ = 0, the pair of Y1 and Y2 is


independent. When λ > 0 the model allows positive correlation, and when λ < 0 the
model allows negative correlation. The mean and variance of the model is
E(Yk ) k , Var (Yk ) k (1 k 1 k ) .
1 y 1
( y) 1
As Gamma function, ( t ), and ( y 1) y! , equation (1) can
( 1) t 0

rewrite as

101
yk 1
2
( 1
t )( k ) y k
1 1
1 ( yk 1
)
k
( k ) ( k
k k ) k

k 1 t 0 yk !
f() y1i y2 i
[1 λ(e c1 )(e c2 )] ; y1 , y2 0,1,2,... ( 2)
0; otherwise.
Let Y1 and Y2 is random variables from bivariate negative binomial
distribution, and X1,X2,...,Xp is predictors. The bivariate negative binomial
regression (BNBR) relative Y to X is given by
(Y1i , Y2i ) ~ BNBR 1i , 2i , 1, 2 ,λ

yki exp(xTi β k ), i 1,2,..., n; k 1,2 ,


(3)

where xi [1, xi1 , xi 2 ,..., xip ]T and β k [ k0 , k1 ,..., kp ]T .

The Maximum Likelihood Estimation (MLE) method is used to find the parameter
estimators. The likelihood function based on (2) is
n 2 yk 1
( k
1
t )( ki ) yki 1 1
L(β1 , β 2 , 1 , 2 , λ) ( k ) k

i 1 k 1 t 0 yki !
1
1 ( yki ) y1i y2 i
( k ki ) k
[1 λ(e c1 )(e c2 )] .
( 4)

with ki exp( xTi βk ) , ck (1 d k ki ) 1/ k


, and d 1 e 1.
To maximize the likelihood function, we first solve the first derivation of
equation (3) with respect to each parameters and equalize to zero. In this case, it is
difficult to be solved analytically. So, we use the popular method to solve it with
numerical iteration, that is Newton-Raphson (NR) algorithm.

2.2. Geographically Weighted Bivariate Negative Binomial regression


(GWBNBR)
GWBNBR is an extension model of BNBR. In GWBNBR, the parameters,
βk , k and λ is vary over the space. Let (ui,vi) is longitude and latitude coordinate

of data point-i. We denote β k (ui , vi ) β*k , k (ui , vi ) *


k , λ(ui , vi ) λ * and
*
ki (ui , vi ) ki . The GWBNBR relative Y to X is given by
* * * *
(Y1i , Y2i ) ~ BNB 1i , 2i , 1 , 2 , λ*

yki exp(xTi β*k ), i 1,2,..., n; k 1,2 ,


(5)

where xi [1, xi1 , xi 2 ,..., xip ]T , β*k [ k0 (ui , vi ), k1 (ui , vi ),..., kp (ui , vi )]T . The
likelihood function for GWBNBR is given by

102
yk 1 * 1 * yki
* * * * *
n 2
( ki t )( ki ) * 1 * 1
L(β , β ,
1 2 1 , 2 ,λ ) ( ki ) ki

i 1 k 1 t 0 yki !
* 1
* 1 * ( yki ki ) y1i y2 i
( ki ki ) [1 λ * (e c1* )(e c2* )] wi ,
( 6)
*
*
with ki exp( xTi β*k ) , ck* (1 d *
k
*
ki ) 1/ k
, and wij is weighting matrix. The
weighting matrix is calculated based on location by kernel function. In this paper
we use adaptive bi-square kernel function given by
[1 (di bi ) 2 ]2 ; for di bi ; i, 1,2,...,n.
wi , (7 )
0; for di bi
where dij is euclidean distance between location-i and ℓ, and bi is a bandwidth in
location-i, the radius which one location can influence to another location.
The logarithmic form of equation (6) is

2.3. Parameter estimation of GWBNBR


*
We have 3 parameters in GWBBRg, λ * , k , and β*k . Using MLE, we do the
first derivation of (9) with respect to β*k and equalize to zero. The first derivation is

ln L(β1* , β*2 , *
1 , *
2 , λ * ) Q*

n 2 yki 1
* 1 * 1
Q* ln( k t ) yki ln( *
ki ) ln yki ! k ln( *
k )
i 1 k 1 t 0 (8)
* 1 * 1 * y1i y2 i
- ( yki k ) ln( k ki ) ln[1 λ * (e c1* )(e c2* )] wi ,

*
with ki exp( xTi β*k ) .

*
Q* n ( y1i 1i ji )x λ * (e y2i c2* ) c1*
wi 0
β1*j i 1 (1 * *
1 1i ) [1 λ * (e y1i c1* )(e y2i c2* )] β1*j
*
Q* n ( y 2i 2i ) x ji λ * (e y1i c1* ) c2*
wi 0
β2* j i 1 (1 *
2
*
2i ) [1 λ * (e y1i c1* )(e y2i c2* )] β2* j
y1i 1 * 2
Q* n
1 * 2 * * 2 * 1 *
* 1 ln 1 1 [ln( 1 1i ) 1]
* 1
1 i 1 t 0 ( 1 t)
* 2 * 1
1 ( y1i 1 ) λ * (e y2 i c2* ) c1*
wi 0
* 1 * [1 λ * (e y1i c1* )(e y2 i c2* )] *
( 1 1i ) 1

103
y2 i 1 * 2
Q* n
2 * 2 * * 2 * 1 *
* 2 ln 2 2 [ln( 2 2i ) 1]
* 1
2 i 1 t 0 ( 2 t)
* 2 * 1
2 ( y 2i 2 ) λ * (e y1i c1* ) c2*
wi 0
* 1 * [1 λ * (e y1i c1* )(e y2 i c2* )] *
( 2 2i ) 2

Q* n
(e y1i c1* )(e y2 i c2* )
wi 0
λ* i
*
1 [1 λ (e
y1i
c1* )(e y2 i
c2* )]

ck* dck* x ji *
ck* * 1 * 1 * * * * *
with ki
, * k [ k ln(1 d k ki ) d ki (1 d k ki )]ck*
βkj* (1 d k* *
ki ) k

Equations in (10) can’t be solved analytically, so it needs numerical iterative


method, such as Newton-Raphson (NR) method. The NR method can be described
as
θˆ ( r 1)
θˆ ( r ) H 1 (θˆ ( r ) )g(θˆ ( r ) ) , where
( 0) T
θˆ *
T T
β1* *
1 β*2 *
2 λ*
T
Q* Q* Q* Q* Q*
g(θˆ * ) T * T *
β1* 1 β*2 2 λ*
2
Q*
T
symmetric
β1* β1*
2 * 2
Q Q*
* *
1 β1 ( 1* ) 2
2
2 *
Q 2 *
Q Q*
H(θˆ * ) T T
β*2 β1* β 2 1*
*
β*2 β*2
2 * 2 * 2 * 2
Q Q Q Q*
* * * * * *
2 β1 2 1 2 β2 ( 2* ) 2
2 * 2
Q Q* 2 *
Q 2 *
Q 2
Q*
λ β1*
*
λ* *
1 λ β*2
*
λ * *
2 ( λ* )2
The Hessian matrix, H, contains the second partial derivative of Q* with
respect to each parameter. We not show the second derivation in here for
simplicity. Iteration would be stopped when θˆ ( r 1) θˆ ( r ) , where is very small
positive real number.

2.4. Testing of Hypothesis


First, we perform the model significance test.
H0 : k1 (u , v ) k2 (u , v ) kj (u , v ) 0 ; j 1,2,..., p; k 1,2; 1,2,...,n.
H1 : at least one of kj (u , v ) 0.

104
We calculate the deviance statistic which denoted D 2 ln [ L( ˆ ) L( ˆ )]. L( ˆ )
calculated from equation (6) with *
exp( k*0 ) , and L( ˆ ) is equation (6) with
ki
* * *
ki exp( x
k1 i kj xi ) . D follows chi-square distribution with degrees of

D0 df0
freedom (p -1), p is number of parameter. If F ;df0 ,df1 , the model is
D1 df1
significance, where 0 and 1 symbol is describe the condition under H 0 and H 1
respectively.
Second, we do the partial test of parameters.
H0 : kj (u , v ) 0 ; j 1,2,..., p; k 1,2

H1 : at least one of kj (u , v i ) 0 .

Using the z-test, we calculate the ˆ / se( ˆ) . The standard error, se( ˆ) , obtained
from the square root of main diagonal of minus inverse of Hessian matrix at the
last iteration of NR, H (θˆ ) . If ˆ / se( ˆ) > Z , the parameter is significant.
1 (r )
2

2.5. Goodness of Fit


We apply the value of deviance to compare goodness of fit of GWBNBR
with other models. Dev 2[Q* ( ˆ ) Q* ( ˆ )] , where Q* ( ˆ ) is the ln-likelihood
function of GWBNBR under the model and Q* ( ˆ ) is the ln-likelihood function of
GWBNBR where y-value replacing the in the calculation. Model with minimum
deviance indicate the better model.

References
[1] T. Nakaya, A. S. Fotheringham, C. Brunsdon, and M. Charlton,
Geographically Weighted Poisson Regression for Disease Association
Mapping, Statistics in Medicine, 24, 2695-2717, (2005)
[2] A. S. Ricardo and T. V. R. Carvalho, Geographically Weighted Negative
Binomial Regression-incorporating overdispersion, Stat. Comput., 1-15,
(2014)
[3] M. I. Thola, Penaksiran Parameter dan Pengujian Hipotesis Model
Geographically Weighted Bivariate Poisson Regression (GWBPR), Tesis,
Insitut Tekonologi Sepuluh Nopember, Surabaya, (2015)
[4] F. Famoye, On The Bivariate Negative Binomial Regeression Model,
Journal of Applied Statistics, 37(6), 968-981, (2010)

105
Proceedings of IICMA 2015
Statistics and Probability

The Poisson Inverse Gaussian (PIG) Regression


in Modeling The Number of HIV New Cases
(Case Study in East Java Province in 2013)
Sayu Made Widiari1,a), Purhadi2,b), I Nyoman Latra3,c)
1,2,3
Department of Statistics,
Sepuluh Nopember Institute of Technology, Surabaya

a)
smade84@gmail.com
b)
purhadi@statistika.its.ac.id
c)
i_nyoman_latra@statistika.its.ac.id

Abstract. The number of HIV new cases is one of the count data which is always modeled with a
simple Poisson regression. But in many real application, the simple Poisson regression fails to
describe the data since, usually, the sample variance is larger than the sample means, known as
overdispersion, as required by the simple Poisson regression. Poisson Inverse Gaussian (PIG)
regression was introduced as one of the mixture Poisson regression allowing for overdispersion.
PIG has a nearly closed form likelihood function that its parameter estimation with maximum
likelihood estimation (MLE) and statistical hypothesis test with maximum likelihood ratio test
(MLRT) are easily to be obtained. Since then, PIG regression has been widely used in many
researches. Therefore, this paper will focus on modeling the number of HIV new cases with the
Poisson Inverse Gaussian (PIG) regression. By using backward elimination procedure, the result
of the modeling shows that percentage of population with high school education level and higher,
percentage of couple with childbearing age who use condom and ratio of health facilities per
100,000 population significantly influence the number of HIV new cases in East Java in 2013
Keywords and Phrases: Poisson Inverse Gaussian, MLE, Parameter Estimation, Statistical
Hypothesis Testing, MLRT, HIV.

1. Introduction
The development of research with count data has led to the use of
Generalized Linear Modeling (GLM) to model the count data. GLM is the
generalization of the classical normal linear model, by relaxing some of its
restrictive assumptions, and provides methods for the analysis of non-normal data
[4]. Poisson regression is one of the members of GLM family which comes from
Poisson distribution. The Poisson distribution is a discrete distribution that takes on
a probability value only for nonnegative integers; this characteristic of the Poisson
distribution makes it an excellent choice for modeling count data. Poisson
distribution is specified by only one parameter μ which defines both the mean and
the variance equal μ. That the mean and variance are equal will be useful as the
assumption in Poisson regression, known as equidispersion.
But there are some situations in which the count data do not meet the
assumption of Poisson regression, where the variance is smaller than the mean,
known as underdispersion or the variance is greater than the mean, known as

106
overdispersion. But most of the count data found to be overdispersion [2]. In
practice, count data frequently depart from Poisson distribution due to the existence
of many zeros data or larger frequency of extreme observations, or both, resulting
in spread (variance) greater than the mean in the observed distribution [7]. Data
with many zeros can be caused by structural zeros or sampling zeros. Structural
zeros happen when the zeros come from unit observation which never exhibits the
event but the sampling zeros happen when those zeros happen by chance.
Overdispersion cases can cause many effect if it is ignored. It will mislead
the interpretation of the model caused by underestimation of standard error, for
example a predictor variable may appear to be significant when in fact it is not
significant [6]. There are some models which build from the mixture of Poisson
distribution with discrete or continuous distribution, known as mixture Poisson
distribution. Some of the mixed Poisson distribution have been found to be
alternative solution for overdispersion cases, but only few of them are often applied
due to its simple computation. One of the mixed Poisson distribution is Poisson
Inverse Gaussian (PIG) distribution, as the mixture of Poisson distribution and the
Inverse Gaussian distribution.
This distribution was first introduced by Holla in 1966. It is the form of
Siche distribution with two parameters. Siche distribution (SI) itself is known as
the better model than negative binomial, especially for the highly skewed to the
right count data, but it is more complicated to be applied since the probability
density function contains three parameters so the estimation is more difficult [3].
As the type of SI, PIG distribution may also flexible to handle the count data with
longer tails but only characterized by two parameters. Therefore, the likelihood
function can be easily obtained and nearly close form, which means the parameter
estimation is quite simple. Despite of this simplicity, many researches had applied
this model.
Wilmot [10] had shown the potency of modeling with Poisson Inverse
Gaussian (PIG) regression as an alternative to the negative binomial regression in
automobile insurance claim data. Six set of data were presented with many zeros
and concluded that PIG regression performed better than negative binomial
regression model. The other research which indicated PIG model was better than
negative binomial is research from Shoukri, Asyali dan Vandorp [8] with mastitis
data collected from 57 farms in Ontario, Canada.
PIG regression model can also be applied in modeling motor insurance
data and modeling in highway safety. Recently, Zha, Lord and Zou [11] had
modeled the crash data occur in Texas and Washington with PIG regression model.
Those data are right skewed with lots of zeros which indicates over dispersion. The
research lead to the conclusion that the PIG model showed better model than
negative binomial model in analyzing the crash data.
Decreasing the number HIV cases has become one of the goal of many
provinces in Indonesia. The number of HIV cases found in a year can be used as a
benchmark for the success of their government on inhibiting the spread of the HIV
virus. Increasing the quality of the health facilities and the number of health
workers is not a sufficient effort when education and others social economy factors
also influence the HIV spread. East Java Province is one of the province which is
striving to reduce the spread of HIV cases.
The number of HIV new cases in East Java Province in 2013 is likely to be
overdispersion that Poisson regression is not a fit model for it. The number of HIV
new cases found in an area can be zero when there is no HIV cases found in a year.

107
On the contrary, it can be a large value when there are many cases found in a year.
Reciprocally, the number of HIV new cases consists of many zeros and some of
them are large values in 38 regencies in East Java. Hence, this study aims to model
the number of HIV new cases in East Java in 2013 with PIG regression. The
parameter estimation is studied with the maximum likelihood estimation (MLE)
and the statistical hypothesis test is applied with the maximum likelihood ratio test
(MLRT).

2. Main Results
This section is divided into five parts. Firstly, general derivation of PIG
distribution is briefly discussed. Then, the PIG regression with its parameter
estimation and its statistical hypothesis test is described in the next part. Lastly, an
application of PIG regression in the number of HIV new cases is discussed.
2.1. Poisson Inverse Gaussian Distribution
Let 𝑌 be a response variable and has a Poisson distribution with mean 𝜇.
We consider mixed Poisson regression models which had been used by Dean,
Lawless and Willmot [3] with
∞ [𝑣𝜇] 𝑦
𝑃(𝑌 = 𝑦|𝑥) = ∫ 𝑒 −𝑣𝜇 𝑔(𝑣)𝑑𝑣 , 𝑦 = 0,1, …, (2.1)
0 𝑦!

We assume a random effect 𝑣 has a probability density function 𝑔(𝑣) which


follows an Inverse Gaussian distribution,
1 −(𝑣−1)2
𝑔(𝑣) = (2𝜋𝜏𝑣 3 )−2 𝑒 2𝜏𝑣 ,𝑣 > 0 (2.2)

where 𝐸(𝑣) = 1 and the parameter 𝜏 is unknown and equal to 𝑉𝑎𝑟(𝑣). The
distribution of 𝑌 given 𝑥 resulting from (2.1) is then a Poisson Inverse Gaussian
(PIG) distribution with mean and variance 𝜇 and 𝜇 + 𝜏𝜇2 , respectively or it can be
written as 𝑌~𝑃𝐼𝐺(𝜇, 𝜏).
From (2.1) and (2.2) the probability density function of 𝑃𝐼𝐺(𝜇, 𝜏) can be seen as
1
𝑝(0) = 𝑒𝑥𝑝 (𝜏 −1 {1 − (1 + 2𝜏𝜇)2 })
1
𝑝(1) = 𝜇(1 + 2𝜏𝜇)−2 𝑝(0)
2𝜏𝜇 3 𝜇2 1
𝑝(𝑦) = (1 − ) 𝑝(𝑦 − 1) + − 𝑝(𝑦 − 2),
1 + 2𝜏𝜇 2𝑦 1 + 2𝜏𝜇 𝑦(𝑦 − 1)

𝑦 = 2, 3, … (2.3)

2.2. Poisson Inverse Gaussian (PIG) Regression


Following the generalized linear model approach, we relate the parameters
𝜇𝑖 to the covariates 𝑥𝑖 through the log-link function so that
𝑇𝜷
log 𝜇𝑖 = 𝒙𝑖 𝑇 𝜷 or 𝜇𝑖 = 𝑒 𝒙𝒊 (2.4)

108
where
𝑥 are the predictor variables and can be denoted as
𝑥𝑖 = [1 𝑥1𝑖 𝑥2𝑖 ⋯ 𝑥𝑘𝑖 ]𝑇
𝛽 are regression parameters and can be denoted as
𝛽 = [𝛽0 𝛽1 𝛽2 ⋯ 𝛽𝑘 ]𝑇

2.3. Parameter Estimation of PIG Regression


There are some ways to do the parameter estimation. Regression
parameters of PIG can be obtained through the Maximum Likelihood Estimation
(MLE) method. The following are the steps to estimate the regression parameters:
1. Determine the likelihood function based on the PIG distribution
𝑛

𝐿(𝛽, 𝜏) = ∏ 𝑃𝑟(𝑌 = 𝑦𝑖 |𝑥𝑖 ; 𝛽, 𝜏) (2.5)


𝑖=1
2. Determine the log-likelihood function from the equation (2.5)
𝑛

𝑙(𝛽, 𝜏) = ∑ log 𝑃𝑟(𝑌 = 𝑦𝑖 |𝑥𝑖 ; 𝛽, 𝜏)


𝑖=1

3. Consider 𝜇𝑖 = 𝜇(𝑥𝑖 ; 𝛽) and for 𝑖 = 1, … , 𝑛,


𝑝 (𝑦 +1)
𝑡𝑖 (𝑦𝑖 ) = (𝑦𝑖 + 1) 𝑖𝑝 (𝑦𝑖 ) , 𝑦 = 0,1,2, … (2.6)
𝑖 𝑖
The manipulation from (2.3) and (2.6) can be expressed as
𝑛 𝑦𝑖 −1
1
𝑙(𝛽, 𝜏) = ∑ {log ( ) + log 𝑝𝑖 (0) + 𝐼(𝑦𝑖 > 0) ∑ log 𝑡𝑖 (𝑗)} , (2.7)
𝑦𝑖 !
𝑖=1 𝑗=0
4. Then the parameters estimation can be obtained from its first derivatives
which is made equal to zero and maximize its second derivatives (infinite
negative)
𝑛
𝜕𝑙 1 𝜕𝜇𝑖
𝑢𝑟 = = ∑{𝑦𝑖 − 𝑡𝑖 (𝑦𝑖 )} ( ), 𝑟 = 1, … , 𝑘,
𝜕𝛽𝑟 𝜇𝑖 𝜕𝛽𝑟
𝑖=1
𝑛
𝜕𝑙 1 + 𝜏𝜇𝑖 1
𝑢𝑘+1 = = ∑( ) 𝑡𝑖 (𝑦𝑖 ) − (1 + 𝜏𝑦𝑖 ) 2 ,
𝜕𝜏 𝜇𝑖 𝜏
𝑖=𝑙
𝑛
−𝜕 2 𝑙 1 𝜕𝜇𝑖 𝜕𝜇𝑖
𝐼𝑟𝑠 = = ∑ {{𝑦𝑖 − 𝑡𝑖 (𝑦𝑖 )𝑡𝑖 (𝑦𝑖 + 1) + 𝑡𝑖 (𝑦𝑖 )2 } 2 ( )( )
𝜕𝛽𝑟 𝜕𝛽𝑠 𝜇𝑖 𝜕𝛽𝑟 𝜕𝛽𝑠
𝑖=1
1 𝜕 2 𝜇𝑖
+ {𝑦𝑖 − 𝑡𝑖 (𝑦𝑖 )} ( )} , 𝑟, 𝑠 = 1, … , 𝑘,
𝜇𝑖 𝜕𝛽𝑟 𝜕𝛽𝑠
𝑛
−𝜕 2 𝑙 1 + 𝜏𝜇𝑖 𝑡𝑖 (𝑦𝑖 ) 𝜕𝜇𝑖
𝐼𝑟,𝑘+1 = = ∑ ({𝑡𝑖 (𝑦𝑖 + 1) − 𝑡𝑖 (𝑦𝑖 )} − 1) ( ),
𝜕𝛽𝑟 𝜕𝜏 𝜏𝜇𝑖 𝜏𝜇𝑖 𝜕𝛽𝑟
𝑖=1

𝑟 = 1, … , 𝑘,

109
−𝜕 2 𝑙
𝐼𝑘+1,𝑘+1 =
𝜕𝜏 2
𝑛
3 + 2𝜏𝜇𝑖 2 + 𝜏𝑦𝑖
= ∑ (𝑡𝑖 (𝑦𝑖 ) −
𝜏 3 𝜇𝑖 𝜏3
𝑖=1
(1 + 𝜏𝜇𝑖 )2
− 𝑡𝑖 (𝑦𝑖 ){𝑡𝑖 (𝑦𝑖 + 1) − 𝑡𝑖 (𝑦𝑖 )})
𝜏 4 𝜇𝑖 2
5. The equations above are implicit and nonlinear equation in 𝜷, therefore the
function should be maximized by some iteration to obtain the parameter
estimation of 𝜷. The iteration model that can be used is Newton-Raphson
iteration [3]. The equation can be expressed as
̂ (𝑚+1) = 𝜽
𝜽 ̂ (𝑚) − 𝑯−1 (𝜽
̂ (𝑚) )𝒈(𝜽
̂ (𝑚) )
̂ (𝑚) is parameter estimation from 𝑚𝑡ℎ iteration and 𝒈(𝜽
𝜽 ̂ (𝑚) is the gradien
̂ (𝑚) . The initial parameter estimation use Ordinary
vector with parameters 𝜽
Least Square (OLS) method. The iteration will stop if ‖𝜽 ̂ (𝑚+1) − 𝜽
̂ (𝑚) ‖ <
𝜀, 𝜀 > 0

2.4. Statistical Hypothesis Test of PIG Regression


The statistical hypothesis about parameters in PIG regression can be
obtained by using Maximum Likelihood Ratio Test (MLRT). The statistical
hypothesis in PIG regression involves the statistical test for parameters 𝛽
simultaneously and the statistical test for parameters 𝛽 and 𝜏 partially.
Simultaneous statistical test
The hypothesis is that
𝐻0 : 𝛽1 = 𝛽2 = ⋯ = 𝛽𝑘 = 0
𝐻𝑎 : at least one 𝛽𝑙 ≠ 0, with 𝑙 = 1, 2, … 𝑘
The statistical test can be obtained with the likelihood ratio statistics formed by
determining the set parameters under population (Ω), i.e. Ω = {𝛽, 𝜏} and the set
parameters under true 𝐻0 (𝜔), i.e. 𝜔 = {𝛽0 , 𝜏}. The likelihood function for
saturated model involves all predictor variables (𝐿(Ω)) is formed, while on the set
parameters under true 𝐻0 we form the likelihood function that exclude all the
predictor variables (𝐿(ω)). It can be written as follows
𝑛 𝑛

𝐿(Ω) = ∏ 𝑃𝑟(𝑌 = 𝑦𝑖 |x𝑖 ; β, τ) 𝐿(ω) = ∏ 𝑃𝑟(𝑌 = 𝑦𝑖 |x𝑖 ; 𝛽0 , 𝜏)


𝑖=1 𝑖=1

Both of the likelihood functions are compared in the form of devians as follows
𝐿(𝜔
̂)
𝐺 = −2 ln ( )
̂)
𝐿(Ω
The values of 𝐿(𝜔 ̂) and 𝐿(Ω̂ ) are the maximum likelihood values from
each model where 𝛽̂ and 𝜏̂ are the result from estimation in subsection 2.3. G’s
statistic is the approximation from chi-square distribution 𝜒 2 with degrees freedom
2
of 𝑣, that the testing criteria is reject 𝐻0 if 𝐺 > 𝜒(𝛼,𝑣) where 𝑣 is the degrees

110
freedom obtained by substracting the number of parameters under population by
the number of parameters under true 𝐻0 .
Partial statistical test
The hypothesis for parameter 𝛽 is that
𝐻0 : 𝛽𝑙 = 0
𝐻𝑎 : 𝛽𝑙 ≠ 0, 𝑤𝑖𝑡ℎ 𝑙 = 1, 2, … , 𝑘
The statistical test for testing significance of parameter 𝛽 is
𝛽̂𝑖
𝑇=
𝑆𝐸(𝛽̂𝑖 )
The testing criteria is reject 𝐻0 if |𝑇| is greater than 𝑡(𝛼) where 𝛼 is the
2
significance level. 𝑆𝐸(𝛽̂𝑖 ) is (𝑖 + 1)𝑡ℎ diagonal element in matrix 𝑣𝑎𝑟(𝛽̂ ) and
𝑣𝑎𝑟(𝛽̂ ) = −𝐸(𝐻 −1 (𝛽̂ )). The same hypothesis and statistical test are also applied
in partial hypothesis test for 𝜏.

2.5. Example
This subsection describes dataset used to analyze the performance of PIG
regression. Dataset was collected from the East Java Province Public Health
Service Office and East Java Statistics Office about the number of new HIV cases
found in East Java province in 2013. The dataset is right skewed with lots of zeros.
The description of the data are shown in table 1 below.
TABLE 1. The description of the data
Variable Description
Y The number of new HIV cases
X1 Percentage of poor people
X2 Percentage of population with high school education level and
higher
X3 Percentage of couple with childbearing age who use condom
X4 Ratio of health workers per 100,000 population
X5 Ratio of health facilities per 100,000 population
X6 Percentage of urban areas
X7 Percentage of population in age group 25-34 years old

The dataset included 38 unit observations with skewness 3.675 and among
the observations 44.7 percent of them reported zero events. As a rule of thumb, the
distribution is considered highly skewed when the absolute value of skewness is
larger than one (Zha, Lord and Zou, 2014). The overdispersion test which tests the
null hypothesis of equidispersion in Poisson GLMs against the alternative of
overdispersion also showed that there is evidence of overdispersion in the data
since the p-value is smaller than 𝛼 = 0.1 meaning reject the null hypothesis. The
summary statistics for the dataset are displayed in table 2 below.

111
TABLE 2. Summary Statistics of the Variables of the Dataset
Variables Minimum Maximum Mean Standard Deviation
Y 0 1,278 122.66 231.642
X1 4.75 26.97 12.7208 5.20279
X2 7.22 56.01 25.7379 12.71099
X3 0.34 7.06 1.8374 1.38053
X4 4.66 150.26 29.5366 34.32929
X5 0.52 15.37 3.9487 3.04548
X6 8.99 100.00 43.2424 31.53698
X7 13.69 19.71 15.9105 1.27722

TABLE 3. Variance Inflation Factor (VIF) of Predictor Variables


Variables VIF
X1 2.711
X2 8.453
X3 3.665
X4 5.810
X5 1.655
X6 6.682
X7 1.971

Before modeling the variables, checking the multicollinearity is a must to make


sure that there is no two or more of the predictors in a regression model are
moderately or highly correlated. As shown in the table 3, the VIF of each predictor
variable is smaller than 10 which indicate that there is no multicollinearity between
the predictor variables and modeling with PIG can be continued.
The model establishment in this study is implemented with the backward
elimination procedure which aims to obtain the fit model based on its Akaike
Information Criterion (AIC).
TABLE 4. Backward Elimination in PIG Regression
Model Akaike Information Criterion (AIC)
Y~X1, X2, X3, X4, X5, X6 , X7 373.76
Y~X1, X2, X3, X5, X6,, X7 371.76
Y~X1, X2, X3, X5, X6 369.76
Y~X2, X3, X5, X6 368.61
It is shown clearly in table 4 that deleting some predictors in the model has
decreased the AIC. Dropping X1, X4, and X7 has led to a new model with the
lowest AIC. The last model is the best model because dropping the variables in the
last model would not give the lower AIC.

112
TABLE 5. Parameter Estimation PIG Regression in the Number of HIV New
Cases in East Java in 2013
Parameter Estimate Standard Error T-value
𝛽0 0.45489 1.67569 0.271
𝛽2 0.18876 0.06724 2.807*
𝛽3 -1.82232 0.66707 -2.732*
𝛽5 0.35167 0.18790 1.872*
𝛽6 0.02792 0.01872 1.491
*significant in 𝛼=0.1
Based on the T-value in table 5, there are three predictor variables which
significantly affect the number of HIV new cases. Those are percentage of
population with high school education level and higher, percentage of couple with
childbearing age who use condom and ratio of health facilities per 100,000
population. The model based on parameter estimation in table 5 is
𝜇̂ = exp(0.45489 + 0.18876𝑋2 − 1.82232𝑋3 + 0.35167𝑋5 + 0.02792𝑋6
The percentage of population with high school education level and higher
positively associated with the number of new HIV cases. More spesifically, one
percent increase in percentage of population with high school education level and
higher for instance can result in exp(0.18876) = 1.20751 times increasing of prior
average of the HIV new cases frequency. This is reasonable since the HIV cases
mostly happen in the high level lifestyle in which most people on it are people with
high level education. Similar analysis also occur in variable ratio of health facilities
per 100,000 population and percentage of urban areas. One percent increase in
ratio of health facilities per 100,000 population for instance can result in
exp(0.35167) = 1.421439 times increasing of prior average of the HIV new cases
frequency. One percent increase in percentage of urban areas for instance can
result in exp(0.02792) = 1.028313 times increasing of prior average of the HIV
new cases frequency. On the contrary, one percent increase in percentage of couple
with childbearing age who use condom for instance can result in exp(-1.82232) =
0.16165 times decreasing of prior average of the HIV new cases frequency.

3. Concluding Remarks
Considering its easiness of computation of parameter estimation with MLE
and statistical hypothesis test with MLRT, PIG regression turns out to be a
potential alternative in modeling the count data with overdispersion. Using the
backward elimination procedure, it is concluded that the best model is the model
which involves percentage of population with high school education level and
higher, percentage of couple with childbearing age who use condom, ratio of health
facilities per 100,000 population and percentage of urban areas as the predictor
variables.

113
References
[1] Statistics of East Java, The 2013 National Social and Economy Survey
Result , Surabaya: Statistics of East Java, (2013)
[2] P.C. Consul and F. Famoye, Generalized Poisson Regression Model,
Communications in Statistics - Theory and Methods. Vol. 21, No. 1, pp 89-
109, (1992)
[3] C. Dean, J. F. Lawless, and G.E. Willmot, A Mixed Poisson-Inverse-
Gaussian Regression Model, The Canadian Journal of Statistics. Vol. 17,
No. 2, pp 171-181, (1989)
[4] P. De Jong and G.Z. Heller, Generalized Linear Models for Insurance Data,
1st edition, Cambridge University, Press., New York, (2008)
[5] East Java Province Public Health Service Office, Health Profile of East Java
Province in 2013, Surabaya : East Java Province Public Health Service
Office, (2014)
[6] J.M Hilbe, Negative Binomial Regression, 1st edition, Cambridge
University, Press., New York, (2007)
[7] M.C. Hu, M. Pavlicova, and E.V. Nunes, Zero-Inflated and Hurdle Models
of Count Data with Extra Zeros: Examples from an HIV-Risk Reduction
Intervention Trial, The American Journal of Drug and Alcohol Abuse. Vol.
37, pp 367-375, (2011)
[8] M.M. Shoukri, M.H. Asyali, R. Vandorp, and R. Kelton, The Poisson
Inverse Gaussian Regression Model in the Analysis of Clustered Counts
Data, Journal of Data Science. Vol. 2, No. 2, pp 17-32, (2004)
[9] M. Stasinopoulus and B. Rigby, Generalized Additive Models for Location
Scale and Shape, Journal of Statistical Software. Vol. 23, pp 1-46, (2007)
[10] G.E. Willmot, The Poisson-Inverse Gaussian Distribution as An
Alternative to the Negative Binomial, Scandinavian Actuarial Journal.
Vol. 3, No. 4, pp 113-127, (1987)
[11] L. Zha, D. Lord, and Y. Zou, The Poisson Inverse Gaussian (PIG)
Generalized Linear Regression Model for Analyzing Motor Vehicle Crash
Data, Journal of Transportation Safety and Security.
DOI:10.1080/19439962.2014.977502, (2014)

114
Proceedings of IICMA 2015
Statistics and Probability

Spatial Analysis and Modelling of Village Level


Poverty (A Case Study: Poverty Modelling in
Pati Regency)
Duto Sulistiyono1,a), Ismaini Zain2,b), and Sutikno3,c)
1,2,3
Department of Statistics, FMIPA, Institut Teknologi Sepuluh Nopember
Jl. Arief Rahman Hakim, Surabaya 60111, Indonesia

a) duto.sulistiyono@bps.go.id
b) ismaini_z@statistika.its.ac.id
c) sutikno@statistika.its.ac.id

Abstract. Poverty issues require special attention from government, especially local
government. Since decentralization, local governments were given authority to decide policy
including the poverty alleviation in their regions. It is necessary for local government to have
poverty data in small area (village level). However, poverty data which were required by local
governments could not be presented to small area. Therefore, an accurate model is needed for
modelling poverty to small area. This study uses a spatial regression model that considers a spatial
aspect to solve the problem. Spatial weighting used in this study was customized contiguity with
main business field approach in every village. The aim of this study is to get the best model for
village level poverty in Pati Regency, Central Java, Indonesia. The result showed that there was a
spatial lag autocorrelation ( ) on poverty. Thus, the Spatial Autoregressive (SAR) model was
fitted in modelling. The result of SAR model obtained 7 predictor variables that significantly
affected to village level poverty. Three variables are characteristic of individuals and households
while four variables are characteristic of regions. Thus, SAR model fits for modelling village
level poverty in Pati Regency.
Keywords: poverty, spatial analysis, spatial autoregressive model, spatial regression.

1. Introduction
Poverty issues require special attention from government, especially local
governments. Since decentralization, local governments were given authority to
decide policy for their regions including poverty alleviation [14]. Poverty
alleviation policies taken should focus on village level [6]. As a basis for making
poverty elevation policy at village level, local governments need poverty data
down to village level.
SEDAC [11] states that study on poverty in a small area has been
implemented in 35 countries in the world during the period of 1991-2002. Several
studies of poverty in a small area has been done by [10] and [13]. Furthermore,
study that includes spatial effects in model has been done by [9].
This study was done by using a spatial regression analysis. A spatial
regression would use customized contiguity as weight matrix. The aim of study,
i.e., to get the best spatial model for village level poverty in Pati Regency and to
determine variables that affect it.

115
Spatial Effect Identification. Moran’s I coefficient (I) and Lagrange Multiplier
(LM) is used to test spatial dependence or autocorrelation between observations or
location. Spatial effect identification for spatial heterogenity use Breusch-Pagan
(BP) test [1], [8], and [7]. Null hypothesis of BP test, i.e., H0 : σ12 = σ22 = ... = σn2 =
σ2 (homoscedasticity). Formula of hypothesis test is Eq. (1).

BP ( 12 ) f T Z(ZT Z) 1 ZT f χ 2 ( p) (1)
ei represents least square residual for i-th observations. Z represents vector matrix
which normal standardized for each observation (n × (p+1)). If BP > 2, reject H0.
Moran’s I test based on least square residual. Null hypothesis of Moran’s I test, i.e.,
H0 : I = 0 (no spatial autocorrelation).
Formula of hypothesis test is Eq. (2).
I I0
Z
var I
(2)
n n
wij X i X Xk X
n 1 n2 S1 nS2 3S02
I i 1 k 1
n
;E I I0 ; var I [E ( I )]2
S0
Xi X
2 n 1 n2 1 S02
i 1

Var (I) is the variance and E (I) is expected value of Moran’s I. Reject H0 and there
is a spatial autocorrelation if |Z|>Zα/2. Value of Moran’s I is between -1 and 1.
Value I>I0 shows positive autocorrelation and I<I0 shows negative autocorrelation.
LM test obtained based on model assumption under H0. There are 3 null hypothesis
used, i.e., H0 : ρ = 0 for Spatial Autoregressive model (SAR), H0 : λ = 0 for
Spatial Error model (SEM), and H0 : ρ, λ = 0 for Spatial Autoregressive
Moving Average model (SARMA). Formula of hypothesis test is shown in Eq.
(3).

LM E 1{( Ry )2 T22 2Ry R e T12 ( Re )2 ( D T11 )} ~ 2


(m)
(3)
where: m = total spatial parameter (SAR = 1, SEM = 1, SARMA = 2). Ry =
eTW1y/σ2; Re = eTW2e / σ2; M = I – X(XTX)-1XT; Tf,g = tr{WfWg + WfTWg}, D = σ-
2
(W1Xβ)TM(W1Xβ); E = (D + T11)T22 – (T12)2, and f,g = 1,2.
e represents least square residual for observations. If spatial weight matrix W2 = W
so T11=T12=T22=T=tr{(WT + W)W}. If LM > 2, we reject H0.
Spatial Regression Models. Spatial regression models were developed by Anselin
[1] uses spatial cross sectional data. General model of spatial regression is shown
in Eq. (4) and (5).

Y W1Y X u (4)

2
where: u = W2 u ~ N 0, I (5)

116
Y represents vector of response variable (n×1). X represents matrix of predictor
variable (n×(p+1)). represents vector of regression coefficient parameter
((p+1)×1). represents spatial lag coefficient parameter on response variable. W1
and W2 represents weight matrix (n×n). represents spatial lag coefficient
parameter on error u. u represents vector of error (n×1) in Eq. (4). represents
vector of error (n×1) in Eq. (5). I represents identity matrix (n×n). n represents
number of observations or locations (i = 1, 2, 3, …, n) and p represents number of
predictor variable (p = 1, 2, 3, .., l).
If ρ = 0 and λ = 0, Eq. (4) would be Ordinary Least Square model (OLS),
Y = X + . This model without spatial spatial effect. If X = 0 and W2 = 0, Eq. (4)
would be first order spatial autoregressive model, Y = W1Y + . This model
represents variance on Y as linear combination of variance among neighboring
locations without predictor variable. If W2 = 0 or λ = 0, Eq. (4) would be Spatial
1 1
Autoregressive model (SAR), Y =(I - W) X +(I - W) + . This model
assumed that autoregressive process on response variable. If W1 = 0 or ρ = 0,
Eq.(4) would be Spatial Error model (SEM), Y = X + u, u = W2u + . It
represents structure spatial λW2 on spatially dependent error (ε). Furthermore, if
W1, W2 ≠ 0, λ ≠ 0, or ρ ≠ 0, Eq. (4) would be Y = W1Y + X + u, u = W2u +
that called as Spatial Autoregressive Moving Average (SARMA).
Estimator for spatial lag coefficient ( ˆ ), i.e., 𝜌̂ = (YTWTWY)-1YTWTY. For
significance test of spatial lag coefficient ( ) is used Likelihood Ratio Test (LRT)
[2]. Null hypothesis of LRT, i.e., H0 : ρ = 0 (no spatial lag dependency).
1 T 1 T
LRT { 2ln I W 2
I W y-X I+ W y-X 2
y-X y-X }

If LRT > 2, we reject H0. vector could be estimated by reducing model (I -


W1)Y = X + . By using square spatial weight matrix, i.e., = (I – W1)T(I –
W1). Thus, estimator, i.e., = (XT X)-1XT (I – W2)Y.

Spatial Weight Matrix. Spatial weight matrix (W) was calculated by distance or
contiguity among region to another [8]. Based on geographic contiguity, the types
of contiguity, e.g., Linear Contiguity, Rook Contiguity, Bishop Contiguity, Double
Linear Contiguity, Double Rook Contiguity, and Queen Contiguity [7]. In a spatial
regression modelling, if a region has an asymmetrical shape, e.g., an administrative
area, hence fit method that used are rook and queen contiguity and would generate
same weight matrix [15]. This weight matrix could not be used for this study
because there are villages that do not contiguous with the others. Thus, the
parameter estimators could not be obtained because inverse matrix of spatial
weight matrix could not be calculated.
Spatial weight matrix (W) is not only caused by geographic contiguity but
also nongeographic contiguity. This contiguity should be a direct relation to a

117
theoretical conceptualization of the structure of dependency. The contiguity also
should relate to a spatial interaction theory. In addition, the contiguity should be
based on the trade relationships maintained between regions or socioeconomic
characteristics [1]. The contiguity is called customized contiguity. Thus, this study
would use customized contiguity for a spatial weight. The used approach is
similarity of main business field in each village. This weight matrix defines w ij=1
to entities that have similarity business field with concern location, wij=0 for
others.

Poverty. Poverty is a condition of a person whose expenditure per capita per


month does not enough to require minimum standard living. Minimum standard
living represented by poverty lines [3]. There are several criteria for poverty
measure that have been widely accepted by development economists, i.e.,
anonymity, independence of population, monotonicity, and distributional
sensitivity [12]. The indicator that used to measure poverty is Head Count Index
(HCI) [5]. The indicator shows percentage of people living below poverty line. The
characteristics of individuals and households that affected poverty were classified
into three aspect, i.e., demographic, economic, and social [4].

Materials. This study used data from BPS that are SUSENAS 2013, PODES 2011,
and SP 2010. SUSENAS 2013 is used to obtain direct estimate of percentage of
poverty as response variable. PODES 2011 and SP 2010 are used to obtain
characteristics of individuals, households, and, regions as a predictor variables. The
variables used are presented in Table 1. The study area is Pati Regency, Central
Java which consists of 21 subdistricts and 406 villages.
Modelling analysis stages, as follow.
a. Exploring data to see characteristics of poverty generally and identification of
correlation variables that affect poverty.
b. Determining spatial weight matrix (W).
c. Identifying of spatial effects that would be used to see significant parameters
on results of Moran's I and BP test.
d. Testing significance of spatial lag coefficient ( ) and spatial error coefficient
( ) by LM test.
e. Using Spatial Autoregressive (SAR) modelling and test of residual assumption.

118
TABLE 1. Predictor Variables
Code Variables Unit

Characteristics of Individuals and Households


X1 Average number of household members Peoples
X2 Dependency ratio -
X3 Percentage of male household head Percent
X4 Percentage of working people Percent
X5 Percentage of education household head >= SMP Percent
X6 Average of floor area per capita M2
X7 Percentage of household with largest floor instead of land Percent
X8 Percentage of household that cook with electricity/ gas Percent

Characteristics of Regions
Z1 Distance village to central regency KM
Z2 Ratio of educational facilities per 100 people -
Z3 Ratio of health facilities per 100 people -
Z4 Ratio of health workers per 100 people -
Z5 Percentage of farm families Percent
Z6 Percentage of Jamkesmas recipient -
Z7 Ratio of establishments per 100 people -

2. Main Results
Descriptive. SUSENAS 2013 notes that average percentage of village level
poverty in Pati Regency are 29.03%. Average of household member are 3.42
people that mean average household in Pati Regency contain three people. Kudur is
a village with smallest percentage of male household head (63.62%). The smallest
percentage of education household head>=SMP in Tompegunung (6.30%). Village
with smallest percentage of households cook with electricity/ gas is Kletek
(2.54%).
Sirahan is a village with the longest distance to central regency to 44
kilometers. Kajen is a village with the largest ratio of educational facilities per 100
peoples to 0.99. Furthermore, village with largest ratio of health facilities per 100
peoples is Tayu Wetan to 0.49. For average percentage of farm families to 53.11%
mean more than half of families in Pati Regency are rely on agricultural sector as a
main source of family economy. Furthermore, average ratio of establishments per
100 peoples to 2.12 mean that there are two places of establishments among 100
peoples in Pati Regency.

119
Correlation. Pearson correlation value with = 5% can be seen that the largest
correlation to percentage of poverty is percentage of households with the largest
floor instead of land to 46.2% and negatively correlated. Smallest correlation is
ratio of establishments per 100 peoples to 10.5% and positively correlated.

Spatial Effect Identification. P-value BP is 0.5622, with = 10% indicates


homogeneous residual variance. Furthermore, based on the result there is a spatial
dependency on village level poverty. P-value Moran’s I of 0.0405, with = 10%
could be concluded that there was a spatial pattern and similar characteristics.
P-value of LM lag is 0.2338. By using α = 25%, it could be concluded that
there is lag spatial dependencies so it could continue to SAR modelling. P-value of
LM (error) is 0.3732. By using α = 25% could be concluded that there was no
spatial dependencies of error. According to analysis carried out that there are no
dependencies spatial error, it could be ascertained that there is not combine spatial
dependencies.

TABLE 2. Results of Spatial Effect Identification


Statistic Test Value P-Value
Breusch-Pagan 5.8089 0.5622
Moran’s I 2.0487 0.0405*
LM (lag) 1.4173 0.2338**
LM (error) 0.7930 0.3732
LM (SARMA) 1.4261 0.4902

(*) sign. at = 10% and (**) sign. at = 25%.


Spatial Autoregressive Model. There are seven predictor variables and spatial lag
( ) that significantly affect to village level poverty variables in Pati Regency.
Three variables are characteristic of individuals and households while four
variables are characteristic of regions, i.e., percentage of male household head (X3),
percentage of education household head >= SMP (X5), percentage of household
that cook with electricity/ gas (X8), distance village to central Regency (Z1), ratio
of educational facilities per 100 peoples (Z2), ratio of health facilities per 100
peoples (Z3), percentage of Jamkesmas recipient (Z6). The conclusion is obtained
by looking at p-value of parameters model.
Rho ( ) coefficient value to 0.2254 indicates that in event of an increase in
percentage of poverty 1% in village that similar characteristics would increase
percentage of poverty in observation village to 0.2254%. Coefficient of
determination (R2) describes variation in variable of village level poverty could be
explained by model. SAR model has R2 of 39.23% indicating that the model is able
to explain variation of village level poverty of 39.23% and 60.77% is explained by
other variables outside model.

120
TABLE 3. Parameter Estimation of SAR Model
Variable Estimation SE z-value p-value
Constant 134.5194 32.08991 4.1919 0.0000*
X3 -0.9904 0.3688 -2.6855 0.0072*
X5 -0.1896 0.1839 -1.0311 0.3025***
X8 -0.1316 0.0904 -1.4543 0.1459**
Z1 0.4159 0.2173 1.9139 0.0556*
Z2 -32.7653 15.2866 -2.1434 0.0321*
Z3 -51.67 25.8803 -1.9965 0.0459*
Z6 -0.2524 0.1629 -1.5496 0.1212**
Rho ( ) 0.2254 0.2128 1.0594 0.2894***
R2 (%) 39.23

(*) sign. at = 10%, (**) sign. at = 15%, and (***) sign. at = 30%.

SAR model for village level poverty as follow.


n
Yˆi 0, 2254 Wij Y j 134,5194 0.9904 X 3i 0,1896 X 5i 0,1316 X 8i
j 1, i j

0, 4159Z1i 32,7653Z 2i 51,67 Z3i 0, 2524Z6i


Model shows that if percentage of male household head increased 1% would
reduce percentage of village level poverty to 0.9904%. If Percentage of education
household head >= SMP increase 1%, percentage of village level poverty would be
reduced to 0.1896%. If percentage of household that cook with electricity/ gas
increased 1% would reduce percentage village level poverty to 0.1316%.
Interpretation by assuming other variables constant. Similarly, interpretation for
other predictor variables.
Identical assuming test for SAR model shown there is no predictor variables
that significantly affect absolute residuals. ACF plot for residual shows that there is
no residual lag significant. Normality test shows AD value at 0.289 and p-value
0.605. Therefore, it could be concluded that residual SAR model require
assumption of residual IIDN (1,0).
Based on these results, we can conclude that SAR model is suitable for
village level poverty modelling in Pati Regency and to determine variables that
affect it.

3. Concluding Remarks
Percentage of village level poverty in Pati Regency has spatial effect. It is
shown from Moran’s I and SAR modelling for village level poverty and variables
that affect it. SAR model could be seen that lag in response variables have
significant affect.

121
There are seven predictor variables that significantly influence percentage
of village level poverty. Only variable distance from village to central Regency has
positive affect. Meanwhile, six predictor variables have negative affect. Thus, SAR
model fit for modelling of village level poverty in Pati Regency.
Ratio of educational and health facilities have a negative effect on village
level poverty. Therefore, local government could take relevant policies related to
education and health facilities. R2 for SAR model is not relative large. That might
be caused by the value of response variable obtained by direct estimation from
SUSENAS. As we know, SUSENAS is applied for regency estimation level.
Finally, to estimate in small area, it is suggested to use Small Area Estimation
(SAE) methods.

References
[1] L. Anselin, Spatial Econometrics: Methods and Models, Kluwer Academis
Publishers, Boston, (1992)
[2] G. Arbia, Spatial Econometrics: Statistical Foundations and Applications to
Regional Convergence, Springer, Berlin, (2006)
[3] BPS & World Bank Institute, Dasar-Dasar Analisis Kemiskinan, BPS,
Jakarta, (2002)
[4] I.S. Chaudhry, S. Malik, and A. Hassan, The Impact of Socioeconomic and
Demographic Variables on Poverty: A Village Study Explaning, The
Lahore Journal of Economics, 14, 39-68, (2009)
[5] J. Foster, J. Greer, and E. Thorbecke, A Class of Decomposable Poverty
Measures, Econometrica, 52, 761-66, (1984)
[6] IFAD, Rural Poverty Report, The International Fund for Agricultural
Development (IFAD), Rome, (2011)
[7] J.P. Lesage, The Theory and Practice of Spatial Econometrics, University of
Toledo, Toledo, (1998)
[8] J.P. LeSage and R.K. Pace, Introduction to Spatial Econometrics, CRC Press,
New York, (2009)
[9] D. Matualage, Metode Prediksi Tak Bias Linier Terbaik Empiris Spasial
pada Area Terkecil untuk Pendugaan Pengeluaran Per Kapita, Tesis, IPB,
Bogor, (2012)
[10] A. Nuraeni, Feed-Forward Neural Network untuk Small Area Estimation
pada Kasus Kemiskinan, Tesis, ITS, Surabaya, (2009)
[11] SEDAC, Poverty Mapping Project: Small Area Estimation of Poverty and
Inequality, Socioeconomic Data and Applications Center (SEDAC),
Columbia University. Retrieved May 13, 2015, from
http://sedac.ciesin.columbia.edu/ data/set/povmap-small-area-estimates-
poverty-inequality/data-download, (2005)
[12] M.P. Todaro and S.C. Smith, Economic Development Eighth Edition,
Pearson Education, Ltd., United Kingdom, (2003)

122
[13] A. Ubaidillah, Small Area Estimation dengan Pendekatan Hierarchical
Bayesian Neural Network, Tesis, ITS, Surabaya, (2014)
[14] Government of Indonesia, Undang-Undang Nomor 32 Tahun 2004 tentang
Pemerintahan Daerah, State Secretariat, Jakarta, (2004)
[15] D. Winarno, Analisis Kematian Bayi di Jawa Timur dengan Pendekatan
Model Regresi Spasial, Tesis, ITS, Surabaya, (2009)

123
Proceedings of IICMA 2015
Statistics and Probability

Mixed Estimator of Kernel and Multivariate


Regression Linear Spline Trucated in
Nonparametric Regression
Ali Akbar Sanjaya1,a), I Nyoman Budiantara2,b), and Bambang
Widjanarko Otok 3,c)
1
Statistics Department
Institut Teknologi Sepuluh Nopember, Surabaya
a)
dimasganteng2011@gmail.com
b)
i_nyoman_b@statistika.its.ac.id
c)
dr.otok.bw@gmail.com

Abstract. The relationship of respond variable with some predictor variables is not always using
only single approach such as spline, kernel or Fourier series. In the regression model allows the
response variable to follow different nonparametric regression curve between the predictor
variables and other predictor variables. Data given in pairs ( x1i , .. xqi , ti , yi ) the relationship
between the predictor variables x1i , .. xqi , ti , and the response variable yi follow additive
q
nonparametric regression model: yi x1i ,..., xqi , ti i
f x pi g ti i .
p 1

Predictor component, x1i .. xqi, approached using Spline Functions linear predictor component
while ti approached by the kernel function. This research was conducted with the purpose of
seeking estimator truncated form of linear spline and kernel to estimate the nonparametric
regression curve. Estimator models obtained using Ordinary Least Square method (OLS). The

Estimator regression curves fˆ , x, t A , y obtained are : ĝ , t D y


and ˆ ( x, t ) ( A( , ) D( )) y . Kernel Estimator mixture and multivariable
,

regression Spline rely heavily on points knots and bandwidth. Kernel mixed Estimator and
multivariable regression Spline is the finest in determining point’s knots and optimal bandwidth.
Keywords and Phrases: Nonparametric regression, Spline, Mixed Estimator, Kernel

1. Introduction
Regression analysis is a statistical technique to model and investigate the
relationship between two or more variables. If the regression curve shape is known,
it is called a parametric regression model assumed and if this is true, then the
parametric regression approach is very efficient, but if not, lead to a misleading
interpretation of the data [1]. If there is no any information about the shape of the
curve f ( xi ) , the approach used is nonparametric regression. Because this
approach does not depend on the assumption of a certain curve shape, thus

124
providing greater flexibility. In this case f ( xi ) assumed to be contained within the
specific function [2].
Some research on kernel estimators have been made by many researchers as
[3], [4]. Spline [5], [2], [6], [7]. Local polynomial [8], [9]. Wavelet [10]. Fourier
series [11]. Nonparametric regression approach that is widely used is spline
truncated. Spline truncated polynomials are pieces that have segmented and
continuous nature. One of the advantages of this model is truncated spline to seek
its own estimates of the data pattern of data wherever it moves. This excess occurs
due to the truncated spline points are knots, which is the point of fusion joint that
showed the change of behavior patterns of data [2], [12].
Kernel estimator has the advantage that a flexible, easy mathematical form
and can achieve a relatively quick convergence level. Kernel approach depends on
bandwidth parameter, which serves to control the smoothness of the curve
estimation. Selection of appropriate bandwidth is very important in kernel
regression [13]. Bandwidth that is too large will produce a very smooth curve
estimation and headed to the average of the response variable, otherwise if the
bandwidth is too small will produce estimates that are less smooth curve that is
estimated to be headed to the data.
According Budiantara, et al. [14], the nonparametric regression models and
the semiparametric models developed by the researchers before, if explored more
deeply, basically there is a very heavy and underlying assumptions the model. Each
predictor in nonparametric regression multi predictor considered to have the same
pattern that researchers are forced to use only one form estimator model for each
predictor variable. Therefore, use only one form of the estimator only in different
patterns of relationship of different data will certainly lead estimator produced less
suited to the pattern data. As a result, the estimated regression model becomes less
precise and produces a large error. Therefore, to overcome these problems some
researchers have developed an estimator mixture in which each data pattern in the
mixture estimator approached with the appropriate estimator. Estimator is expected
to be a mix of the right estimator capable of estimating data patterns well. Some
researchers have ever developed such a mixture estimator is [14] who developed
The combination spline estimator truncated and kernel. In addition Sudiarsa, et al.
[15] combines estimator spline truncated and Fourier series in multivariable
nonparametric regression curve.
This study will carry out a study on a mixture of kernel estimators and
truncated linear spline nonparametric regression multivariable model. At the
mixture estimator assumed some predictor variable number q follow the spline
truncated linear function and a predictor variable follow the kernel function.

2. Main Results
Data given in pairs in ( x1i , , xqi , ti , yi ) which the relationship of these
variables are assumed to follow the nonparametric regression model:
yi xi , ti i
(1)

where 𝑖 = 1,2, . . . , 𝑛 dan 𝑥̃𝑖 = (𝑥1𝑖 , 𝑥2𝑖 , … , 𝑥𝑞𝑖 ) . Shape of the curve 𝜇(𝑥̃𝑖 , 𝑡𝑖 ) in
equation (1) assumed to be unknown, and known only to the smooth curve in the
sense of continuous and differentiable. Random error 𝜀𝑖 identical, independent and
follow a normal distribution with 𝐸(𝜀𝑖 ) = 0 dan 𝑉𝑎𝑟(𝜀𝑖2 ) = 𝜎 2 . Besides the
regression curve 𝜇(𝑥̃𝑖 , 𝑡𝑖 ) assumed to be additive that can be written in the form:

125
q

( xi , ti ) f p x pi g ti (2)
p 1

With 𝑓𝑝 (𝑥𝑝𝑖 ) and g(𝑡𝑖 ) are smooth functions. The main objective in
nonparametric regression curve estimator mixture is getting shape estimation
regression curve 𝜇(𝑥̃𝑖 , t 𝑖 ), namely :
q

ˆ ,
( xi , ti ) fˆp ( , )
xi , ti gˆ ti . (3)
p 1

Regression curve 𝑓𝑝 (𝑥𝑖 ) is assumed to follow a linear truncated spline function


with knots points 𝜆 = (𝜆1 , 𝜆2 , … , 𝜆𝑟 )′ and the components of the curve g(𝑡𝑖 ) is
assumed to follow the kernel function. To obtain a mixture estimator of 𝜇(𝑥̃𝑖 , t 𝑖 ),
assumed truncated spline curve is :
m r
m
f ( xi ) j
xi
j
j
xi j
(4)
j 0 j 1

With:
m
m xi , xi r
xi r
r

0 , xi r

Parameters 𝜃0 , 𝜃1 , … , 𝜃𝑞 dan 𝜙1 , 𝜙2 , … , 𝜙𝑟 being unknown parameters. Then,


estimation of regression curve kernel presented as:

1
n
K t ti 1
n
gˆ t n n
yi n W i t yi (5)
1
i 1
n K t tj i 1

j 1

where :
K t ti 1 t ti
Wi t n
;K t ti K
1
n K t tj
j 1

Estimator regression curve (5) is dependent variable upon kernel function 𝐾α and
bandwidth parameter α. Kernel function 𝐾𝛼 can be either a uniform kernel, kernel
triangle, epanechnikov kernel, kernel squares, triweight kernel, kernel cosine or
Gaussian kernel.
Mixed estimator nonparametric regression curve (3) will be obtained by kernel
estimator (5) and spline function truncated (4) with a linear polynomial
components (m =1).
ˆ ( x, t ) ˆ
,
f , ( x, t ) gˆ (t )

Lemma 1. If spline truncated curve 𝑓(𝑥𝑖 ) multivariable give by equation (4) where
in polynomial component is linier (𝑝 = 1), than :
q

f p ( x pi ) X0 X1 1 1
X2 2 2
Xq q q
(6)
p 1

126
which vector 𝑓̃𝑝 (𝑥𝑝𝑖 ), 𝜃̃, 𝜙̃1 , 𝜙̃2 ,…, 𝜙̃𝑝 and matrix X(λ̃1 ) , X(𝜆̃2 ),…, X(𝜆̃𝑃 ) given
by :
f p1 1 x11 xq1 0
11 12 1q
f p2 1 x12 xq 2 11
f p ( x pi ) , X0 , , 1
, 2
, q

11 12 1q
f pn 1 x1n xqn 1q

x11 11
x11 r1
x11 12
x11 r2

X1 1
, X2 2

x1n 11
x1n r1
x1n 12
x1n r2

xq1 1q
xq1 rq

Xq q

xqn 1q
xqn rq

Vector 𝑓̃𝑝 (𝑥𝑝𝑖 ) of size 𝑛 × 1, vector 𝜃̃ of size (𝑞 + 1) × 1, vector 𝜙̃1 of size 𝑟 × 1,


vector 𝜙̃2 of size 𝑟 × 1 , vector 𝜙̃𝑃 of size 𝑟 × 1, Matrix 𝑋1 (𝜆1 ) of size (𝑛 𝑥 𝑟),
Matrix 𝑋2 (𝜆2 ) of size (𝑛 𝑥 𝑟), dan Matrix 𝑋𝑃 (𝜆𝑃 ) of size (𝑛 𝑥 𝑟).

Proof. If 𝑚 = 1 given by equation (4), then:


1 r
j 1
f ( xi ) x
j i j
xi j
j 0 j 1

So if translated would be:


q
*
f p ( x pi ) 0 11 1i
x 12
x2i 1p
x pi 11
x1i 11
x1i r1
p 1

11 1q
xqi 1q 1q
xqi rq
(7)
*
where 0 01 02 0q

Equation (7) applies to each 𝑖 = 1 until 𝑖 = 𝑛, then :

f p ( x p1 ) 1 x11 xP1 0 11
x11 11
x11 r1
f p ( xp2 ) 1 x12 xP 2 11 11
x12 11
x12 r1

f p ( x pn ) 1 x1n xPn 1P 11
x1n 11
x1n r1

x11 12
x11 r2 12
xq 1 1q
x P1 rq 1q

x1n 12
x1n r2 12
xqn 1q
xqn rq 1q

127
So that equation (7) can be simplified in the form of matrix notation in accordance
with equation (6):
q
f p ( x pi ) X0 X1 1 1
X2 2 2
Xq q q
(8)
p 1

Lemma 2. If given the regression model (2) and estimator for Kernel regression
given by equation (5), then squared error given by:
|| ||2 || ([ I D( )] y Z ( ) ) ||2

with || || as a vector length ,y ( y1, y2 ,..., yn )' ,


Z( ) [X0 X1 ( 1 ) X q ( q )] , ( 1 , 2 ,..., n ) , ' [ 1 q
] and
1
D( ) {n W i (t j )}, i 1, 2,..., n; j 1, 2,..., n

Proof. Equation (5) gives:


n
1
gˆ t n W i t yi
i 1

Because it applies to each t = t1, t =t2,…, t = tn


gˆ t1 1
n W 1 t1 n W
1
2
t1
1
n W n t1 y1
gˆ t2 1
n W 1 t2 n W
1
2
t2
1
n W n t2 y2
D y (9)

gˆ tn 1
n W 1 tn n W
1
2
tn
1
n W n tn yn

From Regression Model (1), Equation (9) and Lemma 1, give :


q

y f p ( x pi ) g t
p 1

X0 X1 1 1
X2 2 2
XP P P
D( ) y

1
X0 X1 1
Xq q
D( )

Z D y (10)
Equation (10) gave Total squared error:
2 2
|| || || y Z ( ) D( ) y ||
2
|| y D( ) y Z ( ) ||
2
|| ( I D( )) y Z ( ) || (11)
Theorem 1. If from regression model (1), Total squared error from model (1) given
by Lemma 2, error model with multivariate normal distribution with zero mean and
𝐸(𝜀̃𝜀̃ , ) = 𝜎 2 𝐈 and, then estimator OLS for parameter 𝛽̃ found from optimization

128
lemma 2 and estimator mixed regression curve ( x, t ) is given by:
q
ˆ ˆ
,
x, t f p( )
x, t gˆ t
p 1

q
ˆ ˆ
with f p( )
x, t Z , , ĝ ,
t D y and
p 1

ˆ
[( Z ( ) ' Z ( )] 1 ( Z ( )) '( I D( )) y

with matrix Z ( ) and D( ) given by Lemma 2.


Proof. To get OLS estimator for the vector parameter 𝜃̃, Theorem 2.1 gives
optimization such as equation (11), with
2
|| ( I D( )) y Z ( ) ||
2
|| ||

(( I
'
(( I D( )) y Z( ) D( )) y Z( ) )

|| ( I D( )) y ||
2
2 '
Z( ) (I
'
D( )) y
'
Z( )'
'
Z( ) (9)

To get estimator of the parameter , carried out partial derivative of equation (9)
Q / , of the following :

Q / , (|| ( I D( )) y ||
2
2 '
Z(
'
) (I D( )) y
'
Z(
'
) ' Z( ) )

ˆ
0 2( Z ( )) '( I D( )) y 2Z ( ) Z ( )
'
(10)

If equation (10) is equated to zero, it will obtain the equation:


ˆ
Z ( )' Z ( ) ( Z ( ))' ( I D( )) y 0
Furthermore, equation can also be written in the form:
ˆ
Z ( )' Z ( ) ( Z ( )) ' ( I D( )) y
Thus obtained:
ˆ
( , ) [ Z ( ) ' Z ( )] 1 ( Z ( )) ' ( I D( )) y C( , ) y (11)

With C( , ) [Z ( )' Z ( )] 1 (Z ( ))' ( I D( ))


Considering equation (11), then the estimator for spline truncated linier
q

multivariable regression curve f p ( x pi ) Z( ) is given by:


p 1

q
ˆ ˆ
fp( )
x, t Z( ) ( , ) .
p 1

Then from equation (5) shows the kernel estimator equation:


gˆ ,
(t ) D( ) y.

129
Lemma 1 gives mixed regression kernel and spline truncated linier multivariable
curve:
q
( xi , ti ) f p x pi g ti
p 1

As a result, a mixture of estimators kernel and spline truncated linier multivariable


i ( xi , ti ) is given by:

q
ˆ ˆ
,
x, t fp( )
x, t gˆ t
p 1

ˆ
Corollary 1. If estimators ( , ) , fˆ , ( x, t ) , ˆ ,
x, t , ĝ t given by
Theorem 2, then:
q

A( , ) y , dan ˆ
ˆ
fp( )
x, t ,
( x, t ) B( , ) y
p 1

with A Z ( )[(Z ( )) ' Z ( )] 1 (Z ( )) '( I D( )) and B( , ) A( , ) D( )

Proof. Theorem 1 give equation:


q
ˆ ˆ
fp( )
x, t Z , (12)
p 1

equation (11) and equation (12) provide:


q
ˆ
fp( )
x, t Z [Z ( )' Z ( )] 1 ( Z ( ))' ( I D( )) y A( , ) y (13)
p 1

With Matrix:

A( , ) Z [Z ( )' Z ( )] 1 ( Z ( ))' ( I D( )) y
Furthermore, Theorem 1 and equation (9) and equation (13) gives:
q
ˆ ˆ
,
x, t fp( )
x, t gˆ t A( , ) y D( ) y
p 1

( A( , ) D( )) y B( , ) y

with B( , ) A( , ) D( )
Mixed estimator regression Spline linier truncated multivariable and
Kernel ˆ , x, t highly depend on many location of knot points
( 1 , 2 ,..., r ) ' and bandwidth parameter . To get the best estimator of
mixture regression Spline and Kernel, knot points and bandwith parameter should
be selected optimally. Famous Methods usuly used is Generalized Cross
Validation (GCV). GCV function is given by (Wahba,[5]) :
n 1 || y ˆ ( x, t ) ||2
,
GCV , 2
n 1trace I A , D( )

130
Optimal knot points opt
( 1( opt )
, 2( opt )
,..., r ( opt )
)' and optimal bandwidth
parameter opt is found from optimization:

G( opt
, opt
) Min{G( , )}
,

Knot points and optimal bandwidth parameter found from the smallest GCV value.

3. Concluding Remarks
If given additive regression nonparametric model:
q

y x, t f p xp g t
p 1

1. Mixed estimator of Kernel and Spline Truncated Linier Multivariable is


given by :
q
ˆ ˆ
,
x, t fp( )
x, t gˆ t where
p 1

q
ˆ
fp( )
x, t A( , ) y , gˆ t D( ) y and ˆ ,
x, t B( , ) y
p 1

2. Mixed estimator of Kernel and Spline Truncated Linier Multivariable


ˆ x, t highly depend on knot location, many knot point and optimal
,
bandwidth parameter. Knot point and optimal bandwidth parameter found
from the smallest GCV value.

References
[1] W. Hardle, Applied Non-parametrik Regression, Cambridge University Press,
Cambridge, (1990)
[2] R.L. Eubank, Nonparametrik Regression and Spline Smoothing,
MarcelDeker: New York, (1999)
[3] E.A. Nadaraya, On Estimating Regression, Theory of Probability and its
Applications 9(1): 141+2. 141-142, (1964)
[4] G.S. Watson, Smooth Regression Analysis, Sankhya: The Indian Journal of
Statistic, series A 26 (4), 359-372, (1964)

[5] G. Wahba, Spline Models for Observational Data, SIAM Pensylvania, (1990)
[6] B.W. Otok, Optimize Knot and Basic Function At Truncated Spline and
Mltivariate Adapative Regression Spline, Proceedings of the first
international conference on mathematics and statistics ICOMS 1 june 19-21
2006 Bandung West Java. Indonesia., (2006)
[7] I.N. Budiantara, B. Lestari, and Islamiyati, Estimator Spline Terbobot dalam
Regresi Nonparametrik dan Semiparametrik Heterokesdastik untuk Data
Longitudinal, Laporan Penelitian Hibah Kompetensi, DP2M-DIKTI,
Jakarta., (2010)

131
[8] A. H. Welsh and T. W. Yee, Lokal Regression for Vector Responses, Journal
of Statistical Planning and Inference, 136, 3007-3031, (2006)
[9] F. Yao, Asymptotic Distribution of Nonparametrik Regression Estimators for
Longitudinal or Fuctional Data, Journal of Multivariate Analysis, 98, 40-56,
(2007)
[10] A. Antoniadis, J. Bigot, and Y. Sapatinas, Wavelet Estimators in
Nonparametric Regression: A Comparative Simulation Study, Journal of
Statistical Software, 6, 1-83, (2001)
[11] V. Ratnasari, I. N. Budiantara, I. Zain, M. Ratna, and N. P. A. M. Mariati ,
Comparison Truncated Spline and Fourier Series in Multivariable
Nonparametric Regresssion Model (Application: Data of Poverty in Papua,
Indonesia), International Journal of Basic & Applied Sciences IJBAS-IJENS
Vol : 15 N0:04., (2015)
[12] I.N. Budiantara, Spline Dalam Regresi Nonparametrik dan Semiparametrik:
Sebuah Pemodelan Statistika Masa Kini dan Masa Mendatang, ITS Press,
Surabaya, (2009)
[13] I.N. Budiantara and Mulianah, Pemilihan Banwidth Optimal Dalam Regresi
Semiparametrik Kernel dan Aplikasinya, Journal Sains dan Teknologi
SIGMA, 10 : 159-166, (2007)
[14] I.W. Sudiarsa, I. N. Budiantara, Suhartono, and S. W. Purnami, Combined
Estimator Fourier Series and Spline Truncated in Multivariabel
Nonparametrik Regression, Applied Mathematical Science, Vol.9, 2015, no.
100, 4997-5010, HIKARI Ltd., (2015)

[15] I. N. Budiantara, V. Ratnasari, M. Ratna and I. Zain , The Combination of


Spline and Kernel estimator for Nonparametrik Regression and Its
Properties, Applied Mathematical Science, 9, No 122 hal 6083-6094, (2015)

132
Proceedings of IICMA 2015
Statistics and Probability

Nonparametric Mixed Regression Model:


Truncated Linear Spline and Kernel Function
Rory1,a), I Nyoman Budiantara2,b), and Wahyu Wibowo 3,c)
1,2,3
Department of Statistics
Institut Teknologi Sepuluh Nopember, Surabaya

a)
rory@bps.go.id
b)
i_nyoman_b@statistika.its.ac.id
c)
wahyu_w@statistika.its.ac.id

Abstract. Given paired data (𝑢𝑖 , 𝑣1𝑖 , 𝑣2𝑖 , … , 𝑣𝑚𝑖 , 𝑦𝑖 ), 𝑖 = 1, 2, … , 𝑛, follow the additive
nonparametric regression model 𝑦𝑖 = 𝑓(𝑢𝑖 , 𝑣̃𝑖 ) + 𝜀𝑖 , where
𝑓(𝑢𝑖 , 𝑣̃𝑖 ) = 𝑔(𝑢𝑖 ) + ∑𝑚𝑗=1 ℎ𝑗 (𝑣𝑗𝑖 ), 𝑣 ̃𝑖 = (𝑣1𝑖 , 𝑣2𝑖 , … , 𝑣𝑚𝑖 )𝑇 . Random errors 𝜀𝑖 follow a normal
distribution with mean 0 and varian σ2. The aim of this study is to obtain a mixed estimator
𝑓(𝑢𝑖 , 𝑣̃𝑖 ). In order to accomplish the aim, the regression curve 𝑔(𝑢𝑖 ) is approached by linear
𝑇
truncated spline function with knot points 𝜉̃ = (𝜉1 , 𝜉2 , … , 𝜉𝑞 ) . While the regression curve
component ℎ𝑗 (𝑣𝑗𝑖 ) is approached by kernel function with bandwidths 𝜙̃ = (𝜙1 , 𝜙2 , … , 𝜙𝑚 )𝑇 .
Based on the MLE method, the estimator 𝑔̃(𝑢) is 𝑔̂̃𝜙̃,𝜉̃ (𝑢, 𝑣̃) and the estimator ∑𝑚 ̃
𝑗=1 ℎ𝑗 (𝑣𝑗 ) is
∑𝑚 ̂̃ ̂̃ ̃ ̃ (𝑢, 𝑣̃) = 𝐒(𝝃̃, 𝝓
̃ )𝑦̃ and ∑𝑚 ̂̃ ̃ ̃. So that, the mixed
𝑗=1 ℎ𝑗𝜙 (𝑣𝑗 ), where 𝑔
𝑗 𝜙 ,𝜉 𝑗=1 ℎ𝑗 (𝑣𝑗 ) = 𝐕(𝝓)𝑦

estimator 𝑓̃(𝑢, 𝑣̃) is 𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃), where 𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝐙(𝝃̃, 𝝓
̃ )𝑦̃ and
̃ ̃ ̃ ̃ ̃ ̃ ̃ ̃
𝐙(𝝃, 𝝓) = 𝐒(𝝃, 𝝓) + 𝐕(𝝓). Matrix 𝐒(𝝃, 𝝓) and 𝐕(𝝓) are depended on knot points 𝝃 and ̃
bandwidths parameter 𝝓 ̃ respectively.

Keywords and Phrases: kernel function, nonparametric regression, truncated spline

1. Introduction
Regression analysis is a statistical method that is often used by researchers in
various fields of research. This analysis is used to determine relationship pattern
between two or more variables in a functional form. In the functional relationship,
each variable is grouped into response variables and predictor variables. An early
identification of the relationship pattern can be done by utilizing past experience or
using a scatter plot. If the functional relationship pattern is known, the regression
model that should be used is parametric regression model. Conversely, if the
functional relationship pattern is unknown, the regression model that should be
used is nonparametric regression models [1].
In general practice, functional relationship patterns between response
variables and predictor variables are unknown. In this condition, using parametric
regression model is not suitable. Therefore, along with the development of
computation technology, nonparametric regression model is increasingly used.
Nonparametric regression model is the best choice because it has high flexibility,
in which the data is expected to look for its shape of regression curve estimation
without being influenced by researcher’s subjective factors [1]. There are many

133
estimators of nonparametric regression curve that have been developed by
researchers, including Spline [2, 3, 4, 1, 5, 6, 7, 8] and Kernel [9, 10, 11, 12, 1, 13,
14, 15].
Spline estimator is one of the estimators that has a very special and very well
statistical interpretation and visual interpretation [1]. One of spline approaches that
is often used is spline truncated. Spline truncated is a function where there is
changeable behavior patterns of curves at different sub-intervals. With support of
knot points, spline curve can overcome problems of data patterns showing sharp
up/down, so it can produce a relatively smooth curve [1]. The shape of spline
regression curve is strongly influenced by location and number of knot points.
Kernel estimator is development of the estimator histogram. Kernel
estimator is a linear estimator similar to other nonparametric regression estimator.
The difference is only because the kernel estimators more specifically in the use of
bandwidth. The advantage of kernel estimator is to have a good ability in modeling
data that have no particular pattern [11]. In addition, the kernel estimator is more
flexible; its mathematical form is simple; and it can reach a relatively faster rate of
convergence [16]. Kernel methods are easy to implement and computationally
feasible, and their definition is intuitive [15].
According to Budiantara, Ratnasari, Ratna, & Zain [17], when the
nonparametric regression and semiparametric regression developed by the
researchers above over are explored, essentially there are two very heavy and basic
assumption, i.e. first the pattern of each predictor in the multivariable
nonparametric regression model of the predictor is considered to have the same
pattern. The second assumption is researchers insist on using only one shape of
model estimator for every predictor variable. The two assumptions used in
nonparametric regression model are only theoretical. In applications on cases, data
pattern is often different from each predictor variable. Moreover, by using only one
estimator is estimating in multivariable nonparametric regression curve, the
produced estimator won’t fit the data pattern. As result, the regression model
estimation produced is not correct and tends to produce larger errors. Therefore, to
overcome these problems, some researchers have developed the mixed estimator in
which each data pattern is approached with appropriate estimator. The mixed
estimator is expected to be accurate estimator that capable to estimate data patterns
well. Some researchers who have ever developed the mixed estimator is such
Budiantara, et al [17] who developed nonparametric mixed estimator of spline
truncated and kernel. In addition, Sudiarsa, Budiantara, Suhartono & Purnami [18]
combined Fourier series estimator and truncated spline in multivariable
nonparametric regression curve.
This study will review mixed nonparametric estimator of truncated linear
spline and kernel function in nonparametric regression model. The mixed estimator
is assumed to have one data pattern following spline function and two or more data
patterns following kernel function.

2. Main Results
Given paired data (𝑢𝑖 , 𝑣1𝑖 , 𝑣2𝑖 , … , 𝑣𝑚𝑖 , 𝑦𝑖 ), 𝑖 = 1,2, . . . , 𝑛, follow the additive
nonparametric regression model

𝑦𝑖 = 𝑓(𝑢𝑖 , 𝑣̃𝑖 ) + 𝜀𝑖 (1)

134
where 𝑣̃𝑖 = (𝑣1𝑖 , 𝑣2𝑖 , … , 𝑣𝑚𝑖 )𝑇 . The shape of regression curve 𝑓(𝑢𝑖 , 𝑣̃𝑖 ) in equation
(1) is assumed to be unknown and smooth, meaning continuous and differentiable.
Random errors 𝜀𝑖 has normal distribution with 𝐸(𝜀𝑖 ) = 0 and 𝑉𝑎𝑟(𝜀𝑖 ) = 𝜎 2 . In
addition, the regression curve 𝑓(𝑢𝑖 , 𝑣̃𝑖 ) is assumed to be additive, meaning
𝑓(𝑢𝑖 , 𝑣̃𝑖 ) can be written as:
𝑓(𝑢𝑖 , 𝑣̃𝑖 ) = 𝑔(𝑢𝑖 ) + ℎ1 (𝑣1𝑖 ) + ℎ2 (𝑣2𝑖 ) + ⋯ + ℎ𝑚 (𝑣𝑚𝑖 )
𝑚

= 𝑔(𝑢𝑖 ) + ∑ ℎ𝑗 (𝑣𝑗𝑖 ) (2)


𝑗=1

with 𝑔(𝑢𝑖 ) and ℎ𝑗 (𝑣𝑗𝑖 ) being smooth function.


In equation (2), regression curve 𝑔(𝑢𝑖 ) is approached by spline function and each
regression curve component ℎ𝑗 (𝑣𝑗𝑖 ) is approached by kernel function.
The form of spline curve is
𝑝 𝑞
𝑝
𝑔(𝑢𝑖 ) = ∑ 𝛽𝑘 𝑢𝑖𝑘 + ∑ 𝜆𝑙 (𝑢𝑖 − 𝜉𝑙 )+ (3)
𝑘=0 𝑙=0

where
(𝑢𝑖 − 𝜉𝑙 )𝑝 , 𝑢𝑖 ≥ 𝜉𝑙
(𝑢𝑖 − 𝜉𝑙 )𝑝+ ={
0 , 𝑢𝑖 < 𝜉𝑙
Parameters 𝛽0 , 𝛽1 , … , 𝛽𝑝 and 𝜆1 , 𝜆2 , … , 𝜆𝑞 are unknown, while 𝜉1 , 𝜉2 , … , 𝜉𝑞 are knot
points. Then, estimation of each kernel curve component is
𝑛

ℎ̂𝑗𝜙𝑗 (𝑣𝑗𝑖 ) = 𝑛−1 ∑ 𝑊𝜙𝑗𝑖 (𝑣𝑗 )𝑦𝑖 (4)


𝑖=1

where
𝐾𝜙𝑗 (𝑣𝑗 − 𝑣𝑗𝑖 )
𝑊𝜙𝑗𝑖 (𝑣𝑗 ) =
𝑛−1 ∑𝑛𝑖=1 𝐾𝜙𝑗 (𝑣𝑗 − 𝑣𝑗𝑖 )
1 𝑣𝑗 − 𝑣𝑗𝑖
𝐾𝜙𝑗 (𝑣𝑗 − 𝑣𝑗𝑖 ) = 𝐾( )
𝜙𝑗 𝜙𝑗
Regression curve estimator (4) is depended on kernel function 𝐾 and bandwidth
parameter 𝜙𝑗 . Kernel function 𝐾 can be a uniform kernel, triangle kernel,
epanechnikov kernel, squares kernel, tri-weight kernel, cosine kernel or Gaussian
kernel [11].
The mixed estimator for nonparametric regression curve (2) is obtained from
kernel estimator (4) and truncated spline function (3) which the polynomial
component is linear (𝑝 = 1).
Lemma 2.1. If spline curve 𝑔(𝑢𝑖 ) given by equation (3) has linear polynomial
component (𝑝 = 1), then

𝑔̃(𝑢) = 𝐆(𝝃̃)𝜃̃ (5)

where vector 𝑔̃(𝑢), matrix 𝐆(𝝃̃) and vector 𝜃̃ respectively are given by

135
𝛽0
𝑔(𝑢1 ) 𝛽1
𝑔(𝑢2 ) 𝜆
𝑔̃(𝑢) = [ ], 𝜃̃ = 1 ,
⋮ 𝜆2
𝑔(𝑢𝑛 ) ⋮
[ 𝜆𝑞 ]
1 𝑢1 (𝑢1 − 𝜉1 )+ ⋯ (𝑢1 − 𝜉𝑞 )+
1 𝑢2 (𝑢2 − 𝜉1 )+ ⋯ (𝑢2 − 𝜉𝑞 )
𝐆(𝝃̃) = +
⋮ ⋮ ⋮ ⋱ ⋮
[ 1 𝑢 𝑛 (𝑢𝑛 − 𝜉1 )+ ⋯ (𝑢𝑛 − 𝜉𝑞 ) ]
+

Dimension of the matrix 𝑔̃(𝑢) is 𝑛 × 1, dimension of the vector 𝜃̃ is (𝑞 + 2) × 1,


and dimension of the matrix 𝐆(𝝃̃) is 𝑛 × (𝑞 + 2).

Proof. If given 𝑝 = 1, the equation (3) will be

𝑔(𝑢𝑖 ) = 𝛽0 + 𝛽1 𝑢𝑖 + 𝜆1 (𝑢𝑖 − 𝜉1 )+ + ⋯ + 𝜆𝑞 (𝑢𝑖 − 𝜉𝑞 )


+
(6)

The equation (6) is applied for 𝑖 = 1 to 𝑖 = 𝑛, therefore

𝑔(𝑢1 ) 𝛽0 + 𝛽1 𝑢1 + 𝜆1 (𝑢1 − 𝜉1 )+ + ⋯ + 𝜆𝑞 (𝑢1 − 𝜉𝑞 )


+
𝑔(𝑢2 ) 𝛽 + 𝛽 𝑢 + 𝜆 (𝑢 − 𝜉 ) + ⋯ + 𝜆 (𝑢 − 𝜉 )
[ ]= 0 1 2 1 2 1 + 𝑞 2 𝑞 +
⋮ ⋮
𝑔(𝑢𝑛 )
[𝛽0 + 𝛽1 𝑢𝑛 + 𝜆1 (𝑢𝑛 − 𝜉1 )+ + ⋯ + 𝜆𝑞 (𝑢𝑛 − 𝜉𝑞 )+ ]
so that
𝛽0
𝑔(𝑢1 ) 1 𝑢1 (𝑢1 − 𝜉1 )+ ⋯ (𝑢1 − 𝜉𝑞 )+ 𝛽
1
𝑔(𝑢2 ) 1 𝑢2 (𝑢2 − 𝜉1 )+ ⋯ (𝑢2 − 𝜉𝑞 )+ 𝜆1
[ ]= (7)
⋮ ⋮ ⋮ ⋮ ⋱ ⋮ 𝜆2
𝑔(𝑢𝑛 )
[1 𝑢𝑛 (𝑢𝑛 − 𝜉1 )+ ⋯ (𝑢𝑛 − 𝜉𝑞 )+ ] ⋮
[𝜆𝑞 ]
Then, the equation (7) can be simplified in matrix notation form in accordance to
equation (5).

Lemma 2.2. If kernel curve component in the equation (2) is estimated by


estimator component in the equation (4), then
𝑚

∑ ℎ̃̂𝑗𝜙𝑗 (𝑣𝑗 ) = 𝐕(𝝓


̃ )𝑦̃ (8)
𝑗=1

where
ℎ̂𝑗𝜙𝑗 (𝑣𝑗1 ) 𝑦1
ℎ̂𝑗𝜙 (𝑣𝑗2 ) 𝑦2
ℎ̂̃𝑗𝜙𝑗 (𝑣𝑗 ) = 𝑗 𝑦̃ = [ ⋮ ]

̂
ℎ 𝑦𝑛
[ 𝑗𝜙𝑗 𝑗𝑛 )]
(𝑣

136
𝑛 𝑛 𝑛
−1 −1 −1
𝑛 ∑ 𝑊𝜙𝑗1 (𝑣𝑗1 ) 𝑛 ∑ 𝑊𝜙𝑗2 (𝑣𝑗1 ) ⋯ 𝑛 ∑ 𝑊𝜙𝑗𝑛 (𝑣𝑗1 )
𝑖=1 𝑖=1 𝑖=1
𝑛 𝑛 𝑛
−1
̃ ) = 𝑛 ∑ 𝑊𝜙𝑗1 (𝑣𝑗2 )
𝐕(𝝓 𝑛−1 ∑ 𝑊𝜙𝑗2 (𝑣𝑗2 ) … 𝑛−1 ∑ 𝑊𝜙𝑗𝑛 (𝑣𝑗2 )
𝑖=1 𝑖=1 𝑖=1
⋮ ⋮ ⋱ ⋮
𝑛 𝑛 𝑛

𝑛−1 ∑ 𝑊𝜙𝑗1 (𝑣𝑗𝑛 ) 𝑛−1 ∑ 𝑊𝜙𝑗2 (𝑣𝑗𝑛 ) … 𝑛−1 ∑ 𝑊𝜙𝑗𝑛 (𝑣𝑗𝑛 )


[ 𝑖=1 𝑖=1 𝑖=1 ]
(9)

Dimension of the vector ℎ̂̃𝑗𝜙𝑗 (𝑣𝑗 ) is 𝑛 × 1, dimension of the vector 𝑦̃ is 𝑛 × 1, and


dimension of the matrix 𝐕(𝝓 ̃ ) is 𝑛 × 𝑛.

Proof. For each component ℎ̂𝑗𝜙𝑗 (𝑣𝑗𝑖 ), it is applied:

ℎ̂𝑗𝜙𝑗 (𝑣𝑗𝑖 ) = 𝑛−1 𝑊𝜙𝑗1 (𝑣𝑗 )𝑦1 + 𝑛−1 𝑊𝜙𝑗2 (𝑣𝑗 )𝑦2 + ⋯ + 𝑛−1 𝑊𝜙𝑗𝑛 (𝑣𝑗 )𝑦𝑛
Likewise for each 𝑖 = 1 to 𝑖 = 𝑛, then:
ℎ̂𝑗𝜙𝑗 (𝑣𝑗1 ) 𝑛−1 𝑊𝜙𝑗1 (𝑣𝑗1 ) 𝑛−1 𝑊𝜙𝑗2 (𝑣𝑗1 ) ⋯ 𝑛−1 𝑊𝜙𝑗𝑛 (𝑣𝑗1 ) 𝑦
1
ℎ̂𝑗𝜙 (𝑣𝑗2 ) 𝑛−1 𝑊𝜙𝑗1 (𝑣𝑗2 ) 𝑛−1 𝑊𝜙𝑗2 (𝑣𝑗2 ) … 𝑛−1 𝑊𝜙𝑗𝑛 (𝑣𝑗2 ) 𝑦2
𝑗 = [⋮]
⋮ ⋮ ⋮ ⋱ ⋮
̂
[ℎ𝑗𝜙𝑗 (𝑣𝑗𝑛 )]
−1 −1
[𝑛 𝑊𝜙𝑗1 (𝑣𝑗𝑛 ) 𝑛 𝑊𝜙𝑗2 (𝑣𝑗𝑛 ) … 𝑛−1 𝑊𝜙𝑗𝑛 (𝑣𝑗𝑛 )] 𝑦𝑛
In matrix notation can be written as:

ℎ̂̃𝑗𝜙𝑗 (𝑣𝑗 ) = 𝐕𝒋 (𝝓𝒋 )𝑦̃ (10)

where:
𝑛−1 𝑊𝜙𝑗1 (𝑣𝑗1 ) 𝑛−1 𝑊𝜙𝑗2 (𝑣𝑗1 ) ⋯ 𝑛−1 𝑊𝜙𝑗𝑛 (𝑣𝑗1 )
𝑛−1 𝑊𝜙𝑗1 (𝑣𝑗2 ) 𝑛−1 𝑊𝜙𝑗2 (𝑣𝑗2 ) … 𝑛−1 𝑊𝜙𝑗𝑛 (𝑣𝑗2 )
𝐕𝒋 (𝝓𝒋 ) =
⋮ ⋮ ⋱ ⋮
−1 −1
[𝑛 𝑊 (𝑣
𝜙𝑗 1 𝑗𝑛 ) 𝑛 𝑊𝜙𝑗 2 (𝑣𝑗𝑛 ) … 𝑛−1 𝑊𝜙𝑗𝑛 (𝑣𝑗𝑛 )]
The sum of component (10) from 𝑗 = 1 to 𝑗 = 𝑚 is
𝑚 𝑚

∑ ℎ̂̃𝑗𝜙𝑗 (𝑣𝑗 ) = ∑ 𝑽𝒋 (𝝓𝒋 )𝑦̃ = 𝑽(𝝓


̃ )𝑦̃
𝑗=1 𝑗=1

where matrix ∑𝑚
𝑗=1 𝐕𝒋 (𝝓𝒋 )
̃ ) is given by equation (9)
= 𝐕(𝝓
Lemma 2.3. If spline curve is given by Lemma 2.1 and kernel estimator is given by
Lemma 2.2, then the sum of squared error for nonparametric regression model in
equation (1) is
2
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ‖
‖𝜀̃‖2 = ‖(𝐈 − 𝐕(𝝓 (11)

where 𝜀̃ = (𝜀1 , 𝜀2 , … , 𝜀𝑛 )𝑇 . Dimension of the vector 𝜀̃ is 𝑛 × 1. Matrix I is identity


matrix with 𝑛 × 𝑛 dimension.

137
Proof. The equations (1), (2), (5) and (8) give
𝑚

𝑦̃ = 𝑔̃(𝑢) + ∑ ℎ̂̃𝑗𝜙𝑗 (𝑣𝑗 ) + 𝜀̃


𝑗=1

= 𝐆(𝝃̃)𝜃̃ + 𝐕(𝝓
̃ )𝑦̃ + 𝜀̃
so that,

̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃
𝜀̃ = (𝐈 − 𝐕(𝝓 (12)

If vector 𝜀̃ in equation (12) is squared, then it will be formed the equation (11).

Theorem 2.1. If the sum of squared error of nonparametric regression model (1) is
given by Lemma 2.3, the error model is distributed normal multivariate with zero
mean, 𝐸(𝜀̃𝜀̃ 𝑇 ) = 𝜎 2 𝐈 and the likelihood function is 𝐿(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃), then the MLE
estimator for parameter 𝜃̃ is obtained from optimization:
2
Max {𝐿(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃)} = Min {‖(𝐈 − 𝐕(𝝓
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ‖ } (13)
̃ ∈𝑅𝑞+2
𝜃 𝑞+2̃ ∈𝑅
𝜃

Proof. Nonparametric regression model is given as in equation (1). Random error 𝜀̃


is distributed normal multivariate with 𝐸(𝜀̃) = 0 and 𝐸(𝜀̃𝜀̃ 𝑇 ) = 𝜎 2 𝐈, then the
likelihood function 𝐿(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃) is given by:
𝑛
1 1 2
𝐿(𝜃̃ , 𝜎 |𝜙̃, 𝜉̃) = ∏
2
exp (− 𝜀 )
√2𝜋𝜎 2 2𝜎 2 𝑖
𝑖=1
𝑛 1
= (2𝜋𝜎 2 )−2 exp (− ‖𝜀̃‖2 )
2𝜎 2
In accordance with Lemma 2.3, then:
1 𝑛 2
𝐿(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃) = (2𝜋𝜎 2 )−2 exp (−
2
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ‖ )
‖(𝐈 − 𝐕(𝝓
2𝜎
Based on the MLE method, estimator of parameter 𝜃̃ is obtained from
optimization:
1 𝑛 2
Max {𝐿(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃)} = Max {(2𝜋𝜎 2 )− 2 exp (− 2𝜎2 ‖(𝐈 − 𝐕(𝝓
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ‖ )}
̃ ∈𝑅
𝜃 𝑞+2 ̃ ∈𝑅
𝜃 𝑞+2

If the likelihood function is transformed to the natural logarithm form, then the
optimization will be:
𝑛 1 2
Max {𝐿(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃)} = Max {− 2 ln(2𝜋𝜎 2 ) − 2𝜎2 ‖(𝐈 − 𝐕(𝝓
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ‖ }
̃ ∈𝑅
𝜃 𝑞+2 ̃ ∈𝑅
𝜃𝑞+2

The optimization will be maximum when the component:


2
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ‖
‖(𝐈 − 𝐕(𝝓
has a minimum value, so that equation (13) is valid.
Theorem 2.2. If regression model is given by equation (1), the sum of squared

138
error is given by Lemma 2.3, the error model is distributed normal multivariate
with zero mean and (𝜀̃𝜀̃ 𝑇 ) = 𝜎 2 𝐈, MLE estimator for parameter 𝜃̃ is obtained from
optimization as theorem 2.1, then MLE estimator for the mixed regression curve
𝑓̃(𝑢, 𝑣̃) is given by
𝑚

𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝑔̂̃𝜙̃,𝜉̃ (𝑢, 𝑣̃) + ∑ ℎ̂̃𝑗𝜙𝑗 (𝑣𝑗 )


𝑗=1

where
𝑚

𝑔̂̃𝜙̃,𝜉̃ (𝑢) = 𝐆(𝝃̃)𝜃̃̂(𝜉̃, 𝜙̃), ∑ ℎ̂̃𝑗𝜙𝑗 (𝑣𝑗 ) = 𝐕(𝝓


̃ )𝑦̃
𝑗=1

𝑇 −1 𝑇
𝜃̃̂(𝜉̃, 𝜙̃) = [(𝐆(𝝃̃)) 𝐆(𝝃̃)] (𝐆(𝝃̃)) (𝐈 − 𝐕(𝝓
̃ )) 𝑦̃

Matrix 𝐆(𝝃̃) is given by Lemma 2.1 and matrix 𝐕(𝝓


̃ ) is given by Lemma 2.2, while
matrix 𝐈 is an identity matrix.

Proof. To obtain MLE estimator for parameter 𝜃̃, Theorem 2.1 produces an
optimization as in equation (13), with considered:
2
𝑄(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃) = ‖(𝐈 − 𝐕(𝝓
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ‖

so that
𝑇
𝑄(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃) = [(𝐈 − 𝐕(𝝓
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ] [(𝐈 − 𝐕(𝝓
̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ]
𝑇 𝑇
̃ )) − 𝜃̃ 𝑇 (𝐆(𝝃̃)) ] [(𝐈 − 𝐕(𝝓
= [𝑦̃ 𝑇 (𝐈 − 𝐕(𝝓 ̃ )) 𝑦̃ − 𝐆(𝝃̃)𝜃̃ ]

2 𝑇
̃ )) 𝑦̃‖ − 2𝜃̃ 𝑇 (𝐆(𝝃̃)) (𝐈 − 𝐕(𝝓
= ‖(𝐈 − 𝐕(𝝓 ̃ )) 𝑦̃
𝑇
+𝜃̃ 𝑇 (𝐆(𝝃̃)) 𝐆(𝝃̃)𝜃̃
Furthermore, the partial derivatives:
𝜕
(𝑄(𝜃̃ , 𝜎 2 |𝜙̃, 𝜉̃)) = 0
𝜕𝜃̃
gives a normal equation:
𝑇 𝑇
(𝐆(𝝃̃)) 𝐆(𝝃̃)𝜃̃ = (𝐆(𝝃̃)) (𝐈 − 𝐕(𝝓
̃ )) 𝑦̃

So the estimation for 𝜃̃ is given by:


𝑇 −1 𝑇
𝜃̃̂(𝜉̃, 𝜙̃) = [(𝐆(𝝃̃)) 𝐆(𝝃̃)] (𝐆(𝝃̃)) (𝐈 − 𝐕(𝝓
̃ )) 𝑦̃ (14)

Considering the equation (14) and invariant characteristic of MLE, then the
estimator for spline curve is given by:

𝑔̂̃𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝐆(𝝃̃)𝜃̃̂(𝜉̃, 𝜙̃) (15)

So that, based on the equation (15) and Lemma 2.2, then the mixed estimators for

139
nonparametric regression curve (2) is
𝑚

𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝑔̂̃𝜙̃,𝜉̃ (𝑢, 𝑣̃) + ∑ ℎ̂̃𝑗𝜙𝑗 (𝑣𝑗 )


𝑗=1

Corollary 2.1. If the estimators 𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃), 𝑔̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃), ∑𝑚 ̂̃ ̂̃ ̃ ̃
𝑗=1 ℎ𝑗𝜙𝑗 (𝑣𝑗 ) and 𝜃 (𝜉 , 𝜙 )
are given by Theorem 2.2, then
̃ )𝑦̃ and 𝑓̃̂ ̃ ̃ (𝑢, 𝑣̃) = 𝐙(𝝃̃, 𝝓
𝑔̂̃𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝐒(𝝃̃, 𝝓 ̃ )𝑦̃
𝜙,𝜉

where
𝑇 −1 𝑇
𝐒(𝝃̃, 𝝓
̃ ) = 𝐆(𝝃̃) [(𝐆(𝝃̃)) 𝐆(𝝃̃)] (𝐆(𝝃̃)) (𝐈 − 𝐕(𝝓))

𝐙(𝝃̃, 𝝓
̃ ) = 𝐒(𝝃̃, 𝝓
̃ ) + 𝐕(𝝓
̃)

Proof. The equation (14) and (15) give an equation


𝑇 −1 𝑇
𝑔̂̃𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝐆(𝝃̃) [(𝐆(𝝃̃)) 𝐆(𝝃̃)] (𝐆(𝝃̃)) (𝐈 − 𝐕(𝝓
̃ )) 𝑦̃

or it can be written as:

𝑔̂̃𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝐒(𝝃̃, 𝝓


̃ )𝑦̃ (16)

where:
𝑇 −1 𝑇
𝐒(𝝃̃, 𝝓
̃ ) = 𝐆(𝝃̃) [(𝐆(𝝃̃)) 𝐆(𝝃̃)] (𝐆(𝝃̃)) (𝐈 − 𝐕(𝝓))

Furthermore, the Theorem 2.2 and equation (16) produce:

𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝐒(𝝃̃, 𝝓


̃ )𝑦̃ + 𝐕(𝝓
̃ )𝑦̃

where:

𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃) = (𝐒(𝝃̃, 𝝓


̃ ) + 𝐕(𝝓
̃ )) 𝑦̃

or it can be written as:

𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝐙(𝝃̃, 𝝓


̃ )𝑦̃

where:
𝐙(𝝃̃, 𝝓
̃ ) = 𝐒(𝝃̃, 𝝓
̃ ) + 𝐕(𝝓
̃ ).

Mixed estimators of spline and kernel 𝑓̃̂𝜙̃,𝜉̃ (𝑢, 𝑣̃) are very depended on the knot
𝑇
points 𝜉̃ = (𝜉1 , 𝜉2 , … , 𝜉𝑞 ) and the bandwidth parameters 𝜙̃ = (𝜙1 , 𝜙2 , … , 𝜙𝑚 )𝑇 .
Getting the best spline and kernel nonparametric mixed estimators is required the
selection of optimal knot points and optimal bandwidths. A method to conduct it is
by using Generalized Cross Validation (GCV) method.

140
MSE(𝜉̃, 𝜙̃)
GCV(𝜉̃, 𝜙̃) = 2
(𝑛−1 trace (𝐈 − 𝐙(𝝃̃, 𝝓
̃ )))

where
𝑛
2
MSE(𝜉̃, 𝜙̃) = 𝑛 −1
∑ (𝑦𝑖 − 𝑓̂𝜙̃,𝜉̃ (𝑢𝑖 , 𝑣̃𝑖 )) .
𝑖=1
𝑇
Optimal knot points 𝜉̃(opt) = (𝜉1(opt) , 𝜉2(opt) , … , 𝜉𝑞(opt) ) and optimal bandwidths
𝑇
𝜙̃(opt) = (𝜙1(opt) , 𝜙2(opt) , … , 𝜙𝑚(opt) ) are obtained from optimation:
GCV(𝜉̃(opt) , 𝜙̃(opt) ) = Min{GCV(𝜉̃, 𝜙̃)}
𝜉̃ ,𝜙
̃

3. Concluding Remarks
Consider mixed nonparametric regression model 𝑦̃ = 𝑓̃(𝑢, 𝑣̃) + 𝜀̃, where
𝑓̃(𝑢, 𝑣̃) = 𝑔̃(𝑢) + ∑𝑚 ̃
𝑗=1 ℎ𝑗 (𝑣𝑗 ) + 𝜀̃. The nonparametric mixed regression estimator
is 𝑓̃̂ ̃ ̃ (𝑢, 𝑣̃) = 𝑔̂̃ ̃ ̃ (𝑢, 𝑣̃) + ∑𝑚 ̃̂ ̃̂ ̃) = 𝐙(𝝃̃, 𝝓
𝑗=1 ℎ𝑗𝜙 (𝑣𝑗 ), where 𝑓 ̃ ̃ (𝑢, 𝑣
̃ )𝑦̃,
𝜙,𝜉 𝜙,𝜉 𝑗 𝜙,𝜉
𝑔̂̃𝜙̃,𝜉̃ (𝑢, 𝑣̃) = 𝐒(𝝃̃, 𝝓
̃ )𝑦̃ and ∑𝑚 ̃̂ ̃ ̃. The mixed estimator
𝑗=1 ℎ𝑗𝜙𝑗 (𝑣𝑗 ) = 𝐕(𝝓)𝑦

𝑓̃̂ ̃ ̃ (𝑢, 𝑣̃) is depended on knot points location, the number of knot points and
𝜙,𝜉
bandwidth parameters. The best mixed estimators are associated with optimal knot
points and optimal bandwidth parameters. The optimal knot points and optimal
bandwidth parameters are obtained from the smallest GCV.

References
[1] R. L. Eubank, Nonparametric Regression and Spline Smoothing, New
York: Marcel Dekker, Inc, (1999)
[2] C. H. Reinsch, Smoothing by Spline Functions, Numerische Mathematik,
pp. hal. 77-183, (1967)
[3] B. W. Silverman, Some Aspects of The Spline Smoothing Approach to
Non-parametric Regression Curve Fitting, Journal of the Royal Statistical
Society. Series B (Methodological), pp. Vol. 47, No. 1, hal. 1-52,, (1985)
[4] G. Wahba, Spline Models for Observational Data, Philadelphia: Society for
Industrial and Applied Mathematics, (1990)
[5] H. Liang, Estimation in Partially Linear Models and Numerical
Comparisons, Computational Statistics & Data Analysis, Vols. Vol. 50,
No .3, pp. hal. 675-687, (2006)
[6] Y. Lin and H. H. Zhang, Component Selection and Smoothing in
Multivariate Nonparametric Regression, The Annals of Statistics, Vols.
Vol. 34, No.5, pp. hal. 2272-2297, (2006)
[7] A. Islamiyati and I. N. Budiantara, Model Spline dengan Titik-titik Knots
dalam Regresi Nonparametrik, Jurnal INFERENSI, pp. Vol. 3, hal. 11-21,
(2007)

141
[8] I. N. Budiantara, Spline Dalam Regresi Nonparametrik dan
Semiparametrik: Sebuah Pemodelan Statistika Masa Kini dan Masa
Mendatang, Surabaya: ITS Press, (2009)
[9] E. A. Nadaraya, Nonparametric Estimation of Probability Densities and
Regression Curves , Kluwcer Academic Publishers, (1989)
[10] T. Gasser and H.-G. Muller, Kernel Estimation of Regression Functions,
Springer Berlin Heidelberg, (1979)
[11] W. Hardle, Applied Nonparametric Regression, Berlin: Humboldt-
Universität zu Berlin, (1994)
[12] M. P. Wand and M. C. Jones, Kernel Smoothing, London: Chapman &
Hall, (1995)
[13] J. You and G. Chen, Semiparametric Generalized Least Squares
Estimation in Partially Linear Regression Models with Correlated Errors,
Journal of Statistical Planning and Inference, Vols. Vol. 137, No. 1, pp.
hal. 117-132, (2007)
[14] M. Kayri and Zirhhoglu, Kernel Smoothing Function and Choosing
Bandwidth for Nonparametric Regression Methods, Ozean Journal of
Applied Sciences, pp. 2, 49-54, (2009)
[15] J. Klemela, Multivariate Nonparametric Regression and Visualization:
with R and Applications to Finance, New Jersey: John Wiley & Sons, Inc,
(2014)
[16] I. N. Budiantara and Mulianah, Pemilihan Bandwidth Optimal dalam
Regresi Semiparametrik Kernel dan Aplikasinya, SIGMA: Jurnal Sains
dan Teknologi, pp. Vol. 10, No. 2, hal 159-166, (2007)
[17] I. N. Budiantara, V. Ratnasari, M. Ratna and I. Zain , The Combination of
Spline and Kernel Estimator for Nonparametric Regression and Its
Properties, Applied Mathematical Sciences, pp. Vol. 9, No. 122, hal. 6083-
6094, (2015)
[18] I. W. Sudiarsa, I. N. Budiantara, Suhartono and S. W. Purnami , Combined
Estimator Fourier Series and Spline Truncated Multivariable
Nonparametric Regression, Applied Mathematical Sciences, pp. Vol. 9,
No. 100, hal. 4997-5010, (2015)

142
Proceedings of IICMA 2015
Statistics and Probability

Small Area Estimation with Bayesian Approach


(Case Study: Proportions of Dropout Children
in Poverty)
Amalia Noviamti1,a), Kartika Fithriasari,b), and Irhamah 3,c)
1,2,3
Department of Statistics, Institute of Technology Sepuluh Nopember (ITS), Surabaya

a) amalia.noviani@gmail.com
b) kartika_f@statistika.its.ac.id
c) irhamahn@gmail.com

Abstract. Due to limited budget, survey data, originally designed to provide reliable statistics,
design based estimates of characteristics of interest for a high level of aggregation also is used to
generate estimates at a lower level, a demographic group, a demographic group within a
geographic region, etc. Using one of the regular surveys in the socio-economic field conducted by
Statistics Indonesia, proportions of dropout children in poverty will be estimated. By borrowing
strength from the related area, the estimation will be done by using small area estimation
technique. Since the variable of interest that examined in this study is binary data, HB approach
will be used to assign the goal of this study. The results shows that five cities have proportions of
dropout children 7 – 15 years old in poverty less than one percent, fourteen districts/cities above
two percent, and the other nineteen districts/cities have proportions of dropout children 7 – 15
years old in poverty around one percent.
Keywords and Phrases: Small Area Estimation, Bayesian, Dropout Children, Poverty.

1. Introduction
Sample surveys are commonly used for data collection in many areas,
particularly the social sciences. They have taken the place of complete enumeration
or census as a more cost-effective means of obtaining information on wide-ranging
topics of interest at frequent intervals over time. Nowadays, the demand for
statistical estimates for small areas using large-scale survey data has increased in
many different application areas including income and poverty, education, and
health. The same survey data, originally designed to provide statistically reliable,
design based estimates of characteristics of interest for a high level of aggregation
(e.g., national level, large geographic domains such as province), also is used to
generate estimates at a lower level (e.g., county, city, etc.), a demographic group
(e.g. age x sex), a demographic group within a geographic region, etc. The usual
direct survey estimators for a small area, based on data only from the sample units
in the area, are likely to yield unacceptably large standard errors due to unduly
small size of the sample in the area.
Due to its difficulties, it is often necessary to employ indirect estimation
that borrows information from related areas through explicit (or implicit) linking
models, using census and administrative data associated with the small areas.
Indirect estimators based on explicit linking models or called small area models
have received a lot of attention in recent years because of the following advantages

143
over the traditional indirect estimators based on implicit models: (i) Explicit
model-based methods make specific allowance for local variation through complex
error structures in the model that link the small areas. (ii) Models can be validated
from the sample data. (iii) Methods can handle complex cases such as cross-
sectional and time series data, binary or count data, spatially-correlated data and
multivariate data. (iv) Area-specific measures of variability associated with the
estimates may be obtained, unlike overall measures commonly used with the
traditional indirect estimators [1].
Small area models can be classified into two broad types: area level models
that relate the small area means to area-specific auxiliary variables and unit level
models that relate the unit values of the study variable to unit-specific auxiliary
variables. Empirical Best Linear Unbiased Prediction (EBLUP) estimators,
parametric Empirical Bayes (EB) estimators, and parametric Hierarchical Bayes
(HB) estimators are three most known indirect estimators based on explicit linking
models. There are some differences between the three approach: EBLUP is
applicable for linear mixed models that cover the basic area level and unit level
models, whereas EB and HB are more generally applicable, covering generalized
linear mixed models that are used to handle categorical (e.g., binary) and count
data. MSE is used as a measure of variability under the EBLUP and EB
approaches, while the HB approach uses the posterior variance as a measure of
variability, assuming a prior distribution on the model parameters. In general, HB
approach has more advantage because it is straightforward, the inferences are
“exact” unlike the EB (or EBLUP) approach, and it can handle complex small area
models using MCMC methods, even though it requires the specification of a prior
on the model parameters. Some application of EBLUP, EB, and HB approach can
be seen in [1], [2], [3], [4], [5], and [6].
This study aims to estimate proportions of dropout children in poverty,
hence, it can be taken into consideration in policy decisions related to education for
school-age children especially those from poor households. Even though East Java
Province is one province that is considered to have successfully run a 9-year
compulsory education program, according to the Education Department of East
Java Province, for the academic year 2012/2013, the number of dropout children at
the primary level is 4,848 people (0.11 percent), and at the junior secondary level is
6,858 persons (0.38 percent) [7]. One of the factors that influence the decision to
drop out of school is economic factor of the family. Results of research of dropout
in elementary and junior high schools conducted by Statistics Indonesia [8] as well
as research conducted by Santoso in [8] in rural areas in East Java showed that the
cost limitations are the reasons by around 50 percent of respondents. School-age
children who come from poor families have possibility of dropping out of school
four times greater than those from affluent families [9]. Research by the National
Center for Education Statistics (NCES) also mentions that in the United States,
school-age children from low-income families are likely to drop out of school at
the high school level five times higher than those from middle-income families and
six times higher rather than school-age children who come from high-income
families [10].
Every year, Statistics Indonesia issued a poverty rate that is calculated from
the results of the National Socio Economic Survey or Survei Sosial Ekonomi
Nasional (Susenas) up to the level of district/city. However, the proportions of
dropout children 7 – 15 years old in poverty cannot be calculate directly from the
results of Susenas due its insufficient sample amount, even for the level of
district/city. Therefore, we need to use small area estimation technique. Since the

144
response variable is a categorical (binary) data, HB approach will be used to assign
the goal of this study. As for the predictor variables will be taken from East Java
Province in Figures 2014 and Education Statistical Executive Summary of East
Java Province 2014, as follows: percentage of poor people (Z1), illiteracy rate (Z2),
student – class ratio (Z3) each district/city in East Java Province 2013.

2. Main Results
2.1. Small Area Model
Small area models can be classified into two broad types: basic area level
models and basic unit level models [1].
A. Basic Area Level Models
Basic area level models are essential if unit (element) level data are not
𝑇
available. In this model, only area-specific auxiliary data 𝑧𝑖 = (𝑧1𝑖 , 𝑧2𝑖 , … , 𝑧𝑝𝑖 ) ,
related to some suitable functions 𝜃𝑖 = 𝑔(𝑌𝑖 ) of the small area total 𝑌𝑖 (𝑖 =
1, … , 𝑚), are used to develop a linking model of the form 𝜃𝑖 = 𝒛𝑻𝒊 𝜷 + 𝑣𝑖 with
𝑣𝑖 ~𝑁(0, 𝜎𝑣2 ) where 𝜎𝑣2 is the model variance. The linking model is combined
with the matching sampling model 𝜃̂𝑖 = 𝜃𝑖 + 𝑒𝑖 , where 𝜃̂𝑖 = 𝑔(𝑌 ̂𝑖 ) is a direct
estimator of 𝜃𝑖 and 𝑒𝑖 |𝜃𝑖 ~𝑁(0, 𝜓𝑖 ) with known sampling variance 𝜓𝑖 . The
combined model 𝜃̂𝑖 = 𝒛𝑇𝑖 𝜷 + 𝑣𝑖 + 𝑒𝑖 is a special case of the linear mixed model.
The basic area level model has at least two limitations. First, the
assumption of known sampling variances, 𝜓𝑖 , is restrictive, although methods
based on generalized variance functions (GVF) have been proposed to produce
smoothed estimates of the 𝜓𝑖 ’s. Secondly, the assumption 𝐸(𝑒𝑖 |𝜃𝑖 ) = 0 may not be
tenable if the small area sample size 𝑛𝑖 is small and 𝜃𝑖 is a nonlinear function of
the total 𝑌𝑖 , even if the director estimator 𝑌̂𝑖 is design-unbiased for 𝑌𝑖 . It is more
realistic to use the sampling model 𝑌 ̂𝑖 = 𝑌𝑖 + 𝑓𝑖 with 𝐸(𝑓𝑖 |𝑌𝑖 ) = 0, which simply
̂
says that 𝑌𝑖 is design-unbiased for 𝑌𝑖 . Further, assume that 𝑉(𝑓𝑖 |𝑌𝑖 ) = 𝜎2𝑖 , where
the sampling variance may depend on 𝑌𝑖 ; for example, 𝜎𝑖2 = 𝑌𝑖2 𝑐𝑖2 , where 𝑐𝑖 is the
known coefficient of variation of 𝑌 ̂𝑖 ascertained from fitting GVF’s. The sampling
model is now unmatched with the linking model in the sense that they cannot be
combined directly to produce a linear mixed model. Various extensions of the basic
area level (also called Fay-Herriot model) have been proposed to handle correlated
sampling errors, spatial dependence of the model errors 𝑣𝑖 and time-series and
cross-sectional data [1].
B. Basic Unit Level Models
In basic unit level model, unit level auxiliary variables 𝑥𝑖 =
𝑇
(𝑥𝑖𝑗1 , 𝑥𝑖𝑗2 , … , 𝑥𝑖𝑗𝑝 ) are related to the unit y-values 𝑦𝑖𝑗 through a nested error
linear regression model 𝑦𝑖𝑗 = 𝒙𝑇𝑖𝑗 𝜷 + 𝑣𝑖 + 𝑒𝑖𝑗 , where 𝑣𝑖 ~𝑁(0, 𝜎𝑣2 ) and
independent of 𝑒𝑖𝑗 ~𝑁(0, 𝜎𝑒2 ). Various extensions of the basic unit-level model
have been proposed to handle binary responses, two-stage sampling within areas,
multivariate responses and others [1].

145
2.2. Hierarchical Bayesian (HB) in Small Area Estimation (SAE)
Recently, HB approach has been proposed for small area estimation
problems because they have many advantages: (i) their specification is
straightforward and allows to take into account the different sources of variation
and (ii) inferences are clear-cut and computationally feasible in most cases by
using standard MCMC techniques (Trevisani, Torelli [11]). This study use a HB
version of the logit-normal model with area level covariates 𝒛𝑖 that can be
expressed as
(i) 𝑦𝑖 |𝑝𝑖 ~𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑛𝑖 , 𝑝𝑖 ), 𝑝𝑖 is proportions of dropout children in poverty for
district/city 𝑖, 𝑖 = 1, … , 𝑚.
(ii) Model A : 𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = 𝛽0 + 𝛽1 𝑍1 + 𝑣𝑖 ; 𝑣𝑖 ~𝑁(0, 𝜎𝑣2 )
Model B : 𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = 𝛽0 + 𝛽1 𝑍2 + 𝑣𝑖 ; 𝑣𝑖 ~𝑁(0, 𝜎𝑣2 )
Model C : 𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = 𝛽0 + 𝛽1 𝑍3 + 𝑣𝑖 ; 𝑣𝑖 ~𝑁(0, 𝜎𝑣2 )
Model D : 𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = 𝛽0 + 𝛽1 𝑍1 + 𝛽2 𝑍2 + 𝑣𝑖 ; 𝑣𝑖 ~𝑁(0, 𝜎𝑣2 )
Model E : 𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = 𝛽0 + 𝛽1 𝑍1 + 𝛽2 𝑍3 + 𝑣𝑖 ; 𝑣𝑖 ~𝑁(0, 𝜎𝑣2 )
Model F : 𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = 𝛽0 + 𝛽1 𝑍2 + 𝛽2 𝑍3 + 𝑣𝑖 ; 𝑣𝑖 ~𝑁(0, 𝜎𝑣2 )
Model G : 𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = 𝛽0 + 𝛽1 𝑍1 + 𝛽2 𝑍2 + 𝛽3 𝑍3 + 𝑣𝑖 ; 𝑣𝑖 ~𝑁(0, 𝜎𝑣2 )
(iii) 𝛽 and 𝜎𝑣2 are mutually independent with 𝜎𝑣−2 ~𝐺(𝑎, 𝑏), 𝑎 ≥ 0, 𝑏 > 0 and
𝛽~𝑁(0, 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛)
This study uses distribution proper prior distribution and noninformative
conjugate Gamma (0.0001, 0.0001) to precision parameter of random effect area
(hyperprior) 𝜎𝑣2 . Markov chain used is single run with burn-in period of 100 and
different period of iteration and thin for each model based on the convergenced
diagnostics of Markov chain. Markov chain convergence diagnostics can be
performed through trace plot, density plot, and autocorrelation plot.

2.3. An Illustration With Proportions of Dropout Children in Poverty


Data
Table 1 shows burn-in, iteration, and thin period, also estimated value for
parameters for each model when Markov chain have been convergenced. It appears
that all of parameter in model C are significant at the 90 percent confidence
interval. Table 1 also shows DIC value for each model that presented that model fit
to the data. Smallest value of DIC also generated by the model C. Thus, the best
model to estimate proportions of dropout children in poverty is model C that can
formulated as follows:
𝑙𝑜𝑔𝑖𝑡(𝑝𝑖 ) = −2,803 − 0,047𝑍3 (4.1)
The negative sign of the estimated value β indicates that the increase in the
value of Z causes a decrease in the value of the log odds of proportions and
otherwise. Suppose that the estimated value 𝛽1 = – 0,047 shows that every 1
student increase in the ratio of student per class in a district/city, the tendency of a
district/city to decrease the proportions of dropout children 7 – 15 years old in
poverty is 1,048 times. The increase of the number of students in class can
encourage children from poor households to stay in school because they can gain
knowledge, play, socialize, and not impossible to forget for a moment the
economic burden and the problems faced by families.

146
TABLE 1. Burn-in Period, Iteration, Thin, and Results of Parameter Estimation
Credible
Burn- Para- Interval
Model Iteration Thin Mean DIC
in meter
5% 95%
[1] [2] [3] [4] [5] [6] [7] [8] [9]

𝛽0 -4.171 -4.964 -3.416


A 100 4,000,000 400 93.682
𝛽1 0.007 -0.040 0.055

𝛽0 -4.089 -4.655 -3.566


B 100 3,500,000 350 93.755
𝛽1 0.002 -0.042 0.047

𝛽0 -2.803 -4.018 -1.496


C 100 4,000,000 400 91.065
𝛽1 -0.047 -0.097 -0.002

𝛽0 -4.244 -5.133 -3.435

D 100 3,250,000 320 𝛽1 0.020 -0.068 0.110 95.592

𝛽2 -0.013 -0.097 0.070

𝛽0 -2.679 -4.35 -0.980

E 100 3,500,000 375 𝛽1 -0.006 -0.059 0.046 92.992

𝛽2 -0.048 -0.100 -0.002

𝛽0 -2.388 -4.052 -0.718

F 100 2,600,400 301 𝛽1 -0.021 -0.073 0.029 92.502

𝛽2 -0.055 -0.108 -0.005

𝛽0 -2.515 -4.132 -0.848

𝛽1 0.049 -0.050 0.152


G 100 2,750,000 300 93.817
𝛽2 -0.062 -0.161 0.034

𝛽3 -0.062 -0.117 -0.012

With the equation (4.1), the proportion of dropout children 7 – 15 years old
in poverty for district/city level in East Java Province are estimated. As shown in
Table 2, the proportion of dropout children 7 – 15 years old in poverty in Kediri
City, Probolinggo City, Mojokerto City, Madiun City, and Batu City each less than
one percent. Pacitan, Trenggalek, Blitar, Jember, Situbondo, Probolinggo,
Pasuruan, Sidoarjo, Magetan, Ngawi, Bojonegoro, Lamongan, Pamekasan, and
Sumenep has the proportion of dropout children 7 – 15 years old in poverty
respectively above two percent. While the other nineteen districts/cities have the
proportion of dropout children 7 – 15 years old in poverty around one percent. The
distribution of proportions of dropout children 7 – 15 years old in poverty in East
Java Province of the year 2013 shown in Figure 1.

147
TABLE 2. Proportions of Dropout Children 7 – 15 Years Old In Poverty by
District/City, 2013
Standard Credible Interval
District/City Mean
Deviation 5 Percent 95 Percent
[1] [2] [3] [4] [5]
Pacitan 2,444 0,766 1,416 3,716
Ponorogo 1,855 0,487 1,106 2,585
Trenggalek 2,003 0,579 1,318 2,929
Tulungagung 1,909 0,485 1,100 2,678
Blitar 2,417 0,804 1,460 3,732
Kediri 1,841 0,482 1,169 2,613
Malang 1,625 0,438 0,947 2,280
Lumajang 1,507 0,537 0,886 2,339
Jember 2,130 0,573 1,354 3,075
Banyuwangi 1,872 0,495 1,130 2,640
Bondowoso 1,968 0,502 1,312 2,815
Situbondo 2,019 0,534 1,181 2,896
Probolinggo 2,068 0,578 1,397 3,102
Pasuruan 2,008 0,538 1,338 2,934
Sidoarjo 2,501 0,859 1,454 3,859
Mojokerto 1,727 0,481 1,050 2,434
Jombang 1,826 0,485 1,153 2,557
Nganjuk 1,882 0,479 1,257 2,679
Madiun 1,867 0,486 1,133 2,605
Magetan 2,592 0,881 1,476 4,087
Ngawi 2,131 0,589 1,240 3,088
Bojonegoro 2,000 0,527 1,163 2,840
Tuban 1,777 0,499 1,135 2,539
Lamongan 2,544 0,880 1,400 3,997
Gresik 1,877 0,506 1,218 2,679
Bangkalan 1,494 0,450 0,903 2,255
Sampang 1,787 0,440 1,030 2,462
Pamekasan 2,103 0,551 1,240 3,005
Sumenep 2,407 0,729 1,329 3,686
Kediri City 0,845 0,479 0,247 1,751
Blitar City 1,085 0,449 0,453 1,886
Malang City 1,433 0,427 0,785 2,087
Probolinggo City 0,939 0,446 0,342 1,783
Pasuruan City 1,050 0,454 0,424 1,882
Mojokerto City 0,662 0,554 0,108 1,732
Madiun City 0,946 0,474 0,324 1,834
Surabaya City 1,474 0,482 0,863 2,221
Batu City 0,855 0,487 0,250 1,805

148
FIGURE 1. Distribution of proportions of dropout children 7 – 15 years old in
poverty, 2013

3. Concluding Remarks
This study presents the HB estimation procedure for an ensemble of
parameters related to proportions of dropout children 7 – 15 years old in poverty.
This study uses distribution proper prior distribution and noninformative conjugate
Gamma (0.0001, 0.0001) to precision parameter of random effect area
(hyperprior). The HB method accounts for the uncertainty involved in the
estimation of mean and variance of prior parameters by assigning the distributions
of prior parameters. Moreover, the HB approach provides standard errors along
with the point estimates. Of all combinations of models produced, the best model is
obtained by using ratio of students per class as predictor variable and the results are
describe as follows: five cities have proportions of dropout children 7 – 15 years
old in poverty less than one percent, fourteen districts/cities above two percent, and
the other nineteen districts/cities have proportions of dropout children 7 – 15 years
old in poverty around one percent. As for suggestions for future research are the
use of other variables that may have influenced proportions of dropout children 7 –
15 years old in poverty, and incorporate spatial and Bayesian element in the model

References
[1] J.N.K. Rao, Some New Development in Small Area Estimation, JIRSS. 2,
145 - 169, (2003)
[2] G.S. Datta, P. Lahiri, and T. Maiti, Empirical Bayes Estimation of Median
Income of Four Person Families by State Using Time Series and Cross-
Sectional Data, Journal of Statistical Planning and Inference. 102, 83 - 97,
(2002)
[3] B. Liu, Adaptive Hierarchical Bayes Estimation of Small-Area Proportions,
JSM, 3785 - 3799, (2009)
[4] S. Song, Small Area Estimation of Unployment: From Feasibility to
Implementation, Paper presented at the New Zealand Association of
Economists Conference, Wellington, (2011)

149
[5] A. Ubaidillah, N. Iriawan, B.S. Ulama, and K. Fithriasari, Pemodelan
Kemiskinan dengan Menggunakan Metode Hierarchical Bayesian Neural
Network (Studi Kasus Pada Sampel Rumah Tangga Survei Sosial Ekonomi
Nasional Tahun 2011 di Kota Jambi, Proceeding of the Jenderal Soedirman
University National Conference, Jenderal Soedirman University,
Purwokerto, (2013)

[6] A. Arrosid, Penerapan Metode Spatial Empirical Best Linear Unbiased


Prediction Pada Small Area Estimation Untuk Estimasi Angka
Pengangguran Tingkat Kecamatan di Propinsi Sulawesi Utara, Thesis,
Institute Technology of Sepuluh Nopember (ITS), (2014)
[7] Department of Education, Statistik Pendidikan Formal Tahun Pelajaran
2012/2013, Department of Education East Java Province, (2014)
[8] Statistics Indonesia, Analisa Pendidikan Putus Sekolah di SD dan SMTP,
(1982)
[9] Y. Sulistyoningrum, UNICEF: 2,5 Juta Anak Indonesia Putus Sekolah,
(2015). Available on http://kabar24.bisnis.com/read/20150623/255/446327/
unicef-25-juta-anak-indonesia-putus-sekolah-.,
[10] S. Sikhan, Low-Income Students Six Times More Likely to Drop Out of
High School, (2013). Available on:
https://www.wsws.org/en/articles/2013/04/10/ hsdo-a10.html
[11] M. Trevisani and N. Torelli, Hierarchical Bayesian Models for Small Area
Estimation with Count Data, Working Paper, Universita Degli Studi Di
Trieste, (2007)

150
Proceedings of IICMA 2015
Statistics and Probability

An Application of Bayesian Adaptive Lasso


Quantile Regression to Estimate The Effect of
Return to Education on Earning
Zablin1,a), Irhammah2,b), and Dedy Dwi Prastyo3,c)
1,2,3
Department of Statistics, Institut Teknologi Sepuluh Nopember, Jl. Arief Rahman
Hakim, Surabaya 60111, Indonesia

a) zablin@bps.go.id
b) irhamah@statistika.its.ac.id
c) dedy-dp@statistika.its.ac.id

Abstract. Education plays an important role in transfers skill and knowledge toward the
increasing in productivity and earning. Using so-called mincer earning function, we investigated
the effect of years of schooling, commonly known as return to education, on earning over
quantile. By specifying the effect of covariate at different quantile levels we allow the covariate to
affect response variable not only at the center of its distribution, but also at its spread. We
employed two methods to estimate parameters in mincer equation: (i) Bayesian quantile
regression (BQR) and (ii) Bayesian quantile regression with adaptive least absolute shrinkage and
selection operator (Lasso) penalty (BALQR). The latter method extends the bayesian Lasso
penalty term by employing different penalty function with an adaptive tuning parameter
accomodated in the inverse gamma prior distribution. Data used in this paper is samples from
workers in agricultural sector in South Sulawesi. Empirical results showed that BALQR
outperformed over BQR because it resulted in smaller mean squared error (MSE). These two
methods showed that in general return to education is higher in top of quantile than in bottom of
quantile of earnings distribution. The workers in agricultural sectors with higher earnings received
higher return to education than the lower earnings workers.
Keywords: Bayesian quantile regression, adaptive lasso penalty, mincer equation.

1. Introduction
Indonesian government (central and regional), started from 2009, allocated
around twenty percent of spending to education as a mandatory according to
constitutions 1945 and constitutional court decision number 13/PUU-VI.I.2008.
However, many provincial governments did not allocate their bugdet up to twenty
percent for education. Instead, they had various program for increasing schools
enrolments. For example, South Sulawesi provincial government provides free
education. The government believes that investment in education is a key to
success in economics. The education is believed to plays an important role to
transfers skill and knowledge in order to increase the productivity and earnings of
society.

Mincer [12] published his study about schooling, experience and earnings
after estimated on thousands of data sets for a large number of countries and time
periods. Depart from theoretical and empirical arguments, Mincer modeled the
natural logarithm of earnings as function of years of schooling and years of
potential experience (age minus year of schooling). The model is widely used and

151
became popular as Mincer equation. This equation now be a formal model of
investment in human capital. Moreover, another part the widely used of mincer
equation is that providing a parsimonious specification that fits the data remarkably
well in most context. The mincer equation was employed by Purnastuti, Miller and
Salim [13] to estimate the return rate education in Indonesia using data from
Indonesian Family Life Survey. The research found that there was a declining rate
of return to education between 1993 and 2007. Comola and Mello [3] employed
ordinary least square (OLS), Heckman’s approach, and multinomial selection to
estimate Mincer equation in Indonesia using data from National Labor Force
Survey (Sakernas) 2004. The study reported that return to education coefficient
obtained from OLS is about 9.5%-10.3%. The Heckman’s approach resulted in the
estimates value is about 10.8%-11.6% whereas the multinomial selection method
estimated the coefficient ranged from 10.2%-11.2%.
OLS regression only captures the effect of predictor on the changing of
response variable’s mean. In certain cases it is not informative enough. Scientist
needs more information about how predictor affects response variable along its
distribution. To overcome such a problem, Koenker and Basset [6] extended
Median regression to Quantile regression (QR). As an alternative to conditional
mean regression, QR can be readily extended to tail locations where social science
research is often of interest. The QR is robust to outlier. It has been applied to
Mincer equation; see (among others) Buchinsky (1994). At the beginning, Least
Absolute Deviation (LAD) method was employed to estimate parameters in QR.
Koenker and Machado [7] showed that loss function in QR similar with density
function of Asymmetric Laplace Distribution (ALD). Yu and Moyeed [15], Yu,
Kerm and Zhang [16], as well as Li, Xi and Lin [11] employed ALD to estimate
parameters of QR in bayesian perspective.
Tibshirani [14] proposed regularization technique so-called least absolute
shrinkage and selection operator (lasso) for simultaneous estimation and variable
selection. The elastic-net was introduced by Zou and Hastie [17] to overcome the
weakness of the lasso. Haerdle and Prastyo [5] applied lasso and elastic-net in logit
model for default risk calculation. Zou [18] extended lasso by applying different
penalty for different coefficient. This new methodology is called adaptive lasso.
Alhamzawi, Yu, and Benoit [1] applied the adaptive lasso penalty on Bayesian
Quantile Regression.
A number of studies using QR investigated how schooling affects the
earnings at different point in earning distribution, not only at the mean of earnings.
Such studies have important policy implications. For instance, do poor individual
earn lower returns on the same level of education than the rich? If so, does it
happend because the poor have access only poor-quality schooling? In this case,
the quality of school the poor attend needs to be improved. Yet, poor individual
might earn lower return on the same level of education because individuals with
characteristics other than schooling, such as ability and motivation, tend to benefit
more from education [4]. Following previous studies, this paper employed quantile
regression to estimate mincer equation. The aim of this study is to provide a
complete view of relationship between education and earnings in South Sulawesi in
2014. The parameters estimation method used is based on bayesian perspective
with adaptive lasso penalty.

152
2. Main Results
2.1. Quantile Regression and Asymmetric Laplace Distribution
Relationship between response variable Y and predictor X can be modelled as
linear regression:
𝒚 = 𝑿𝜷 + 𝜺, (1)
where 𝜷 is vector of parameters (𝛽0 𝛽1 𝛽2 ∙∙∙ 𝛽𝑘 )′ and 𝜺 is vector of error. The
estimates of 𝜷 can be obtained using least square method by minimazing sum
square residual ∑𝑛𝑖=1(𝑦𝑖 − 𝒙′𝑖 𝜷)2 . The estimator for 𝜷 in quantile regression is
obtained by minimazing sum of error imposed in check function, 𝜌𝜏 (𝑢), where 𝜏 is
quantile, 𝜏 ∈ (0,1):
𝛽̂ = argmin ∑𝑛𝑖=1 𝜌𝜏 (𝑦𝑖 − 𝒙′𝑖 𝜷), (2)
𝛽

where
𝜌𝜏 (𝜀) = 𝜀 {𝜏 − 𝐼(𝜀 < 0)}. (3)

The loss function in (3) is equivalent with probability density function of


ALD:
𝑓𝜏 (𝜀) = 𝜏 (1 − 𝜏) exp{−𝜌𝜏 (𝜀)}, (4)
with 𝜌𝜏 (𝜀) is defined in (3). So, minimization of the loss function in QR is
exactly equivalent to maximizing the likelihood function of independent
observation that follow ALD.

2.2. Bayesian Quantile Regression (BQR)


The QR is a linear model as in (1) with 𝜀 has ALD density function
formulated in (4). Kotz, Kazubowski and Podgorski (2001) showed that
ALD has various mixture representation. One of them is a mixture of
normal and exponensial distribution as described in Lemma 4.1.
Lemma 4.1. Let u ~𝐸𝑥𝑝(1), z ~𝑁(0,1), 𝜏 ∈ (0,1),
1−2𝜏 2
𝑝= and 𝑞 = √
𝜏(1−𝜏) 𝜏(1−𝜏)

then random variable 𝜀 = 𝑝𝑢 + 𝑞√𝑢𝑧 follows asymmetric Laplace


distribution with density in (4).
Based on Lemma 4.1, the linear model in (1) is equivalent with:
𝑦𝑖 = 𝒙′𝑖 𝜷𝜏 + 𝑝𝑢𝑖 + 𝑞 √𝑢𝑖 𝑧𝑖 , 𝑖 = 1,2, … , 𝑛. (5)
𝑦𝑖 −𝒙′𝑖 𝜷𝜏 − 𝑝𝑢𝑖 1
If 𝑧𝑖 = 𝑞 √𝑢𝑖
and 𝑑𝑧𝑖 = 𝑞 𝑑𝑦𝑖 , then the density of 𝑦𝑖 conditional
√𝑢𝑖
on 𝑢𝑖 :
2
∞ 1 (𝑦 −𝒙′𝑖 𝜷𝜏 −𝑝𝑢𝑖 ) 1
𝑓(𝑦𝑖 |𝑢𝑖 ) = ∫−∞ √2𝜋 exp(− 𝑖 2𝑞 2𝑢
)𝑞 𝑑𝑦𝑖 (6)
𝑖 √𝑢𝑖

153
is normal distribution with mean 𝜇 = 𝒙′𝑖 𝜷𝜏 + 𝑝𝑢𝑖 and variance 𝜎 2 = 𝑞 2 𝑢𝑖 .
Adding scale parameter 𝜎 to model (5) we have (7):
𝑦𝑖 = 𝒙′𝑖 𝜷𝜏 + 𝜎 −1 𝑝𝑢𝑖 + 𝜎 −1 𝑞 √𝑢𝑖 𝑧𝑖 . (7)
Dealing with gibbs sampling process, we need to transform (7) on to (8):
𝑦𝑖 = 𝒙′𝑖 𝜷𝜏 + 𝑝𝑣𝑖 + 𝜎 −1⁄2 𝑞 √𝑣𝑖 𝑧𝑖 , (8)
where 𝑣𝑖 = 𝜎−1 𝑢𝑖 . The joint conditional density of 𝑦 given 𝜷𝜏 , 𝜎, 𝑣 is:
2
1 (𝑦𝑖 −𝒙′𝑖 𝜷𝜏 −𝑝𝑣𝑖 )
𝑓(𝒚|𝜷𝜏 , 𝜎, 𝑣) = ∏𝑛𝑖=1 exp (− ). (9)
√2𝜋 𝜎−1/2 𝑞√𝑣𝑖 2𝜎 −1 𝑞 2 𝑣𝑖

In the bayesian framework, the parameters of model are considered as random


variables, therefore they have prior distribution. The prior distribution for bayesian
quantile regression in (8) are:
a) Prior for 𝜷𝝉 is 𝑁(𝑏0 , 𝐵0 )
b) Prior for 𝜎 is 𝐼𝐺(𝑎, 𝑏), with IG stands for Inverse Gamma distribution.
c) Prior for 𝑣𝑖 is Exp(𝜎).
Using Box-Tiao method described in Li et al. [11], the posterior distribution is
defined as:
𝜋(𝜷𝜏 , 𝑣, 𝜎|𝒚) ∝ 𝑓(𝒚|𝜷𝜏 , 𝑣, 𝜎) 𝜋(𝜷𝜏 |𝜎, 𝑣) 𝜋(𝑣|𝜎) 𝜋(𝜎)

2.3. BQR with Adaptive Lasso (BALQR)

Zou [18] proposed an alternative shringkage method with different weights


for differents coefficients so-called adaptive lasso formulated as:
2
𝛽̂ (𝑎𝑙𝑎𝑠𝑠𝑜) = argmin‖𝒚 − ∑𝑘𝑗=1 𝑋𝑗 𝛽𝑗 ‖ + 𝜆 ∑𝑘𝑗=1 𝑤
̂𝑗 |𝛽𝑗 |, (10)
𝛽
𝛾
with weight vector 𝑤 ̂𝑗 = 1/|𝛽̂ | , 𝛾 > 0, and 𝜆 > 0. Considering the adaptive lasso
in (10), Li et al. [11] imposed the Laplace prior distribution on parameter 𝜷:
𝜏𝜆 𝑘
𝜋(𝜷|𝜏, 𝜆) = ( 2 ) exp{−𝜏 𝜆 ∑𝑘𝑗=1 |𝛽𝑗 |}. (11)
Laplace prior (11) can be represented as a scale mixture of normal and
exponential density:
𝑤 ∞ 1 𝑡2 𝑤2 𝑤2𝑠
exp{−𝑤|𝑡|} = ∫0 exp {− } exp {− } 𝑑𝑠.
2 √2𝜋𝑠 2𝑠 2 2

Let 𝑤𝑗 = 𝜎 1⁄2 ⁄𝜆𝑗 , the equation (11) can be written as:


∞ 1 𝛽𝑗2 𝜎 𝜎𝑠𝑗
𝜋(𝛽𝑗 |𝜎, 𝜆𝑗2 ) = ∫0 exp {− } exp {− } 𝑑𝑠𝑗 . (12)
√2𝜋𝑠𝑗 2𝑠𝑗 2𝜆2𝑗 2𝜆2𝑗

Therefore, Alhamzawi (2012) employed IG distribution as prior on 𝜆𝑗2:


𝜃𝛿 −1−𝛿 𝜃
𝜋(𝜆𝑗2 |𝛿, 𝜃) = Γ(𝛿) (𝜆𝑗2 ) exp {− 𝜆2 }, (13)
𝑗

154
where 𝛿 > 0 and 𝜃 > 0 are hyper-parameters. Posterior density of 𝜆𝑗2 are taken
from combining (12) and (13). Shape and scale parameter for posterior invers
gamma are (1 + 𝛿) and (𝑠𝑗 ⁄2𝜎) + 𝜃, respectively. The amount of shringkage
depends on hyper-parameter 𝛿 and 𝜃.
The bayesian hierarchical model for BALQR proposed by Alhamzawi
(2012) are:

𝑦𝑖 = 𝛽0 + 𝑥𝑖′ 𝜷𝜏 + 𝑝𝑣𝑖 + 𝑞√𝜎 −1 𝑣𝑖 𝑧𝑖

𝜋(𝛽0 ) ∝ 1

1 𝑧𝑖2
𝜋(𝑧𝑖 ) = exp (− )
√2𝜋 2

𝜋(𝑣𝑖 |𝜎) = 𝜎 exp(−𝜎 𝑧𝑖 )

1 𝛽𝑗2 𝜎 𝜎 𝑠𝑗
𝜋(𝛽𝑗 , 𝑠𝑗 |𝜎, 𝜆𝑗2 ) = exp (− ) 2 exp (− 2 )
√2𝜋𝑠𝑗 2𝑠𝑗 2𝜆𝑗 2𝜆𝑗

𝜆𝑗2 𝜆𝑗2
𝜋(𝑠𝑗 |𝜆𝑗2 ) = exp (− 𝑠𝑗 )
2 2

𝜃𝛿 −1−𝛿 𝜃
𝜋(𝜆𝑗2 |𝛿, 𝜃) = (𝜆𝑗2 ) exp (− 2 )
Γ(𝛿) 𝜆𝑗

𝜋(𝜃, 𝛿) ∝ 𝜃 −1

𝑏
𝜋(𝜎) = 𝜎 −(𝑎+1) exp (− )
𝜎

The posterior distribution for all parameters is:


𝑛

𝜋(𝛽0 , 𝜷, 𝒗, 𝒔, 𝜎, 𝝀|𝒚, 𝒙) ∝ 𝜋(𝒚|𝛽0 , 𝜷, 𝒗, 𝒔, 𝜎, 𝒙) ∏ 𝜋(𝑣𝑖 |𝜎)


𝑖=1

× ∏ 𝜋(𝛽𝑗 , 𝑠𝑗 |𝜎, 𝜆𝑗2 ) 𝜋(𝜆𝑗2|𝛿, 𝜃) 𝜋(𝜎) 𝜋(𝜃, 𝛿)


𝑖=1

2.4. Data

The method previously described was applied to the data collected from
Sakernas 2014 in South Sulawesi province. The observations are the workers in
agriculture sector, above 15 years old, self employed, regular employee, and causal
employee (freelance). The quantile regression model in mincer equation is:
ln(𝑌) = 𝛽𝜏0 + 𝛽𝜏1 𝑋1 + 𝛽𝜏2 𝑋2 + 𝛽𝜏3 𝑋22 + 𝜀 ,

155
where 𝜀 is assumed to follows ALD. The description of response variable 𝑌 and
predictors 𝑋 are given in Table 1. The estimation parameter 𝜷𝜏 using BALQR
method was compared with the ones obtained using BQR.

TABLE 1. Variables

Variable Description
𝑌 Hourly earnings Hourly earnings from main employment
𝑋1 Years of schooling Number of year for education completed
𝑋2 Potential experience Ages minus years of schooling

2.5. Empirical Results

Human capital theory stated that worker with more education received
more earning than less educated worker as a result of increasing productivity.
Figure 1 showed the average of Ln(hourly earning) over education level. For
primary school level or more, the average of earning are monotonically increasing.
This means the higher education level, the higher hourly earnings average.

9,5 NCS : Not complete school


Ln(hourly earnings) (Rp)

9,4
PS : Primary School
9,3
9,2 YS : Yunior School
9,1
9,0 SS : Senior School
8,9
HE : Higher Education
8,8
8,7
8,6
NCS PS YS SS HE
Education Level

FIGURE 1. Mean of Ln(hourly_earnings) by Education


Level

TABLE 2. Mean Square Error (MSE) of BQR and BALQR


Quantile BQR BALQR
0.05 2.25899 2.26278
0.1 1.60409 1.59584
0.2 1.05954 1.05913
0.3 0.79104 0.79072
0.4 0.66990 0.67033
0.5 0.62832 0.62826
0.6 0.65764 0.65733
0.7 0.75808 0.75789
0.8 1.01032 1.00998
0.9 1.57775 1.57561
0.95 3.14460 2.62837

156
Mean square error (MSE) of BQR and BALQR models for each quantile
are presented in Table 2. The BALQR resulted in smaller MSE than BQR at all
quantile levels, except at 0.05 and 0.4 quantiles. Thus, BALQR method
outperformed BQR. The 𝛽𝜏1 is the coefficient for years of scholling variable
whereas the 𝛽𝜏2 and 𝛽𝜏3 correspond to the coefficients for experience variable and
experience squared, respectively. The estimated parameters of mincer equation
obtained by BALQR method are reported in table 3.
Figure 2. shows that the estimated quantile lines 𝐿𝑛(𝑌) = 𝛽𝜏0 + 𝛽𝜏1 𝑋1
for 𝜏 = {0.05, 0.1, … , 0.9, 0.95} that are parallel. The empirical results indicate that
the variance of error over quantile are constant (and thus for the dependent
variable). All quantile regression lines have the same slope, even for extreme
quantile (0.05 and 0.95). The slope for 0.05 quantile regeression line shows how
the earnings changes with years of schooling along the 5-th percentile of earnings
distribution at each value of years of schooling.
TABLE 3. Estimator for 𝛽𝜏 Obtained from BALQR Method
Quantile (𝜏) 𝛽𝜏1 𝛽𝜏2 𝛽𝜏3
0.05 0.035000 0.018200 -0.000037
0.10 0.032600 0.011400 0.000003
0.20 0.037300 0.013000 0.000039
0.30 0.033119 0.018551 -0.000113
0.40 0.034742 0.019255 -0.000124
0.50 0.041561 0.020516 -0.000129
0.60 0.043976 0.024809 -0.000158
0.70 0.044035 0.027832 -0.000194
0.80 0.046499 0.032658 -0.000243
0.90 0.042814 0.024264 -0.000183
0.95 0.043100 0.002080 -0.000094

0,048

0,046

0,044

0,042
Beta (tau)

0,040

0,038

0,036

0,034

0,032
0,0 0,2 0,4 0,6 0,8 1,0
Quantile (tau)

FIGURE 2. Fitted Quantile Regression Lines FIGURE 3. Plot of 𝛽𝜏1 over Quantile

In the human capital theory, the coefficient of years of schooling known as


return to education. Return to education varies along earnings distribution. As
ilustrated in Figure 3, there is no clear pattern for estimated coefficient of return to
education for quantile less than 0.3, the average return capture the similar
provitability of education. There is a monotone increasing pattern of the estimated
coefficient at 0.3 up to 0.8 quantile level, then it decline afterward. Changing in
pattern suggested the different policy should be done to overcome the inequality in
return to education. Figure 3 also shows that it is clear to conclude that coefficients

157
of years of scholling are higher at top of quantiles (close to one) than at bottom
quantiles (close to zero). This indicates that people with the same level of
education received different return consider to conditional earnings. In general,
workers in agricultural sectors with higher earnings received higher return to
education than the lower earnings workers.

3. Concluding Remarks

The effect of years of schooling, also known as return to education, on


earning was studied in this paper. The mincer equation was used as a model that
can represent the relationship of these variables. The parameters in the mincer
equation was estimated by using BQR and BALQR methods. Applying these two
estimation methods to the workers from agriculture sector in South Sulawesi, the
BALQR produced better results than BQR in term of smaller MSE at most quantile
level. However, the returns to education do not monotonically increase along
earning distribution, but clearly to conclude that the returns are higher in top of
quantile of earning than in bottom of quantile. This result indicates the higher
returns for workers with higher earning than workers with lower earnings.

References
[1] R. Alhamzawi, K. Yu, and D. F. Benoit, Bayesian Adaptive Lasso Quantile
Regression, Statistical Modeling, 12(3), 279-297, (2012)
[2] M. Buchinsky, Changes in the US Wage Structure 1963-1987: Application
of Quantile Regression, Econometrica, 62, 405-459, (1994)
[3] M. Comola and L. de Mello, Educational attainment and selection into the
labor market: the determinants of employment and earnings in Indonesia,
Paris-Jourdan Sciences Economiques Working Paper, 6, (2010)
[4] T. Fasih, Linking education policy to labor market outcames, World Bank,
Washington, (2008)
[5] W. Haerdle and D.D. Prastyo, Embedded predictor selection for Default
Risk Calculation: A Southeast Asian Industry Study, in Chuen, D.L.K. and
Gregoriou, G.N. (Eds.), Handbook of Asian Finance, Vol. 1. Financial
Market and Sovereign Wealth Fund, Academic Press, San Diego, (2014)
[6] R. Koenker and G. Basset Jr., Regression Quantiles, Econometrica, 46, 33-
50, (1978)

[7] R. Koenker and J.A.F. Machado, Goodness of fit and related inference
processes for quantile regression, Journal of the American Statistical
Association, 94, 1296–1310, (1999)
[8] R. Koenker and K.F. Hallock, Quantile Regression: An Introduction,
Journal of Economic Perspectives, 15, 143-156, (2001)
[9] R. Koenker, Quantile Regression, Cambridge University Press, Cambridge,
(2005)

158
[10] S. Kotz, T.J. Kozubowski, and K. Podgorski, The Laplace Distribution and
Generalization: A Revisit with Application to Communications, Economics,
Engineering, and Finance, Springer Science and Business Media, New
York, (2001)
[11] Q. Li, R. Xi, and N. Lin, Bayesian regularized quantile regression,
Bayesian Analysis, 5, 1–24, (2010)
[12] J. Mincer, Schooling, Experience and Earnings, The Natural Bureau of
Economic Research, New York, (1974)
[13] L. Purnastuti, P.W. Miller, and R. Salim, Declining Rates of return to
education: evidence for Indonesia, Bulletin of Indonesian Economic Studies,
Vol. 49:2, 213-236, (2013)
[14] R. Thibsirani, Regression Shrinkage and Selection via the Lasso, Journal of
the Royal Statistical Society, Series B, Vol. 58, 267-288, (1996)
[15] K. Yu and R.A. Moyeed, Bayesian quantile regression, Statistics &
Probability Letters, 54, 437–447, (2001)

[16] K. Yu, P.V. Kerm, and J. Zang, Bayesian Quantile Regression: An


Application to the Wage Distribution in 1990s Britain, The Indian Journal
of Statistics, 67 (2), 359-377, (2005)

[17] H. Zou and T. Hastie, Regularization and Variable Selection via the
Elastic-Net, Journal of the Royal Statistical Society, Series B, 67(2), 301-
320, (2005)
[18] H. Zou, The Adaptive Lasso and its Oracle Properties, Journal of the
American Statistical Association, 101, 1418–29, (2006)

159
Proceedings of IICMA 2015
Statistics and Probability

Hierarchical Bayes Modeling in Small Area for


Estimating Unemployment Proportion Under
Complex Survey
Arip Juliyanto1,a), Heri Kuswanto2,b), Ismaini Zain3,c)
1 ,2,3
Department of Statistics, Institut Teknologi Sepuluh Nopember,
Jl. Arief Rahman Hakim, Surabaya 60111, Indonesia
a)
arip_jy@bps.go.id
b)
heri_k @statistika.its.ac.id
c)
ismaini_z @statistika.its.ac.id

Abstract. In general, surveys to estimate unemployment proportion are designed for large area.
An area is regarded as a large if the sample size is large enough to be directly estimates or design-
methods with adequate precision. However, direct design-methods are unreliable to be applied in
small area as they are based on small sample size. To carry out estimate for small area with
adequate precision, we can use the indirect-estimation such as hierarchical Bayes models. In this
paper, we compare the performance of four hierarchical bayes small area models for estimating
the proportion of unemployment based on data generated from complex survey with finite
population. Two of the models adopt the commonly Fay-Heriot models assuming known
variance, while two the other models assumed that the sampling variance was unknown. From the
study we found that the model assuming unknown variance outperforms the other and can be
considered as small area level estimation to account proportion of unemployment.
Keywords and Phrases: small area estimation, hierarchical bayes, linier model, proportion,
complex survey, unemployment.

1. Introduction
Proportion of unemployment is one of the main indicators of the region
besides economic growth. In Indonesia, proportion of unemployment obtained
from the National Labor Force Survey is designed to provide area-specific (region)
estimator (unemployment rate or proportion) with reliable precision. Therefore, if
we want to provide this indicator for subregion, the direct estimator leads to
unacceptably large standard error due to small sample in this area and the estimator
of proportion will have heavily biased. In order to overcome this problem, it is
necessary to find indirect methods that can give reliable estimators known as Small
Area Estimation method (SAE).
There are several of SAE methods often used in line with the growing
demand of small area statistics e.g Best Linier Unbiased Prediction (BLUP),
Empirical Bayes (EB) and Hierarchical Bayes (HB) estimation [3]. In general, HB
method is more superior than other methods. Ghosh and Rao [3], Arora and Lahiri
[1] or You and Chapman [14] has shown that the estimators (posterior mean) of
the HB method has smaller mean square error than BLUP method. In the other
hand, the HB method can handle more complex model such as unmatched
sampling and linking models whereas these models are difficult to do in other
methods.

160
The SAE have two types of basic small area models that have been studied
in the most literature i.e basic area level model and unit level model. Area level
models often used by researchers than unit level models, one of the reason is that
area level auxiliary information is more readily available than unit level. Many
studies have used area level model to obtain estimator of proportion. Xia, Carlin,
dan Waller [15] used HB area model for lung cancer mapping in Ohio in 1988.
Xie, Raghunathan, and Lepkowski [16] used HB area model-based on Fay Herriot
model to estimate proportion of overweight individual in America. Meanwhile,
Mohadjer, Rao, Liu, Krenzke and Kerckhove [12] used HB area model that
extended with unmatched sampling and linking model to estimate proportion of
literacy capabilities in America.
The SAE modeling in most cases assumes that the data is obtained from
the Simple Random Sampling (SRS). Event though, public data are very often
designed with the complex survey, such as stratified and two stage sampling
survey. HB area level model able to adjust model with the design survey which
data have been drawn. Liu et al. [11] or Liu and Diallo [10] in their study applied
variance adjustment based on stratified simple random sampling in HB model in
order to estimate the uncertainty sampling variances component. Ha et al. [5] have
estimated smooking rate by adjusted the HB model in order to make design survey
complex consistent.
In Section 2, we briefly describe about Small Area Estimation Method. In
this section we introduce some notation about models description and discuss some
models that often used in SAE such as Fay Herriot models an the other alternatif
models. Comparing some models criteria is presented in Section 4. The simulation
study is describe in Section 5 and the results are presented in Section 5. Conclution
of this study is found in the Section 6 about concluding remark.

2. Main Results
2.1 Model Description
Suppose 𝑦𝑖𝑗 denotes the binary response variable of interest (activity
status) for unit j in small area i. Let 𝑁𝑖 be the finite population size in area i
(i=1,2,...,m; j=1,2,...,Ni). The parameters to be estimated are small area proportions
𝑁𝑖
with 𝜃̂𝑖 = ∑𝑗=1 𝑦𝑖𝑗 ⁄𝑁𝑖 . With complex sampling design assumed, the direct
ni ni
estimator for 𝜃𝑖 is ˆ wij yij wij , where ni is the sample size (𝑛𝑖 > 0)
i
j 1 j 1
and wij is survey weight associated unit j in area i. For the survey designed by
EPSEM (Equal Probability of Selection Method), the direct estimate for proportion
ni
given by ˆiw yij ni .
j 1

According to Kish [8], suppose Vard sgn ( i ) and Varsrs ( i ) is proportion


variance of complex survey and simple random sampling respectively then design
effect can be expressed as Deffi Vard sgn ( i ) Var srs ( i ) . Design effect can reflect
the sample efficiency of the complex sampling design. The parameter Deff often
unknown due to the population variance of proportion is unknown, so Deff will be
estimated by deff iw .

161
Due to of small sample size, the direct estimators of proportion of small
area is very imprecise and has a large variance that can’t be ignored. The modeling
approach can address this problem. The objective of using modeling approach is to
increase the effective sample size thus increasing the precision by reducing
variance. Liu [9] said that a commonly used area level model is Fay-Herriot, which
is a two-level mixed model with the following form:

Model 1: (Fay-Herriot: F-H Normal-Normal model)


ind
Level 1 (sampling model) : ˆiw i N( i , i)
ind
Level 2 (linking model) : ˆi β , 2
v N (xi' β, 2
v) (1)

Where i is sampling variance assumed known and xi is a vector of covariate for


area-i. The FH model is an example of a matched model [13], because both of the
model (sampling and linking model) can be formed into a single model (linier
mixed model):
ˆ xi' β vi ei
iw

Ha et al. [5] said that for estimating small area proportions, the assumption
of normality in sampling and linking model could be unacceptable, particularly
when the true small area proportions are near to 0 or 1. Moreover, the normality
assumption doesn’t guarantee the support between 0 and 1 for posterior distribution
of 𝜃𝑖 . Liu et al [11] proposed using logit for linking model to ensure the range of
posterior of 𝜃𝑖 always falls within range (0,1).

Model 2: (Normal-Logistic model: NL)


ind
Level 1 (sampling model) : ˆiw i N( i , i)
ind
2
Level 2 (linking model) : logit ( i ) β , v N (xi' β, 2
v) (2)

According to Liu et al [11], both of the FH model (1) and NL model (2)
have main problem that is about assumption of known variance. In practice,
sampling variance is unknown so have to be estimated. If we use the direct
approach to get the estimate of variance, the estimate is very imprice and unstable
when the sample size is very small. Some of alternatives to approximate estimate
of variance are the syntetic variance estimator, Generalized Variance Function
(GVF), domain census variance. In this study we use cencus domain (region)
variance that proposed by Ha et al. [5].
To implement explanation above about sampling variance estimate, we
applied the equations of variance as:

i i (1 i) ni Deffi ,
With the estimator of variance given by
ˆi ˆrgn (1 ˆrgn ) n deff
j (i ) j (i ) i iw

deffiw vard sgn ( ˆrgn ˆrgn


j (i ) ) varsrs ( j (i ) )

162
Where deff iw is a estimator of Deffi . Assuming deff iw as a estimator of Deffi
implied that the estimate of the true design effect for all subregions are similar.
Assumption of known variance in FH model (1) and NL model (2) need
additional procedures to adress this problem mentioned previously. Liu et al. [11]
and Ha et al. [5] consider the NLrs (3) and BLrs (4) method in order to adjust this
problem. NLrs and BLrs method treat the i as unknown parameter in HB model.
Model 3: (Normal-logistic random sampling variance: NLrs)
ind
Level 1 (sampling model) : ˆiw i N( i , i)

ind
2
Level 2 (linking model) : logit ( i ) β , v N (xi' β, 2
v) (3)

To accommodate asymetris sampling distribution (non-normality),


assumption of beta distribution for sampling model can be used to change the
normality. Adapting of this assumption initially considered by Jiang and lahiri [7].
The beta distribution is chosen because it can cover a rich class of distribution,
including asymetris distribution. Furthermore, beta distribution has the desirable
property of range restriction to (0,1). Hawala and Lahiri [6] used model 4 to
estimate poverty rate compared FH model (1).
Model 4: (Beta-Logistic random sampling variance: BLrs)
ind
Level 1 (sampling model) : ˆiw i beta(ai , bi )
ind
2
Level 2 (linking model) : logit ( i ) β , v N (xi' β, 2
v) (4)

Liu, et al. [15] for both model 3 and model 4 considered approximation to
estimate variance function by used i i (1 i ) ni deffiw . The parameter 𝑎𝑖 and
𝑏𝑖 in model 4 are given by
ai i ni deffiw 1 , and bi (1 i) ni deffiw 1

In order to get some estimators from four models above, we can use the
Metropolis-Hasting (MH) algorithm within Gibbs sampler in MCMC method. MH
algorithm will be applied due to three models (model 2,3 and 4) have non-closed
form distribution of conditional distribution especially for parameter proportion.
Drawing ramdom samples in MCMC based on the full conditional distributions of
unknown parameter with certain initial values.

2.2. Model Evaluation and Comparison


One of the procedure to compare some models from bayesian model is
through the DIC (Deviance Information Criterion) measure. The model with
smaller value of DIC suggesting a better-fitting model. This measure uses the
deviance is defined as
i i
D y, 2log p y

i
Where p y is the likelihood of the data and are the parameters of
the model. Thus the DIC measure depend on both of data and parameter. A point

163
estimator which is used to estimate is D ˆ y D y, ˆ y derived from
simulations computing. The average posterior distribution is given by
1 n
D y E D y, y , that be estimated with Dˆ y D y, i
ni1
Another important procedure in DIC procedure is about effective number
of parameter defined as pD Dˆ y D y . So as finally the DIC measure is
ˆ

formed as
DIC 2Dˆ y D ˆ y
Congdon [2] said that despite the model have smaller DIC value, it doesn’t
automatically mean that the model is better than others especially if the model fails
to estimate true parameters, in other word the model can’t yield the precise
estimator for the true parameters. Liu [9] considered methods to compare models
with bias statistics. The following bias statistic can be used to compare some
models are Overal Average Bias (OAB), Overal Average Absolut Deviation
(OAAD) dan Average Absolute Relative (OAARD). The three statistics bias can be
defined as:
m R
1 ˆr
OAB i i
mR i 1 r 1

m R
1 ˆr
OAAD i i
mR i 1 r 1

m R ˆr
1 i i
OAARD
mR i 1 r 1 i

where ˆir is the estimate of 𝜃𝑖 for each subregion on rth sample, and R is a number
of replication (set of data) and m is number of subregion. Models with smaller
value of this statistics (OAB, OAAD, and OAARD) is considered as the better
models. This criteria is only can be done within simulation data, because if using
survey data to estimate some parameter the true value of the parameter is
unavailable.

2.3. Data and Simulation Study


This section will describes simple simulation study based on real data to
compare the performance of the four model HB (FH, NL, NLrs, BLrs) in order to
estimate unemployment proportion. First, we describe about sampling method in
which the data have been drawn and then use them as an input in four HB
modeling processes.
The sampling frame used to draw samples is obtained from population
census 2010 in Riau Province especially coastal region (Bengkalis, Meranti, and
Indragiri Hilir regency). The total finite population of three region is restricted only
for labor force status, contain 595,836 record. The parameter interest that will be
estimated is proportion of unemployment in each subregion.
In each region as census domain, we drawn samples using two stage sample
design. The first stage is to select samples (census block) without replacement

164
using probability proportional to size (PPS), with size the number of household. In
the second stage, 10 households were selected from each chosen census block. This
sampling process was repeated ten times so that we have 10 set of independent
data. For simplicity, this simulation only use independent variable without some
covariates.
Each of set data was used as an input in simulation to obtain some
estimators of parameter interest. Simulation were done using OpenBugs and R
sofware, Brugs package was used to run OpenBugs in R. We used Metropolis-
Hasting (MH) algoritm within Gibbs sampler. This algoritma drawn samples based
on full conditional distribution with certain initial values. For first model (FH) wet
set run of 20,000 iteration lenght with burn in at 10,000. Meanwhile, for the other
model setting iteration lenght in 50,000 with burn in at 10,000 and thined 15 used
to reduce autocorrelation of the samples. The difference of lenght of iteration due
to depend on complexity of the model to realize convergence.

2.4. Results
This section discusses the report result of simulation designed above. In
Figure 1 depicts of plot the average of four HB (FH, NL, NLrs dan BLrs) estimate
in estimating unemployment proportion against subregion ordered by average of
number samples. The true parameter of proportion appears to be more fluctuate
than the estimated value using four HB models especially for small sample size.
For larger sample size the fluctuation of the parameter decreases and appear to be
similar for all models. This condition is also appear if we look at Figure 2.

Scatterplot of true P, FH, NL, NLrs, BLrs vs subreg


0.25 Variable
true P
FH
NL
0.20 NLrs
BLrs
Y-Data

0.1 5

0.1 0

0.05

0 5 10 15 20 25 30 35
subreg

FIGURE 1. Comparison of average point estimation of HB models and true


proportion against subregion ordered by number samples.
Figure 2 shows plotting residual against subregion ordered number samples.
Four HB estimate appear fluctuate around the null value. In line with the increasing
number of samples, error values also getting smaller for all HB models. This
condition happened due to the covariate or predictor variabels not to be included in
HB model so that the posterior estimator only depent on value of direct estimation
(direct proportion from samples chosen). By including the covariate variable, the
HB models can estimate proportion more precisely.

165
FIGURE 2. Comparison of average residual of HB models against subregion
ordered by number samples ( residual defined as Pi piHB )
Furthermore, we can compare HB models by using standard technique such
as DIC and bias statistics. Based on DIC measure, Table 1 shows that the model
FH and NL appear to be similar value, then NLrs model shown marginally better
than both models previously with smaller DIC value. From this table shown that
BLrs model has the smallest DIC. According to this measurement, BLrs model
suggested to be the best models.
However, with bias criteria the differences of four model is hard to be
concluded because the value of OAAD and OAARD are very similar. In other
hand, the OAB values appear to be in line with the DIC measurement to confirm
BLrs as the best model. Nevertheles, according to OAARD values, all of models
shown the value around thirty percent. Thus it can be indication that the four
models excluded covariate variable is not appropriate enough to estimate subregion
proportion, so addition of covariat variable is needed.

TABLE 1. The Average of Deviance Information Criterion, Overal Average Bias,


Overal Average Absolut Deviation (OAAD) dan Average Absolute Relative
(OAARD)

Model DIC OAB OAAD OAARD


FH -70.2460 -0.0036 0.0322 0.3274
NL -71.1820 -0.0035 0.0331 0.3420
NLrs -76.5740 -0.0057 0.0343 0.3414
BLrs -95.1580 -0.0023 0.0327 0.3316

3. Concluding Remarks
From the description on the section 5 can be concluded that by using of HB
models we can estimate unemployment proportion in small area (subregion). BLrs
model shown better performance than others from DIC criterion and overal average
bias (OAB). However, based on plotting average of HB estimation (estimate of
proportion) and residuals, the four models shown similar performance. From this
study we found that model HB without covariate variable give not strong
appropriate enough for small area estimation. In future, we very recommended to
add some covariate variables to the linking model in order to increase the precision
of estimation. Furthermore, we can try use different prior to analyze sensitifity of

166
model. Addapting some distribution (non-normality) to random effect area model
is also recommended to enrich small area modeling.

References
[1] V. Arora and P. Lahiri, On the superiority of the Bayesian method over
BLUP in small area estimations problems, Statistical Sinica, vol. 7, pp.
1053-1063, (1997)
[2] P.D. Congdon, Applied Bayesian Hierarchical Methods, CRC Press, Boca
Raton, (2010)
[3] M. Ghosh and J.N.K. Rao, Small area estimation: an appraisal, Statististical
Science, Institute of Mathematical Statistics, vol. 9, pp. 55-76, (1994)
[4] N.S. Ha, Hierarchical Bayes Estimation of Small Area Means Using Complex
Survey Data, Ph.D Thesis, University of Maryland, (2013)
[5] N.S.Ha, P. Lahiri, and V. Parsons, Methods and Results for Small Area
Estimation using Smoking Data from The 2008 Nasional Health Interview
Survey, Statistics in Medicine, vol. 33, pp. 3932-3945, (2014)
[6] S. Hawala and P. Lahiri, A Hierarchical Bayes Estimation of Poverty Rates,
Proceedings of American Statistical Association, Section on Survey Research
Methods, Alexandria, VA: American Statistical Association, Fortcoming,
(2012)
[7] J.Jiang and P. Lahiri, Mixed Model Prediction and Small Area Estimation,
Test, vol. 15, 1, pp. 1-96, (2006)
[8] L. Kish, Survey sampling, Jhon Wiley, New York, (1965)
[9] B. Liu, Hierarchical Bayes Estimation and Empirical Best Prediction of
Small Area Proportions, Ph.D Thesis, University of Maryland, (2009)

[10] B. Liu and M. Diallo, Parametric Bootstrap Confident Intervals for Survey-
Weighted Small Area Proportion, Proceedings of the American Statistical
Association, Section on Survey Research Methods, VA: American Statistical
Association, Fortcoming, pp. 109-121, (2013)

[11] B. Liu, P. Lahiri, and G. Kalton, Hierarchical Bayes Modeling of Survey-


Weighted Small Area Proportions, Proceedings of the American Statistical
Association, Section on Survey Research Methods, Alexandria, VA:
American Statistical Association, Fortcoming , pp. 3181-3186, (2007)
[12] L. Mohadjer, J.N.K. Rao, B. Liu, T. Krenzke, and W. Van de Kerckhove,
Hierarchical Bayes Small Area Estimates of Adult Literacy using Unmatched
Sampling and Linking Models, Journal of the Indian Society of Agricultural
Statistics, vol. 66, pp. 55-63, (2012)
[13] Y. You and J.N.K. Rao, Small Area Estimation Using Unmatched Sampling
and Linking Models, Canadian Journal of Statistics, vol. 30, pp. 3-15,
(2002)
[14] Y. You and B. Chapman, Small Area Estimation Using Area Level Models
and Estimated Sampling Variances, Survey Methodology, vol. 32, pp. 97-
103, (2006)

167
[15] H. Xia, B.P. Carlin, and L.A. Waller, Hierarchical Models for Mapping
Ohio Lung Cancer Rate, Research Report 95-011, Division of Biostatistics,
University of Minnesota, (1995)
[16] D. Xie, T.E. Raghunathan, and J.M. Lpkowski, Estimation of The
Proportion of Overweight Individuals in Small Areas-a robust extension of
The Fay-Herriot Model, Statistics In Medicine, vol. 26, pp. 2699-2715,
(2006)

168
Proceedings of IICMA 2015
Statistics and Probability

Food Insecurity Structure in Papua and West


Papua
Agustin Riyanti1,a), Vita Ratnasari2,b), Santi Puteri Rahayu3,c)
1,2,3
Department of Statistics, Institute of Techonolgy Sepuluh Nopember (ITS) Surabaya
a)
justafity@gmail.com
b)
vitaratna70@gmail.com
c)
santi_pr@statistika.its.ac.id

Abstract. The problem of food insecurity is a multidimensional issue that requires a complex
analysis and should receive serious attention. The method that used to analyze multiple
dimensions with many variables is Structural Equation Modeling (SEM). SEM is a method that
has an ability to resolve complicated problems which estimate the relationship between multi
related variables with the output from the measurement model and the structural model. The
Partial Least Squares (PLS) approach to SEM offers an alternative to covariance based SEM,
which is especially when data is not normally distributed. PLS- called as soft modeling technique
with minimum demands regarding measurement scales, sample sizes and residual distribution. In
this paper, PLS is used to determine the variables that significant on food insecurity in Papua and
Papua Barat. There are four latent variables, namely food insecurity, food access, food
availability, and food utilization. The result showed that food access significantly affects to food
availability and food utilization, food availability has a significant affect to food utilization.
However, food access, food utilization, and food availability does not significantly affect to food
insecurity in Papua and Papua Barat.
Keywords and Phrases: Food insecurity, Structural Equation Modeling, Partial Least Square

1. Introduction
Food is human basic necessity. Its fulfillment becomes economic as well
as social investment to have better generation in the future. Food security is one of
the government priorities. The problem of food security is food insecurity. Food
insecurity is a multidimensional issue which needs a complex analysis more than
food production and food availability. The complexity of food security can be
simplified by focusing on three dimension namely food availability, food access,
and food utilization. In Indonesia, Priyanto [1] using partial least square logistic
regression to analized food security in districts of The Borneo Island, Kastanja [2]
modeling the status of the risk of food insecurity in the provinces of Papua and
Papua Barat with SEM-PLS Spatial.
Food Security Council (DKP) [3] conducted an analysis of food security
using three aspects of food security using principal component analysis. While
abroad, Migoto, Davis, Carletto, and Beegle [4] used multivariate regression
analysis to measure food security using respondent’s perceptions of the adequacy
of food consumption. Khasnobis and Hazakira [5] states that the status of women
affects the child food security in Pakistan. Dutta and Gundersen [6] observed that
the variables that influence food security is income, children under the age of 6
years. Rahim, Saeed, and Rasool [7] in research on factors that affect household

169
food insecurity in parts of Northwestern Iran stated that the factors affecting food
insecurity is the distance from the city, the food supply, the number of household
members, and the Average income per capita. Lovendal and Knowles [8]
conducted a study on a framework for analyzing vulnerability to food security
using three-dimensional studies in food security, namely: food availability, food
access and food utilization. The objective of this paper is to determine the
variables that significantly affect to food insecurity in Papua and Papua Barat.

1.1 SEM- Partial Least Square


Hair, Black, Babin, and Anderson [9] explain that Structural Equation
Modeling is a statistical model to explain the relationships among multiple
variables. SEM has an ability to represent unobserved variables in these
relationships and account for measurement error in the estimation process. The
most common SEM estimation procedure is maximum likelihood estimation
(MLE). MLE provides a valid and stable result under ideal condition. Model
complexity leads to the need for large sample. MLE is more efficient and unbiased
when the assumption of multivariate normality is fulfilled. One of the
developments of SEM is Partial Least Square (PLS). Wold [10] introduced PLS
using NIPALS (Nonlinear Iterative Partial Least Squares) algorithm. Whereas
SEM estimates model in order to discrepancy between the estimated and sample
covariance matrices is minimized, PLS explained variance of the endogenous
variable is maximized by estimating the partial model relationship in iterative
sequence of ordinary least squares (OLS) regressions. Thus, PLS scores are
estimated as exact linear combinations of their associated manifest variables and
treats them as error-free substitutes for the manifest variables. Whereas SEM
requires distributional assumptions, PLS is less rigid distributional assumptions on
the data. PLS is referred to as soft modeling technique with minimum demands
regarding measurement scales, sample sizes and residual distribution. In this study,
used three latent variables to predicted food insecurity. Three latent variables are
food availability, food access, and food utilization.
Partial Least Square includes the measurement model (outer models) and
the structural model (inner model). Inner model describing the relationship
between latent variables. General equation of the structural model:
𝜂 = 𝐵𝜂 + 𝛤𝜉 + 𝜁
Where:
𝜂 :Vector of endogenous variables in the inner model
𝐵 : Path coefficient matrix of endogenous variables in the inner model
𝛤 : Path coefficient matrix of exogenous variables in the inner model
𝜉 : Vector of exogenous variables in the inner model
𝜁 : Random vector of residuals in the inner model
Outer model defines the relation between the latent variables and the
observed indicator or manifest variables. General equation of the outer model:
𝑥 = Ʌ𝑥 𝜉 + 𝛿
𝑦 = Ʌ𝑦 𝜂 + 𝜀

170
Where:
𝑥 : Vector of exogenous latent variables
𝑦 : Vector of endogenous latent variables
Ʌ𝑥 : Loading factor matrix of exogenous latent variables
Ʌ𝑦 : Loading factor matrix of endogenous latent variables
𝛿 : Error in the measurement model in exogenous variables
𝜀 : Error in the measurement model in endogenous variables

1.2 Data and Variables


Data that used in this study are secondary data. Data collected from
National Socio-Economic Survey 2013, Village Potential Data Collection, Health
Research 2013, Papua and Papua Barat in figure 2014.
Indicators used in food availability are normative consumption to net per-
capita production ratio and percentage of food crop production in Gross domestic
product. Food access uses indicators percentage of people living below the poverty
line, percentage of villages with inadequate transport connections, and percentage
of households without access to electricity. Indicators food utilization are
percentage of female illiteracy, percentage of households without access to clean
and safe drinking water, child stunting, life expectancy at birth, percentage of food
expenditure per capita. Food insecurity used indicators percentage of damage area,
percentage of forest area, and percentage of floods area.

FIGURE 1. Conceptual models in this study

Manifest variables used in this study : (X1) normative consumption to net


per capita production ratio, (X2) percentage of food crop production in Gross
domestic product, (X3) percentage of people living below the poverty line, (X4)
percentage of villages with inadequate transport connections, (X5) percentage of
households without access to electricity, (X6) percentage of female illiteracy, (X7)
percentage of households without access to clean and safe drinking water, (X 8)
child stunting, (X9) life expectancy at birth, (X10) percentage of food expenditure
per capita, (Y1) percentage of forest area, (Y2) percentage of floods area, and (Y3)
percentage of damage area .

171
2. Main Results
The result of Partial Least Square includes the measurement model and the
structural model. The measurement model defines the relation between the latent
variables and the observed indicator or manifest variables. Fig 2. shows that X8 and
X9 not valid indicators in food utilization, Y3 and Y1 are not valid indicators in food
insecurity.

FIGURE 2. Path diagram of food Insecurity

In the next step, X8 and X9 exclude from food utilization, Y1 and Y3 exclude from
food insecurity. The result after excludes X8, X9, Y1 and Y3 shown in fig 3.

FIGURE 3. Path Diagram of Food Insecurity after X8, X9, Y1 and Y3


excluded

2.1 Measurement Model


The measurement model defines the relation between the latent variables
and the observed indicator or manifest variables. This analysis carried out with
reference to reliability and validity attributes.
In order to check whether the indicators of each construct measure are
supposed to measure, test of convergent and discriminant validity is required. In
terms of convergent validity, both indicator reliability and construct reliability were
assessed. Indicator reliability was examined by looking at the construct loading.

172
Construct reliability and validity was tested using two indices: the composite
reliability (CR) and Average Variance Extracted (AVE). All the estimated indicate
were above the threshold of 0,6 for CR and 0,5 for AVE. in table 1. showed that
value AVE >0,5 and CR > 0,6.

TABLE 1. AVE and Composite reliability


AVE Composite Reliability

Food insecurity 1.000.000 1.000.000


food access 0.741057 0.894997
food availability 0.721600 0.837688
food utilization 0.816105 0.930061

Discriminant validity of construct items was assured by looking at the


cross-loadings. There are obtained by correlating the component scores of each
latent variable with both their respective block of indicators and all other items that
are included in the model. In tables 2 the crosses loading are presented.

TABLE 2. Cross Loading


Food insecurity food access food availability food utilization

X1 -0.240592 0.560539 0.791086 0.467260


X10 -0.181215 0.663985 0.511424 0.865599
X2 -0.256778 0.719910 0.904092 0.784855
X3 -0.175926 0.784613 0.612091 0.662238
X4 -0.350970 0.830885 0.591740 0.636992
X5 -0.273017 0.957697 0.753254 0.892197
X6 -0.277819 0.835641 0.851063 0.901872
X7 -0.190177 0.806628 0.657498 0.941106
Y2 1.000.000 -0.308724 -0.291647 -0.244238

2.2 Structural Model


The structural model includes the relations between the latent or
unobserved. The structural model is assessed according to the meaningfulness and
significance of the hypothesized relationships between the latent variables. Test of
the structural model are: the mount of variance explained, the significance of the
relationship, and the model’s predictive relevance.
Since the primary objective of PLS is prediction, the goodness of a
theoretical model is established by the strength of each structural path and
combined predictive (R2) of its exogenous constructs. Falk and Miller [12] suggest
that the variance explained for endogenous variables should be greater than 0,1.
The variance explained for each dependent construct is shown in table 3.

173
TABLE 3. Variance Explained
Dependent construct R Square

Food insecurity 0.108218


food availability 0.582176
food utilization 0.766080

In this study, food insecurity has R2 value of 0,108, which can be


considered satisfactory, taking into account the complexity of the model. Path
coefficient indicates the level of significance in hypothesis testing. Score path
coefficients indicated by the value of T-statistic (t > 1,96).

TABLE 4. Path Coefficient


Original Sample Standard Standard T Statistics
Sample (O) Mean (M) Deviation Error (|O/STERR|)
(STDEV) (STERR)

food access -> Food


-0.306953 -0.308411 0.320472 0.320472 0.957815
insecurity

food access -> food


0.763004 0.770126 0.049572 0.049572 15.391.915
availability

food access -> food


0.663852 0.663002 0.090165 0.090165 7.362.601
utilization

food availability ->


-0.173107 -0.177929 0.262116 0.262116 0.660423
Food insecurity

food availability ->


0.256331 0.260955 0.095510 0.095510 2.683.807
food utilization

food utilization ->


0.151624 0.158930 0.256314 0.256314 0.591554
Food insecurity

Table 4 shows the significant paths. Food access has a significant effect to food
availability and food utilization. Food availability has a significant effect to food
utilization. In this study, food availability, food access, and food utilization does not
significantly affect to food insecurity.

3. Concluding Remarks
The results of this study indicate that food access has a significant effect to
food availability and food utilization. Food availability has a significant effect to
food utilization. However, food availability, food access, and food utilization does not
significantly affect to food insecurity. Valid manifest indicators for food availability
are normative consumption to net per capita production ratio and percentage of
food crop production in Gross domestic product. Valid manifest indicators for food
access are percentage of people living below the poverty line, percentage of
villages with inadequate transport connections, and percentage of households
without access to electricity. Valid manifest indicators for food utilization are
percentage of female illiteracy, percentage of households without access to clean

174
and safe drinking water, and percentage of food expenditure per capita. Child
stunting and life expectancy at birth are not valid indicators for food utilization.
Valid manifest indicators for food insecurity is percentage of floods area.

References
[1] E. Priyanto, Partial Least Squares Logistic Regression Case Study-
Food Security Data of Districts in The Island of Borneo, Thesis,
Institute of Technology Sepuluh Nopember, Surabaya, (2011)
[2] L.I. Kastanja, Variance Based Spatial Structural Equation Modeling
(Spatial SEM-PLS) of Food Risk Vulnerability in Papua and West
Papua Province, Thesis, Institute of Technology Sepuluh Nopember,
Surabaya., (2014)
[3] Food Security Council and World Food Programme, A Food Security
and Vulnerability Atlas of Indonesia 2009, Food Security Council,
Jakarta, (2009)
[4] M. Migotto, B. Davis, C. Carletto, and K. Beegle, Measuring Food
Security Using Respondents’ Perception of Food Consumtion
Adequacy, in Food Security : Indicator, Measurement, and The
Impact of Trade Openness, eds. Khasnobis, B.G., Acharya, S.S., and
Davis, B., New York, page. 13-41., (2007)
[5] B.G. Kasnobis, Women’s Status and Children’s Food Security in
Pakistan, in Food Security : Indicator, Measurement, and The Impact
of Trade Openness, eds. Khasnobis, B.G., Acharya, S.S., and Davis,
B., New York, page. 95-108., (2007)
[6] I. Dutta and C. Gundersen, Measures of Food Insecurity at the
Household Level, in Food Security: Indicator, Measurement, and The
Impact of Trade Openness, eds. Khasnobis, B.G., Acharya, S.S., and
Davis, B., New York, page. 42-61, (2007)
[7] S. Rahim, D. Saeed, G.A. Rasool, and G. Saeed, Factors Influencing
Household Food Security Status, Food and Nutrition Sciences, Vol.
2, No. 1, page. 31-34, (2011)
[8] C.M. Lovendal and M. Knowles, Tomorrow’s Hunger : A frame for
Analysing Vulnerability to Food Security, in Food Security :
Indicator, Measurement, and The Impact of Trade Openness, eds.
Khasnobis, B.G., Acharya, S.S., and Davis, B., New York, page. 62-
94, (2007)
[9] J.F. Hair, W.C. Black, B.J. Babin, and R.E. Anderson, Multivariate
Data Analysis (6th ed), Upper Saddle River: Pearson, (2006)
[10] H. Wold, Partial Least Square, in Encyclopedia of Statistical, eds.
Kotz, S., dan Johnson, N. L., New York , page. 581-591, (1985)
[11] W. W. Chin and P. R. Newsted, Structural equation modeling:
analysis with small samples using partial least squares, in Statistical
strategies for small sample research, eds. Hoyle, R., Tousana Oaks,
page. 307-341, (1999)
[12] R. F. Falk and N.B. Miller, A primer for soft modeling, Ohio: The
University of Akron Press, (1992)

175
Proceedings of IICMA 2015
Statistics and Probability

Forecasting on Indonesian’s Fishery Export


Value Using ARIMA and Neural Network
Eunike W. Parameswari1,a), Brodjol S.S Ulama2,b), Suhartono3,c)
1,2,3
Department of Statistics, Institute of Technology Sepuluh Nopember (ITS)
a)
eunike@bps.go.id
b)
brodjol.su@gmail.com
c)
suhartono@statistika.its.ac.id

Abstract. The increasing of the export performance has a direct impact on economic growth. It
means export plays an important role in economic growth. The availability of the current export
data becomes a challenge to apply an effective forecasting method. Neural networks have shown
great ability in modeling and forecasting nonlinear and non-stationary time series. This paper
compares linear stochastic models (ARIMA) and Feed Forward Neural Network (FFNN) for
Indonesia’s fishery export. FFNN is the most commonly used Neural Network (NN) architecture
in many fields of application. Empirical results indicated that utilizing a Feed Forward Neural
Network (FFNN) outperforms ARIMA models in terms of forecasting accuracy. This paper also
reports empirical evidence that a neural network model is applicable to the prediction of export
value.
Keywords and Phrases: Export, ARIMA, Neural Network, Feed Forward Neural Network

1. Introduction
In the era of globalization, international trade allows people to produce the
best products and consume a variety of goods and services produced worldwide.
For a country, international trade plays an important role in improving living
standards and enabling each country to specialize in producing goods and services
that have a comparative advantage. One of the indicators to watch the increase in
international trade is to look at export growth. The export value at a time has a
tendency to depend on an earlier time. Forecasting methods of export data was
developed and is still an interesting issue. In Indonesia, Triyanto [1] had predicted
Indonesia’s export value in period of July 2011-June 2012 using a hybrid method
ARIMA-Neural Network. Previously, Wienarti [2] also foresee export value of
Central Java with exponential smoothing method. Barus [3] and Ruslan, Harahap,
Sembiring [4] also forecasting in North Sumatra’s export value in period of
November 2012 to October 2014. While abroad, forecasting in export value has
also been developed, including in Thomson [5] in New Zealand, Kargbo [6] in
South Africa, Stoevsky [7] in Bulgary, Mehmood [8] in Pakistan, Arumugam dan
Anithakumari [9] in Taiwan, Zhang and Zhao [10] in Ningbo, Tahir [11] in
Pakistan, also Sen, Sabur, Islam, and Nature [12] in Bangladesh. The aims of this
paper are twofold: (1) to examine the network architecture of NN in forecasting
fishery exports from Indonesia and (2) to evaluate how ARIMA and FFNN
compared in predicting the unseen future.

176
1.1 Forecasting
At first, forecasting methods dominated by linear method. Linear methods
are relatively easily developed and implemented. However, the linear method
cannot capture non-linear relationships that are often found in real conditions. In
general, modeling of time series analysis divided into two classifications,
univariate and multivariate. Univariate time series analysis is more proper used to
model data in which the cause of the fluctuations are difficult to know, as macro
economy data, especially export value. This analysis has the advantages that it can
only use variables whose behavior will be investigated, without need to look for
other variables that influence the data (Makridakis et al. [13]). Yule [14] introduced
a model of auto regression (AR) and started develop statistical modeling of linear
time series, especially linear model Autoregressive Integrated Moving Average
(ARIMA). Nonlinear time series analysis describes the nonlinear relationship
between variables and multiple testing procedure to detect the nonlinear
relationship.

1.2 ARIMA
Time Series ARIMA model was proposed by Box- Jenkins in 1970, the
model examined each variable by using auto regression, AR(p) and Moving
Average (q) to investigate the historical data and economic fluctuations. The
algorithm is to be had as follows.
1) Data Interpretation: The first step in developing a Box-Jenkins model is to
decide if the series is stationary and if there is any significant seasonality that needs
to be modeled. The autocorrelation functions (ACF) are used to define the
distribution of sample data.
2) Model Identification: Identifying the phase of the series by using autocorrelation
function (ACF and partial autocorrelation function (PACF).
3) Inference: The conditional likelihood and exact likelihood are used to estimate
the parameters.
4) Diagnostic Checking: The process of diagnostic check involves testing the
assumptions of the model to identify any areas where the model is inadequate. The
statistical identification process includes whether the parameter achieves statistical
significance or multi collinearity and whether the residual term is white noise or
not. If the model is found to be insufficient, it is necessary to remedy and repeat
step (4) until a better model is identified.
ARIMA models in the back shift notation according to Makridakis, et al.
[13] can be expressed mathematically as follows:
(1 − 𝜙1 𝐵 − 𝜙2 𝐵 2 − ⋯ − 𝜙𝑝 𝐵 𝑝 )(1 − 𝐵)𝑑 𝑌𝑡 = (1 − 𝜃1 𝐵 − 𝜃2 𝐵 2 − ⋯ −
𝜃𝑞 𝐵 𝑞 )𝑒𝑡 (1)

Difference degree illustrates an ARIMA process in dimension p, d, q so ARIMA


can be interpreted as a time series that follows the process of AR(p), MA(q), and
become stationary after d-difference. The process of data transformation can be
done to overcome the problem of not stationary in variance time series. The
transformation model that is often used is the Box-Cox transformation. (Wei,
[16]):

177
𝑌𝑡 𝜆 −1
𝑇(𝑌𝑡 ) = 𝑌𝑡 (𝜆) = 𝜆
(2)

With 𝜆 is the transformation parameter. If 𝜆 = 0 then it can be approached:

𝑌𝑡𝜆 −1
lim 𝑇(𝑌𝑡 ) = lim 𝑌𝑡 (𝜆) = lim (3)
𝜆→0 𝜆→0 𝜆→0 𝜆

1.3 Neural Networks


Neural Network (NN) is a method that is currently progressing rapidly. In
recent decades, NN had shown great ability in modelling and forecasting nonlinear
and non-stationary time series in economic data due to their innate nonlinear
property and flexibility for modelling. NN is often regarded as a universal
approach to data without requiring statistical assumptions. Some of the advantages
of NN are: (1) They are able to recognize the relation between the input and output
variables without explicit physical considerations. (2) They work well even when
the training sets contain noise and measurement errors. (3) They are able to adapt
to solutions over time to compensate for changing circumstances. (4) They possess
other inherent information-processing characteristics and once trained are easy to
use.
NN working mechanism mimics the workings of biologic neural networks.
Imitating biologic neural networks, NN composed of nerve cells (neurons) that
interconnected and operated in parallel. Feed Forward Neural Network (FFNN) is
one form of very flexible NN models that used in various applications. In getting
FFNN appropriate model (optimal architecture) is necessary to determine the right
combination between the number of input variables and the number of units in the
hidden layer. Feed forward neural networks are one of the neural networks'
architecture structure types. This structure is basically given in Fig. 1. The
architecture consists of three parts such as input, hidden, and output layers. Each
layer consists of neurons. These neurons linked each other by weights.

Hidden Layer

Output Layer

Input Layer

FIGURE 1. Multilayer feed forward neural networks with one output neuron

178
2. Main Results
The data set consists of observations for real exports value of fishery. The
data set, obtained from Statistics Indonesia database, is monthly and covers the
periods January 1999-July 2015. This data set divided into two subsets, one
training data subset from January 1999 to December 2013 and one validation data
subset from January 2014 to July 2015.
In the first stage, an ARIMA model developed using the training subset.
Fig. 2 shows clearly that the data sets have trend patterns; hence the assumption of
stationary condition in the mean is not satisfied. Nevertheless, we validate this
assumption of stationary condition using Box-Cox Transformation. The results
showed that. Next, the time-series properties of the data examined using unit-root
tests, the augmented Dickey and Fuller (ADF) test for the null hypothesis of non-
stationary. The unit root tests results (see Table 1) suggests that the variables
integrated of order one. This implies the possibility of cointegrating relationships.
Fig. 3 shows the data has been de-trended and de-seasonal after transformation and
differencing for the both non-seasonal and seasonal orders.

300000000

250000000

200000000

150000000

100000000

1 18 36 54 72 90 108 126 144 162 180

FIGURE 2. Time series plot of Fishery export value (US$)

0,00002

0,00001

0,00000

-0,00001

-0,00002

-0,00003
1 18 36 54 72 90 108 126 144 162 180

FIGURE 3. Time series plot of Fishery export value (US$) after data
transformation and differencing

179
TABLE 1. ADF Test for unit root

ADF test
Significance t - statistic
statistic
Levels 1% level -3.467.851 -0.096520
5% level -2.877.919
10% level -2.575.581
1st 1% level -3.467.851 -11.19859 Reject Null
difference Hypotheses
5% level -2.877.919
10% level -2.575.581

Fig. 4 shows, the autocorrelation functions (a) and the partial


autocorrelation (b) in order to identify the model and took a few possible models.
The best model was chosen among these competitive models base on the smallest
RMSE value. The model parameters were evaluated by using maximum likelihood
estimation. Models with insignificant parameters (exclude constant) were
eliminated. Remaining models then proceed to the diagnostic checking step to see
whether the models adequate by using Ljung-Box statistics. The p-value for the
Ljung–box statistic was 0.9564, and therefore, it can be stated that the data in the
residuals was independently distributed or, in other words, that the residuals from
the ARIMA model have no correlation.

a 1,0 b 1,0

) 0,8 ) 0,8

0,6 0,6
Partial Autocorrelation

0,4 0,4
Autocorrelation

0,2 0,2

0,0 0,0

-0,2 -0,2

-0,4 -0,4

-0,6 -0,6

-0,8 -0,8

-1,0 -1,0

1 5 10 15 20 25 30 35 40 45 1 5 10 15 20 25 30 35 40 45
Lag Lag

FIGURE 4. (a) Autocorrelation function variable and (b) partial autocorrelation


function variable
According to RMSE value, the best ARIMA model is ARIMA (1, 1, 0). Therefore,
the models to forecast fishery export value after taking parameter estimation are
following the below equation:
𝑍̂𝑡 = 0,6042 𝑍𝑡−1 + 0,3958 𝑍𝑡−2 + 𝑎𝑡

A difficult task with FFNN involves choosing parameters such as the


number of hidden nodes, the learning rate h, and the initial weights. Input selection
in FFNN method for forecasting fishery export will use a significant lag in the
pattern of PACF. Forecasting on fishery's export used input variable in the period
(t-1), (t-2), (t-3), and (t-9).

180
As discussed previously, there is no theory yet to tell how many hidden
units are needed to approximate any given function. The network geometry is a
problem dependent. Here, we use the three-layer FFNN with one hidden layer (Fig.
1) and the common trial-and-error method to select the number of hidden nodes.
The model structure can be represented by the notation FFNN(N, M, m), where N is
the number of input nodes, M the number of hidden nodes and m the number of
nodes in the output layer. In this study, the four input variables used in the ARIMA
model constitute the network inputs. To identify an appropriate FFNN model, M is
varied over the range 1-20 and for each model, the fit to the calibration and
validation data set is evaluated, respectively, using the root mean square error
(RMSE). The optimum number of hidden nodes for the FNN with LMBP is M =
19. Therefore, the FFNN(4, 19, 1) is chosen. Here it is found that for a short-time
(monthly) forecast, there is no significant advantage to select a precise network
structure for each forecast lead time. However, it may be appropriate to use a
specific network structure for each forecast lead time. The criterion used in this
back-propagation FFNN in order to measure the proximity of the neural network
prediction to its target was the least mean squares, while the activation function of
the output layer is pure linear function (Haykin, [15]). The training method
employed was adaptive gradient descent. The selected model was chosen as the
one with the lowest RMSE value of all those trained and validated using a design
of experiment's methodology. All models were trained using a maximum number
of 10.000 epochs in case the convergence criteria were not achieved before. Table
2 shows the forecast value for ARIMA models and FFNN.

TABLE 2. Forecast Results of Fishery Export Value using ARIMA and FFNN
Period Actual ARIMA FFNN
2014 JAN 237,010,432 252,233,786 228,381,190.47
FEB 235,411,214 251,836,124 228,587,306.07
MAR 247,547,514 251,993,401 228,628,074.80
APR 245,221,787 251,931,135 228,702,806.18
MAY 240,988,844 251,955,777 229,044,100.63
JUN 251,513,759 251,946,023 228,622,084.15
JUL 251,291,564 251,949,884 228,498,529.76
AUG 249,186,857 251,948,356 228,756,105.23
SEP 291,023,894 251,948,960 228,676,831.61
OCT 294,267,111 251,948,721 228,834,016.02
NOV 275,559,372 251,948,816 229,089,831.43
DEC 274,738,452 251,948,778 228,680,232.24
2015 JAN 227,095,608 251,948,793 228,561,609.23
FEB 203,291,348 251,948,787 228,596,728.03
MAR 236,593,502 251,948,790 228,243,074.43
APR 236,626,141 251,948,789 228,578,378.22
MAY 215,590,608 251,948,789 228,952,802.11
JUN 215,575,527 251,948,789 228,467,562.44
JUL 179,710,060 251,948,789 228,341,426.09

3. Concluding Remarks
According to the empirical results of the present research with Indonesia’s
fishery export value, it can be stated that the performance level of ARIMA is good
enough in terms of forecasting. Nevertheless, in those cases in which the
computational capability is not a problem, and the data analysis will be performed
as a black box for the final user of the forecasting system, authors recommend the
use of neural network's models, especially combining hybrid models that are able
to take full advantage of ARIMA models and creating models that combine
intelligence artificial techniques.

181
References
[1] A.G. Triyanto, Model Peramalan Hibrida ARIMA-NN pada Data Ekspor
Indonesia, ITS, Surabaya, (2012)
[2] I. Wienarti, Peramalan (Forecasting) Nilai Ekspor Jawa Tengah menurut
Komoditi dengan Metode Exponential Smoothing Bulan Desember 2009
sampai Juni 2010, UNS, Semarang, (2011)
[3] J.H. Barus, Analisis Peramalan Ekspor Indonesia Pasca Krisis Keuangan
Eropa dan Global Tahun 2008 dengan Metode Dekompoisi, USU, Medan,
(2013)
[4] R. Ruslan, et al., Peramalan Nilai Ekspor di Propinsi Sumatera Utara
dengan Metode ARIMA Box-Jenkins, Saintia Matematika, Vol. 1, Issue 6,
pp. 579-589, (2013)
[5] G.F. Thomson, A Forecasting Model of New Zealand’s Lamb Exports,
Research Report no.223, Lincoln University, Canterbury, (1994)
[6] J.M. Kargbo, Forecasting Agricultural Exports and Imports in South Africa,
Applied Economics, Vol. 39, Issue 16, pp. 2069-2084, (2007)
[7] G. Stoevsky, Economic Forecasting of Bulgaria’s Export and Import Flows,
Discussion Papers, Bulgarian National Bank, (2009)
[8] S. Mehmood, Forecasting Pakistan’s Exports to SAARC: An Application of
Univariate ARIMA Model, Journal of Contemporary Issues in Bussiness
Research, Vol. 1, Issue 3, pp. 96-110, (2012)
[9] P. Arumugam and V. Anithakumari, Fuzzy Time Series Method for
Forecasting Taiwan Export Data, International Journal of Engineering
Trends and Technology (IJETT), Vol. 4, Issue 8, pp. 3342-3347, (2013)
[10] W. Zhang and S. Zhao, Forecasting Research on The Total Volume of
Import and Export Trade of Ningbo Port by Gray Forecasting Model,
Journal of Software, Vol. 8, Issue 2, pp. 466-471, (2013)
[11] A. Tahir, Forecasting Citrus Exports in Pakistan, Pakistan Journal of
Agricultural Research, Vol.27, Issue 1, pp. 64-68, (2014)
[12] B.B. Sen, et al, Forecasting The Quantity of Shrimp and Dry Fish Export
from Bangladesh, Journal of Economics and Sustainable Development,
Vol.6, Issue 7, pp. 52-58, (2015)
[13] S. Makridakis, et al , Forecasting: Methods and Applications (3rd ed.),
Wiley: New York, (1998)
[14] G.U. Yule, On a method of investigating periodicities in distributed series
with special reference to Wolfer’s sunspot numbers, Transactions of the
Royal Society of London Series A, Containing Papers of a Mathematical
or Physical Character, Vol. 226, pp. 267-298, (1927)
[15] S. Hayki, Neural Networks: A Comprehensive Foundation, Prentice Hall,
(1998)

[16] W. W. S. Wei, Time Series Analysis Univariate and Multivariate Methods,


2nd Edition, Addison Wesley Publishing Company Inc., New York, (2006)

182
Proceedings of IICMA 2015
Statistics and Probability

Spatial Autoregressive Poisson (SAR Poisson) Model to


detect Influential Factors to the Number of Dengue
Haemorrhagic Fever Patients for each District in the
Province of DKI Jakarta
Siti Rohmah Rohimah1,a), Sudarwanto 2,b), Ria Arafiyah3,c)
1,2,3
Mathematics Departement, Faculty of Mathematics and Natural Sciences
State University of Jakarta
a)
srohmahrohimah@yahoo.com
b)
sudarwanto@gmail.com
c)
ria_lamrat@yahoo.com

Abstract. Dengue Haemorrhagic Fever (DHF) Patients is a rare occurrence but DHF disease is a serious problem
that requires special handling in each district. In this research used spatial autoregressive Poisson models to detect
factors that influence toward the number of DHF patients. Based on the results of this study was obtained factors
that influence the number of DHF patients are spatial and non-spatial. Spatial factor that predispose to a particular
location is the location of the neighbors. While non-spatial factors that influence the number of DHF patients are
the number of public health centers and the volume of waste. In this research, parameter was estimated by
maximum likelihood estimation. Furthermore, in spatial autoregressive Poisson model was obtained error
standard was less than without spatial in Poisson regression model.
Keywords: Spatial Autoregressive Poisson, Dengue Haemorrhagic Fever (DHF), maximum likelihood, Poisson
regression

1. Introduction
The government of DKI Jakarta has always increased the health services through the
years. The improvement of the health sector has always been the important aspect to obtain a
special treatment by the government. Hence, the development of the health sector is considered
as the effort to improve these society welfare. This effort has shown the positive achievement.
The amount of health facilities and infrastructure have been improved continously, especially
hospitals, health center, clinics, laboratories, pharmacies, and Posyandu. Besides improving the
health facilities and infrastructure the government has also held the health training through the
clinics of each district, such as: Posyandu, health education, eradication of mosquito breeding
activity, epidemiological studies, fogging, public health training, and hospitalized patients of
DHF. DHF remains a major health problem in a big city and Jakarta is one of the highest
province in Indonesia that has the citizens who suffered the dengue fever.
The case of DHF is one of the major health sector problems that are often found in
Jakarta. DHF cases requires precise handling, hence it obviously causes death. The research
towards the factors that influence the number of patients with dengue fever in Jakarta have been
carried out, such as, Afira and Mansour [1] that discussed the depiction of the incidence of
DHF in Gambir and Sawah Besar District of 2005-2009.
Everything is related to everything else, but near things are more related than distant
things [2]. There will be the the presence of spatial effects between one region to another. A
model that can explain the relationship between an areas and another is called a spatial
model. DHF patients from one region hypothesized can be influenced by the surrounding region

183
and spread Poisson. Therefore this study uses a model of spatial autoregressive Poisson
(Poisson SAR). The use of SAR Poisson models are expected to determine the factors that
influence both spatially and non-spatially toward the number of DHF patients, so the results of
this reserach can be used as a reference of DHF eradication program in each district in
Jakarta. The purpose of this study is to determine the factors that influence the spatial and non-
spatial to the number of DHF patients in DKI Jakarta. This research is expected to provide the
inputs to the local government of Jakarta in resolving the cases of dengue fever spatially and
non-spatially.
Dengue fever Disease is also called as Haemorrhagic Fever hence it is accompanied by
fever and bleeding [13]. The disease is spread by vectors such as mosquitoes called Aedes
aegypti. It is classified as unusual events. DHF can be diagnosed through symptoms such as
high fever and the appearance of the rash and to obtain a higher accuracy of diagnosis usually
conducted the various laboratory tests.
The laboratory tests included: counting the number of antibodies against dengue virus,
counting the complete leukocytes blood, hemoglobin, hematocrit, and platelets. Leukocytes or
white blood cells that contain a nucleus, with a normal range for women is 4300-11300/mm3,
whereas for men is 4300-10300/mm3. Hemoglobin is a complex protein which presents in
erythrocytes containing iron and red. Normal hemoglobin levels for women was 11.4 g/dl to
15.1 g/dl, while for males 13.4 g/dl to 17.7 g/dl. Hematocrit is a number that indicates the
percentage of solids in the blood against the blood fluid. Hematocrit normal circumstances of
human body is 38 to 42% for women and 40-47% for men. Platelets are the smallest part of the
cellular elements in the bone marrow and are important in the process of clotting and
hemostatic. The normal conditions the number of platelets for women and men between
150000-40000/mm3.
The assumption of the factor that affects the number of patients of dengue hemorrhagic
fever is a spatial factor and non-spatial. Spatial factors that affect the number of dengue fever
patients in a certain location is the number of dengue fever patients at the sorrounding
location. While a non-spatial factors that suspected to affect the number of dengue fever
patients, such as: the number of population density, the number of health centers, and the
volume of waste.

Poisson Regression Model


Poisson regression is a regression function with the response variable (Y) which has a
Poisson distribution probability, e.g random variable Y represents the number of events that
occurred over a period of time or a particular region. Poisson distribution is determined by
probability mass function (Fleiss et al. 2003):
e−μ μy
P(Y = y | μ) = , for y = 0, 1, 2, … (1)
y!

Let Y1 , … , Yn is a random sample of probability Poisson distribution with mean μi .


the probability mass function Yi is expressed as follows:
y
μ i e−μi
f(yi |μi ) = i (2)
yi !
Let 𝜼 = 𝐗′𝜷 is a sistematic component which is the linear function of independent variable X
and the parameter 𝜷 which is unknown. 𝜼 is linked with μ through link function h(μ) = 𝜼 with
h(μ) = log μ. So, multiple Poisson regression model can be written as:
log μi = xi1 β1 + … + xik βk + εi (3)
With xik is the kth independent variable and ith respons i = 1,2, … , n [4].

184
Model SAR (Spatial Autoregressive Model)
Spatial contiguity matrix is the matrix which depicts the relation between one region to
another. The row of ith from weighted matrix shows the relation between ith response to another
responses. So that, the weighted matrix (n x n) , with n as the sum of all responses. The
weighted matrix used is the nearest neighborhood [8], which is defined as follow:
1, if j is the nearest neighbor of i
Wij = {
0, others
The row of the spatial contiguity matrix shows the spatial relation between one region to
another, therefore the sum of the ith row is the sum of the neighbors which is owned by the i
region dinoted: ci. = ∑nj=1 cij with ci. is the sum of the weighted of all the ith row and cij
cij
weighted variable of ith row and jth column. While (wij∗ ) = , this wij∗ is the matrix element
ci.
which has been standardized, so the sum of each row is 1.
SAR model equation in [8] can be written as follow:
yi = ρ ∑nj=1 wij∗ yj + xi 𝜷 + εi , (4)
With ρ is the autoregressive spatial coefficient, wij∗ is the spatial weighted matrix which have
been standardized in the ith region and jth neighbor, and also εi is random error which
independently and identically distributed.
If the SAR model is written in the form of matrix, can be written as follow:
𝒚 = ρ𝐖 ∗ 𝒚 + 𝐗𝜷 + 𝜺 (5)
The reduction of SAR is as follow:
𝒚 = 𝐀−1 𝐗𝜷 + 𝜺∗ (6)
With 𝐀 = 𝐈 − ρ𝐖 ∗, 𝐀−1 is the inverse matrix A and 𝜺∗ = 𝐀−1 𝜺.
The use of spatial on otoregressive model for the rendom data [10] is:
μSAR
i = exp[𝒂𝑖 𝐗𝜷] (7)
With 𝒂𝑖 is the row vector on the ith region (1 x n). On the Poisson SAR model, the expected
value of one region or ithlocation is the function of neighbor region or j th location. Besides
Poisson SAR model is also used as the data of the response variable with count data. Probability
mass function of Poisson SAR model is:
yi
(μSAR ) exp(−μSAR )

f(yi | 𝐗, 𝐖 ; 𝜷, ρ) =
i i (8)
yi !
with μSAR
i = exp(𝒂𝑖 𝐗𝜷).
The log likelihood function is:
n yi

(μSAR
i ) exp(−μSAR
i )
L( 𝜷, ρ | 𝐗, 𝐖 ; y1 , y2 , … , yn ) = ∏ { }
yi !
i=1

Parameter ρ and 𝜷 are estimated using maximum likelihood method. The probability
mass function of Poisson distribution is:
yi
(μSAR
i ) exp(−μSAR
i )
f(yi | 𝐗, 𝐖 ∗ ; 𝜷, ρ) = (9)
yi !

185
With μSAR
i = exp(𝒂𝑖 𝐗𝜷), maximum log likelihood function is:
ln L( 𝜷, ρ | 𝐗, 𝐖 ∗ ; y1 , y2 , … , yn )
n n
−1
= 𝒚′𝐀 𝐗𝜷 − ∑ exp([𝒂𝑖 𝐗𝜷]) − ∑ ln( yi !)
i=1 i=1 (10)
The estimation of parameter ρ dan 𝜷 on the Poisson SAR model used iteration with
Newton-Raphson method. The steps of this method are:
1. ̂ ∗(0) , with 𝜷∗(0) = [𝜌0 𝛽00 𝛽10
Determining 𝜷 … 𝛽𝑘𝑜 ], iteration when t = 0.
∂ L(𝛽 ∗ ) ∂ L(𝛽∗ )
2. Making gradient vector 𝒈′t+1 = [ , ], with t is the iteration number.
∂ρ ∂𝜷

3. Making Hessian matrix H:

∂2 ln L(β∗ ) ∂2 ln L(β∗ ) ∂2 ln L(β∗ )



∂ρ2 ∂β0 ∂ρ ∂βk ∂ρ
∂2 ln L(β∗ ) ∂2 ln L(β∗ )

𝐇(k+1)x(k+1) = ∂β20 ∂β0 βk
⋱ ⋮
∂2 ln L(β∗ )
[ ∂β2k ]
4. Putting ̂ ∗(0)
𝜷 into 𝒈 vector and H matrix to result 𝒈(0) vector and 𝐇(0) matrix.
5. Doing the iteration with t = 0 on the equation: 𝜷∗(t+1) = 𝜷∗t − 𝐇t−1 𝒈′t , 𝜷t∗ is set of
parameter estimeter which is convergence on tth iteration.
6. If the estimeter parameter is not convergence yet, so the second step is done returned to
obtain the convergent. The convergence criteria is resulted when the eigen value of Fisher
information matrix is positive.
7. ̂ [10].
Wald tests is used to test the significance of spatial correlation coefficient (ρ̂) and 𝜷
Hypothesis test for ρ is:
H0 : ρ = 0 (there is no spatial correlation)
H1 : ρ ≠ 0 (there is a spatial correlation)
ρ̂0 2
Gρ = { }
̂ (ρ̂0 )
se
Gρ statistic will distribute χ2 with the degree of freedom 1. Decision criteria which is taken is
rejected H0 , if Gρ > χ2(α/2;1) .
The hypothesis for βk coefficient parameter(Fleiss et al. 2003) is:
H0 ∶ βk = 0
H1 : βk ≠ 0
With Wald Test statistic:
2
β̂k
Gβ = { }
̂ (β̂k )
se

186
Gβ statistic will distribute χ2 with the degree of freedom 1. Decision criteria which is taken is
rejected H0, if Gβ > χ2(α/2;1). Standard error is resulted using Fisher information matrix 𝐈(𝛉)
(McCulloch dan Searle 2001), wih this formula:
∂2 ln L(β∗ ) ∂2 ln L(β∗ ) ∂2 ln L(β∗ )

∂ρ2 ∂β0 ∂ρ ∂βk ∂ρ
∂ ln L(β∗ )
2
∂ ln L(β∗ )
2

𝐈(𝛉) = − ∂β20 ∂β0 βk
⋱ ⋮
∂2 ln L(β∗ )
[ ∂β2k ]
The variance of θ̂ ≈ [𝐈(𝛉)]−1 , so the standard error = √[𝐈(𝛉)]−1.
After conducting the parameter estimation and significance test of the each parameter
estimator, it is required the determination coefficient which can depict the relation between
response variable and independent variable. Determination coefficient or R2 is the percent of
variation that can be explained by the regression equation. One of the R2 which has been
improved by Cameron and Windmeijer [5] which is based on the deviance residual (R2DEV ).
̂)
ln L(y)−ln L(μ
The formula for R2DEV: R2DEV = 1 − ln L(y)−ln L(y̅)

with ln L(y) = ∑ni[yi ln(yi ) − yi − ln(yi !)] is the natural logaritm (ln) of the likelihood function
when all the parameter βj (j = 0,1,2, … , k) are not included in the model, yi is the response;
ln L(μ̂) = ∑ni[yi ln(μ̂) − μ̂i − ln(yi !)] is the natural logaritm of likelihood function when all the
parameter βj are included in the model, μ̂i is the estimation value for ith response; ln L(y̅) =
∑ni[yi ln(y̅) − y̅ − ln(yi !)] is the natural logaritm of likelihood function when only the
parameter β0 is included in the, and (y̅) is mean of y response.

The Analysis Steps


The analysis of the data using R.2.10.1and the analysis steps are conducted as follow:
1. Determining the independent variable.
2. Determining spatial weighted matrix 𝐖 ∗ .
3. Estimating parameter 𝜷, ρ with Newton-Raphson method.
4. Testing of parameter significance with Wald Test.
5. Testing of goodness of fit model with R2DEV .
6. Drawing conclusion

The Data
This research uses the data from BPS DKI Jakarta on 2013 from 42 districts in DKI
Jakarta. Besides, the data is also obtained from the surveillance data of Public Health and Public
Cleanness of DKI Jakarta. The response variable used is the amount of dengue fever patients in
Jakarta province without Seribu island. While the independent variable in this research: the
number of population density (X1 ), the number of health centers (X 2 ), and the volume of
waste (X 3 ).

187
2. Main Results
Poisson Regression Model Analysis
Poisson Regression Model is formed using three independent variable simultaneously,
they are: the number of population density (X1 ), the number of health centaers (X 2 ), and the
volume of waste (X 3 ). The parameter estimation value from this model is shown in Table 1. To
interpret the obtained Poisson Regression Model used odd ratio value of each coefficient. This
model shows that the addition of the proportion of the population density (X1 ) one person per
km2 will increase the mean of DHF patients is 0.000392 %. Based on Table 1 concluded that the
number of population density does not significantly influence. Besides, in every additional 1
unit of health center (X 2 ) will increase the mean of DHF patients is 12.29%. While, in every
additional 1 ton of waste (X 3 ) will increase the mean of DHF patients is 0.29%. Then the result
of parameter estimater from the Poisson Regression is used as the initial value to obtain the
parameter estimator on Poisson SAR model. The determination coefficient (R2 deviance) from
the Poisson regression model is obtained is 22%. The following is the parameter estimation
value from the Poisson regression model.

TABLE 1. Parameter Estimation Value of Poisson regression model


Estimated Standard
Parameter G value
value error
beta 0 2.03E+00 1.17E-01 2.99E+02*
beta 1 3.92E-06 2.50E-06 2.46E+00
5.024
beta 2 1.16E-01 1.06E-02 1.19E+02*
beta 3 2.90E-03 4.36E-04 4.41E+01*
Note: * significant with alpha = 5 %

The Analysis of Poisson SAR Model


The estimation of coefficient parameter SAR model used maximum likelihood
method. Besides, Poisson SAR model is a kind of nonlinear model and also it is not closed
form. So, the process of estimation of parameter used iteration by Newton-Raphson method.
TABLE 2. Parameter estimation value of SAR model
Estimated Standard
Parameter G value
Value error
Rho 9.75E-02 9.46E-04 10630.89*
beta 0 2.03E+00 4.67E-04 18849862*
beta 1 1.90E-06 2.18E-06 0.761109 5.024
beta 2 9.43E-02 7.85E-03 144.2874*
beta 3 2.32E-03 3.91E-04 35.01243*
Note: * : significant with alpha 5%

SAR Poisson model analysis that involves the entire district in Jakarta showed that the
number of dengue fever patients is influenced by the proximity of the area and some
independent variables. Table 2 showed the significance test each parameter estimators using the

188
Wald test. Wald test results showed that the spatial correlation value was significant. The result
is spatial correlation value ρ = 0.1 with Gρ value = 10630.89, and χ12 value = 5.024. it showed
the spatial correlation of the model was significant o the level of α = 5%. So that, it can be
concluded that the number of DHF patient on certain region or sorrounding location will
influence the number of DHF patient on the nearby location. The significance test for every
parameter estimater β2 dan β3 𝑖𝑠 obtained Gβ > χ12 value. It showed that X 2 and X3 is
included into model is significant while for the parameter estimater β1 is not significant hence it
is obtained Gβ < χ12 .
This model suggests that for every proportion additional of the population density (X1 ) of
1 person per km2 will increase the mean of DHF patients at 0.00019%. However, the
independent variables X1 do not have a significant effect on the model. Based on Table 2
showed that the number of population density was not significant. Moreover, every additional 1
unit of health centers (X 2 ) will increase the mean of DHF patients at 9.889%. while the
additional 1 ton of waste (X 3 ) will increase the mean of DHF patients at 0.23%. but the
independent variable X1 do not have the significance role toward the model. The test of model
goodness can be seen by the number of R2. Based on the R2 deviance is obtained that variance
of the number of DHF patients can be explained by the independent variables at 21%. If
comparing with the parameter estimator of Poisson regression, the obtained standard error on
the Poisson SAR model is less. The obtained Poisson SAR model can be written as follow:
2.03E + 00
1.90E − 06
μSAR
i = exp[𝒂𝑖 𝐗𝜷] with ρ = 0.1 and 𝜷 = [ ]
9.43E − 02
2.32E − 03

3. Concluding Remarks
The factors that influence the number of DHF patients in DKI Jakarta province based on
SAR Poisson model is spatial and non-spatial factor. The spatial factor which influence a
certain district is it’s nearby. Based on the Poisson SAR model obtained the spatial correlation
which is significant at 𝜌 = 0.1. The non-spatial factor based on the Poisson SAR model which
influence the number of DHF patient significantly are the number of health centers and the
volume of waste. The result of the analysis using Poison regression model and Poisson SAR
model is obtained that the number of population desnity is not significant; Modelling the
number of DHF patients using Poisson regression model is obtained R2 is 22% while modelling
the number of DHF patients using Poisson SAR model is obtained R2 is 21%. The standard error
of the Poisson SAR model is less than the Poisson regression model, so it can be concluded that
the modelling of SAR Poisson used is better than the Poisson regression model.

Acknowledgement
The author would like to thank to the State University of Jakarta who have financed this
research. This research is funded by DIPA UNJ on fiscal year 2014.

References
[1] F. Afira and M. Mansyur, Gambaran Kejadian Demam Berdarah Dengue di Kecamatan
Gambir dan Kecamatan Sawah Besar Jakarta Pusat Tahun 2005-2009, Jurnal Gambaran
Kejadian Demam Berdarah Dengue vol. 1 No.1 April 2013, [FKUI].
[2] L. Anselin, Spatial Economics: Methods and Models, Dordrecht, Academic Publishers,
(1988).
189
[3] [BPS] Badan Pusat Statistika, 2013, Provinsi DKI Jakarta dalam Angka, DKI Jakarta, BPS.
[4] A.C. Cameron and P.K. Trivedi, Regression Analysis of Count Data, New York, Cambridge
University, (1998).
[5] A.C. Cameron and F.A.G. Windmeijer, R-squared Measures for Count Data Regession
Models with Applications to Health Care Utilization, Journal of Business and Economics
Statistics (1995).
[6] R.Y. Fa’rifah and Purhadi, Analisis Survival Faktor-Faktor yang Mempengaruhi Laju
Kesembuhan Pasien Penderita Demam Berdarah Dengue (DBD) di RSU Haji Surabaya
dengan regresi Cox, Jurnal Sains dan Seni ITS Vol. 1 No.1 September 2012. [FMIPA ITS].
[7] J.L. Fleiss, B. Levin, and M.C. Paik, Statistical Methods for Rates and Proportions Third
Edition, USA, Columbia University, (2003).
[8] A.S. Fotheringham and P.A. Rogerson, Handbook of Spatial Analysis, London, Sage
Publications Ltd, (2009).
[9] D.G. Kleinbaum, L.L. Kupper, and K.E. Muller, Apllied Regression Analysis and Other
Multivariable Methods, Boston, PWS-KENT Publishing Company, (1988).
[10] D.M. Lambert, J.P. Brown, and RJGM Florax, A Two-Step Estimator for a Spatial Lag
Model of Counts: Theory, Small Sample Performance and application, USA, Dept. of
Agricultural Economics Purdue University, (2010).
[11] J. Lee and DWS Wong, Statistic for Spatial Data, New York, John Wiley & Sons, Inc,
(2001).
[12] C.E. McCulloch and S.R. Searle, Generalized Linear and Mixed Models, Canada: John
Wiley & Sons, Inc, (2001).
[13] R.M. Mulia, Kesehatan Lingkungan, Yogyakarta, Graha Ilmu, (2005).

190
Proceedings of IICMA 2015
Mathematics of Finance

Loss Severity Distribution Estimation of


Operational Risk using Gaussian Mixture Model
for Loss Distribution Approach

Seli Siti Sholihat1,a), Hendri Murfi2,b)


1,2
Department of Mathematics, Faculty of Mathematics and Natural Sciences,
Universitas Indonesia

a)
seli.siti.sholihat@sci.ui.ac.id
b)
hendri@ui.ac.id

Abstract. Banks must be able to manage all of banking risk, one of them is operational risk. Banks
manage operational risk by calculates estimating operational risk which is known as the economic
capital (EC). Loss Distribution Approach (LDA) is a popular method to estimate economic capital
(EC).This paper propose Gaussian Mixture Model(GMM) for severity distribution estimation of loss
distribution approach(LDA). The result on this research is the value at EC of LDA method using
GMM is smaller 2 % - 2.8 % than the value at EC of LDA using existing distribution model.

Keywords: Loss Distribution Approach, Gaussian Mixture Model, Bayesian Information Criterion,
LDA, GMM, BIC.

1. Introduction
Bank must be able to manage all of banking risk, one of them is operational risk. A
common industry definition of Operational Risk is the risk of direct or indirect loss
resulting from inadequate or failed internal processes, people or systems, or from
external events”, Frachot [4]. Bank manage operational risk by calculate estimation of
operational risks which is known as the economic capital (EC).
Economic Capital (EC) is the amount of capital that an organization must set aside
to offset potential losses. There are three approach to calculate Economic Capital based
on Basel Accord II. That are Basic Indicator Approach (BIA), Standardized Approach
(SA), and Advanced Measurement Approach (AMA), Frachot [4]. The capital charge
using BIA and SA is calculated by fixed percentage. The capital charge using AMA,
bank could calculated EC based on their internal loss data. Internal data is used as an
input to compute the probability distribution of loss. The popular approach of AMA is
Loss Distribution Approuch (LDA).
Mathematics definition , the total of annual operational Losses :
𝑁(𝑡)

𝑍(𝑡) = ∑ 𝑋 (𝑖) (𝑡) (1.1)


𝑖

191
Where:
N (t) : Random Variable of the number events losses in 1 year.
Distribution of N (t) is called frequency Distribution
(i)
X (t) : Random Variable of the amount losses for the i-th event.
Distribution of X(i)(t) is called Severity Distribution
Z(t) : Annual losses, is summarize of the loss X(i)(t) in 1 year.
Distribution of Z(t) is called Aggregation Distribution

In LDA method, loss severity distribution (severity distribution) and loss


frequency distribution (frequency distribution) must be estimated and then aggregate
distribution is formed from both of them. Through LDA method, the value of EC can be
gotten from Value at Risk (VaR) in aggregate distribution with the level of confidence
reaches 99.9%. Aggregate distribution of the random variable Z can not expressed
analytically. So that the numerical approach is needed to determine the distribution.
Several well-known numerical method that could be used are the Monte Carlo method,
the Fast Fourier Transform, and Panjer Recursion. In the study used the most easily
implemented, namely the Monte Carlo method, Shevchenko [9]. That why our research
would used its method. One of problems on LDA is severity distribution estimation that
used a model on particular distribution cannot describe a data well through. Then severity
distribution estimation based on data is used to solved this problem.
One of methods that estimate probability distribution function based on data is
Gaussian Mixture Model (GMM). GMM is parametric method that estimate probability
density of random variable. Probability density of GMM is a linear combination of
several Gaussian distribution, that is :
𝐾

𝑝(𝑥) = ∑ 𝜋𝑘 𝒩(𝑥|𝝁𝒌 , 𝚺𝒌 ) (1.2)


𝑘=1
Where:
𝑝(𝑥): probability of x
K : the number of Gaussian distribution that is used
𝜋𝑘 : k-th mixing coefisien, ∑k πk = 1 dan 0 ≤ πk ≤ 1.
𝒩(𝑥|𝝁𝒌 , 𝛴𝑘 ) : Normal /Gaussian Distribution k-th, where k=1,2,…,
1 1
𝒩(𝒙|𝝁𝒌 , 𝛴𝑘 ) = exp {− (𝒙 − 𝝁𝒌 )T Σk −1 (𝒙 − 𝝁𝒌 )}
√(2𝜋|𝜮𝒌 |) 2
Each Gaussian distribution 𝒩(𝑥|𝜇𝑘 , Σ𝑘 ) is called component of mixture, and each
componet have different mean 𝝁𝒌 and covarian 𝚺𝒌 . GMM is formed by parameter 𝝅, ,
and 𝚺, where 𝝅 = (𝝅𝟏 , , 𝝅𝟐 , … , 𝝅𝒌 ), 𝝁 = (𝝁𝟏 , 𝝁𝟐 , … , 𝝁𝒌 ) and 𝚺 = (𝚺𝟏 , 𝚺𝟐 … . , 𝚺𝐤 ).
Parameter 𝜋𝑘 is called mixing coefisient. Ilustration of GMM show in Bishop, C. M.[1].
The question is “Which is a better K for GMM (K=?)”. Number of component in
GMM could be selected using model selection. There are two popular model selection
that is used, Akaike Information Criterion(AIC) and Bayesian Information
Criterion(BIC). Due to the selection model, BIC has proven consistent in estimating the
density function of the mixture model, Dempster[3]. BIC also proved consistent in

192
choosing the number of components in the mixture model, Claeskens and Hjort [2].
Those are the reason of choosing BIC in this study.
The best model using BIC is taken by giving a score to each model and then choose
the model that has the smallest score. Here is the calculation of scores on the model BIC,
Claeskens and Hjort [2] :
BIC = -2ln (L (θ)) + dim (θ) ln (n)
where:
L (θ) : the value of the likelihood function model with the estimated parameters θ
n : number of data

2. Main Results
The Software of simulations use programming language Python. The simulation in
this paper calculate the value of EC using LDA in which severity distribution is estimated
by GMM. First step on simulating, we generated data toy for operational risk(Assumed
operational risk real data), with those data then we estimate frequency distribution with a
Poisson distribution and estimate severity distribution using k-GMM(Selection model for
k using BIC). Next, the simulations to be done to generate more data for LDA which
appropriate with operasional risk data. The result of simulation on LDA is EC value.
Then to see how GMM works on LDA, EC value in which GMM applied compare with
EC value in which other distribution model applied.
Data are generate in 3 group of data: 3 years, 5 years, and 10 years. histogram of
data is below here :

Histogram 3 years Histogram 5 years

Value of loss Value of loss

Histogram 10 years

Value of loss
FIGURE 1. Histogram of data

193
First, estimating frequency distribution, Frequency of losses per year in operational
risk are the values for the random variable N, which is the number of frequency of losses
incurred within one year. The distribution of this random variable N can be estimated
with a Poisson distribution, this is because the number of frequency of losses incurred in
a particular year does not depend on the number of frequencies in other years. Parameters
on the Poisson distribution is the mean . For data 3 years: = 165, for data 5 years: =
60, and for 10 years: = 54. Frequency distribution that is formed can be seen in the
following figure 2:
Frequency distribution 3 years Frequency distribution 5 years

Frequency of losses event in 1 year Frequency of losses event in 1 year

Frequency distribution 10 years

Frequency of losses event in 1 year

FIGURE 2. Frequency Distribution

Estimating severity distribution using GMM, GMM is a parametric models, so the


thing to do is to determine the parameters-parameters in the GMM. Simulations
performed on three groups of data, 3 years, 5 years and 10 years. Component K in GMM
for each of data determined in advance by the selection model BIC. BIC methods are
iterative methods in determining the optimal model by scoring in each model of different
components k, optimal model is a model that has the smallest scoring with K smallest
components. At 3 years of data, the method of BIC produce optimal k= 2. At 5 years of
data, the method of BIC produce optimal component k= 4. At 5 years of data, the method
of BIC produce optimal component K = 3.

194
TABLE 1. Parameters of GMM using data 3 years, 5 years and 10 years

Data 3 years
Component Cofficient mixing Mean Variance
1 0.4055 66436.9129 3.23732875e+08
2 0.5945 32211.7052 1.34890412e+08

Data 5 years
Component Cofficient mixing Mean Variance
1 0.1511 67914.4881 1.13370445e+08
2 0.5348 36794.2922 1.02323533e+08
3 0.2511 14826.8864 1.00002549e+07
4 0.0630 92860.0602 5.95976389e+06

Data 10 years
Component Cofficient mixing Mean Variance
1 0.4041 54793.2608 2.07430867e+08
2 0.2056 16073.4845 1.20531853e+07
3 0.3903 32175.9726 8.76020500e+07

The following is ilustration of GMM for data 10 years with 3 component gaussian where
the each parameter on table 1.
3

𝑝(𝑥) = ∑ 𝜋𝑘 𝒩(𝑥|𝜇𝑘 , Σ𝑘 ) = 𝜋1 𝒩(𝑥|𝜇1 , Σ1 ) + 𝜋2 𝒩(𝑥|𝜇2 , Σ2 ) + 𝜋3 𝒩(𝑥|𝜇3 , Σ3 )


𝑘=1

𝑝(𝑥) = 0.4041𝒩(𝑥|54793.2608,2.07430867e + 08) + 0.2056𝒩(𝑥|16073.4845,1.20531853e + 07)


+ 0.3903𝒩(𝑥|32175.9726,8.76020500e + 07)

The red curve in Figure 3 are curve of GMM for each data. Figure 3 also saw us
that the curve is very good in estimating the data, visible from the ridge on the histogram
followed properly by the red curve.
How k component on GMM estimate the severity distribution, for k = 1,2,3,4, and
k = 10. This estimation was performed on three groups of data, 3 years, 5 years of data,
and the data 10 years. This estimating was conducted to visually whether the selection of
the best models with BIC able approximating data well and compare it with other GMM
models. The following Figure 4, shows the probability density function models GMM for
k = 1,2,3,4 and 10. For k = 10, appears to lack of smoothness curve pdf, pdf increasingly
tapered curve. Moreover, it appears the estimated GMM with a large k (k = 10) is not too
different from the estimated optimal GMM with K obtained by BIC.

195
The graphic of severity distribution using GMM are on figure 3.

Pdf amount of losses, data 3 years


Pdf amount of losses, data 5 years

amount of losses amount of losses

Pdf amount of losses, data 10 years

amount of losses

FIGURE 3. Severity Distribution using GMM with K component choosing by BIC

Pdf of amount of losses, data 3 years


Pdf of amount of losses, data 5 years

Pdf of amount of losses, data 10 years

FIGURE 4. Severity Distribution

196
Visually to data 3 years, the best model looks to GMM with 3 components (yellow
curve), GMM with 2 components is still not good approximation. This is in contrast with
the best GMM models produced by the method of BIC which 2 component is an optimal
component. Data 5 years, the best model seems to GMM 4 component (blue curve). 10
years of data, the best model seems to GMM 3 component (yellow curve). Data for 5-
year and 10 year, the selection of the best models with visualization are same as with the
selection of the best model with BIC. However, best model GMM for each data can be
done visually in the case of data one dimension as above. If data have large dimension, it
would be difficult to portray the data graphically so the selection of components visually
difficult. In addition, the selection of the optimum component in a visual way can not be
justified because that are subjective.
Pdf of amount of losses, data 3 years
Pdf of amount of losses, data 5 years

Pdf of amount of losses, data 10 years

FIGURE 5. Comparing severity distribution using GMM(red curve) and Log-Normal(blue curve)
Red curve in Figure 5 shows for 3 group of data used in this study, pdf using GMM
better in describing the research data because it can estimate the local areas, while pdf
model of log-normal can not do it.
The simulation is calculating EC values. As we know that EC obtained from the
calculation of VaR on Aggregate distribution ( formed from the severity distribution and
frequency distribution) with a confidence level of 99.9%.Aggregate distribution
calculated numerically using the Monte Carlo method. The purpose of this simulation to
determine how much difference the value of EC produced by LDA using GMM and EC
produced by LDA using the Log-Normal. The number of samples used were 1.10, 102,
103, 104, 105, and 106. The simulation was performed 10 times for each sample number.
Results of the simulation calculations are presented in table 2 below:
TABEL 2. EC using GMM and Log-Normal for number of sampel 106
Method Economic Capital (EC)
Data 3 years Data 5 years Data 10 years
GMM 9.729.364,21 3.557.837,80 3.089.042,94

Log-Normal 9.901.079.80 3.632.659.70 3.178.200.00

197
Table 2 shows that using GMM on severity distribution of LDA gives a lower EC
value than the Log-Normal. EC value by GMM of 3 groups of data provide EC value 2%
lower than the value of the EC with the Log-Normal.

3. Concluding Remarks
The result on this research is estimation of severity distribution through GMM is better
than known distribution model in describing the data. The value at EC of LDA method
using GMM is smaller 2 % - 2,8 % than the value at EC of LDA using existing
distribution model. Then if bank use this method, they could have capital efficiency.

Reference
[1] C. M. Bishop, Pattern Recognition and Machine Learning. Springer: New York,
(2006).
[2] G. Claeskens and N.L. Hjort, Model Selection and model Averaging, Cambrigde
University Press, (2011).
[3] A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum Likelihood from Incomplete
Data Via The EM Algorithm. Journal of The Royal Statistical Society, 39(1), (1977).
[4] A. Frachot, P. Georges, and T. Roncalli, Loss Distribution Approach for Operational
Risk, Working Paper, Groupe de Recherche Operationnelle: France, (2001).
[5] E.J. Jimene, M.J. Feria, and J.L. Martin, Economic Capital for Operational Risk:
Applying Loss Distribution Approach(LDA), Working Paper, Consejeria de
Innovacions: Andalucia, (2007).
[6] G.J. McLachlan and T. Krishnan, The EM Algorithm and its Extensions, Wiley,
(1997).
[7] G.J. McLachlan and D. Peel, Finite Mixture Model, Wiley, (2000).
[8] H.Z. Ming and S.C. Qian, An Approach VaR for capital markets with Gaussian
Mixture. Journal of Applied Mathematics and Computation, 168, pp. 1079-1085,
(2005).
[9] P.V. Shevchenko, Implementing Loss Distribution Approach for Operational Risk.
Applied Stochastic Models in Business and Industry, vol. 26(3), pp. 277-307, (2009).

198
Proceedings of IICMA 2015
Mathematics of Finance

The Calculation of Aggregation Economic


Capital for Operational Risk
Using A Clayton Copula
Nurfidah Dwitiyanti1,a), Zuletane Maska2,b), Hendri Murfi3,c),
Siti Nurrohmah4 ,d)
1,2,3,4
Departement of Mathematics, Faculty of Mathematics and Natural Sciences,
Universitas Indonesia
a)
nurfidah.dwitiyanti@sci.ui.ac.id
b)
zuletane.maska@sci.ui.ac.id
c)
snurrohmah@sci.ui.ac.id
d)
hendri@ui.ac.id

Abstract. Bank is required to provide capital adequacy based on its risks. One of the bank's risks
should be calculated for the capital requirement is the operational risk. The operational risks must
be managed well because it may have a significant negative effect on bank’s reputation. The
operational risk’s capital requirement is known as economic capital (EC). The values of EC is
obtained from Value at Risk (VaR) in compound distribution with the level of confidence reaches
99.9%. According to Basel II, bank must calculate EC for overall 56 risks, so it is needed for the
aggregation. In the calculation of aggregation EC, the assumptions between risks that are usually
used by bank are completely dependent and independent. But, the values of EC from the
assumptions are unrealistic, so bank needs to consider the dependence structure that exists
between risks for aggregating EC. In this paper, we use a Clayton copula for this aggregation. The
Clayton copula is one type of the Archimedean families that has lower tail dependence. The
Clayton copula is used to capture the dependencies of small losses. The purpose of this paper is to
conduct a simulation of the calculation of the aggregation EC using a Clayton copula then
compares it to the EC value using a Frank copula. The results of the simulation show that the EC
values using Clayton copula are smaller than the EC values using a Frank Copula.
Keywords and phrases: operational risk, economic capital, value at risk, Clayton copula.

1. Introduction
Bank Indonesia as the central bank plays an important role in creating a
healthy banking system. Related to the role, Bank Indonesia implemented a policy
under the supervision and control of bank capital in the future. The capitalization
policy is regulated by Bank Indonesia refers to international standards which is
recommended by the Basel Committee on Banking Supervision in a document
known as Basel I and II.
In the Basel documents, the bank is required to provide capital adequacy
based on its risks. It was arranged by Bank Indonesia through Bank Indonesia
Regulation No. 15/12 / PBI / 2013, which Bank Indonesia required each bank to
provide minimum capital requirement according to its risks profile. In this

199
regulation, the risks profile regulated consist of credit risk, market risk and
operational risk.
One of the bank's risks must be managed well because it may have a
significant negative effect on bank’s reputation is operational risk. Basel II Accord
defines operational risk as the risk of loss resulting from inadequate or failed
internal processes, people or systems, or from external events. The supervision and
control of operational risk has been regulated by the central bank through Bank
Indonesia Regulation No. 11/25 / PBI / 2009 concerning the implementation of risk
management for commercial bank. In the implementation of the risk management,
one, bank must calculate the allocation of minimum capital requirement to cover
losses because of the bank's risks. So banks are expected to avoid losses or
bankruptcy.
In the context of operational risk, the capital requirement is known as
economic capital. Economic capital (EC) is the amount of capital banks need to be
maintained for at risk as a result of unexpected losses in the period and a certain
confidence level [1]. EC can be viewed as Value at Risk (VaR) at a given
confidence level [14]. VaR is defined as a risk measurement method that is used to
determines the worst possible loss that may occur with a given confidence level
and for a given time frame [2].
Basel II Accord allows banks to choose one of the three approaches that is
adopted by the Basel Committee for calculating EC. The three approaches are
referred to the Basic Indicator Approach (BIA), the Standardized Approach (SA),
and the Advanced Measurement Approach (AMA). Based on the three approaches,
banks generally use the AMA where banks are allowed to use their own internal
models according to eight business lines and seven event types which are defined
by Basel II and relevant to the bank. [2]
The most widely used the AMA approach is the Loss Distribution Approach
(LDA). LDA is used to calculate the value of EC that is resulted from the
aggregation loss for individual risks. The aggregation loss for individual risk is
formed by a combination of two distributions namely frequency distribution and
severity distribution. Both of the distributions are assumed to be independent. [3]
According to Basel II, there are 56 overall risks that need to be aggregated
EC for the overall risks. In the calculation of the aggregation EC, the assumptions
between the risks are usually used by banks are completely dependent and
independent. In the calculation of the aggregation EC assuming completely
dependent is assessed unrealistic and may overestimate the EC because it is
assumed each losses are dependent on each other and always occur at the same
time [1].
For example, if a natural disaster caused loss for the institution, then at the
same time the institution also suffer losses because of internal fraud. It is
considered unrealistic because the incident did not always occur simultaneously.
While the assumption of risk are independent can not be accepted by the regulator
and for a fair quantification of EC [12]. So, the calculation of the aggregation EC
needs to consider for the dependence structure between risks. Because this will be
an impact on the values of EC.

200
In this paper, we use copula for dependence modeling between risks. In
Latin, the word meaning copula means a link or binder. Mathematically, Copula is
a function that combines the univariate marginal distributions to obtain a joint
distribution with a specific dependency structure [16].
There are two families copulas commonly used, namely Elliptical families
and Archimedean families. According to The Moment, Kimiagari, and Noorbakhsh
[10, Lu [7], and Li, et al [6], Elliptical families, such as Gaussian copula and
Student t Copula, are considered good enough for determining the influence of the
dependence structure in the calculation of EC. However, Elliptical copulas have
limitations for heavy-tailed risk data distribution because it has properties
dependent symmetrical tail [15]. While the Archimedean families can determine
the different of dependence structure. This is illustrated by this families are
identical from their tails. The application of Archimedean families for the financial
risk data and climatology data can be found in Mahfoud [8] and Pradier [12].
In this paper will be discussed the application of Archimedean families for
calculating EC on the operational risk data. The type of the Archimedean families
is used in this paper namely Clayton copula. Clayton copula at the first time was
introduced by Clayton [18]. Clayton Copula is widely used to capture the
dependence structure of the lower tail on correlated risks [8]. According to Venter
[17], Clayton copula gives priority to capture the dependency for small losses.
Then, in this paper will also be conducted comparison of EC using the same family
of the Archimedean copula namely Frank copula, which Frank copula did not show
dependence structure either at the upper or lower tail.

2. Main Results
2.1. Clayton Copula
Clayton copula is one type of Archimedean families. Clayton copula is
used to capture the dependence structure of the lower tail on correlated risks [8].
Clayton copula had been introduced by Clayton [18]. Clayton copula form for
bivariate case is given by:

−1/𝜃
𝐶(𝑢, 𝑣) = (𝑢−𝜃 + 𝑣 −𝜃 − 1) , 𝜃>0 (1)

(𝑡 −𝜃 −1)
where a generator function 𝜙(𝑡) = 𝜃
.

Genest and Rivest [8] said that to built parameter from Archimedean
bivariate copula, it can use correlation of Kendall’s τ and generator function
(𝑡 −𝜃 −1)
𝜙(𝑡) = 𝜃
. So, the estimation for Clayton Copula’s parameter θ is:

2𝜏
𝜃 = 1−𝜏 . (2)

201
The dependence between copula’s parameter and the tail dependence for
the lower tail coefficient can be determined by using:

2−1/𝜃 . (3)

Because Clayton copula captures the lower tail dependence, the coefficient
for the upper tail dependence is 𝜆𝑈 = 0.

The illustration for the Clayton copula can be seen in Figure 1. It shows a
simulation points from Clayton copula. In the figure, it appears that there is a
strong correlation in the lower left corner. It indicates the dependent tail bottom.

[Source: Panjer [11], Operational risk modeling analytics, pp.243 illus ]


FIGURE 1. Copula Clayton (θ=3)

2.2. Value at Risk


Intuitively, VaR is defined as a risk measurement method that is used to
determines the worst possible loss that may occur with a given confidence level
and for a given time frame [2]. While mathematically, given some confidence level
α∈ (0.1), the Value at-Risk (VaR) at the confidence level α is given by the smallest
number 𝑙 in way that the probability that the loss 𝐿 exceeds 𝑙 is not larger than (1-
α) [9] :
𝑉𝑎𝑅𝛼 (𝐿) = inf{𝑙 ∈ ℝ ∶ 𝑃(𝐿 > 𝑙) ≤ 1 − 𝛼}. (4)
𝑉𝑎𝑅 is also known as quantile of loss distribution.

2.3. Loss Distribution Approach (LDA)


The Basel Committee has defined LDA as an estimate of the distribution of
operational risk losses for each business line and event types, based on assumptions
of frequency and severity of events [1].

202
In modeling losses with LDA, the first step will be done by bank is to
estimates two separate distributions for frequency and severity of operational losses
from each business line and event type combination. Frequency distribution states
occurance of operational losses of a bank. This distribution is discrete and
generally follows a Poisson distribution or Negative Binomial distribution [10].
Severity distribution is the amount of losses that is caused by loss events. There are
many distributions to represent severity distribution, Lognormal, Weibull, and
Gamma distribution. Furthermore, both distribution are compounded to obtain the
aggregate loss distribution through Monte Carlo simulation. As an illustration of
the LDA approach is shown in Figure 2.

FIGURE 2. The Aggregate Loss Distribution

Mathematically, the annual loss distribution for the individual risk r is given by:

N
r
Lr = ∑i=1 X r,i. (5)

Where:
Nr : the number of loss events caused by risk r over a year.
X r : the amount of losses for event r.
The Assumptions for the equation (5) :

1. 𝑁𝑟 and (𝑋𝑟,1 , 𝑋𝑟,2 , … ) are independent random variables.


2. 𝑋𝑟,1 , 𝑋𝑟,2 , … is a set of independent and identically distributed random
variables.

2.4. Algorithms of The Calculation of Economic Capital using Copula


In this section will explain about algorithms of the calculation of economic
capital using copula. This algorithms consist of two steps namely the first step to
analysis begins with the aggregation of each risks using LDA. The next step is to
use copula for calculating EC. Meanwhile, the procedures are the following:

203
a. The Aggregation of the risk with LDA
1) Determine the frequency distribution of 𝑁.
2) Determine the severity distribution of 𝑋.
3) Take randomly for N numbers from the frequency distribution, e.g the numbers
taken 𝑓𝑖 , 𝑖 = 1, … , 𝑁.
4) Conduct random risks taking as much 𝑓𝑖 , for 𝑖 = 1,2, … , 𝑁 of the severity
distribution, for example 𝑋1 , 𝑋2 , … , 𝑋𝑓 with 𝑋1 , 𝑋2 , … , 𝑋𝑓 independently.
𝑖 𝑖

5) Compound distribution can be constructed from total losses Lr namely 𝐿𝑟 =


∑𝑓𝑗=1
𝑖
𝑋𝑟,𝑗 , 𝑖 = 1,2, … , 𝑁.

b. The Calculation of Economic Capital Using Copula


1) Take a random sample for 𝑏 numbers from each compound distribution,
namely (𝐿1 , 𝐿2 )𝑖 , 𝑖 = 1, … , 𝑏.
2) Transform (𝐿1 , 𝐿2 )𝑖 to (𝑢1 , 𝑢2 )𝑖 , 𝑖 = 1, … , 𝑏 using cdf (cumulative density
function), so 𝑢𝑟 , 𝑟 = 1,2 range of the value between 0 and 1.
3) Determine a parameter to form copula chosen from (𝑢1 , 𝑢2 )𝑖 , 𝑖 = 1, … , 𝑏,
namely 𝜃.
For step 4 to 6 using the Monte Carlo method.
4) Build new samples c from copula, (v1 , v2 )i , i = 1, … , c.
5) Transform (𝑣1 , 𝑣2 )𝑖 to (𝐿1 , 𝐿2 )𝑖 , 𝑖 = 1, … , 𝑐 using the inverse cdf (cumulative
density function) in accordance with the origin loss distribution.
6) Then, to get the estimation of the value of EC, (𝐿1 + 𝐿2 )𝑖 , 𝑖 = 1, … , 𝑐, sorted
from the the smallest value to the largest value determined percentile 99.9%.

2.5. Results
To obtain the EC value from two risks data using Clayton copula is
determined by the specified method in the algoritm above. In this simulation, the
samples used for 200 numbers were taken from each of the compound distribution
for calculating of the aggregation EC. Furthermore, from 200 samples were used to
build copula through the Monte Carlo method.
The samples were constructed from copula and had been transformed back
into the original distribution, sorted them from the smallest to the largest value then
determined percentile of 99.9 % which were estimated value of EC in the following
years. This simulation was executed 10 times and the number of samples used are
100 , 101 , 102 , 103 , 104 , 105 , dan 106. The result of the simulation is presented
in Figure 3.

204
FIGURE 3. The values of EC using a Clayton copula
Figure 3. illustrates that when the number of samples are 105 , then the
results of EC values using Clayton copula begin to converge to one value. This is
consistent with the statement Frachot, Georges, and Roncalli [3], which states that
the Monte Carlo simulation method will converge to one value for the total samples
of more than 100,000. The similar method is used to aggregation EC using Frank
copula. The results of the simulation are presented in Figure 4.

FIGURE 4. The values of EC using a Frank copula


Based on Figure 4., it shows that the number of samples forming Copula 200
and variations in the number of samples indicate that the results of EC values using
Copula Frank start to convergence to the number of samples 105 . The details of EC
values for samples 105 dan 106 either using a Clayton or a Frank copula can be
seen in Table 1.

205
TABLE 1.The results of EC Values using Clayton and Frank copula.
The Values of EC The Number Standard Deviation
Types of Copula
(000 Rp) of Samples (000 Rp)

14,875,493.169 105 32,036.944


Clayton
14,882,810.634 10 6 10,893.068
14,904,997.101 105 23,095.367
Frank
6
14,914,849.292 10 10,337.822

Table 1 shows that the EC values are generated from each copula converge
to a single value when the number of samples are 106 . It is clearly seen that the
standard deviation of 106 is smaller than 105 . In addition, Table 1 also shows that
the values of EC using Clayton copula 0.21% smaller than the values of EC using
a Frank Copula.

3. Concluding Remarks
Based on the simulation conducted on the two types of risk for data toys by
using the number of samples forming Copula of 200, it can be concluded that the
results of the value of EC either using a Clayton copula or a Frank copula converge
to one value when the number of samples are 106 . It is clearly showed with
standard deviation for 106 is smaller than 105 . And the value of EC using Clayton
copula is smaller 0.21% than the value of EC using Frank copula.

References
[1] C. Alexander, Operational risk : Regulation, analysis and management,
London, Prentice Hall Financial Times, (2003).
[2] A. S. Chernobai, S. T. Rachev, and F. J. Fabozzi, Operational risk: A guide to
basel II capital requirements, models, and analysis, New Jersey, John Willey
& Son, (2007).
[3] A. Frachot, P. Georges, and T. Roncalli, Loss distribution approach for
operational risk, France, Gropu de Recherche Operationenelle, Credit
Lyonnais, (2001).
[4] C. Genest and J. Mackay, The joy of copulas: Bivariate distribution with
uniform marginals, The American Statistician, 40(4), 280-283, (1986).
[5] C. Genest and L. P. Rivest, Statistical inference procedures for Bivariate
Archimedean copulas, Journal of the American Statistical Association, 88,
423, 1034-1043, (1993).
[6] J. Li, X. Zhu, J. Chen, L. Gao, J. Feng, D. Wu, and X. Sun, Operational risk
aggregation across business lines based on frequency dependence and loss
dependence, Hindawi Publishing Corporation, 8, (2014).

206
[7] Z. Lu, Modeling the yearly Value-at-Risk for operational risk in Chinese
commercial banks, Mathematics And Computers In Simulation, 82, 604-616,
(2011).
[8] M. Mahfoud, Bivariate archimedean copulas: an application to two stock
market indices, Amsterdam, Vrije Universiteit Amsterdam, (2012).
[9] A. McNeil, R. Frey, and P. Embrechts, Quantitative risk management:
Concepts, techniques, and tools, New Jersey, Princenton University Press,
(2005).
[10] O. Momen, A. Kimiagari, and E. Noorbakhsh, Modelling the operational risk
in Iranian commercial banks: Case study of a private bank, Journal of
Industrial Engineering International (8:15), (2012).
[11] H. H. Panjer, Operational risk modeling analytics, New Jersey, John Wiley
dan Sons, (2006).
[12] E. Pradier, Copula theory: An application to risk modeling, Ensimag:
Research Project Report, Grenoble INP, (2011).
[13] T. Schmidt, Coping with copulas, In Copulas-from theory to applications in
finance, Germany, Department of Mathematics, University of Leipzig, (2006).
[14] J.Shim, S. H. Lee, and R. MacMinn, Measuring economic capital : Value at
risk, expected tail loss and copula approach, Illinois Wesleyan University,
(2009).
[15] Q. J. Tang and G. H. Sun, Measuring dependence risks of funds with copula in
China, Applied Mathematics,5, 1863-1869, (2014).
[16] Y. K. Tse, Nonlife acturial models: Theory, methods, and evaluation,
Cambridge, Cambridge University Press, (2009)..
[17] G. G. Venter, Tails of copulas, In Proceedings ASTIN Washington, (pp. 68-
113), USA, (2001).
[18] D. Clayton, A model for association in bivariate life tables and its application
in epidemiological studies of family tendency in chronic disease incidence,
Biometrika, 65, 141-151, (1978).

207
Proceedings of IICMA 2015
Mathematics Education

Increasing The Students’ Mathematical


Communication Skill for Junior High School by
Applying React (Relating, Experiencing,
Applying, Cooperating, and Transferring)
Learning Strategy
Della Afrilionita1,a)
1
Senior High School Islam Al-Azhar
a)
dellaafrilionita@gmail,com

Abstract. According to the observation and the pre-test result in Grade 7 of 99 Jakarta Junior
High School, it’s seen that the students’ mathematical communication skill is relatively in the
low level. REACT learning strategy can be one of the alternatives of mathematics learning in
the classroom to increase that skill. REACT learning strategy consists of five steps, those are:
relating, experiencing, applying, cooperating, and transferring, which can increase the
mathematical communication skill in each steps, respectively. The objective of this research
is to increase the students’ mathematical communication skill in grade 7 at 99 Jakarta Junior
High School. This is a classroom action research that conducted in three cycles. Every cycle
has four steps, those are planning, practicing, observation, and reflection. There is post-test in
every ending of cycles for measuring their mathematical communication skill. This research
was conducted from September until October 2014. The result shows that the mathematics
learning process through REACT learning strategy increase the students’ mathematical
communication skill. This is shown by the gaining of the average test score.
Keywords and Phrases: Mathematical Communication, REACT Learning Strategy,
Classroom Action Research.

1. Introduction
Education is an important aspect in improving the quality of a nation.
Quality of education will certainly be able to transform students into individuals
who have the required competence. To produce a quality nation, of course,
necessary to increase the quality of education in various fields, including
mathematics. Mathematics is a science that has a big role in everyday life.
Mathematics as a branch of science education can be used as a barometer to
measure the students’ ability, for example in logic skills, optimization skills,
prediction skills, analytical skills, and many others. Mathematics is used in the
branches of science as a medium of proofs, calculations, estimations, and others.
More or less, the level of mastery of mathematics will determine the quality of
educational outcomes for learners. Students’ (learners) mastery of mathematics in

208
Indonesia can be seen by the report of The Trends in International Mathematics
and Science Study (TIMSS) whose aim to measure the achievements of
mathematics and science students in the participating countries, including
Indonesia. TIMSS study results showed that the Indonesian position is far below
the countries around. In 2011, Singapore was ranked second, Malaysia was 26th,
Thailand was 28th, while Indonesia was 42nd. The low of students’ mathematical
communication skills demonstrated in Rohaeti’s study that the average score of
students' mathematical communication skills are in low qualification [11].
Meanwhile Purniati states that students’ responses to the mathematical
communication problems is generally low [10].
Further the test was conducted at preliminary research in August 2014 to
students grade 7 at 99 Jakarta Junior High School. The material was tested on five
numbers in the form of essays, which requires students to write the steps of solving
problem completely in order to measure the students' mathematical communication
skills accurately. From the 36 students who took the tests, it’s obtained that the
average score of the test is 39.44. The result of preliminary research can be seen in
the following table:
TABLE 1. Preliminary Test Score of Students Mathematical Communication
Percentage
Score of Mathematical Number of
Criteria of The Number
Communication (%) Students
of Students (%)
90 < Score 100 0 0
Very Good
80 < Score 90 0 0
70 < Score 80 1 2,78
Good
60 < Score 70 3 8,33
50 < Score 60 3 8,33
Fair
40 < Score 50 7 19,44
30 < Score 40 11 30,56
Low
20 Score 30 6 16,67
10 Score 20 3 8,33
Very Low
0 Score 10 2 5,56
Sum 36 100

The average score of the test of mathematical communication skills acquired


by students of grade 7 at 99 Jakarta Junior High School during the preliminary
study is 39.44 with the highest score of 75 and the lowest score of 10. Based on the
table above, it can be seen that there is no student who has the mathematical
communication skills with very good category, while almost half of the number of
students in the class have a mathematical communication skills in low category and

209
there are 5 students with mathematical communication skills that are in very low
category. From this data it can be seen the level of students' mathematical
communication skills are relatively low.
Based on observations and interviews in the preliminary research, it is
known that some factors become the students’ difficulty in solving the test
questions, including difficulties in translating or interpreting the language
problems; difficulties in using the common variables to answer the questions; and
difficulties students in mastering the basic of algebra shown by mistake in
changing the form of equations and algebraic error in calculation. Another factor
that is found in this preliminary research are mathematics learning strategies are
less motivated students to interact each others and use their ability to communicate
a problem and its solution.
This fact certainly makes teachers feel need to improve the factors that
support the achievement of learning goals, one of which is to improve the learning
strategy. Selecting the appropriate strategy and suitable with the students’
characteristics will help them to receive the lessons and apply it in daily life
situation. REACT learning strategy (Relating, Experiencing, Applying,
Cooperating, and Transferring) is a learning that involves five aspects: linking the
concept to be learned with sommething the student already knows (relating),
Hands-on activities and teacher explanation allow students to discover new
knowledge (experiencing), application the knowledge to the real-world situation
(applying), provides the opportunity for students to learn through collaboration
(cooperating), and provides the opportunity for students to transfer knowledge into
new contexts (transferring). In this research, REACT learning strategy is expected
to be an appropriate alternative as an effort to improve the mathematical
communication skills of students grade 7 at 99 Jakarta Junior High School.

Literatur Review
In the first, REACT learning strategy developed in the United States. REACT
strategy is one of learning strategy that based on contextual learning (learning
system that tries brain actions to crating patters that have meaning). REACT
strategy contains five essential forms of learning: Relating, Experiencing,
Applying, Cooperating, and Transferring.
Relating. Relating is learning in the context of life experiences. In this step,
teachers uses relating as a tool by presenting situations completely familiar to the
student and extracting new concepts or developing deeper understanding of
concepts from those situations. Experiencing. Experiental learning takes events
and learning out of the abstract thought and brings them into concrete exploration,
discovery, and invention. In this step, teachers can do some hands-on learning
experience. Applying. Applying is learning concepts and information in a useful
situation. Students apply a concept when they can apply their real world
experienced to their problem-solving activities. Teachers can motivate students by
making problems realistic and relevant to students’ life. Cooperating. Cooperative
learning is based on the belief that learning is inherently social. Cooperating is
learning in the context of sharing, responding, and communicating with other
learners. Cooperating learning has a positive effect on students’ achievement,
interpersonal relatinships, and communication skills. It also improves students’
attitudes toward the opposite gender and toward other racial and ethnic groups. In
cooperating step, students can improve their oral communication skills.

210
Transferring. Transferring is using knowledge-existing or newly acquired- in a
new context or situation. This step is learning in the context of existing knowledge,
or transferring, uses and builds upon what the students already knows. Learning
mathematics using REACT strategy is expected to change the way students
learning to a meaningful learning. Students will learn to find their own concepts of
the material being studied and are expected to understand the mathematical
concepts and be able to communicate the solution of a mathematical problem [8].
Ministry of National Education states that the mathematical communication
skill is the students’ ability to be able to express and interpret mathematical ideas,
either orally, in writing, drawing, or demonstrating the mathematical problems.
According to Utari Sumarmo cited by Gusni satriawati, mathematical communication
skill is an ability that can be included and contains a variety of opportunities to
communicate in a variety of forms. The forms are as follows:
a. Reflecting the real objects, drawings, and diagrams into mathematical
ideas.
b. Create a model of the situation or problem using oral, written, concrete,
and graphics.
c. Declare a daily occurrence in the language or mathematical symbols.
d. Listening, discussing, and writing about mathematics.
e. Read with understanding a written mathematical presentation.
f. Make a conjecture, make the argument, definition, and generalization.
g. Explain and make inquiries about mathematics that have been studied.
Meanwhile, according to Sullivan & Mosley in Rofiah, mathematical communication
is not merely express ideas through writing, but, more broadly, the ability of students
in terms of talking, explaining, describing, listening, inquiring, clarifying, in
collaboration (sharing), writing, and finally report on what has been learned.
Based on some definition of mathematical communication skills as described
above, it can be said that the mathematical communication skills is the ability of
students to express their understanding of mathematics, among others include the
ability of stating, demonstrate and interpret ideas mathematically from a contextual
problem in the form of descriptions into mathematics models (pictures, graphs, charts,
tables, and equations) or otherwise, either orally or in writing.

Methodology
The research of students’ mathematical communication skill is conducted to
explore how the implementation of REACT learning strategy can encourage the
student of grade 7 at 99 Jakarta Junior High School and to find out the effectiveness of
REACT (Relating, Experiencing, Applying, Cooperating, and Trasferring) in
increasing their mathematical communication skill. To conduct such research, the
method that used is classroom action research.
Classroom Action Research (CAR) is a form of reflective activity intended to
deepen the understanding of the actions during the learning process and improve the
weaknesses that still occur in the learning process and also to realize the objectives of
the learning process. The procedure of classroom action research is conducted in a
systematic and involves repeated reflection at each cycle. Each cycle consists of four
activities: planning, implementation, observation, and reflection. If in the cycle there
is no expected change, then the cycle will be repeated by holding the improvements to
the data obtained from various sources related to the focus of the research was

211
relatively similar /repetitive. In other words, this research will be terminated if the data
obtained is already saturated/stable.

2. Main Result
The data obtained in this research showed the increament of the score of the
mathematical communication skill conducted from preliminary research to the 3rd
cycle. The increament can be seen in the diagram below.

40
35
30
Percentage

25
20
15
10
5
0
11 - 21 - 31 - 41 - 51 - 61 - 71 - 81 - 91 -
0 - 10
20 30 40 50 60 70 80 90 100
Pre-test 5,56 8,33 16,67 30,56 19,44 8,33 8,33 2,78 0 0
Post-test 1st Cycle 0 0 0 2,78 11,11 13,89 36,11 33,33 2,78 0
Post-test 2nd Cycle 0 0 0 2,78 8,33 5,56 33,33 22,22 19,44 8,33
Post-test 3rd Cycle 0 0 0 2,78 0 8,33 11,11 25 38,89 13,89

FIGURE 1. The Increament of Students’ Mathematical Communication Skill


from Preliminary Research to The 3rd Cycle
Below is the figure of the results of mathematical communication skill
grouping students into five categories: very good, good, fair, low, and very low
conducted from preliminary research to the 3rd cycle.

FIGURE 2. The Qualification of The Score of Mathematical Communication Test

212
Based on the findings in this research, it is known that the students'
mathematical communication skills was increase by using REACT learning
strategy. By enhancing the ability of mathematical communication, students are
expected to improve their ability in other mathematical aspects. In REACT
learning strategy, students are required to become a center of learning through the
five steps: relating, experiencing, applying, cooperating, and transferring.
In Relating, students observe and linking the material being studied with
everyday life that is close to the students. In this case, teacher can helps students
construct their knowledge by asking questions relating to materials, or to give an
explanation of the phenomena that occur in everyday life. Students were
enthusiastic to know the relation of the material being studied with the students’
life, as students assume that by knowing the benefits of the material being studied
with everyday life, the lessons will be easier and can be imagined. Students
become ask questions more frequenty or simply express their opinions, so that it
stimulates students' mathematical communication skills orally.
In Experiencing, students will learn by experience. Students work on an
activity to acquire new knowledge. With his own experience, students are given an
understanding or formula quickly and not easily forgotten. Students were praticing
in this phase to absorb the mathematical ideas given. Students were enthusiastic
because at this step, the students didn’t work individually, but together with their
team. Through the discussion, students share the ideas and teach each other in
completing the activity, so that these group discussions help students to
communicate with other learners (Cooperating).
Applying step was help students to write the answers communicatively.
Students was practicing to solve some contextual problems. At this step, the
students' mathematical communication skills was honed. Furthermore, after the
students solve the problems, students transferring their knowledge into new
situations. In this case, the students did a presentation so that the students could
deliver their knowledge to other students. When the students were presenting,
teachers instructed the students from other groups to ask questions or simply
respond to the presentation.
The application of the REACT strategy stimulated students to do continual
communication, both oral and written, so that students have the opportunity to
improve their mathematical communication skills through this learning strategy.
The description above shows that the use of REACT learning strategy is a factor to
increase the students' mathematical communication skills in mathematics.
Therefore, it can be concluded that the REACT learning strategy can increase the
ability of students' mathematical communication.

References
[1] R. Aisyah, Peningkatan Kemampuan Berpikir Kreatif Siswa SMP Melalui
Pembelajaran Matematika dengan Strategi Relating, Experiencing,
Applying, Cooperating, Transferring (REACT) (The Increament of Creative
Thinking Ability of Junior High School Students by Mathematics Learning
Using REACT Learning Strategy), Bachelor Thesis, Bandung: Universitas
Pendidikan Indonesia., (2013)

213
[2] Armiati, Peningkatan Kemampuan Penalaran Matematis, Komunikasi
Matematis, dan Kecerdasan Emosional Mahasiswa melalui Pembelajaran
Berbasis Masalah (The Increament of Mathematical Reasoning Skill,
Mathematical Communication Skill, and Emotional Intelegences of Students
by Applying The Problem Based Learning Method), PhD Thesis. Bandung:
Universitas Pendidikan Indonesia, (2011)
[3] CORD, REACT (learning strategy), Texas: CORD Communication Inc,
(1999)
[4] CORD, Teaching Mathematics Contextually: The Cornestone of Tech Prep,
Waco, Texas: CORD Communication, Inc, (1999)

[5] Cotton, K, Mathematical Communication, Conceptual Understanding, and


Students' Attitudes Toward Mathematics, Journal, Nebraska: University of
Nebraska-Lincoln, (2008)
[6] J. R. Fraenkel, How to Design and Evaluate Research in Education 7th
Edition, New York: The McGraw-Hill Companies, Inc, (2009)
[7] Huggins, Communication in Mathematics, Canada: St. Xavier University,
(1999)
[8] M.L. Crawford, Teaching Contextually: Research, Rational, and Techniques
for Improving Student Motivation and Achievement in Mathematics and
Science, [ONLINE] Available at: http://www.cord.org., (2001)

[9] OECD, Mathematics Teaching and Learning Strategies in PISA: Programme


for International Student Assessment, USA: OECD Publications, (2010)
[10] Purniati, Pembelajaran Matematika Geometri Berdasarkan Tahap-tahap
Awal Van Hiele dalam Upaya Meningkatkan Kemampuan Komunikasi siswa
SLTP (Mathematics Learning of Geometry Based on Van Hiele Early Fases
in order to Increase The Mathematical Communication Skill of Junior High
School Students), Master Thesis, Bandung: Universitas pendidikan
Indonesia, (2003)

[11] Rohaeti, Pembelajaran dengan Metode Improve untuk Meningkatkan


Pemahaman dan Kemampuan Komunikasi Matematik Siswa SLTP (Improve
Learning Method for Increasing the Mathematical Understanding and
Mathematical Communication Skill of Junior High School Students), Master
Thesis, Bandung: Universitas Pendidikan Indonesia, (2003)

214
Proceedings of IICMA 2015
Mathematics Education

Investigating Students’ Spatial Visualization in


The Properties of Solid Figure by Using Daily
Life Context
Lestariningsih1,a)
1
Doctoral Student of Surabaya State University (Unesa), Gedung K9 Kampus Unesa
Ketintang Surabaya

a) lestari.med@gmail.com

Abstract. The aim of this research is to investigate students’ spatial visualization in the properties
of solid figure by using daily life context. Pendidikan Matematika Realistik Indonesia (PMRI) is
chosen as an approach because researcher in the learning process used context and it’s in line with
PMRI characteristics [1]. This research use the descriptive research method. Data are collected by
using students’ worksheet, video recording and interview some students to get deeper information
of their spatial visualization. Based on the result by investigating from findings and the data
obtained in this research, it can be concluded that students can think about solid figure properties
if they see these objects oftent in daily life activities such as objects that have cube, cuboid or
prism form. However, they find difficulties to make sketch related to objects that combine more
than one kind of solid figures then it can be solved by classroom discussion.
Keywords and Phrases: spatial visualization, daily life context, PMRI

1. Introduction
Geometry is one of the oldest mathematical sciences. Geometry is the visual
study of shapes, sizes, patterns, and positions. It occurred in all cultures, through at
least one of these five strands of human activities namely building/ structures,
machines/motion, navigating/star-gazing, art/patterns, measurement [2].
Considering the importance of geometry in human activities, geometry has been
taught in Indonesia from elementary school, junior high and senior high school [3].
Furthermore, students can also study geometry in college or university.
In Indonesian junior high school, one of important students’ abilities is
capable in using spatial visualization. Students in Junior High School have been
supposed that they can imagine abstract object. To support this, Indonesian
Curriculum of mathematics in Junior High School has 62% material about
geometry [4]. In the second semester of grade VIII, there is topic in geometry
namely solid figures. The competence that must be achieved by students is
understanding the properties of cube, cuboids, prism, pyramid, their elements and
their size that consist of three categories, those are identifying the characteristic of
cube, cuboids, prism, and pyramid, making the net of cube, cuboids, prism, and
pyramid, and calculating surface area and volume of cube, cuboids, prism, and
pyramid.

215
However, based on my observation, the topic of solid figure especially
about the properties of cube, cuboids, prism, pyramid, their elements and their size
is given directly in the classroom. Teachers teach this material by directly giving
the table of properties of cube, cuboids, prism, pyramid, their elements and their
size to students. On the other hand, students directly read and give attention to the
explanation from their teacher without any question or solve challenging problems.
This is in line with the result from Safrina’s et.al. research [5] that stated students
find difficulty in understanding the geometry because of learning strategy used is
not appropriate with the material. This kind of learning make students feel boring
and can not develop their spatial visualization.
Spatial visualization in solid figure is an extremely important skill in many
fields involving science, technology, engineering, and mathematics, including the
geosciences [6]. Spatial visualization is a complex process that involves both visual
abilities and the formation of mental images [7]. Because of the importance of
spatial visualization across many disciplines, it has been studied by a wide variety
of workers in science, education, and cognitive psychology.
Considering the aforementioned issues, the aim of this research is to
investigate students’ spatial visualization in the properties of solid figure by using
daily life context.
To support this research, PMRI (Pendidikan Matematika Realistik
Indonesia) approach is chosen. PMRI is Indonesian version of RME (Realistic
Mathematics Education) that has been developed by Freudenthal Institute since
1970’s. Treffers in Wijaya (2008) state that there are five characteristics of RME
and in Indonesia become PMRI characteristics, namely: 1). phenomenological
exploration or the use of contexts, 2). using models and symbols for progressive
mathematization, 3). using students’ own construction and production, 4).
interactivity, 5).intertwining.
In PMRI, students can learn mathematics using context that can be found
easily in their environment. Moreover, anything that can be imagined by students
also can be used as context for learning mathematics.This research use many
things that usually were found by students in their daily life context.
This research use descriptive research method. Descriptive research method
is a method in research that describe the status of a group of people, an object, a
condition, a system of thought, or a class of events in the present [8]. The purpose
of this method is to make description in a systematic, factual and accurate
information on the facts, nature and the relationship between the phenomena
investigated.

2. Main Results
This research is conducted in the eighth grade SMPN (Sekolah Menengah
Pertama Negeri) I Palembang, a junior high school located at Pangeran Aria
Kesuma Abdurrahim, Bukit Kecil, Talang Semut, Palembang. As an apperception,
students were asked about the net of cube and cuboids that had been studied before.
One student mentioned the requirement of net that can be formed as cube or
cuboids.

216
Furthermore, students are given three problems namely drawing solid figures
and gives their names from an object given, determining the number of face, edge,
vertex from the object in the first problem and determining the relationship
between the number of face, edge and vertex that refers to Euler’s formula.
Students are asked to solve these problems related solid figure as shown in figure
1.

FIGURE 1. Students Solve Problems

After solved their problems, students present their work in front of


classroom as shown in figure 2. Some students gave their responses and for other
students that have different results could also present their work. One student
explained the solution for third problem because many students could not solve it.
Her explanation refers to Euler’s formula.
Moreover, students made conclusion from their learning process that the
total number of vertices and faces is always two more than edges.

FIGURE 2. Students Present Their Work

Analysis
Students easily work to classify and sketch some solid figures of objects.
They can sketch the object that has cube shape and mentioned it well. Also for
object that has pyramid shape, they can sketch it easily. Furthermore, for question
that shows the picture of some glasses, students clearly sketch the picture to be
cylinder. This case happened again for question that shows the object with cuboids
shape.

217
However, some students find difficulty to make sketch from the picture of
ring case that has prism shape. Beside, they also unsure whether the name from this
object that already wrote are correct or false because they seldom meet with such
object in learning material for mathematics. Actually this kind of problem is part of
learning material in solid figure.
Moreover, many students cannot solve for first problem especially number 6
and 8. For problem number 6 namely the picture of pencil, some students think that
this object has cylinder shape but they doubt with their answer because the end of
point is sharp as shown in figure 3. Some students argue that it has the cone shape,
yet the solid has the cylinder part. This conflict leads to classroom discussion and
they can answer that the pencil has two shapes of solid figure namely cylinder and
cone.

FIGURE 3. Picture of first problem number 6


Students find difficulty in the first problem number 8 that shows the picture
of a slice of watermelon (figure 4). There are many kinds of students’ answers
because they have different idea from this object. Some students think that this is a
half of watermelon and then they sketch a half of sphere. Another student has an
idea that he must sketch a quarter of sphere since it is a quarter of watermelon. Also,
other students conclude that the object is one eighth of watermelon so that they
make sketch of one eighth of sphere. Of course, these different ideas among
students lead to discussion and as a result, they can come to the right answer.

FIGURE 4. Picture of first problem number 8

218
There are students that also had confused to count the number of face,
edge, and vertex of pencil and a slice of watermelon (figure 4). Then, one of them
told that pencil had one vertex, three faces, and two edges and a slice of
watermelon consisted of two vertexes, three faces, and three edges.
Moreover, most of students found difficulty to determine the relationship
between the numbers of vertices, faces, and edges. However, there is a student that
could find it and explained in front of classroom. She said that we can determine
the relationship between them using Euler’s Formula.
3. Concluding Remarks
Based on the result by investigating from findings and the data obtained in
this research, we can conclude that students can think about solid figures easily if
these objects are often they see in daily life activities such as objects that have
cube, cuboids or prism form. However, they found difficulties to make sketch
related to objects that combine more than one kind of solid figures for instance:
pencils, or watermelon that didn’t have sphere form. Moreover, they could know
the relationship between the number of faces, vertexes and edges using Euler’s
Formula.

References
[1] Lestariningsih, R.I.I. Putri, and Darmayijoyo, The Legend of Kemaro Island
for Supporting Students in Learning Average, IndoMS Journal on
Mathematics Education, Volume 3(2): 165-174 , (2012)
[2] Cornell University, What Is Geometry?, Retrieved from
http://www.math.cornell.edu/~mec/What_is_Geometry.pdf , (2015)
[3] Kemdikbud, Kompetensi Dasar Sekolah Menengah Pertama (SMP)/Madrasah
Tsanawiyah (MTs), Retrieved from http://www.pendidikan-
diy.go.id/file/mendiknas/kurikulum-2013-kompetensi-dasar-smp-ver-3-3-
2013.pdf, (2013)
[4] F.N. Ngaini, B.P. Darminto, and W. Ika, Studi Komparasi Model Pembelajaran
Matematika Tipe Jigsaw Dan Tipe STAD Pada Siswa Kelas VIII, Retrieved from
http://download.portalgaruda.org/article.php?article=129026&val=612, (2013)
[5] K. Safrina, M. Ikhsan, and A. Ahmad, Peningkatan Kemampuan Pemecahan
Masalah Geometri melalui Pembelajaran Kooperatif Berbasis Teori Van
Hiele, Jurnal Didaktik Matematika, 1(1): 9- 20, (2014)

[6] S. Titus and E. Horsman, Characterizing and Improving Spatial Visualization


Skills, Journal of Geoscience Education, 57(4): 242-254, (2009)
[7] J.H. Mathewson, Visual-spatial thinking: an aspect of science overlooked by
educators, Science and Education, 83: 33-54, (1999)
[8] Sudjana and Ibrahim, Penelitian dan Penilaian Pendidikan, Bandung : Sinar
Baru Algesindo, (1989)

219
Proceedings of IICMA 2015
Mathematics Education

Using Matematika Gasing in Learning


Mathematics for Three Digit Addition
Aloysius Ajowembun1,a), Falenthino Sampuow2,b),Wiwik
Wiyanti3,c), and Johannes H. Siregar4,d)
1,2,3,4
STKIP Surya
a)
aloysius.natalis@students.stkipsurya.ac.id
b)
falenthino.sampouw @students.stkipsurya.ac.id
c)
wiwik.wiyanti@stkipsurya.ac.id
d)
Johannes.siregar@stkipsurya.ac.id

Abstract. Surya College of Education (STKIP Surya) is a college where the students
representatives from various local governments. The majority of students come from mountainous
areas of Papua. For every first-year student at Surya STKIP, they are not directly following the
lectures as in the other campus, but students have to follow the matriculation activities. Based on
observations during matriculation activities for this year, many students still use their fingers or
tally mark to answer the addition question. With this condition, we intend to investigate by giving
treatment to use Matematika GASING. Matematika GASING is a way of learning mathematics in
an easy, fun, and enjoyable fashion. GASING is short for (Bahasa Indonesia) GAmpang, aSyIk,
and menyenaNGkan. It was originally developed by Prof. Yohanes Surya to improve the
mathematics ability of Indonesia’s students. The focus of our research is how to add three digit
number by using Matematika GASING. The subject is 28 students in matriculation class at STKIP
Surya. The research method used in this study is qualitative research. To collect the data,
researchers conducted classroom observation, interview and documentation. Then the date was
analyzed to show Matematika GASING can improve matriculation students ability in learning
mathematics for three digit addition.
Keywords and Phrases: Matematika GASING, three digit addition, qualitative research.

1. Introduction
According to Law of Republic Indonesia No. 20/2003 article 1 verse 8
about National Education System in Indonesia [5], the education system in
Indonesia has been divided into several levels. As the students need to continue
their education level then it is compulsory for them to meet some requirements.
Like for example, every students who want to get to junior high school level then
they first have to be graduated from the elementary school level. The same process
applied in STKIP Surya. Every student in this college of education need to
complete the matriculation class in their first year of college. In this matriculation
class, all topics in mathematics in the lower level of education before will be
reviewed before they face harder topics in college level.
Based on the observation which is done in matriculation class of 2015,
there are still some students use their finger on doing calculation. Moreover, tutor
has stated it is true for some of students in matriculation that calculating is difficult
enough to be done without using fingers. However, all matriculation students were

220
graduated from senior high school and students have started to learn calculating
since they were in primary school.
It is necessary to be known that, STKIP Surya is a college of education
which is coordinated with some governmental provinces in Indonesia in providing
the scholarship for some selected students. Most of students in this college of
education come from the eastern point of Indonesia like Papua, Kupang, and
Maluku.
In this study, we applied Matematika GASING to increase the students skill
on doing 3 digits addition. Matematika GASING was developed in STKIP Surya
(was founded by Prof. Yohanes Surya as the founder of STKIP Surya) for the
importance of helping the students in learning mathematics [3]. In addition, some
studies which are done by Surya and Moss [4], Wiyanti and Wakhyuningsih [6],
Siregar [1] have proved that Matematika GASING has helped in increasing
students’ grade in mathematics. And for this study, researchers use qualitative
descriptive method for the data analysis. We expect that Matematika GASING can
increase students’ skill for three addition. The purpose of this research is to show
that, there is an increase in the matriculation participant’s skill in the Matematika
GASING for three digit addition. The research is focused on the Matematika
GASING for three addition. The research was done on 17th– 24th September 2015.
The participants of the research were students of matriculation class in STKIP
Surya, who came from Papua. STKIP Surya is a teacher’s college located in
Tangerang, Indonesia. The research method used qualitative. This method is to
make researchers as an main instrument [2], so data be obtained from observation,
interview and making documentation what is done by researchers. And to see an
increase on the ability of the students for three digit addition, researchers used
pretest and posttest. Not only to see student abilities but also, students can do three
digit addition by Matematika GASING. There are 3 steps in qualitative method
according to Miles and Huberman (rewritten by Sugiyono) [2]; 1). Data reduction
(researchers focused on the important data); 2). Data Display (Researchers displays
the data); 3). Conclusion Drawing/Verification (Researchers make conclusion and
verification about the data). This research is considered successful when
participants gain score 90 on their posttest.

2. Main Results
2.1. Theoretical Framework
Matematika Gasing
Surya dan Moss [7], said “Math GASING Method shows how to change
concrete sample into an abstract symbol so the students will be able to read a
mathematical pattern, thus gain the conclusion by themselves”. Matematika
GASING having a critical point was believed that if students can pass through and
realy understand this, students can finish various problem that related with that
critical point. Those supported with the result research conducted by some
researchers, one of them is Siregar. Siregar [4] concluded that student was pass
critical point of Matematika GASING was good on their posttest.
To understand the addition of three digits with three digits numbers,
students must pass the additions critical point of Matematika GASING. Besides,

221
students must know 100-999 numbers. It is also important for them to know the
place of value concepts for each numbers [6].
1. Knowing the number of 100 – 999
Introducing the number 100-900 to students can be done by using a green
rectangular stick consists of ten cubes in each. One rectangular stick represents ten
then when teachers take 9 of the stick, they said that it is ninety. Teachers need to
explain that there will be a blue rectangular stick to represent hundreds. Students
have to recognize that 100 green rectangular stick is equal to one blue rectangular
stick represent hundreds. The Figure 2.1 below shows how 10 green rectangular
sticks is equal to one blue rectangular stick.

FIGURE 2.1 Knowing the number of 100.

2. The number of 101


Teachers show a blue rectangular stick as a hundred and a red square as
ones. Then, teacher can say “well, this is one hundred and one, written 101”. Apply
the same way to show the number 102-109. Figure 2.2 below as illustration to
know the number of 101.

FIGURE 2.2 Know the number of 101

3. The number of 110


Teachers show a blue rectangular stick as a hundred and a green
rectangular stick to represent ten. Then teachers can say “well, this is one hundred
and ten, written 110”. Teacher can apply the same way to show the number of 120-
190. Figure 2.3 below as illustration to know the number of 110.

222
FIGURE 2.3 Knowing the number of 110
4. The number of 111
Teacher show a blue rectangular stick as a hundreds, a green rectangular to
represent tens and a red square as ones, teachers said this a hundred eleven. Do the
same way to know the number of 112 and so on.
5. Adding Three Digits and Three Digits Numbers
Once student proficient in knowing 100 – 999 numbers, next they can start
doing the addition of three digits with three digits numbers. By doing this kind of
addition means that students have also learnt in adding the three digits number with
a single digit and three digits with double digit number. Example 2.1 shows the
process in adding three digits and three digits numbers [6].
Example 2.1 Example addition three digits with three digits number.

FIGURE 2.4 Three digits addition


As above, the addition between three digits number and three digits
number must be done based on the place of value of numbers. Watch addition
253 + 169 as on Example 2.1, the first students must add up in the value of
hundreds of namely place 2 + 1 = 3, then dozens, namely 5 + 6 = 11, we write
one small above and one ( 11). One small on said that hundreds of because we
addition is the value of tens or 50 + 60 the result of 110. Next, addition in the
value of a unit 3 + 9 = 12, we wrote 1 small and two( 12). One small above
declare tens because 3 + 9 = 12, where 1 is occupying the value of the tens. Then
we combine 𝟑 𝟏𝟏 𝟏𝟐 becomes 𝟒𝟐𝟐. So that, 253 + 169 = 3 11 12 = 𝟒𝟐𝟐.
In doing addition like what is written in Example 2.1 then we need to
apply the same way by focusing on the place of value concept of numbers we add.
The addition process is started from the first number as hundreds then follow to the
last numbers as ones. Moreover, we do need to concern on the result we get. It is
important that adding numbers as ones and ones may produce tens. Another case
happen while adding numbers as tens and tens may produce hundreds. Then
numbers which have the same place of value can be combined.

223
2.2. Discussion
Based on the learning process, students have shown significant
improvements proven by the increasing of the score between pretest and posttest.
One of the students get eight points higher in posttest compared to pretest and
another student named Piter has finally reach a maximum score in posttest after ge
got 40 as his pretest score. From lessons and also score of pretest and posttest,
students shown increased. The low score on pretest because Piter did not answer
questions about addition three digits, as shown in Figure 2.5.

(a) Pretest (b) Posttest


FIGURE 2.5. Piters sheets before and after learning three digit addition with
Matematika GASING
Based on analysis was done with what obtained by Piter, it can be said that
before being treatment, Piter not understand addition three digits. But, after taught
matter a sum three digits with three digits using Matematika GASING, Piter were
able to answer the questions in posttest. This is futher strengthened with the results
of interviews with Piter, he confirmed that before learned with Matematika
GASING, Piter do not yet understand addition three digits with three digits. The
other thing we can see for number 22 and 23 on posttest, Peter using Matematika
GASING (like Example 2.1) to finish the problem. Besides that, according to
researchers observation, Peter get early time to finish all problems when posttest
compared his pretest. So, researchers believe that Matematika GASING give impact
to enhance his ability on doing three digits addition.
The other students also get increased result on posttest then pretest. Some
of them who did not answer or given the wrong answer on pretest, can give the true
answer at posttest with the same type problems that equal to pretest. On Figure 2.6
is an example of the work of student matriculation named Erikson. Erikson is one
of the students who get 100 score on posttest (solve all posttest properly). The
same thing received score more than 92 in posttest also obtained by other named
Sumani, Robeka and Timothy.

224
(a) Pretest (b) Posttest
FIGURE 2.6 Pretest and posttest sheets by Erikson
Could obviously to see that on the question of the same type (different number),
Erikson given the wrong answer on pretest and true answer on posttest. Not only
that, when pretest, there are questions that not answered by Erikson but on posttest,
Erikson capable of got the right answer. Other case, on pretest and posttest there is
a trap question which is number 26 on pretest and number 18 on posttest. On
pretest, Erikson gives answer 906. This shows that Erikson answer it without the
knowledge that the addition should be based also by place of value. After a
teaching, Erikson can answer the same question on posttest (see number 18 on
Figure 2.6) properly. What that Erikson do, also carried out by his friends named
Robeka and Sumani.
Besides those mentioned above, namely students who got the significant
increase in score, there is also one student namely Simion that got insignificant
increase in score. He got 40 on pretest and be 48 on posttest. This definitely does
not achieve the success indicators that researchers ratified. After researchers
analysis what Simion do and during the teaching process, researchers concluded
that Simeon is one of weak students in receiving treatment learning as the teacher
do. It strengthened after the researchers interviewed her tutor before researchers
given a treatment (learning with Matematika GASING). Even strengthen by
Simions statement when researchers interviewed. Based on interview with Simion,
researchers found that Simion have first time to learning from introduction the
number of hundred until three digits addition.
Overall students of matriculation got increased score on math addition
three digits after taught using Matematika GASING. With one student that only got
increased 8 points, which is still less than indicator that researchers decreed. But
the other students can passed and got more than 90 score.
Based on the conclusion above, the following are recommended:
1) Matematika GASING is strongly recommended to be applied to teach three
digits addition for the students.
2) For the next research capability can consider learners in absorbing the
subjects given, so that researchers can focus to help students who really need
help (weak in absorb the material).

225
Acknowledgement
The authors would like to thank the Department of Mathematics at STKIP
Surya for facilitating this research, especially the students who participated. Special
thanks goes to Novi Purnama Sari and Edi Ramawijaya who gave translation
support.

References
[1] Siregar, et al, Learning the Critical Points for Addition in MATEMATIKA
GASING, Journal On Mathematics Education (IndoMS-JME) Vol. 5, 160-
169, (2014)
[2] Sugiyono, Metode Penelitian Kuantitatif, Kualitatif dan R&D, Bandung:
Afabeta, (2011)
[3] Y. Surya, Modul Pelatihan Matematika GASING SD Vol. 1, PT Kandel,
(2013)
[4] Y. Surya and M. Moss, Mathematics Education in Rural Indonesia, Proc.
12th International Congress on Mathematics Education: Topic Study Group
30, pp.6223-6229, Seoul, (2012)
[5] Departemen Pendidikan Nasional, Undang-Undang Nomor 20 Tahun 2003,
Tentang Sistem Pendidikan Nasional, Jakarta: Depdiknas, (2003)
[6] W. Wiyanti and N.S. Wakhyuningsih, Penerapan Matematika GASING
(Gampang, ASyIk, menyenaNGkan) pada Materi Penjumlahan Dua Digit
dengan Dua Digit untuk Siswa Kelas 1 Sekolah Dasar Negeri Cihuni II
Kelapa Dua Tangerang, Proc. National Congress on Mathematics and
Mathematics Education: Topic Study Group C, pp. C11-C18, Malang, (2013)

226
Proceedings of IICMA 2015
Mathematics Education

Implementation of Matematika Gasing Addition


of Many Numbers for Matriculation Students
at STKIP Surya, Tangerang
Jenerson Otniel Duwit1,a), Delson Albert Gebze2,b),Wiwik
Wiyanti3,c), and Johannes H. Siregar4,d)
1,2,3,4
STKIP Surya

a)
jenerson.otniel@student.stkipsurya.ac.id
b)
delson.albert@student.stkipsurya.ac.id
c)
wiwik.wiyanti@stkipsurya.ac.id
d)
Johannes.siregar@stkipsurya.ac.id

Abstract. STKIP Surya is always attended by students from the region of Papua. for each new
academic year. Even they graduated from the high school level, they still feel difficult how to do
basic mathematics. Base on this condition, the matriculation class must be completed before the
program study class start, students are required to study again mathematics from elementary
school to high school level. At beginning of matriculation class, from preliminary test result show
that many students have difficulty to answer the question correctly, for example, from the
addition of many numbers question, such as 2 + 7 + 9 + 5 + 6 + 8 = ⋯ , 5 + 9 + 6 + 1 + 8 +
7 + 4 + 3 + 5 = ⋯. To answer this question, student still use their fingers and made their
calculation by using tally marks, which make them to get wrong answers. Hence, we present
Matematika GASING (Gampang, ASyIk, menyenaNGkan) for the addition of many numbers in
matriculation class. In Matematika GASING, the learning process come from concrete to abstract
and evaluated with mentally calculation when they can calculate without using paper and pencil.
To conduct this study, 28 matriculation students were selected as sample. The method which is
used in this research is qualitative research. Techniques of analysis in this study is conducted an
analysis per question between the pre-test post-test in the same categories. From the analysis
results, we conclude that the implementation of Matematika GASING for the addition of many
numbers can help improve students’ numeracy skills for students from Papua.
Keyword and Phrases: Matematika GASING, addition of many numbers, qualitative research,
matriculation class, Papuan students, streak system.

1. Introduction
Mathematics knowledge, interest and skills are basic to students’ success in
higher education and for life too. Indonesia is geographically spread out among so
many islands, certainly face a challenge in education gap achievement between
urban, rural and remote areas. Students in rural and remote areas have less
opportunity to develop the quality of knowledge and to move forward in achieving
particular goals [7]. To make improvement of rural education environment, those
students have to participate to the challenges of improving education in their place
as a teacher.

227
STKIP Surya, a teacher’s college, is facing this challenges by providing
matriculation progam for rural area students. Matriculation is pre-coursework that
is often applied in college in general. As the early stages of the pre-coursework,
students are introduced to the program to help students in terms of knowledge,
attitudes and behaviour to be able to adapt to the real atmosphere of lectures in
college. Because students have diverse capabilities, in matriculation class, students
are required to study again mathematics from elementary school to high school
level by using Matematika GASING.
Wakhyuningsih as a tutor (teacher of matriculation class) said, there are still
many students of academic year 2015 who have not mastered elementary school
math material well. At the time of the process of learning math, it found many
students who do not understand and cannot answer the question correctly whether
oral or written. Some students still difficult in working on the given problem and
they are working on using finger or tally marks to calculate. Example of student
work can be seen in Fig. 1.1.

FIGURE 1.1 Example of work from the student in matriculation class


As seen from the Fig. 1.1, this student has not mastered addition properly.
From this student’s work shows that the ability to deal with addition question by
counting with tally marks. If this condition is continued to be left without
improvement, that students will be hard to follow the program study lectures. As an
example, lecture for Number Theory, Basic Physics, Basic Statistics, Logic and Set
Theory in STKIP Surya requires that students have known basic introduction to
integers number well and its operations.
Other than this student from Papua still have some problems. The students
feel hard when they are doing basic mathematics operations from their teachers.
The fact found as result of investigation by interviewing with some students, said
their past teachers of mathematics subjects rarely come to the classroom and even
once a month, so this makes them difficult to master mathematics. Learning
mathematics are not meaningful for them, it was also discovered by Jensen in his
research, namely the meaningfulness of learning will not be achieved if learning
has no deep meaning, not touching to heart and private students [1]. Similarly, in a
study conducted on [7], claim that there are no children who could not learn math,
only children who have not had the opportunity to learn mathematics in a way that
is fun and meaningful.
There have been several studies of using Matematika Gasing for learning
mathematics. Surya and Moss found the fact that the mathematics subject is an
interesting study, but to grow the meaningfulness of the learning of mathematics
required way done in fun and exciting [7]. Therefore, learning to do the easy, fun

228
and enjoyable, Prof. Yohanes Surya, use Matematika GASING for learning
mathematics. Learning with Matematika GASING has been applied in the area of
Papua and the result was able to make new breakthroughs, that scored some world-
level Olympic champion from kids who came from rural regions, there is one
student from Papua. Wiyanti and Wakhyuningsih on the topic of addition for
elementary school students, found that students who have mastered the critical
point for addition are better at solving addition problems than those who have not
mastered it. Prahmana and Suwasti, on the topic of Local Instruction Theory on
Division in Mathematics GASING, describes how Matematika GASING make a
real contribution of students understanding in the concept of division operation [3].
Kusuma and Sulistiawati, on the topic of Teaching Multiplication of Numbers
using Matematika GASING, showed that Matematika GASING helped these
students to understand and be able to teach multiplication of numbers from 1 to 10
better [2]. Siregar, et al., on the topic of learning the critical points for addition in
Matematika GASING, show that there is an increase in the students’ skill for
addition and most students can teach for addition after learning with Matematika
GASING [4]. Based on these previous study, this study implemented Matematika
GASING for matriculation students learning in the addition of many number.
The purpose of the research is to find out if the implementation of
Matematika GASING can help improve the arithmetic skill on the addition of
many numbers for matriculation students from Papua region. This research topic is
covered for using Matematika GASING to the topic of addition of many numbers.
The research method is qualitative research. The matriculation class students who
comes from the region of Papua with the number of students is 28 students. The
research is said to be successful if there is an increase score from the results of pre
test to the results of post test.

2. Main Results
2.1 Matematika GASING
Learning mathematics using Matematika GASING is a way of learning math
that can be carried easy,fun and enjoyable. GASING stands for Gampang, ASyIk
dan menyenaNGkan, which is translated as easy, fun and enjoyable. Surya dan
Moss introduce Matematika GASING as “Math GASING method shows how to
change a concrete sample into an abstract symbol so the students will be able to
read a mathematical pattern, thus gain the conclusion by themselves” [7]. Teaching
mathematics in Mathematika Gasing start from concrete to abstract then evaluated
with mentally calculation when they can calculate without any aid such as using
paper and pencil. During the learning process by using Matematika GASING,
students must pass through a phase that was believed to be when students mastered
this phase, students will be able to work on a given problem easily. This phase is
called “GASING critical point” as shown in Figure 2.1 [6]. It can also be found in
the study on GASING critical point addition which done in [4], show about the
students who reach the passing grades for the mastery of the GASING critical
point, also followed the results of written test is improved and the result of mental
arithmetic is increased.
For addition, GASING critical point is mastery the addition of two numbers
whose sum less than 20. These steps to get to the critical point addition can be seen

229
on Figure 2.1, described as a ladder. Description numbers of each ladder is (1)
students should recognize the numbers 1-10, (2) students must master the addition
of the numbers 1-10, (3) students should recognize the numbers 11-19, (4) students
must master the addition of the numbers 11-19. Then after finishing fourth ladder,
student will reach the GASING critical point of addition.

Gasing Critical points


4
3
2
1

FIGURE 2.1. The process to achieve the GASING critical point


After mastering the GASING critical point, then student will be introduced
to the addition of many numbers with “streak system”.

2.2 Addition of many numbers (more than 2 numbers) with streak system
According to [6], to add many numbers numbers in one digit with “streak
system”, it can be done as example 2.1. Note to learn about “streak system”,
students should know the place value of numbers in advance. The place value is the
value of the location of a digit in a number, for example, in number 13, the place
value of the 1 is "tens" and 3 is "ones"

Example 2.1. Addition of many numbers in one digit

Explanation of procedure of streak system in Example 2.1 as follows, 4 + 9 = 13,


streak number 9 (13 is greater than 10) and remember “ones” value is 3. Next, add
number 3 to the next number 8, then 3 + 8 = 11, streak number 8 (11 is greater than
10, remember “ones” value is 1. Next, add number 1 to the next number 1, then 1 +
1 = 2, there is no streak here (2 is not greater than 10). Next, add number 2 to the
next number 3, then 2 + 3 = 5, there is no streak here (5 is not greater than 10).
Next, add number 5 to the next number 3, then 5 + 3 = 8, there is no streak here (8
is not greater than 10). Last, add number 8 to the next number 7, then 8 + 7 = 15,
streak number 7 (15 is greater than 10). Number 5 is the result of ones place value,

230
and there are 3 streaks which count as the result of tens place value a number. Then
the result for addition of many numbers is 35.
The method of this research is using qualitative research method by
considering the learning process of mathematics study. Sugiono explained on
qualitative research researcher must go through the phase of description, focus and
selection [5]. In description phase, the researcher describe of what is seen, heard,
felt and asked, this is accomplished by interviewing students and tutors
(matriculation teachers). The conclusion from description phase is obtained that
students have different level of understanding and different way of solving
problems in mathematics. The next phase is focus phase, in this phase the
researcher focus on information obtained from description phase, information
about how student solve the problem is the focus in this research that will be
discussed. The last phase is selection phase, the researcher describe more detail
with depth analysis of obtained information from the conclusion of focus phase.
The selection phase in this research is the way students solved problem by using
Matematika Gasing with streak system. Figure 3.1. is shown three phase of
qualitative research method in this study. Phase 1 (description phase) done with the
obtained information that students have different way of solving problems in
mathematics. Phase 2 (focus phase) done with the obtained information about how
student solve the problem. Phase 3 (selection phase) done with the obtained
information about the way students solved problem by using Matematika Gasing
with streak system.

FIGURE 3.1. Three phase of qualitative research method in this study

The description phase begins with collecting data by interviewing students,


tutor and their record of daily test score. From these data, some students said that
they are still difficult to answer the addition problems, because of previous study
in their own places before has not been optimal. This fact is supported by
interviews with their tutor, Mrs. Wakhyuningsih, said that most students have not

231
been able to answer problems correctly because of language barrier and not yet
well understood about the problems.
In addition to this description phase, researchers found some interesting
things about some students have not been able to add up the addition problems of
"many numbers". For examples in addition of many numbers problems in Example
2.1, students also still use their finger and tally mark to solve addition calculation,
as shown on Figure 1.1. Not just the wrong result calculation, they still wrong
writing down the name of the number in the form of letters. For example number 7
in Indonesian language is TUJUH, but they write “TUJU” without H. For number
8, DELAPAN, they write “DERAPAN” where L is replaced with R. And then
written to “DELAFAN” where P is replaced with F, 18 written “DERAPAN
BELAS” where L that should have written DELAPAN replaced with R. There is
one student if he told to write down the number of tens for example 13, he can
write on paper or a whiteboard with the correct “TIGA BELAS”. However, if he
told to spell the number 13, he spelled not “TIGA BELAS” but “TIGA TRAS”.
This is an indication of the influence of habit speaking using local languages. Some
students while using visual aid in the process of learning, they quickly master the
material provided, however the next day when questioned without using visual aid
then they become confuse and get wrong answer.
The conclusion of description phase, researcher found students are weak to
make calculation, even for the problem of high school level, they are still using
tally mark as shown in Figure 1.1.
Focus phase on this research, a consideration of how student solved the
problems and why still get incorrectly answered the problems. They solved the
problems with the way of they learn in their previous study.

FIGURE 3.2.. Student’s work before and after having Matematika GASING
In selection phase, student learn using Matematika GASING for addition of
many number with streak system. Then after the learning process using
Matematika GASING, students are able to answer problem correctly, and this
consistent answer is correct for the given problem with the same type. The work of
one student can be seen in Figure 3.2.
The result about this study can be explained as following. The data has been
collected in the form of the score of the pre test and post test. By analyzing the test
score of the 28 students of matriculation class, the average score of pre test 75.83
and the average score of post test 95.85, their test score is increased. In addition,
from the results of interviewing some matriculation students, they said they were
very pleased to learn math by using Matematika GASING. According to them,
their learning are very easy to remember, process analytical study was very fun
because matriculation students play an active role as executor (demonstrated to

232
work on a given problem) and deeply enjoyable since using visual aid in the
learning process.
Based on the above description, Matematika GASING may help to improve
skills of matriculation students from Papua, particularly about addition of many
numbers. This can be seen from the results obtained, there is increasing score form
pre test to post test of matriculation students from Papua. Besides the learning
process outcome achievement, from the result of interviewing, students feel
enjoyable and fun when learning mathematics by Matematika GASING.

References
[1] E. Jensen, Teaching with the brain in mind, Association for Supervision and
Curriculum Development 1703 N, Beauregard St. Alexandria., (1998)
[2] J. Kusuma and Sulistiawati, Teaching Multiplication of Numbers Using
Matematika GASING, Journal On Mathematics Education (INdoMS-JME), Vol
5, No.2, Page 66 - 84, Palembang, (2014)
[3] R.C.I. Prahmana and P. Suwasti, Local Instruction Theory on Division in
Mathematics Gasing, Journal On Mathematics Education (INdoMS-JME), Vol
5, No.2, Page 17 - 26, Palembang, (2014)
[4] Siregar, et al, Learning The Critical Points for Addition in Matematika
GASING, Journal On Mathematics Education (INdoMS-JME), Vol 5, No.2,
Page 160-169, Palembang, (2014)
[5] Sugiono, Metode penelitian pendidikan, Bandung: Alfabeta, (2013)
[6] Y. Surya, Modul Pelatihan Matematika GASING SD, PT KANDEL, (2013)
[7] Y. Surya and M. Moss, Mathematics Education in Rural Indonesia, Proc. 12th
International Congress on Mathematics Education: Topic Study Group 30,
pp.6223-6229. Seoul: Korea National University of Education, (2012)
[8] W. Wiyanti and N.S. Wakhyuningsih, Penerapan Matematika GASING
(Gampang, ASyIk,menyenaNGkan) pada Materi Penjumlahan Dua Digit
dengan Dua Digit untuk Siswa Kelas 1 Sekolah Dasar Negeri Cihuni II Kelapa
Dua Tangerang, Proc. National Congress on Mathematics and Mathematics
Education: Topic Study Group C, pp. C11-C18, Malang, (2013)

233
Proceedings of IICMA 2015
Mathematics Education

The Implementation of Matematika Gasing on


Multiplication Concept Toward Interest, Study
Motivation, and Student Learning Outcome
Asri Gita1,a), Nia Yuniarti2,b), Nerru Pranuta M3,c)
1,2,3
STKIP Surya

a)
asri.gita@students.stkipsurya.ac.id
b)
nia.yuniarti@students.stkipsurya.ac.id
c)
nerru.pranuta@stkipsurya.ac.id

Abstract. This research was conducted based on the findings that students have difficulties
in learning multiplication of two numbers, especially multiplication of two digits with two
digits and multiplication of two digits with three digits. This research is a pre-experimental
design. The design that is used in this study is The One Shot Case Study. The purpose of this
study is to find out: (1) to implement of Matematika GASING into the concept of
multiplication (2) to know the influence of Matematika GASING toward students learning
outcome (3) to know the positive influence of students interest and motivation simultaneously
to mathematics learning outcome in Matematika GASING. The population in this study is 5th
grade students in Cihuni I elementary school in Kelapa Dua Tangerang districts. The
sampling technique used purposive sampling and data collected by test and questionaire. The
data findings analyzed by using anova with significance with level 5%. The expected results
this research are: (1) Matematika GASING can be implement into the concept of
multiplication (2) there is positive influence of Matematika GASING toward students
learning outcome, (3) there is positive influence of students interest and motivation
simultaneously to mathematics learning outcome in Matematika GASING.
Keywords and Phrases: study interest, study motivation, learning outcome, and
Matematika GASING.

1. Introduction
Mathematics is a subject that is always present in every level of education,
from elementary school until college. Basic math skills became a very important
part for the education of children in elementary school and became one of the skills
necessary for success in the 21st century [1]. According to the National Association
for the Education of Young Children (NAEYC), the students learn math skills at a
young age is a great base to build future learning efforts and can be a good
indicator or whether someone can be able to meet and resolve the challenges that
will be faced in the future [2].
Mathematics content given to elementary school students essentially
elementary and contains the basic concepts to understand higher concepts [3]. One
of the basic content of mathematics that should be mastered by children is
operation of counting numbers that include addition, subtraction, multiplication,

234
and division. These operations are related to very closely to understanding of the
concepts and skills perform operations that one would affect the understanding of
the concepts and skills of the other operations [4]. The operation numbers in
elementary school is very important to be able to learn other subjects [5].
Based on the theories of Piaget [6], the progression of knowledge a person
closely related to the development of biological and its interaction with the
environment. The level of sensorimotorik to formal or abstract thinking with the
classification of age as follows: 1) sensorimotorik (0 – 2 years), 2) pre operational
(2-7 years), 3) think concrete (7 – 11 years), and 4) abstract thinking or formal (12
– 16 years) [6]. Children in the elementary school level is still in the stage of
concrete thinking. Therefore when they learn an abstract concept, they will have
diffculties. As a result, many students consider mathematics as a tough lesson [7, 8,
9]. Based on the results of the exam 4th grade students in elementary school number
2 Muara Panas school year 2014-2015 on 1st semester on multiplication of the
composition shows that 20% of students to master completely, 35% of students not
mastering, and 45% less mastered [10].
The low value of the students in the multiplication content because there are
still teachers who use the method of memorizing the multiplication operation in
teaching [11]. As a result students quickly forget and do not understand the concept
well. Even for the multiplication of natural numbers the tens and hundreds of
teachers are only applying the method of multiplication of the composition so that
students feel bored because there is no variation with other methods [11]. The
appeal of a subject is determined by two things, first by subjects or learning itself
and secondly by way of teaching teachers [12]. Therefore, a teacher must prepare
their own special methods to make the subject more interesting than before and to
make it easier to learn. There are several methods and learning media to do. One of
them is by using Matematika GASING (Gampang, AsyIk, and menyenaNGkan).
Matematika GASING is a way of learning mathematics in an easy, fun, and
enjoyable. With a pleasant learning atmosphere will definitely encourage students
motivation and interest in learning. Motivation is an important aspect in the activity
of teaching and learning [13, 14]. If a student had the motivation of learning
mathematics and he will learn it in earnest and can easily achieve the learning
objectives. Students can understand the mathematics content studied, then it will
grow a positive attitude towards mathematics learners so that the interest will grow
[15]. Thus, when the understanding of content can be achieved then this will have
an effect on student learning outcomes.
This research examines some of the problems as follows: 1) How to
implement of Matematika GASING into the concept of multiplication ?, 2) is
there the influence of Matematika GASING toward students learning
outcome?, 3) is there the positive influence of students interest and
motivation simultaneously to mathematics learning outcome in Matematika
GASING?
The purpose of this research is to implement of Matematika GASING into
the concept of multiplication, to know the influence of Matematika GASING
toward students learning outcome, and to know the positive influence of
students interest and motivation simultaneously to mathematics learning
outcome in Matematika GASING.

235
This research is expected to be beneficial for teachers subjects in order to use
other means in teaching a two-digit integer multiplication content to students. Then
the benefits of this research for researchers is to develop knowledge, so it is
beneficial to the development of research learning mathematics further.

2. Main Results
Matematika GASING
Matematika GASING is way of mathematics learning to achieve learning
outcome was originally developed by Prof. Yohanes Surya. GASING stands for
Gampang, ASyIk, dan menyenaNGkan which is translated as easy, fun, and
enjoyable. The content of Matematika GASING is different with sequence of
teaching mathematics as usual schools because Matematika GASING start from
adding, multiplication, substraction, and division. Matematika GASING will be
teach use unique way which starts from concrete forms to understanding the
concept then continous to abstract forms. Matematika GASING shows how to
change a concrete sample into an abstract symbol so the students will be able to
read a mathematics patern thus gain the conclusion by themselves [16]. The
introduction of concrete forms can encourages students with exploration activities
using props. Matematika GASING has a critical point in any content. GASING
critical point is a top level that must be passed by students to be undestand next
contents [17].
In multiplication content, GASING critical point is 1) students must
understand the concepts of multiplication, 2) students know how to count multiples
of 1, 10, 9, 2, and 5 fastly, 3) followed by the multiplication of two numbers the
same multiplication, 4) 3 and 4 multiplication, and 5) multiplication 8, 7 and 6.

Critical
Point
5 GASING
4
3
2
1

FIGURE 2.1 GASING Critical Point


In order to understand the multiplication concept, the learning is started by
using concrete means (concrete stage in GASING). An illustration of learning 2 ×
5 is given in picture 2.2.

Written 2□5 →2×5

FIGURE 2.2 Concrete multiplication of 𝟐 × 𝟓.

Taking one box that consist five pineapples. Therefore taking once again.
There is two boxes that consist five pineapples read “two boxes consisted five”
shows 2 × 5. The result is 5 + 5 = 10.

236
When the fifth step already controlled properly, it can be said that students
have entered the GASING critical point. It means that students can continuous to
other way multiplication including the multiplication of two-digit numbers.
The following is an example of how to calculate the multiplication of two-digit
numbers:
Tens times tens is hundreds, so put 3 places (ones, tens, hundreds).
Multiply the number from front. 12
1 tens × 4 tens = 4 hundreds write on front. 46x
1 tens × 6 ones = 6 tens, continuous adding with 12
4 tens × 2 ones = 8 tens. 4 _ _4 6 x
Hundreds to be 4 + 1 = 5
The retrieved results is 14 tens or 1 hundreds and 4 tens. 41 4 _
5 4_
Next is 2 ones × 6 ones = 12 ones or 1 tens dan 2 ones. 12
The total of tens is 4 + 1 = 5 tens. 46x
Hundreds is 2.
So the finally result from 12 × 46 is 552. 41 41 2
5 5 2

Motivation and Study Interest


Motivation is one of the factors which determine the success of the children
in the study because the motivation plays very effective in helping students learn
[18], this makes motivation become one of the prerequisite which is quite
important in learning [19]. Student motivation plays an important role in the
process of conceptual change [20], critical thinking, learning strategies [21, 22] and
the achievement of the learning outcomes [23]. According to Winkel [24] the
motivation of learning is the driving force of psychic power overall in student
learning activities that give rise, ensuring continuity of learning activities and
provide direction on learning activities that for the sake of achieving a goal.
Motivation can arise due to factors from inside and outside that affect the interest
of students towards a subjects [15].
How to motivate students during the learning process is to connect the
learning experience of students with interest. Interest is a characteristic staple that
States the relationship between a person and an object or specific activity [25]
when one is not currently being in the pressure from outside himself [26]. Study
interest is encouragement from inside of someone who's done consciously,
pleasure, voluntarily, even wanted to try repeatedly to understand a content.
Interest is often associated with someone's behaviour to achieve specific goals in
order to get an impression of a condition and interaction in the environment [15,
18]. Interest to have a strong influence on the cognitive domain (knowledge) and
effective (attitude) domain of the individual [27, 28, 29]. This influence is not only
bulit the cognitive domain and effective domain, but also blend in both of it [30].

Research Design
The design that used in this study is quantitative research method with
One Shot Case Study. On the research of One Shot Case Study were given

237
treatment and subsequently observed the result [12]. The sampling technique had
been used is purposive sampling and data collected by test and questionaire.
This forms of design research according to Sugiyono [12] is as follows:

X O
Note:
X : learning multiplication of two-digit numbers with a Matematika GASING
O : post-test value

The expected results this research are: 1) identify problems and goals, 2)
determine appropriate research design issues and research purposes, 3) arrange the
instruments test, 4) give learning a two-digit integer multiplication, 5) give post-
test for two-digit multiplication content with two-digit and two-digit multiplication
and a three-digit, 6) give the questionaires to know interests and motivation
learning of students, 7) make an analysis of the test results, 8) make the conclusion
of the results of the study, and 9) make research reports.

Research Instrument
The collection of data in this study is to provide a written test and
questionaires. The written test given at the end and this questions made from
revision bloom taksonomi. The level are C1 (remember), C2 (understand), C3
(aplication), dan C4 (analize) [31]. The next post-test question given after learning
will be answered by Matematika GASING students use. Written tests given to two-
digit multiplication this number as many as 30 questions that consists of 25
questions with brief descriptions and 5 reserved form of the story. Problems given
to get the average value of the results of the learning of students with two-digit
numbers multiplication content. Besides giving post-test, students were also given
two questionaires to find out the interest and motivation of students in learning
math after learning about the Matematika GASING.

Result
Based on the results of pre-test and post-test from 5th grade in elementary
school Cihuni I in Kelapa Dua Tangerang districts, obtained the data that the
ability to multiply a two-digit numbers by a two-digits and three-digits by a two-
digits is influenced by the interest and students motivation. The average value of
the post-test students who answered a mathematics way of using Matematika
GASING is 82.47. After the success of the test used on average of students learning
outcomes variables using one of the test. Whose hypothesis are as follows:
Ho : μ ≤ 70
H1 : μ > 70
Note:
μ = Average of students learning outcomes.

238
Ho = Average of students learning outcomes less than equal to 70.
H1 = Average of students learning outcomes more than 70.

x – μ0
Statistics test using: t =
s
n
Note:
t : Statistics value of calculation result .

x : Average of students learning outcomes.


s : Deviation standard of students learning outcomes.

o : Comparison of students indicator value is 70.


n : The number of students.

x - μ0
t=
s
n
82, 47 70
thitung 3, 510
15,885
20

Therefore the result compared with t table value with dk = (n-1) and
significance level 5%. For testing criteria if t count > t(1 - α) therefore H0 ignored was
consequences H1 accepted, it means the average value more than 70. The result of
calculation obtained tcount value = 3,510, whereas t table = 1,729. Obviously t count >
t(1 - α) it means H0 ignored. Then the average of students learning outcomes more
than 70 or passed through comparison of students indicator value.
Student Interest in Matematika GASING

Student Interest 1
2
3
4
5

FIGURE 2.3 Diagram of student interest in Matematika GASING

239
Based on the questionnaires which had previously been distributed to
students, the obtained results are shown in the diagram above. In the diagram
above shows that student interest towards Matematika GASING belongs to high.

Student Motivation in Matematika GASING

Student Motivation 1
2
3
4
5
6

FIGURE 2.4 Diagram of student motivation in Matematika GASING

Based on the questionnaires which had previously been distributed to


students, the obtained results are shown in the diagram above. In the diagram
above shows that student motivation towards Matematika GASING belongs to
high.

Findings
Many factor happened during researched. Those factors are:
1. The time taken in this study is too short that is only 3-4 times for 3 weeks.
2. Students are still not accustomed to learning Matematika GASING.
3. Post-test was carried out after the sports lessons so that the child is still in a
condition of fatigue.

3. Concluding Remarks
Based on the research that has been done and results researchers have
presented above can be drawn some conclusions with regard to the interest,
motivation, and student learning outcomes. The summary is as follows:
1. Matematika GASING can be implement into the concept of multiplication.
2. There is positive influence of Matematika GASING toward students
learning outcome.
3. There is positive influence of students interest and motivation
simultaneously to mathematics learning outcome in Matematika GASING.

240
References
[1] A. A. Partnership, 21st Century Knowledge And Skills In Educator Preparation,
American: National Education Association, (2010)
[2] N. S. Childhood, NAEYC Standards for Early Childhood Professional
Preparation Programs, Washington: Brochure, (2009)
[3] Komariah, Model Pemecahan Masalah Melalui Pendekatan Realistik pada
Pembelajaran Matematika SD, Jurnal Pendidikan Dasar, Vol. V No. 7, (2007)
[4] A. Karim, Muchtar, et al., Buku Pendidikan Matematika I, Malang: Depdikbud,
(1996)
[5] National Council of Teachers of Mathematics (NCTM), Principles and
standards for school mathematics, Reston, VA: Author, (2000)
[6] J. Piaget, Intelectual Evolution from Adolescence to Adulthood, Human
Development, (5), pp. 1-12, (1972)
[7] S. S. Stodolsky, Student Views about Learning Math and Social Studies,
American Educational, 28(1), pp. 89-116, (1991)
[8] M. Abdurrahman, Pendidikan bagi Anak Berkesulitan Belajar, Jakarta: Rineka
Cipta, (1999)
[9] W. H. Cockcroft, Mathematics Counts, London, HMSO: Report of the
Committee of Inquiry into the Teaching of Mathematics in Schools, (1982)
[10] Y. Maiyulita, Peningkatan Kemampuan Menghitung Perkalian Menggunakan
Teknik Jari Tangan Pada Pelajaran Matematika Siswa Sekolah Dasar, Jurnal
ilmiah ilmu Pendidikan Dasar, Vol. XV No. 1, (2015)
[11] Mujib and Suparingga, Upaya Mengatasi Kesulitan Siswa Dalam Operasi
Perkalian Dengan Metode Latis, Prosiding, (2013)
[12] Sugiyanto, Model-model Pembelajaran Inovatif, Surakarta: Panitia
Sertifikasi, (2008)
[13] A. Krapp, Interest, Motivation, and Learning: an Educational-Psychological
Perspective, European Journal of Psychology of Education, Vol. XIV pp. 23-
40, (1999)
[14] F. Y. Odera, Motivation: The Most Ignored Factor in Classroom Instruction in
Kenyan Secondary Schools, International Journal of Science and Technology,
Vol. 1 No.6, December 2011, pp. 283 - 288, (2011)
[15] H. Hudojo, Mengajar Belajar Matematika, Jakarta: Depdikbud Dirjen Dikti,
(1988)
[16] Y. Surya and M. Moss, Mathematics Education in Rural Indonesia,
Proceeding in the 12th International Congress on Mathematics Education:
Topic Study Group 30, pp. 62236229, Seoul: Korea National University of
Education, (2012)
[17] Y. Surya, Modul Pelatihan Matematika GASING SD, Tangerang: Kandel,
(2013)

241
[18] A. Rehman and K. Haider, The Impact of Motivation on Learning Of
Secondary School Students in Karachi: an Analytical Study, Educational
Research International, Vol 2 No. 2, (2013)
[19] S. E. W. Djiwandono, Psikologi Pendidikan, Jakarta: PT Grasindo, (2008)
[20] O. Lee and J. Brophy, Motivational Patterns Observed in Sixth-Grade Science
Classrooms, Journal of Research in Science Teaching, 33(3), pp. 585–610,
(1996)
[21] H. Kuyper, M.P.C. Van der Werf, and M.J. Lubbers, Motivation, Meta-
cognition and Self-regulation as Predictors of Long Term Educational
Attainment, Educational Research and Evaluation, 6(3), pp. 181–201, (2000)
[22] C.A. Wolters, The Relation between High School Students’ Motivational
Regulation and Their Use of Learning Strategies, Effort, and Classroom
Performance, and Learning and Individual Differences, 11(3), pp. 281–300,
(1999)
[23] J.D. Napier and J.P. Riley, Relationship between Affective Determinants and
Achievement in Science for Seventeen-Year-Olds, Journal of Research in
Science Teaching, 22 (4), pp. 365-383, (1985)
[24] Winkel, Psikologi Belajar, Jakarta: PT. Gramedia Pustaka Utama, (2005)
[25] S. N. Elliot, et al., Educational Psychology: Effective Teaching, Effective
Learning, Boston: The McGraw-Hill Companies, Inc., (2000)
[26] A. J. Nitko and S. M. Brookhart, Educational Assesment of Students, New
Jersey: Pearson Education, (2007)
[27] M. D. Ainley, Interest in Learning in the Disposition of Curiosity in
Secondary Students: Investigating Process and Context, In L. Hoffman, A.
Krapp, K. Renninger, & J. Baumert (Eds.), Interest and learning: Proceedings
of the Seeon Conference on Interest and Gender (pp.257–266). Kiel,
Germany: IPN, (1998)
[28] K. A. Renninger, Individual Interest and Its Implications for Understanding
Intrinsic Motivation, In C. Sansone & J. M. Harackiewicz (Eds.), Intrinsic and
extrinsic motivation: The search for optimum motivation and performance (pp.
373–404), New York: Academic Press, (2000)
[29] U. Schiefele, Topic Interest, Text Representation, and Quality of Experience,
Contemporary Educational Psychology, 21, pp. 3–18., (1996)
[30] L. Anderson and D. Krathwohl, Kerangka Landasan Untuk Pembelajaran,
Pengajaran dan Asesmen Revisi Taksonomi Pendidikan Bloom, Yogyakarta:
Pustaka Pelajar, (2010)

242
Proceedings of IICMA 2015
Mathematics Education

The Development of Palopo Local Context


Learning Model (Modified Cooperative
Learning Techniques; Think Pair Share and
Problem-Based Learning Techniques)
Patmaniar1,a), and Darma Ekawati2,b)
1
Cokroaminoto Palopo University
a)
niar.niezt@yahoo.com
b)
darma.ekaa@gmail.com

Abstract. Teaching and learning activities in Palopo was still teacher-centered. In addition,
local contexts, local wisdom, and local problems were less used as a source of learning
materials. These problems then became the basis of this research to develop local context
learning media for Math instruction in Palopo by modifying some Cooperative Learning
techniques; Think Pair Share and Problem-Based Learning techniques, to be techniques
which were of good qualities (valid, practical, and reliable). Considering the limitation of
time that the researcher had, the development of those three qualities (valid, practical, and
reliable) were finally done simultaneously. When developing the techniques, the researcher
also developed learning media and learning instruments that suited the developed mdel. In
longer time, this development was intended to (1) develop Palopo local context learning
model that could be utilized as a reference by university lecturers in Palopo city; (2)
developing a course book for university students which was of Palopo Local Context
characteristics that could be used by university students in Palopo city; (3) suggest the Local
Education Authority that the modified learning model be techniques that can be used by
teachers and lecturers in Palopo city and its surrounding areas. In other words, this research
gave some theoretical and practical contributions especially in the field of education in the
form of a new learning model that was the modified Think Pair Share and Problem-Based
Learning techniques. This study employed 4 critical phases; they were (1) initial
investigation, (2) planning, (3) realization, and (4) evaluation and revision on Data Analysis
course.The results of this study showed that the developed media that related local contexts
and problems in Palopo city could stimulate and develop students’ skill to solve mathematical
problems happening in Palopo city. The results of this study also gave contributions to the
learning development especially for teachers’ and lecturers’ development in Palopo city.
Keywords and Phrases: Media Development, Local Context, Cooperative Learning
technique, Think Pair Share, Problem-Based Learning

243
1. Introduction
Indonesia is now facing two major challenges, they are decentralization or
regional autonomy which has been starting and the globalization era that is going
to happen in 2020. Both are major challenges that have to be prepared and faced by
all Indonesian people. The success key of facing those challenges lies on the
powerful Human Source that understands their culture or local context.
Local wisdom can be understood as ideas, values, and beliefs of certain
places that are wise and good, and are followed by the society. The elements of a
local culture are local wisdom that is proven to be good and to survive until now. It
is of some characteristics; (1) it can survive from the attack of other or foreign
cultures; (2) it is able to accommodate the elements of other or foreign cultures, (3)
it is able to integrate the elements of other or foreign cultures into the local culture;
(4) it is able to give directions to culture developments.
Based on the results of the observation and interview with university
students, it could be concluded that university students were always thinking that
math was a difficult lesson; they tended to be ignorant in the lesson, did the task
not seriously or even cheated from their friends without trying to understand the
problem first. This thing is actually far from what the local wisdom teaches us.
Based on the interview on “pendidikan karakter”. Character Education that was
integrated into local context, the teachers should develop a local context learning
model, which was rarely found, to achieve the purpose. Another finding was found
from interviews with a teacher in a school and with a lecturer in a private
university in Palopo city, it was found that developing local context learning media
integrated with mathematic teaching materials needed to be realized in a learning
model. It was however admitted that the learning model was still limited to the
development of lesson plans, course books, and media used in teaching and
learning process. The local context of Palopo now started to be forgotten by the
students. Developing this kind of learning media was considered a better solution
since it would involve the problems happening in Palopo city.
Furthermore, based on the interview results with some students of
Cokroaminoto University, Palopo, it was found that university students were not
passionate about learning their college courses especially the math course. It was
because the lesson is so hard that they got difficulties in understanding it and the
learning style was always “text book oriented.” They were also reluctant to ask
things they did not understand. Looking at that condition, it can be inferred that a
teacher needs to use innovative learning activities that are adapted to the problems
that happen surrounding them. By doing that the students are expected to be easier
to understand the lesson so as they can be smart, creative, innovative, and helpful
for the future students and it helps teachers teach about students’ characters that are
suitable with local context in Palopo city. One of solutions to reduce the students’
problems was modifying the cooperative learning techniques; Think Pair Share and
Problem-Based Learning as well as developing the learning media and problems
triggering the students to develop their skills. That thing can be implemented in
Data Analysis course. Besides developing students’ analyzing skill through
softwares, they can also think about suitable problems that are given to them.
According to the authentic problems presented above, the researcher decided to
develop Palopo local context learning media by modifying Cooperative Learning
Techniques; Think Pair Share and Problem-Based Learning techniques.

244
2. Main Results
In this section, the researcher will discuss the development process of local
context learning media referring to Plomp’s developing procedures (1997).
Plomp’s developing procedures consist of some development steps; initial
investigation, planning, realization of the media.
The initial investigation phase was about investigating the learning models
and media that were used by teachers and lecturers while the initial planning phase
was about planning to modify the learning and planning the learning media. Things
that had been modified can be seen at Table 1.

TABLE 1. Modifications in Learning Models


Phase Indicator Teacher’s Activity
Students’ The teacher gives illustrations about the real
1 orientation to the conditions related to the material that is going to
problems be discussed
Telling the The teacher tells the objectives of the lesson that
2 objectives and are going to be achieved and motivates the
motivating students students
Introducing the The teacher introduces the main topic of the
3 main topic of materials to the students by using scaffolding
materials technique
The teacher presents an authentic problem and
Assisting students asks the students to think individually and
4
to Think-Pair phase afterwards the students discuss the problem in
pair
The teacher asks the pairs to share their thoughts
5 Assisting students and cooperate with the whole class by saying
to Share phase opinions and responding them
The teacher evaluates the learning outcomes
6 Initial evaluation about the materials by giving another problem
The teacher gives rewards to the results of group
7 Giving rewards or individual works
Evaluating the The teacher gives feedback about the lesson
8 students’ learned
understanding

In planning phase, the prototype of learning media which were still limited
to Data Analysis course were designed. The course unit that was successfully
designed was based on the syntax of the modified learning model, it also
considered the relationship between other components, such as; reaction principles,
social systems, instructional effect and nurturant effect, and modified learning
media. The presentation of materials in student’s course book was designed by
combining the direct presentation and the process of constructing knowledge by the
students that was suitable with Palopo local context. The characteristics of the book

245
were that it was of the local context of Palopo and of the problems happening at
Palopo city and of the way(s) to analyze them. Worksheets were designed in the
form of individual tasks followed by some projects and they were to be given in
each meeting. Those activities were expected to train the students to get used to
analyzing the educational problems in Palopo city. The worksheets were integrated
with the developed media. To gain data about the process and the results of the
media development, the instruments of validity, practicality, and reliability needed
to be prepared. The validating instruments in this planning, which functioned to
standardized the assessment aspects and indicators, were (1) the assessment sheet
of the modified learning model, (2) the observation form for the modified learning
model implementation, (3) the student observation form, (4) the observation form
of leaning management, (5) the student response sheet, and (6) the evaluation sheet
for students. All designs of the above instruments consisted of instructions and
contents. The contents were based on theories that supported the objects attached in
the instruments.
Realization phase was about realizing the modified learning model, the
suitable learning media as well as the needed learning instruments. The products of
this process were (1) the Palopo local context learning media that suited the results
of modified learning model, and (2) the instruments of validity, practicality, and
reliability. The name of the product was Prototype-1 (the modified learning model,
media, and instruments). The description of Prototype-1 as the results of the
development in realization phase is shown at Table 2.
TABLE 2. The Modification Illustrations in Learning Model

INFORMA
NO TEACHER’S ACTIVITY STUDENT’S ACTIVITY TIME TION

Initial Activity
Phase 1 : Student orientation to the problem
1 Presenting pictures about Giving answers based on Asking and
materials that are going to be their understandings 5 Answering,
discussed Minutes discussion

2 Presenting the contents Heeding the information


Asking and
related to the materials presented about the related 5 Answering,
issues in Data Analysis Minutes
Discussion
course
Phase 2 : Presenting the objectives and motivating the students
1 Presenting the learning Heeding the learning Lecturing
objectives, motivating objectives especially the
students, and managing the application in each Data 5
class to be a real-life-like Analysis material Minutes
class
Phase 3 : Presenting the main topic of the materials
1 Providing scaffolds about Giving answers related to Asking and
the concepts that are going the materials 10 Answering,
to be discussed Minutes Discussion

2 Inviting students to have Heeding the information 10 Asking and

246
discussion on the terms from about the terms of demand Minutes Answering,
the materials given function, supply function, Discussion
and equilibrium price
Phase 4 : Assisting students to Think-Pair phase
1 Providing the problems to be Discussing the problems
discussed; authentic given in pairs
questions related to per unit 10 Think, Pair
taxes Minutes

Phase 5 : Assisting students to Share phase


1 Giving directions to students Sharing the results of pair
to share the results of pair discussion 2 Share
discussion Minutes

Phase 6 : Initial Evaluation


1 Giving directions to the Presenting the results of Asking and
students to present the the discussion 10 Answering,
results of pair discussion Minutes Discussion

2 Assessing the presentation Heeding the comments of Asking and


of the discussion results other students 2 Answering,
Minutes Discussion

Phase 7 : Giving rewards


1 Giving rewards to the Listening 2
student’s learning outcomes Minutes
Phase 8 : Evaluating student’s understanding
1 Distributing worksheets Students finish the
containing questions related worksheets 10
to the materials minutes

2 Giving directions to the Listening to the given


students so that they can information
implement the problem in
real situation
5
3 Reminding the students to Listening to the minutes
study the materials that are information about the next
going to be discussed in the meeting
next meeting
The results of realization phase were followed by doing validity test to the
developed learning media and to some related instruments.
Discussion of Results
In this part, the researcher will discuss the achieved objectives and
particular findings.
a. The Achieved Objectives
1) Validity
Related to the quality of the developed learning media, the researcher
found the average validity of the media that met a validity indicator that had been

247
set before. The developed media were based on the Palopo local context.
Regarding the result of validity test, it could be concluded that Prototype-1
(modified model, media, and instruments) had met all criteria of validity. In the
initial testing process (test of validity) the modified model was defined to be valid
seen from the whole aspects of the model, the supporting theories were however
still considered to be insufficient to support the modified model. Some validators
suggested that the modified model use constructivism theories as theoretical basis
of this study so that this model did not seem to focus on lecturers’ thoughts only.
The addition of constructivism theories as theoretical basis would bring massive
impacts on the modified model, supporting media, and research instruments. After
going with some revisions, the modified model was finally applicable to math
teaching and learning activities that focused on the lecturer and the students. The
learning media that were previously dominated by direct presentation were revised
to be presented with balance combination between direct presentation and student
constructing process.
2) Practicality
Theoretically, this modified model was verified by the experts and was said
to be deserved to be applied. Empirically, based on the observation on the
implementation of the model on the first trial, the model had met the criteria of
practicality and its practicalities increased in the second and third trials. However,
when it was further observed, there were some aspects of each component that
needed to be increased in the second trial, they were:
1) For syntax component, the learning phase that still needed more lecturer’s
attention was Phase-1: Orientation to the problems;
2) For social system components, the aspects that had not yet implemented well
were (1) the students’ autonomy for learning, especially when the students
constructed their knowledge and did the worksheet and (2) the chances given
to the students to be actively involved in teaching and learning activities and
the rewards given.
3) For reaction principle aspect, the not-yet-well-implemented aspect then was
more emphasized in the second trial and it gave a positive reinforcement.
A factors that was assumed as the cause of the not-implemented aspects of
modified learning model on the first trial was that the lecturers got difficulties in
managing the class since students needed more time to follow the syntax of the
modified model.
3) Reliability
The reliability of the modified learning model was defined by 4
components; the achieved learning outcomes, students’ activities, the lecturer’s
ability to manage the lesson, and students’ response to the modified learning model
and the local context learning media. From those components, there were three
components that had been fulfilled; they were the lecturer’s ability to manage the
class using the modified model, the achieved learning outcomes, and students’
response to the modified model and learning media. The characteristics of the
modified model were that there were students’ activities in thinking the orientation
to problems, evaluation of students’ understandings, and other solving problem
activities beside the constructing knowledge activity. If a student was less active in
the teaching and learning activities, the student’s mastery of the materials would
not be optimum.

248
To improve the results of the second trial, the lecturer was suggested that
(1) the lecturer frequently give supports to students so that the students would be
active in thinking, sharing with friends, completing course book and finish the
worksheet, (2) the learning emphasis be on the achieved learning outcomes in the
form of the mastery of math teaching materials.

b. Specific Findings
There were some specific findings that were considered important in this
research. In trials, the majority of the students had successfully found the
educational problems that happened in Palopo city, they were able to present the
results of their researches, they were able to analyze independently and they were
able to share with other groups. Those findings illustrated how a learning process
in class could change the students’ teaching and learning paradigm; in the
beginning the students only got information, but in this modified model, the
students’ characters were formed and their ways of thinking when solving
problems were also developed. Thus, in the modified learning model the students
took a part in collecting the data, interpreting the results of the researches, and
presenting the results of their researches.
The previously mentioned facts showed the importance of integrating the
local context education that could shape the students’ characters and impact
positively on the students’ ability development. Even though Data Analysis course
is an exact course, it was not impossible to integrate Palopo local context into the
learning process. The integrated local context with a learning model whose
learning syntax had been modified was expected to develop the high achiever’s
cognitive and affective domains.
The learning process presented local problems, the students then looked for
solutions to the problems by doing investigation, collecting information, and
rechecking the solutions through analysis. By using these activities, the students
could understand the concept, implement it in their daily lives, and improve their
abilities. The results of this study showed that teaching with modified learning
model that was integrated with local context significantly affected the learning
outcomes. The learning process emphasized on the students’ ability in constructing
their own ideas when solving problems. From that solving problem process, the
students could develop their quality characters, such as critical, careful,
responsible, logical, not easily giving up, respecting, honest, and cooperative.
Other than that, this learning model supported the implementation of the integrated
local values in some courses in math education department. According to the
researcher, this part of the study is specific findings since this aspect differentiates
the local context learning media from the previously existing learning model.
3. Concluding Remarks
Based on the discussion of trials on students done in Data Analysis course
by developing Palopo local context learning media through valid, practical, and
reliable development processes, it can be concluded that:
1. Based on the results of validity test and trials, it was found that Palopo local
context learning media through modifying cooperative learning techniques;
Think-Pair Share and Problem-based learning (course unit, course book,

249
students’ worksheet and learning outcomes or evaluation) had met the criteria of
validity so that it could be implemented by math lecturers in Palopo city.
2. The developed media met the criteria of practicality; the aspects of learning
management activities had been fulfilled and were classified as‘good category.’
3. The product (learning media) had met the reliability criteria; the learning
outcomes, students’ activities, and students’ response to teaching and learning
activities.

References
[1] R. I. Arends, Learning to Teach 9th Ed (Mc.Graw Hill Companies, Inc., 2008)
[2] E. E. Holmes, New Direction in Elementary School Mathematics, Interactive
teaching and Learning (New Jersey, Prentice Hall.Inc, 1995)
[3] B. Joyce, M. Weil, and B. Showers, Models of Teaching Fourth Edition
(Boston, Allyn & Bacon, 1992)
[4] N. Nieveen, Prototyping to Reach Product Quality. In Jan Van den Akker,
R.M. Branch, K. Gustafson, N. Nieveen & Tj. Plomp (Eds), Design
Approaches and Tools in Education and Training (pp 125 – 135) Kluwer
Academic Publishers, Dordrecht, the Nederlands, (1999)
[5] Nurdin, “Model Pembelajaran Matematika Untuk Menumbuhkan Kemampuan
Metakognitif”, Disertasi, Universitas Negeri Surabaya. (unpublished, 2007)
[6] N. Syahrul, “Peranan Pendidikan Budaya Karakter Bangsa Sebagai Upaya
Pembentukan Watak Kepribadian Peserta Didik”, (presented at Seminar
Pendidikan Internasional., 2011)
[7] Trianto, Model-Model Pembelajaran Inovatif Berorientasi Konstruktivistik
(Jakarta, Prestasi Pustaka, 2007)

250