Anda di halaman 1dari 27

Statistics in the Business

Statistics TEPE Group 1


5910751485 Warapong
5910755627 Techin
5910756062 Chaiyatorn
Company Profile
Batex International Co., Ltd
บริษัท บาเท็กซ อินเตอรเนชั่นแนล จํากัด
Founded: 2001
Industry: B2B Chemical Trade, Import, Export
Plastics industry, Industrial detergents

Revenue (2018): 75 Million Baht


Employees: 130 (Office) 720 (Factory)
Profile
Name: Ms. Nutkamol Rerkkasemsan
Education: BSc. Chemistry @Thammasat (2002)
MSc. Chemical Engineering @KMUTNB (2004)
MSc. Business Statistics @STOU (2005)
Position: Head of Sales at Batex International Co., Ltd
Job Description: - Manage the sales workforce
- Perform business analytics, market research
to predict and optimize sales
The use of
Statistics in Customer Segmentation
Batex Int’l
Business Statistics is extensively used
to create business models both
internally and externally.
Sales Forecasting
-Enterprise Optimization
-Marketing Analytics
-Pricing Analytics
-Risk and Credit
-Supply Chain HR Analytics
-Transportation Analytics
Customer Segmentation

The company wants to divide up the 3,000+ customer base into


five clusters to better suit the customer’s needs

● Factories and Suppliers are surveyed every few years


● Divided customers into clusters with different characteristics
● Customizes sales techniques based on the clusters
Customer Segmentation 1. Obtaining the data

Used connections to survey over 3,000 companies who buy


chemicals and who might be interested in buying chemicals

Questions include
1) Attitude towards chemical trade and use of chemicals (29 Q’s)
2) Demographics (5 Q’s)
3) Demand price of chemicals (8 Q’s)
Customer Segmentation 2. Factor Analysis
To simplify the 29 questions into meaningful factors

A Correlation Matrix is used to see whether two questions have correlated answers or
are independent. [1 = correlated, 0 = indep]
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12
Q1

Q2

Q3

Q4
Customer Segmentation 2. Factor Analysis
The Eigenvalues and variances are calculated for each question. A Scree Plot is drawn
To see how many questions are needed to explain the behavior of customers

5 factors would explain 50% of the customer’s


behaviors, but the team decided to use 10 factors
as it explains 66.61% of the behavior.
Customer Segmentation 3. Extracting Factors
The company then combines the 29 questions into the 10 factors by combining the
most related questions together. An Extraction and Rotation Analysis based on the
factor analysis table is used. Each question in the factor is weighted by the company.

Factor Questions Factor Questions

1 1 (30%) 2 (25%) 17(25%) 26(20%) 6 8 14 23 24 (25%)

2 3 (40%) 7(40%) 16 (20%) 7 10 (50%) 15(25%) 25(25%)

3 4 18 20 (33.3%) 8 9 27 28 29 (25%)

4 5 19 (50%) 9 11

5 6 21 22 (33.3%) 10 12 13 (50%)
Customer Segmentation 3. Extracting Factors
The survey data is then rewritten in terms of the 10 factors.
Example data of 5 companies:
Customer Segmentation 4. Clustering
A Dendrogram is drawn to obtain the number of clusters. It shows how different customers
answers each question. The R command hclust(*, “ward”) is used.

Q1

Q2

Q3 Based on the dendrogram, The


company decides to divide the
Q4 customers into five clusters
based on how similar their answers are
Customer Segmentation 4. Summarizing Data
The company summarizes the survey results according to each cluster. Now the company
can now train the salespeople according to the results.

Q1
Q2
Q3
Q4
Q5
Q6
Sales Forecasting Sales Statistics

The marketing manager wants to forecast total sales


And want to understand which factors influence them
(1) to plan the sales tactics
(2) to give an idea of staffing required
(3) to allocate money to advertising, promotions
(4) to make better policy decisions concerning price, advertising,
and product development spending
Sales Forecasting Sales Statistics
Data from the customer is measured against the amount of sales of chemicals.
Sales is the dependent variable, and the other variables are Independent
Sales Forecasting 1. Correlation
The company measures which variables are most correlated to sales
R-squared value= [1=most correlated 0=uncorrelated/independent]
Sales Forecasting 1. Correlation
Graphs are also used in conjunction to see the relationship.
This is the dependent variable vs the first independent variable(PDI VS SALES)

r2=0.7301
Sales Forecasting 2. Regression Model

Al the independent variables are combined to form a regression model


So the company can predict the sales based on the variable inputs. Model:
SALES =
b1*PDI + b2*DEALS + b3*PRICE + b4*R.D + b5*INVEST + b6*ADVERTIS + b7*EXPENSE + b8*TOTINDAD
Sales Forecasting 2. Regression Model
Sales = 3027.6336 + 3.3723PDI + 4.6953DEALS - 18.1112PRICE - 9.9033RD + 1.6895INVEST+8.2907ADVERETI
+ 4.4434EXPENSE-0.4427TOTINDAD
Sales Forecasting 3. Checking Model

A Residual plot is drawn to make sure our model is correct and valid throughout the entire
data set. Random residual plot = model is probably okay
HR Analytics

The company has sales that travel around the country.


The biggest problem is attrition (lost of motivation and leaving)
which leads to damage for the company

The company produces a model which predicts whether any


given employee has attrition and is at risk of resigning

in order to better understand why employees want to leave


HR Analytics 1. Collecting Data

These data of every employee is collected:

AGE ATTRITION BUSINESSTRAVEL DAILYRATE DEPARTMENT


1=resigning 1=None 2=Freq
0=Not Resig 3=Rarely
DISTANCEFROMHOME EDUCATION EDUCATIONFIEFLD SATISFACTION
GENDER HOURLYRATE JOBROLE MARITAL
NUMCOMPANYWORKED
OVERTIME PERCENTSALARYINCREASE PERFORMANCE TOTAL WORKING YRS
TRAININGTIME WORK-LIFE-BALANCE YRS-AT-BATEX
YRS-CURRENT-ROLE
YEAR-WITH-CURRENT-MANAGER YEARS-SINCE-PROMOTION STD-HOURS
HR Analytics 1. Collecting Data

The data is divided into Attrition (Employees who have resigned or have considered)
And non-attrition(Employees who don’t want to resign)

The Attrition data is considered


HR Analytics 2. Logistic Regression

The logistic regression takes the variables and outputs a probability value
(1=resigning 0=not resigning)
HR Analytics 3. CART Analysis

This is similar to a probability tree diagram which takes in each of the variables and outputs
the confidence level that the employee will leave orange=stay blue = leave
HR Analytics 3. Checking Model

The HIT Analysis is used to produce the probability of correctly identifying whether the
employee will leave or not

These are only three of the methods the company used. There are around 15 methods that
the company uses including: ROC, Log Regression, Confusion Matrix, Gains Analysis

But the company is moving towards a machine learning model.


The Importance of Statistics
● Needed in every part of the business
From HR to manufacturing to sales to quality control
● Allows the company to simplify big data into simple analysis
● Aids in quick decision making with a high confidence level
Thank You

Anda mungkin juga menyukai