TYPES OF DATA
Experimental and Observational Data Cross-sectional and Time-Series Data Panel (Longitudinal) Data
Experimental Data: obtained from experiments designed to evaluate a treatment or policy to investigate a causal effect. Observational Data: data obtained outside the experimental setting.
Cross-sectional Data Data on different entities for a single time period. Time-Series Data data for a single entity collected at multiple time periods.
CONCEPT I: DATA GENERATION PROCESS AND PROBABILITY DISTRIBUTION Stochastic process Classic Example: Throwing a die Economic Example: GDP
Specifies all possible values with parameters: mean (expected value) and variance (dispersion) Each value has certain probability to be observed Realized value is the observed value, i.e. the data point we collect
Possible values: 1, 2, 3, 4, 5 and 6 Each has the probability of 1/6 to be observed After the die is thrown, only one value we observe. That is the data point, the realized value or observed value. This value is called a RANDOM variable. For a given year (say 2011), there are numerous possible levels of national production. Mean = E(GDP) Each range or level of production has certain probability of occurrence. After the process completed (after 2011), the level of production measured and reported. That is the realized value. Unlike the classic example: GDP is continuous.
Characteristics
Bell-shaped Symmetric mean=mode=median Area (probability) roughly 68% within one standard deviation, 95% within 2 standard deviations Described by mean and variance
Hypothesis: supposition made as a basis for reasoning, or as a starting point for further investigation from known facts. Hypothesis: A statement made concerning a particular aspect or aspects of the population under study. Hypothesis: Testing of the hypotheses is an essential part in inferential statistics generalizing from the statistical values computed from the data to the population values. An example of a population value mean income of households in Putrajaya. Why hypothesis testing? - the value may be of interest to decision making - the population value is unknown - estimating the value using the sample is the only way - However, in making inferences or in linking the sample mean to the population mean, a test is needed. No direct inferences can be made
Hypotheses are formulated as NULL HYPOTHESIS and ALTERNATIVE (RESEARCH) HYPOTHESIS Null Hypothesis: an assertion that we hold as true unless we have sufficient statistical evidence to conclude otherwise. Alternative Hypothesis: the negation of the null hypothesis. Alternative Hypothesis: a research hypothesis that we seek to provide supportive statistical evidence. Alternative Hypothesis: it can be DIRECTIONAL or EXPLORATORY Directional: the hypothesized population value is LESS (or MORE) than the value stated under the null. Example: Equity ownership of the Malays is less than 30%. Exploratory: the hypothesized population value is DIFFERENT from the null value. It is not known from theories whether it is LESS or MORE than the null value. Example: Equity ownership of the Malays is different from 30%. In testing, the null hypothesis is given the benefit of the doubt. That is, the statement is held true until there is sufficient evidence to reject it.
Problem: suppose a company plans to build a super store in Putrajaya. However, for the store to be viable, the mean monthly income of households in Puterajaya must be more than RM4,000. Problem: is the Putrajaya household mean income more than RM4,000? The statement that the Puterajaya household mean income is more than RM4,000 CAN NOT be the null hypothesis. If it is, there is no need for testing since it is assumed to be TRUE. In other words, it must be the alternative hypothesis. The null hypothesis would be: the mean income is equal or less than RM4,000. Note that the alternative hypothesis is directional The burden of proof is on the researcher to provide evidence for the alternative hypothesis.
ESTIMATION: A formula used to estimate Q . The example is the mean income of the households in the sample or sample mean, X
TESTING: * How to use the sample mean to make inferences regarding Q. * This requires the distribution of X or of sample mean
Problem: inflation in one of the main variable focused by policymakers and economists. Over the past years, the role of money supply growth in affecting inflation has received much attention. However, there seems to be an increasing concern that the inflation in Malaysia is imported due to its pegged exchange rate system. Perhaps, changes in the US inflation and the Ringgit value exert significant effect on inflation. Null: US inflation has no impact on Malaysias inflation Alternative: US inflation has significant impact on Malaysias inflation. The same can be stated for the effect of exchange rate changes. (FLOWCHART In Session II)
Elements
Null and Alternative Hypotheses Test Statistics Critical Values of the Test and Significance Level One-Tailed V. Two-Tailed Tests Type I Error and Type II Error.
HYPOTHESIS TESTING STEP 1: STATE HYPOTHESES Null Hypothesis: A statement about the value of the population parameter It is an assertion about the value of the parameter. It is an assertion that we hold as TRUE unless we have sufficient statistical evidence to conclude otherwise. Alternative Hypothesis: A Statement that is accepted if the sample data provide enough evidence that the null hypothesis is false. It is the complement of the Null.
HYPOTHESIS TESTING STEP 2: SELECT A LEVEL OF SIGNIFICANCE STEPS 3: SELECT THE TEST STATISTICS
Level of Significance: the probability of rejecting the null hypothesis when it is true. Errors in testing: (a) Type I Error: Rejecting the Null Hypothesis when it is TRUE (b) Type II Error: Accepting the Null Hypothesis when it is false
Test Statistic: A Value, determined from sample information, used to determine whether to reject the null hypothesis.
HYPOTHESIS TESTING STEP 4: FORMULATE THE DECISION RULE STEPS 5: MAKE DECISION
Critical Value: The dividing point between the region where the null hypothesis is rejected and the region where it is not rejected. P-value: the probability of observing a sample value as extreme as, or more extreme than, the value observed, given that the null hypothesis is true.
Reject the Null Hypothesis if the test statistic exceeds the critical value [in ABSOLUTE VALUE]. OR Reject the Null Hypothesis if the p-value is less than the significance level.
END OF SESSION I