Anda di halaman 1dari 8

Econ 120A – ECONOMETRICS A LECTURE NOTES

Foster, UCSD January 11, 2018

TOPIC 1 – PRELIMINARIES

A. Introduction to Statistics

1. What Does "Statistics" Mean?

a) Data base -- collection of observations or measurements of variables tabulated and classi-


fied to present useful info about a given subject.

b) Analytical approach -- scientific analysis of data using mathematics of probability and


techni-ques of statistical inference to better understand phenomena, reduce uncertainty in
deci-sion making, and make predictions.

c) Summary numerical measures characterizing the data (aka “sample statistics”).

2. Population and Sample:

a) We want to describe/explain/predict the behavior and characteristics of some population


of people, institutions, or natural phenomena. We gather observations or measurements to
obtain data for analysis.

b) Population or census data -- measure every element in population. More accurate and
expensive, but sometimes impossible.

c) Sample data -- measure a subset of population, usually consisting of randomly-selected


representative elements.
1) Less expensive when population is large.
2) Necessary with destructive testing. [Light bulb endurance]
3) More accurate than sloppy census due to smaller but well-trained staff.

3. Probability and Deductive Reasoning:

a) Deduction -- from general to specific (population to sample). If something is true in the


population, it is probably true in a random population subset.

b) Of 10 chips marked 0 - 9 in bowl, exactly 1/2 are even numbers. Deduce that if 1 chip is
drawn blindly, probability that it is even = 0.5.

4. Statistical Inference and Inductive Reasoning:

a) Induction -- from specific to general (from sample to population). If something is true in


representative sample, infer that it is also approx. true for population as a whole.

b) If 12 of 25 randomly selected students are male, infer that about 48% of student
population is male.
Ec 120A PRELIMS p. 2 of 8
c) Statistical inference is what statistics is all about:
1) Estimating population characteristics from sample data.
2) Hypothesis testing for decision processes.
3) Prediction and forecasting.

Notation for this course


X, Y Random variables
xj, yk Possible values of X and Y
Z, z Standardized r. var. and value
N≥n Population and sample size
xi, i = 1…n Cross-section sample observations on X
yt, t = 1…n Time-series observations on Y
Σ, Π Summation and multiplication operators
,  Union and intersection of sets
AB Set A contained in set B
QPq Permutations
QCq or ( q) Q Combinations
S Sample space or universe
Pr(ej) Probability of outcome ej
f, f/n Frequency and relative frequency
Population Sample Statistic
μx or E(X) ̅
X Mean or expected value
σ 2 s 2 Variance
σx sx Standard deviation
νx ̃
X Median
π 𝑝̅ Proportion
σxy sxy Covariance
ρxy rxy Correlation
f(x) Marginal probability distribution of X
F(x) Cumulative probability distribution of X
f(x, y) Joint probability distribution of X and Y
f(x|y) Conditional probability of X, given Y
B(n, π) Binomial distribution
N(μ, σ ) 2 Normal distribution
N(0, 1) Standard normal distribution
t(v) Student t-distribution (v d.f.)
χ (v)
2 Chi-square distribution (v d.f.)
F(v1,v2) Fisher’s F-distribution (num, denom)
H0, H1 Null and alternative hypotheses
̂
ẑ, t̂, F Hypothesis test statistics
zα, tα/2(v), χ2α(v) Critical values (z test, t test, etc.)
Ec 120A PRELIMS p. 3 of 8
B. Types of Statistical Data

1. Data Classification by Measurement Scale:

a) Qualitative or “categorical” data and nominal Qualitative and Ordinal Data


scale. Person 1 2 3 4 5 6 7 8 9
1) Categorical data are measured on nominal Color B B R Y Y B B Y
(“name”) scale. Observation values are the B 1 1 0 0 0 1 1 0
arbitrary names of characteristic qualities Y 0 0 0 1 1 0 0 1
or categories. R 0 0 1 0 0 0 0 0
2) Only valid computation is to count the Player A B C D E F G H I
number of observations in each category. Rank 4 7 3 1 8 9 6 2 5
3) Example -- Y = blonde; B = brunette; R =
redhead.
4) Qualitative data are put in numerical terms with binary (dummy, categorical) variables.
Let Bi = 1 if person i is brunette; Bi = 0 if not. Similarly define Yi and Ri.

b) Rank or “ordinal” data and ordinal scale.


1) Rank data are measured on ordinal (“order”) scale. Observation values are the rank
order positions of underlying data points.
2) Only valid type of computation involves sorting (ranking) data points from largest to
smallest or vice versa.
3) On an ordinal scale, only rank order counts. If observation values are 2 and 3, then 3 is
ranked higher than 2, but we don’t know how much higher because the difference 3-2
has no meaning.
4) When only rank data are available, we use “nonparametric” statistics.
5) Example -- 9 baseball players were ranked according to batting average (1 = lowest).

c) Quantitative data and interval and ratio scales.


1) Quantitative data are numerical and measured on interval or ratio scales, where all
arithmetic computations are valid. Observation values are real numbers xi, i = 1...n.
2) Interval scales have no meaningful 0; differences between numbers are meaningful, but
not ratios. Consider temperature: 0 F.  0 C; a 50 – 40 = 10 difference in tempera-
ture is meaningful, but 50 is not “twice as hot” as 25.
3) Ratio scales do have a meaningful 0; both differences and ratios are meaningful.
Consider price: if P = $0, the good is free; P = $4 is twice as expensive as P = $2.

2. Cross-Section and Time-Series Data:

a) Cross-section -- measure different population elements at given point in time.


1) Measure H = height of n students on Jan 31, 2009. Refer to these data as hi, i = 1...n.
2) Cross-section data have no natural order.

b) Time-series data -- measure one population element at different points in time.


1) Measure Y = GDP from 1980 to the present. Refer to these data as yt, t = 1...T.
2) Time-series data have a natural (chronological) order.
Ec 120A PRELIMS p. 4 of 8
C. Rules of Summation1

1. Summation Notation:

a) Σ indicates summation or addition. For data set xi, i = 1...n:


n

∑ xi = x1 + x2 + ⋯ + xn
i=1

read as “the sum of x as i goes from 1 to n.”

b) When it is clear that all observations are to be summed, the notation may be shortened:

∑ xi or ∑ xi or just ∑ x
i

2. Rules of Summation:

a) N constant terms -- for constant c: c = c + c + … + c = nc

b) Constant times a variable -- for constant c and xi, i = 1...n:

cxi = cx1 + cx2 + … + cxn = c xi

c) Sum of sums -- for data pairs xi, yi, i = 1...n:

(xi + yi) = (x1 + y1) + (x2 + y2) + … + (xn + yn) = xi + yi

d) Sum of squares and sums of products.


2
∑ xi2 = x12 + x22 + ⋯ + xn2 ≠ (∑ xi )

∑ xi yi = x1 y1 + x2 y2 + ⋯ + xn yn ≠ (∑ xi ) (∑ y1 )

3. Illustration: for i = 1…7: Sample Data


i 1 2 3 4 5 6 7
 4 = 28 xi 4 –2 7 5 9 12 –3
 xi = 32 yi 15 14 15 18 9 22 13
 yi = 106
 xi2 = 328
 xiyi = 167
 (xi + yi) = 32 + 106 = 138

1 This material will be covered in the first discussion section.


Ec 120A PRELIMS p. 5 of 8
D. Sets2

1. Set Notation:

a) Set -- a collection of elements in no particular order.


1) Universe -- S = set of all relevant elements.
2) Null set --  = set with no elements (the empty set).
3) For illustration, consider a universe of 7 numbered blocks and some other sets in this
universe:
 S = {1 2 3 4 5 6 7}
 A = {1 6 7} B = {4 5 2 3} C = {1 6 7 5} D = {4 1}

b) Subset -- a set of some of the elements in a given set, denoted by  ("contained within").
 AC

c) Complement -- set of elements in universe but not in a given set, denoted by ~ ("not").
 ~C = {2 3 4}

d) Element belonging to a set, denoted by :


 6A

2. Combinations of Sets and Venn Diagrams:

a) Union of 2 sets -- set of elements in either or both sets, denoted by .


 C  D = {1 4 5 6 7}; A  B = S

b) Intersection of 2 sets -- set of elements common to both sets, denoted by .


 B  C = {5}; A  C = A; A  B = 

c) Venn diagrams -- visual


aid representing sets and A C E S
combinations, widely
used in logic. B D F H

BA CD EF ~H

Venn Diagrams

2 This material will be covered in the first discussion section.


Ec 120A PRELIMS p. 6 of 8
E. Counting Rules3

1. Introduction to Combinatorics:

a) Counting rules are used to calculate the number of different ways that elements of sets can
be arranged in certain ways.
1) Assume a set S of Q = 7 distinct elements. We will be working with subsets of q of these
elements.
2) It matters whether the q elements are selected from the Q elements with or without
replacement.
3) It matters whether the subset of q elements is selected as an ordered sequence or an
unordered set or collection.

b) Factorial notation.
1) For a positive integer n, n! = n  n–1  n–2  ...  2  1; 5! = 5  4  3  2  1 = 120.
2) 0!  1

2. Permutations -- Sequences Drawn Without Replacement:

a) From Q items, select an ordered sequence of q items without replacement. How many such
sequences (permutations) are possible?

Q!
b) Permutation counting rule – number of possible permutations is QPq = (Q−q)!

c) Illustration.
 7P3 = 7!/4! = 5040/24 = 7  6  5 = 210
 That is, there are 7 ways to draw the first item, 6 ways to draw the second and 5 ways
to draw the third.
 Note that 3-6-4 is not the same permutation as 4-3-6 or 3-4-6, etc.

3. Combinations -- Collections Drawn Without Replacement:

a) From Q items, select a subset of q items. How many such subsets (combinations) are
possible?
Q Q!
b) Combination counting rule – no. of possible combinations is QCq = ( ) = q!(Q−q)!
q

c) Illustration.
7 7! 5040
 ( )= = = 35
3 3!4! 6 × 24
 Note that combination {3 6 4} is the same as {3 4 6}, etc.

d) Properties of combinations.
1) QCq = QCQ-q by the symmetry of the formula: 7C3 = 7C4 = 35.
2) QPq = QCq q! because for each combination of q items there are q! permutations of that
combination: 7P3 = 35 × 3 = 210

3 This material is covered in the second discussion section.


Ec 120A PRELIMS p. 7 of 8
4. Distinguishable Permutations with Groups of Identical Items:

a) Given Q items of k kinds, with q1 of the first kind, q2 of the second kind, ... qk of the kth kind:
qk = Q. How many distinguishable permutations can we form from all Q items?
Q!
b) Counting rule – number of distinct permutations is q !q !… q ! .
1 2 k
c) How many 4-letter words in ADAM? Answer: 4!/(2! 1! 1!) = 12 [Find them]

d) Note special case where k = 2 -- q of one kind and Q-q of the other.
Q!
1) Number of distinguishable permutations = q!(Q−q)! = QCq.
2) We use this result with the binomial distribution later.

5. Other Counting Rules:

a) Sequences selected with replacement.


1) From Q distinct items, we draw an ordered sequence of q items, but we replace each
item after it is drawn, so that any given item might be drawn more than once, so q can be
> Q. How many possible sequences are there?
2) Answer: Qq
3) Example -- when a coin is flipped, the Q = 2 possible results are {H T}. If we flip q = 3
times, we could get 23 = 8 sequences. [What are they?]

b) Slot filling.
1) We have an ordered sequence of m slots to fill. There are n1 ways to fill the first slot, n2
ways to fill the second, etc. How many distinct ways are there of filling the M slots?
2) Answer: ∏m j=1 nj = n1  n2  n3  …  nm
3) Example -- From n1 = 4 shirts, n2 = 3 pants, and n3 = 2 ties, how many outfits can you
put together?
 4 shirts × 3 pants × 2 ties = 24 distinct outfits
 Coin flipping was a special case where n1 = n2 = n3 = 2
Ec 120A PRELIMS p. 8 of 8

PRACTICE PROBLEMS

Problem 1. Summation Notation


Use the two data sets in Table A to calculate the following sums, for i = 1…3: Table A
A)  xi C)  xi2 E)  x1 yi xi 2 4 1
B)  5yi D)  6 F)  (xi + 2yi) yi 3 1 –2

Problem 2. Thinking with Sets

A) True or False? Sets


1) A  C =  4) C  B
2) C  B = B 5) D  B D
3) D ~B = S 6) A ~D
B
B) Shade the following areas:
C
1) D  B 2) A  ~C
A
Problem 3. Counting Rules

A) A student awards committee of 4 persons is to be chosen from 7 faculty members. How many
committees are possible?

B) Three persons will be selected from 6 honor students to perform at a graduation ceremony, one as
speaker, one as alternate, and one to carry the "Eat Fascist Death You Flaming Pig!" poster. How
many ways can these positions be filled?

C) The CIA gives agents an 8-letter code that is a scrambled version of "SURPRISE." How many code
names are possible?

Answers

Problem 1
A) 7 B) 10 C) 21 D) 18 E) 8 F) 11

Problem 2
A) 1) F 2) F 3) F 4) T 5) F 6) T

Problem 3
A) 35 B) 120 C) 10,080

Anda mungkin juga menyukai