Anda di halaman 1dari 15

MBA SEMESTER 1

MB0040 – STATISTICS FOR MANAGEMENT


Assignment Set - 1

Q1. Elucidate the functions of statistics.


A1.:
(1) Statistics helps in providing a better understanding and exact description of a phenomenon of
nature.
(2) Statistical helps in proper and eff icient planning of a statistical inquiry in any field of study.
(3) Statistical helps in collecting an appropriate quantitative data.
(4) Statistics helps in presenting complex data in a suitable tabular, diagrammatic and graphic form
for an easy and clear comprehension of the data.
(5) Statistics helps in understanding the nature and pattern of variability of a phenomenon through
quantitative observations. (6) Statistics helps in drawing valid inference, alon g with a measure of
their reliability about the population parameters from the sample data.

Q2. What are the methods of statistical survey? Explain briefly.


A2.: Statistical surveys are used to collect quantitative information about items in a
population. Surveys of human populations and institutions are common in political polling
and government, health, social science and marketing research. A survey may focus on
opinions or factual information depending on its purpose, and many surveys involve
administering questions to individuals. When the quest ions are administered by a researcher,
the survey is called a structured interview or a researcher -administered survey . When the
questions are administered by the respondent, the survey is referred to as a questionnaire or
a self-administered survey .

Structure and standardization

The questions are usually structured and standardized. The structure is intended to reduce bias. For
example, questions should be ordered in such a way that a question does not influence the
response to subsequent question s. Surveys are standardized to ensure reliability, generalizability, and
validity. Every respondent should be presented with the same questions and in the same order as
other respondents.

In organizational development (OD), carefully constructed survey ins truments are often used as the
basis for data gathering, organizational diagnosis, and subsequent action planning. Some OD
practitioners (e.g. Fred Nickols) even consider survey guided development as the sine qua non of
OD.

Serial surveys

Serial surveys ar e those which repeat the same questions at different points in time, producing
repeated measures data. There are three basic designs for a study with more than one
measurement occasion: cross -sectional design, longitudinal design, and time -series design.

• Cross-sectional surveys use different units (respondents) at each of the measurement


occasions, by drawing a new sample each time. The time intervals may be different
between measurement occasions, but they are the same for all units (respondents). A study
in which a survey is administered once is also considered to be cross -sectional.
• Longitudinal surveys use the same units (respondents) at each of the measurement
occasions, by recontacting the same sample from the initial survey for the following
measurement occasion(s), and asking the same questions at every occasion. The time
intervals may be different between measurement occasions, but they are the same for all
units (respondents).
• Time-series surveys also use the same units (respondents) at each of the measurement
occasions, but the difference with longitudinal study designs is that in time -series designs both

MBA-1 | Subject Code: Statistics For Management Page 1 of 15


the number of measurement occasions and the time intervals between occasions may be
different between units (respondents).

Modes of Data Collectio n


There are several ways of administering a survey, including:

Telephone

• use of interviewers encourages sample persons to respond, leading to higher response rates.
• interviewers can increase comprehension of questions by answering respondents' questions.
• fairly cost efficient, depending on local call charge structure
• good for large national (or international) sampling frames
• some potential for interviewer bias (e.g. some people may be more willing to discuss a
sensitive issue with a female interviewer than with a male one)
• cannot be used for non -audio information (graphics, demonstrations, taste/smell samples)
• unreliable for consumer surveys in rural areas where telephone penetration is low
• three types:
o traditional telephone interviews
o computer assisted telephone dialing
o computer assisted telephone interviewing ( CATI)

Mail

• the questionnaire may be handed to the respondents or mailed to them, but in all cases they
are returned to the researcher via mail.
• cost is very low, since bulk postage is cheap in most c ountries
• long time delays, often several months, before the surveys are returned and statistical
analysis can begin
• not suitable for issues that may require clarification
• respondents can answer at their own convenience (allowing them to break up long surve ys;
also useful if they need to check records to answer a question)
• no interviewer bias introduced
• large amount of information can be obtained: some mail surveys are as long as 50 pages
• response rates can be improved by using mail panels
o members of the panel have agreed to participate
o panels can be used in longitudinal designs where the same respondents are surveyed
several

Online surveys

• can use web or e-mail


• web is preferred over e -mail because interactive HTML forms can be used
• often inexpensive to admi nister
• very fast results
• easy to modify
• response rates can be improved by using Online panels - members of the panel have
agreed to participate
• if not password-protected, easy to manipulate by completing multiple times to skew results
• data creation, manipu lation and reporting can be automated and/or easily exported into a
format which can be read by PSPP, DAP or other statistical analysis software
• data sets created in real time
• some are incentive based (such as Survey Vault or YouGov)
• may skew sample toward s a younger demographic compared with CATI
• often difficult to determine/control selection probabilities, hindering quantitative analysis of
data
• use in large scale industries.

Personal in-home survey

MBA-1 | Subject Code: Statistics For Management Page 2 of 15


• respondents are interviewed in person, in their homes (o r at the front door)
• very high cost
• suitable when graphic representations, smells, or demonstrations are involved
• often suitable for long surveys (but some respondents object to allowing strangers into their
home for extended periods)
• suitable for location s where telephone or mail are not developed
• skilled interviewers can persuade respondents to cooperate, improving response rates
• potential for interviewer bias

Personal mall intercept survey

• shoppers at malls are intercepted - they are either interviewed o n the spot, taken to a room
and interviewed, or taken to a room and given a self -administered questionnaire
• socially acceptable - people feel that a mall is a more appropriate place to do research
than their home
• potential for interviewer bias
• fast
• easy to manipulate by completing multiple times to skew results

Sampling

Sample selection is critical to the validity of the information that represents the populations that are
being studied. The approach of the sampling helps to determine the focus of the study and allows
better acceptance of the generalizations that are being made. Careful use of biased sampling can
be used if it is justified and as long as it is noted that the resulting sample may not be a true
representation of the population of the study. Th ere are two different approaches to sampling in
survey research:

• There is nonprobability sampling approach. In this approach the researcher does not know
each element's probability of selection in the sample. The most commonly used
nonprobability sampling method is the convenience sampling approach. With this method, it
only samples those who are available and willing to participate in the survey. The use of this
approach allows for convenience for the researcher while possibly losing data validity due to
the lack of representation.
• The probability sampling approach for research methods gives each element a known
chance of being included in the sample. This method is closer to a true representation of the
population. It can be difficult to use due to cost of a rigorous sampling method, and difficulty
in obtaining full coverage of the target population, but the generalizations that come from it
are more likely to be closer to a true representation of the population. Different forms of
probability sampling are designed to achieve various benefits - e.g. theoretical simplicity,
operational simplicity, detailed information on subpopulations, or minimal cost. Some
common forms:

o Equal probability of selection designs (EPS), in which each element of the population
has an equal chance of being included in the sample. This uniformity makes EPS
surveys relatively simple to interpret. Forms of EPS include Simple random sampling
(SRS) and systematic sampling.

o Probability-proportional-to-size designs (PPS), in which 'larger ' elements (according to


some known measure of size) have a higher chance of selection. This approach is
common in business surveys where the object is to determine sector totals (e.g. "total
employment in manufacturing sectors"); compared to EPS, concentr ating on larger
elements may produce better accuracy for the same cost/sample size.

o Stratified random sampling approach , in which the population is divided into


subpopulations (called strata) and random samples are then drawn separately from
each of these strata, using any probability sampling method (sometimes including
further sub-stratification). This may be done to provide better control over the sample
size (and hence, accuracy) within each subpopulation; when the variable/s of

MBA-1 | Subject Code: Statistics For Management Page 3 of 15


interest are correlated with subpopulation, it can also improve overall accuracy.
Another use for stratification is when different subpopulations require different
sampling methods - for instance, a business survey might use EPS for businesses whose
'size' is not known and PPS el sewhere.

Q3. Tabulate the following data:


Age: 20-40; 40-60;60-above

Departments: English, Hindi, Political science, History, sociology

Degree level: Graduates, Post graduates; PhD, Total students in age group and in degree level.
A3.:
DEPARTMENTS
Total
Age Degree Level Eng Hin Pol. Sci History Sociology

20-40

40-60

60 & Above

Total

Q4. The data given below is the distribution of employees of a business according to their efficiency.
Find the mean deviation and coefficient of mean deviation from Mean and Median:
Efficiency Index 22-26 26-30 30-34 34-38 38-42

Emloyees 25 35 15 5 2

A4.: Calculation of Mean Deviation from Mean:


|D|=
EI E (f) x fx f|D|
(x-28.29)

22-26 25 24 600 4.29 107.25

26-30 35 28 980 0.29 10.15

30-34 15 32 480 3.71 55.65

34-38 5 36 180 7.71 38.55

38-42 2 40 80 11.71 23.42

N=82 •fx=2320 235.02

MD = (•f|D|)/N

X• = •fx/N = 2320/82 = 28.29

D = X – X•

MD = 235.02/82 = 2.866

Co-efficient of MD from mean = MD/ X• = 2.866/28.29 = 0.1013

MBA-1 | Subject Code: Statistics For Management Page 4 of 15


Calculation of MD from Median:

EI f Cf X |D|=x-Me f|D|
22-26 25 25 24 3.83 95.75
26-30 35 60 28 0.17 5.95
30-34 15 75 32 4.17 62.55
34-38 5 80 36 8.17 40.85
38-42 2 82 40 12.17 24.34
N=82 •f|D|=229.44

Me class = N/2 th class = 82/2 th class = 41 th class


Me class = 26-30

Me = l + [(N/2 – Cf)/f] * I = 26 + [(41 -25)/35] * 4 = 27.83

MD = •f|D|/N = 229.44/82 = 2.798

Coefficient of MD = MD/M e = 2.798/27.83 = 0.1005

Q5. What is conditional probability? Explain with an example.

A5.: Conditional probability is the probability of some event A, given the occurrence of some other
event B. Conditional probability is written P(A|B), and is read "the (conditional) probability of A,
given B" or "the probability of A under the condition B". When in a random experiment the event B is
known to have occurred, the possible outcomes of the experiment a re reduced to B, and hence the
probability of the occurrence of A is changed from the unconditional probability into the conditional
probability given B.

Joint probability is the probability of two events in conjunction. That is, it is the probability of b oth
events together. The joint probability of A and B is written or

Marginal probability is then the unconditional probability P(A) of the event A; that is, the probability
of A, regardless of whether event B did or did not occur. If B can be thought of as the event of a
random variable X having a given outcome, the marginal probability of A can be obtained by
summing (or integrating, more generally) the joint probabilities over all outcomes for X. For example,
if there are two possible outcomes for X with corresponding events B and B', this means that
. This is called marginalization.

In these definitions, note that there need not be a causal or temporal relation between A and B. A
may precede B or vice versa or they may happen at the same time. A may cause B or vice versa or
they may have no causal relation at all. Notice, however, that causal and temporal relations are
informal notions, not belonging to the probabilistic framework. They may apply in some examples,
depending on the interpretation given to events.

Conditioning of probabilities, i.e. updating them to take account of (possibly new) information, may
be achieved through Bayes' theorem. In such conditioning, the probability of A given only initial
information I, P(A|I), is known as the prior p robability. The updated conditional probability of A,
given I and the outcome of the event B, is known as the posterior probability, P(A|B,I).

Introduction

Consider the simple scenario of rolling two fair six -sided dice, labelled die 1 and die 2. Define th e
following three events (not assumed to occur simultaneously):

A: Die 1 lands on 3.

MBA-1 | Subject Code: Statistics For Management Page 5 of 15


B: Die 2 lands on 1.
C: The dice sum to 8.

The prior probability of each event describes how likely the outcome is before the dice are rolled,
without any knowledge of the roll's outcome. For example, die 1 is equally likely to fall on each of its
6 sides, so P(A) = 1/6. Similarly P(B) = 1/6. Likewise, of the 6 × 6 = 36 possible ways that a pair of dice
can land, just 5 result in a sum of 8 (namely 2 and 6, 3 and 5, 4 and 4 , 5 and 3, and 6 and 2), so P( C)
= 5/36.

Some of these events can both occur at the same time; for example events A and C can happen at
the same time, in the case where die 1 lands on 3 and die 2 lands on 5. This is the only one of the 36
outcomes where bo th A and C occur, so its probability is 1/36. The probability of both A and C
occurring is called the joint probability of A and C and is written , so . On the
other hand, if die 2 lands on 1, the dice cannot sum to 8, so .

Now suppose we roll the dice and cover up die 2, so we can only see die 1, and observe that die 1
landed on 3. Given this partial information, the probability that the dice sum to 8 is no longer 5/36;
instead it is 1/6, since die 2 must land on 5 to achieve this result. This is called the conditional
probability, because it is the probability of C under the condition that A is observed, and is written
P(C | A), which is read "the probability of C given A." Similarly, P(C | B) = 0, since if we observe die 2
landed on 1, we already know the dice can't sum to 8, regardless of what the other die landed on.

On the other hand, if we roll the dice and cover up die 2, and observe die 1, this has no impact on
the probability of event B, which only depends on die 2. We say events A and B are statistically
independent or just independent and in this case

In other words, the probability of B occurring after observing that die 1 landed on 3 is the same as
before we observed die 1.

Intersection events and conditional events are related by the formul a:

In this example, we have:

As noted above, , so by this formula:

On multiplying across by P( A),

In other words, if two events are independent, their joint probability is the product of the prior
probabilities of each event occurring by itself.

Definition

Given a probability space (•, F, P) and two events A, B F with P(B) > 0, the conditional probability
of A given B is defined by

MBA-1 | Subject Code: Statistics For Management Page 6 of 15


If P(B) = 0 then P(A | B) is undefined (see Borel–Kolmogorov paradox for an explanation). However it
is possible to define a conditional probability with respect to a •-algebra of such events (such as
those arising from a continuous random variable ).

For example, if X and Y are non-degenerate and jointly continuous random variables with density
ƒX,Y(x, y) then, if B has positive measure,

The case where B has zero measure can only be dealt with directly in the case that B={y 0},
representing a single point, in which case

If A has measure zero then the conditional probability is zero. An indication of why the more general
case of zero measure cannot be dealt with in a similar way can be seen by noting that that the limit,
as all •yi approach zero, of

depends on their relationship as they approach zero. See conditional expectation for more
information.

Derivation

The following derivation is taken from Grinstead and Snell's Introduction to Probability.

Let • be a sample space with the probability P. Suppose the event has occurred and an
altered probability P({•} | E) is to be assigned to the elementary events {•} to reflect the fact that E
has occurred. (In the following we will omit the curled brackets.)

For all we want to make sure that the intuitive result P(• | E) = 0 is true.

Also, without further information provided, we can be certain that the relative magnitude of
probabilities is conserved:

This requirement leads us to state:

where •, is a positive real constant or scaling factor to reflect the above requirement.

Since we know E has occurred, we can state P(E) > 0 and:


Hence

For another event F this leads to:

Statistical Independence

Two random events A and B are statistically independent if and only if

Thus, if A and B are independent, then their joint probability can be expressed as a simple product of
their individual probabilities.

Equivalently, for two independent events A and B with non-zero probabilities,

and

In other words, if A and B are independent, then the conditional probability of A, given B is simply the
individual probability of A alone; likewise, the probability of B given A is simply the probability of B
alone.

Mutual Exclusivity

Two events A and B are mutually exclusive if and only if . Then .

Therefore, if P(B) > 0 then is defined and equal to 0.

The Conditional Probability Falacy

The conditional probability fallacy is the assumption that P(A|B) is approximately equal to P(B|A).
The mathematician John Allen Paulos discusses this in his book Innumeracy , where he points out that
it is a mistake often made even by doctors, lawyers, and other highly educated non -statisticians. It
can be overcome by describing the data in actual numbers rather than probabilities.

The relation between P(A|B) and P(B|A) is given by Bayes' theorem:

In other words, one can only assume that P(A|B) is approximately equal to P(B|A) if the prior
probabilities P(A) and P(B) are also approximately equal.

An Example
In the following constructed but realistic si tuation, the difference between P(A|B) and P(B|A) may
be surprising, but is at the same time obvious.

In order to identify individuals having a serious disease in an early curable form, one may consider
screening a large group of people. While the benefits are obvious, an argument against such
screenings is the disturbance caused by false positive screening results: If a person not having the
disease is incorrectly found to have it by the initial test, they will most likely be quite distressed until a
more careful test shows that they do not have the disease. Even after being told they are well, their
lives may be affected negatively.

The magnitude of this problem is best understood in terms of conditional probabilities.

Suppose 1% of the group suffer from t he disease, and the rest are well. Choosing an individual at
random,

P(ill) = 1% = 0.01 and P(well) = 99% = 0.99.

Suppose that when the screening test is applied to a person not having the disease, there is a 1%
chance of getting a false positive result an d 99% chance of getting a true negative result, i.e.

P(positive | well) = 1%, and P(negative | well) = 99%.

Finally, suppose that when the test is applied to a person having the disease, there is a 1% chance of
a false negative result and 99% chance of get ting a true positive result, i.e.

P(negative | ill) = 1% and P(positive | ill) = 99%.

Now, one may calculate the following:

The fraction of individuals in the whole group who are well and test negative (true negative):

The fraction of individuals in the whole group who are ill and test positive (true positive):

The fraction of individuals in the whole group who have false positive results:

The fraction of individuals in the whole group who have false negative results:

Furthermore, the fraction of in dividuals in the whole group who test positive:

Finally, the probability that an individual actually has the disease, given that the test result is positive:

In this example, it should be easy to relate to the difference between the conditional probabi lities
P(positive | ill) (which is 99%) and P(ill | positive) (which is 50%): the first is the probability that an

MBA-1 | Subject Code: Statistics For Management Page 9 of 15


individual who has the disease tests positive; the second is the probability that an individual who
tests positive actually has the disease. With the numbers chosen here, the last result is likely to be
deemed unacceptable: half the people testing positive are actually false positives.

Second Type of Conditional Probability Falacy

Another type of fallacy is interpreting conditional probabilitie s of events (or a series of events) as
(unconditional) probabilities, or seeing them as being in the same order of magnitude. A conditional
probability of an event and its (total) probability are linked with each other through the formula of
total probability, but without additional information one of them says little about the other. The
fallacy to view P(A|B) as P(A) or as being close to P(A) is often related with some forms of statistical
bias but it can be subtle.

Here is an example: One of the conditio ns for the legendary wild -west hero Wyatt Earp to have
become a legend was having survived all the duels he survived. Indeed, it is reported that he was
never wounded, not even scratched by a bullet. The probability of this to happen is very small,
contributing to his fame because events of very small probabilities attract attention. However, the
point is that the degree of attention depends very much on the observer. Somebody impressed by a
specific event (here seeing a "hero") is prone to view effects of randomness differently from others
which are less impressed.

In general it does not make much sense to ask after observation of a remarkable series of events
"What is the probability of this?"; this is a conditional probability based upon observation. The
distinction between conditional and unconditional probabilities can be intricate if the observer who
asks "What is the probability?" is himself/herself an outcome of a random selection. The name "Wyatt
Earp effect" was coined in an article "Der Wyatt Earp Effekt" (in German) showing through several
examples its subtlety and impact in various scientific domains.

Q6. The probability that a football player will play Eden garden is 0.6 and on Ambedkar Stadium is
0.4. The probability that he will get knee injury when playing in Eden is 0.07 and that in Ambedkar
stadium is 0.04. What is the probability that he would get a knee injury if he played in Eden.
A6.:
P(A) = 0.6 P(B) = 0.4 P(C) = 0.07 P(D) = 0.04
P(A•C) = P(A) * P(C)
= 0.6*0.07
= 0.042

MBA SEMESTER 1
MB0040 – STATISTICS FOR MANAGEMENT
Assignment Set - 2

Q1. A random sample of 6 sachets of mustard oil was examined and two were found to be leaking.
A wholesaler receives seven hundred twenty six packs, each containing 6 sachets. Find the
expected number of packets to contain exactly one sachet leaking?
A1.:
n=6
N =726
Each packet contains 6 sachets
Expected no. of packs to contain exactly 1 sachet leaking
E(A) = N*P(x)
P(x) = 36/726 = 0.0496

MBA-1 | Subject Code: Statistics For Management Page 10 of 15


E(A) = 726*0.0496 = 36

Q2. What do you mean by errors in statistics? Mention the measures to do so.
A2.: In statistics and optimization, statistical errors and residuals are two closely related and easily
confused measures of the deviation of a sample from its “theoretical value”. The error of a sample is
the deviation of the sample from the (unobservable) true function value; while the residual of a
sample is the difference between the sample and the estimated function value.

The distinction is most important in regression analysis, where it leads to the concept of studentized
residuals.

Suppose there is a series of observations from a univariate distribution and we want to estimate the
mean of that distribution (the so -called location model). In this case the errors are the deviations of
the observations from the population mean, while the residuals are the deviations of the
observations from the sample mean.

A statistical error is the amount by which an observation differs from its expected value; the latter
being based on the whole population from which the stat istical unit was chosen randomly. For
example, if the mean height in a population of 21 -year-old men is 1.75 meters, and one randomly
chosen man is 1.80 meters tall, then the “error” is 0.05 meters; if the randomly chosen man is 1.70
meters tall, then the “error” is •0.05 meters. The expected value, being the mean of the entire
population, is typically unobservable, and hence the statistical error cannot be observed either.

The nomenclature arose from random measurement errors in astronomy. It is as if the measurement
of the man’s height were an attempt to measure the population mean, so that any difference
between the man’s height and the mean would be a measurement error.

A residual (or fitting error), on the other hand, is an observable estimate of the unobservable
statistical error. Consider the previous example with men’s heights and suppose we have a random
sample of n people. The sample mean could serve as a good estimator of the population mean.
Then we have:

• The difference between the height of each man in the sample and the unobservable
population mean is a statistical error, whereas
• The difference between the height of each man in the sample and the observable sample
mean is a residual.

Note that the sum of the residuals within a random sample is ne cessarily zero, and thus the residuals
are necessarily not independent . The statistical errors on the other hand are independent , and their
sum within the random sample is almost surely not zero.

One can standardize statistical errors (especially of a nor mal distribution) in a z -score (or “standard
score”), and standardize residuals in a t -statistic, or more generally studentized residuals.

Standard error of the mean

The standard error of the mean (SEM) is the standard deviation of the sample mean estimate of a
population mean. (It can also be viewed as the standard deviation of the error in the sample mean
relative to the true mean, since the sample mean is an unbiased estimator.) SEM is usually estimated
by the sample estimate of the population standard d eviation (sample standard deviation) divided
by the square root of the sample size (assuming statistical independence of the values in the
sample):

MBA-1 | Subject Code: Statistics For Management Page 11 of 15


where

s is the sample standard deviation (i.e., the sample based estimate of the standard deviation
of the population), and

n is the size (number of observations) of the sample.

This estimate may be compared with the formula for the true standard deviation of the mean:

where

• is the standard deviation of the population.

Note 1: Standard error may also be defined as the standard deviation of the residual error term.

Note 2: Both the standard error and the standard deviation of small samples tend to systematically
underestimate the population standard error and deviations: the standard error of the mean is a
biased estimator of the population standard error. With n = 2 the underestimate is about 25%, but for
n = 6 the underestimate is only 5%. Gurland and Tripathi (1971) provide a correction and equation for
this effect. Sokal and Rohlf (1981) give an equation of the correction factor for small samples of n <
20. See unbiased estimation of standard deviation for further discussion.

A practical result: Decreasing the uncertainty i n your mean value estimate by a factor of two
requires that you acquire four times as many observations in your sample. Worse, decreasing
standard error by a factor of ten requires a hundred times as many observations.

Assumptions and usage

If the data are assumed to be normally distributed, quantiles of the normal distribution and the
sample mean and standard error can be used to calculate approximate confidence intervals for
the mean. The following expressions can be used to calculate the upper and lower 95% confidence
limits, where is equal to the sample mean, SE is equal to the standard error for the sample mean,
and 1.96 is the .975 quantile of the normal distribution:

Upper 95% Limit =

Lower 95% Limit =

In particular, the standard error of a sampl e statistic (such as sample mean) is the estimated
standard deviation of the error in the process by which it was generated. In other words, it is the
standard deviation of the sampling distribution of the sample statistic. The notation for standard error
can be any one of SE, SEM (for standard error of measurement or mean), or SE.

Standard errors provide simple measures of uncertainty in a value and are often used because:

• If the standard error of several individual quantities is known then the standard er ror of some
function of the quantities can be easily calculated in many cases;
• Where the probability distribution of the value is known, it can be used to calculate a good
approximation to an exact confidence interval; and
• Where the probability distributio n is unknown, relationships like Chebyshev’s or the
Vysochanskiï -Petunin inequality can be used to calculate a conservative confidence interval
• As the sample size tends to infinity the central limit theorem guarantees that the sampling
distribution of the mean is asymptotically normal.
Correction for finite population

The formula given above for the standard error assumes that the sample size is much smaller than
the population size, so that the population can be considered to be effectively infinite in siz e. When
the sampling fraction is large (approximately at 5% or more), the estimate of the error must be
corrected using a “finite population correction”

to account for the added precision gained by sampling close to a larger percentage of the
population. The effect of the FPC is that the error becomes zero when the sample size n is equal to
the population size N.

Correction for correlation in the sample

If values of the measured quantity A are not statistically independent but have been obtained from
known locations in parameter space x, an unbiased estimate of error in the mean may be obtained
by multiplying the standard error above by the factor f:

where the sample bias coefficient • is the average of the autocorrelation -coefficient • ij value (a
quantity between -1 and 1) for all sample point pairs. See unbiased estimation of standard deviation
for more discussion.

Expected error in the mean of A for a sample of n data points with sample bias coefficient •. The
unbiased standard error plots as the •=0 line with log-log slope -½.

Q3. From a population known to have a standard deviation of 1.4, a sample of 70 individuals is
taken. The mean of this sample is found to be 6.2. Find the standard error of the mean. Also establish
an interval estimate around the sample mean using one standard deviation of the mean.
A3.:
• = 1.4 x• = 6.2 n = 70

Standard Error SE(x•) = •/•n = 1.4/•70 = 0.1673

Interval Estimation: 99% interval confidence for mean


x•-2.58* •/•n x•+2.58* •/•n
= 6.2-2.58*1.4/•70 = 6.2+2.58*1.4/•70

MBA-1 | Subject Code: Statistics For Management Page 13 of 15


= 5.768 = 6.6317

Q4. A machine is designed so as to pack 300ml of a solution with a standard deviation of 5ml. A
sample of 150 bottles when measured had a mean content of 201.3ml. Test whether the machine is
functioning properly.(use 5% level of significance)
A4.:
µ0 = 300ml n = 150 bottles • = 0.05 • = 5ml x• = 201.3ml

H0:µ = 300ml if machine functioning properly


H0:µ • 300ml if machine not functioning properly

zabs = (x• - µ0)/(•/•n) = (201.3 - 300)/(5/•150) = -98.7/0.408 = -241.765

Since zabs > (-1.96, 1.96) critical values, µ0 is rejected.

Conclusion: The mean is not 300ml , that is the machine is not functioning properly.

Q5. Out of 2000 people surveyed, 1200 belong to urban areas and rest to semi urban areas. Among
1000 who visited other regions, 800 belonged to urban areas. Test at 5% level of significance whether
area and visiting other states are dependant.
A5.:
N = 2000

P = 800/1000 = 0.8

• = 0.05

P0 = 1200/2000 = 0.6

Q0 = 1- P0 = 0.4

H0 : P = 0.6 Area & visting other states are dependent

H1 : P • 0.6 Area & visting other states are independent

z = (P- P0)/•( P0 Q0/n) N(0,1)

= (0.8-0.6)/•(0.6*0.4/1000)

= 12.91

zabs = 12.91 Value is not in the interval ( -1.96, 1.96)

H0 is rejected

Conclusion: Area & visiting other states are independent.

Q6. How is statistics useful for modern managers? Give examples and explain.
A6.: Modern managers often join agencies because they s eek to serve and help their communities
and country. Not surprisingly, some managers are puzzled by the suggestion of engaging in research
and statistics: research appears boring in comparison with developing and implementing new
programs, and statistics seems, well, impossibly challenging with little payoff in sight.
In fact, analytical techniques involving research and statistics are increasingly in demand. Many
decisions that modern managers make involve data and analysis, one way or another. Consider th e
following common uses of analysis and data :

MBA-1 | Subject Code: Statistics For Management Page 14 of 15


First, data and objective analysis often are used to describe and analyze problems, such as the
magnitude of environmental disasters (for example, oil spills), the extent of social and public health
problems (such as homelessness or the AIDS epidemic), the extent of lawlessness, the level of
economic prosperity or stagnation, or the impact of weather -related problems such as brought on
by hurricanes and snow storms. For example, it matters whether the illiteracy rate among 12 year
olds is 3 percent or 30 percent, or somewhere in between. By describing the extent of these
problems and their underlying causes accurately, managers are able to better formulate effective
strategies for dealing with them. Policy analy sis often begins by describing the extent and
characteristics of problems, and the factors associated with them.

Second, data are used to describe policies and programs. What are programs and policies
expected to achieve? How many services are programs expected to provide? What are some
milestones of achievement? How much will a program cost? These questions involve quantifiable
answers, such as the number of national guardsmen that are brought in to assist with search and
rescue efforts after a major hurr icane, or the number of evacuees for whom officials expect to
provide refuge. Policies and programs can be described in quite detailed ways, involving distinct
program activities, the duration and geographic scope of activities, and staffing levels and are a
program budget data.

Third, programs produce much routine, administrative data that are used to monitor progress and
prevent fraud. For example, hospitals produce a large amount of data about patient visits, who
attended them, their diagnosis, billing codes, and so on. Schools produce vast amounts of data
about student achievement, student conduct, extracurricular activities, support and administrative
services, and so on. Regulatory programs produce data about inspections and compliance. In
many states, gaming devices (such as slot machines) are monitored electronically to ensure that
taxes are collected and that they are not tampered with. Managers are expected to be familiar
with the administrative data in their lines of business.

Fourth, analysis is used to guide and improve program operations.


Data can be brought to bear on problems that help managers chose among competing strategies.
For example, what -if analysis might be used to determine the cost-effectiveness of alternative
courses of action. Such analysis often is tailored to unique situations and problems. In addition, client
and citizen surveys might be used to inform program priorities by assessing population needs and
service satisfaction. Systematic surveys provide valid and objective assessments of citizen and client
needs, priorities, and perceptions of programs and services. Systematic surveys of citizens and clients
are used increasingly and are considered a valuable tool of modern management.

Fifth, data are used to evaluate outcomes. Legislatures and citizens want to know what return they
are getting from their tax dollars. Did programs and policies achieve their aims? Did they produce
any unexpected results?
Most grant applications require modern managers to be accountable for program outcomes.
Modern managers must demonstrate that their programs are producing effective outcomes and
that they are doing so in cost -effective ways. This demand for outcome evaluation and monitoring
far exceeds any requirement of proper funds management. Ana lysis can also be used to determine
the impact of different conditions on program effectiveness, leading to suggestions for improving
programs.

Data and analysis are omnipresent in programs and policies. They are there at every stage, from the
inception of programs and policies, to their very end. Of course, decisions are also based on
personal observation, political consensus, anecdotal and impressionistic descriptions, and the
ideologies of leaders. Yet data and analysis often are present, too, one way or another. This is
because analysis is useful. Specifically, quantitative analysis aids in providing an objective, factual
underpinning of situations and responses. Analysis, along with data, helps quantify the extent of
problems and solutions in ways that other information seldom can. Analysis can help quantify the
actual or likely impact of proposed strategies, for example, helping to determine their adequacy. At
the very least, a focus on facts and objective analysis might reduce judgment errors stemming from
overly impressionistic or subjective perceptions that are factually incorrect. So managers are
expected to bring data and analysis to the decision -making table.

MBA-1 | Subject Code: Statistics For Management Page 15 of 15

Anda mungkin juga menyukai