Engineering Methods
Problem
description
Factors
identification
Scientific
model
proposal
Data
collection
Conclusion
Validation
Manipulation
2
Statistics
Science dealing with collection, presentation, analysis, use of data to make decisions, solve problems,
design products & processes
Data collection planned in terms of the design of surveys & experiments
Purpose is to extract information from data
Statistical techniques useful for describing and understanding variability and its potential sources
Relate to population or sample
Sample chosen subset of the population; opposed to compiling data about the entire group where
information available usually partial information from population
Conclusion about population is drawn based on information obtained in sample
Relates to uncertainty concepts (probability theory, probability distributions)
Statistical analysis
description
inference
3
Data Collection
How data will be mathematically analyzed depends on how those data were collected
Experimental design and statistics go hand in hand!
Statistical Data
Collection
Experimental
Observational
Census
Sample survey
Experimental:
1. Planning the research, finding the number of replicates of the study, using the following
information:
e.g. preliminary estimates regarding the size of treatment effects, alternative hypotheses,
and the estimated experimental variability.
To allow an unbiased estimate of the difference in treatment effects, experiments shall be
compared with (at least) one new treatment with a standard treatment or control,.
2. Design of experiments (DOE), using blocking to reduce the influence of confounding
variables, and randomized assignment of treatments to subjects to allow unbiased
estimates of treatment effects and experimental error.
3. Performing the experiment and analyzing the data.
4. Examining the data set in secondary analyses, to suggest new hypotheses for future study.
5. Documenting and presenting the results.
A control study which researcher attempts to understand cause-and-effect relationships.
The researcher controls
how subjects are assigned to groups
which treatments each group receives.
4
In analysis, the researcher compares group scores on some dependent variable.
Based on the analysis, the researcher draws a conclusion about whether the treatment (independent
variable) had a causal effect on the dependent variable.
Observational:
Typically uses a survey or case-control study to collect observations about the area of interest
and then performs statistical analysis
Census:
Obtains data from every member of a population. In most studies, a census is not practical,
because of the cost and/or time required.
Sample survey:
Obtains data from a subset of a population, in order to estimate population attributes.
This subset of the population will be used to represent the whole population.
Statistical measures within population or sample: Variances, and standard deviation, are called
parameters.
For a sample to be used as a guide to an entire population, it is important that it is truly a
representative of that overall population. Representative sampling assures that the inferences and
conclusions can be safely extended from the sample to the population as a whole.
Statistics offers methods to estimate and correct for any random trending within the sample and
data collection procedures.
Various ways to sample a population: random sampling (most common). Randomness is studied
using the mathematical discipline of probability theory.
5
-ExampleA researcher who carries out a study to determine the average height of fifth graders in a particular
school district. If only boys were measured, the results would only apply to boys, not all fifth
graders, and would thus be biased, not random. To collect unbiased data, one would randomly
choose the same number of boys and girls from each fifth grade class to measure.
-ExampleAn experimental design calls for observing what food items red ants bring back to their colony as
compared to black ants. You have too many ant colonies to observe all of them, so you pick a
random sample of 5 colonies of each ant type to observe. An easy way to choose randomly is by
giving each colony a number or letter on a slip of paper. Put these in a basket and pull 5 slips for
each ant colony type. This way there is no bias toward any particular colonies.
-ExampleIn drug trials, fifty out of one hundred people are randomly chosen to receive the drug, while the
other fifty receive a placebo.
How many
study subjects
needed?
7
-ExampleThree plants receive 0.1L/day, three receive 0.5L/day, and three receive 1L/day. With three plants
in each treatment group, data analysis such as to determine the averages is carried out. Determine
the number of replication in this experiment and state whether data analysis can be do
Pseudoreplication
Taking multiple measurements on the same experimental unit and treating each measurement
as an independent data point not true replication.
Pseudoreplication should always be avoided because the results are not scientifically valid.
-ExampleUsing one plant for an experiment measuring the effect of nitrogen on growth and counting each
branch as a separate experimental unit or replicate, would be an example of pseudoreplication.
You need to use multiple separate plants for each treatment.
8
Controlled experiment
An experiment where only one variable or factor is manipulated and all other variables are held
constant. An experiment is controlled if the only factor that is allowed to vary is the independent
variable (treatment). All other factors are kept as constant as possible.
Control
An experimental unit that is being subjected to all the same conditions as the units actually are
treated, except for the control does not receive an actual treatment or receives only a placebo.
Blind study
The people collecting and analyzing the data do not know which experimental units received which
treatments. Only after the data are analyzed are the treatments revealed, or decoded. The purpose
is to reduce any human bias toward an expected outcome.
-ExampleIf the pots have coded stickers on the bottom that only the treatment students understand, then the
data takers will not know which plants are getting which treatment and that will reduce their bias
(preconceived expectations), and the data will be more objective and reliable. Labels can be as
simple as T1-1, T1-2, T1-3, T2-1...T2-3, and T3-1...T3-3. T1, T2 and T3 stand for the treatment
(5 g N, 10 g N or 0 g N). The numerals after the dash number each pot within the treatment group.
9
Data Collection: Pros and Cons
Resources
When the population is large, a sample survey has a big resource advantage over a census. A welldesigned sample survey can provide very precise estimates of population parameters - quicker,
cheaper, and with less manpower than a census.
Generalizability
Generalizability refers to the appropriateness of applying findings from a study to a larger
population. Generalizability requires random selection. If participants in a study are randomly
selected from a larger population, it is appropriate to generalize study results to the larger
population; if not, it is not appropriate to generalize.
Observational studies do not feature random selection; so generalizing from the results of an
observational study to a larger population can be a problem.
Causal inference
Cause-and-effect relationships can be teased out when subjects are randomly assigned to groups.
Therefore, experiments, which allow the researcher to control assignment of subjects to treatment
groups, are the best method for investigating causal relationships.
Data Recording
Counting (raw numbers)
Collecting numerical data begins as counts, called raw numbers such as the number of flowers on
the plants, write the numbers on a data sheet or in a science journal, and graph those or put them
in a table.
Pictures, drawings
Sometimes the data collected is in the form of a drawing when recording variables such as shape
and color. Drawings are usually necessary for presentations to help explain to an audience what
the experiment was, how it was conducted, and the results.
Non-numerical data
In some experiments the data to be collected is not numerical in nature. It might be color change,
intensity of color, or some other qualitative measure such as high, low, or medium light.
10
-
GROUP ACTIVITY-
The dot diagram is a very useful plot for displaying a small body of data say up to
about 20 observations.
This plot allows us to see easily two features of the data; the location, or the middle, and
the scatter or variability.
The engineer considers an alternate design and eight prototypes are built and pulloff
force measured.
11
=+
where
= constant
= random disturbance.
12
Practice
1. Which of the following statements are true?
I. A sample survey is an example of an experimental study.
II. An observational study requires fewer resources than an experiment.
III. The best method for investigating causal relationships is an observational study.
(A) I only
(B) II only
(C) III only
(D) All of the above.
(E) None of the above.
2. Which of the following statements are true?
I. The mean of a population is denoted by x.
II. Sample size is never bigger than population size.
III. The population mean is a statistic.
(A) I only.
(B) II only.
(C) III only.
(D) All of the above.
(E) None of the above.
3. Hypothesis testing and estimation are both types of descriptive statistics.
(A) True
(B) False
4. A set of data organized in a participants(rows)-by-variables(columns) format is known as a
data set.
(A) True
(B) False
5. A graph that uses vertical bars to represent data is called a ____.
(A) Line graph
(B) Bar graph
(C) Scatterplot
(D) Vertical graph
6. The goal of ___________ is to focus on summarizing and explaining a specific set of data.
(A) Inferential statistics
(B) Descriptive statistics
(C) None of the above
(D) All of the above
13
7. A _______ is a numerical characteristic of a sample and a ______ is a numerical characteristic
of a population.
(A) Sample, population
(B) Population, sample
(C) Statistic, parameter
(D) Parameter, statistic
8. A sampling distribution might be based on which of the following?
(A) Sample means
(B) Sample correlations
(C) Sample proportions
(D) All of the above
9. The car will probably cost about 16,000 dollars; this number sounds more like a(n):
(A) Point estimate
(B) Interval estimate
10. The use of the laws of probability to make inferences and draw statistical conclusions about
populations based on sample data is referred to as ___________.
(A) Descriptive statistics
(B) Inferential statistics
(C) Sample statistics
(D) Population statistics
11. Which of the following are principles of questionnaire construction?
(A) Consider using multiple methods when measuring abstract constructs
(B) Use multiple items to measure abstract constructs
(C) Avoid double-barreled questions
(D) All of the above
(E) Only B and C
12. Which of these is not a method of data collection.
(A) Questionnaires
(B) Interviews
(C) Experiments
(D) Observations
13. Secondary/existing data may include which of the following?
(A) Official documents
(B) Personal documents
(C) Archived research data
(D) All of the above
14
14. Which of the following terms best describes data that were originally collected at an earlier
time by a different person for a different purpose?
(A) Primary data
(B) Secondary data
(C) Experimental data
(D) Field notes
15. Researchers use both open-ended and closed-ended questions to collect data. Which of the
following statements is true?
(A) Open-ended questions directly provide quantitative data based on the researchers
predetermined response categories
(B) Closed-ended questions provide quantitative data in the participants own words
(C) Open-ended questions provide qualitative data in the participants own words
(D) Closed-ended questions directly provide qualitative data in the participants own words
16. Open-ended questions provide primarily ______ data.
(A) Confirmatory data
(B) Qualitative data
(C) Predictive data
(D) None of the above
17. Which of the following is true concerning observation?
(A) It takes less time than self-report approaches
(B) It costs less money than self-report approaches
(C) It is often not possible to determine exactly why the people behave as they do
(D) All of the above
18. Qualitative observation is usually done for exploratory purposes; it is also called
___________ observation.
(A) Structured
(B) Naturalistic
(C) Complete
(D) Probed
19. Another name for a Likert Scale is a(n):
(A) Interview protocol
(B) Event sampling
(C) Summated rating scale
(D) Ranking
20. Which of the following is not one of the six major methods of data collection that are used by
educational researchers?
(A) Observation
(B) Interviews
(C) Questionnaires
(D) Checklists
15
21. The type of interview in which the specific topics are decided in advance but the sequence
and wording can be modified during the interview is called:
(A) The interview guide approach
(B) The informal conversational interview
(C) A closed quantitative interview
(D) The standardized open-ended interview
22. Which one of the following in not a major method of data collection:
(A) Questionnaires
(B) Interviews
(C) Secondary data
(D) Focus groups
(E) All of the above are methods of data collection
23. A census taker often collects data through which of the following?
(A) Standardized tests
(B) Interviews
(C) Secondary data
(D) Observations
24. The researcher has secretly placed him or herself (as a member) in the group that is being
studied. This researcher may be which of the following?
(A) A complete participant
(B) An observer-as-participant
(C) A participant-as-observer
(D) None of the above
25. Which of the following is not a major method of data collection?
(A) Questionnaires
(B) Focus groups
(C) Correlational method
(D) Secondary data