AHE 501
Research Proposal
experiment. The sampling method and size will be appropriate for this exploratory pilot study.
However, to increase internal and external validity, a larger sample size will be required.
The Treatment: The selected students (the treatment group) spend a period of time with the
instructor, separated from the control group, playing the game. The instructor is to teach and
run the game, but not offer unsolicited instruction regarding subject-matter related to the
section. However, if players of the game specifically solicit the instructor for topical
information, she could discuss it. This was to assure that content is instructed only as it
appears in context of gameplay, rather than any planned instruction.
The Control: The selected control group will have access to typical instruction in the course.
The course is, by standard, a flipped model with independent study, activities, and
workshopping during classroom hours. The control group will be using the time allotted for the
experiment to exit class early (20 minutes) and study independently.
The Pre-Assessment and Post-Assessment:
1. What is an hypothesis?
2. What does it mean to test an hypothesis?
3. List the steps in the hypothesis testing process:
4. On a scale from 1 to 5 (with 5 being the most) how confident are you in your conceptual
understanding of statistical hypothesis testing?
5. On a scale from 1 to 5 (with 5 being the most) how confident are you in your ability to apply
the statistical hypothesis testing procedure?
Treatment Group Post-Assessment Extra Questions:
(Since this is an exploratory study, this additional information will only be used to
inform future testing instruments and treatment design.)
6. Did the experience of playing the game make you feel more confident or less confident
regarding hypothesis testing?
7. What are a few words or phrases that best reflect your experience in the experiment?
Sectional Summative Assessment from Instructor: This assessment will be a scored
evaluation by the instructor as per a typical instruction schedule. It will specifically assess
content knowledge related to hypothesis testing in all students within the course.
Data Collection: All of the assessments, pre and post, and the scores from the summative
content evaluation will be gathered. Also, beyond the exploratory study, the test must show
equivalence and stability by being applied by multiple instructors over multiple instructional
quarters, enhancing reliability.
Data analysis: The qualitative and quantitative data will be analyzed in an exploratory fashion
to observe any possible correlations, or lack of correlations, between playing the game and
learning outcomes in students. The research design contains more than one experimental
design. One is a test re-test design of content confidence; it will be analyzed via a scatter plot
of both groups results as well as a two-tailed t-test (and box-and-whisker plots) to minimize
the effects of Lords Paradox. The other is a simple treatment/control post-test design that can
be analyzed with its own two-tailed t-test (and box-and-whisker plot); this is in regard to the
actual content knowledge evaluation. Triangulation of the analyses will provide the most valid
results.
Rationale
Research Design
The research design was established through a process of dealing with small sample size. It
was determined that it would be a pretest-posttest/posttest-only hybrid as part of a larger
causal-comparative design, selecting two groups that differ on some variable of interest and
comparing them on some dependent variable (GMA, p.231). Confidence levels in the instructed
content would be examined in a pretest-posttest format. Actual content knowledge would be
assessed by a typical learning evaluation scheduled for the end of the section. Comparison
groups were established with one acting as the experimental group, or treatment group, and the
other acting as a control (GMA, p.231). It was expected that the presence of a control would
control for all threats to internal validity except mortality (GMA, p.269). Mortality was not a
concern in this experiment as the entire study was to last only one week, the actual treatment
period for the experimental group was 20 minutes, and otherwise, attendance of the course
would be typical.
Sampling Method
Simple Random Sampling was utilized to ensure that every individual has the same probability
of being selected, and selection of one individual in no way affects selection of another
individual (GMA, p.131). Stratified Sampling may have primed attitudes due to the small
population size and also could not account for all of the variables possible, so it was not used for
this iteration of the experiment. It was determined that random sampling would obtain the most
representative sample (GMA, p.131). This leads to stronger validity but lower reliability as the
experiment would still need to be reiterated.
Measurement Instruments
The pretest-posttest assessment was identical aside from two extra questions issued the
experimental group to glean qualitative information about the treatment itself, relevant only to
future experiment design. The assessments primary questions were a combination of
qualitative achievement measurements of proficiency and numeric rating scales of confidence
levels (GMA, p.155-157). The content-knowledge posttest evaluation was part of the normal
course curriculum and was outside of the researchers control. It was administered to all
students identically.
Data Analysis
The examination of data will most likely occur through exploratory data analysis using Fathom
statistical software. However, it will follow a general path of correlational study leading to causalcomparative study where variables that are highly relatedmay be examined (GMA, p.204).
This will involve scatter plots of the pretest-posttest confidence-level data and observations of
correlation strengths. Then, box-and-whisker plots will be created examining possible
significance at multiple significance levels. Two-tailed t-tests also will be conducted. If no
statistical significance is found, then the hypothesis will be nullified in this iteration of the
experiment.
A fuller exploratory analysis will be completed for the reason of avoiding the Lords Paradox in
which Fredereic Lord saw differing results from different analysts using different but equally
sound methods, leading him to claim there simply is no logical or statistical procedure that can
be counted on to make proper allowances for uncontrolled preexisting differences between
groups (Lord, 1967). To improve validity, the data must be analyzed through many different
techniques to see if it retains internal consistency. Because our sample size is small and
diverse, it must be observed in this light.
Future Considerations
- A large sample size is very important to this study. In the future, the same instructor will be
-
able to teach two face-to-face classes in the same subject in the same quarter and full
classes will act as comparison groups.
The treatment should be applied as a whole-group experience, so that the instructor can
make callbacks to it openly in class, for best treatment outcome.
There will need to be an examination of whether or not actual content-knowledge
improvement within the single instructional section is the goal of the treatment. If increasing
student confidence or enjoyment of classroom participation is the goal, then the measurement
must be altered to observe those areas more specifically.
References
Gay, L. R., Mills, G. E., & Airasian, P. W. (2011). Educational Research: Competencies for
Analysis and Applications (10 edition). Boston: Pearson.
Lord, F. M. (1967). A Paradox in the Interpretation of Group Comparisons. Psychological
Bulletin, 68(5), 304305. http://doi.org/10.1037/h0025105