With the NBME releasing a lot of scores on July 9th 2014, I thought it would be a good idea to get some
information about study habits and resources used and how that might correlate to the score. I received 291
responses, 272 of which were valid for use. This group self-selected for high performers and so the average was
a fair bit higher than the mean (228) reported by the NBME. In my opinion however, this perhaps makes any
variations more important than otherwise, as scoring on average higher than other high performers can be
looked at with favor.
As far as conclusions that can be drawn from this dataset, there were no methods or resources used that
resulted in statistically significant higher scores. Many people used the same resources and scored wildly
different, most likely dependent on the specific Step 1 questions they received during the actual test and
individual ability. However, there are several interesting observations that can be drawn from the analysis.
If you are not interested in looking at the analysis for each section, my conclusions that I have drawn
from this project are as follows: Use Uworld, FA, and Pathoma, take at least UWSA 1 and 2 along with NBME
15 and 16. Other than that, in general using a resource is better than not, with the exception of DIT. Firecracker
and Kaplan seem to also be a bit above average if you are looking for more resources. Practice tests, especially
the one taken closest to your test date are a decent approximation of your score, usually within 5-10 points with
the actual score tending to be higher or lower depending on how high the practice test score was (once >250,
tends to overpredict, <240 underpredicts).
UWSA 1
NBME 7
NBME 11
NBME 12
NBME 13
NBME 15
NBME 16
Which practice testdid you take feel was closest to the real thing?
UWSA 1
UWSA 1
NBME 7
NBME 11
NBME 12
NBME 13
NBME 15
NBME 16
Analysis: (please note that the second UWSA1 should be UWSA2) The Uworld Self Assessments and the
newer NBMEs were by far the most popular, and together those four tests represent what 75% of responders
though was the most like the real test. The average number of practice tests taken was 3.75. As discussed later,
people who took more practice tests tended to do better than those who took less, so if you had to choose ones
to take, those four are probably good options.
DIT
BeckerUSMLE Rx Other
Analysis: Especially due to the popularity of the so-called UFAP method on this site, it is not surprising that
Uworld, First Aid, and Pathoma hold the overwhelming majority of resources used. Under the other option,
significant responses were: BRS = 11, Goljan = 27, Kaplan Qbank = 9, Kaplan Videos = 5, Picmonic = 11,
Firecracker = 13. These resources were included in the analysis below. DIT and USMLE Rx both had a high
number of users.
did you take step 1 before or after having done some clinical rotations?
before
after
Analysis: Only 6% of responders took Step 1 after having some clinical rotations. As seen below, this does not
overly affect their score.
300
250
200
150
SD
AVG
100
50
Overall
Practice
Pathoma
NOT Pathoma
Becker
UsmleRx
NOT Usmle Rx
DIT
NOT DIT
Firecracker
Goljan
Kaplan
Picmonic
NBME 16
NOT NBME16
NBME 15
NOT nbme15
>4 practice
</=4 practice
0 practice
10+ practice
after clinicals
AVG
243.1
241.7
4
244.1
9
235.9
7
244
248.9
3
240.6
5
239.9
5
244.4
4
253.2
5
246.6
9
249.1
2
242.8
8
245.1
6
241.2
9
244.7
5
238.7
1
253.8
1
239.1
2
227
264.3
3
241.9
3
t score to
overall
SD
19.055
Number
272
18.86
244
0.813819045
17.60
236
-0.66981054
25.60
6
35
2
1.591265886
-0.204678175
14.72
80
-2.899320532
20.11
192
1.320718476
17.75
80
1.371747477
19.43
192
-0.737514284
14.74
11
-2.210363923
15.87
22
-1.004104802
19.49
18
-1.270872843
5.35
0.094464726
18.68
127
-1.019542308
19.2
144
0.917128887
19.23
152
-0.850047063
28.58
119
1.533156945
11.96
73
-5.900692285
7.436889513
19.66
35.26
199
10
2.198529227
1.436231602
-3.235272961
7.27
-6.665786838
21.8
15
0.203616667
1.835595545
3.773475422
-1.847790329
1.679820103
1.980929385
Analysis: There is no statistically significant difference between any specific study methods, either compared to
the overall mean or between using a resource/ not using a resource. This is most likely due to the very wide
standard deviations present. For example, the average score for someone taking 10 or more practice tests was
far higher than that of someone who took zero, however due to the small sample size on both those extremes (6
and 10 respectively) the standard deviation was particularly large for the zero test group. As far as a specific
example, there was one person who scored a 262 without having taken any practice tests.
Things to note:
Picmonic was the closest to having a statistically significant difference in averages, though the average
score of users was not very different from that of the overall. (still better than NBME reported average
though)
Taking large numbers of practice tests tended to do better than not, and taking more than average (3.7
tests) tended to do better than less.
The only resource that people scored LOWER on average when comparing using vs. NOT using was
DIT.
Taking Step 1 after having clinical experience does not appear to be a large benefit, and perhaps a slight
detriment.
First Aid might not be necessary if you have a good plan otherwise, as the average for that was higher
and had a smaller standard deviation than the overall.
Only 3 people reported not using Uworld, their average was 210. So Uworld is probably a good idea.