Anda di halaman 1dari 6

Exercise 2

Download and open the SPSS file entitled Seminar1_SU.sav


Here, you will find part of the data collected from a PhD project
(i4Culture) that collected data on substance use among non-western
immigrant youth in the Netherlands. Please treat with confidentiality:
you may only use these data for this course to practice some skills.
First, browse through the variables (variable view) included in the file
to get an impression on what kind of data you have available for now.
Questions:
a) What is the 12-months prevalence of cannabis use in this
sample? (tip: use the variable labeled CAU_year). Describe in
your own words what this means.
b) For those who used cannabis in the last year, how often did they
use cannabis on average and what is the standard deviation?
c) What is the average age of cannabis use onset?
d) Make a bar chart of age of cannabis use onset. Can you tell from
the graph what the median age is? What kind of information
does the bar chart give you? Please explain in your own words
what you see.
e) What is the last month prevalence of cannabis use in this
sample? Describe in your own words what this means.
f) What is the 12-months and last month prevalence of cannabis
use for boys and for girls, separately?
g) Is the 12-month prevalence of cannabis use different for children
with children with parents with a low level of education as
compared to children with parents with a high educational level?
h) You will notice a lot of missing values. Can you find a reason for
this large number of missings? Could you think of a better way
of dealing with this number of missing values?

ANSWERS
a)
FREQUENCIES VARIABLES=CAU_year
/ORDER=ANALYSIS.

last 12 months use

Valid

Missing
Total

never
once or more
Total
System

Frequency
48
157
205
249
454

Percent
10,6
34,6
45,2
54,8
100,0

Valid Percent
23,4
76,6
100,0

Cumulative
Percent
23,4
100,0

Look at Percent that also includes the cases that on a prior question
(ca0: at which age did you use cannabis for the first time) answered
with never use: 34.6% of the total sample of N= 454 has used
Cannabis at least once last year. Of those who reported an age of
onset of cannabis use and thus any cannabis use ever in their lives,
76.6% reported last year cannabis use at least once.
b) First, select only the cases that have a score of 1 on the CAU_year
variable (either with select cases option or split file). Next, run
descriptives for the variable ca2. You can see in the table that the
mean is 6.10 for this group (SD = 3.46). Now, this reflects the mean
SCORE of 6,10 (see variable coding: 6 = 5-6 times), with a range (SD)
between 2,64 (between 1-2 times) and 9,56 (between 11-19 times
and 20 times or more). This is perhaps not how you would like to report
your findings. What would be a better option?
Note the lower total number (n=151) than in the table above (n=157).
This suggest that data were missing on ca2 for 6 cases. Indeed, 6
cases have the system missing code 999 for ca2.
DATASET ACTIVATE DataSet1.
USE ALL.
COMPUTE filter_$=(CAU_year = 1).
VARIABLE LABELS filter_$ 'CAU_year = 1 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).

FILTER BY filter_$.
EXECUTE.

DESCRIPTIVES VARIABLES=ca2
/STATISTICS=MEAN STDDEV MIN MAX.

N
How often have you
used cannabis in
the last 12 months
Valid N (listwise)

Descriptive Statistics
Minimum
Maximum
151

11

Mean

Std. Deviation

6,10

3,464

151

c) Again, using the filtered cases (at least once used cannabis):

N
At which age did
you use cannabis
for the first time?
Valid N (listwise)

Descriptive Statistics
Minimum
Maximum
157

17

Mean

Std. Deviation

10,57

157

This yields a mean SCORE of 10.57, which represents the age of 15


16 years. Here you would have liked to use the age as continuous
score instead of categorical score.
d) Here, use again all cases, it gives you some more information on all
your data.
FILTER OFF.
USE ALL.
EXECUTE.
FREQUENCIES VARIABLES=ca0
/BARCHART FREQ
/ORDER=ANALYSIS.

2,234

If you would leave out the never users (which clearly represents the
largest group), the median is probably 15 or 16 years. Very few have
used prior to age 12, with inclining onset at age 14 16, and fewer
have later onset than age 16.
e)
FREQUENCIES VARIABLES=CAU_month
/ORDER=ANALYSIS.

The table shows that 19.8% of the total study sample (n=454) used
cannabis in the last month.
last month use
Cumulative
Frequency
Percent
Valid Percent
Percent
Valid
never
115
25,3
56,1
56,1
once or more
90
19,8
43,9
100,0
Total
205
45,2
100,0
Missing System
249
54,8
Total
454
100,0

f) Here, I used split file according to gender, which is a good option if you have few
categories for a variable that you want to describe separately. See the highlighted
prevalence rates. Note that this contains purely descriptive analyses and not a test of a
gender difference.
SORT CASES BY gender.
SPLIT FILE SEPARATE BY gender.
FREQUENCIES VARIABLES=CAU_year CAU_month
/ORDER=ANALYSIS.

last 12 months usea

Valid

never
once or more
Total
System

Missing
Total
a. Gender = Male

Frequency
19
80
99
123
222

Percent
8,6
36,0
44,6
55,4
100,0

Valid Percent
19,2
80,8
100,0

Cumulative
Percent
19,2
100,0

last month usea

Valid

never
once or more
Total
System

Missing
Total
a. Gender = Male

Frequency
48
51
99
123
222

Percent
21,6
23,0
44,6
55,4
100,0

Valid Percent
48,5
51,5
100,0

Cumulative
Percent
48,5
100,0

last 12 months usea

Valid

never
once or more
Total
System

Missing
Total
a. Gender = Female

Frequency
29
77
106
126
232

Percent
12,5
33,2
45,7
54,3
100,0

Valid Percent
27,4
72,6
100,0

Cumulative
Percent
27,4
100,0

last month usea

Valid

never
once or more
Total
System

Missing
Total
a. Gender = Female

Frequency
67
39
106
126
232

Percent
28,9
16,8
45,7
54,3
100,0

Valid Percent
63,2
36,8
100,0

Cumulative
Percent
63,2
100,0

g) Conduct a simple t-test with groups on variable Edulevel 1 versus


level 3. When comparing all three levels, use a univariate ANOVA and
pick the correct contrasts you wish to make.
T-TEST GROUPS=Edulevel(1 3)
/MISSING=ANALYSIS
/VARIABLES=ca2
/CRITERIA=CI(.95).

Group Statistics

How often have


you used
cannabis in the
last 12 months

Educational
level parents
low
high

42

Mean
4,57

Std.
Deviation
3,590

Std. Error
Mean
,554

90

5,11

3,746

,395

The T-test table (not included here) shows that there is no significant
difference in last year prevalence of cannabis of children between both
groups (t = -0.78, p=0.44)
h) The researcher did not use a (defined) value for questions following
the first question on cannabis use (ca0: at which age did you use
cannabis for the first time) when the answer to this question was
never used. Rather, the researcher left the variables blank for those
cases. This can be confusing. Always check your data to make sure
how you coded the information and what you are reporting!

Anda mungkin juga menyukai