Anda di halaman 1dari 4

Last Modified January 26, 2007

Basic Descriptive Statistics Using R


In the following handout words and symbols in bold are R functions and words and
symbols in italics are entries supplied by the user; underlined words and symbols are
optional entries (all current as of version R-2.4.1). Sample texts from an R session are
highlighted with gray shading.

Measures of Central Tendency

• mean(object) – provides the mean of the object’s elements

> quarters = c(5.683, 5.620, 5.551, 5.549, 5.536,


+ 5.552, 5.548, 5.539, 5.554, 5.552, 5.684, 5.632

> mean(quarters)

[1] 5.583333

• median(object) – provides the median of the object’s elements

> median(quarters)

[1] 5.552

• mode– there is no built in function for finding an object’s mode; however, the
command table(object) creates a frequency table for the object’s elements and the
mode is the element in this table with the greatest frequency

> table(quarters)

5.536 5.539 5.548 5.549 5.551 5.552 5.554 5.62 5.632 5.683 5.684
1 1 1 1 1 2 1 1 1 1 1

• midrange – there is no built in function for reporting the midrange; the command
shown below use the functions for an object’s maximum (max) and minimum (min)
to calculate and print the object’s midrange

> midrange = (max(quarters) + min(quarters))/2; midrange

[1] 5.61

1
Last Modified January 26, 2007

Measures of Spread

• var(object) – provides the sample variance of the object’s elements

> var(quarters)

[1] 0.003116606

• sd(object) – provides the sample standard deviation of the object’s elements

> sd(quarters)

[1] 0.05582657

• standard error of the mean – there is no built in function for reporting the standard
error of the mean; the command shown below use the functions for the object’s
standard deviation (sd) and number of elements (length), as well as the mathematical
function for finding a square root (sqrt) to calculate and print the object’s standard
error of the mean

> sem = sd(quarters)/sqrt(length(quarters)); sem

[1] 0.01611574

• range – there is no built in function for reporting the range; the command shown
below use the functions for an object’s maximum (max) and minimum (min)
elements to calculate and print the object’s range

> range = (max(quarters) – min(quarters)); range

[1] 0.148

• IQR(object) – provides the object’s interquartile range; note – this value may differ
slightly from that provided by other programs because there is no single accepted
definition for FU and FL

> IQR(quarters)

[1] 0.07425

2
Last Modified January 26, 2007

Quantitative and Visual Representations of a Distribution’s Shape

• skew(object) – provides the skewness for an object; this function is not included in R,
but is available from the file “skew&kurt.RData,” which is available on the course’s
I-drive account.

> skew(quarters)

[1] 0.8508155

• kurt(object) – provides the kurtosis for an object relative to that of a normal


distribution; this function is not included in R, but is available from the file
“skew&kurt.RData,” which is available on the course’s I-drive account.

> kurt(quarters)

[1] -1.075001

• hist(object) – creates a histogram of the object’s elements with the number of


compartments chosen by R.

> hist(quarters)

3
Last Modified January 26, 2007

• boxplot(object 1, object 2…, names, horizontal = TRUE) – creates a boxplot of the


object’s elements (for multiple objects, a boxplot is drawn for each); names is a
vector containing the names of the objects, which adds labels on the x-axis when
plotting more than one boxplot. Setting horizontal to TRUE (the default value is
FALSE) creates a horizontal boxplot.

> boxplot(quarters)

Anda mungkin juga menyukai