Anda di halaman 1dari 27

Reporting with R Markdown DataCamp

DATACAMP COURSES - NOTES


Reporting with R Markdown
2018

Contenido
1. Authoring R Markdown Reports ........................................................................ 2
1.1. The R Markdown Interface ......................................................................... 2
1.2. Explore R Markdown .................................................................................. 3
1.3. Prepare your workspace for preliminary analysis ....................................... 4
1.4. Styling narrative sections ........................................................................... 7
1.5. Lists ............................................................................................................ 8
1.6. LaTeX Equations ........................................................................................ 9
2. Embedding R Code ........................................................................................ 10
2.1. R Code Chunks ........................................................................................ 10
2.2. Customize R code chunks ........................................................................ 11
2.3. Inline R Code ........................................................................................... 14
2.4. Labeling and Reusing code chunks ......................................................... 17
3. Compiling Reports .......................................................................................... 18
3.1. Alternative output formats ........................................................................ 18
3.2. Creating Slideshows................................................................................. 19
3.3. Specify knitr and pandoc options ............................................................. 21
3.4. Recurring to a CSS File to manage Style ................................................. 23
3.5. Interactive Reports with Shiny .................................................................. 24
3.6. Interactive ggvis graphics ........................................................................ 26

Page 1
Reporting with R Markdown DataCamp

1. Authoring R Markdown Reports


1.1. The R Markdown Interface
For this course, DataCamp has developed a new kind of interface that looks
like the R Markdown pane in RStudio. You have a space (my_document.Rmd)
to write R Markdown documents, as well as the buttons to compile the R
Markdown document. To keep things simple, we'll stick with making html and
pdf documents, although it is also possible to create Microsoft Word
documents with R Markdown.

When you click "Knit HTML", DataCamp will compile your R Markdown
document and display the finished, formatted results in a new pane.

To give you a taste of the things you'll learn in this course, we've prepared two
documents in the editor on the right:

 my_document.Rmd containing the actual R Markdown code;


 faded.css, a supplementary file that brands your report.

INSTRUCTIONS

 Change the title of the Markdown Document from "Ozone" to "Hello R


Markdown".

 Click the "Knit HTML" button to see the compiled version of your sample
code.

MI DOCUMENT.RMD
---
title: "Hello R Markdown"
output:
html_document:
css: faded.css
---

## Data

The `atmos` data set resides in the `nasaweather` package of the *R* programming language.
It contains a collection of atmospheric variables measured between 1995 and 2000 on
a grid of 576 coordinates in the western hemisphere. The data set comes from the
[2006 ASA Data Expo](http://stat-computing.org/dataexpo/2006/).

Some of the variables in the `atmos` data set are:

* **temp** - The mean monthly air temperature near the surface of the Earth (measured in
kelvins (*K*))

* **pressure** - The mean monthly air pressure at the surface of the Earth (measured in
millibars (*mb*))

Page 2
Reporting with R Markdown DataCamp

* **ozone** - The mean monthly abundance of atmospheric ozone (measured in Dobson units
(*DU*))

You can convert the temperature unit from Kelvin to Celsius with the formula

$$ celsius = kelvin - 273.15 $$

And you can convert the result to Fahrenheit with the formula

$$ fahrenheit = celsius \times \frac{9}{5} + 32 $$

```{r, echo = FALSE, results = 'hide'}


example_kelvin <- 282.15
```

For example, `r example_kelvin` degrees Kelvin corresponds to `r example_kelvin - 273.15`


degrees Celsius.

FADED.CSS

h1{
color: white;
padding: 10px;
background-color: #3399ff
}

ul {
list-style-type: square;
}

.MathJax_Display {
padding: 0.5em;
background-color: #eaeff3
}

RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/hojbkhifbyqxpxolah.html

1.2. Explore R Markdown


The document to the right is a template R Markdown document. It includes the
most familiar parts of an R Markdown document:

 A YAML header that contains some metadata


 Narrative text written in Markdown
 R code chunks surrounded by ```{r} and ```; a syntax that comes from
the knitr package

Click the 'Knit HTML' button and compare the document to its compiled form.
Then:

 Change the title of the document to "Hello World".

 Change the author of the document to your own name.

Page 3
Reporting with R Markdown DataCamp

 Rewrite the first sentence of the document to say "This is my first R


Markdown document.".
 Replace the cars data set with the mtcars data set in both code blocks.

 Recompile the document; can you see your changes?

---
title: "Hello World"
author: "Miguel Prada"
date: "January 1, 2015"
output: html_document
---

This is my first R Markdown document. Markdown is a simple formatting syntax for authoring
HTML, PDF, and MS Word documents. For more details on using R Markdown see
<http://rmarkdown.rstudio.com>.

When you click the **Knit** button a document will be generated that includes both content
as well as the output of any embedded R code chunks within the document. You can
embed an R code chunk like this:

```{r}
summary(mtcars)
```

You can also embed plots, for example:

```{r, echo=FALSE}
plot(mtcars)
```

Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of
the R code that generated the plot.

RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/rjjplyctzjsvomnwhgfa.html

1.3. Prepare your workspace for preliminary analysis


During this course, we will examine a data set that comes in
the nasaweather package. The data set is called atmos, and it contains
meteorological data about the western hemisphere.

We'll also use the dplyr package to manipulate our data and
the ggvis package to visualize it.

For the next set of exercises, you will use the traditional DataCamp interface:
you have an editor where you can write and submit R code, as well as a
console where you can experiment with R code without doing a formal
submission.

 Load the nasaweather, dplyr, and ggvis packages. These packages


have already been installed in the DataCamp R session.

Page 4
Reporting with R Markdown DataCamp

 After submitting the correct code, open the help page for the atmos data
set by executing ?atmos in the console. Before proceeding to the next
exercise, read the help page to familiarize yourself with the data.

# Load the nasaweather package


library(nasaweather)

# Load the dplyr package


library(dplyr)

# Load the ggvis package


library(ggvis)

We will use some of the data in atmos to explore the relationship between
ozone and temperature. But before we do, let's transform the data into a more
useful form.

The sample code uses dplyr functions to aggregate the data. It computes the
mean value of temp, pressure, ozone, cloudlow, cloudmid, and cloudhigh for
each latitude/longitude grid point.

You can learn more about dplyr in DataCamp's dplyr course.

Don't get confused by the pipe operator ( %>% ) from the magrittr package that
is used often in combination with dplyr verbs. It is used to chain your code in
case there are several operations you want to do without the need to save
intermediate results.

 Set the year variable to 1995. This will cause the code to retain just
observations from the year 1995.
 At the end of the sample code, add a command to print the
resulting means data frame and examine its output.

# The nasaweather and dplyr packages are available in the workspace

# Set the year variable to 1995


year <- 1995

means <- atmos %>%


filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()

# Inspect the means variable


means

Page 5
Reporting with R Markdown DataCamp

The sample code on the right uses ggvis functions to visualize the data. It
displays a plot of pressure vs. ozone.

We'll use ggvis to create several graphs for our R Markdown reports.

You can learn more about ggvis in DataCamp's ggvis course.

 Run the code and take a look at the graph that it makes. See how
straightforward it is to plot the data from the previous exercise?
 Change the code to plot the temp variable vs the ozone variable, both in
the means data set. We will write an R Markdown report that analyzes
the relationship between temp and ozone.

# The nasaweather, dplyr and ggvis packages are loaded in the workspace.

# Code for the previous exercise - do not change this


means <- atmos %>%
filter(year == 1995) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()

# Change the code to plot the temp variable vs the ozone variable
means %>%
ggvis(x = ~temp, y = ~ozone) %>%
layer_points()

We've now loaded data, cleaned it, and visualized it. Our analysis will have
one more component: a model.

Page 6
Reporting with R Markdown DataCamp

The code on the right creates a linear model that predicts ozone based
on pressure and cloudlow; all three are variables of the means data frame you
created earlier.

You can learn more about building models with R in DataCamp's Introduction
to Statistics course.

 Change the model so that it predicts ozone based on tempand nothing


else.
 Generate a summary of the model using the summary()function. Can you
interpret the results? Test yourself by looking for the model's estimates
for the intercept and temp coefficients, as well as the p-value associated
with each coefficient and the model's overall Adjusted R-squared.

# The nasaweather and dplyr packages are already at your disposal


means <- atmos %>%
filter(year == 1995) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()

# Change the model: base prediction only on temp


mod <- lm(ozone ~ temp, data = means)

# Generate a model summary and interpret the results


summary(mod)

1.4. Styling narrative sections


You can use Markdown to embed formatting instructions into your text. For
example, you can make a word italicized by surrounding it in
asterisks, bold by surrounding it in two asterisks, and monospaced (like code)
by surrounding it in backticks:

*italics*
**bold**
`code`

You can turn a word into a link by surrounding it in hard brackets and then
placing the link behind it in parentheses, like this:

[RStudio](www.rstudio.com)

To create titles and headers, use leading hastags. The number of hashtags
determines the header's level:

Page 7
Reporting with R Markdown DataCamp

# First level header


## Second level header
### Third level header

The paragraph to the right describes the data that we'll use in our report.

 Turn the line that begins with "Data" into a second level header.

 Change the words atmos and nasaweather into a monospaced font


suitable for code snippets.

 Make the letter R italicized.

 Change "2006 ASA Data Expo" to a link that points to http://stat-


computing.org/dataexpo/2006/

The paragraph to the right describes the data that you'll use in your report. Try
rendering it both before and after you make the changes below.

## Data

The `atmos` data set resides in the `nasaweather` package of the *R* programming language.
It contains a collection of atmospheric variables measured between 1995 and 2000 on
a grid of 576 coordinates in the western hemisphere. The data set comes from the
[2006 ASA Data Expo](http://stat-computing.org/dataexpo/2006/).

RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/iefsfcpdhhukkstivyv.html

1.5. Lists
To make a bulleted list in Markdown, place each item on a new line after an
asterisk and a space, like this:

* item 1
* item 2
* item 3

You can make an ordered list by placing each item on a new line after a
number followed by a period followed by a space, like this

1. item 1
2. item 2
3. item 3

In each case, you need to place a blank line between the list and any
paragraphs that come before it.

We've added some text to your description on the right.

Page 8
Reporting with R Markdown DataCamp

 Turn the text into a bulleted list with three bullets. Temp, pressure, and ozone
should each get their own entry.

 Make temp, pressure, and ozone bold at the start of each entry.

 Make K, mb, and DU italicized at the end of each entry.

Then render your results to see the final format.

Some of the variables in the `atmos` data set are:

* **temp** - The mean monthly air temperature near the surface of the Earth (measured in
kelvins (*K*))
* **pressure** - The mean monthly air pressure at the surface of the Earth (measured in
millibars (*mb*))
* **ozone** - The mean monthly abundance of atmospheric ozone (measured in Dobson units
(*DU*))

RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/vcrcykctrbswsxrnteyv.html

1.6. LaTeX Equations


You can also use the Markdown syntax to embed latex math equations into
your reports. To embed an equation in its own centered equation block,
surround the equation with two pairs of dollar signs like this,

$$1 + 1 = 2$$

To embed an equation inline, surround it with a single pair of dollar signs, like
this: $1 + 1 = 2$.

You can use all of the standard latex math symbolsto create attractive
equations.

The text on the right contains a formula that converts degrees Celsius to degrees
Fahrenheit. Where the comment is, write another formula that converts degrees Kelvin to
degrees Celsius. You can convert any temperature in degrees Kelvin to a temperature in
degrees Celsius by subtracting 273.15 from it. Do not capitalize Kelvin or Celsius when
writing the formula. Then render your results to see the final format.

You can convert the temperature unit from Kelvin to Celsius with the formula

$$ celsius = kelvins - 273.15 $$

And you can convert the result to Fahrenheit with the formula

$$ fahrenheit = celsius \times \frac{9}{5} + 32 $$

RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/tohurdhnfpwlaquefcr.html

Page 9
Reporting with R Markdown DataCamp

2. Embedding R Code
2.1. R Code Chunks
You can embed R code into your R Markdown report with the knitr syntax. To
do this, surround your code with two lines: one that contains ```{r} and one
that contains ``` . The result is a code chunk that looks like this:

```{r}
# some code
```

When you render the report, R will execute the code. If the code returns any
results, R will add them to your report.

The first file in the editor pane on the right contains the next section of your R
Markdown report. This section will explain how you cleaned your data. The
second file (my_code.R) on the right is an R Script that contains the actual
code that we used to clean the data. Use the knitr syntax to embed this code
into the .Rmd file.

Then render the file to see the results.

## Cleaning

For the remainder of the report, we will look only at data from the year 1995. We aggregate
our data by location, using the *R* code below.

```{r}
library(nasaweather)
library(dplyr)

year <- 1995

means <- atmos %>%


filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()
```

RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/shdjftrnujkaikxqreym.html

Good job! Notice that this code does not display any results. It simply
saves means so we can use it later. Did you notice that we
included library(nasaweather) and library(dplyr) to be rerun in the last
exercise? Each R Markdown document is given a fresh, empty R session to run
its code chunks in. This means that you will need to define any R objects that this

Page 10
Reporting with R Markdown DataCamp

document uses - and load any packages that it uses - inside the same R
Markdown document. The document won't have access to the objects that exist in
your current R session.

2.2. Customize R code chunks


You can customize each R code chunk in your report by providing optional
arguments after the r in ```{r} , which appears at the start of the code chunk.
Let's look at one set of options.

R functions sometimes return messages, warnings, and even error messages.


By default, R Markdown will include these messages in your report. You can
use the message , warning and error options to prevent R Markdown from
displaying these. If any of the options are set to FALSE R Markdown will not
include the corresponding type of message in the output.

For example, R Markdown would ignore any errors or warnings generated by


the chunk below.

```{r warning = FALSE, error = FALSE}


"four" + "five"
```

 Packages often generate messages when you first load them


with library(). To make sure that these messages do not appear in
your report, separate library(nasaweather) , library(dplyr) ,
and library(ggvis) into their own code chunk in the document to the
right. Be sure to make this the first code chunk in your document (so
other code chunks will have access to the data sets and functions that
come in those libraries).

 Arrange for the new code chunk to ignore any messages that are
generated when loading the packages.

Then render the file to see the results.

## Cleaning

For the remainder of the report, we will look only at data from the year 1995. We
aggregate our data by location, using the *R* code below.

```{r message=FALSE}
library(nasaweather)
library(dplyr)
library(ggvis)

Page 11
Reporting with R Markdown DataCamp

```
```{r}
year <- 1995

means <- atmos %>%


filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()
```
RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/yogkqbvrjlzvupxssdfg.html

Nice job! Notice that splitting your code in different chunks does not change
anything about the availability of the results. Although the library() functions
have been executed in another chunk, the packages are still available in the next
chunk.

Three of the most popular chunk options are echo , eval and results.

If echo = FALSE, R Markdown will not display the code in the final document
(but it will still run the code and display its results unless told otherwise).

If eval = FALSE, R Markdown will not run the code or include its results, (but it
will still display the code unless told otherwise).

If results = 'hide' , R Markdown will not display the results of the code (but it
will still run the code and display the code itself unless told otherwise).

The R Markdown file to the right contains a complete report with two figures. It
is common to display figures without the code that generates them (the code is
a distraction). Modify each code chunk that generates a graph so that it does
not display the code that makes the graph. Notice how the document controls
the size of the figures with the fig.heightand fig.width arguments.
## Data

The `atmos` data set resides in the `nasaweather` package of the *R* programming language.
It contains a collection of atmospheric variables measured between 1995 and 2000 on
a grid of 576 coordinates in the western hemisphere. The data set comes from the
[2006 ASA Data Expo](http://stat-computing.org/dataexpo/2006/).

Some of the variables in the `atmos` data set are:

* **temp** - The mean monthly air temperature near the surface of the Earth (measured in
kelvins (*K*))

* **pressure** - The mean monthly air pressure at the surface of the Earth (measured in
millibars (*mb*))

Page 12
Reporting with R Markdown DataCamp

* **ozone** - The mean monthly abundance of atmospheric ozone (measured in Dobson units
(*DU*))

You can convert the temperature unit from Kelvin to Celsius with the formula

$$ celsius = kelvin - 273.15 $$

And you can convert the result to Fahrenheit with the formula

$$ fahrenheit = celsius \times \frac{9}{5} + 32 $$

## Cleaning

For the remainder of the report, we will look only at data from the year 1995. We aggregate
our data by location, using the *R* code below.

```{r message = FALSE}


load(url("http://assets.datacamp.com/course/rmarkdown/atmos.RData")) # working with a
subset
library(dplyr)
library(ggvis)
```

```{r}
year <- 1995
means <- atmos %>%
filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()
```

## Ozone and temperature

Is the relationship between ozone and temperature useful for understanding fluctuations in
ozone? A scatterplot of the variables shows a strong, but unusual relationship.

```{r fig.height = 4, fig.width = 5, echo=FALSE}


means %>%
ggvis(~temp, ~ozone) %>%
layer_points()
```

We suspect that group level effects are caused by environmental conditions that vary by
locale. To test this idea, we sort each data point into one of four geographic
regions:

```{r}
means$locale <- "north america"
means$locale[means$lat < 10] <- "south pacific"
means$locale[means$long > -80 & means$lat < 10] <- "south america"
means$locale[means$long > -80 & means$lat > 10] <- "north atlantic"
```

### Model

Page 13
Reporting with R Markdown DataCamp

We suggest that ozone is highly correlated with temperature, but that a different
relationship exists for each geographic region. We capture this relationship with
a second order linear model of the form

$$ ozone = \alpha + \beta_{1} temperature + \sum_{locales} \beta_{i} locale_{i} +


\sum_{locales} \beta_{j} interaction_{j} + \epsilon$$

This yields the following coefficients and model lines.

```{r}
lm(ozone ~ temp + locale + temp:locale, data = means)
```

```{r fig.height = 4, fig.width = 5,echo=FALSE}


means %>%
group_by(locale) %>%
ggvis(~temp, ~ozone) %>%
layer_points(fill = ~locale) %>%
layer_model_predictions(model = "lm", stroke = ~locale) %>%
hide_legend("stroke") %>%
scale_nominal("stroke", range = c("darkorange", "darkred", "darkgreen", "darkblue"))
```

### Diagnostics

An anova test suggests that both locale and the interaction effect of locale and temperature
are useful for predicting ozone (i.e., the p-value that compares the full model to
the reduced models is statistically significant).

```{r}
mod <- lm(ozone ~ temp, data = means)
mod2 <- lm(ozone ~ temp + locale, data = means)
mod3 <- lm(ozone ~ temp + locale + temp:locale, data = means)

anova(mod, mod2, mod3)


```
RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/tycrwfrbnisrrawpapfw.html

2.3. Inline R Code


You can embed R code into the text of your document with the `r ` syntax. Be
sure to include the lower case r in order for this to work properly. R Markdown
will run the code and replace it with its result, which should be a piece of text,
such as a character string or a number.

For example, the line below uses embedded R code to create a complete
sentence:

The factorial of four is `r factorial(4)`.

When you render the document the result will appear as:

The factorial of four is 24.

Page 14
Reporting with R Markdown DataCamp

Inline code provides a useful way to make your reports completely


automatable.

The report to the right has been reorganized to make it more automatable.

 Change the value of year to 2000 in line 28.


 Complete lines 31 and 52 so that the blank space, ___, shows the value
of the year object when the report is rendered.

 Render the document and notice how everything updates to use the
new year's worth of data. Even the sentences in lines 31 and 52 update
to reflect the new year.

---
output: html_document
---

## Data

The `atmos` data set resides in the `nasaweather` package of the *R* programming language.
It contains a collection of atmospheric variables measured between 1995 and 2000 on
a grid of 576 coordinates in the western hemisphere. The data set comes from the
[2006 ASA Data Expo](http://stat-computing.org/dataexpo/2006/).

Some of the variables in the `atmos` data set are:

* **temp** - The mean monthly air temperature near the surface of the Earth (measured in
kelvins (*K*))

* **pressure** - The mean monthly air pressure at the surface of the Earth (measured in
millibars (*mb*))

* **ozone** - The mean monthly abundance of atmospheric ozone (measured in Dobson units
(*DU*))

You can convert the temperature unit from Kelvin to Celsius with the formula

$$ celsius = kelvin - 273.15 $$

And you can convert the result to Fahrenheit with the formula

$$ fahrenheit = celsius \times \frac{9}{5} + 32 $$

## Cleaning

```{r echo = FALSE}


year <- 2000
```

For the remainder of the report, we will look only at data from the year `r year`. We
aggregate our data by location, using the *R* code below.

```{r message = FALSE}


library(nasaweather)
library(dplyr)
library(ggvis)
```

Page 15
Reporting with R Markdown DataCamp

```{r}
means <- atmos %>%
filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()
```

where the `year` object equals `r year`.

## Ozone and temperature

Is the relationship between ozone and temperature useful for understanding fluctuations in
ozone? A scatterplot of the variables shows a strong, but unusual relationship.

```{r echo = FALSE, fig.height = 4, fig.width = 5}


means %>%
ggvis(~temp, ~ozone) %>%
layer_points()
```

We suspect that group level effects are caused by environmental conditions that vary by
locale. To test this idea, we sort each data point into one of four geographic
regions:

```{r}
means$locale <- "north america"
means$locale[means$lat < 10] <- "south pacific"
means$locale[means$long > -80 & means$lat < 10] <- "south america"
means$locale[means$long > -80 & means$lat > 10] <- "north atlantic"
```

### Model

We suggest that ozone is highly correlated with temperature, but that a different
relationship exists for each geographic region. We capture this relationship with
a second order linear model of the form

$$ ozone = \alpha + \beta_{1} temperature + \sum_{locales} \beta_{i} locale_{i} +


\sum_{locales} \beta_{j} interaction_{j} + \epsilon$$

This yields the following coefficients and model lines.

```{r}
lm(ozone ~ temp + locale + temp:locale, data = means)
```

```{r echo = FALSE, fig.height = 4, fig.width = 5}


means %>%
group_by(locale) %>%
ggvis(~temp, ~ozone) %>%
layer_points(fill = ~locale) %>%
layer_model_predictions(model = "lm", stroke = ~locale) %>%
hide_legend("stroke") %>%

Page 16
Reporting with R Markdown DataCamp

scale_nominal("stroke", range = c("darkorange", "darkred", "darkgreen", "darkblue"))


```

### Diagnostics

An anova test suggests that both locale and the interaction effect of locale and temperature
are useful for predicting ozone (i.e., the p-value that compares the full model to
the reduced models is statistically significant).

```{r}
mod <- lm(ozone ~ temp, data = means)
mod2 <- lm(ozone ~ temp + locale, data = means)
mod3 <- lm(ozone ~ temp + locale + temp:locale, data = means)

anova(mod, mod2, mod3)


```
RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/sgqkxraptogifuolvrna.html

2.4. Labeling and Reusing code chunks


Apart from the popular code chunk options you have learned by now, you can
define even more things in the curly braces that follow the triple backticks .

An interesting feature available in knitr is the labeling of code snippets. The


code chunk below would be assigned the label simple_sum :

```{r simple_sum, results = 'hide'}


2 + 2
```

However, because the results option is equal to hide, no output is shown.


This is what appears in the output document:

2 + 2

What purpose do these labels serve? knitr provides the option ref.label to
refer to previously defined and labeled code chunks. If used
correctly, knitr will copy the code of the chunk you referred to and repeat it in
the current code chunk. This feature enables you to separate R code and R
output in the output document, without code duplication.

Let's continue the example; the following code chunk:

```{r ref.label='simple_sum', echo = FALSE}


```

produces the output you would expect:

## [1] 4

Notice that the echo option was explicitly set to FALSE, suppressing the R code
that generated the output.

Page 17
Reporting with R Markdown DataCamp

In the sample code on the right, you see a rather large code chunk that
contains R code to load packages dplyr and ggvis and functions to create a
ggvis graph

 Separate the code chunks into two: one for the library() calls, one for
the ggvis and dplyrfunctions.

 Edit the first chunk's header so no messages are shown in the output
document.
 Edit the first chunk's header so output is hidden; give this code chunk
the label chained .
 Move the sentence "The ggvis plot gives us a nice visualization of
the mtcars data set:" after the second chunk.
 Add a third chunk at the end containing no code, showing the output of
the second chunk, without the echoing code that generated it. Use
the ref.label option.

## Exploring the mtcars data set

Have you ever wondered whether there is a clear correlation between the gas consumption of
a car and its weight?
To answer this question, we first have to load the `dplyr` and `ggvis` packages.

```{r message = FALSE}


library(dplyr)
library(ggvis)
```

```{r chained, results = 'hide'}


mtcars %>%
group_by(factor(cyl)) %>%
ggvis(~mpg, ~wt, fill = ~cyl) %>%
layer_points()
```

The `ggvis` plot gives us a nice visualization of the `mtcars` data set:

```{r ref.label='chained', echo = FALSE}


```

RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/jlmbrjplrrqdotohdrke.html

3. Compiling Reports
3.1. Alternative output formats
You can render the same R Markdown file into several different formats. There
are two ways to change a file's output format.

First, you can click the triangle icon next to "Knit HTML" at the bottom of the
pane that displays your R Markdown file. This will open a drop down menu that
gives you the choice of rendering as an HTML document or a pdf document.

Page 18
Reporting with R Markdown DataCamp

Second, you can change the output field in the YAML block at the top of your
document. For example, this YAML block will create an HTML file:

---
output: html_document
---

This one will create a pdf file:

---
output: pdf_document
---

This one will create a MS Word file:

---
output: word_document
---

And this one will create a Markdown file:

---
output: md_document
---

 The R Markdown file to the right describes the cloud data in


the atmos data set. Change the output field to make the document
render as a pdf, and re-render the document.
 Notice that to visualize data in a pdf document, you will have to use
the ggplot2 package as an alternative to the ggvis package. This is for
a reason: the ggvis package creates graphs that are HTML objects.
These graphs are useful for HTML documents, but cannot be included in
a pdf document without intermediary steps.

---
title: "Cloud Cover"
author: "Anonymous"
date: "December 2, 2014"
output: pdf_document
---

Good work! Rendering R Markdown files in the RStudio IDE works the same way.
But what if you are using R in a terminal window? You can run the
command rmarkdown::render(<file path>) to render any .Rmd file with R.

3.2. Creating Slideshows


You can also export your file as a slideshow by changing the output field to:

---
output: beamer_presentation

Page 19
Reporting with R Markdown DataCamp

---

which creates a beamer pdf slideshow,

---
output: ioslides_presentation
---

which creates an ioslides HTML slideshow or

---
output: slidy_presentation
---

which creates a slidy HTML slideshow.

R Markdown will start a new slide at each first or second level header in your
document. You can insert additional slide breaks with Markdown's horizontal
rule syntax:

***

Everywhere you add these three asterisks in your text, pandoc will create a
new slide.

 Change the output type of the cloud report to a slidy HTML slideshow.

 Insert a horizontal rule before "Some of the variables" and "You can
convert" to create additional slide breaks.

 Then render the file as a slideshow.

---
title: "Cloud Cover"
author: "Anonymous"
date: "December 2, 2014"
output: slidy_presentation
---

## Data

The `atmos` data set resides in the `nasaweather` package of the *R* programming language.
It contains a collection of atmospheric variables measured between 1995 and 2000 on
a grid of 576 coordinates in the western hemisphere. The data set comes from the
[2006 ASA Data Expo](http://stat-computing.org/dataexpo/2006/).

***

Some of the variables in the `atmos` data set are:

* **cloudlow** - The mean percent of the sky covered by clouds at low altitudes.

* **cloudmid** - The mean percent of the sky covered by clouds at mid-range altitudes.

Page 20
Reporting with R Markdown DataCamp

* **cloudhigh** - The mean percent of the sky covered by clouds at high altitudes.

***

You can convert the temperature unit from Kelvin to Celsius with the formula

$$ celsius = kelvin - 273.15 $$

And you can convert the result to Fahrenheit with the formula

$$ fahrenheit = celsius \times \frac{9}{5} + 32 $$

## Cleaning

```{r echo = FALSE}


year <- 2000
```

For the remainder of the report, we will look only at data from the year `r year`. We
aggregate our data by location, using the *R* code below.

```{r echo = FALSE, message = FALSE}


library(nasaweather)
library(dplyr)
library(tidyr)
```

```{r}
means <- atmos %>%
filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()

clouds <- means %>%


select(-(temp:ozone)) %>%
gather("altitude", "coverage", 3:5)
```
RESULT: https://s3.amazonaws.com/markdown-loads.datacamp.com/llzeqtordymjglnysnj.html#(1)

Great job! When you render R Markdown documents on your own computer, R
Markdown will save a copy of the file (in the output file type) on your machine. It
will appear in the same folder that the .Rmd file lives in. Feel free to experiment
with the other slideshow formats in this exercise!

3.3. Specify knitr and pandoc options


Each R Markdown output template is a collection of knitr and pandoc options.
You can customize your output by overwriting the default options that come
with the template.

Page 21
Reporting with R Markdown DataCamp

For example, the YAML header below overwrites the default code highlight
style of the pdf_documenttemplate to create a document that uses the zenburn
style:

---
title: "Demo"
output:
pdf_document:
highlight: zenburn
---

The YAML header below overwrites the default bootstrap CSS theme of
the html_document template.

---
title: "Demo"
output:
html_document:
theme: spacelab
---

Pay close attention to the indentation of the options inside the YAML header; if
you do not do this correctly, pandoc will not correctly understand your
specifications. As an example, notice the difference between only specifying
the output document to be HTML:

---
output: html_document
---

and specifying an HTML output document with a different theme:

---
output:
html_document:
theme: spacelab
---

You can learn more about popular options to overwrite in the R Markdown
Reference Guide.

 Move the html_document header element to its own line, and indent it to
be a subelement of output.

Page 22
Reporting with R Markdown DataCamp

 Add a colon, : after html_document.


 Add toc (for a table of contents) and number_sections subelements
to html_document , setting both to true (lower case).

 Re-render the document.

---
title: "Ozone"
author: "Anonymous"
date: "January 1, 2015"
output:
html_document:
toc: true
number_sections: true
---
RESULT: https://s3.amazonaws.com/markdown-
uploads.datacamp.com/fvxngrgrhplxakfnwtbx.html#ozone-and-temperature

You are getting the hang of it! Notice that the numbering of the table of contents
contains a zero since no first level headers were defined. Each document template
has its own set of options to overwrite. Check out the R Markdown Reference
Guide for a full, but concise description of the possibilities.

3.4. Recurring to a CSS File to manage Style


In the last exercise, we showed a way to change the CSS style of your HTML
output: you can set the theme option of html_document to one
of default , cerulean, journal, flatly, readable, spacelab , united, or cosmo .
(Try it out).

But what if you want to customize your CSS in more specific ways? You can
do this by writing a .css file for your report and saving it in the same directory
as the .Rmd file. To have your report use the CSS, set the css option
of html_document to the file name, like this

---
title: "Demo"
output:
html_document:
css: styles.css
---

Custom CSS is an easy way to add branding to your reports.

The faded.css file to the right contains some example CSS that will change
the appearance of your report.

 Move html_document to be a subelement of output , and append a colon.


 Add a css subelement of html_document, setting it to the CSS file.

Page 23
Reporting with R Markdown DataCamp

 Render the report.

R MARKDOWN FILE:
---
title: "Ozone"
author: "Anonymous"
date: "January 1, 2015"
output:
html_document:
css: faded.css
---
.
.
.

FADED.CSS:
h1{
color: white;
padding: 10px;
background-color: #3399ff
}

ul {
list-style-type: square;
}

.MathJax_Display {
padding: 0.5em;
background-color: #eaeff3
}

RESULT: https://s3.amazonaws.com/markdown-uploads.datacamp.com/raxcakycbxxuuwiatiz.html

Nice work! R Markdown might be a new way of reporting, but branding R Markdown
reports in a familiar way is straightforward!

3.5. Interactive Reports with Shiny


Shiny is an R package that uses R to build interactive web apps such as data
explorers and dashboards. You can add shiny components to an R Markdown
file to make an interactive document.

When you do this, you must ensure that

 You use an HTML output format


(like html_document, ioslides_presentation , or slidy_presentation ).
 You add runtime: shiny to the top level of the file's YAML header.

To learn more about interactivity with Shiny and R, visit shiny.rstudio.com.

Read the raw R Markdown file to the right.

 Change the output type in the YAML header to html_document.

Page 24
Reporting with R Markdown DataCamp

 Add a runtime element, set to shiny .

 Knit the result.

 Visit the report hosted at DataCamp's Shiny Server to see the


interactive report that it would make when you render it.

---
title: "Shiny Demo"
author: "DataCamp"
output: html_document
runtime: shiny
---

This R Markdown document is made interactive using Shiny. Unlike the more traditional
workflow of creating static reports, you can now create documents that allow your
readers to change the assumptions underlying your analysis and see the results
immediately.

To learn more, see [Interactive


Documents](http://rmarkdown.rstudio.com/authoring_shiny.html).

## Inputs and Outputs

You can embed Shiny inputs and outputs in your document. Outputs are automatically updated
whenever inputs change. This demonstrates how a standard R plot can be made
interactive by wrapping it in the Shiny `renderPlot` function. The `selectInput`
and `sliderInput` functions create the input widgets used to drive the plot.

```{r, echo=FALSE}
inputPanel(
selectInput("n_breaks", label = "Number of bins:",
choices = c(10, 20, 35, 50), selected = 20),

sliderInput("bw_adjust", label = "Bandwidth adjustment:",


min = 0.2, max = 2, value = 1, step = 0.2)
)

renderPlot({
hist(faithful$eruptions, probability = TRUE, breaks = as.numeric(input$n_breaks),
xlab = "Duration (minutes)", main = "Geyser eruption duration")

dens <- density(faithful$eruptions, adjust = input$bw_adjust)


lines(dens, col = "blue")
})
```

## Embedded Application

It is also possible to embed an entire Shiny application within an R Markdown document


using the `shinyAppDir` function. This example embeds a Shiny application located
in another directory:

```{r, echo=FALSE}
shinyAppDir(
system.file("examples/06_tabsets", package = "shiny"),
options = list(
width = "100%", height = 550
)

Page 25
Reporting with R Markdown DataCamp

)
```

Note the use of the `height` parameter to determine how much vertical space the embedded
application should occupy.

You can also use the `shinyApp` function to define an application inline rather then in an
external directory.

In all of R code chunks above the `echo = FALSE` attribute is used. This is to prevent the
R code within the chunk from rendering in the document alongside the Shiny
components.
RESULT: https://multiplexer-
paid.datacamp.com/proxy/relative/2d74f23cf1628855b13a6e30a3472450/ctzoicyjxdjgonye
fdsc/

3.6. Interactive ggvis graphics


You can also use R Markdown to create reports that use
interactive ggvis graphics. ggvis relies on the shiny framework to create
interactivity, so you will need to prepare your interactive document in the same
ways:

 You need to add runtime: shiny to the YAML header


 You need to ensure that your output is a HTML format
(like html_document, ioslides_presentation , or slidy_presentation )

You do not need to wrap your interactive ggvisplots in a render function. They
are ready to use as is in an R Markdown document.

The .Rmd file to the right contains a ggvis plot that updates as a user moves a
slider.

 Change the document type to be an HTML document.

 Add a Shiny runtime.

 Render the document; you will see that the rendered document will be
static. The interactive report is hosted at Datacamp's Shiny server.

---
title: "ggvis"
author: "DataCamp"
output: html_document
runtime: shiny
---

ggvis provides a number of ways to enhance plots with interacticity. For example, the
density plot below allows users to set the kernel and bandwidth of the plot.

```{r echo = FALSE, message = FALSE}


library(ggvis)

Page 26
Reporting with R Markdown DataCamp

mtcars %>% ggvis(x = ~wt) %>%


layer_densities(
adjust = input_slider(.1, 2, value = 1, step = .1, label = "Bandwidth adjustment"),
kernel = input_select(
c("Gaussian" = "gaussian",
"Epanechnikov" = "epanechnikov",
"Rectangular" = "rectangular",
"Triangular" = "triangular",
"Biweight" = "biweight",
"Cosine" = "cosine",
"Optcosine" = "optcosine"),
label = "Kernel")
)
```
RESULT: https://multiplexer-
paid.datacamp.com/proxy/relative/d4a18a5e61ec00760c06a16fef78d502/aahyupxdjcepwwrh
ukoc/

Page 27