Anda di halaman 1dari 40

Introduction to R

Introduction to R

Learning Outcomes
● What is R?
● R vs. RStudio
● Getting familiarize with R
● Basic Data Types
● Vectors
● Matrix
● List
● Dataframe
● Working with the Data: An Application
Introduction to R

What is R? Why R?
● Language for Statistical Computing and Graphics

○ Command line based

○ Developed in Auckland, NZ

○ Similar to S language and environment

○ Its free! and open source

● For more info and download visit at : https://www.r-project.org


● Nowadays, today almost every industry relies on data
Introduction to R

R vs. Rstudio

R RStudio

Price ● Free and Open Source ● Free and Open Source

Interactivity ● Less interactive ● More advanced and


interactive

Download it at ● https://www.r-project.org/ ● https://www.rstudio.com/


Introduction to R

Familiarize with R
Console Input/Command

Output
# 6 mod 8, gives you the remainder when 6 is divided by 8

Variable
● Assign almost anything to a variable by
using either <- or =
● Use it later on
Workspace
● Must start with an alphabet
Ext. .RData
● Can only contain
“Numbers”,”Alphabets”,”Underline
Character”, “dot”
● Reserved words can’t be assigned as
variable name
● R is case sensitive !
Introduction to R

Familiarize with R
R script : Text (“.txt”) file containing R commands
Extension .R
Introduction to R

Inserting comments: Console


● To add the information in front of a command use #
● R will not execute anything after # (including #)
Introduction to R

Inserting comments: R script


Introduction to R

Basic Data Types


● Logicals:
○ TRUE (1)
○ FALSE (0)
○ NA (missing value)

● Numeric
○ Integer

● Character/String

Use class() function to figure out the data type


Introduction to R

Vectors

● Sequence of data elements (numeric, logicals, characters etc.)

○ (3,4), (TRUE,FALSE, T, NA), (“Thrones”, “Suits”) etc.

● Only accepts data of same type

● Automatic coercion

● Single value is also a vector!

● c() is use to create a vector


Introduction to R

Creating a Vector
● A single value is a vector ● 1:9 is same as c(1,2,3,...,9)
● Even missing value NA ! ● seq(x,y,z) generates sequence
● Every vector in R is a row vector by default! ○ lower limit x
○ upper limit y
● Rows: Data points/Observation ○ with increments z (default = 1)
● Column: a variable of data
Introduction to R

Coercion
● R performs automatic conversion of data types
■ c(“A”, 8) => c(“A”, “8”) Numerics gets converted into characters
■ c(T, 0 , 1) => c(1, 0, 1) Logicals gets converted into Numerics
Introduction to R

Logical operation on Vector


● Logical operations yields a logical vector of TRUE, FALSE, NA
● By default logical operations are performed element wise
● Logical operators:
○ & : and
○ | : or
○ ! : not
● Logical operators for Comparisons
○ <
○ >
○ ==
Introduction to R

Logical Test
● is.vector(x) :to check if a variable/data point x is vector or not
Introduction to R

Logical Test
Introduction to R

Naming a Vector
● Two ways to name the elements:
○ names() function
○ Use names of each element while creating a vector
Introduction to R

Vector Calculus
● Vector operations are done element wise
■ +
■ -
■ *
■ /
■ ^ : To exponentiate
■ log() : computes natural logarithm function ie base e
■ exp() : natural exponential function
■ %% : modulus operator, returns remainder

● To multiply two vectors in a usual way use %*%


Introduction to R

Vector Calculus
sum(x) : Computes sum of all elements of x
Introduction to R

Subsetting Vector
Square brackets []

● Index number
vector
● (-) minus sign to
exclude values
that aren’t
required
● Logicals: T, F
● Conditioning
Introduction to R

Listing and Removing Variables


● ls() : To list the variables loaded in workspace
● rm(x,y,z) : Removes variable x, y and z from workspace
○ rm(list = ls()) : To remove all the variables from workspace in one go!
● ctrl + L : To clear console, it won’t delete any variable (Win OS only)
Introduction to R

Matrix
● Vector: 1D array of data elements
● Matrix: 2D array of data elements
○ Rows and Columns
● Accepts only same data type, Coercion occurs otherwise
● dim(x): Retrieves and sets dimensions of x
Introduction to R

Creating a Matrix
● matrix(data = , nrow = , ncol = , byrow = )
Introduction to R

Naming a Matrix
● rownames(m) : retrieve or set the names of the rows of matrix m
● colnames(m): retrieve or set the names of the columns of matrix m
Introduction to R

Matrix Calculus, Algebra


● Like vector, Matrix operations are also done element wise
■ +
■ -
■ *
■ /
■ ^ : To exponentiate
■ log() : computes natural logarithm function ie base e
■ exp() : natural exponential function

● To multiply two matrix in a usual way use %*%


● t(x) : Computes the transpose of matrix x
● solve(x) : Computes the inverse of a square matrix x
Introduction to R

Matrix Calculus
Introduction to R

Matrix Subsetting
● Subsetting by row or a column
○ PXY[2, ] : entire row 2 of matrix PXY
○ PXY[ ,1] : entire column 1 of matrix PXY
● Subsetting by names
○ PXY[“Mon”, “Y”] : Profit made on asset Y on monday
○ PXY[“Tue”, ] : Profit made on asset X and asset Y on tuesday
● Subsetting multiple entries
○ PXY[c(2, 3), -1]: Profit made on asset Y on tuesday and wednesday
● Subsetting by logicals
○ PXY[c(TRUE, FALSE), ] : Profit matrix for alternate days
Introduction to R

Matrix Subsetting: Conditional


● PXY[ , “X”] > 0 & PXY[ , “Y”] > 0 : Vector of logicals, TRUE iff profit on X and profit on Y >
0

● PXY[PXY[ , “X”] > 0 & PXY[ , “Y”] > 0, ]: Profit matrix with strictly positive profit on both
the assets
Introduction to R

Data Frame
● Specifically for data sets
● Observations : Rows
● Variables : Columns
● Elements of same column : Same type!
● data.frame(): Creates data frame
Introduction to R

Subsetting Data Frame


● Using []

● Using $

● Using subset()

○ subset(dataframe, subset = , select = )


Introduction to R

Working Directory
● All the work gets saved in the working directory

● getwd() : to get full path of your working directory

● setwd(path) : set working directory to the path eg. “e:/Work”


Introduction to R

Reading Data into R


● Locally saved data

● .csv : read.csv(‘name of the file’, stringsAsFactors = FALSE )

● .txt : read.table(‘name of the file’, header = TRUE)

● Or use RStudio’s interface


Introduction to R

Application
● Statistics & Econometrics

● Mathematics

● Finance
Introduction to R

Application : Statistics
● Descriptive Statistics

● Plotting data

● Generating random numbers/variates

● Calculating the probability of famous distributions - Z, t, F, 2

● Hypothesis testing and CI estimation


Introduction to R

Plotting Symbols

Source: Quick-R
Introduction to R

Plotting: Line Types

Source: Quick-R
Introduction to R

Application : Econometrics
● Linear Regression

● Multiple Linear Regression

● ANOVA table

● Creating Dummy Variable


Introduction to R

Application : Mathematics
● Determinant of a Matrix

● Solving System of Linear Equations - solve()

● Eigenvalues and Eigenvectors - eigen()


Introduction to R

Dates in R
● Date: Another type of data

● Represented as number of days since 1970-01-01

● Useful if data has time series object


Introduction to R

Dates in R
as.Date(x, format = )
○ 24/09/2015: as.Date(‘24/09/2015’,
‘%d/%m/%Y’)
■ “2015-09-24”
○ Sep 24, 2015 : as.Date(“Sep 24, 2015”, “%b %d,
%Y”)
■ “2015-09-24”
○ 2015-09-24: as.Date(“2015-09-24”)
○ 2015-09-24: as.Date(“2015-09-24”, format = “
%Y-%m-%d ”)
■ “2015-09-24”
Introduction to R

Learn More at
● edx
● Coursera
● DataCamp

Anda mungkin juga menyukai