1 tayangan

Diunggah oleh Dawsonn-andoh Steve

- 04 Regression
- Methods
- multiple regression
- Datascience Online Training in Hyderabad
- Quantitative Methods Summary 2012 - J Stegemann
- Lecture 8 Regression Analysis
- Minitab Commands
- Lab_5_Nerlove(a)
- Regression Analysis One 18-05-2011
- Vs30 From Shallow Models
- Published-PDF-1023-6-The Study of the Determinants of Systemic Risk on Earnings Response Coefficient
- Buildings-203578- Revised- With Track Changes
- THE EFFECT OF COMPENSATION, COMPETENCE, AND TALENT MANAGEMENT ON EMPLOYEES PERFORMANCE OF THE IKATAN MOTOR INDONESIA.
- Annexure a Thesis Baljit Singh
- Model Selection
- Slides Session 4
- Algorithmic Mathematics
- Analisis Regresi Sederhana Jawaban
- Assignment 3 Sem 1 2015
- Jurnal 2

Anda di halaman 1dari 4

Marist College

Michael J. Crawley

John Wiley & Sons, Chichester, 2015.

ISBN 978-1-118-94109-6. 339 pp. USD 45.00 (P).

http://www.imperial.ac.uk/bio/research/crawley/statistics/

Introduction

Michael J. Crawleys text Statistics: An Introduction with R is indeed an introduction to

statistical analysis as well as excellent introduction to working in and with R. However, it is

not for the faint of heart. The first half covers standard introductory material (descriptive

statistics, simple regression and inferential statistics) in a somewhat idiosyncratic way. The

second half of the text is a good introduction to modeling (multiple regression, ANOVA and

ANCOVA, various general linear models, survival analysis). The treatments are thorough

and yet short only 300 odd pages. As you can imagine he is extraordinarily succinct. There

are many useful and thoughtful insights; it is well written and clear. It contains no exercises;

there is an excellent appendix that serves well as a basic R tutorial. If you teach this material

without a text or think you might like to try, you might find that it (paradoxically) will work

as the structural backbone for a course that you fill out as you see fit. The text will serve

well for an introductory course or for a second course in modeling.

Discussion

The overall structure and approach is very nice. His use of R is quite gentle, intertwined

with the narrative in a way that is natural, informative, and quickly builds basic facility. In

early chapters Crawley introduces a statistic, then calculates it directly by hand in R. He

writes a function to automate the process, and only having done so points out the built-in R

version. He simultaneously discusses the meaning and use of the statistic, code to calculate it,

and higher level R structures. For instance, within the first few pages using R he introduces

functions, modular arithmetic, nested looping and ways to plot the results. This sounds like

a lot and it is, but it works well here. The magical black-box feel of many routines in varied

platforms should be dispelled by the approach. Once in a while he does not follow through

2 Statistics: An Introduction Using R (2nd Edition)

e.g., deriving the least squares coefficients yet going directly to lm but for the most part he

uses this method to good effect.

The material is kept quite applied. No real discussion of the mathematics or theory is provided.

Nevertheless he is careful with algebraic arguments concerning partitioning sums of squares or

calculation of degrees of freedom; he derives the (two-variate) normal equations and solutions.

These types of calculations are the only instances of boxed, independent asides outside of the

general narrative.

The book starts out with a chapter of fundamentals, more of an annotated glossary for

later reference than an introduction. Here we find general guidelines for types of analyses;

discussion of hypotheses and error types; typical assumptions; experimental design; maximum

likelihood estimation; power; replication; randomization; weak and strong inference; multiple

comparisons; as well as a laundry list of some of the basic model types available in R. In

twenty-two pages. Whew!

The presentation is rather terse the subsection on controls reads in its entirety:

No controls, no conclusions.

Pseudoreplication occurs when you analyze the data as if you had more degrees

of freedom than you really have.

This is the first mention of degrees of freedom I can find in the text; these sorts of out of

place sections lard the text, and can cause some confusion. While frustrating to students

new to the material, it is in service of brevity, elegance of presentation, and perhaps piquing

the interest of the reader for what is to come. Though at times annoying, I find it works

reasonably well overall. Clearly, it is intended that some of the material will become clearer

as the reader progresses through the book.

The second chapter begins the material on R, introducing dataframes and their manipulation,

as well as initial data inspection summary statistics and basic graphical analysis (in that

order: Id reverse it, but then thats me). Datasets from the books website are used, with R

commands highlighted in red, the output in blue.

The third chapter begins where most texts start: measures of central tendency. Here, as well

as in later chapters, the treatment is not completely standard. A discussion of quartiles and

the five number summary is omitted. After discussing the mean and median, the geometric

and harmonic means are presented (with good motivating examples) as well.

Another instance of non-linearity: the next chapter on variance ends with a residual scatter-

plot showing heteroscedasticity long before chapters on regression or bivariate data. I found

myself frequently wishing for a more thorough index. On the other hand, Crawley introduces

confidence intervals very naturally and intuitively, both t and bootstrapped versions. He

demonstrates sample size effects on estimation using nested loops for sampling and graphics.

It is an odd, or at least unorthodox, presentation that works nicely.

In the next chapters he introduces analysis of single samples and two samples, including nor-

mality tests and both parametric and nonparametric inference techniques. The single-sample

chapter finishes up with a discussion of statistics to measure skew and kurtosis. Two-sample

nonparametric alternatives (rank sum and binomial test) feature prominently. He includes

Journal of Statistical Software Book Reviews 3

graphical techniques as well: to with notched box plots. The 2 and Fishers exact tests

for independence of two samples of counts is covered. The two-sample discussion concludes

with correlation including Pearsons product-moments correlation method of testing for

differences from zero.

From here, Crawley begins simple regression by reminding the reader of the definition of slope

and other rather rudimentary topics. Curiously, he spends a rather verbose three pages at a

mathematical level dramatically lower than the rest of the texts narrative. That aside, the

presentation is very good, and quickly goes well beyond normal material to include partition-

ing sums of squares; calculating standard errors for the slope and intercept, careful model

checking and transformations to correct problems, polynomial regression and even non-linear

regression as well as generalized additive models.

Again, whew! This recalls to mind a colleagues quip that assigning a difficult text is a

desirable thing in that it allows you to be the good guy when you elaborate and clarify in

the classroom. Too many modern texts are written in such a way that the roles are reversed.

But I overstate: the worked examples are clear, interesting and easily replicated with the

commands shown. Color is used well but sparingly to differentiate text, code, and output, and

for graphics. There are no colored boxes, marginal notes, distracting highlights or gratuitous

photos in the text.

The narrative continues with chapters on ANOVA, ANCOVA, multiple regression, contrasts,

and a series of chapters on general linear models. In all of these the focus is on the nature of

the data. Poisson regression is discussed in the chapter on count data, logistic in the chapters

on proportion data and binary response data. A brief chapter on survival analysis rounds out

the text.

These regression chapters are my favorite part of the book. With minor variations each

chapter in the latter half of the book takes a data set, sets out the situation, does a basic

analysis, finds a maximal model and then either by hand or using step prunes to a minimal

adequate model. The implications of the model and need for transformations are assessed,

and if necessary the process is repeated. He uses graphics well to assess models and composes

nice multilayer plots of the data and fitted curves.

The entire set of chapters begins with an overview of types of data, error structures, link

functions, deviance and AIC. This type of overview is one of the great strengths of the

exposition. Crawley provides excellent characterizations at various points outlining the big

picture or the general steps an analysis should take.

I have a few quibbles. The exposition is, again, at times non-linear. We frequently meet terms

that are only defined later in the text. There is very little theory, which gives many of the

topics a rabbit-out-of-the-hat feel to them though when possible nice intuitive comments

(e.g., for maximum likelihood techniques) are presented in lieu of the mathematics.

There are some inconsistencies. He makes a strong case for graphical analysis of regression

results, providing an example of a model with a strong numeric summary that nevertheless

graphical analysis shows to be flawed. The point is a good one if not consistently followed.

Before this example there were several occasions where he proceeds, willy-nilly, with a numer-

ical analysis without even a precaution to the student that they consider graphing the data

first. Indeed, in Chapter 4 variance is calculated repeatedly while ignoring any plots.

4 Statistics: An Introduction Using R (2nd Edition)

Conclusion

Crawley has a real knack for outlining the general steps and processes of an analysis that is

clear, concise, informative and useful. He frequently makes pithy, insightful points without

overelaboration. He steps back regularly to give a broad overview of the topic. The use of R is

integrated seamlessly into the exposition, and will provide the novice with a straightforward

introduction to R code. The approach is not quite what I expected from an introductory text,

but I am left with the strong sense that it should be.

I could easily see this used as the backbone of both a first-year course for STEM students

with the mathematical maturity to take Calculus, as well as a course in applied statistics for

students who have taken at least a semester of mathematical probability and statistics. It

does omit exercises, and given the authors brevity will need elaboration in the classroom.

However the text provides strong overall structure and would be a good student resource.

Having taught the material from the second half of the text recently without a text, I find

myself wishing I had known of and made this text available to my class.

Reviewer:

James E. Helmreich

Marist College

Department of Mathematics

3399 North Road

Poughkeepsie, NY 12514, United States of America

E-mail: James.Helmreich@Marist.edu

URL: http://foxweb.marist.edu/users/james.helmreich/

published by the Foundation for Open Access Statistics http://www.foastat.org/

October 2015, Volume 67, Book Review 5 Published: 2015-01-06

doi:10.18637/jss.v067.b05

- 04 RegressionDiunggah olehSyed Noman Hussain
- MethodsDiunggah olehdurgasainath
- multiple regressionDiunggah olehWinny Shiru Machira
- Datascience Online Training in HyderabadDiunggah olehMohan R
- Quantitative Methods Summary 2012 - J StegemannDiunggah olehismuz
- Lecture 8 Regression AnalysisDiunggah olehadilsajjad2005
- Minitab CommandsDiunggah olehHenry Tom
- Lab_5_Nerlove(a)Diunggah olehZahirlian
- Regression Analysis One 18-05-2011Diunggah olehSyed Ovais
- Vs30 From Shallow ModelsDiunggah olehNarayan Roy
- Published-PDF-1023-6-The Study of the Determinants of Systemic Risk on Earnings Response CoefficientDiunggah olehPay Sohilauw
- Buildings-203578- Revised- With Track ChangesDiunggah olehYezan Arahman
- THE EFFECT OF COMPENSATION, COMPETENCE, AND TALENT MANAGEMENT ON EMPLOYEES PERFORMANCE OF THE IKATAN MOTOR INDONESIA.Diunggah olehIJAR Journal
- Annexure a Thesis Baljit SinghDiunggah olehKaustav Manna
- Model SelectionDiunggah olehvamsi54
- Slides Session 4Diunggah olehAdarsh Agrawal
- Algorithmic MathematicsDiunggah olehmadhurocksktm
- Analisis Regresi Sederhana JawabanDiunggah olehAiy Nita Fitriani
- Assignment 3 Sem 1 2015Diunggah olehRyan
- Jurnal 2Diunggah olehframesti
- EL5Diunggah olehGustavTillman
- Chapter 7 and 8Diunggah olehAdam Glassner
- Effect of Cash Flow Operating, Earning Per Share and Economic Value Added Activities on Stock Prices in Manufacturing Companies Listed In Indonesia Stock Exchange (2011-2013 Period)Diunggah olehInternational Journal of Innovative Science and Research Technology
- Forecasting 2nd III 17Diunggah olehNIKHIL SINGH
- THUC HANHDiunggah olehHuu Nguyen
- 046Diunggah olehPedro Roman
- 322-1090-1-PB.pdfDiunggah olehgpal.india2802
- Lecture 07Diunggah olehAshwani Kumar
- Adaptive Estimation and Control for Systems with Parametric and Nonparametric UncertaintiesDiunggah olehsjhsjh
- 75389103 Equal Employment Opportunity EEO in BangladeshDiunggah olehsajid bhatti

- High Performance Django (Sample Chapter)Diunggah olehMarco Santonocito
- AppleDiunggah olehDawsonn-andoh Steve
- Stand Dev Asa Mea RiskDiunggah olehDawsonn-andoh Steve
- General Considerations for the Analysis of Case-control StudiesDiunggah olehDawsonn-andoh Steve
- Programming support fileDiunggah olehRichard Ding
- Internet of Things for Smart CitiesDiunggah olehloory21
- s2rDiunggah olehDawsonn-andoh Steve
- 555kma Fee Fixing Resolution for 2017 GazetteDiunggah olehDawsonn-andoh Steve
- 09.Project-Hospital Management SystemDiunggah olehRajeev Ranjan
- BosDiunggah olehProfRomeo Moreno
- Management of Accra Meteropolitan Assembly MarketsDiunggah olehDawsonn-andoh Steve
- Local Government BulletinDiunggah olehDawsonn-andoh Steve
- Carnegie Mellon Recruiting in GhanaDiunggah olehDawsonn-andoh Steve
- 0904.2731Diunggah olehDawsonn-andoh Steve
- Data AnalyticsDiunggah olehdeepa
- 2. Command Line Manipulating Text Files on the Command LineDiunggah olehdimoni_et
- Upgrading & Improving Existing GHG Database v1Diunggah olehDawsonn-andoh Steve
- Example Service Level Agreement_editedDiunggah olehketanshekhar
- V2N1-005Diunggah olehDawsonn-andoh Steve
- ZR-03-38Diunggah olehDawsonn-andoh Steve
- JavaScript AllongeDiunggah olehMurali Nunna
- 011.5 Relational Algebra Union Diff SelectDiunggah olehDawsonn-andoh Steve
- Leanpub.retrofit.love.Working.with.APIs.on.AndroidDiunggah olehDawsonn-andoh Steve
- Python Lectures 5Diunggah olehDawsonn-andoh Steve
- 3-Network Simulation Tools SurveyDiunggah olehRakhmadhanyPrimananda
- GrimmsFairyStories.pdfDiunggah olehnimfaea
- 1-s2.0-S0304397597000674-main.pdfDiunggah olehDawsonn-andoh Steve
- UsingMicrosoftAccess7 InterfacesDiunggah olehDawsonn-andoh Steve

- Lecturer Notes of Unit -3Diunggah olehGraba Cada
- Singular Value Decomposition - Lecture NotesDiunggah olehRoyi
- Lecture 1Diunggah olehDayah Jaafar
- probstatmarkov.pdfDiunggah olehAmit
- Prony method for exponentialDiunggah olehsoumya
- Numerical Method for engineers-chapter 17Diunggah olehMrbudakbaek
- Probabilistic Forecasting and B - Sebastian ReichDiunggah olehVictor Hugo Garay Saez
- Guidelines for calibrationDiunggah olehfurious143
- Notes on Mod Al AnalysisDiunggah olehAfham Ahmad
- Linear Least Squares (Mathematics) - Wikipedia, The Free EncyclopediaDiunggah olehWanly Pereira
- introduction.pdfDiunggah olehJawad Munir
- SchoukensSystemIdentification2013.pdfDiunggah olehDev Ganatra
- Matrix Methods in Data Mining and Pattern Recognition, EldenDiunggah olehJoelBarnett
- Elements of the Theory of Generalized Inverses of MatricesDiunggah olehBen Ali Fathi
- 1.0 Regression Problems for Magnitudes - Castellaro 2006Diunggah olehGovind Gaurav
- [Nicholas J. Higham] Accuracy and Stability of NumDiunggah olehyassirajalil
- PWJohnDiunggah olehmaleticj
- NA-M-89-03Diunggah olehTapan Pradhan
- Mathematics of Dispersion ModelingDiunggah olehsscoston
- EconometricsDiunggah olehMarjo Kaci
- Applied Parameter Estimation for Chemical Engineers (Chemical Industries) by Peter Englezos {anexo1]Diunggah olehcegarcia
- (Lecture Notes) James v. Burke-Linear Least-Squares Problems (2013)Diunggah olehqwertry
- System IdentificationDiunggah olehdunerto
- The methods of geophysical inversion.pdfDiunggah olehAnonymous 781tGbsut
- Datasheet Go Cons Gcf 2 Dir r4Diunggah olehNinin Irmania
- SM 56Numerical AnalysisDiunggah olehpavan kumar kvs
- linear modelsDiunggah olehMario Zamora
- APPLIED REGRESSION ANALYSIS AND GENERALIZED LINEAR MODELS fox 2008Diunggah olehcaro
- Applied Parameter Estimation for Chemical EngineersDiunggah olehRogério Do Nascimento
- ex1Diunggah olehapi-322416213