Anda di halaman 1dari 4

BU.510.650.

XX – Data Analytics – Instructor Name – Page 1 of 4

Course Outline
MATH6200 - Data Analysis (Data Analytics)
Texts & Learning Materials
There is no required textbook: All required class materials will be available on our Blackboard website.
However, some books are very useful if you want to learn more about data analytics and its applications. The
best way to learn is by doing (especially for R programming)

Optional Textbook 1 (highly recommend, easy to follow, with many examples and data sets):
Data Mining and Business Analytics with R, by Johannes Ledolter;
Publisher: Wiley (2013), ISBN-13: 978-1118447147;
Available in Johns Hopkins online library: https://catalyst.library.jhu.edu/catalog/bib_4637122

Optional Textbook 2 (solid primer, with theory and explanation):


An Introduction to Statistical Learning with Application in R, by Gareth James, Daniela Witten, Trevor
Hastie, Robert Tibshirani;
Publisher: Springer (2013); ISBN-13: 978-1461471370;
Available in Johns Hopkins online library: https://catalyst.library.jhu.edu/catalog/bib_6591386

Optional Textbook 3 (a great advanced text):


Elements of Statistical Learning: Data Mining, Inference, and Prediction, by Trevor Hastie, Robert
Tibshirani and Jerome Friedman, but it requires some mathematical sophistication and goes beyond the
material we will be covering. The book is free at https://web.stanford.edu/~hastie/Papers/ESLII.pdf

Software:
 We require the R Statistical Software, which is powerful and free. R can be downloaded at the link
below: http://www.cran.r-project.org/
 Rstudio is a free platform for both writing and running R, available at www.rstudio.org. Some students
find it friendlier than basic R (especially in windows OS).
 The learning curve is very steep. Students can become proficient in a few weeks. Some manuals are
very helpful to learn R, e.g., http://cran.r-project.org/manuals.html
 I provide limited software instruction, in-class demonstration, and code to accompany lectures and
assignments. We do not assume that you have used R in a previous class. However, this is not a
class on R. Like any language, R is only learned by doing. You should install R as soon as possible
and familiarize yourself with basic operations.
 Additional resources: (a) Tutorials at data.princeton.edu/R are fantastic (and there are many others out
there). (b) YouTube intros to R, e.g. the series from Google Developers.

Course Description
This course prepares students to gather, describe, and analyze data, and use advanced statistical tools to
make decisions on operations, risk management, finance, marketing, etc. Analysis is done targeting economic
and financial decisions in complex systems that involve multiple partners. Topics include probability, statistics,
hypothesis testing, regression, clustering, decision trees, and forecasting.

Learning Objectives
By the end of this course, students will be able to:
1. Gather sufficient relevant data, conduct data analytics using scientific methods, and make appropriate
and powerful connections between quantitative analysis and real-world problems.
2. Demonstrate a sophisticated understanding of the concepts and methods; know the exact scopes and
possible limitations of each method; and show capability of using data analytics skills to provide
constructive guidance in decision making.
3. Use advanced techniques to conduct thorough and insightful analysis, and interpret the results
correctly with detailed and useful information.
4. Show substantial understanding of the real problems; conduct deep data analytics using correct
methods; and draw reasonable conclusions with sufficient explanation and elaboration.
5. Write an insightful and well-organized report for a real-world case study, including thoughtful and
convincing details.
6. Make better business decisions by using advanced techniques in data analytics.

Attendance
Attendance and class participation are part of each student’s course grade. Students are expected to attend all
scheduled class sessions. Failure to attend class will result in an inability to achieve the objectives of the
course. Excessive absence will result in loss of points for participation. Regular attendance and active
participation are required for students to successfully complete the course.

Class participation is an important part of learning. If you have a question, it’s likely that others do as well. I
encourage active participation, and course grades will take into account students who make particularly strong
contributions.
BU.510.650.XX – Data Analytics – Instructor Name – Page 2 of 4

Study Groups (not required, but highly recommended)


Many students learn better and faster when working in a group, so I encourage collaborative learning. You can
work together in a study group with 2–4 students to discuss class materials, homework assignments, and
projects on a weekly basis. However, each student must write your homework assignment individually using
your own language; your text should reflect your own understanding of the materials. The study groups can be
different from your project groups.

Tentative Course Calendar


The instructors reserve the right to alter course content and/or adjust the pace to accommodate class
progress. Students are responsible for keeping up with all adjustments to the course calendar.

Recommended Reading
Week Date Weekly Objectives/Topics Assignments
(book by Ledolter)
1 [date] Introduction, Data Summarization and Text, Ch 1, 2
Visualization
2 [date] Linear and Nonlinear Regression Text, Ch 3, 4, 5, 6 HW 1 is due

3 [date] Model Selection Text, Ch 7, 8, 9, 11 HW 2 is due

4 [date] Classification, Logistic Regression Text, Ch 13, 14, 15, 16 HW 3 is due

5 [date] Clustering Text, Ch 19, 20 HW 4 is due

6 [date] Decision Trees Text, Ch 17, 18 HW 5 is due

7 [date] Project Presentation HW 6 is due

8 [date] Final Exam


BU.510.650.XX – Data Analytics – Instructor Name – Page 3 of 4

Appendix. Homework Rubric for Data Analytics Course: Part 1

Assessment Not Good Enough Good Very Good


Score
Criteria (0≤ score <6) (6≤ score <9) (9≤ score ≤10)
Deep Demonstrate inadequate Understand concepts and Demonstrate sophisticated
understanding of understanding of some methods relatively well, understanding for the concepts and
theory and its important concepts, methods analyze data using acceptable methods; know the exact scopes
applications, using or their applications, e.g., methods although not perfect; and possible limitations of each
qualitative methods choose wrong methods, be able to derive useful method; show capability of using
to answer business conduct analysis information for decision data analytics skills to make right
questions inappropriately, or interpret making. business decision.
results incorrectly.

Implementation Use wrong techniques to Choose acceptable methods to Use advanced techniques to
and interpretation analyze data, present analyze data, interpretations conduct thorough and insightful
of data analysis inappropriate interpretations or are sensible, derive useful analysis, interpret the results
techniques conclusions. results. correctly, draw right conclusions
based on data analysis.

Ability of solving Data is inadequate or Collect and document just Gather sufficient relevant data,
real-world unstructured. Use enough data, employ conduct data analytics using
problems using inappropriate methods to appropriate techniques to scientific methods, make
quantitative analyze data, fail to retrieve retrieve insightful information appropriate and powerful
methods useful information. from data, make reasonable connections between analysis and
Suggestions are not recommendations. real-world problems, provide
persuading. constructive guidance in decision
making.

Writing and Report is inadequately written Report is concise and clearly Report is well organized and
presenting, and poorly organized. Analysis written. Analyze problems insightfully written, includes
especially on is insufficient. Conclusions are following scientific strategies; thorough and thoughtful details.
organization and unconvincing. provide useful suggestions Conclusions are convincing.
communication with detailed explanation.

Total Score
Comments:
BU.510.650.XX – Data Analytics – Instructor Name – Page 4 of 4

Appendix. Homework Rubric for Data Analytics Course: Part 2

Assessment Not Good Enough Good Very Good


Score
Criteria (0≤ score <6) (6≤ score <9) (9≤ score ≤10)
Interpretation of Little or no attempt to Interpret most data correctly; Data are completely and
Data interpret data; or there are part of conclusions may be appropriately interpreted; there is no
(qualitative) significant errors; or some suspect; suggestions on future over- or under-interpretation; draw
data are over- or under- implementation are sound. convincing conclusions.
interpreted.

Statistical Analysis Statistical methods are Most statistical methods are Statistical methods are fully and
(quantitative) completely misapplied or correctly applied but more correctly applied; demonstrate
applied but with significant could have been done with the superior data analysis skills; deeply
errors or omissions. Choose data. Predictions are sensible mine the data and obtain useful
inappropriate methods and but may deviate from the true insights for decision making.
make wrong predictions. results in a large range.

Critical evaluation of Blindly accept defective Recognize defective results Show deep understanding for the
findings results; or recognize and figure out the causes; sources of errors; recognize
defective results but does not understand the main sources defective results and eliminates the
know how to fix them. of errors. causes.

Ability to draw Not draw conclusions; draw Draw correct conclusion; Demonstrate substantial
proper conclusions incorrect conclusions; suggestions may have understanding of the problem;
and make effective suggestions are not potential impact on the future conduct deep data analytics using
suggestions acceptable. business. correct methods; draw correct
conclusions with sufficient
explanation and elaboration.

Total Score
Comments: