Syllabus
Syllabus
Data for the course (from the book)
https://studysites.sagepub.com/dsur/study/articles.htm
Another Book
An Introduction to Statistical Learning
with Applications in R
Data From ISLR
Advertising.csv
Auto.data
Auto.csv
College.csv
Ch10Ex11.csv
Credit.csv
Income1.csv (Figure 2.2)
Income2.csv (Figure 2.3)
Heart.csv
Smarket.csv
Caravan.csv
Data From Chapter 4 example ISLR
PizzaChoiceGenerator.csv
LDAMulti.csv
Data for Homework 5
Pres1.csv
Pres2.csv
Pres3.csv
Pres4.csv
Pres5.csv
Course Assignments
- Homework 1
- Linreg.csv
- Run a multiple linear regression with y = f(x1,x2)
- What are the F-statistics and their corresponding p-values. Interpret. Is your model good?
- Write out your model explicitly
- abc.csv
- Run anova on the data.
- Separate into columns with AB and C as Boolean variables and run a linear regression. What is your model?
- donates.csv
- Run a logistic regression on this data where donate = f(x1,x2)
- Evaluate your model
- Write Your Model Explicitly.
- Homework 2
- For houseprice.csv
- Perform Anova using any R command you wish
- Perform a Helmert planned contrast from scratch.
Perform R commands learned in class to compute the planned contrasts for helmert as in question 2
- Perform your choice of the pairwise t-tests, "holm" or "bonferroni" on the data set.
For each step explain what you did and interpret your result. You should use complete English and explain what you did so that you can understand your work a year from now. The other data sets from Jan 28th are: age.csv
and hhi.csv
- Homework 3
Problems 8,9,10,11,12,13 in ISLR Chapter 3
- Homework 4
- For Three.csv
- Make an LDA Model (label model 1).
- Find Accuracy, Specificty and Sensitivity.
- Make a QDA model and compare Accuracy, Specificty and Sensitivity.
- For Quad.csv
- Make a QDA model (label model 2).
- Find Accuracy, Specificty and Sensitivity.
- Make a Logistic Regression and compare Accuracy, Specificty and Sensitivity.
- For KNNData.csv
- Make a KNN Model for various k.
- Evaluate the models and select the k with the highest accuracy. Label the
model with the highest accuracy model 3.
- Find Specificty and Sensitivity.
- Make an LDA and compare accuracy.
- Extra Credit. For each of the three models above Model 1, Model 2,
and Model 3 make a 2D graph displaying
- the datapoints color coded for a, b or a, b and c and
- the decision boundary.
- Homework 5: Presentation - Regression
- Load and Clean Data
- Handle NAs
- Find and handle outliers
- Find and handle leveraged data
- Look for some correlation between variables
- Compute correlation matrix
- Compute graphs
- Is data normal?
- Model exploraion - Regression
- Check for multicolinearity
- Run several models including quadratic,
Y~X_1, Y~X_1+X_2, Y~X_1+X_2+X_3, Y~X_1+X_2+X_3,Y~X_1+X_2+X_3,
Y~X_1^2+X_2^2+X_1X_2+X_1+X_2,...
- Perform Model selection via cross validation
- Select Your Model
- Why did you select your model?
- Describe the formula of your model.
- What do the parameters tell us about the model (if you have parameters).
- What is the R^2 statistic (adjusted)? Interpret.
- Homework 5: Presentation - Categorical
- Load and Clean Data
- Handle NAs
- Find and handle outliers
- Find and handle leveraged data
- Look for some correlation between variables
- Compute correlation matrix
- Compute graphs
- Is data normal?
- Model exploraion - Categorical
- Check for multicolinearity
- Run several models Log Regression, LDA, QDA and KNN
- Perform Model selection via cross validation
- Select Your Model
- Why did you select your model?
- Describe the formula of your model. If you chose LDA, QDA what is the decsion boundary?
- What do the parameters tell us about the model (if you have parameters).
- Print a confusion matrix, intepret and mention the different measues of accuracy
Quizzes
Tests
Final Exam Schedule
Will be held on Tuesday, May 12, 2020 at regular class time in our regular classroom.