Course HighlightsCOURSE
Principles of Machine Learning: R Edition

Principles of Machine Learning: R Edition

Get hands-on experience building and deriving insights from machine learning models using R and Azure Notebooks.

Principles of Machine Learning: R Edition Highlights

Course Enrollment

Starts on

15 April 2019

Enrollment closes on
30 September 2019

  Course duration

Duration

  • Total 36 to 48 hours
  Course Fee

Fee

Free

Course Enrollment

Starts on

15 April 2019

Enrollment closes on
30 September 2019

Course duration

Duration

  • Total 36 to 48 hours
Course Fee

Fee

Free

Enrollment is Closed

About this course

Machine learning uses computers to run predictive models that learn from existing data in order to forecast future behaviors, outcomes, and trends.

In this data science course, you will be given clear explanations of machine learning theory combined with practical scenarios and hands-on experience building, validating, and deploying machine learning models. You will learn how to build and derive insights from these models using R, and Azure Notebooks.

edX offers financial assistance for learners who want to earn Verified Certificates but who may not be able to pay the fee. To apply for financial assistance, enroll in the course, then follow this link to complete an application for assistance.

What you'll learn

After completing this course, you will be familiar with the following concepts and techniques:

  • Data exploration, preparation and cleaning
  • Supervised machine learning techniques
  • Unsupervised machine learning techniques
  • Model performance improvement

Prerequisites

To complete this course successfully, you should have:

  • A basic knowledge of math
  • Some programming experience – R is preferred.
  • A willingness to learn through self-paced study.

Course Syllabus

  • Introduction to Machine Learning
  • Exploring Data
  • Data Preparation and Cleaning
  • Getting Started with Supervised Learning
  • Improving Model Performance
  • Machine Learning Algorithms
  • Unsupervised Learning

Note: This syllabus is preliminary and subject to change.

Meet the instructors

Graeme Malcolm

Graeme Malcolm

Senior Content Developer
Microsoft Learning Experiences

Graeme has been a trainer, consultant, and author for longer than he cares to remember, specializing in SQL Server and the Microsoft data platform. He is a Microsoft Certified Solutions Expert for the SQL Server Data Platform and Business Intelligence. After years of working with Microsoft as a partner and vendor, he now works in the Microsoft Learning Experiences team as a senior content developer, where he plans and creates content for developers and data professionals who want to get the best out of Microsoft technologies.

Steve Elston

Steve Elston

Managing Director
Quantia Analytics, LLC

Steve is a big data geek and data scientist, with over two decades of experience using R and S/SPLUS for predictive analytics and machine learning. He holds a PhD degree in Geophysics from Princeton University, and has led multi-national data science teams across various companies

Cynthia Rudin

Cynthia Rudin

Associate Professor
MIT and Duke

Cynthia leads the Prediction Analysis Lab at MIT, and is associated with the Computer Science and Artificial Intelligence Laboratory and the Sloan School of Management. She holds a PhD in applied and computational mathematics from Princeton University, and was previously, an associate research scientist at the Center for Computational Learning Systems at Columbia U.

Jonathan Sanito

Jonathan Sanito

Senior Content Developer
Microsoft

Jonathan works as a content developer and project manager for Microsoft focusing in Data and Analytics online training. He has worked with trainings for developer and IT pro audiences, from Microsoft Dynamics NAV to Windows Active Directory. Before coming to Microsoft, Jonathan worked as a consultant for a Microsoft partner, implementing Microsoft Dynamics NAV solutions.

Course Outline

Enrollment is Closed
Before Your Start
Welcome to DAT276x: Principles of Machine Learning: R Edition
Meet Your Instructor
Pre-Course Survey
Course Overview
Lab Overview
Overview of Machine Learning with K-Means Classifiers
Demo of K-Means Classification
Lab 1 Introduction to Machine Learning
Data Exploration
Data Visualization
Visualizing Distributions
Visualizing Data Relationships
Visualizing Categorical Relationships
Using Aesthetics to Visualize High Dimensions
Visualizing High Dimensional Relationships
Visualizing for Classification
Frequency Tables
Visualizing Data for Regression
Visualizing Data for Classification
Data Preparation
Duplicates
Missing Values
Errors and Outliers
Scaling
Splitting
Summary
Finding and Treating Missing Data
Finding and Treating Duplicates
Scaling Data
Overview of Feature Engineering
Transforming Features
Interaction Terms
Summary
Feature Engineering
Data Preparation
Introduction to Linear Regression
Multiple Linear Regression
Basics of ML with R
Evaluating Regression Models
Demo of Evaluating Regression
Models with Categorical Variables
Introduction to Classification
Loss Function for Classification
Statistical Learning Theory for Supervised Learning
Logistic Regression
Maximum Likelihood Perspective
Evaluating Classifiers
ROC Curves
ROC Curve Algorithms
Classification Models
Demo of Classifier Evaluation
Imbalanced Data
Approaches for Addressing Imbalanced Data
Introduction to Regression
Applying Linear Regression
Classification
Improving Models
Regularization
Performing Regularization
Interpreting Features
Features Selection
Sweeping Parameters
Cross Validation
Nested Cross Validation
Model Selection and Cross Validation
Overview of Dimensionality Reduction
Principal Component Analysis
Bias-Variance Trade-Off
Cross Validation
Feature Selection
Dimensionality Reduction
Decision Trees
What is Information
Boosting
AdaBoost
Coordinate Descent
Decision Forests
Introduction to Neural Networks
Backpropagation
Introduction to SVMs
Kernels for SVMs
Theory of Naive Bayes Models
Bagging
Boosting
Neural Networks
SVM
Naive Bayes
Introduction to Clustering
K-Means Clustering
Hierachical Clustering
Evaluating Clustering Models
Introduction to Unsupervised Learning
Application of Clustering
Introduction
About the Data
Data Exploration
Challenge 2: Classification
Classification - Grading
Challenge 3: Regression
Regression Grading
Post-Course Survey
Course Certificate

Earn your certificate

Once you have completed this course, you will earn your certificate.

Principles of Machine Learning: R Edition