Data Science Sprint 2: Statistical Tests and Experiments

Welcome to Sprint 2

Welcome to Sprint 2! An important application of statistics is designing and evaluating experiments. In the context of web applications, this means an A/B test where users experience different versions of a site and compare their behavior/outcomes.

How do you design a good and valid experiment? When have you run your experiment long enough? How do you evaluate the outcome of an experiment? Finally, how do you balance all of this math and science with the practical business/product concerns you're working with? These are the sorts of questions we'll discuss in this sprint.

Statistics is, in some ways, the most tolerant branch of mathematics. Unlike pure math, statistics accepts situations where the exact and complete are unattainable. We've learned about the basic summary metrics provided by descriptive statistics - mean, median, mode, standard deviation - and how you view these numerically and visually to help tell a story about data.

In this sprint, we'll still use these metrics but go deeper into their meaning and interpretation to perform hypothesis tests that allow us to state with "confidence" conclusions about our data. We will also have a chance to build our first predictive models using linear regression.

Modules

This sprint is structured to provide you with comprehensive understanding of statistical tests and experiments:

Module 1

Hypothesis Testing (t-tests) and Confidence Intervals

Build on descriptive statistics concepts to explore hypothesis testing with t-tests and t-distributions. Learn about p-values, confidence intervals, and the Central Limit Theorem.

View Module

Module 2

Hypothesis Testing (chi-square tests)

Continue with hypothesis testing but introduce chi-square tests for analyzing categorical data. Learn when to use chi-square tests and how to interpret the results.

View Module

Module 3

Bayesian Statistics

Contrast frequentist statistics with Bayesian statistics - an approach that models how we form and update beliefs based on evidence.

View Module

Module 4

Simple Linear Correlation and Regression

Learn to measure linear relationships between quantitative variables, calculate correlation, and model relationships with simple linear regression.

View Module

Sprint Resources