Data Science Sprint 2: Statistical Tests and Experiments
Welcome to Sprint 2
Welcome to Sprint 2! An important application of statistics is designing and evaluating experiments. In the context of web applications, this means an A/B test where users experience different versions of a site and compare their behavior/outcomes.
How do you design a good and valid experiment? When have you run your experiment long enough? How do you evaluate the outcome of an experiment? Finally, how do you balance all of this math and science with the practical business/product concerns you're working with? These are the sorts of questions we'll discuss in this sprint.
Statistics is, in some ways, the most tolerant branch of mathematics. Unlike pure math, statistics accepts situations where the exact and complete are unattainable. We've learned about the basic summary metrics provided by descriptive statistics - mean, median, mode, standard deviation - and how you view these numerically and visually to help tell a story about data.
In this sprint, we'll still use these metrics but go deeper into their meaning and interpretation to perform hypothesis tests that allow us to state with "confidence" conclusions about our data. We will also have a chance to build our first predictive models using linear regression.
Modules
This sprint is structured to provide you with comprehensive understanding of statistical tests and experiments:
Module 1
Hypothesis Testing (t-tests) and Confidence Intervals
Build on descriptive statistics concepts to explore hypothesis testing with t-tests and t-distributions. Learn about p-values, confidence intervals, and the Central Limit Theorem.
View ModuleModule 2
Hypothesis Testing (chi-square tests)
Continue with hypothesis testing but introduce chi-square tests for analyzing categorical data. Learn when to use chi-square tests and how to interpret the results.
View ModuleModule 3
Bayesian Statistics
Contrast frequentist statistics with Bayesian statistics - an approach that models how we form and update beliefs based on evidence.
View ModuleModule 4
Simple Linear Correlation and Regression
Learn to measure linear relationships between quantitative variables, calculate correlation, and model relationships with simple linear regression.
View Module