← Back to Home

Module 3 - Introduction to Relational Databases

Module Overview

Get started with relational databases, including normalization and associations, to build robust data storage solutions.

Introduction to Relational Databases

A relational database stores an item's data across multiple related tables, unlike key-value databases like DynamoDB that store all information for an item in a single table.

Key RDB Concepts

Common Data Types in Relational Databases

Here's an example of an Employees table in a relational database:

id leave level officeCode officeName dateJoined state
dac041be true 5 SEA24 Fiona 2016-07-23 WA
dac04434 false 6 SEA41 Day 1 2019-12-03 WA
dac045a6 true 4 SEA32 Nessie 2015-09-15 WA

Unlike DynamoDB, adding a new column to an RDB table requires modifying the table schema and backfilling data, making careful upfront design important.

Database Normalization

Normalization is the process of organizing data to minimize redundancy by dividing it into tables with each containing information about only one type of entity.

Why Normalize Data?

Redundant data not only takes up unnecessary space but causes three specific anomalies:

  1. Insertion Anomaly: Can't insert new information until you have all the data for the record
  2. Update Anomaly: Need to update the same information in several records to maintain consistency
  3. Deletion Anomaly: Deleting a record removes more facts than intended, potentially losing important information

Normalization Example

To normalize the employee/office example from the previous section, we would isolate employee data in one table and office data in another:

Employees (Normalized):

id leave level dateJoined
dac041be true 5 2016-07-23
dac04434 false 6 2019-12-03
dac045a6 true 4 2015-09-15

Offices (Normalized):

id name state
SEA24 Fiona WA
SEA41 Day 1 WA
SEA32 Nessie WA

EmployeeOffice (Relationship Table):

employee office
dac041be SEA24
dac04434 SEA41
dac045a6 SEA32

This approach allows adding new employees or offices independently and updates to office information only need to happen in one place.

RDBMS Associations

Associations define relationships between tables in a relational database. The primary mechanism for creating associations is through foreign keys.

Foreign Keys and Referential Integrity

A foreign key is a column (or set of columns) in one table that refers to the primary key in another table. It enforces referential integrity, ensuring that relationships between tables remain consistent.

For example, in the EmployeeOffice table above:

Referential integrity enforces rules like:

Types of Associations

  1. One-to-One: One record in a table is associated with exactly one record in another table
  2. One-to-Many: One record in a table can be associated with multiple records in another table (e.g., one office can have many employees)
  3. Many-to-Many: Multiple records in a table can be associated with multiple records in another table (requires a junction/relationship table)

Basic SQL for Querying Data

-- Select all employees
SELECT * FROM Employees;

-- Select specific columns
SELECT id, level FROM Employees;

-- Filter with conditions
SELECT * FROM Employees WHERE level > 5;

-- Count employees
SELECT COUNT(*) FROM Employees;

-- Join tables
SELECT e.id, e.level, o.name, o.state
FROM Employees e
INNER JOIN EmployeeOffice eo ON e.id = eo.employee
INNER JOIN Offices o ON eo.office = o.id;

Learning Objectives

Resources