← Back to Home

Module 1: Intro to DynamoDB

Module Overview

Learn the fundamentals of DynamoDB, including database concepts, partition and sort keys, and data types.

Learning Objectives

Introduction to Databases

Key Concepts

A database is an organized collection of data, stored and accessed electronically from a computer system. Databases are essential for storing and retrieving data reliably and efficiently. They serve critical roles in applications by:

  • Providing a structured way to store and organize large amounts of data
  • Enabling efficient retrieval of information based on specific attributes
  • Ensuring data consistency and integrity across applications
  • Supporting concurrent access by multiple users

In DynamoDB, data is organized into tables, items (similar to rows in relational databases), and attributes (similar to columns). Each item in a DynamoDB table can be uniquely identified by its primary key.

Consider a practical example: if you organized 30 pairs of shoes in your closet, you would create a system where each shoe has attributes like color, style, and occasion. Similarly, databases organize data with attributes that allow for easy retrieval based on specific criteria.

// Example of data representation in a DynamoDB table
Table: ShoeOrganizer
{
  "shoe_id": "SN01", // Partition key
  "cubby_location": 1,
  "color": "grey",
  "style": "sneaker",
  "occasion": "athletic"
}

Distributed database systems like DynamoDB offer additional benefits:

  • Ability to store much larger datasets across multiple machines
  • Higher availability with geographically distributed data storage
  • Increased fault tolerance - if one server fails, others can handle requests
  • Better support for concurrent requests through distributing load

Partition and Sort Keys

Primary Keys in DynamoDB

Primary keys in DynamoDB uniquely identify each item in a table. DynamoDB supports two types of primary keys:

  1. Partition Key Only - A simple primary key with a single attribute called the partition key
  2. Composite Primary Key - A composite primary key consisting of a partition key and a sort key

The partition key determines the partition where your data is stored. It's used by DynamoDB's internal hash function to distribute data across partitions for scalability. For example, in a shoe table, "shoe_id" might be a good partition key.

The sort key allows you to organize items with the same partition key. This is especially useful for related items that need to be retrieved together.

// Example of a table with composite primary key
Table: MusicLibrary
{
  "artist": "Black Eyed Peas", // Partition key
  "song_title": "I Gotta Feeling", // Sort key
  "genre": "pop",
  "year": 2009
}

{
  "artist": "Black Eyed Peas", // Same partition key
  "song_title": "Pump It", // Different sort key
  "genre": "pop",
  "year": 2005
}

With a composite key, multiple items can share the same partition key (artist), but each must have a unique sort key (song_title) within that partition. This design is ideal when:

  • Items naturally group together (like songs by the same artist)
  • You need to query for ranges of related items
  • You want to organize items within their natural group

DynamoDB Scalar and Set Types

DynamoDB Data Types

DynamoDB supports several data types for attributes, broadly categorized as scalar types and sets:

Scalar Types (Single Values)

  • String (S) - Text or alphanumeric data. Maps to Java's String type.
  • Number (N) - Numeric data. Maps to Java's numeric types like Integer, Long, Double, etc.
  • Boolean (BOOL) - True/false values. Maps to Java's boolean or Boolean types.

Set Types (Multiple Unique Values)

  • String Set (SS) - A collection of unique String values. Maps to Java's Set<String>.
  • Number Set (NS) - A collection of unique Number values. Maps to Java's Set<Number> types.
// Example of different DynamoDB data types
{
  "MemberId": "M123", // String type (S)
  "Active": true, // Boolean type (BOOL)
  "Age": 28, // Number type (N)
  "LastName": "Smith", // String type (S)
  "Committees": ["Finance", "Marketing"], // String Set type (SS)
  "YearsActive": [2018, 2019, 2020] // Number Set type (NS)
}

When working with Java applications and DynamoDB:

  • Java String should be stored as DynamoDB String (S)
  • Java numeric types (int, Integer, long, double, etc.) should be stored as DynamoDB Number (N)
  • Java boolean/Boolean should be stored as DynamoDB Boolean (BOOL)
  • Java Set<String> should be stored as DynamoDB String Set (SS)
  • Java Set<Integer>, Set<Double>, etc. should be stored as DynamoDB Number Set (NS)

Understanding these mappings is crucial when designing your data model and implementing Java classes that interact with DynamoDB.

Sprint 1 Intro to DynamoDB

DynamoDB Operations and Best Practices

DynamoDB supports the standard database CRUD operations (Create, Read, Update, Delete):

  • Create - Adding new items to a table
  • Read - Retrieving items from a table using queries or scans
  • Update - Modifying existing items in a table
  • Delete - Removing items from a table

When designing DynamoDB tables, consider these best practices:

  • Choose partition keys with high cardinality (many distinct values) for better distribution
  • Keep items small for better performance and lower costs
  • Use composite primary keys when items naturally group together
  • Design your access patterns first, then build tables to support those patterns
  • Use global secondary indexes (GSIs) for flexible querying beyond the primary key

Remember that DynamoDB is a NoSQL database designed for high-scale, performance-intensive applications. Its design principles differ from traditional relational databases like MySQL or PostgreSQL.

Resources