Sprint 10 Challenge

About the Sprint Challenge

In this Sprint Challenge you will write code to interact with SQLite databases. You may use any tools and references you wish, but your final code should reflect your work and be saved in .py files (not notebooks).

For all your code, you may only import/use the following:

  • other modules (python files) that you write
  • sqlite3 (from the standard library)
Github Repo

Challenge Details

Part 1 - Making and populating a Database

Consider the following table of data (column headers are in bold):

s x y
'g' 3 9
'v' 5 7
'f' 8 7

Using the standard sqlite3 module:

  • Open a connection to a new (blank) database file called demo_data.sqlite3
  • Make a cursor, and execute an appropriate CREATE TABLE statement to accept the above data (name the table demo)
  • Write and execute appropriate INSERT INTO statements to add the data (as shown above) to the database

Make sure to commit() so your data is saved! The file size should be non-zero.

Then write the following queries (also with sqlite3) to test the demo database and save the results of these queries under the following variables names:

  • row_count: How many rows are in the table?
  • xy_at_least_5: How many rows are there where both x and y are at least 5?
  • unique_y: How many unique values of y are there (hint - COUNT() can accept a keyword DISTINCT)?

Your code (to reproduce all above steps) should be saved in demo_data.py and turned in along with the generated SQLite database (demo_data.sqlite3).

Part 2 - The Northwind Database - Basic Queries

Using sqlite3, connect to the given northwind_small.sqlite3 database.

Northwind ERD

Above is an entity-relationship diagram - a picture summarizing the schema and relationships in the database. Note that it was generated using Microsoft Access, and some of the specific table/field names are different in the provided data.

Warning: unlike the diagram, the tables in SQLite are singular and not plural (do not end in s).

Answer the following questions (each is from a single table) and then save each query under the following variable name:

  • expensive_items: What are the ten most expensive items (per unit price) in the database? Please return all columns in the table, not just the price and name but all columns.
  • avg_hire_age: What is the average age of an employee at the time of their hiring? (Hint: a lot of arithmetic works with dates.)

Your code (to load and query the data) should be saved in northwind.py.

Part 3 - Sailing the Northwind Seas (JOIN statements)

You've answered some basic questions from the Northwind database, looking at individual tables - now it's time to put things together with JOIN!

Using sqlite3 in northwind.py, answer the following:

  • ten_most_expensive: What are the ten most expensive items (per unit price) in the database and their suppliers? Please return only the ProductName, UnitPrice, and CompanyName columns.
  • largest_category: What is the largest category (by number of unique products in it)?

Submission

Submit the following files:

  • demo_data.py
  • northwind.py
  • demo_data.sqlite3
  • northwind_small.sqlite3

You're also encouraged to include the output from your queries as docstring comments, to facilitate grading and feedback.

To submit your Sprint Challenge:

  1. Create a new GitHub repository with all your solution files
  2. Push your changes to GitHub
  3. Submit the URL to your GitHub repository

Deadline: Your Sprint Challenge is due by the end of the sprint.

Grading Rubric

Your Sprint Challenge will be evaluated based on the following criteria:

Database Creation (30%)

  • Correct implementation of the database creation
  • Proper insertion of data
  • Accurate queries on the demo database

Basic SQL Queries (35%)

  • Correctness of queries for the Northwind database
  • Proper variable assignments
  • Efficient query writing

JOIN Statements (35%)

  • Correct implementation of JOIN queries
  • Proper handling of table relationships
  • Correct results for complex queries

Challenge Resources

Remember to review your notes from all four modules and the code-alongs. The challenge is designed to test your comprehensive understanding of SQL databases.