CS133 Lab 5: Designing a Relational Database

Assigned: Thursday, November 21
Due: Wednesday, December 11 11:59 PM PT

Lab Description

In this lab, you will be designing a relational database schema for modeling a real-world application: a carpool (ride-sharing) application. You will be responsible for designing the schema to capture the constraints of the application, discussed in more detail below. You will also be tasked with writing several SQL queries to support required functionality for this carpool application. As part of this lab, you will need to be sure that your schema and queries operate correctly using the RDBMS SQLite.

1. Getting started

1.1 Description of the Carpool Application

You will be modeling a carpooling application: a group of people participate in a carpool program, sharing rides in their cars driving to and from work. By carpooling, they can take advantage of being able to drive in the carpool lane on the highway (at least two passengers are required for a car to drive in that lane). Each day, a subset of the people drive, while the other people are passengers in their cars. In particular, the application has the following semantics:

The driving schedule for one day might look something like this:

			November 20, 2019:
			6:45am -- Driver: Alice (in Toyota Prius), Passengers: Bob, Carl, Denise
			8:00am -- Driver: Ellen (in Chevy Volt), Passengers: Francis, Greg
			
			5:40pm -- Driver: Ellen (in Chevy Volt), Passengers: Bob, Francis
			6:30pm -- Driver: Alice (in Toyota Prius), Passengers: Carl, Denise, Greg
		

1.2 Using SQLite

In this lab, you will use SQLite 3 to run queries. SQLlite is a lightweight database system that supports a reasonably large subset of SQL; a SQLite database is maintained in a single file. SQLite is available on knuth, but can also be installed on your own computer if desired (and might already be installed, see resources). After you log into knuth, you can issue the following command to use SQLite:

		sqlite3 myLab5.db
	
This will create a new database with the filename myLab5.db if the database does not already exist, and you will now see the sqlite> prompt. If you have previously created a database with this filename, this command will load that database.

We discussed getting started with SQLite in class, you may wish to refer back to the notes from Lectures 1 and 21. Feel free to also consult the example files for the Sailors and College databases.

SQLite.org has a plethora of information about syntax, data types, etc. You may find the following pages particularly useful:

2. Designing the Carpool Application

2.1 Creating a Relational Schema

Design a relational schema to model the semantics of the carpool application. You should consider which relations, attributes, and primary and foreign keys are appropriate. You may start by modeling the application with an Entity/Relationship diagram, but that is not required. You do not need to create indexes. You may wish to skim Exercises 4-6 to see what SQL queries you will eventually have to write, and note that Exercise 7 asks you to create sample data.

Exercise 1: Create a file called schema.sql that contains the CREATE TABLE statements for each of your relations. Be sure to include primary/foreign keys and foreign key constraints (as well as other constraints as appropriate). You can decide on data types for each attribute, but be sure to check the SQLite documentation to see which types it supports.

Tip: You can use the sqlite command .read schema.sql at the sqlite prompt to run the SQL commands in schema.sql, which will create your relations in your current sqlite database.

Now it is time to reflect on your relational schema, possibly refining it if necessary. You should think about what functional dependencies (FDs) make sense for this application, how those FDs may or not indicate data redundancy in your schema, and what integrity constraints may exist but are not currently captured by the constraints in your schema.

Exercise 2: Answer the following questions in a document to be included with your lab writeup:
  1. What functional dependencies make sense for this application?
  2. For each of your relations, indicate whether they conform to Boyce-Codd-Normal-Form (BCNF)
  3. Are there real-life constraints in the carpool application that are not captured in your schema? If so, describe them.
Exercise 3: Refine your relational schema from Exercise 1, if necessary. If you make changes, be sure to include in your writeup which are the old and new versions. If you decide not to normalize a relation that is not in BCNF, explain your decision.

2.2 Writing SQL Queries

2.2.1 Queries for updating and validating a driving schedule

Every weekend, one of the carpool participants (called the "carpool czar") sets a driving schedule for each day of the upcoming week. This is a complex process that includes balancing each person's preferences regarding the times they arrive at work and leave each day. While you will not need to devise an algorithm to construct driving schedules, you will write queries to help the czar validate that they did not make an error when creating the week's schedules.

For Exercises 4-6, write an SQL query, or sequence of queries, that will accomplish the following tasks. Where appropriate, denote if a sequence of queries should be run as a single transaction, using BEGIN TRANSACTION; and END TRANSACTION;.

Exercise 4: Write an SQL query or queries to support each of the following features. Write your queries in a file called queries.sql. For each, describe briefly in your writeup how each query would be used to make the checks described below.
  1. Check that the set of drivers driving to work on November 20th, 2019 is the same set of drivers driving home
  2. Check that all passengers riding to work on November 20th are also going home.
  3. For the assignment of passengers to cars on November 20th, check that the number of passengers assigned to a car does not exceed the car's maximum capacity, but has at least two people in it (including the driver)

Note: for these queries, it is not sufficient to give the user two sets of query results and ask him/her to visually compare the two sets.

Of course, people sometimes have issues that arise, and they need to amend the established schedule for the week. They may realize they need to stay at work later in the day than originally planned, or need access to their own car during the day.

Exercise 5: Write an SQL query or queries to support each of the following features. Include them in your queries.sql file. For both queries, you can assume you already know Bob's unique id.

  1. Assume Bob is not a driver on November 20th, and wants to go to work later than he is currently scheduled. Show which cars Bob could switch into for his trip to work, keeping in mind vehicle capacity limits. You do not need to check the capacity of the car he is switching out of.
  2. Assume Bob is not a driver on November 20th. Remove Bob as a passenger from whichever car(s) he is riding in for both trips that day, and make him a driver for both trips, using his own car (at the same times as he was originally scheduled). Note that this means he will be driving alone for both trips.

Possibly helpful: Deferring foreign key checking (see comment in Section 1.2) and creating temporary tables in SQlite (see example here).

2.2.2 Queries for reading and analyzing driving schedules

Of course, people often want to read their schedule for the day so that they know when they are going to work. Also, it is sometimes necessary to investigate the fairness of the driving schedules over time. The following two features support these functionalities.

Exercise 6: Write an SQL query or queries to support each of the following features. Include them in your queries.sql file. As before, you can assume you already know Bob's unique id.
  1. Show Bob's schedule for November 20th: display the driver and/or car and time for both trips of the day.
  2. Calculate a count of the number of days each person has been a driver.

2.3 Creating Sample Data

You should create some sample data to sanity-check your SQL queries. You can import data into your database in SQLite using a series of INSERT statements, or you may prefer to use the SQLite import command (see https://www.sqlite.org/cli.html#csv for more details).

You should include this sample data, as well as the output of your queries from Ex. 4-6, in your submission. Some helpful commands for generating output files:

You can specify in the SQLite console that output should be written to a specified file using the command .output. For example, to write the result of a query to the file query1output.sql, you could do the following:

	sqlite> .output query1output.sql
	sqlite> SELECT * FROM MyCoolRelation;
Note: This will overwrite any contents of query1output.sql if the file exists already.

You can also "dump" all the data from your SQLite database into a file that you can submit using the following commands, where myDbContents.sql is your desired output file:
	sqlite> .output myDbContents.sql
	sqlite> .dump

You can reset the output to print to the screen again:
	sqlite> .output stdout

Exercise 7: Create sample data to ensure your queries run correctly using SQLite. Be sure to include this data in your submission.

3. Logistics

You must submit your schema and queries (see below) as well as a short writeup. This writeup should include:

3.1. Collaboration

This lab can be completed alone or with a partner. Please indicate clearly who you worked with, if anyone, on your individual writeup.

3.2. Submitting your assignment

The files you need to submit are:

Submit your files under the "Lab 5" assignment on Gradescope. Please do not zip or tar the files.

3.3 Evaluation

We will be running your SQL queries using your relational schema in SQLite, so be sure that they are syntactically correct. You will be evaluated on how well your design captures the semantics and constraints of the real-world application, as well as the correctness of your queries.

.