{ "cells": [ { "cell_type": "markdown", "id": "29331406-bdbf-4ac8-b1a1-e71de069fb05", "metadata": { "tags": [] }, "source": [ "# SQL exercise – Compound and nested SQL queries\n", "\n", "The questions in this exercise are about the syntax of SQL queries.\n", "\n", "We will use:\n", "* the QSpatiaLite stork database that we used earlier;\n", "* SQL to write queries on this database;\n", "* the programming language Python to query this database (execute the queries).\n", "\n", "
\n", "
\n", " How to use this notebook?
\n", " This notebook consists of text cells and code cells (Python). The code cells must be executed in order to see the output of the program.
To execute a cell, select it and then simply click on the \"play\" button () in the toolbar at the top of the notebook, or type shift + enter.
It is important to run the code cells in the same order as they appear in the notebook.\n", "
\n" ] }, { "cell_type": "markdown", "id": "cce1c593-9fe7-4b64-9ce0-b3e589c29488", "metadata": {}, "source": [ "---\n", "---" ] }, { "cell_type": "markdown", "id": "ad6d3b11-2eb8-41c7-a8f9-32aad5d552a5", "metadata": { "tags": [] }, "source": [ "# Accessing the database with Python\n", "\n", "In this first part, we show you with an example how to write and execute your queries on the stork database with Python." ] }, { "cell_type": "markdown", "id": "f40617a7-951c-4ff6-b3d4-f19e710e8dee", "metadata": { "tags": [] }, "source": [ "## Importing Python libraries\n", "To manipulate a database with Python we need the `sqlite3` and `pandas` libraries:" ] }, { "cell_type": "code", "execution_count": null, "id": "1728d534-3f64-4cb2-860e-7fd6247041e8", "metadata": {}, "outputs": [], "source": [ "# Importing libraries\n", "import sqlite3\n", "import pandas" ] }, { "cell_type": "markdown", "id": "39ffe37e-403e-4a19-b228-669d245a9c0f", "metadata": {}, "source": [ "## Connection to the database \n", "First, a connection to the database is established:" ] }, { "cell_type": "code", "execution_count": null, "id": "fdfc5079-cc8d-4f4c-b3a1-6c8d1ac85ad6", "metadata": {}, "outputs": [], "source": [ "# Establish the connection to the database:\n", "con = sqlite3.connect(\"stork-mooc.sqlite\")" ] }, { "cell_type": "markdown", "id": "d2ed37e5-7ee7-4799-bcdb-2234589257e1", "metadata": {}, "source": [ "## Executing SQL queries\n", "\n", "To execute a SQL query on our database with Python, we use the `read_sql_query` function from the `pandas` library which :\n", "* executes the query\n", "* returns the result table\n", "To display this table, you just have to do a `display`.\n", "\n", "Here is an example, which executes the query `select * from migration` and displays the result:" ] }, { "cell_type": "code", "execution_count": null, "id": "e256f9b1-2eee-47a5-b506-0c4ef035520c", "metadata": {}, "outputs": [], "source": [ "# Give the query to execute\n", "query = \"select * from migration\"\n", "\n", "# Execute the query and get the result table\n", "resulttable = pandas.read_sql_query(query, con)\n", "\n", "# Display the result table\n", "display(resulttable)" ] }, { "cell_type": "markdown", "id": "dc751f1e-ef59-40dc-9a6f-eec4665c9c5e", "metadata": {}, "source": [ "
\n", "\n", "**Question** : modify the above query to retrieve the contents of the `stork` table." ] }, { "cell_type": "markdown", "id": "a48f4694-1c58-43cb-9245-99c0af063f29", "metadata": { "tags": [] }, "source": [ "**Solution** : Click on the three small dots \"...\" below." ] }, { "cell_type": "markdown", "id": "b663bfc9-e3cb-4195-95c0-b34a4a8b1079", "metadata": { "jupyter": { "source_hidden": true }, "tags": [] }, "source": [ "
\n", "\n", "Here is the correct query:\n", " \n", "`select * from stork`\n", " \n", "
\n" ] }, { "cell_type": "markdown", "id": "c0ebf2e7-09e9-4290-9b6c-83a6f8c1502b", "metadata": {}, "source": [ "---\n", "---" ] }, { "cell_type": "markdown", "id": "cbd82ad7-2d85-4287-a5b1-413e2bf1acde", "metadata": { "tags": [] }, "source": [ "# Exercises with SQL\n", "\n", "For each of the following questions, **give the SQL syntax for the requested queries** and **test your queries.** \n", "\n", "As a reminder, here is the UML diagram of our stork database. \n", "\n", "\"UML" ] }, { "cell_type": "markdown", "id": "3811c85a-dc52-4259-b538-16d198ad120b", "metadata": { "tags": [] }, "source": [ "---\n", "\n", "## Question 1\n", "**Question** : Return the names of all storks who migrated in both 2001 and 2003 (use the UNION, EXCEPT or INTERSECT operators).\n", "\n", "Here is how to test your query: \n", "1. Edit the query in the cell below and then execute the cell to see the result. \n", "You can edit and re-execute the cell as many times as necessary. \n", "2. Once you are satisfied with your query, you can execute the validation cell below to check if your query is correct or not. \n", "3. Finally, you can compare your query with the solution." ] }, { "cell_type": "code", "execution_count": null, "id": "da5ceda4-2489-4d20-b128-55343cb6dcf3", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Give the query to execute\n", "query = \"select * from migration\" # TODO: MODIFY THE QUERY HERE\n", "\n", "# Execute the query and get the result table\n", "resulttable = pandas.read_sql_query(query, con)\n", "\n", "# Display the result table\n", "display(resulttable)" ] }, { "cell_type": "markdown", "id": "caffe1a6-0617-4c3d-838c-e48a4342642d", "metadata": {}, "source": [ "**Validation** : Execute the following cell to see if your query is correct or not. \n", "Read carefully the error messages to correct your query." ] }, { "cell_type": "code", "execution_count": null, "id": "839e19f7-bdc8-494d-9fae-b41e3035e4a1", "metadata": { "tags": [] }, "outputs": [], "source": [ "from SQL_fusion import *\n", "question1_validation(query)" ] }, { "cell_type": "markdown", "id": "b4efcf80-e101-47b5-9584-f2c8af299a8e", "metadata": { "tags": [] }, "source": [ "**Solution** : Click on the three small dots \"...\" below." ] }, { "cell_type": "markdown", "id": "3248b00b-42fc-4043-849d-ce206ef3d94e", "metadata": { "tags": [] }, "source": [ "
\n", "\n", "Here is the correct query:\n", " \n", "`select name from migration where departureyear=2001 intersect select name from migration where departureyear=2003`\n", " \n", "
\n" ] }, { "cell_type": "markdown", "id": "5ded8691-5cc7-4ffd-8e7a-8b9e68ec9b28", "metadata": {}, "source": [ "---\n", "\n", "## Question 2\n", "\n", "**Question** : Return the names of all storks who migrated in 2001, but not in 2003 (use the UNION, EXCEPT or INSTERSECT operators).\n" ] }, { "cell_type": "code", "execution_count": null, "id": "8e7fa944-0dd6-410c-8b98-3ce7c2296028", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Give the query to execute\n", "query = \"select * from migration\" # TODO: MODIFY THE QUERY HERE\n", "\n", "# Execute the query and get the result table\n", "resulttable = pandas.read_sql_query(query, con)\n", "\n", "# Display the result table\n", "display(resulttable)" ] }, { "cell_type": "markdown", "id": "a51feebe-bbfb-49ec-8217-61f08395d75f", "metadata": {}, "source": [ "**Validation** : Execute the following cell to see if your query is correct or not. \n", "Read carefully the error messages to correct your query." ] }, { "cell_type": "code", "execution_count": null, "id": "b60cc09c-cbe5-4055-bdbf-dff6ef205faa", "metadata": { "tags": [] }, "outputs": [], "source": [ "from SQL_fusion import *\n", "question2_validation(query)" ] }, { "cell_type": "markdown", "id": "b005725d-094e-4c88-a066-c6bb561240b0", "metadata": { "tags": [] }, "source": [ "**Solution** : Click on the three small dots \"...\" below." ] }, { "cell_type": "markdown", "id": "74758991-7c8a-49be-8c23-d99f3505b03b", "metadata": { "jupyter": { "source_hidden": true }, "tags": [] }, "source": [ "
\n", "\n", "Here is the correct query:\n", " \n", "`select name from migration where departureyear=2001 except select name from migration where departureyear=2003`\n", " \n", "
\n" ] }, { "cell_type": "markdown", "id": "baf8ea9d-52c7-4f09-b2da-f7a2350edb84", "metadata": {}, "source": [ "---\n", "\n", "## Question 3\n", "\n", "**Question** : Return the names of all storks who migrated in both 2001 and 2003 (use the IN clause to create a nested query). " ] }, { "cell_type": "code", "execution_count": null, "id": "effa9adc-db64-40e0-9fb7-d77c068f5b3a", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Give the query to execute\n", "query = \"select * from migration\" # TODO: MODIFY THE QUERY HERE\n", "\n", "# Execute the query and get the result table\n", "resulttable = pandas.read_sql_query(query, con)\n", "\n", "# Display the result table\n", "display(resulttable)" ] }, { "cell_type": "markdown", "id": "f5fbdd80-6dc0-4748-8b80-5aa97367948e", "metadata": {}, "source": [ "**Validation** : Execute the following cell to see if your query is correct or not. \n", "Read carefully the error messages to correct your query." ] }, { "cell_type": "code", "execution_count": null, "id": "edd89993-4524-4260-ae45-c03dcef23817", "metadata": { "tags": [] }, "outputs": [], "source": [ "from SQL_fusion import *\n", "question3_validation(query)" ] }, { "cell_type": "markdown", "id": "9780abdf-f469-4682-b97d-dfae186fc85c", "metadata": { "tags": [] }, "source": [ "**Solution** : Click on the three small dots \"...\" below." ] }, { "cell_type": "markdown", "id": "e4b270be-9464-4b53-8dc9-ae0b6d45f30f", "metadata": { "jupyter": { "source_hidden": true }, "tags": [] }, "source": [ "
\n", "\n", "Here is the correct query:\n", " \n", "`select name from migration where departureyear=2001 and name in (select name from migration where departureyear=2003)`\n", " \n", "
\n" ] }, { "cell_type": "markdown", "id": "2181501d-9609-45db-add3-11fef5627712", "metadata": { "tags": [] }, "source": [ "---\n", "\n", "## Question 4\n", "\n", "**Question** : Select the names of all storks who migrated prior to the birth of the youngest stork (use the Migration table in the primary query and the Stork table in a subquery in the WHERE clause)." ] }, { "cell_type": "code", "execution_count": null, "id": "3356bb0c-b1a7-43cd-9b3d-b67864694c2d", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Give the query to execute\n", "query = \"select * from migration\" # TODO: MODIFY THE QUERY HERE\n", "\n", "# Execute the query and get the result table\n", "resulttable = pandas.read_sql_query(query, con)\n", "\n", "# Display the result table\n", "display(resulttable)" ] }, { "cell_type": "markdown", "id": "cc4f7793-0cef-42c5-9b23-154ee16fa467", "metadata": {}, "source": [ "**Validation** : Execute the following cell to see if your query is correct or not. \n", "Read carefully the error messages to correct your query." ] }, { "cell_type": "code", "execution_count": null, "id": "6f6fa35d-2308-4d9f-91ad-3b849f122479", "metadata": { "tags": [] }, "outputs": [], "source": [ "from SQL_fusion import *\n", "question4_validation(query)" ] }, { "cell_type": "markdown", "id": "09eb69f0-908e-4e40-bcad-9853fca49826", "metadata": { "tags": [] }, "source": [ "**Solution** : Click on the three small dots \"...\" below." ] }, { "cell_type": "markdown", "id": "4b0e5608-1b34-47ca-9e4e-eac6f9fb944b", "metadata": { "jupyter": { "source_hidden": true }, "tags": [] }, "source": [ "
\n", "\n", "Here is the correct query:\n", " \n", "`select distinct name from Migration where departureyear<(select max(yearofbirth) from Stork)`\n", " \n", "
\n" ] }, { "cell_type": "markdown", "id": "43863949-1e0d-4dee-9f08-f560700677a5", "metadata": { "tags": [] }, "source": [ "---\n", "\n", "## Question 5\n", "\n", "**Question** :Select the names of all storks who migrated prior to the birth of the youngest stork (use the Migration table in the primary query and the Stork table in a subquery in the FROM clause)." ] }, { "cell_type": "code", "execution_count": null, "id": "d6bb47c7-72ca-4264-a797-aab5e2f0b053", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Give the query to execute\n", "query = \"select * from migration\" # TODO: MODIFY THE QUERY HERE\n", "\n", "# Execute the query and get the result table\n", "resulttable = pandas.read_sql_query(query, con)\n", "\n", "# Display the result table\n", "display(resulttable)" ] }, { "cell_type": "markdown", "id": "b5307431-89a3-4010-96ea-6dab15923bf1", "metadata": {}, "source": [ "**Validation** : Execute the following cell to see if your query is correct or not. \n", "Read carefully the error messages to correct your query." ] }, { "cell_type": "code", "execution_count": null, "id": "4aaf1326-f94a-4527-b334-266e5989238f", "metadata": { "tags": [] }, "outputs": [], "source": [ "from SQL_fusion import *\n", "question5_validation(query)" ] }, { "cell_type": "markdown", "id": "6f147f9e-647b-40f7-842a-b54ac6866ea7", "metadata": { "tags": [] }, "source": [ "**Solution** : Click on the three small dots \"...\" below." ] }, { "cell_type": "markdown", "id": "e3030f35-7b38-4be0-852b-b1dff7635117", "metadata": { "jupyter": { "source_hidden": true }, "tags": [] }, "source": [ "
\n", "\n", "Here is the correct query:\n", " \n", "`select distinct name from Migration, (select max(yearofbirth) as max from Stork) where departureyear\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 5 }