Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
283 changes: 283 additions & 0 deletions Solved _lab.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,283 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "ed411b38",
"metadata": {},
"source": [
"## LAB Connecting Python to SQL\n",
"### Name: Bryan Calderon"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "dbb2bb91",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import pymysql\n",
"from sqlalchemy import create_engine\n",
"import getpass # To get the password without showing the input\n",
"password = getpass.getpass()"
]
},
{
"cell_type": "markdown",
"id": "3a18cbf9",
"metadata": {},
"source": [
"1. Establish a connection between Python and the Sakila database."
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "cfcb3c2f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Engine(mysql+pymysql://root:***@localhost/sakila)"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bd = \"sakila\"\n",
"connection_string = 'mysql+pymysql://root:' + password + '@localhost/'+bd\n",
"engine = create_engine(connection_string)\n",
"engine"
]
},
{
"cell_type": "markdown",
"id": "9c26b19c",
"metadata": {},
"source": [
"2. Write a Python function called rentals_month that retrieves rental data for a given month and year (passed as parameters) from the Sakila database as a Pandas DataFrame."
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "c4101041",
"metadata": {},
"outputs": [],
"source": [
"def rentals_month(engine, month, year):\n",
" query = f\"\"\"\n",
" SELECT * FROM rental\n",
" WHERE MONTH(rental_date) = {month}\n",
" AND YEAR(rental_date) = {year};\n",
" \"\"\"\n",
" \n",
" df = pd.read_sql(query, engine)\n",
" return df"
]
},
{
"cell_type": "markdown",
"id": "5cd94a97",
"metadata": {},
"source": [
"3. Develop a Python function called rental_count_month that takes the DataFrame provided by rentals_month as input along with the month and year and returns a new DataFrame containing the number of rentals made by each customer_id during the selected month and year."
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "a6ed86cf",
"metadata": {},
"outputs": [],
"source": [
"def rental_count_month(df, month, year):\n",
" column_name = f\"rentals_{month:02d}_{year}\"\n",
" \n",
" result = (df.groupby(\"customer_id\")\n",
" .size()\n",
" .reset_index(name=column_name)\n",
" )\n",
" \n",
" return result"
]
},
{
"cell_type": "markdown",
"id": "584ddf80",
"metadata": {},
"source": [
"4. Create a Python function called compare_rentals that takes two DataFrames as input containing the number of rentals made by each customer in different months and years. The function should return a combined DataFrame with a new 'difference' column, which is the difference between the number of rentals in the two months."
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "ffad14b1",
"metadata": {},
"outputs": [],
"source": [
"def compare_rentals(df1, df2):\n",
" merged = pd.merge(df1, df2, on=\"customer_id\", how=\"outer\").fillna(0)\n",
" \n",
" col1 = df1.columns[1]\n",
" col2 = df2.columns[1]\n",
" \n",
" merged[\"difference\"] = merged[col2] - merged[col1]\n",
" \n",
" return merged"
]
},
{
"cell_type": "markdown",
"id": "9cda3071",
"metadata": {},
"source": [
"### EXAMPLE"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "ed39f866",
"metadata": {},
"outputs": [],
"source": [
"# Rentals\n",
"may_rentals = rentals_month(engine, 5, 2005)\n",
"june_rentals = rentals_month(engine, 6, 2005)\n",
"# Counts\n",
"may_counts = rental_count_month(may_rentals, 5, 2005)\n",
"june_counts = rental_count_month(june_rentals, 6, 2005)"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "213c59c8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>customer_id</th>\n",
" <th>rentals_05_2005</th>\n",
" <th>rentals_06_2005</th>\n",
" <th>difference</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>2.0</td>\n",
" <td>7.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>1.0</td>\n",
" <td>1.0</td>\n",
" <td>0.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>2.0</td>\n",
" <td>4.0</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>0.0</td>\n",
" <td>6.0</td>\n",
" <td>6.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>3.0</td>\n",
" <td>5.0</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" customer_id rentals_05_2005 rentals_06_2005 difference\n",
"0 1 2.0 7.0 5.0\n",
"1 2 1.0 1.0 0.0\n",
"2 3 2.0 4.0 2.0\n",
"3 4 0.0 6.0 6.0\n",
"4 5 3.0 5.0 2.0"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Comparision \n",
"compar = compare_rentals(may_counts, june_counts)\n",
"compar.head()"
]
},
{
"cell_type": "markdown",
"id": "0ab886e3",
"metadata": {},
"source": [
"### That is all :)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}