A dive into using Pandas to analyze data, this project helps a fictional school district make strategic decisions regarding future school budgets and priorities by analyzing a district-wide data set.
The first aspect of the project analyzes district-wide standardized test results, aggregating data on student math and reading scores and school demographics to highlight obvious trends in school performance.
The final report includes:
Using Pandas, a high level snapshot (in table form) of the district's key metrics is created, including:
- Total Schools
- Total Students
- Total Budget
- Average Math Score
- Average Reading Score
- % Passing Math (The percentage of students that passed math.)
- % Passing Reading (The percentage of students that passed reading.)
- % Overall Passing (The percentage of students that passed math and reading.)
Using Pandas, an overview table is created that summarizes key metrics about each school, including:
- School Name
- School Type
- Total Students
- Total School Budget
- Per Student Budget
- Average Math Score
- Average Reading Score
- % Passing Math (The percentage of students that passed math.)
- % Passing Reading (The percentage of students that passed reading.)
- % Overall Passing (The percentage of students that passed math and reading.)
Pandas is used to create a table that highlights the top 5 performing schools based on % Overall Passing. The table includes:
- School Name
- School Type
- Total Students
- Total School Budget
- Per Student Budget
- Average Math Score
- Average Reading Score
- % Passing Math (The percentage of students that passed math.)
- % Passing Reading (The percentage of students that passed reading.)
- % Overall Passing (The percentage of students that passed math and reading.)
Pandas is used to create a table that highlights the bottom 5 performing schools based on % Overall Passing. All of the same metrics as listed above are included.
Pandas is used to create a table that lists the average Math Score for students of each grade level (9th, 10th, 11th, 12th) at each school.
Pandas is used to create a table that lists the average Reading Score for students of each grade level (9th, 10th, 11th, 12th) at each school.
Pandas is used to create a table that breaks down school performances based on average Spending Ranges (Per Student). Bins are created to group school spending. The following is included in the table:
- Average Math Score
- Average Reading Score
- % Passing Math (The percentage of students that passed math.)
- % Passing Reading (The percentage of students that passed reading.)
- % Overall Passing (The percentage of students that passed math and reading.)
The above breakdown is repeated, but this time schooles are grouped based on a reasonable approximation of school size (Small, Medium, Large).
The above breakdown is repeated, but this time schools are grouped based on school type (Charter vs. District).
This assignment is from the University of Denver's Data Analytics Boot Camp.
Trilogy Education Services © 2020. All Rights Reserved.