This project explores the top paying jobs and skills for data-related roles, with a particular focus on Data Analyst positions. The analysis includes remote roles and jobs based in London, UK. The goal is to identify trends, such as high-demand skills, optimal skill sets for maximising salaries, and the companies offering the highest-paying roles in the industry.
Data-related roles are among the most in-demand and well-compensated positions across various industries. This project leverages SQL queries to analyse data on salaries, job demand, and required skills to uncover valuable insights for job seekers and industry professionals. The analysis includes:
- Top Paying Jobs: Exploring roles offering the highest average salaries.
- Top Paying Jobs, Skills: Highlighting what skills are required for the top paying roles.
- Most In-Demand Skills: Identifying skills frequently required in job postings.
- Top Paying Skills: Identifying the average salary per skill.
- Optimal Skills: Analysing the best combination of demand and salary potential.
The dataset includes detailed job postings and skill information, allowing for an in-depth exploration of trends.
- What are the top paying data analyst jobs?
- What skills are required for the top paying data analyst jobs in London, UK or Remote?
- What are the most in-demand skills for data analysts?
- What is the average salary per skill and what are the top skills based on salary?
- What are the most optimal skills to learn?
- SQL: To query and analyze the dataset, including aggregating salaries, counting skill occurrences, and performing advanced filtering for specific job types.
- pgAdmin: For database management.
- Visual Studio Code: For database management and running SQL queries
1_top_paying_jobs.sql: SQL script for identifying the highest-paying jobs in the dataset, in London or remote.
SELECT
job_id,
job_title,
job_location,
job_schedule_type,
salary_year_avg,
job_posted_date,
name AS company_name
FROM job_postings_fact
LEFT JOIN company_dim ON job_postings_fact.company_id = company_dim.company_id
WHERE job_title_short = 'Data Analyst'
AND (job_location = 'London, UK' OR job_work_from_home IS TRUE)
AND salary_year_avg IS NOT NULL
ORDER BY salary_year_avg DESC
LIMIT 10The breakdown of the data
- Wide Salary Range: The top 10 positions range from $184,000 to $650,000.
- Remote and global positions fill the top 10 spots: Remote positions from global companies. -- Job titles vary: Job titles range from Director of Analytics to Data Analyst.
2_top_paying_skills.sql: SQL script analysing the skills associated with the highest salaries.
WITH top_paying_jobs AS (
SELECT
job_id,
job_title,
salary_year_avg,
job_work_from_home,
job_location,
name AS company_name
FROM job_postings_fact
LEFT JOIN company_dim ON job_postings_fact.company_id = company_dim.company_id
WHERE job_title_short = 'Data Analyst'
AND (job_location IN ('London, UK')
OR job_work_from_home IS TRUE )
AND salary_year_avg IS NOT NULL
ORDER BY salary_year_avg DESC
LIMIT 10
)
SELECT top_paying_jobs.*,
skills
FROM top_paying_jobs
-- Inner join two tables, skills_job_dim and skills_dim to find the skill_id and
-- subsequent skills
INNER JOIN skills_job_dim ON top_paying_jobs.job_id = skills_job_dim.job_id
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
ORDER BY salary_year_avg DESC Breakdown of the data:
Breakdown of the data:
Top Skills Identified: SQL, Python, and R are the most lucrative skills. Top Companies: AT&T, Pinterest, UCLA Healthcare Careers, and SmartAsset are associated with these salaries. Wide Salary Range: Salaries span from $184,000 to $255,829.50, indicating a significant gap between lower and higher-paying roles.
3_most_indemand_skills.sql: SQL script identifying the most frequently required skills across job postings.
SELECT
skills,
COUNT (skills_job_dim.job_id) AS demand_count
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
WHERE job_title_short = 'Data Analyst'
AND job_work_from_home IS TRUE
-- AND job_location = 'London, UK'
GROUP BY skills
ORDER BY demand_count DESC
LIMIT 5Breakdown of the data:
Top Skills (Both London and Remote): SQL Python Excel Tableau Power BI Skills for Remote Jobs: Identical to those in London, suggesting a strong overlap between local and global demand for these capabilities.
Bar graph showing the top 10 skills in demand for Data Analysts in 2023; Graph created from SQL query results
4_top_paying_skills.sql: SQL script providing a deeper analysis of the highest-paying skills.
SELECT
skills,
ROUND (AVG (salary_year_avg), 0) AS avg_salary
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
WHERE job_title_short = 'Data Analyst'
AND salary_year_avg IS NOT NULL
GROUP BY skills
ORDER BY avg_salary DESC
LIMIT 25Breakdown of the data:
Skills Ranked by Salary: SVN, Solidity, Couchbase, DataRobot, GoLand, MXNet, dplyr, VMware, Terraform.
Observations: These skills represent niche tools and frameworks, many associated with specialized domains such as blockchain (Solidity), cloud infrastructure (Terraform, VMware), and machine learning (MXNet).
5_optimal_skills.sql: SQL script combining skill demand and salary analysis to highlight the most optimal skills for a Data Analyst role.
SELECT
skills_dim.skill_id,
skills_dim.skills,
COUNT (skills_job_dim.job_id) AS demand_count,
ROUND (AVG(job_postings_fact.salary_year_avg),0) AS avg_salary
FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
WHERE
job_title_short = 'Data Analyst'
AND salary_year_avg IS NOT NULL
AND job_work_from_home IS TRUE
GROUP BY
skills_dim.skill_id
HAVING
COUNT (skills_job_dim.job_id) > 10
ORDER BY
avg_salary DESC,
demand_count DESC
LIMIT 25;Breakdown of the data:
Top Optimal Skills (Ordered by Salary & Demand): Go Confluence Hadoop Snowflake Azure BigQuery AWS Java Observations on Tools: Common Ground: These tools are all critical in handling large-scale data processing and cloud-based solutions.
Use Cases: Tools like Hadoop and BigQuery are essential for managing big data, while cloud platforms like Azure and AWS are integral to modern data pipelines. Java and Go are favored for scalable backend solutions.
Most tools cater to big data management, cloud computing, or software development for scalable systems. They emphasize automation, data handling, and cross-platform integration.
Skills in both high demand and associated with high salaries often belong to specialized domains. Professionals with these skills can position themselves for highly compensated roles, especially in remote or global markets.
It highlights the intersection of demand and compensation, guiding professionals on skill acquisition to maximise career potential. Employers can identify competitive salaries to attract top talent with niche expertise. Who is it useful for?
Job Seekers: To strategically develop high-demand, high-paying skills. Employers: To structure job postings and competitive compensation. Educators & Career Advisors: To align curriculum or training programs with industry needs. Recruiters: To focus on in-demand skills while sourcing talent.
-
Complex Queries: Delving into advanced SQL queries to extract insights from vast datasets. 💻🔍 Data Aggregation: Mastering the art of summarizing large data volumes to reveal key trends and metrics. 📊
-
Common Table Expressions (CTEs): Leveraging CTEs to simplify complex queries and improve code readability. 📝💡
-
UNIONS: Combining results from multiple queries into one powerful dataset for broader analysis. ➕🔗
-
JOINS: Using JOINS to merge data from different tables, uncovering hidden relationships and insights. 🔄🔎 Window Functions: Gaining a deeper understanding of how to analyze data across multiple rows without collapsing it into a single summary. 🔲✨
-
Subqueries: Nesting queries within other queries to solve intricate problems and enhance data analysis. 🔄🔮
-
Optimising Performance: Learning how to write more efficient queries to handle large datasets and improve processing time. ⚡💼
-
Data Filtering & Sorting: Developing advanced filtering and sorting techniques to refine results and extract actionable insights. 🔎📈
-
Group By & Having Clauses: Mastering the use of GROUP BY and HAVING for more specific data aggregation and analysis. 📊
- Clone the repository:
git clone https://github.com/your_username/SQL_Project_Top_Jobs_Skills.git
