Skip to content

safina12/SQL_Project_Data_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQL_Project_Data_Analysis

Top Paying Jobs and Skills for Data-Related Roles

Introduction

This project explores the top paying jobs and skills for data-related roles, with a particular focus on Data Analyst positions. The analysis includes remote roles and jobs based in London, UK. The goal is to identify trends, such as high-demand skills, optimal skill sets for maximising salaries, and the companies offering the highest-paying roles in the industry.

Background

Data-related roles are among the most in-demand and well-compensated positions across various industries. This project leverages SQL queries to analyse data on salaries, job demand, and required skills to uncover valuable insights for job seekers and industry professionals. The analysis includes:

The dataset includes detailed job postings and skill information, allowing for an in-depth exploration of trends.

Questions I answered using SQL queries

  1. What are the top paying data analyst jobs?
  2. What skills are required for the top paying data analyst jobs in London, UK or Remote?
  3. What are the most in-demand skills for data analysts?
  4. What is the average salary per skill and what are the top skills based on salary?
  5. What are the most optimal skills to learn?

Tools I Used

  • SQL: To query and analyze the dataset, including aggregating salaries, counting skill occurrences, and performing advanced filtering for specific job types.
  • pgAdmin: For database management.
  • Visual Studio Code: For database management and running SQL queries

The Analysis

  1. 1_top_paying_jobs.sql: SQL script for identifying the highest-paying jobs in the dataset, in London or remote.
SELECT
    job_id,
    job_title,
    job_location,
    job_schedule_type,
    salary_year_avg,
    job_posted_date,
    name AS company_name

FROM job_postings_fact

LEFT JOIN company_dim ON job_postings_fact.company_id = company_dim.company_id

WHERE job_title_short = 'Data Analyst'
AND (job_location = 'London, UK' OR job_work_from_home IS TRUE)
AND salary_year_avg IS NOT NULL 

ORDER BY salary_year_avg DESC 

LIMIT 10

The breakdown of the data

  • Wide Salary Range: The top 10 positions range from $184,000 to $650,000.
  • Remote and global positions fill the top 10 spots: Remote positions from global companies. -- Job titles vary: Job titles range from Director of Analytics to Data Analyst.
  1. 2_top_paying_skills.sql: SQL script analysing the skills associated with the highest salaries.
WITH top_paying_jobs AS (

SELECT
    job_id,
    job_title,
    salary_year_avg,
    job_work_from_home,
    job_location,
    name AS company_name

FROM job_postings_fact

LEFT JOIN company_dim ON job_postings_fact.company_id = company_dim.company_id

WHERE job_title_short = 'Data Analyst'
AND (job_location IN ('London, UK')
OR job_work_from_home IS TRUE )

AND salary_year_avg IS NOT NULL 

ORDER BY salary_year_avg DESC 
LIMIT 10
)
SELECT top_paying_jobs.*,
skills

FROM top_paying_jobs

-- Inner join two tables, skills_job_dim and skills_dim to find the skill_id and 
--  subsequent skills

INNER JOIN skills_job_dim ON top_paying_jobs.job_id = skills_job_dim.job_id 
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id

ORDER BY salary_year_avg DESC 

Breakdown of the data:

Breakdown of the data:

Top Skills Identified: SQL, Python, and R are the most lucrative skills. Top Companies: AT&T, Pinterest, UCLA Healthcare Careers, and SmartAsset are associated with these salaries. Wide Salary Range: Salaries span from $184,000 to $255,829.50, indicating a significant gap between lower and higher-paying roles.

  1. 3_most_indemand_skills.sql: SQL script identifying the most frequently required skills across job postings.
SELECT 
    skills,
    COUNT (skills_job_dim.job_id) AS demand_count

FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id 
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
WHERE job_title_short = 'Data Analyst'
AND job_work_from_home IS TRUE 
-- AND job_location = 'London, UK' 

GROUP BY skills 
ORDER BY demand_count DESC 
LIMIT 5

Breakdown of the data:

Top Skills (Both London and Remote): SQL Python Excel Tableau Power BI Skills for Remote Jobs: Identical to those in London, suggesting a strong overlap between local and global demand for these capabilities.

In Demand Jobs

Bar graph showing the top 10 skills in demand for Data Analysts in 2023; Graph created from SQL query results

  1. 4_top_paying_skills.sql: SQL script providing a deeper analysis of the highest-paying skills.
SELECT 
    skills,
    ROUND (AVG (salary_year_avg), 0) AS avg_salary

FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id 
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id
WHERE job_title_short = 'Data Analyst'
    AND salary_year_avg IS NOT NULL 

GROUP BY skills 
ORDER BY avg_salary DESC
LIMIT 25

Breakdown of the data:

Skills Ranked by Salary: SVN, Solidity, Couchbase, DataRobot, GoLand, MXNet, dplyr, VMware, Terraform.

Observations: These skills represent niche tools and frameworks, many associated with specialized domains such as blockchain (Solidity), cloud infrastructure (Terraform, VMware), and machine learning (MXNet).

  1. 5_optimal_skills.sql: SQL script combining skill demand and salary analysis to highlight the most optimal skills for a Data Analyst role.
SELECT 
    skills_dim.skill_id,
    skills_dim.skills,
    COUNT (skills_job_dim.job_id) AS demand_count,
    ROUND (AVG(job_postings_fact.salary_year_avg),0) AS avg_salary

FROM job_postings_fact
INNER JOIN skills_job_dim ON job_postings_fact.job_id = skills_job_dim.job_id
INNER JOIN skills_dim ON skills_job_dim.skill_id = skills_dim.skill_id

WHERE
    job_title_short = 'Data Analyst'
    AND salary_year_avg IS NOT NULL 
    AND job_work_from_home IS TRUE 

GROUP BY 
    skills_dim.skill_id  

HAVING
    COUNT (skills_job_dim.job_id) > 10

ORDER BY
    avg_salary DESC,
    demand_count DESC

LIMIT 25;

Breakdown of the data:

Top Optimal Skills (Ordered by Salary & Demand): Go Confluence Hadoop Snowflake Azure BigQuery AWS Java Observations on Tools: Common Ground: These tools are all critical in handling large-scale data processing and cloud-based solutions.

Use Cases: Tools like Hadoop and BigQuery are essential for managing big data, while cloud platforms like Azure and AWS are integral to modern data pipelines. Java and Go are favored for scalable backend solutions.

Insights from the data

What is similar about these tools?

Most tools cater to big data management, cloud computing, or software development for scalable systems. They emphasize automation, data handling, and cross-platform integration.

What can we gather about this information?

Skills in both high demand and associated with high salaries often belong to specialized domains. Professionals with these skills can position themselves for highly compensated roles, especially in remote or global markets.

Why is this data useful?

It highlights the intersection of demand and compensation, guiding professionals on skill acquisition to maximise career potential. Employers can identify competitive salaries to attract top talent with niche expertise. Who is it useful for?

Job Seekers: To strategically develop high-demand, high-paying skills. Employers: To structure job postings and competitive compensation. Educators & Career Advisors: To align curriculum or training programs with industry needs. Recruiters: To focus on in-demand skills while sourcing talent.

What I Learned 🌟

  • Complex Queries: Delving into advanced SQL queries to extract insights from vast datasets. 💻🔍 Data Aggregation: Mastering the art of summarizing large data volumes to reveal key trends and metrics. 📊

  • Common Table Expressions (CTEs): Leveraging CTEs to simplify complex queries and improve code readability. 📝💡

  • UNIONS: Combining results from multiple queries into one powerful dataset for broader analysis. ➕🔗

  • JOINS: Using JOINS to merge data from different tables, uncovering hidden relationships and insights. 🔄🔎 Window Functions: Gaining a deeper understanding of how to analyze data across multiple rows without collapsing it into a single summary. 🔲✨

  • Subqueries: Nesting queries within other queries to solve intricate problems and enhance data analysis. 🔄🔮

  • Optimising Performance: Learning how to write more efficient queries to handle large datasets and improve processing time. ⚡💼

  • Data Filtering & Sorting: Developing advanced filtering and sorting techniques to refine results and extract actionable insights. 🔎📈

  • Group By & Having Clauses: Mastering the use of GROUP BY and HAVING for more specific data aggregation and analysis. 📊

Getting Started 📄...

  1. Clone the repository:
    git clone https://github.com/your_username/SQL_Project_Top_Jobs_Skills.git 
    
    

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors