Introduction

I made this quick project to demonstrate my proficiency in SQL, focusing on data cleaning, transformation, and exploratory data analysis using a dataset containing information about tech companies layoffs.

The project is divided into two main sections:

  1. SQL Data Cleaning - Layoffs data: Handling duplicates, standardizing data, and managing NULL values
  2. Exploratory Data Analysis (EDA): Extracting insights on layoff trends, company rankings, and time-based patterns
SELECT *
FROM layoffs_staging;

Layoffs dataset after cleaning

Layoffs dataset after cleaning

2. Exploratory Data Analysis (EDA)

2.1 Basic EDA

# Date Range
SELECT MIN(`date`), MAX(`date`)
FROM layoffs_staging;

Date range for this dataset

Date range for this dataset

SELECT MAX(total_laid_off), MAX(percentage_laid_off)
FROM layoffs_staging;

Untitled

# Biggest funded totally laid off (100%)
SELECT *
FROM layoffs_staging
WHERE percentage_laid_off = 1
ORDER BY funds_raised_millions DESC;

Top 5 Funds raised with 100% Laid off

Top 5 Funds raised with 100% Laid off

# Highest Volume Laid off by company
SELECT company, SUM(total_laid_off)
FROM layoffs_staging
GROUP BY company
ORDER BY 2 DESC;

Top 5 total Laid off by Company

Top 5 total Laid off by Company

# Highest Volume Laid off by industry
SELECT industry, SUM(total_laid_off)
FROM layoffs_staging
GROUP BY industry
ORDER BY 2 DESC;

Top 5 total Laid off by Industry

Top 5 total Laid off by Industry