Table of Contents
ToggleThe SQL CASE statement
The SQL CASE
statement is a powerful conditional expression used to create or modify new columns based on specified conditions. It allows you to add logic to your SQL queries, enabling the transformation and categorization of data within the query itself.
Understanding SQL CASE
The CASE
statement goes through conditions and returns a value when the first condition is met (like an if-else statement). Once a condition is true, it stops reading and returns the result. If no conditions are true, it returns the value in the ELSE
clause. If there is no ELSE
part and no conditions are true, it returns NULL
.
Syntax:
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
…
ELSE resultN
END
Example 1:
SELECT [invoice_no],
[price],
CASE
WHEN [price] > 1000.00 THEN ‘High’
WHEN [price] BETWEEN 500.00 AND 1000.00 THEN ‘Medium’
ELSE ‘Low’
END AS sale_category
FROM [brazecommerce].[dbo].[SalesData]
Example 2:
SELECT [EmployeeKey],
[MaritalStatus],
[Gender],
[SickLeaveHours],
CASE
WHEN [SickLeaveHours] <= 10 THEN ‘excellent’
WHEN [SickLeaveHours] <= 20 THEN ‘good’
WHEN [SickLeaveHours] <= 40 THEN ‘fair’
WHEN [SickLeaveHours] <= 60 THEN ‘poor’
ELSE ‘very poor’
END AS health
FROM [AdventureWorksDW2022].[dbo].[DimEmployee]
Example 3:
Example 4:
[FirstName],
[DepartmentName],
CASE
WHEN [HireDate] < ‘2004-01-01’ THEN ‘Legacy’
WHEN [HireDate] < ‘2014-01-01’ THEN ‘Veteran’
WHEN [HireDate] < ‘2020-01-01’ THEN ‘Seasoned’
WHEN [HireDate] < ‘2022-01-01’ THEN ‘Established’
WHEN [HireDate] < ‘2024-03-01’ THEN ‘Newcorner’
ELSE ‘Fresh Hire’
END AS employee_status
FROM [AdventureWorksDW2022].[dbo].[DimEmployee];
The SQL UNION statement
The SQL UNION
operator is a powerful tool used in data analysis to combine the results of two or more SELECT
statements into a single result set. This is particularly useful when dealing with data from multiple tables or queries that need to be aggregated into a comprehensive view.
Understanding SQL UNION
The UNION
operator merges the results of two or more SELECT
queries. Each SELECT
statement within the UNION
must have the same number of columns in the result sets with similar data types.
Key Characteristics:
- Combines Result Sets:
UNION
combines rows from different queries into a single result set. - Eliminates Duplicates: By default,
UNION
removes duplicate rows. If you want to include duplicates, you can useUNION ALL
. - Column Compatibility: The corresponding columns in each
SELECT
statement must have compatible data types. - Ordering: The
ORDER BY
clause can be used at the end of theUNION
statement to sort the final result set.
Syntax:
SELECT column1, column2, …
FROM table1
UNION
SELECT column1, column2, …
FROM table2;
Example 1:
SELECT ‘Country’ AS Country_language, [id], [first_name], [country]
FROM [AW_SSIS_DEV_DB].[dbo].[Sport_club_members]
UNION
SELECT ‘Language’, [id], [first_name], [language]
FROM [AW_SSIS_DEV_DB].[dbo].[Members];
UNION vs. UNION ALL
The choice between UNION
and UNION ALL
depends on whether you want to include duplicate rows.
- UNION: Removes duplicate rows from the combined result set.
- UNION ALL: Includes all rows from the combined result set, including duplicates. This is faster because it doesn’t require the database to check for and remove duplicates.
Understanding SQL UNION ALL
Syntax:
SELECT column1, column2, …
FROM table1
UNION ALL
SELECT column1, column2, …
FROM table2;
Example 1:
SELECT ‘Customer’ AS Type, [customer_id], [customer_city]
FROM [brazecommerce].[dbo].[customers]
WHERE [customer_city] = ‘sao paulo’
UNION ALL
SELECT ‘Seller’, [seller_id], [seller_city]
FROM [brazecommerce].[dbo].[sellers]
WHERE [seller_city] = ‘sao paulo’
ORDER BY [customer_city];
Performance Considerations
Using UNION
and UNION ALL
can impact performance, especially with large datasets. Here are some tips to optimize performance:
- Indexes: Ensure appropriate indexes on columns used in the
SELECT
statements to speed up query execution. - Filter Early: Apply
WHERE
conditions in individualSELECT
statements to reduce the number of rows before applyingUNION
. - UNION ALL: Use
UNION ALL
when duplicates are not a concern, as it is faster thanUNION
.
Practical Applications in Data Analysis
- Data Consolidation: Combine similar data from different sources or tables into a single result set for comprehensive analysis.
- Historical Data Analysis: Merge data from different periods (e.g., monthly, yearly) to perform trend analysis.
- Reporting: Generate unified reports from various subsets of data, such as merging sales data from different regions.
Trademark Disclaimer:
All trademarks, logos, and brand names are the property of their respective owners. All company, product, and service names used in this website are for identification purposes only. Use of these names trademarks, and brands do not imply endorsement.