Within
two weeks
, we had already realised
enough savings
to pay
for a year’s worth of license.
It was that good — that
intuitive
Pivoting and unpivoting are
essential techniques in Snowflake
—they stand as crucial feature in data transformation. These techniques help you to transpose your data, converting rows into columns and vice versa. Such transformations are not just cosmetic; they pave the way for smarter data analysis and reporting.
In this article, we will explore what pivoting and unpivoting operations do, their use cases, and how to leverage built-in
Snowflake PIVOT
and
Snowflake UNPIVOT
functions to put these concepts into practice.
What Is PIVOT and UNPIVOT?
1) What Do You Mean by Pivoting?
Pivoting is the process of rotating data from row format to column format. Imagine having a table where each row represents a month, and you want to turn those months into individual columns. That's where pivoting comes into play.
Visually, this transformation can be represented as:
Diagram showing transformation from a vertical to a pivoted table format
2) What Do You Mean by Unpivoting?
Unpivoting does the exact reverse operation—transforming columns into rows format. If you have a wide table with many columns that you'd rather represent as rows, unpivoting is your go-to operation.
Visually, this transformation can be represented as:
Diagram showing the unpivoting process
In general,
Diagram showing Snowflake PIVOT and UNPIVOT operations
Common Use Case of PIVOT in Snowflake
Data Summarization
: Pivoting can help summarize data, turning detailed rows into summarized columns.
Cross-Tabulation
: Creating matrix formats for data representation.
Data Reporting
: Making data more readable and presentable for reports.
Common Use Case of Snowflake UNPIVOT
Data Normalization
: Turning wide tables with redundant columns into a more normalized format.
Data Preparation
: Preparing data for tools or applications that require a specific row-based format.
Data Cleaning
: Simplifying complex datasets by reducing the number of columns.
Note
:
Snowflake UNPIVOT is
NOT
exactly the reverse of Snowflake PIVOT as it cannot undo aggregations made by PIVOT.
Now let's look at how Snowflake PIVOT and Snowflake UNPIVOT commands can be used to accomplish pivoting and unpivoting data.
Save up to 30%
on your Snowflake spend in a few minutes!
What Does PIVOT Do in Snowflake?
Snowflake's PIVOT is a
SQL
operation used to transform rows into columns. It's particularly useful when you want to convert unique row values from one column into multiple columns in the output, aggregating data in the process. This operation is common in data analysis and reporting tasks.
Using PIVOT, the unique values from a specific column that were previously organized into separate rows can be rotated to align related data points into columns instead. This pivoting of the data structure condenses the information, providing a more consolidated analytical view. Here's a basic syntax for using Snowflake PIVOT command:
Syntax for Snowflake PIVOT
Snowflake PIVOT must be used within the
FROM clause
exclusively. When dealing with a subquery, it should be written within the subquery's
From clause
.
The syntax is:
SELECT ...
FROM ...
PIVOT (
<aggregate_function>(<pivot_column>)
FOR <value_column>
IN (<pivot_values>)
)
The key parameters are:
aggregate_function:
Aggregation you want to apply, like AVG, COUNT, MAX, MIN, and SUM.
pivot_column:
Column you want to turn into new columns.
value_column:
Values you want to populate in the new columns.
pivot_values:
List of values in the pivot_column that you want to turn into new columns.
Aggregations Supported by PIVOT Command
Snowflake PIVOT command supports several built-in aggregate functions that can be used to combine and transform the data during the pivoting operation. These aggregate functions are:
1)
AVG
: Calculates the average value of the specified column for each group of rows. It can be used with numeric data types.
Example,
SELECT *
FROM table_name
PIVOT(AVG(pivot_column) FOR value_column IN (pivot_values))
2)
COUNT
: Counts the number of non-null values in the specified column for each group of rows. It can be used with any data type.
Example,
SELECT *
FROM table_name
PIVOT(COUNT(pivot_column) FOR value_column IN (pivot_values))
3)
MAX
: Returns the maximum value in the specified column for each group of rows. It can be used with numeric, string, date, and timestamp data types.
Example,
SELECT *
FROM table_name
PIVOT(MAX(pivot_column) FOR value_column IN (pivot_values))
4)
MIN
: Returns the minimum value in the specified column for each group of rows. It can be used with numeric, string, date, and timestamp data types.
Example,
SELECT *
FROM table_name
PIVOT(MIN(pivot_column) FOR value_column IN (pivot_values))
5)
SUM
: Calculates the sum of all values in the specified column for each group of rows. It can be used with numeric data types.
Example,
SELECT *
FROM table_name
PIVOT(SUM(pivot_column) FOR value_column IN (pivot_values))
Note that the aggregate function you choose should be compatible with the data type of the pivot column. For example, you cannot use the AVG function on a string column, or the SUM function on a date column.
Snowflake Pivot Examples
For example, consider the following example data table containing the school's student grades for different subjects:
First, let's create on table called
StudentGrades
and insert some values into that table.
Creating and inserting values into the 'StudentGrades' table in Snowflake
Now, lets pivot the table so that each subject becomes a separate column.
SELECT * FROM StudentGrades
PIVOT (MAX(Grade) FOR Subject IN ('Math', 'History')) AS p;
This would output:
Pivoting table so that each subject becomes a separate column StudentGrades table
The pivot has aggregated each of the student's grades for Math and History as separate columns.
Getting column name without quotes(“ “) using PIVOT
Remember that you can also include the column names in the
AS
clause
if you prefer them without quotes, or if you wish to display different column names, like this:
SELECT * FROM StudentGrades
PIVOT (MAX(Grade) FOR Subject IN ('Math', 'History')) AS p (student_id, math, history);
Getting column name without quotes using Snowflake PIVOT
How to Pivot Multiple Columns?
Snowflake PIVOT function is really powerful, it's designed to pivot on a single aggregate function. But what if you want to pivot on multiple aggregates, like both SUM and AVERAGE? There's no direct way, but with the clever use of UNION, you can achieve this.
Let's say you have a table
BookSales
that tracks the number of books sold and the total revenue for different genres.
Step-by-Step Guide to Pivot Multiple Columns in Snowflake
To pivot on both the SUM and AVERAGE of BooksSold and Revenue for each genre, you can use the following query:
-- Total Revenue
SELECT 'Total Revenue' AS Metric, *
FROM (
SELECT Genre, Revenue
FROM BookSales
) AS SourceTable
PIVOT (
SUM(Revenue)
FOR Genre IN ('Fiction', 'Non-Fiction')
) AS PivotTable
UNION ALL
-- Average Revenue
SELECT 'Average Revenue' AS Metric, *
FROM (
SELECT Genre, Revenue
FROM BookSales
) AS SourceTable
PIVOT (
AVG(Revenue)
FOR Genre IN ('Fiction', 'Non-Fiction')
) AS PivotTable;
Pivoting on both the SUM and AVERAGE of BooksSold and Revenue for each books genre
Check out this comprehensive video to learn more in-depth about using the Snowflake PIVOT function with a dynamic list of columns.