Pandas How To Uncategorized How to pivot tables in Pandas?

How to pivot tables in Pandas?

One of the most useful features of Pandas is the ability to pivot tables, which allows you to transform data by grouping, aggregating, and reshaping it.

First, you need to have a dataset in a Pandas DataFrame. Let’s assume we have a dataset that contains information about sales made by a company in different regions:

import pandas as pd

data = {
'Region': ['North', 'North', 'South', 'South', 'East', 'East', 'West', 'West'],
'Product': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B'],
'Sales': [100, 200, 150, 50, 75, 125, 225, 175]
}

df = pd.DataFrame(data)

This DataFrame has three columns: Region, Product, and Sales. Region and Product are categorical variables, while Sales is a numerical variable.

To pivot this DataFrame, you can use the pivot_table() function. The pivot_table() function takes several arguments:

data: The DataFrame to pivot.
index: The column(s) to use as the index (i.e., the rows).
columns: The column(s) to use as the columns.
values: The column(s) to use as the values (i.e., the cells).
aggfunc: The aggregation function to use when multiple values are found for a cell.

Here’s an example of how to use the pivot_table() function to pivot the sales data:

table = pd.pivot_table(df, values='Sales', index='Region', columns='Product', aggfunc='sum')

In this example, we are pivoting the DataFrame df by using the Region column as the index and the Product column as the columns. We are using the Sales column as the values and the sum function as the aggregation function.

The resulting table DataFrame looks like this:

Product A B
Region
East 75 125
North 100 200
South 150 50
West 225 175

This pivot table shows the total sales made by the company in each region and for each product.

You can also use the pivot_table() function to perform more complex transformations. For example, you can group by multiple columns and calculate multiple aggregation functions:

table = pd.pivot_table(df, values='Sales', index=['Region', 'Product'], columns='Year', aggfunc={'Sales': ['sum', 'count']})

In this example, we are pivoting the DataFrame df by using the Region and Product columns as the index and the Year column as the columns. We are using the Sales column as the values and the sum and count functions as the aggregation functions.

The resulting table DataFrame looks like this:

Sales
sum count
Year 2019 2019
Region Product
East A 75 1
B 125 1
North A 100 1
B 200 1
South A 150 1
B 50 1
West A 225 1
B 175 1

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post