Pandas How To Uncategorized How to reshape Pandas Dataframe using melt and wide_to_long()

How to reshape Pandas Dataframe using melt and wide_to_long()

Pandas melt function is used to reshape a DataFrame from wide format to long format. It takes the following parameters:

df: The DataFrame you want to reshape.
id_vars: Column(s) to use as identifier variables.
value_vars: Column(s) to use as values.
var_name: The name to use for the melted value columns.
value_name: The name to use for the melted value column.

Melt example

For example, consider the following wide format DataFrame:

country year population GDP
0 India 2000 1000.0 100
1 India 2001 1100.0 110
2 India 2002 1200.0 120
3 Brazil 2000 1500.0 150
4 Brazil 2001 1600.0 160
5 Brazil 2002 1700.0 170

You can reshape it to long format using melt:

df_melted = df.melt(id_vars=["country", "year"],
value_vars=["population", "GDP"],
var_name="variable",
value_name="value")

The resulting DataFrame would look like this:

country year variable value
0 India 2000 population 1000.0
1 India 2001 population 1100.0
2 India 2002 population 1200.0
3 Brazil 2000 population 1500.0
4 Brazil 2001 population 1600.0
5 Brazil 2002 population 1700.0
6 India 2000 GDP 100.0
7 India 2001 GDP 110.0
8 India 2002 GDP 120.0
9 Brazil 2000 GDP 150.0
10 Brazil 2001 GDP 160.0
11 Brazil 2002 GDP 170.0

Wide_to_long example

wide_to_long function is another way to reshape DataFrames from wide to long format. It takes the following parameters:

df: The DataFrame you want to reshape.
stubnames: The stub names of the wide format columns to be unstacked.
i: Column(s) to use as identifier variables.
j: Column(s) to use as value variables.
suffix: The suffix to use for overlapping column names.

For example, consider the following wide format DataFrame:

country year population_2000 population_2001 population_2002 GDP_2000 GDP_2001 GDP_2002
0 India 2000 1000.0 1100.0 1200.0 100.0 110.0 120.0
1 Brazil 2000 1500.0 1600.0 1700.0 150.0 160.0 170.0

You can reshape it to long format using wide_to_long:

df_melted = df.wide_to_long(stubnames=["population", "GDP"],
i=["country", "year"],
j="year_variable",
suffix="\d+")

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post