If your DataFrame includes columns containing lists or arrays, explode()
is your go-to method to normalize this data into separate rows—ideal for analysis, filtering, and merging.
What Does explode()
Do?
explode()
transforms a nested list column into multiple rows—one per list element—while duplicating values in other columns accordingly.
Example:
import pandas as pd
df = pd.DataFrame({
'id': [1, 2, 3],
'tags': [['python', 'pandas'], [], ['data', 'analysis', 'ml']]
})
df_exploded = df.explode('tags')
Result:
id tags
0 1 python
0 1 pandas
1 2 NaN
2 3 data
2 3 analysis
2 3 ml
Use this when nested list-like data (e.g. tags, labels, keywords) must be flattened for further processing.
Handling Edge Cases
- Empty lists produce rows with
NaN
; drop missing withdropna()
. - Reset index: Flattened data retains original row index:
df_exploded.reset_index(drop=True, inplace=True)
- Multiple list columns require consecutive calls:
df.explode('tags').explode('values')
Alternatives to explode()
apply(pd.Series)
: Breaks lists into separate columns, not rows.- Loop + concat: Manual expansion—slower and more verbose.
json_normalize
: Works for dicts/nested JSON, not simple lists.
Practical Use Cases
Use Case 1: You have user-generated tags per row—explode tags into multiple rows to count tag frequency.
Use Case 2: Nested ingredients lists in recipes—flatten lists to analyze ingredient usage.
Use Case 3: Survey responses with multiple choices—explode selections for pivoting or merging.
Example Workflow
# Flatten tags and drop missing
df_exploded = df.explode('tags').dropna(subset=['tags']).reset_index(drop=True)
# Count tag frequency:
tag_counts = df_exploded['tags'].value_counts()
Performance Tips
- Use
explode()
with eager evaluation; - Combine with `dropna()`, `query()`, and `pivot_table()` for powerful transformations;
- Avoid repeating explode on large DataFrames—chain operations when possible.
explode()
is an essential Pandas method for normalizing list-like data structures — transforming nested lists into row structures that are easy to analyze, filter, and summarize. Once you master it, you’ll tackle many real-world data-wrangling tasks with ease.