Working with reshaped data in Pandas often boils down to two functions: pivot()
and pivot_table()
. Although they sound similar, they serve different purposes. This guide explains when and why to use each, with code examples and best-use scenarios.
What pivot()
Does
pivot()
reshapes long-format data into a wide format — each unique index-column pair must map to a single value.
Example:
data = [
{'Date': '2025-01-01', 'City': 'NYC', 'Temp': 5},
{'Date': '2025-01-01', 'City': 'LA', 'Temp': 15},
{'Date': '2025-01-02', 'City': 'NYC', 'Temp': 6},
{'Date': '2025-01-02', 'City': 'LA', 'Temp': 16},
]
df = pd.DataFrame(data)
pivoted = df.pivot(index='Date', columns='City', values='Temp')
Output:
City LA NYC
Date
2025-01-01 15 5
2025-01-02 16 6
Use pivot()
when:
- Your index and column combination is unique
- You’re reshaping a tidy dataset
- No aggregation is needed
What pivot_table()
Adds
pivot_table()
builds on pivot()
by allowing aggregation when duplicates occur. It’s more flexible and handles real-world messy data better.
Example:
data = [
{'Date': '2025-01-01', 'City': 'NYC', 'Temp': 5},
{'Date': '2025-01-01', 'City': 'NYC', 'Temp': 7}, # duplicate
{'Date': '2025-01-01', 'City': 'LA', 'Temp': 15},
]
df = pd.DataFrame(data)
pivot_tbl = df.pivot_table(index='Date', columns='City', values='Temp', aggfunc='mean')
Output:
City LA NYC
Date
2025-01-01 15 6
Use pivot_table()
when:
- Duplicates exist in your data
- You need aggregation (mean, sum, min, max, etc.)
- You want custom aggregations per column
Common Differences at a Glance
Feature | pivot() |
pivot_table() |
---|---|---|
Duplicate Handling | Errors on duplicates | Aggregates duplicate values |
Aggregation | No | Yes (via aggfunc) |
Default aggfunc | N/A | mean |
Margins (totals) | No | Yes (with margins=True ) |
Flexibility | Less flexible | Highly flexible |
Best Practices & Tips
- Prefer
pivot()
when data is clean and one-to-one. - Use
pivot_table()
for grouped summaries and aggregation. - Add totals using `
pivot_table(..., margins=True)
`, useful for dashboards. - Combine with `.reset_index()` to flatten the DataFrame if needed.
Use pivot()
for clean one-to-one reshaping and pivot_table()
for aggregated summaries and handling duplicates. Understanding both gives you powerful tools to reshape your Pandas DataFrames accurately and efficiently.