Working with reshaped data in Pandas often boils down to two functions: pivot() and pivot_table(). Although they sound similar, they serve different purposes. This guide explains when and why to use each, with code examples and best-use scenarios.
What pivot() Does
pivot() reshapes long-format data into a wide format — each unique index-column pair must map to a single value.
Example:
data = [
{'Date': '2025-01-01', 'City': 'NYC', 'Temp': 5},
{'Date': '2025-01-01', 'City': 'LA', 'Temp': 15},
{'Date': '2025-01-02', 'City': 'NYC', 'Temp': 6},
{'Date': '2025-01-02', 'City': 'LA', 'Temp': 16},
]
df = pd.DataFrame(data)
pivoted = df.pivot(index='Date', columns='City', values='Temp')
Output:
City LA NYC
Date
2025-01-01 15 5
2025-01-02 16 6
Use pivot() when:
- Your index and column combination is unique
- You’re reshaping a tidy dataset
- No aggregation is needed
What pivot_table() Adds
pivot_table() builds on pivot() by allowing aggregation when duplicates occur. It’s more flexible and handles real-world messy data better.
Example:
data = [
{'Date': '2025-01-01', 'City': 'NYC', 'Temp': 5},
{'Date': '2025-01-01', 'City': 'NYC', 'Temp': 7}, # duplicate
{'Date': '2025-01-01', 'City': 'LA', 'Temp': 15},
]
df = pd.DataFrame(data)
pivot_tbl = df.pivot_table(index='Date', columns='City', values='Temp', aggfunc='mean')
Output:
City LA NYC
Date
2025-01-01 15 6
Use pivot_table() when:
- Duplicates exist in your data
- You need aggregation (mean, sum, min, max, etc.)
- You want custom aggregations per column
Common Differences at a Glance
| Feature | pivot() |
pivot_table() |
|---|---|---|
| Duplicate Handling | Errors on duplicates | Aggregates duplicate values |
| Aggregation | No | Yes (via aggfunc) |
| Default aggfunc | N/A | mean |
| Margins (totals) | No | Yes (with margins=True) |
| Flexibility | Less flexible | Highly flexible |
Best Practices & Tips
- Prefer
pivot()when data is clean and one-to-one. - Use
pivot_table()for grouped summaries and aggregation. - Add totals using `
pivot_table(..., margins=True)`, useful for dashboards. - Combine with `.reset_index()` to flatten the DataFrame if needed.
Use pivot() for clean one-to-one reshaping and pivot_table() for aggregated summaries and handling duplicates. Understanding both gives you powerful tools to reshape your Pandas DataFrames accurately and efficiently.
