Pandas How To Uncategorized How to aggregate data in Pandas

How to aggregate data in Pandas

Pandas provides a variety of methods to aggregate data, including groupby(), pivot_table(), and resample(). Here’s an example of how to use groupby() to aggregate data:

import pandas as pd

# create a sample dataframe
data = {'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Charlie'],
'Gender': ['F', 'M', 'M', 'F', 'M'],
'Age': [25, 30, 35, 27, 32],
'Salary': [50000, 60000, 70000, 55000, 75000]}
df = pd.DataFrame(data)

# group the dataframe by the 'Gender' column and calculate the mean of each group
grouped = df.groupby('Gender').mean()

# print the result
print(grouped)

In this example, we first create a sample dataframe using the pd.DataFrame() function. We then use the groupby() method to group the dataframe by the ‘Gender’ column. We then apply the mean() method to calculate the mean of each group. The resulting dataframe will have one row for each unique value in the ‘Gender’ column, and each column will show the mean value for that column for each group.

You can also perform more complex aggregations by applying multiple aggregation functions to each group. For example:

# group the dataframe by the 'Gender' column and calculate the mean and standard deviation of each group
grouped = df.groupby('Gender').agg(['mean', 'std'])

# print the result
print(grouped)

In this example, we use the agg() method to apply both the mean() and std() methods to each group. The resulting dataframe will have one row for each unique value in the ‘Gender’ column, and each column will show both the mean and standard deviation for that column for each group.

Note that groupby() can also be used to group by multiple columns or by a function applied to the index. Check out the pandas documentation for more details on how to use groupby() to aggregate data.

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post