The .agg() method

The .agg() method in Python is used to aggregate or summarize data in a DataFrame. It is a versatile method that allows you to perform multiple aggregation functions on one or more columns of a DataFrame.

Here's an example of how to use the .agg() method to summarize data:

import pandas as pd

# Create a DataFrame

data = {'Name': ['John', 'Mary', 'Peter', 'Anna', 'Mike'],

'Age': [25, 32, 18, 47, 23],

'Salary': [50000, 80000, 35000, 65000, 45000]}

df = pd.DataFrame(data)

# Use the .agg() method to calculate multiple summary statistics

summary = df[['Age', 'Salary']].agg(['mean', 'median', 'min', 'max'])

# Print the summary

print(summary)

In this example, we create a DataFrame with columns for Name, Age, and Salary. We then use the .agg() method to calculate the mean, median, minimum, and maximum values for the Age and Salary columns. The resulting summary DataFrame contains four rows (one for each summary statistic) and two columns (one for Age and one for Salary).

You can also use the .agg() method to apply custom aggregation functions. For example, if you have a custom function called my_func that you want to apply to a column of a DataFrame, you can do it like this:

def my_func(x): 
    # Custom aggregation function

return x.sum() / x.count()

summary = df[['Age', 'Salary']].agg(my_func)

In this example, we define a custom function my_func that calculates the average of a column by dividing the sum of the values by the count. We then use the .agg() method to apply this function to the Age and Salary columns of the DataFrame. The resulting summary DataFrame contains one row (the output of the custom function) and two columns (one for Age and one for Salary).

Articles

The .agg() method

Built-in Functions

Generating your code...