To perform multiple summaries on a single column of a Pandas DataFrame, you can use the .agg() method with a list of summary statistics as its argument.
Here's an example of how to use the .agg() method to perform multiple summaries on a single column:
import pandas as pd# Create a DataFramedata = {'Name': ['John', 'Mary', 'Peter', 'Anna', 'Mike'], 'Age': [25, 32, 18, 47, 23], 'Salary': [50000, 80000, 35000, 65000, 45000]}df = pd.DataFrame(data)# Use the .agg() method to perform multiple summaries on the Age columnsummary = df['Age'].agg(['mean', 'median', 'min', 'max'])# Print the summaryprint(summary) |
In this example, we create a DataFrame with columns for Name, Age, and Salary. We then use the .agg() method to perform multiple summaries on the Age column, including the mean, median, minimum, and maximum values. The resulting summary DataFrame contains one row (one for each summary statistic) and one column (Age).
You can also use the .agg() method to perform custom summaries on a single column. For example, if you have a custom function called my_func that you want to apply to the Age column of the DataFrame, you can do it like this:
def my_func(x): # Custom aggregation function return x.sum() / x.count()summary = df['Age'].agg(my_func) |
In this example, we define a custom function my_func that calculates the average of a column by dividing the sum of the values by the count. We then use the .agg() method to apply this function to the Age column of the DataFrame. The resulting summary DataFrame contains one row (the output of the custom function) and one column (Age).