Pandas is a popular Python library for data analysis and manipulation. One of its main features is the DataFrame object, which represents a 2-dimensional labeled data structure with columns of potentially different types. Iterating over a Pandas DataFrame can be done in several ways, depending on what you want to achieve.
iterrows() method. This method returns an iterator that yields pairs of index and row data as Pandas Series objects. Here's an example:
import pandas as pddf = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]})for index, row in df.iterrows(): print(f"Index: {index}, Name: {row['Name']}, Age: {row['Age']}") |
Output:
Index: 0, Name: Alice, Age: 25Index: 1, Name: Bob, Age: 30Index: 2, Name: Charlie, Age: 35 |
iteritems() method. This method returns an iterator that yields pairs of column name and column data as Pandas Series objects. Here's an example:
import pandas as pddf = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]})for column, series in df.iteritems(): print(f"Column: {column}, Data: {series.tolist()}") |
Output:
Column: Name, Data: ['Alice', 'Bob', 'Charlie']Column: Age, Data: [25, 30, 35] |
import pandas as pdimport numpy as npdf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})# Multiply column A by 2df['A'] = df['A'] * 2# Add 1 to column Bdf['B'] = np.add(df['B'], 1)print(df) |
Output:
A B0 2 51 4 62 6 7 |
In summary, Pandas provides several ways to iterate over a DataFrame, but in most cases, vectorized operations should be used for better performance.