Return Values from Functions in Python – Best Practices for Data Science 2026
How you return values from functions significantly impacts code clarity, reusability, and maintainability. In 2026, modern data science code follows clear conventions for returning data from functions — especially when working with Pandas DataFrames, models, metrics, and pipelines.
TL;DR — Modern Return Value Best Practices
- Return clean, consistent data types
- Use type hints for return values
- Prefer returning tuples or dataclasses for multiple values
- Avoid returning None unless explicitly indicating failure
1. Basic Return Patterns
from typing import Tuple, Optional
import pandas as pd
def process_sales_data(df: pd.DataFrame) -> pd.DataFrame:
"""Clean and enrich sales data."""
cleaned = df.dropna(subset=["amount", "customer_id"]).copy()
cleaned = cleaned.assign(
year=cleaned["order_date"].dt.year,
profit=cleaned["amount"] * 0.25
)
return cleaned
def calculate_metrics(df: pd.DataFrame) -> Tuple[float, float, int]:
"""Return multiple metrics as a tuple."""
total_sales = df["amount"].sum()
avg_sale = df["amount"].mean()
transaction_count = len(df)
return total_sales, avg_sale, transaction_count
2. Recommended Patterns for Data Science
from dataclasses import dataclass
from typing import Dict, Any
@dataclass
class ModelResults:
model: Any
accuracy: float
precision: float
recall: float
feature_importance: Dict[str, float]
def train_and_evaluate(
X_train, y_train, X_test, y_test
) -> ModelResults:
"""Train model and return structured results."""
# ... model training logic ...
return ModelResults(
model=model,
accuracy=0.87,
precision=0.85,
recall=0.89,
feature_importance=feature_importance_dict
)
def load_and_clean_data(file_path: str) -> Optional[pd.DataFrame]:
"""Return cleaned DataFrame or None if loading fails."""
try:
df = pd.read_csv(file_path, parse_dates=["order_date"])
# cleaning logic...
return df
except Exception as e:
print(f"Error loading data: {e}")
return None
3. Best Practices for Return Values in Data Science 2026
- Use **type hints** for all return values (especially with complex returns)
- Return **tuples** when returning multiple related values
- Use **dataclasses** or **namedtuples** for structured return values
- Be consistent — if a function sometimes returns None, make it clear in the docstring and type hint
- Prefer returning rich objects (DataFrames, dataclasses) over raw primitives when it improves clarity
- Document what is returned, especially when returning multiple values
Conclusion
Return values are a critical part of function design in data science. In 2026, the best practice is to be explicit with type hints, use structured return types (tuples or dataclasses) when returning multiple values, and keep return behavior consistent and predictable. Well-designed return values make your functions easier to use, test, and integrate into larger data science pipelines.
Next steps:
- Review your current functions and improve their return values by adding type hints and using structured return types where appropriate