Multiple Parameters and Return Values in Python Functions – Data Science Best Practices 2026
Handling multiple parameters and returning multiple values is very common in data science workflows. In 2026, following modern Python conventions helps you write clean, readable, and maintainable functions for data processing, modeling, and analysis.
TL;DR — Recommended Patterns
- Use type hints for all parameters and return values
- Use keyword-only parameters for important options
- Return multiple values using tuples or dataclasses
- Prefer structured return types over raw tuples when possible
1. Function with Multiple Parameters
from typing import List, Tuple, Optional
import pandas as pd
def prepare_modeling_data(
df: pd.DataFrame,
target_column: str,
feature_columns: Optional[List[str]] = None,
*,
handle_missing: bool = True,
test_size: float = 0.2,
random_state: int = 42
) -> Tuple[pd.DataFrame, pd.DataFrame, pd.Series, pd.Series]:
"""
Prepare data for machine learning modeling.
Args:
df: Input DataFrame
target_column: Name of the target variable
feature_columns: List of feature columns (if None, uses all except target)
handle_missing: Whether to drop rows with missing values
test_size: Proportion of data to use for testing
random_state: Random seed for reproducibility
Returns:
Tuple containing (X_train, X_test, y_train, y_test)
"""
if feature_columns is None:
feature_columns = [col for col in df.columns if col != target_column]
if handle_missing:
df = df.dropna(subset=feature_columns + [target_column])
X = df[feature_columns]
y = df[target_column]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=test_size, random_state=random_state
)
return X_train, X_test, y_train, y_test
2. Better Approach: Using a Dataclass for Multiple Return Values
from dataclasses import dataclass
from typing import Any
@dataclass
class ModelingResults:
model: Any
metrics: dict
feature_importance: dict
predictions: pd.Series
def train_classification_model(
X_train, y_train, X_test, y_test, model_type: str = "random_forest"
) -> ModelingResults:
"""Train a model and return structured results."""
# Training logic here...
return ModelingResults(
model=model,
metrics={"accuracy": 0.89, "f1": 0.87, "auc": 0.92},
feature_importance=feature_importance,
predictions=predictions
)
3. Best Practices in 2026
- Use **type hints** for all parameters and return values
- Use **keyword-only parameters** (`*`) for configuration options
- Return multiple values using a **dataclass** instead of a plain tuple when the return value is complex
- Keep the number of parameters reasonable (ideally under 6-7)
- Provide sensible default values for optional parameters
- Document complex parameters and return values clearly in the docstring
Conclusion
Handling multiple parameters and return values properly is a hallmark of professional data science code. In 2026, the best practice is to use type hints, keyword-only parameters for options, and structured return types (dataclasses) instead of raw tuples. These patterns make your functions more readable, maintainable, and easier for other data scientists to use.
Next steps:
- Review your current data science functions and improve them by adding type hints and using dataclasses for complex return values