Custom Metrics & Rewards
Flexible configuration for time series forecasting and beyond
📊 Custom Metrics
MAT-HPO provides a flexible parameter-based interface that allows you to customize metrics tracking and reward computation without modifying the library code. This feature is especially useful for:
- Time Series Forecasting: Track MASE, SMAPE, MAE, RMSE instead of F1/AUC/G-mean
- Regression Tasks: Use MSE, R², RMSE, MAE as metrics
- Custom Domains: Define any metrics specific to your problem
- Custom Rewards: Implement complex reward functions based on multiple objectives
✅ Key Benefits
- No library code modification needed
- Support for arbitrary number of metrics (not limited to 3)
- Preserves original metric values for proper evaluation
- Backward compatible with existing F1/AUC/G-mean interface
- Domain-agnostic design
Three Core Components
1️⃣ BaseEnvironment Parameters
Configure metrics and rewards in your environment:
custom_metrics: List of metrics to trackmetric_names_mapping: Display name mappingreward_function: Custom reward logic
2️⃣ HPOLogger Configuration
Enhanced logging with custom metrics:
metrics_extractor: Extract metrics from hyperparamsmetric_names: Custom display names- Automatic separation of original vs. transformed values
3️⃣ Automatic Integration
Everything works seamlessly:
- Optimizer auto-detects environment config
- Logger inherits metric settings
- Flexible storage in
best_hyperparams.json
Complete Time Series Example
Step 1: Define Custom Functions
python
import numpy as np
# Define custom reward function
def timeseries_reward(metrics: dict) -> float:
"""Reward based on training loss (avoid data leakage)"""
train_loss = metrics.get('train_loss', 1.0)
if train_loss < 300.0:
return 0.9
elif train_loss < 400.0:
return 0.7
return 0.3
def extract_timeseries_metrics(hyperparams: dict) -> dict:
"""Extract all time series metrics from hyperparams"""
return {
'train_loss': float(hyperparams.get('train_loss', 0.0)),
'val_loss': float(hyperparams.get('val_loss', 0.0)),
'mase': float(hyperparams.get('mase', 1.0)),
'smape': float(hyperparams.get('original_smape', 0.0)),
'mae': float(hyperparams.get('original_mae', 0.0)),
'rmse': float(hyperparams.get('original_rmse', 0.0)),
}
Step 2: Configure Environment
python
from MAT_HPO_LIB import BaseEnvironment
class TimeSeriesEnvironment(BaseEnvironment):
def __init__(self, model_name, dataset_name):
super().__init__(
name=f"TS-{model_name}-{dataset_name}",
# Custom metrics list
custom_metrics=['train_loss', 'val_loss', 'mase', 'smape', 'mae', 'rmse'],
# Metric name mapping (for display)
metric_names_mapping={
'f1': 'SMAPE',
'auc': 'MAE',
'gmean': 'RMSE'
},
# Custom reward function
reward_function=timeseries_reward
)
def train_evaluate(self, model, hyperparams):
# Train your model...
train_loss = 331.72
val_loss = 346.51
# Evaluate on test set...
mase = 2.304
mae = 632.53
rmse = 809.25
smape = 0.0618
# Return all metrics
return {
# Original training metrics
'train_loss': train_loss,
'val_loss': val_loss,
'overfitting_ratio': val_loss / train_loss,
# Original test metrics
'mase': mase,
'smape': smape,
'mae': mae,
'rmse': rmse,
# Transformed values for MAT-HPO (higher is better)
'f1': 0.8 - min(0.8, smape / 2.0),
'auc': 0.8 - min(0.8, mae / 1000.0),
'gmean': 0.8 - min(0.8, rmse / 1000.0),
# Save original values
'original_smape': smape,
'original_mae': mae,
'original_rmse': rmse
}
def compute_reward(self, metrics):
# Use custom reward function
if self.custom_reward_function:
return self.custom_reward_function(metrics)
return 0.5
Step 3: Create Logger and Optimizer
python
from MAT_HPO_LIB import MAT_HPO_Optimizer, HyperparameterSpace
from MAT_HPO_LIB.utils import DefaultConfigs
from MAT_HPO_LIB.utils.logger import HPOLogger
# Create environment
env = TimeSeriesEnvironment("dlinear", "us_births")
# Create hyperparameter space
space = HyperparameterSpace()
space.add_continuous('learning_rate', 1e-5, 1e-2, agent=0)
space.add_discrete('batch_size', [8, 16, 32, 64], agent=0)
# Create config
config = DefaultConfigs.standard()
config.max_steps = 100
# Create optimizer (automatically inherits environment's custom metrics)
optimizer = MAT_HPO_Optimizer(env, space, config)
# Run optimization
results = optimizer.optimize()
print(f"Best reward: {results['best_performance']['reward']:.4f}")
Pro Tip
The optimizer automatically detects and uses the custom metrics configuration from your environment. No manual logger setup needed!
Output Format
best_hyperparams.json
{
"hyperparameters": {
"learning_rate": 0.001,
"batch_size": 32
},
"performance": {
"smape": 0.0618,
"mae": 632.53,
"rmse": 809.25,
"mase": 2.304,
"train_loss": 331.72,
"val_loss": 346.51,
"overfitting_ratio": 1.045,
"reward": 0.7512
},
"step": 42
}
step_log.jsonl (each line)
{
"step": 0,
"timestamp": "2025-10-03T08:00:00",
"metrics": {
"train_loss": 331.72,
"val_loss": 346.51,
"overfitting_ratio": 1.045,
"mase": 2.304,
"smape": 0.0618,
"mae": 632.53,
"rmse": 809.25,
"f1_transformed": 0.7691,
"auc_transformed": 0.1675,
"gmean_transformed": 0.1000
},
"timing": {...},
"hyperparameters": {...}
}
Metric Separation
The logger automatically separates:
- Original values:
smape,mae,rmse- True metric values for evaluation - Transformed values:
*_transformed- Used internally by MAT-HPO for optimization
Advanced Usage
Manual Logger Configuration
For full control, you can manually configure the logger:
python
from MAT_HPO_LIB.utils.logger import HPOLogger
# Create custom logger
logger = HPOLogger(
output_dir='./results',
metric_names={'f1': 'SMAPE', 'auc': 'MAE', 'gmean': 'RMSE'},
custom_metrics=['train_loss', 'val_loss', 'mase', 'smape', 'mae', 'rmse'],
metrics_extractor=extract_timeseries_metrics
)
# Create optimizer
optimizer = MAT_HPO_Optimizer(env, space, config)
optimizer.logger = logger # Override default logger
# Run optimization
results = optimizer.optimize()
Flexible Metric Count
Track as many metrics as you need:
python
super().__init__(
name="MyEnv",
custom_metrics=[
'train_loss', 'val_loss', 'test_loss',
'mase', 'smape', 'mae', 'rmse', 'mape', 'mse',
'overfitting_ratio', 'training_time', 'inference_time'
],
# ... other parameters
)
Complex Reward Functions
def sophisticated_reward(metrics: dict) -> float:
"""Multi-objective reward combining accuracy and efficiency"""
# Accuracy component (70%)
mase = metrics.get('mase', 10.0)
accuracy_reward = 0.7 * (1.0 / max(mase, 0.1))
# Efficiency component (20%)
train_time = metrics.get('training_time', 1000)
efficiency_reward = 0.2 * (1.0 / max(train_time / 100, 1.0))
# Stability component (10%)
overfitting = metrics.get('overfitting_ratio', 2.0)
stability_reward = 0.1 * (1.0 if overfitting <= 1.2 else 0.5)
total_reward = accuracy_reward + efficiency_reward + stability_reward
return max(0.0, min(1.0, total_reward))
Use Cases
Time Series Forecasting
- Metrics: MASE, SMAPE, MAE, RMSE
- Reward: Based on validation loss
- Track overfitting ratio
Regression
- Metrics: MSE, RMSE, MAE, R²
- Reward: Inverse of validation MSE
- Track prediction intervals
Classification (Default)
- Metrics: F1, AUC, G-mean, Precision, Recall
- Reward: Weighted combination
- Track per-class metrics
Multi-Objective
- Metrics: Accuracy + Speed + Memory
- Reward: Pareto optimization
- Track resource usage
Best Practices
⚠️ Important Considerations
- Avoid Data Leakage: Use training/validation metrics for rewards, not test metrics
- Normalize Rewards: Keep rewards in a reasonable range (e.g., 0.0-1.0)
- Save Original Values: Always preserve original metrics with
original_*prefix - Transform Consistently: Ensure "higher is better" for f1/auc/gmean used in optimization
✅ Recommended Patterns
python
# ✅ Good: Save both original and transformed
return {
'mase': 2.304, # Original value
'f1': 0.8 - min(0.8, mase / 5), # Transformed (higher is better)
'original_mase': 2.304 # Explicit original backup
}
# ❌ Bad: Only transformed values
return {
'f1': 0.8 - min(0.8, mase / 5) # Lost original information!
}
API Reference
BaseEnvironment Parameters
| Parameter | Type | Description |
|---|---|---|
custom_metrics |
List[str] |
List of custom metric names to track (e.g., ['mase', 'smape', 'mae']) |
metric_names_mapping |
Dict[str, str] |
Map internal names to display names (e.g., {'f1': 'SMAPE', 'auc': 'MAE'}) |
reward_function |
Callable[[Dict], float] |
Custom reward computation function taking metrics dict, returning float |
HPOLogger Parameters
| Parameter | Type | Description |
|---|---|---|
metrics_extractor |
Callable[[Dict], Dict] |
Function to extract metrics from hyperparams dictionary |
metric_names |
Dict[str, str] |
Custom metric display names for console output |
custom_metrics |
List[str] |
List of metrics to track in logs |
More Examples
For complete runnable examples, see:
- CUSTOM_METRICS_GUIDE.md - Detailed Chinese guide
- examples/timeseries_custom_metrics_example.py - Full working example