autoemulate.core.model_selection#
- evaluate(y_pred, y_true, metric=MetricConfig(name=r2, maximize=True))[source]#
- Evaluate Emulator prediction performance using a torchmetrics.Metric. - Parameters:
- y_true (TensorLike) – Ground truth target values. 
- y_pred (TensorLike) – Predicted target values, as returned by an Emulator. 
- metric (Metric) – Metric to use for evaluation. Defaults to R2. 
 
- Return type:
 
- cross_validate(cv, dataset, model, model_params, transformed_emulator_params=None, x_transforms=None, y_transforms=None, device='cpu', random_seed=None, metrics=None)[source]#
- Cross validate model performance using the given cv strategy. - Parameters:
- cv (BaseCrossValidator) – Provides split method that returns train/val Dataset indices using the specified cross-validation strategy (e.g., KFold, LeaveOneOut). 
- dataset (Dataset) – The data to use for model training and validation. 
- model (Emulator) – An instance of an Emulator subclass. 
- model_params (ModelParams) – Model parameters to be used to construct model upon initialization. Passing an empty dictionary {} will use default parameters. 
- transformed_emulator_params (None | TransformedEmulatorParams) – Parameters for the transformed emulator. Defaults to None. 
- device (DeviceLike) – The device to use for model training and evaluation. 
- random_seed (int | None) – Optional random seed for reproducibility. 
- metrics (list[TorchMetrics] | None) – List of metrics to compute. If None, uses r2 and rmse. 
 
- Returns:
- Contains scores for each metric computed for each cross validation fold. 
- Return type:
 
- bootstrap(model, x, y, n_bootstraps=100, n_samples=100, device='cpu', metrics=None)[source]#
- Get bootstrap estimates of metrics. - Parameters:
- model (Emulator) – An instance of an Emulator subclass. 
- x (TensorLike) – Input features for the model. 
- y (TensorLike) – Target values corresponding to the input features. 
- n_bootstraps (int | None) – Number of bootstrap samples to generate. When None the evaluation uses all all given data and returns a single value with no measure of the uncertainty. Defaults to 100. 
- n_samples (int) – Number of samples to generate to predict mean when emulator does not have a mean directly available. Defaults to 100. 
- device (str | torch.device) – The device to use for computations. Default is “cpu”. 
- metrics (list[MetricConfig] | None) – List of metrics to compute. If None, uses r2 and rmse. 
 
- Returns:
- Dictionary mapping metric names to (mean, std) tuples. 
- Return type:
 
