Minimising computation time#

AutoEmulate can be slow if the input data has many observations (rows) or many output variables. By default, AutoEmulate cross-validates each model, so we’re computing 5 fits per models. The computation time will be relatively short for datasets up to a few thousands of datapoints, but some models (e.g. Gaussian Processes) don’t scale well, so computation time might quickly become an issue.

In this tutorial we walk through four strategies to speed up AutoEmulate:

  1. parallise model fits using n_jobs

  2. restrict the range of models using the models argument

  3. run fewer cross validation folds using cross_validator

  4. for hyperparameter search:

    • all of the above

    • run fewer iterations using param_search_iters

from sklearn.datasets import make_regression
from autoemulate.compare import AutoEmulate
/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/autoemulate/compare.py:8: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)
  from tqdm.autonotebook import tqdm

Let’s make a dataset.

X, y = make_regression(n_samples=500, n_features=10, n_targets=5)
X.shape, y.shape
((500, 10), (500, 5))

And see how long AutoEmulate takes to run (without hyperparameter search).

import time

start = time.time()

em = AutoEmulate()
em.setup(X, y)
em.compare()

end = time.time()
print(f"Time taken: {end - start} seconds")

AutoEmulate is set up with the following settings:

Values
Simulation input shape (X) (500, 10)
Simulation output shape (y) (500, 5)
Proportion of data for testing (test_set_size) 0.2
Scale input data (scale) True
Scaler (scaler) StandardScaler
Do hyperparameter search (param_search) False
Reduce dimensionality (reduce_dim) False
Cross validator (cross_validator) KFold
Parallel jobs (n_jobs) 1

1) parallise model fits using n_jobs#

The n_jobs parameter allows you to specify the number of CPU cores to use for parallel processing. Setting n_jobs = -1 uses all available cores, speeding up computations when working with large datasets.

Note: Maxing out all available cores might not always lead to faster computation times. Due to overhead from parallelization, memory bandwidth limitations, and potential load imbalances, using more cores can sometimes result in diminishing returns or even slower performance.

Here we accomplish a speed-up by setting n_jobs to 5.

start = time.time()

em = AutoEmulate()
em.setup(X, y, n_jobs=5, print_setup=False)
em.compare()

end = time.time()
print(f"Time taken: {end - start} seconds")
Time taken: 19.856790781021118 seconds

2) restrict the range of models#

Another approach is to limit the range of models by selecting a subset of relevant types based on your domain and problem expertise. This selection process typically considers factors such as the nature of the problem, data characteristics or the need for interpretability. By narrowing down the types of models, you can reduce the computational burden and focus on the most promising architectures for your specific task.

em = AutoEmulate()
em.setup(X, y, print_setup=False)

# let's see all models, which we can refer to by short or full name
em.model_registry.get_model_names()

# setup with fewer models
start = time.time()

em.setup(X, y, models=["sop", "rbf", "gb"], print_setup=False)
em.compare()

end = time.time()
print(f"Time taken: {end - start} seconds")
Time taken: 2.7409257888793945 seconds

3) reduce the number of folds in cross validation using cross_validator#

With larger datasets, you might initially want to set the number of folds for the cross validation to 3 instead of 5 (the default), so that there are fewer models to fit. AutoEmulate takes a cross_validator argument, which takes an scklearn cross validator or splitter. Let’s use kfold with 3 splits, which saves 2 model fits per model.

from sklearn.model_selection import KFold

start = time.time()

em = AutoEmulate()
em.setup(X, y, cross_validator=KFold(n_splits=3), print_setup=False)
em.compare()

end = time.time()
print(f"Time taken: {end - start} seconds")
Time taken: 15.135623931884766 seconds