First-Time Users’ Frequently Asked Questions#
General Questions#
What is
AutoEmulate
?A Python package that makes it easy to build emulators for complex simulations. It takes a set of simulation inputs
X
and outputsy
, and automatically fits, optimises and evaluates various machine learning models to find the best emulator model. The emulator model can then be used as a drop-in replacement for the simulation, but will be much faster and computationally cheaper to evaluate.
How do I install
AutoEmulate
?See the installation guide for detailed instructions.
What are the prerequisites for using
AutoEmulate
?AutoEmulate
is designed to be easy to use. The user has to first generate a dataset of simulation inputsX
and outputsy
, and optimally have a basic understanding of Python and machine learning concepts.
Usage Questions#
How do I start using
AutoEmulate
with my simulation?See the getting started guide or a more in-depth tutorial.
What kind of data does
AutoEmulate
need to build an emulator?AutoEmulate
takes simulation inputsX
and simulation outputsy
to build an emulator.X
is an ndarray of shape(n_samples, n_parameters)
andy
is an ndarray of shape(n_samples, n_outputs)
. Each sample here is a simulation run, so each row ofX
corresponds to a set of input parameters and each row ofy
corresponds to the corresponding simulation output. Currently, all inputs and outputs should be numeric, and we don’t support missing data.All models work with multi-output data. We have optimised
AutoEmulate
to work with smaller datasets (in the order of hundreds to thousands of samples). Training emulators with large datasets (hundreds of thousands of samples) may currently require a long time and is not recommended.
How do I interpret the results from
AutoEmulate
?See the tutorial for an example of how to interpret the results from
AutoEmulate
. Briefly,X
andy
are first split into training and test sets. Cross-validation and/or hyperparameter optimisation are performed on the training data. After comparing the results from different emulators, the user can evaluate the chosen emulator on the test set withAutoEmulate.evaluate_model()
, and plot test set predictions withAutoEmulate.plot_model()
, see autoemulate.compare module for details.An important thing to note is that the emulator can only be as good as the data it was trained on. Therefore, the experimental design (on which points the simulation was evaluated) is key to obtaining a good emulator.
Can I use
AutoEmulate
for commercial purposes?Yes. It’s licensed under the MIT license, which allows for commercial use. See the license for more information.
Advanced Usage#
Does AutoEmulate support parallel processing or high-performance computing (HPC) environments?
Yes, AutoEmulate.setup() has an
n_jobs
parameter which allows to parallelise cross-validation and hyperparameter optimisation.
Can AutoEmulate be integrated with other data analysis or simulation tools?
AutoEmulate
takes simpleX
andy
ndarrays as input, and returns emulator models that can be saved and loaded withjoblib
. All emulators are written as scikit learn estimators, so they can be used like any other scikit learn model in a pipeline.
Data Handling#
What are the best practices for data preprocessing before using
AutoEmulate
?The user will typically run their simulation on a selected set of input parameters (-> experimental design) using a latin hypercube or other sampling method.
AutoEmulate
currently needs all inputs to be numeric and we don’t support missing data. By default,AutoEmulate
will scale the input data to zero mean and unit variance, and there’s the option to do dimensionality reduction insetup()
.
How does AutoEmulate handle large datasets?
AutoEmulate
is optimised to work with smaller datasets (in the order of hundreds to thousands of samples). Training emulators with large datasets (hundreds of thousands of samples) may currently require a long time and is not recommended. Emulators are created because it’s expensive to evaluate the simulation, so we expect most users to have a relatively small dataset.
Troubleshooting#
What common issues might I encounter when using
AutoEmulate
, and how can I solve them?AutoEmulate.setup()
has alog_to_file
option to log all warnings/errors to a file. It also has averbose
option to print more information to the console. If you encounter an error, please open an issue (see below).
How can I report a bug or request a feature in
AutoEmulate
?You can report a bug or request a new feature through the issue templates in our GitHub repository. Head on over there and choose one of the templates for your purpose and get started.
Community and Learning Resources#
Are there any community projects or collaborations using
AutoEmulate
I can join or learn from?Where can I find tutorials or case studies on using
AutoEmulate
?See the tutorial for a comprehensive guide on using the package.
How can I stay updated on new releases or updates to AutoEmulate?
Watch the AutoEmulate repository.
What support options are available if I need help with AutoEmulate?
Please open an issue or contact the maintainer (email) directly.