Running experiments with prompto¶
When running the pipeline, there are a few key classes that we will look at in this notebook:
Settings
: this defines the settings of the the experiment pipeline which stores the paths to the relevant data folders and the parameters for the pipeline.Experiment
: this defines all the variables related to a single experiment. An 'experiment' here is defined by a particular JSONL file which contains the data/prompts for each experiment. Each line in this file is a particular input to the LLM which we will obtain a response for.ExperimentPipeline
: this is the main class for running the full pipeline. The pipeline can be ran using theExperimentPipeline.run()
method which will continually check the input folder for new experiments to process.- This takes in a
Settings
object and for each JSONL file in the input folder, it will create anExperiment
object and run the experiments sequentially as they are created in the input folder.
- This takes in a
import os
from prompto import Settings, Experiment, ExperimentPipeline
Settings¶
The Settings
class stores all the relevant information for the pipeline such as:
- the paths to the data folders
- the (default) maximum number of queries per minute
- the number of max retries for failed requests
- whether or not to use parallel processing of the prompts
- the maximum number of queries per minute for each group of prompts (if parallel processing is enabled) - see the Grouping prompts and specifying rate limits notebook for more information on this.
if "data" not in os.listdir("."):
os.mkdir("data")
settings = Settings(data_folder="data", max_queries=50, max_attempts=5)
We can print the settings object to see the current settings easily.
print(settings)
Settings: data_folder=data, max_queries=50, max_attempts=5, parallel=False Subfolders: input_folder=data/input, output_folder=data/output, media_folder=data/media
Here, we will just print out the attributes of the settings object to see what is stored in it (although we have just printed these above as well).
print(f"settings.data_folder: {settings.data_folder}")
print(f"settings.input_folder: {settings.input_folder}")
print(f"settings.output_folder: {settings.output_folder}")
print(f"settings.media_folder: {settings.media_folder}")
print(f"settings.max_queries: {settings.max_queries}")
print(f"settings.max_attempts: {settings.max_attempts}")
print(f"settings.parallel: {settings.parallel}")
print(f"settings.max_queries_dict: {settings.max_queries_dict}")
settings.data_folder: data settings.input_folder: data/input settings.output_folder: data/output settings.media_folder: data/media settings.max_queries: 50 settings.max_attempts: 5 settings.parallel: False settings.max_queries_dict: {}
Note that the input_folder
, output_folder
and media_folder
attributes are read only (by using the @property
decorator) and so we cannot change these directly. This is because we want to have consistency with the data_folder
attribute.
So if we try to change the input_folder,
output_folderand
media_folder` attributes, it will raise an error:
settings.input_folder = "unknown_folder/input"
--------------------------------------------------------------------------- WriteFolderError Traceback (most recent call last) Cell In[6], line 1 ----> 1 settings.input_folder = "unknown_folder/input" File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:176, in Settings.input_folder(self, value) 174 @input_folder.setter 175 def input_folder(self, value: str): --> 176 raise WriteFolderError( 177 "Cannot set input folder on it's own. Set the 'data_folder' instead" 178 ) WriteFolderError: Cannot set input folder on it's own. Set the 'data_folder' instead
settings.output_folder = "unknown_folder/output"
--------------------------------------------------------------------------- WriteFolderError Traceback (most recent call last) Cell In[7], line 1 ----> 1 settings.output_folder = "unknown_folder/output" File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:188, in Settings.output_folder(self, value) 186 @output_folder.setter 187 def output_folder(self, value: str): --> 188 raise WriteFolderError( 189 "Cannot set output folder on it's own. Set the 'data_folder' instead" 190 ) WriteFolderError: Cannot set output folder on it's own. Set the 'data_folder' instead
settings.media_folder = "unknown_folder/media"
--------------------------------------------------------------------------- WriteFolderError Traceback (most recent call last) Cell In[8], line 1 ----> 1 settings.media_folder = "unknown_folder/media" File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:200, in Settings.media_folder(self, value) 198 @media_folder.setter 199 def media_folder(self, value: str): --> 200 raise WriteFolderError( 201 "Cannot set media folder on it's own. Set the 'data_folder' instead" 202 ) WriteFolderError: Cannot set media folder on it's own. Set the 'data_folder' instead
What really is happening under the hood is we're using a @property
dectorator and we do not define a setter method for these attributes. This means that we cannot change these attributes directly. Of course, we can change the underlying _input_folder
, _output_folder
and _media_folder
attributes directly if we want to change these, but this is not recommended.
settings._input_folder = "unknown_folder/input"
settings.input_folder
'unknown_folder/input'
We can set the data_folder
attribute to a new path if we want to change the data folder. When doing so, it will check if the folder exists, otherwise we get an error:
settings.data_folder = "unknown_folder"
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[11], line 1 ----> 1 settings.data_folder = "unknown_folder" File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:162, in Settings.data_folder(self, value) 159 @data_folder.setter 160 def data_folder(self, value: str): 161 # check the data folder exists --> 162 self.check_folder_exists(value) 163 # set the data folder 164 self._data_folder = value File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:114, in Settings.check_folder_exists(data_folder) 112 # check if data folder exists 113 if not os.path.isdir(data_folder): --> 114 raise ValueError( 115 f"Data folder '{data_folder}' must be a valid path to a folder" 116 ) 118 return True ValueError: Data folder 'unknown_folder' must be a valid path to a folder
However, if the data does exist, it will store the new path and importantly, this will also update the input_folder
, output_folder
and media_folder
attributes accordingly.
settings.data_folder = "data2"
Notice how the input_folder
, output_folder
and media_folder
attributes have been updated to the new corresponding paths.
We also create these subfolders if they do not exist.
print(settings)
Settings: data_folder=data2, max_queries=50, max_attempts=5, parallel=False Subfolders: input_folder=data2/input, output_folder=data2/output, media_folder=data2/media
print(f"settings.data_folder: {settings.data_folder}")
print(f"settings.input_folder: {settings.input_folder}")
print(f"settings.output_folder: {settings.output_folder}")
print(f"settings.media_folder: {settings.media_folder}")
print(f"settings.max_queries: {settings.max_queries}")
print(f"settings.max_attempts: {settings.max_attempts}")
settings.data_folder: data2 settings.input_folder: data2/input settings.output_folder: data2/output settings.media_folder: data2/media settings.max_queries: 50 settings.max_attempts: 5
Experiment¶
The Experiment
class stores all the relevant information for a single experiment. To initialise, we need to pass in the path to the JSONL file which contains the data for the experiment.
The Experiment
class stores several attributes:
file_name
: the name of the JSONL fileexperiment_name
: the file_name without the.jsonl
extensionsettings
:Settings
object which is described aboveoutput_folder
: the path to the output folder for the experiment, e.g.data_folder/output_folder/experiment_name
creation_time
: the time the experiment file was createdlog_file
: the path to the log file for the experiment, e.g.data_folder/output_folder/experiment_name/{creation_time}_experiment_name.log
input_file_path
: the path to the input JSONL file, e.g.data_folder/input_folder/experiment_name.jsonl
output_completed_jsonl_file_path
: the path to the completed output JSONL file, e.g.data_folder/output_folder/experiment_name/completed-experiment_name.jsonl
output_input_jsonl_file_out_path
: the path to the input output JSONL file, e.g.data_folder/output_folder/experiment_name/input-experiment_name.jsonl
(this is just for logging to know what the input to the experiment was)
Essentially, when initialising an Experiment
object, we construct all the paths that are relevant to that particular experiment such as the log file, the input file path, and the file paths for storing the final output for the experiment.
We construct these paths by using the Settings
object which tells us where all the paths to the relevant folders are.
Finally, Experiment
also stores:
experiment_prompts
as a list of dictionaries (we just read in the JSONL to get these)number_queries
: the number of queries in the experiment (i.e. the length ofexperiment_prompts
)
experiment = Experiment("test.jsonl", settings=settings)
experiment.__str__()
'test.jsonl'
experiment.creation_time
'09-07-2024-11-59-54'
experiment.experiment_name
'test'
experiment.experiment_prompts
[{'id': 9, 'prompt': ['Hello', "My name is Bob and I'm 6 years old", 'How old am I next year?'], 'api': 'unknown-api', 'model_name': 'unknown-model-name', 'parameters': {'candidate_count': 1, 'max_output_tokens': 64, 'temperature': 1, 'top_k': 40}}, {'id': 10, 'prompt': ['Can you give me a random number between 1-10?', 'What is +5 of that number?', 'What is half of that number?'], 'api': 'unknown-api', 'model_name': 'unknown-model-name', 'parameters': {'candidate_count': 1, 'max_output_tokens': 128, 'temperature': 0.5, 'top_k': 40}}, {'id': 11, 'prompt': "How many theaters are there in London's South End?", 'api': 'unknown-api', 'model_name': 'unknown-model-name'}]
experiment.number_queries
3
We can print out all the relevant information for the experiment:
print(f"experiment.file_name: {experiment.file_name}")
print(f"experiment.input_file_path: {experiment.input_file_path}")
print(f"experiment.output_folder: {experiment.output_folder}")
print(
f"experiment.output_input_jsonl_file_out_path: {experiment.output_input_jsonl_file_out_path}"
)
print(
f"experiment.output_completed_jsonl_file_path: {experiment.output_completed_jsonl_file_path}"
)
print(f"experiment.log_file: {experiment.log_file}")
experiment.file_name: test.jsonl experiment.input_file_path: data2/input/test.jsonl experiment.output_folder: data2/output/test experiment.output_input_jsonl_file_out_path: data2/output/test/24-09-2024-09-13-56-input-test.jsonl experiment.output_completed_jsonl_file_path: data2/output/test/24-09-2024-09-13-56-completed-test.jsonl experiment.log_file: data2/output/test/24-09-2024-09-13-56-log-test.txt
Printing the object just prints out the file name.
print(experiment)
test.jsonl
f"{experiment}"
'test.jsonl'
We can simply process a single experiment by awaiting the async process
method (you can also use asyncio.run
as well) .This will process all the prompts in the experiment and write the output to the output folder.
The method returns the list of completed prompt dictionaries (with the response from the LLM in the "response" key) and a float which is the average time taken to process and wait for the response for each prompt.
completed_responses, avg_query_processing_time = await experiment.process()
Sending 3 queries at 50 QPM with RI of 1.2s (attempt 1/5): 100%|██████████| 3/3 [00:03<00:00, 1.20s/query] Waiting for responses (attempt 1/5): 100%|██████████| 3/3 [00:00<00:00, 352.44query/s]
Note that completed responses are also saved in the completed_responses
attribute of the object:
experiment.completed_responses == completed_responses
True
experiment.completed_responses
[{'id': 9, 'prompt': ['Hello', "My name is Bob and I'm 6 years old", 'How old am I next year?'], 'api': 'unknown-api', 'model_name': 'unknown-model-name', 'parameters': {'candidate_count': 1, 'max_output_tokens': 64, 'temperature': 1, 'top_k': 40}, 'response': 'NotImplementedError - API unknown-api not recognised or implemented'}, {'id': 10, 'prompt': ['Can you give me a random number between 1-10?', 'What is +5 of that number?', 'What is half of that number?'], 'api': 'unknown-api', 'model_name': 'unknown-model-name', 'parameters': {'candidate_count': 1, 'max_output_tokens': 128, 'temperature': 0.5, 'top_k': 40}, 'response': 'NotImplementedError - API unknown-api not recognised or implemented'}, {'id': 11, 'prompt': "How many theaters are there in London's South End?", 'api': 'unknown-api', 'model_name': 'unknown-model-name', 'response': 'NotImplementedError - API unknown-api not recognised or implemented'}]
After running the experiment, you can also see the output as a dataframe too:
experiment.completed_responses_dataframe
id | prompt | api | model_name | parameters | response | |
---|---|---|---|---|---|---|
0 | 9 | [Hello, My name is Bob and I'm 6 years old, Ho... | unknown-api | unknown-model-name | {'candidate_count': 1, 'max_output_tokens': 64... | NotImplementedError - API unknown-api not reco... |
1 | 10 | [Can you give me a random number between 1-10?... | unknown-api | unknown-model-name | {'candidate_count': 1, 'max_output_tokens': 12... | NotImplementedError - API unknown-api not reco... |
2 | 11 | How many theaters are there in London's South ... | unknown-api | unknown-model-name | NaN | NotImplementedError - API unknown-api not reco... |
If we look at the output, we can see we got errors that there were NotImplementedErrors
as the model was not implemented. To see the models implemented, there is a dictionary of models in the apis
module called ASYNC_APIS
where the keys are the API names and the values are the corresponding classes.
from prompto.apis import ASYNC_APIS
ASYNC_APIS
{'test': prompto.apis.testing.testing_api.TestAPI, 'azure-openai': prompto.apis.azure_openai.azure_openai.AzureOpenAIAPI, 'openai': prompto.apis.openai.openai.OpenAIAPI, 'anthropic': prompto.apis.anthropic.anthropic.AnthropicAPI, 'gemini': prompto.apis.gemini.gemini.GeminiAPI, 'vertexai': prompto.apis.vertexai.vertexai.VertexAIAPI, 'ollama': prompto.apis.ollama.ollama.OllamaAPI, 'huggingface-tgi': prompto.apis.huggingface_tgi.huggingface_tgi.HuggingfaceTGIAPI, 'quart': prompto.apis.quart.quart.QuartAPI}
Experiment Pipeline¶
The ExperimentPipeline
class is the main class for running the full pipeline which will continually check the input folder for new experiments to process. To initialise, it simply just takes in a Settings
object:
pipeline = ExperimentPipeline(settings)
It stores several things such as:
settings
:Settings
objectaverage_per_query_processing_times
: this is a list of the average query processing times for each experimentoverall_avg_proc_times
: this is a float which is an average of the values inaverage_per_query_processing_times
These last two attributes are just for logging purposes to see how long each experiment takes on average and for us to give a very rough estimate of how long we may expect queries to return to us.
The object will also store experiment_files
which is a list of all the JSONL files in the input folder. When the pipeline is running, it will check this folder for new experiments to process and order then by creation time so that we process the oldest experiments first.
print(f"pipeline.settings: {pipeline.settings}")
print(
f"pipeline.average_per_query_processing_times: {pipeline.average_per_query_processing_times}"
)
print(f"pipeline.overall_avg_proc_times: {pipeline.overall_avg_proc_times}")
print(f"pipeline.experiment_files: {pipeline.experiment_files}")
pipeline.settings: Settings: data_folder=data2, max_queries=50, max_attempts=5, parallel=False Subfolders: input_folder=data2/input, output_folder=data2/output, media_folder=data2/media pipeline.average_per_query_processing_times: [] pipeline.overall_avg_proc_times: 0.0 pipeline.experiment_files: []
The key method in the ExperimentPipeline
class is the run()
method which will continually check the input folder for new experiments to process. When processing experiments, we create an Experiment
object as described above, and process the experiment.
You can start one from the CLI using the run_pipeline.py
script, or just use the prompto_run_pipeline
CLI command (also see the documentation for prompto commands). This takes in several arguments:
--data-folder
: the path to the data folder--max-queries
: the maximum number of queries per minute--max-attempts
: the maximum number of attempts for each query--parallel
: whether or not to use parallel processing--max-queries-json
: whether or not to group prompts
See the Grouping prompts and specifying rate limits notebook and the Specifying rate limits documentation for more information on the last two arguments.
Here we will run the method to see how the pipeline runs and processes the experiments. Currently we only have one more experiment in the folder to process "test2.jsonl"
:
os.listdir("data2/input")
['test2.jsonl']
experiment2 = Experiment("test2.jsonl", settings=settings)
experiment2.experiment_prompts
[{'id': 9, 'prompt': ['Hello', "My name is Bob and I'm 6 years old", 'How old am I next year?'], 'api': 'test', 'model_name': 'test', 'parameters': {'candidate_count': 1, 'max_output_tokens': 64, 'temperature': 1, 'top_k': 40}}, {'id': 10, 'prompt': ['Can you give me a random number between 1-10?', 'What is +5 of that number?', 'What is half of that number?'], 'api': 'test', 'model_name': 'test', 'parameters': {'candidate_count': 1, 'max_output_tokens': 128, 'temperature': 0.5, 'top_k': 40}}, {'id': 11, 'prompt': "How many theaters are there in London's South End?", 'api': 'test', 'model_name': 'test'}]
completed_responses_2, avg_query_processing_time_2 = await experiment2.process()
Sending 3 queries at 50 QPM with RI of 1.2s (attempt 1/5): 100%|██████████| 3/3 [00:03<00:00, 1.20s/query] Waiting for responses (attempt 1/5): 100%|██████████| 3/3 [00:00<00:00, 830.39query/s]
completed_responses_2
[{'id': 9, 'prompt': ['Hello', "My name is Bob and I'm 6 years old", 'How old am I next year?'], 'api': 'test', 'model_name': 'test', 'parameters': {'candidate_count': 1, 'max_output_tokens': 64, 'temperature': 1, 'top_k': 40}, 'timestamp_sent': '24-09-2024-09-14-39', 'response': 'This is a test response'}, {'id': 10, 'prompt': ['Can you give me a random number between 1-10?', 'What is +5 of that number?', 'What is half of that number?'], 'api': 'test', 'model_name': 'test', 'parameters': {'candidate_count': 1, 'max_output_tokens': 128, 'temperature': 0.5, 'top_k': 40}, 'timestamp_sent': '24-09-2024-09-14-40', 'response': 'This is a test response'}, {'id': 11, 'prompt': "How many theaters are there in London's South End?", 'api': 'test', 'model_name': 'test', 'timestamp_sent': '24-09-2024-09-14-41', 'response': 'ValueError - This is a test error which we should handle and return'}]
Note that we can use ExperimentPipeline.run()
to run the pipeline and process the experiments but we are unable to run this within a notebook as it uses asyncio.run
which cannot be called within a notebook. However, as mentioned above, we typically would run this from the CLI using the prompto_run_pipeline
command.
In the terminal, move do this current directory (prompto/examples/notebooks
) and run the following command:
prompto_run_pipeline --data-folder data2 --max-queries 50 --max-attempts 5
After the experiment has finished, check the output folder for the output file.