Running experiments with prompto¶

When running the pipeline, there are a few key classes that we will look at in this notebook:

Settings: this defines the settings of the the experiment pipeline which stores the paths to the relevant data folders and the parameters for the pipeline.
Experiment: this defines all the variables related to a single experiment. An 'experiment' here is defined by a particular JSONL file which contains the data/prompts for each experiment. Each line in this file is a particular input to the LLM which we will obtain a response for.
ExperimentPipeline: this is the main class for running the full pipeline. The pipeline can be ran using the ExperimentPipeline.run() method which will continually check the input folder for new experiments to process.
- This takes in a Settings object and for each JSONL file in the input folder, it will create an Experiment object and run the experiments sequentially as they are created in the input folder.

In [1]:

Copied!

import os

from prompto import Settings, Experiment, ExperimentPipeline
import os

from prompto import Settings, Experiment, ExperimentPipeline

Settings¶

The Settings class stores all the relevant information for the pipeline such as:

the paths to the data folders
the (default) maximum number of queries per minute
the number of max retries for failed requests
whether or not to use parallel processing of the prompts
the maximum number of queries per minute for each group of prompts (if parallel processing is enabled) - see the Grouping prompts and specifying rate limits notebook for more information on this.

In [2]:

Copied!

if "data" not in os.listdir("."):
    os.mkdir("data")
if "data" not in os.listdir("."):
    os.mkdir("data")

In [3]:

Copied!

settings = Settings(data_folder="data", max_queries=50, max_attempts=5)
settings = Settings(data_folder="data", max_queries=50, max_attempts=5)

We can print the settings object to see the current settings easily.

In [4]:

Copied!

print(settings)
print(settings)

Settings: data_folder=data, max_queries=50, max_attempts=5, parallel=False
Subfolders: input_folder=data/input, output_folder=data/output, media_folder=data/media

Here, we will just print out the attributes of the settings object to see what is stored in it (although we have just printed these above as well).

In [5]:

Copied!





print(f"settings.data_folder: {settings.data_folder}")
print(f"settings.input_folder: {settings.input_folder}")
print(f"settings.output_folder: {settings.output_folder}")
print(f"settings.media_folder: {settings.media_folder}")
print(f"settings.max_queries: {settings.max_queries}")
print(f"settings.max_attempts: {settings.max_attempts}")
print(f"settings.parallel: {settings.parallel}")
print(f"settings.max_queries_dict: {settings.max_queries_dict}")
print(f"settings.data_folder: {settings.data_folder}")
print(f"settings.input_folder: {settings.input_folder}")
print(f"settings.output_folder: {settings.output_folder}")
print(f"settings.media_folder: {settings.media_folder}")
print(f"settings.max_queries: {settings.max_queries}")
print(f"settings.max_attempts: {settings.max_attempts}")
print(f"settings.parallel: {settings.parallel}")
print(f"settings.max_queries_dict: {settings.max_queries_dict}")

settings.data_folder: data
settings.input_folder: data/input
settings.output_folder: data/output
settings.media_folder: data/media
settings.max_queries: 50
settings.max_attempts: 5
settings.parallel: False
settings.max_queries_dict: {}

Note that the input_folder, output_folder and media_folder attributes are read only (by using the @property decorator) and so we cannot change these directly. This is because we want to have consistency with the data_folder attribute.

So if we try to change the input_folder, output_folderandmedia_folder` attributes, it will raise an error:

In [6]:

Copied!

settings.input_folder = "unknown_folder/input"
settings.input_folder = "unknown_folder/input"

---------------------------------------------------------------------------
WriteFolderError                          Traceback (most recent call last)
Cell In[6], line 1
----> 1 settings.input_folder = "unknown_folder/input"

File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:176, in Settings.input_folder(self, value)
    174 @input_folder.setter
    175 def input_folder(self, value: str):
--> 176     raise WriteFolderError(
    177         "Cannot set input folder on it's own. Set the 'data_folder' instead"
    178     )

WriteFolderError: Cannot set input folder on it's own. Set the 'data_folder' instead

In [7]:

Copied!

settings.output_folder = "unknown_folder/output"
settings.output_folder = "unknown_folder/output"

---------------------------------------------------------------------------
WriteFolderError                          Traceback (most recent call last)
Cell In[7], line 1
----> 1 settings.output_folder = "unknown_folder/output"

File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:188, in Settings.output_folder(self, value)
    186 @output_folder.setter
    187 def output_folder(self, value: str):
--> 188     raise WriteFolderError(
    189         "Cannot set output folder on it's own. Set the 'data_folder' instead"
    190     )

WriteFolderError: Cannot set output folder on it's own. Set the 'data_folder' instead

In [8]:

Copied!

settings.media_folder = "unknown_folder/media"
settings.media_folder = "unknown_folder/media"

---------------------------------------------------------------------------
WriteFolderError                          Traceback (most recent call last)
Cell In[8], line 1
----> 1 settings.media_folder = "unknown_folder/media"

File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:200, in Settings.media_folder(self, value)
    198 @media_folder.setter
    199 def media_folder(self, value: str):
--> 200     raise WriteFolderError(
    201         "Cannot set media folder on it's own. Set the 'data_folder' instead"
    202     )

WriteFolderError: Cannot set media folder on it's own. Set the 'data_folder' instead

What really is happening under the hood is we're using a @property dectorator and we do not define a setter method for these attributes. This means that we cannot change these attributes directly. Of course, we can change the underlying _input_folder, _output_folder and _media_folder attributes directly if we want to change these, but this is not recommended.

In [9]:

Copied!

settings._input_folder = "unknown_folder/input"
settings._input_folder = "unknown_folder/input"

In [10]:

Copied!

settings.input_folder
settings.input_folder

Out[10]:

'unknown_folder/input'

We can set the data_folder attribute to a new path if we want to change the data folder. When doing so, it will check if the folder exists, otherwise we get an error:

In [11]:

Copied!

settings.data_folder = "unknown_folder"
settings.data_folder = "unknown_folder"

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 settings.data_folder = "unknown_folder"

File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:162, in Settings.data_folder(self, value)
    159 @data_folder.setter
    160 def data_folder(self, value: str):
    161     # check the data folder exists
--> 162     self.check_folder_exists(value)
    163     # set the data folder
    164     self._data_folder = value

File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/settings.py:114, in Settings.check_folder_exists(data_folder)
    112 # check if data folder exists
    113 if not os.path.isdir(data_folder):
--> 114     raise ValueError(
    115         f"Data folder '{data_folder}' must be a valid path to a folder"
    116     )
    118 return True

ValueError: Data folder 'unknown_folder' must be a valid path to a folder

However, if the data does exist, it will store the new path and importantly, this will also update the input_folder, output_folder and media_folder attributes accordingly.

In [12]:

Copied!

settings.data_folder = "data2"
settings.data_folder = "data2"

Notice how the input_folder, output_folder and media_folder attributes have been updated to the new corresponding paths.

We also create these subfolders if they do not exist.

In [13]:

Copied!

print(settings)
print(settings)

Settings: data_folder=data2, max_queries=50, max_attempts=5, parallel=False
Subfolders: input_folder=data2/input, output_folder=data2/output, media_folder=data2/media

In [14]:

Copied!





print(f"settings.data_folder: {settings.data_folder}")
print(f"settings.input_folder: {settings.input_folder}")
print(f"settings.output_folder: {settings.output_folder}")
print(f"settings.media_folder: {settings.media_folder}")
print(f"settings.max_queries: {settings.max_queries}")
print(f"settings.max_attempts: {settings.max_attempts}")
print(f"settings.data_folder: {settings.data_folder}")
print(f"settings.input_folder: {settings.input_folder}")
print(f"settings.output_folder: {settings.output_folder}")
print(f"settings.media_folder: {settings.media_folder}")
print(f"settings.max_queries: {settings.max_queries}")
print(f"settings.max_attempts: {settings.max_attempts}")

settings.data_folder: data2
settings.input_folder: data2/input
settings.output_folder: data2/output
settings.media_folder: data2/media
settings.max_queries: 50
settings.max_attempts: 5

Experiment¶

The Experiment class stores all the relevant information for a single experiment. To initialise, we need to pass in the path to the JSONL file which contains the data for the experiment.

The Experiment class stores several attributes:

file_name: the name of the JSONL file
experiment_name: the file_name without the .jsonl extension
settings: Settings object which is described above
output_folder: the path to the output folder for the experiment, e.g. data_folder/output_folder/experiment_name
creation_time: the time the experiment file was created
log_file: the path to the log file for the experiment, e.g. data_folder/output_folder/experiment_name/{creation_time}_experiment_name.log
input_file_path: the path to the input JSONL file, e.g. data_folder/input_folder/experiment_name.jsonl
output_completed_jsonl_file_path: the path to the completed output JSONL file, e.g. data_folder/output_folder/experiment_name/completed-experiment_name.jsonl
output_input_jsonl_file_out_path: the path to the input output JSONL file, e.g. data_folder/output_folder/experiment_name/input-experiment_name.jsonl (this is just for logging to know what the input to the experiment was)

Essentially, when initialising an Experiment object, we construct all the paths that are relevant to that particular experiment such as the log file, the input file path, and the file paths for storing the final output for the experiment.

We construct these paths by using the Settings object which tells us where all the paths to the relevant folders are.

Finally, Experiment also stores:

experiment_prompts as a list of dictionaries (we just read in the JSONL to get these)
number_queries: the number of queries in the experiment (i.e. the length of experiment_prompts)

In [15]:

Copied!

experiment = Experiment("test.jsonl", settings=settings)
experiment = Experiment("test.jsonl", settings=settings)

In [16]:

Copied!

experiment.__str__()
experiment.__str__()

Out[16]:

'test.jsonl'

In [17]:

Copied!

experiment.creation_time
experiment.creation_time

Out[17]:

'09-07-2024-11-59-54'

In [18]:

Copied!

experiment.experiment_name
experiment.experiment_name

Out[18]:

'test'

In [19]:

Copied!

experiment.experiment_prompts
experiment.experiment_prompts

Out[19]:

[{'id': 9,
  'prompt': ['Hello',
   "My name is Bob and I'm 6 years old",
   'How old am I next year?'],
  'api': 'unknown-api',
  'model_name': 'unknown-model-name',
  'parameters': {'candidate_count': 1,
   'max_output_tokens': 64,
   'temperature': 1,
   'top_k': 40}},
 {'id': 10,
  'prompt': ['Can you give me a random number between 1-10?',
   'What is +5 of that number?',
   'What is half of that number?'],
  'api': 'unknown-api',
  'model_name': 'unknown-model-name',
  'parameters': {'candidate_count': 1,
   'max_output_tokens': 128,
   'temperature': 0.5,
   'top_k': 40}},
 {'id': 11,
  'prompt': "How many theaters are there in London's South End?",
  'api': 'unknown-api',
  'model_name': 'unknown-model-name'}]

In [20]:

Copied!

experiment.number_queries
experiment.number_queries

Out[20]:

We can print out all the relevant information for the experiment:

In [21]:

Copied!





print(f"experiment.file_name: {experiment.file_name}")
print(f"experiment.input_file_path: {experiment.input_file_path}")
print(f"experiment.output_folder: {experiment.output_folder}")
print(
    f"experiment.output_input_jsonl_file_out_path: {experiment.output_input_jsonl_file_out_path}"
)
print(
    f"experiment.output_completed_jsonl_file_path: {experiment.output_completed_jsonl_file_path}"
)
print(f"experiment.log_file: {experiment.log_file}")
print(f"experiment.file_name: {experiment.file_name}")
print(f"experiment.input_file_path: {experiment.input_file_path}")
print(f"experiment.output_folder: {experiment.output_folder}")
print(
    f"experiment.output_input_jsonl_file_out_path: {experiment.output_input_jsonl_file_out_path}"
)
print(
    f"experiment.output_completed_jsonl_file_path: {experiment.output_completed_jsonl_file_path}"
)
print(f"experiment.log_file: {experiment.log_file}")

experiment.file_name: test.jsonl
experiment.input_file_path: data2/input/test.jsonl
experiment.output_folder: data2/output/test
experiment.output_input_jsonl_file_out_path: data2/output/test/24-09-2024-09-13-56-input-test.jsonl
experiment.output_completed_jsonl_file_path: data2/output/test/24-09-2024-09-13-56-completed-test.jsonl
experiment.log_file: data2/output/test/24-09-2024-09-13-56-log-test.txt

Printing the object just prints out the file name.

In [22]:

Copied!

print(experiment)
print(experiment)

test.jsonl

In [23]:

Copied!

f"{experiment}"
f"{experiment}"

Out[23]:

'test.jsonl'

We can simply process a single experiment by awaiting the async process method (you can also use asyncio.run as well) .This will process all the prompts in the experiment and write the output to the output folder.

The method returns the list of completed prompt dictionaries (with the response from the LLM in the "response" key) and a float which is the average time taken to process and wait for the response for each prompt.

In [24]:

Copied!

completed_responses, avg_query_processing_time = await experiment.process()
completed_responses, avg_query_processing_time = await experiment.process()

Sending 3 queries at 50 QPM with RI of 1.2s (attempt 1/5): 100%|██████████| 3/3 [00:03<00:00,  1.20s/query]
Waiting for responses (attempt 1/5): 100%|██████████| 3/3 [00:00<00:00, 352.44query/s]

Note that completed responses are also saved in the completed_responses attribute of the object:

In [25]:

Copied!

experiment.completed_responses == completed_responses
experiment.completed_responses == completed_responses

Out[25]:

True

In [26]:

Copied!

experiment.completed_responses
experiment.completed_responses

Out[26]:

[{'id': 9,
  'prompt': ['Hello',
   "My name is Bob and I'm 6 years old",
   'How old am I next year?'],
  'api': 'unknown-api',
  'model_name': 'unknown-model-name',
  'parameters': {'candidate_count': 1,
   'max_output_tokens': 64,
   'temperature': 1,
   'top_k': 40},
  'response': 'NotImplementedError - API unknown-api not recognised or implemented'},
 {'id': 10,
  'prompt': ['Can you give me a random number between 1-10?',
   'What is +5 of that number?',
   'What is half of that number?'],
  'api': 'unknown-api',
  'model_name': 'unknown-model-name',
  'parameters': {'candidate_count': 1,
   'max_output_tokens': 128,
   'temperature': 0.5,
   'top_k': 40},
  'response': 'NotImplementedError - API unknown-api not recognised or implemented'},
 {'id': 11,
  'prompt': "How many theaters are there in London's South End?",
  'api': 'unknown-api',
  'model_name': 'unknown-model-name',
  'response': 'NotImplementedError - API unknown-api not recognised or implemented'}]

After running the experiment, you can also see the output as a dataframe too:

In [27]:

Copied!

experiment.completed_responses_dataframe
experiment.completed_responses_dataframe

Out[27]:

	id	prompt	api	model_name	parameters	response
0	9	[Hello, My name is Bob and I'm 6 years old, Ho...	unknown-api	unknown-model-name	{'candidate_count': 1, 'max_output_tokens': 64...	NotImplementedError - API unknown-api not reco...
1	10	[Can you give me a random number between 1-10?...	unknown-api	unknown-model-name	{'candidate_count': 1, 'max_output_tokens': 12...	NotImplementedError - API unknown-api not reco...
2	11	How many theaters are there in London's South ...	unknown-api	unknown-model-name	NaN	NotImplementedError - API unknown-api not reco...

If we look at the output, we can see we got errors that there were NotImplementedErrors as the model was not implemented. To see the models implemented, there is a dictionary of models in the apis module called ASYNC_APIS where the keys are the API names and the values are the corresponding classes.

In [28]:

Copied!

from prompto.apis import ASYNC_APIS

ASYNC_APIS
from prompto.apis import ASYNC_APIS

ASYNC_APIS

Out[28]:

{'test': prompto.apis.testing.testing_api.TestAPI,
 'azure-openai': prompto.apis.azure_openai.azure_openai.AzureOpenAIAPI,
 'openai': prompto.apis.openai.openai.OpenAIAPI,
 'anthropic': prompto.apis.anthropic.anthropic.AnthropicAPI,
 'gemini': prompto.apis.gemini.gemini.GeminiAPI,
 'vertexai': prompto.apis.vertexai.vertexai.VertexAIAPI,
 'ollama': prompto.apis.ollama.ollama.OllamaAPI,
 'huggingface-tgi': prompto.apis.huggingface_tgi.huggingface_tgi.HuggingfaceTGIAPI,
 'quart': prompto.apis.quart.quart.QuartAPI}

Experiment Pipeline¶

The ExperimentPipeline class is the main class for running the full pipeline which will continually check the input folder for new experiments to process. To initialise, it simply just takes in a Settings object:

In [29]:

Copied!

pipeline = ExperimentPipeline(settings)
pipeline = ExperimentPipeline(settings)

It stores several things such as:

settings: Settings object
average_per_query_processing_times: this is a list of the average query processing times for each experiment
overall_avg_proc_times: this is a float which is an average of the values in average_per_query_processing_times

These last two attributes are just for logging purposes to see how long each experiment takes on average and for us to give a very rough estimate of how long we may expect queries to return to us.

The object will also store experiment_files which is a list of all the JSONL files in the input folder. When the pipeline is running, it will check this folder for new experiments to process and order then by creation time so that we process the oldest experiments first.

In [30]:

Copied!





print(f"pipeline.settings: {pipeline.settings}")
print(
    f"pipeline.average_per_query_processing_times: {pipeline.average_per_query_processing_times}"
)
print(f"pipeline.overall_avg_proc_times: {pipeline.overall_avg_proc_times}")
print(f"pipeline.experiment_files: {pipeline.experiment_files}")
print(f"pipeline.settings: {pipeline.settings}")
print(
    f"pipeline.average_per_query_processing_times: {pipeline.average_per_query_processing_times}"
)
print(f"pipeline.overall_avg_proc_times: {pipeline.overall_avg_proc_times}")
print(f"pipeline.experiment_files: {pipeline.experiment_files}")

pipeline.settings: Settings: data_folder=data2, max_queries=50, max_attempts=5, parallel=False
Subfolders: input_folder=data2/input, output_folder=data2/output, media_folder=data2/media
pipeline.average_per_query_processing_times: []
pipeline.overall_avg_proc_times: 0.0
pipeline.experiment_files: []

The key method in the ExperimentPipeline class is the run() method which will continually check the input folder for new experiments to process. When processing experiments, we create an Experiment object as described above, and process the experiment.

You can start one from the CLI using the run_pipeline.py script, or just use the prompto_run_pipeline CLI command (also see the documentation for prompto commands). This takes in several arguments:

--data-folder: the path to the data folder
--max-queries: the maximum number of queries per minute
--max-attempts: the maximum number of attempts for each query
--parallel: whether or not to use parallel processing
--max-queries-json: whether or not to group prompts

See the Grouping prompts and specifying rate limits notebook and the Specifying rate limits documentation for more information on the last two arguments.

Here we will run the method to see how the pipeline runs and processes the experiments. Currently we only have one more experiment in the folder to process "test2.jsonl":

In [31]:

Copied!

os.listdir("data2/input")
os.listdir("data2/input")

Out[31]:

['test2.jsonl']

In [32]:

Copied!

experiment2 = Experiment("test2.jsonl", settings=settings)
experiment2.experiment_prompts
experiment2 = Experiment("test2.jsonl", settings=settings)
experiment2.experiment_prompts

Out[32]:

[{'id': 9,
  'prompt': ['Hello',
   "My name is Bob and I'm 6 years old",
   'How old am I next year?'],
  'api': 'test',
  'model_name': 'test',
  'parameters': {'candidate_count': 1,
   'max_output_tokens': 64,
   'temperature': 1,
   'top_k': 40}},
 {'id': 10,
  'prompt': ['Can you give me a random number between 1-10?',
   'What is +5 of that number?',
   'What is half of that number?'],
  'api': 'test',
  'model_name': 'test',
  'parameters': {'candidate_count': 1,
   'max_output_tokens': 128,
   'temperature': 0.5,
   'top_k': 40}},
 {'id': 11,
  'prompt': "How many theaters are there in London's South End?",
  'api': 'test',
  'model_name': 'test'}]

In [33]:

Copied!

completed_responses_2, avg_query_processing_time_2 = await experiment2.process()
completed_responses_2, avg_query_processing_time_2 = await experiment2.process()

Sending 3 queries at 50 QPM with RI of 1.2s (attempt 1/5): 100%|██████████| 3/3 [00:03<00:00,  1.20s/query]
Waiting for responses (attempt 1/5): 100%|██████████| 3/3 [00:00<00:00, 830.39query/s]

In [34]:

Copied!

completed_responses_2
completed_responses_2

Out[34]:

[{'id': 9,
  'prompt': ['Hello',
   "My name is Bob and I'm 6 years old",
   'How old am I next year?'],
  'api': 'test',
  'model_name': 'test',
  'parameters': {'candidate_count': 1,
   'max_output_tokens': 64,
   'temperature': 1,
   'top_k': 40},
  'timestamp_sent': '24-09-2024-09-14-39',
  'response': 'This is a test response'},
 {'id': 10,
  'prompt': ['Can you give me a random number between 1-10?',
   'What is +5 of that number?',
   'What is half of that number?'],
  'api': 'test',
  'model_name': 'test',
  'parameters': {'candidate_count': 1,
   'max_output_tokens': 128,
   'temperature': 0.5,
   'top_k': 40},
  'timestamp_sent': '24-09-2024-09-14-40',
  'response': 'This is a test response'},
 {'id': 11,
  'prompt': "How many theaters are there in London's South End?",
  'api': 'test',
  'model_name': 'test',
  'timestamp_sent': '24-09-2024-09-14-41',
  'response': 'ValueError - This is a test error which we should handle and return'}]

Note that we can use ExperimentPipeline.run() to run the pipeline and process the experiments but we are unable to run this within a notebook as it uses asyncio.run which cannot be called within a notebook. However, as mentioned above, we typically would run this from the CLI using the prompto_run_pipeline command.

In the terminal, move do this current directory (prompto/examples/notebooks) and run the following command:

prompto_run_pipeline --data-folder data2 --max-queries 50 --max-attempts 5

After the experiment has finished, check the output folder for the output file.