Grouping prompts and specifying rate limits¶

When running the pipeline or an experiment, there are certain settings to define how to run the experiments which are described in the pipeline documentation. In the Specifying rate limits documentation, we have seen how we can specify rate limits for the pipeline in the command line interfaces for running the pipeline with prompto_run_pipeline and for running a particular experiment file with prompto_run_experiment. In this notebook, we will walkthrough the examples in the documentation to see how we can specify rate limits for the pipeline and for experiments.

We will consider three examples of experiment files which are found in the input folder of the parallel_data_example directory. The experiment files are:

In [1]:

Copied!

from prompto import Settings, Experiment
from prompto import Settings, Experiment

In [2]:

Copied!

data_folder = "parallel_data_example"
data_folder = "parallel_data_example"

Using parallel processing¶

As noted in the Specifying rate limits documentation, parallel processing of the prompts (meaning that different groups of prompts are processed and sent to APIs in parallel) should be enabled in most settings where the experiment file includes more than one model. Typically this is even true in cases where we are querying the same API type but for different models since the rate limits are usually per model.

To use parallel processing of the prompts, we need to first split the prompts into different queues (or groups). For most settings, splitting by API and certain models is sufficient. However, in some cases, we may want to split the prompts into different groups manually and this can be done by using the "group" key in the experiment file.

When we obtain the groups of prompts for parallel processing, what really is happening in the code (see source code for the prompto.experiment_processing.Experiment.group_prompts method) is that we loop over the prompts in the experiment file and assign them to different queues/groups based on:

the "group" key if it is present in the prompt dictionary
the "api" key

Since we also allow for splitting according to the "model_name" key, we also look at if a rate limit has been specified for a particular model within the group or API. We specify rate limits via the --max-queries-json or -mqj flag in the commands line interfaces or via the max_queries_dict argument in the Settings object in the prompto library.

In the examples below, we will see how we can specify rate limits for different APIs and models in the experiment files. We will also see how we can use the "group" key in the experiment file to group prompts manually and specify rate limits for each group.

Examples¶

First, we will look at the documentation_example.jsonl experiment file which has prompts for three different APIs (gemini, openai and ollama) for 6 different models (gemini-1.0-pro, gemini-1.5-pro, gpt3.5-turbo, gpt4, llama3 and mistral):

In [3]:

Copied!

with open(f"{data_folder}/input/documentation_example.jsonl", "r") as f:
    print(f.read())
with open(f"{data_folder}/input/documentation_example.jsonl", "r") as f:
    print(f.read())

{"id": 0, "api": "gemini", "model_name": "gemini-1.0-pro", "prompt": "What is the capital of France?"}
{"id": 1, "api": "gemini", "model_name": "gemini-1.0-pro", "prompt": "What is the capital of Germany?"}
{"id": 2, "api": "gemini", "model_name": "gemini-1.5-pro", "prompt": "What is the capital of France?"}
{"id": 3, "api": "gemini", "model_name": "gemini-1.5-pro", "prompt": "What is the capital of Germany?"}
{"id": 4, "api": "openai", "model_name": "gpt3.5-turbo", "prompt": "What is the capital of France?"}
{"id": 5, "api": "openai", "model_name": "gpt3.5-turbo", "prompt": "What is the capital of Germany?"}
{"id": 6, "api": "openai", "model_name": "gpt4", "prompt": "What is the capital of France?"}
{"id": 7, "api": "openai", "model_name": "gpt4", "prompt": "What is the capital of Germany?"}
{"id": 8, "api": "ollama", "model_name": "llama3", "prompt": "What is the capital of France?"}
{"id": 9, "api": "ollama", "model_name": "llama3", "prompt": "What is the capital of Germany?"}
{"id": 10, "api": "ollama", "model_name": "mistral", "prompt": "What is the capital of France?"}
{"id": 11, "api": "ollama", "model_name": "mistral", "prompt": "What is the capital of Germany?"}

Same rate limit for all APIs¶

Recall for each Experiment, we need to pass in the path to a jsonl file and a Settings object which stores paths to relevant data folders and also some parameter settings for how to run the particular experiment. For an overview of the Settings and Experiment classes see the Running experiments notebook.

By default, the Settings object has the parallel attribute set to False. Recall we can simply print the settings object to see the current settings:

In [4]:

Copied!

settings = Settings(data_folder=data_folder, max_queries=5)
print(settings)
settings = Settings(data_folder=data_folder, max_queries=5)
print(settings)

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=False
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

We can simply initialise an Experiment object for this experiment and the prompts in that experiment are stored in the experiment_prompts attribute:

In [5]:

Copied!

experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.experiment_prompts
experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.experiment_prompts

Out[5]:

[{'id': 0,
  'api': 'gemini',
  'model_name': 'gemini-1.0-pro',
  'prompt': 'What is the capital of France?'},
 {'id': 1,
  'api': 'gemini',
  'model_name': 'gemini-1.0-pro',
  'prompt': 'What is the capital of Germany?'},
 {'id': 2,
  'api': 'gemini',
  'model_name': 'gemini-1.5-pro',
  'prompt': 'What is the capital of France?'},
 {'id': 3,
  'api': 'gemini',
  'model_name': 'gemini-1.5-pro',
  'prompt': 'What is the capital of Germany?'},
 {'id': 4,
  'api': 'openai',
  'model_name': 'gpt3.5-turbo',
  'prompt': 'What is the capital of France?'},
 {'id': 5,
  'api': 'openai',
  'model_name': 'gpt3.5-turbo',
  'prompt': 'What is the capital of Germany?'},
 {'id': 6,
  'api': 'openai',
  'model_name': 'gpt4',
  'prompt': 'What is the capital of France?'},
 {'id': 7,
  'api': 'openai',
  'model_name': 'gpt4',
  'prompt': 'What is the capital of Germany?'},
 {'id': 8,
  'api': 'ollama',
  'model_name': 'llama3',
  'prompt': 'What is the capital of France?'},
 {'id': 9,
  'api': 'ollama',
  'model_name': 'llama3',
  'prompt': 'What is the capital of Germany?'},
 {'id': 10,
  'api': 'ollama',
  'model_name': 'mistral',
  'prompt': 'What is the capital of France?'},
 {'id': 11,
  'api': 'ollama',
  'model_name': 'mistral',
  'prompt': 'What is the capital of Germany?'}]

Note that the experiment_prompts attribute is read only (which is implemented using a @property decorator) and so we cannot change the prompts directly:

In [6]:

Copied!

experiment.experiment_prompts = "something else"
experiment.experiment_prompts = "something else"

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[6], line 1
----> 1 experiment.experiment_prompts = "something else"

File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/experiment_processing.py:115, in Experiment.experiment_prompts(self, value)
    113 @experiment_prompts.setter
    114 def experiment_prompts(self, value: list[dict]) -> None:
--> 115     raise AttributeError("Cannot set the experiment_prompts attribute")

AttributeError: Cannot set the experiment_prompts attribute

For this experiment file, we notice that we have three "api" keys present (gemini, openai and ollama). If no max_queries_dict is passed into the Settings object, then as we noted above, the prompts will be grouped first by their "group" key and then by their "api" key. In this example, we have no prompts with a "group" key and so the prompts will be grouped by their "api" key.

The groups of prompts are stored in the grouped_experiment_prompts attribute which again is a read only attribute and it's only initialised when we try to access the attribute. We can see by default the underlying attribute _grouped_experiment_prompts is an empty dictionary:

In [7]:

Copied!

experiment._grouped_experiment_prompts
experiment._grouped_experiment_prompts

Out[7]:

{}

Now when we access the grouped_experiment_prompts attribute, the prompts are grouped by their "api" key:

In [8]:

Copied!

experiment.grouped_experiment_prompts
experiment.grouped_experiment_prompts

WARNING:root:The 'parallel' attribute in the Settings object is set to False, so grouping will not be used when processing the experiment prompts. Set 'parallel' to True to use grouping and parallel processing of prompts.

Out[8]:

{'gemini': {'prompt_dicts': [{'id': 0,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of France?'},
   {'id': 1,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of Germany?'},
   {'id': 2,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of France?'},
   {'id': 3,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 5},
 'openai': {'prompt_dicts': [{'id': 4,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of France?'},
   {'id': 5,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of Germany?'},
   {'id': 6,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of France?'},
   {'id': 7,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 5},
 'ollama': {'prompt_dicts': [{'id': 8,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of France?'},
   {'id': 9,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of Germany?'},
   {'id': 10,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of France?'},
   {'id': 11,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 5}}

Notice the warning message that is logged when accessing this attribute. We got this message since the parallel attribute in the Settings object is set to False. We still get the groups of prompts but if we run the experiment, the prompts will not be processed in these different groups/queues in parallel.

In [9]:

Copied!

experiment.grouped_experiment_prompts == experiment._grouped_experiment_prompts
experiment.grouped_experiment_prompts == experiment._grouped_experiment_prompts

WARNING:root:The 'parallel' attribute in the Settings object is set to False, so grouping will not be used when processing the experiment prompts. Set 'parallel' to True to use grouping and parallel processing of prompts.

Out[9]:

True

We can see now that _grouped_experiment_prompts is now a dictionary with keys as the different APIs and the values are dictionaries with keys "prompt_dicts" which is a list of the prompts for that API and "rate_limit" which is the rate limit for that API. We can see that the rate limit here for each API is set to the default rate limit of 5 which is given by the Settings object for the experiment and which we set above:

In [10]:

Copied!

experiment.grouped_experiment_prompts.keys()
experiment.grouped_experiment_prompts.keys()

WARNING:root:The 'parallel' attribute in the Settings object is set to False, so grouping will not be used when processing the experiment prompts. Set 'parallel' to True to use grouping and parallel processing of prompts.

Out[10]:

dict_keys(['gemini', 'openai', 'ollama'])

In [11]:

Copied!

experiment.settings.max_queries
experiment.settings.max_queries

Out[11]:

As mentioned, this attribute is read only so we will not be able to change the groups directly:

In [12]:

Copied!

experiment.grouped_experiment_prompts = "something else"
experiment.grouped_experiment_prompts = "something else"

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[12], line 1
----> 1 experiment.grouped_experiment_prompts = "something else"

File ~/Library/CloudStorage/OneDrive-TheAlanTuringInstitute/prompto/src/prompto/experiment_processing.py:136, in Experiment.grouped_experiment_prompts(self, value)
    134 @grouped_experiment_prompts.setter
    135 def grouped_experiment_prompts(self, value: dict[str, list[dict]]) -> None:
--> 136     raise AttributeError("Cannot set the grouped_experiment_prompts attribute")

AttributeError: Cannot set the grouped_experiment_prompts attribute

A useful method in the Experiment class is the grouped_experiment_prompts_summary method which returns a dictionary where the keys are the API names and the values are a string which summarises the number of prompts for that API and the rate limit for that API. This is useful to see a summary of the groups of prompts:

In [13]:

Copied!

experiment.grouped_experiment_prompts_summary()
experiment.grouped_experiment_prompts_summary()

WARNING:root:The 'parallel' attribute in the Settings object is set to False, so grouping will not be used when processing the experiment prompts. Set 'parallel' to True to use grouping and parallel processing of prompts.

Out[13]:

{'gemini': '4 queries at 5 queries per minute',
 'openai': '4 queries at 5 queries per minute',
 'ollama': '4 queries at 5 queries per minute'}

Different rate limits for each API type¶

To build on the above example, we can set different rate limits for each API type by passing in a dictionary which specifies the rate limits for each API type. We can do this by passing in a max_queries_dict argument to the Settings object (or passing a json to the --max-queries-json or -mqj flag in the commands line interfaces) where the keys are the API names and the values are the rate limits for that API. We can see how this is done below:

In [14]:

Copied!





max_queries_dict = {"openai": 20, "gemini": 10}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
)
print(settings)
max_queries_dict = {"openai": 20, "gemini": 10}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
)
print(settings)

WARNING:root:max_queries_dict is provided and not empty, but parallel is set to False, so max_queries_dict will not be used. Set parallel to True to use max_queries_dict

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=False
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

Notice the warning provided here! We passed in a max_queries_dict to the Settings object but the parallel attribute is still set to False. We can remove this warning by setting the parallel attribute to True as the warning suggests:

In [15]:

Copied!





max_queries_dict = {"openai": 20, "gemini": 10}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)
max_queries_dict = {"openai": 20, "gemini": 10}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=True, max_queries_dict={'openai': 20, 'gemini': 10}
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

Also notice above how the max_queries_dict is only printed when the parallel attribute is set to True.

Let's now see how the prompts are grouped and what the grouped_experiment_prompts attribute looks like:

In [16]:

Copied!

experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.grouped_experiment_prompts
experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.grouped_experiment_prompts

Out[16]:

{'openai': {'prompt_dicts': [{'id': 4,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of France?'},
   {'id': 5,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of Germany?'},
   {'id': 6,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of France?'},
   {'id': 7,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 20},
 'gemini': {'prompt_dicts': [{'id': 0,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of France?'},
   {'id': 1,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of Germany?'},
   {'id': 2,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of France?'},
   {'id': 3,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 10},
 'ollama': {'prompt_dicts': [{'id': 8,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of France?'},
   {'id': 9,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of Germany?'},
   {'id': 10,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of France?'},
   {'id': 11,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 5}}

In [17]:

Copied!

experiment.grouped_experiment_prompts.keys()
experiment.grouped_experiment_prompts.keys()

Out[17]:

dict_keys(['openai', 'gemini', 'ollama'])

In [18]:

Copied!

experiment.grouped_experiment_prompts_summary()
experiment.grouped_experiment_prompts_summary()

Out[18]:

{'openai': '4 queries at 20 queries per minute',
 'gemini': '4 queries at 10 queries per minute',
 'ollama': '4 queries at 5 queries per minute'}

We can see now that we have the same grouping (by API type) as above, but the rate limits for gemini and openai have been set to 20 and 10 respectively. When processing the experiment, we will send the "gemini" prompts at a rate of 20 prompts per minute and the "openai" prompts at a rate of 10 prompts per minute. We have not specified the ollama rate limit and so it will be set to the default rate limit of 5 which was passed into the Settings object via the max_queries argument.

Different rate limits for each API type and model¶

For this example, we have different models within each API. For gemini, we have "gemini-1.0-pro" and "gemini-1.5-pro", for openai, we have "gpt3.5-turbo" and "gpt4" and for ollama, we have "llaam3" and "mistral".

To specify model-specific rate limits, instead of passing in an integer value for an API type like above, we can actually pass in another dictionary where the keys are model names and the values are the rate limits for that model, i.e. max_queries_dict can be a nested dictionary. Note that we do not need to specify rates for every model but only for the models we want to specify rates for. Everything else will be set to the default rate limit.

We can see how this is done below:

In [19]:

Copied!





max_queries_dict = {
    "gemini": {"gemini-1.5-pro": 20},
    "openai": {"gpt4": 10, "gpt3.5-turbo": 20},
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)
max_queries_dict = {
    "gemini": {"gemini-1.5-pro": 20},
    "openai": {"gpt4": 10, "gpt3.5-turbo": 20},
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=True, max_queries_dict={'gemini': {'gemini-1.5-pro': 20}, 'openai': {'gpt4': 10, 'gpt3.5-turbo': 20}}
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

In this example, we are specifying that "gemini-1.5-pro" from the gemini API should have a rate limit of 20, the "gpt4" and "gpt3.5-turbo" models from the openai API should have rate limits of 10 and 20 respectively. Everything else will be set to the default rate limit of 5.

Let's now see how the prompts are grouped and what the grouped_experiment_prompts attribute looks like:

In [20]:

Copied!

experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.grouped_experiment_prompts
experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.grouped_experiment_prompts

Out[20]:

{'gemini-gemini-1.5-pro': {'prompt_dicts': [{'id': 2,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of France?'},
   {'id': 3,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 20},
 'openai-gpt4': {'prompt_dicts': [{'id': 6,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of France?'},
   {'id': 7,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 10},
 'openai-gpt3.5-turbo': {'prompt_dicts': [{'id': 4,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of France?'},
   {'id': 5,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 20},
 'gemini': {'prompt_dicts': [{'id': 0,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of France?'},
   {'id': 1,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 5},
 'openai': {'prompt_dicts': [], 'rate_limit': 5},
 'ollama': {'prompt_dicts': [{'id': 8,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of France?'},
   {'id': 9,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of Germany?'},
   {'id': 10,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of France?'},
   {'id': 11,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 5}}

In [21]:

Copied!

experiment.grouped_experiment_prompts_summary()
experiment.grouped_experiment_prompts_summary()

Out[21]:

{'gemini-gemini-1.5-pro': '2 queries at 20 queries per minute',
 'openai-gpt4': '2 queries at 10 queries per minute',
 'openai-gpt3.5-turbo': '2 queries at 20 queries per minute',
 'gemini': '2 queries at 5 queries per minute',
 'openai': '0 queries at 5 queries per minute',
 'ollama': '4 queries at 5 queries per minute'}

As noted above, when we group the prompts, we are actually just looping over the prompts in the experiment and looking at the "api" key (if the "group" key is not present). We then look at the "model_name" key if it is present and if a rate limit has been specified for that model, we add it to a model-specific group. If no rate limit has been specified for that model, we add it to the default group for that API.

For gemini, we can see that we have a model-specific group for "gemini-1.5-pro" called "gemini-gemini-1.5-pro" and we have two queries for that model and this has a rate limit of 20 queries per minute as specified by the max_queries_dict above. We also have a default group for gemini called "gemini" which catches all other gemini prompts. We did not specify any default rate limit for gemini and so it will be set to the default rate limit of 5.

For openai, we can see that we have two model-specific groups for "gpt4" and "gpt3.5-turbo" called "openai-gpt4" and "openai-gpt3.5-turbo" respectively which have the correct rate limits as specified by the max_queries_dict above. We also have a default group for openai called "openai" which catches all other openai prompts. For this experiment file, there are no other openai prompts and so this group is empty.

Finally, we still have the group of ollama prompts which is called "ollama" and this has the default rate limit of 5.

Default rate limits for APIs¶

If we want to specify the default rate limit for a given API type, we can do this by specifying a rate limit for "default" in the max_queries_dict. This will set the default rate limit for the API which will include all prompts that do not have a model-specific rate limit. We can see how this is done below:

Note for specifying the ollama API, writing "ollama": 4 is equivalent to writing "ollama": {"default": 4}.

In [22]:

Copied!





max_queries_dict = {
    "gemini": {"default": 30, "gemini-1.5-pro": 20},
    "openai": {"gpt4": 10, "gpt3.5-turbo": 20},
    "ollama": 4,
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)
max_queries_dict = {
    "gemini": {"default": 30, "gemini-1.5-pro": 20},
    "openai": {"gpt4": 10, "gpt3.5-turbo": 20},
    "ollama": 4,
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=True, max_queries_dict={'gemini': {'default': 30, 'gemini-1.5-pro': 20}, 'openai': {'gpt4': 10, 'gpt3.5-turbo': 20}, 'ollama': 4}
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

We can see now that the rate limits for gemini and ollama have been specified as 20 and 4 respectively:

In [23]:

Copied!

experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.grouped_experiment_prompts_summary()
experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.grouped_experiment_prompts_summary()

Out[23]:

{'gemini': '2 queries at 30 queries per minute',
 'gemini-gemini-1.5-pro': '2 queries at 20 queries per minute',
 'openai-gpt4': '2 queries at 10 queries per minute',
 'openai-gpt3.5-turbo': '2 queries at 20 queries per minute',
 'ollama': '4 queries at 4 queries per minute',
 'openai': '0 queries at 5 queries per minute'}

Specifying models or APIs that don't exist in the experiment file¶

Note that if you specify a API/group or model that does not exist in the experiment file, there will be a group/queue created for that API/group or model but it will be empty:

In [24]:

Copied!





max_queries_dict = {
    "gemini": {"default": 30, "gemini-1.5-pro": 20},
    "openai": {"gpt4": 10, "gpt3.5-turbo": 20},
    "ollama": {
        "llama3": 3,
        "mistral": 3,
        "unknown-model": 4,
    },
    "unknown-group-or-api": 25,
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)
max_queries_dict = {
    "gemini": {"default": 30, "gemini-1.5-pro": 20},
    "openai": {"gpt4": 10, "gpt3.5-turbo": 20},
    "ollama": {
        "llama3": 3,
        "mistral": 3,
        "unknown-model": 4,
    },
    "unknown-group-or-api": 25,
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=True, max_queries_dict={'gemini': {'default': 30, 'gemini-1.5-pro': 20}, 'openai': {'gpt4': 10, 'gpt3.5-turbo': 20}, 'ollama': {'llama3': 3, 'mistral': 3, 'unknown-model': 4}, 'unknown-group-or-api': 25}
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

In [25]:

Copied!

experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.grouped_experiment_prompts_summary()
experiment = Experiment(file_name="documentation_example.jsonl", settings=settings)
experiment.grouped_experiment_prompts_summary()

Out[25]:

{'gemini': '2 queries at 30 queries per minute',
 'gemini-gemini-1.5-pro': '2 queries at 20 queries per minute',
 'openai-gpt4': '2 queries at 10 queries per minute',
 'openai-gpt3.5-turbo': '2 queries at 20 queries per minute',
 'ollama-llama3': '2 queries at 3 queries per minute',
 'ollama-mistral': '2 queries at 3 queries per minute',
 'ollama-unknown-model': '0 queries at 4 queries per minute',
 'unknown-group-or-api': '0 queries at 25 queries per minute',
 'openai': '0 queries at 5 queries per minute',
 'ollama': '0 queries at 5 queries per minute'}

Full control: Using the "groups" key to define user-specified groups of prompts¶

In some cases, we may want to group prompts manually. This can be done by using the "group" key in the experiment file. We now look at the documentation_example_groups_1.jsonl experiment file which has prompts for three different APIs (gemini, openai and ollama) for 6 different models (gemini-1.0-pro, gemini-1.5-pro, gpt3.5-turbo, gpt4, llaam3 and mistral). We have manually grouped the prompts into three groups: "group1", "group2" and "group3".

Note that when specifying the "group" key, the prompts will be grouped by this key and not by the "api" key. We can see how the prompts are grouped below:

In [26]:

Copied!

with open(f"{data_folder}/input/documentation_example_groups_1.jsonl", "r") as f:
    print(f.read())
with open(f"{data_folder}/input/documentation_example_groups_1.jsonl", "r") as f:
    print(f.read())

{"id": 0, "api": "gemini", "model_name": "gemini-1.0-pro", "prompt": "What is the capital of France?", "group": "group1"}
{"id": 1, "api": "gemini", "model_name": "gemini-1.0-pro", "prompt": "What is the capital of Germany?", "group": "group2"}
{"id": 2, "api": "gemini", "model_name": "gemini-1.5-pro", "prompt": "What is the capital of France?", "group": "group1"}
{"id": 3, "api": "gemini", "model_name": "gemini-1.5-pro", "prompt": "What is the capital of Germany?", "group": "group2"}
{"id": 4, "api": "openai", "model_name": "gpt3.5-turbo", "prompt": "What is the capital of France?", "group": "group1"}
{"id": 5, "api": "openai", "model_name": "gpt3.5-turbo", "prompt": "What is the capital of Germany?", "group": "group2"}
{"id": 6, "api": "openai", "model_name": "gpt4", "prompt": "What is the capital of France?", "group": "group1"}
{"id": 7, "api": "openai", "model_name": "gpt4", "prompt": "What is the capital of Germany?", "group": "group2"}
{"id": 8, "api": "ollama", "model_name": "llama3", "prompt": "What is the capital of France?", "group": "group3"}
{"id": 9, "api": "ollama", "model_name": "llama3", "prompt": "What is the capital of Germany?", "group": "group3"}
{"id": 10, "api": "ollama", "model_name": "mistral", "prompt": "What is the capital of France?", "group": "group3"}
{"id": 11, "api": "ollama", "model_name": "mistral", "prompt": "What is the capital of Germany?", "group": "group3"}

Setting rate limits for groups works in the exact same way as setting rate limits for APIs. We simply pass in a dictionary where the keys are the group names and the values are the rate limits for that group. We can see how this is done below:

In [27]:

Copied!

max_queries_dict = {"group1": 5, "group2": 10, "group3": 15}
settings = Settings(data_folder=data_folder, max_queries=5, parallel=True)
print(settings)
max_queries_dict = {"group1": 5, "group2": 10, "group3": 15}
settings = Settings(data_folder=data_folder, max_queries=5, parallel=True)
print(settings)

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=True, max_queries_dict={}
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

In [28]:

Copied!





experiment = Experiment(
    file_name="documentation_example_groups_1.jsonl", settings=settings
)
experiment.grouped_experiment_prompts
experiment = Experiment(
    file_name="documentation_example_groups_1.jsonl", settings=settings
)
experiment.grouped_experiment_prompts

Out[28]:

{'group1': {'prompt_dicts': [{'id': 0,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of France?',
    'group': 'group1'},
   {'id': 2,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of France?',
    'group': 'group1'},
   {'id': 4,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of France?',
    'group': 'group1'},
   {'id': 6,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of France?',
    'group': 'group1'}],
  'rate_limit': 5},
 'group2': {'prompt_dicts': [{'id': 1,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of Germany?',
    'group': 'group2'},
   {'id': 3,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of Germany?',
    'group': 'group2'},
   {'id': 5,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of Germany?',
    'group': 'group2'},
   {'id': 7,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of Germany?',
    'group': 'group2'}],
  'rate_limit': 5},
 'group3': {'prompt_dicts': [{'id': 8,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of France?',
    'group': 'group3'},
   {'id': 9,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of Germany?',
    'group': 'group3'},
   {'id': 10,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of France?',
    'group': 'group3'},
   {'id': 11,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of Germany?',
    'group': 'group3'}],
  'rate_limit': 5}}

In [29]:

Copied!

experiment.grouped_experiment_prompts_summary()
experiment.grouped_experiment_prompts_summary()

Out[29]:

{'group1': '4 queries at 5 queries per minute',
 'group2': '4 queries at 5 queries per minute',
 'group3': '4 queries at 5 queries per minute'}

Mixing using the "api" and "group" keys to define groups¶

It is possible to have an experiment file where only some of the prompts have a "group" key. We consider one here in the documentation_example_groups_2.jsonl experiment file:

In [30]:

Copied!

with open(f"{data_folder}/input/documentation_example_groups_2.jsonl", "r") as f:
    print(f.read())
with open(f"{data_folder}/input/documentation_example_groups_2.jsonl", "r") as f:
    print(f.read())

{"id": 0, "api": "gemini", "model_name": "gemini-1.0-pro", "prompt": "What is the capital of France?"}
{"id": 1, "api": "gemini", "model_name": "gemini-1.0-pro", "prompt": "What is the capital of Germany?"}
{"id": 2, "api": "gemini", "model_name": "gemini-1.5-pro", "prompt": "What is the capital of France?"}
{"id": 3, "api": "gemini", "model_name": "gemini-1.5-pro", "prompt": "What is the capital of Germany?"}
{"id": 4, "api": "openai", "model_name": "gpt3.5-turbo", "prompt": "What is the capital of France?"}
{"id": 5, "api": "openai", "model_name": "gpt3.5-turbo", "prompt": "What is the capital of Germany?"}
{"id": 6, "api": "openai", "model_name": "gpt4", "prompt": "What is the capital of France?"}
{"id": 7, "api": "openai", "model_name": "gpt4", "prompt": "What is the capital of Germany?"}
{"id": 8, "api": "ollama", "model_name": "llama3", "prompt": "What is the capital of France?", "group": "group1"}
{"id": 9, "api": "ollama", "model_name": "llama3", "prompt": "What is the capital of Germany?", "group": "group1"}
{"id": 10, "api": "ollama", "model_name": "mistral", "prompt": "What is the capital of France?", "group": "group1"}
{"id": 11, "api": "ollama", "model_name": "mistral", "prompt": "What is the capital of Germany?", "group": "group1"}
{"id": 12, "api": "ollama", "model_name": "gemma", "prompt": "What is the capital of France?", "group": "group2"}
{"id": 13, "api": "ollama", "model_name": "gemma", "prompt": "What is the capital of Germany?", "group": "group2"}
{"id": 14, "api": "ollama", "model_name": "phi3", "prompt": "What is the capital of France?", "group": "group2"}
{"id": 15, "api": "ollama", "model_name": "phi3", "prompt": "What is the capital of Germany?", "group": "group2"}

As noted above, we first try to place prompts into the right groups based on the "group" key and then based on the "api" key. We will specify rate limits for two groups here:

In [31]:

Copied!





max_queries_dict = {
    "group1": 5,
    "group2": 10,
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)
max_queries_dict = {
    "group1": 5,
    "group2": 10,
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=True, max_queries_dict={'group1': 5, 'group2': 10}
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

We can see that the prompts with "group" keys are placed within their respective groups and the remaining prompts are grouped by their "api" key (either gemini or openai in this case).

In [32]:

Copied!





experiment = Experiment(
    file_name="documentation_example_groups_2.jsonl", settings=settings
)
experiment.grouped_experiment_prompts
experiment = Experiment(
    file_name="documentation_example_groups_2.jsonl", settings=settings
)
experiment.grouped_experiment_prompts

Out[32]:

{'group1': {'prompt_dicts': [{'id': 8,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of France?',
    'group': 'group1'},
   {'id': 9,
    'api': 'ollama',
    'model_name': 'llama3',
    'prompt': 'What is the capital of Germany?',
    'group': 'group1'},
   {'id': 10,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of France?',
    'group': 'group1'},
   {'id': 11,
    'api': 'ollama',
    'model_name': 'mistral',
    'prompt': 'What is the capital of Germany?',
    'group': 'group1'}],
  'rate_limit': 5},
 'group2': {'prompt_dicts': [{'id': 12,
    'api': 'ollama',
    'model_name': 'gemma',
    'prompt': 'What is the capital of France?',
    'group': 'group2'},
   {'id': 13,
    'api': 'ollama',
    'model_name': 'gemma',
    'prompt': 'What is the capital of Germany?',
    'group': 'group2'},
   {'id': 14,
    'api': 'ollama',
    'model_name': 'phi3',
    'prompt': 'What is the capital of France?',
    'group': 'group2'},
   {'id': 15,
    'api': 'ollama',
    'model_name': 'phi3',
    'prompt': 'What is the capital of Germany?',
    'group': 'group2'}],
  'rate_limit': 10},
 'gemini': {'prompt_dicts': [{'id': 0,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of France?'},
   {'id': 1,
    'api': 'gemini',
    'model_name': 'gemini-1.0-pro',
    'prompt': 'What is the capital of Germany?'},
   {'id': 2,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of France?'},
   {'id': 3,
    'api': 'gemini',
    'model_name': 'gemini-1.5-pro',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 5},
 'openai': {'prompt_dicts': [{'id': 4,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of France?'},
   {'id': 5,
    'api': 'openai',
    'model_name': 'gpt3.5-turbo',
    'prompt': 'What is the capital of Germany?'},
   {'id': 6,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of France?'},
   {'id': 7,
    'api': 'openai',
    'model_name': 'gpt4',
    'prompt': 'What is the capital of Germany?'}],
  'rate_limit': 5}}

In [33]:

Copied!

experiment.grouped_experiment_prompts_summary()
experiment.grouped_experiment_prompts_summary()

Out[33]:

{'group1': '4 queries at 5 queries per minute',
 'group2': '4 queries at 10 queries per minute',
 'gemini': '4 queries at 5 queries per minute',
 'openai': '4 queries at 5 queries per minute'}

Model-specific rates within groups¶

Specifying model-specific rates within groups works in the exact same way as specifying model-specific rates for APIs. We can see how this is done below:

In [34]:

Copied!





max_queries_dict = {
    "group1": {"llama3": 10},
    "group2": 10,
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)
max_queries_dict = {
    "group1": {"llama3": 10},
    "group2": 10,
}
settings = Settings(
    data_folder=data_folder,
    max_queries=5,
    max_queries_dict=max_queries_dict,
    parallel=True,
)
print(settings)

Settings: data_folder=parallel_data_example, max_queries=5, max_attempts=3, parallel=True, max_queries_dict={'group1': {'llama3': 10}, 'group2': 10}
Subfolders: input_folder=parallel_data_example/input, output_folder=parallel_data_example/output, media_folder=parallel_data_example/media

We now we see we have split up group1 further and have a group1-llama3 grouping:

In [35]:

Copied!





experiment = Experiment(
    file_name="documentation_example_groups_2.jsonl", settings=settings
)
experiment.grouped_experiment_prompts_summary()
experiment = Experiment(
    file_name="documentation_example_groups_2.jsonl", settings=settings
)
experiment.grouped_experiment_prompts_summary()

Out[35]:

{'group1-llama3': '2 queries at 10 queries per minute',
 'group2': '4 queries at 10 queries per minute',
 'gemini': '4 queries at 5 queries per minute',
 'openai': '4 queries at 5 queries per minute',
 'group1': '2 queries at 5 queries per minute'}