Using prompto with OpenAI¶
from prompto.settings import Settings
from prompto.experiment import Experiment
from dotenv import load_dotenv
import os
When using prompto
to query models from the OpenAI API, lines in our experiment .jsonl
files must have "api": "openai"
in the prompt dict.
Environment variables¶
For the OpenAI API, there are two environment variables that could be set:
OPENAI_API_KEY
: the API key for the OpenAI API
As mentioned in the environment variables docs, there are also model-specific environment variables too which can be utilised. In particular, when you specify a model_name
key in a prompt dict, one could also specify a OPENAI_API_KEY_model_name
environment variable to indicate the API key used for that particular model (where "model_name" is replaced to whatever the corresponding value of the model_name
key is). We will see a concrete example of this later.
To set environment variables, one can simply have these in a .env
file which specifies these environment variables as key-value pairs:
OPENAI_API_KEY=<YOUR-OPENAI-KEY>
If you make this file, you can run the following which should return True
if it's found one, or False
otherwise:
load_dotenv(dotenv_path=".env")
True
Now, we obtain those values. We raise an error if the OPENAI_API_KEY
environment variable hasn't been set:
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
if OPENAI_API_KEY is None:
raise ValueError("OPENAI_API_KEY is not set")
If you get any errors or warnings in the above two cells, try to fix your .env
file like the example we have above to get these variables set.
Types of prompts¶
With the OpenAI API, the prompt (given via the "prompt"
key in the prompt dict) can take several forms:
- a string: a single prompt to obtain a response for
- a list of strings: a sequence of prompts to send to the model
- this is useful in the use case of simulating a conversation with the model by defining the user prompts sequentially
- a list of dictionaries with keys "role" and "content", where "role" is one of "user", "assistant", or "system" and "content" is the message
- this is useful in the case of passing in some conversation history or to pass in a system prompt to the model
We have created an input file in data/input/openai-example.jsonl with an example of each of these cases as an illustration.
settings = Settings(data_folder="./data", max_queries=30)
experiment = Experiment(file_name="openai-example.jsonl", settings=settings)
We set max_queries
to 30 so we send 30 queries a minute (every 2 seconds).
print(settings)
Settings: data_folder=./data, max_queries=30, max_attempts=3, parallel=False Subfolders: input_folder=./data/input, output_folder=./data/output, media_folder=./data/media
len(experiment.experiment_prompts)
5
We can see the prompts that we have in the experiment_prompts
attribute:
experiment.experiment_prompts
[{'id': 0, 'api': 'openai', 'model_name': 'gpt-4o', 'prompt': 'How does technology impact us?', 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}}, {'id': 1, 'api': 'openai', 'model_name': 'gpt-3.5-turbo', 'prompt': 'How does technology impact us?', 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}}, {'id': 2, 'api': 'openai', 'model_name': 'gpt-4o', 'prompt': ['How does international trade create jobs?', 'I want a joke about that'], 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}}, {'id': 3, 'api': 'openai', 'model_name': 'gpt-4o', 'prompt': [{'role': 'system', 'content': 'You are a helpful assistant designed to answer questions briefly.'}, {'role': 'user', 'content': 'What efforts are being made to keep the hakka language alive?'}], 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}}, {'id': 4, 'api': 'openai', 'model_name': 'gpt-4o', 'prompt': [{'role': 'system', 'content': 'You are a helpful assistant designed to answer questions briefly.'}, {'role': 'user', 'content': "Hello, I'm Bob and I'm 6 years old"}, {'role': 'assistant', 'content': 'Hi Bob, how may I assist you?'}, {'role': 'user', 'content': 'How old will I be next year?'}], 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}}]
- In the first prompt (
"id": 0
), we have a"prompt"
key which is a string and specify a"model_name"
key to be "gpt-4o" - In the second prompt (
"id": 1
), we have a"prompt"
key is also a string but we specify a"model_name"
key to be "gpt-3.5-turbo". - In the third prompt (
"id": 2
), we have a"prompt"
key which is a list of strings. - In the fourth prompt (
"id": 3
), we have a"prompt"
key which is a list of dictionaries. These dictionaries have a "role" and "content" key. This acts as passing in a system prompt. Here, we just have a system prompt before a user prompt. - In the fifth prompt (
"id": 4
), we have a"prompt"
key which is a list of dictionaries. These dictionaries have a "role" and "content" key. Here, we have a system prompt and a series of user/assistant interactions before finally having a user prompt. This acts as passing in a system prompt and conversation history.
Note that for each of these prompt dicts, we have "model_name": "gpt-4o"
, besides "id": 1
where we have "model_name": "gpt-3.5-turbo"
.
Running the experiment¶
We now can run the experiment using the async method process
which will process the prompts in the input file asynchronously. Note that a new folder named timestamp-openai-example
(where "timestamp" is replaced with the actual date and time of processing) will be created in the output directory and we will move the input file to the output directory. As the responses come in, they will be written to the output file and there are logs that will be printed to the console as well as being written to a log file in the output directory.
responses, avg_query_processing_time = await experiment.process()
Sending 5 queries (attempt 1/3): 100%|██████████| 5/5 [00:10<00:00, 2.00s/query] Waiting for responses (attempt 1/3): 100%|██████████| 5/5 [00:00<00:00, 6.47query/s]
We can see that the responses are written to the output file, and we can also see them as the returned object. From running the experiment, we obtain prompt dicts where there is now a "response"
key which contains the response(s) from the model.
For the case where the prompt is a list of strings, we see that the response is a list of strings where each string is the response to the corresponding prompt.
responses
[{'id': 0, 'api': 'openai', 'model_name': 'gpt-4o', 'prompt': 'How does technology impact us?', 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}, 'response': 'The impact of technology on society is profound and multifaceted, touching virtually every aspect of our lives. Here are some key ways in which technology affects us:\n\n### 1. Communication\n- **Enhanced Connectivity:** Instant messaging, video calls, and social media platforms have revolutionized how we communicate, shrinking the world and making it easier to stay in touch with friends, family, and colleagues.\n- **Digital Divide:** While technology enhances connectivity, it also highlights the gap between those with access to digital tools'}, {'id': 1, 'api': 'openai', 'model_name': 'gpt-3.5-turbo', 'prompt': 'How does technology impact us?', 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}, 'response': 'Technology impacts us in many ways. It has changed the way we communicate, work, learn, and access information. It has made tasks easier and more efficient, but it has also brought about challenges such as information overload, privacy concerns, and digital addiction.\n\nTechnology has enabled us to connect with people from around the world instantly through social media, video calls, and messaging apps. It has also streamlined communication in the workplace through email and project management tools. Additionally, technology has revolutionized how we consume media'}, {'id': 3, 'api': 'openai', 'model_name': 'gpt-4o', 'prompt': [{'role': 'system', 'content': 'You are a helpful assistant designed to answer questions briefly.'}, {'role': 'user', 'content': 'What efforts are being made to keep the hakka language alive?'}], 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}, 'response': 'Efforts to keep the Hakka language alive include educational programs, inclusion in school curricula, cultural festivals, and community activities. Digital initiatives such as apps, social media, and online courses also play a role. Additionally, preservation projects like documentation, dictionaries, and media broadcasts in Hakka contribute to its revitalization.'}, {'id': 4, 'api': 'openai', 'model_name': 'gpt-4o', 'prompt': [{'role': 'system', 'content': 'You are a helpful assistant designed to answer questions briefly.'}, {'role': 'user', 'content': "Hello, I'm Bob and I'm 6 years old"}, {'role': 'assistant', 'content': 'Hi Bob, how may I assist you?'}, {'role': 'user', 'content': 'How old will I be next year?'}], 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}, 'response': 'Next year, you will be 7 years old.'}, {'id': 2, 'api': 'openai', 'model_name': 'gpt-4o', 'prompt': ['How does international trade create jobs?', 'I want a joke about that'], 'parameters': {'n': 1, 'temperature': 1, 'max_tokens': 100}, 'response': ['International trade can create jobs through various mechanisms:\n\n1. **Market Expansion**: By opening up to international markets, businesses can sell their goods and services to a much larger audience than the domestic market alone. This increased demand often leads to higher production levels, which in turn requires more workers, thus creating jobs.\n\n2. **Specialization and Efficiency**: International trade encourages countries to specialize in the production of goods and services that they can produce most efficiently. This specialization often leads to the growth of industries', "Sure, here's a light-hearted take on it:\n\nWhy did the factory worker love international trade?\n\nBecause it meant they were getting shipped all over the world!"]}]
Running the experiment via the command line¶
We can also run the experiment via the command line. The command is as follows (assuming that your working directory is the current directory of this notebook, i.e. examples/openai
):
prompto_run_experiment --file data/input/openai-example.jsonl --max-queries 30