Simple quart API
Quart API
To query models from Huggingface that are not available via the text-generation-inference
API, we have written a simple script to start up a Quart API that can be used to query a text-generation model from the Huggingface model hub using the Huggingface transformers
library. This can be started using the prompto_quart_server
command, e.g.
prompto_quart_server \
--model-name vicgalle/gpt2-open-instruct-v1 \
--host localhost \
--port 5000 \
--max-length 200
Once the server is running, you can query the model by sending a POST request to the endpoint with the prompt in the request body, e.g.
curl \
-X POST http://localhost:5000/generate \
-H "Content-Type: application/json" \
-d '{"text": "This is a test prompt"}'
In Python, you can use the requests
library to send a POST request to the endpoint, e.g.
import requests
import json
req = requests.post(
"http://localhost:5000/generate",
data=json.dumps({"text": "This is a test prompt"}),
headers={"Content-Type": "application/json"},
)
Environment variables:
QUART_API_ENDPOINT
: the endpoint for the Quart API
Model-specific environment variables:
As described in the model-specific environment variables of the environment variables document section, you can set model-specific environment variables for different models in the Quart API by appending the model name to the environment variable name. For example, if "model_name": "vicgalle/gpt2-open-instruct-v1"
is specified in the prompt_dict
, the following model-specific environment variables can be used:
QUART_API_ENDPOINT_vicgalle_gpt2_open_instruct_v1
Similarly to the Huggingface text-generation-inference
API, the model name is only used as an identifier for the pipeline. The model that the endpoint is querying is returned in the response from the API and saved in the output prompt_dict
in the "model"
key.
In this case, the completed prompt_dict
should include the "model_name": "vicgalle/gpt2-open-instruct-v1"
key-value pair to confirm that the endpoint is indeed querying the correct model.
Required environment variables:
For any given prompt_dict
, the following environment variables are required:
- One of
QUART_API_ENDPOINT
orQUART_API_ENDPOINT_model_name