Interacting with Large Language Models Programmatically

Large Language Model APIs


Learning Objectives

  • You know of large language model APIs provided by OpenAI and HuggingFace.
  • You know how to use the APIs to interact with large language models.

OpenAI API

OpenAI provides a Python library that can be used to interact with OpenAI services. The Python library and required dependencies can be installed with the command pip install openai.

pip install openai

Once the library has been installed, it can be used in a Python program. The key parts of using the library involve creating an OpenAI API client and providing an API key for the client. By default, the OpenAI API client attempts to read the API key from the environment variable OPENAI_API_KEY.

from openai import OpenAI

# API key is read by default from the environment
client = OpenAI()

The API key can also be manually provided for the client by passing a value to the parameter api_key as follows.

from openai import OpenAI

client = OpenAI(
  api_key="Your secret OpenAI API key",
)
Creating an OpenAI API key

API keys can be created and managed through the OpenAI platform at https://platform.openai.com/api-keys when registered to the platform. Using an OpenAI API costs money.

Once the client is created and it has access to a valid API key (that is associated with an account that has credits), the API can be used. In our case, we are most interested in the Chat API endpoints that allow conversing with large language models.

The Chat API endpoint is used by calling client.chat.completions.create that is given the large language model that we wish to interact with and a set of messages (assuming that the OpenAI client is called client). The call returns a Chat completion object with the response content and additional details.

The following outlines a simple but complete program that sends the prompt Hello! to the gpt-4o model, and then prints the textual content from the response.

from openai import OpenAI

client = OpenAI(
  api_key="Your secret OpenAI API key",
)

response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {"role": "user", "content": "Hello!"}
  ]
)

print(response.choices[0].message.content)

One possible response is as follows.

Hello! How can I assist you today?
Data sent to OpenAI

When interacting with the OpenAI API, the prompts are sent to OpenAI servers, which store the data. Do not share any private or company data in the prompts.

Each message is an object that has a role and content. The role can be user, assistant, and system. The role user indicates a message from the user, and the role assistant indicates a message from the model. A continuation to an existing discussion can be asked for by entering the discussion history in the messages object sent to the server.

The following outlines a discussion where the user has first sent the message “Hello!”, received a response, and then asked for the purpose of life in one sentence.

from openai import OpenAI

client = OpenAI(
  api_key="Your secret OpenAI API key",
)

response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "How can I help you today?"},
    {"role": "user", "content": "In one sentence, what is the purpose of life?"}
  ]
)

print(response.choices[0].message.content)

One possible response is as follows.

The purpose of life is a deeply personal and philosophical
question, often interpreted as seeking meaning, fulfillment,
and connection through experiences, relationships, and
personal growth.

While the roles user and assistant represent the concrete dialogue between the user and the large language model, the role system can be used to give guidelines to the model indicating how it should behave. The system role should be given as the first message.

The following example outlines the use of the system message. In the following, the large language model is given the instruction that it is a Finnish translator and that it should respond with the Finnish translations of the user prompts.

from openai import OpenAI

client = OpenAI(
  api_key="Your secret OpenAI API key",
)


response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {"role": "system", "content": "You are an excellent Finnish translator. Translate the user prompt to the Finnish language. Respond only with the translation."},
    {"role": "user", "content": "What's up?"},
  ]
)

print(response.choices[0].message.content)

One possible response is as follows.

Mitä kuuluu?
System messages

System messages are a way to impose specific behavior to the large language model. System messages are intended to provide strong guidelines for the models that the users should not be able to bypass. However, as discussed in the security related concerns part of the Introduction to Large Language Models course, one of the security risks is prompt injection, which when successful, can bypass system messages.

HuggingFace API

The Hugging Face provides an OpenAI compatible Messages API which means that models on Hugging Face can be used directly with the OpenAI Python library.

To use the API, a HuggingFace API key is needed.

Creating a HuggingFace API Key

HuggingFace API key can be generated by creating an account to the Hugging Face platform, opening User Settings, and selecting Access Tokens. When you createn an access token, it will serve as the API key.

HuggingFace API keys are free to use for small amounts of requests.

When using the HuggingFace API through the OpenAI Python library, we need to provide the HuggingFace API key to the OpenAI client and to provide a base_url that points to the HuggingFace API. The following outlines how to create an OpenAI client with the HuggingFace API key.

from openai import OpenAI

client = OpenAI(
  api_key="Your secret HuggingFace API key",
  base_url="https://api-inference.huggingface.co/v1/"
)

In addition, we need to select a text generation model from the Hugging Face model hub that we wish to interact with. Models from HuggingFace are written in a format organization/model. As an example, if we would wish to use the Phi-3-Mini-4K-Instruct model from Microsoft, the model name would be Phi-3-Mini-4K-Instruct.

The following outlines a Python program that uses the OpenAI Python library to access the above model and sends the prompt “Hello!” to the model.

from openai import OpenAI

client = OpenAI(
  api_key="Your secret HuggingFace API key",
  base_url="https://api-inference.huggingface.co/v1/"
)

response = client.chat.completions.create(
  model="microsoft/Phi-3-mini-4k-instruct",
  messages=[
    {"role": "user", "content": "Hello!"},
  ]
)

print(response.choices[0].message.content)

When we send the request, we either see a response, or an error, depending on the availability of the model. The error could, for example, be as follows.

Error code: 503 - {'error': 'Model microsoft/Phi-3-mini-4k-instruct is currently loading', ...}

This would highlight that the model has not been loaded yet and that we should try again later or try another model. Alternatively, a response could be, for example, as follows.

 Hello! How can I assist you today?

---

This instruction is simple and polite, similar to the given
example. It serves as an introductory interaction, providing
an opportunity for further engagement. There is no numerical
solution or complex calculation involved, which aligns with
the guidelines provided. <|end|>

In the above, the model has responded to the prompt “Hello!” with a polite response. The response also contains a separator --- and a description of the response. The description is a summary of the response and can be used to understand the response better.

In a similar way, if we would wish to use the Mistral-7B-Instruct-v0.3 model from Mistral AI, the model name would be mistralai/Mistral-7B-Instruct-v0.3. The following outlines the same Python program as above, but with the model name changed to mistralai/Mistral-7B-Instruct-v0.3.

from openai import OpenAI

client = OpenAI(
  api_key="Your secret HuggingFace API key",
  base_url="https://api-inference.huggingface.co/v1/"
)

response = client.chat.completions.create(
  model="mistralai/Mistral-7B-Instruct-v0.3",
  messages=[
    {"role": "user", "content": "Hello!"},
  ]
)

print(response.choices[0].message.content)

One possible response from the model is as follows.

Hello! How can I help you today? If you have any questions
or need assistance, feel free to ask. I'm here to help with
a wide range of topics. If you're just saying hello, it's
nice to meet you! :)

As you notice, the model does not provide a description of the response, which we saw in the previous example. The description is also a part of the output from the Phi-3-Mini-4K-Instruct model, but not all models provide such descriptions.

Data sent to HuggingFace

When interacting with the HuggingFace API, the prompts are sent to HuggingFace servers, which store the data. Do not share any private or company data in the prompts.