LangStream Documentation
Langstream.aiLangStream GitHub RepoChangelog
  • LangStream Documentation
  • ❤️Langstream.ai
  • ⭐LangStream GitHub Repo
  • 📜Changelog
  • about
    • What is LangStream?
    • License
  • Get Started
  • installation
    • LangStream CLI
    • Docker
    • Minikube (mini-langstream)
    • Kubernetes
    • Build and install from source
  • Building Applications
    • Vector Databases
    • Application structure
      • Pipelines
      • Instances
      • Configuration
      • Topics
      • Assets
      • Secrets
      • YAML templating
      • Error Handling
      • Stateful agents
      • .langstreamignore
    • Sample App
    • Develop, test and deploy
    • Application Lifecycle
    • Expression Language
    • API Gateways
      • Websocket
      • HTTP
      • Message filtering
      • Gateway authentication
    • API Reference
      • Agents
      • Resources
      • Assets
  • LangStream CLI
    • CLI Commands
    • CLI Configuration
    • Web interface
  • Integrations
    • Large Language Models (LLMs)
      • OpenAI
      • Hugging Face
      • Google Vertex AI
      • Amazon Bedrock
      • Ollama
    • Data storage
      • Astra Vector DB
      • Astra
      • Cassandra
      • Pinecone
      • Milvus
      • Solr
      • JDBC
      • OpenSearch
    • Integrations
      • Apache Kafka Connect
      • Apache Camel
    • LangServe
  • Pipeline Agents
    • Agent Messaging
    • Builtin agents
      • Input & Output
        • webcrawler-source
        • s3-source
        • azure-blob-storage-source
        • sink
        • vector-db-sink
        • camel-source
      • AI Agents
        • ai-chat-completions
        • ai-text-completions
        • compute-ai-embeddings
        • flare-controller
      • Text Processors
        • document-to-json
        • language-detector
        • query
        • query-vector-db
        • re-rank
        • text-normaliser
        • text-extractor
        • text-splitter
        • http-request
      • Data Transform
        • cast
        • compute
        • drop
        • drop-fields
        • merge-key-value
        • unwrap-key-value
      • Flow control
        • dispatch
        • timer-source
        • trigger-event
    • Custom Agents
      • Python sink
      • Python source
      • Python processor
      • Python service
    • Agent Developer Guide
      • Agent Types
      • Agent Creation
      • Configuration and Testing
      • Environment variables
  • Messaging
    • Messaging
      • Apache Pulsar
      • Apache Kafka
      • Pravega.io
  • Patterns
    • RAG pattern
    • FLARE pattern
  • Examples
    • LangServe chatbot
    • LlamaIndex Cassandra sink
Powered by GitBook
On this page
  • Using OpenAI chat models
  • Using VertexAI chat models
  • Using Ollama models
  • Using Amazon Bedrock AI21 Jurassic-2 models
  • Using Amazon Bedrock Anthropic Claude models
  • Prompt limitations
  • Costs incurred
  • Topics
  • Configuration
Edit on GitHub
  1. Pipeline Agents
  2. Builtin agents
  3. AI Agents

ai-chat-completions

PreviousAI AgentsNextai-text-completions

Last updated 1 year ago

Given the AI model specified in an application's configuration resources, this agent will use its completion API to submit message prompts and return the result. This agent will discover its type (i.e. OpenAI, Hugging Face, VertexAI) and use the corresponding library to generate the embedding. It is up to the developer to match the correct mode name with the configured AI model.

Using OpenAI chat models

The ai-chat-completions for OpenAI uses the /v1/chat/completions endpoint. Refer to the to know which models are compatible.

Setup the OpenAI LLM . Add the ai-chat-completions agent:

pipeline:
  - name: "ai-chat-completions"
    type: "ai-chat-completions"
    output: "history-topic"
    configuration:
      model: "gpt-3.5-turbo"
      # on the log-topic we add a field with the answer
      completion-field: "value.answer"
      # we are also logging the prompt we sent to the LLM
      log-field: "value.prompt"
      # here we configure the streaming behavior
      # as soon as the LLM answers with a chunk we send it to the answers-topic
      stream-to-topic: "output-topic"
      # on the streaming answer we send the answer as whole message
      # the 'value' syntax is used to refer to the whole value of the message
      stream-response-completion-field: "value"
      # we want to stream the answer as soon as we have 10 chunks
      # in order to reduce latency for the first message the agent sends the first message
      # with 1 chunk, then with 2 chunks....up to the min-chunks-per-message value
      # eventually we want to send bigger messages to reduce the overhead of each message on the topic
      min-chunks-per-message: 10
      messages:
        - role: user
          content: "You are a helpful assistant. Below you can find a question from the user. Please try to help them the best way you can.\n\n{{ value.question }}"

Using VertexAI chat models

pipeline:
  - name: "ai-chat-completions"
    type: "ai-chat-completions"
    configuration:
      model: "chat-bison"
      max-tokens: 100
      completion-field: "value.chatresult"
      log-field: "value.request"
      messages:
        - role: user
          content: "You are a helpful assistant. Below you can find a question from the user. Please try to help them the best way you can.\n\n{{ value.question }}"

Using Ollama models

pipeline:
  - name: "ai-chat-completions"
    type: "ai-chat-completions"
    configuration:
      model: "llama2"
      max-tokens: 100
      completion-field: "value.chatresult"
      log-field: "value.request"
      messages:
        - role: user
          content: "You are a helpful assistant. Below you can find my question. Please try to help them the best way you can.\n\n This is my question: {{ value.question }}"

Using Amazon Bedrock AI21 Jurassic-2 models

pipeline:
  - name: "ai-chat-completions"
    type: "ai-chat-completions"
    configuration:
      model: "ai21.j2-mid-v1"
      completion-field: "value.answer"
      options:
        request-parameters:
          # here you can add all the supported parameters
          temperature: 0.5
          maxTokens: 300
        # expression to retrieve the completion from the response JSON. It varies depending on the model 
        response-completions-expression: "completions[0].data.text"
      messages:
        - content: "{{ value.question }}"

Using Amazon Bedrock Anthropic Claude models

pipeline:
  - name: "ai-chat-completions"
    type: "ai-chat-completions"
    configuration:
      model: "anthropic.claude-v2"
      completion-field: "value.answer"
      options:
        request-parameters:
          # here you can add all the supported parameters
          temperature: 0.5
          max_tokens_to_sample: 300
          top_p: 0.9
          top_k: 250
        # expression to retrieve the completion from the response JSON. It varies depending on the model 
        response-completions-expression: "completion"
      messages:
        - content: "{{ value.question }}"

Prompt limitations

Most public LLMs have character limits on message content. It is up to the application developer to ensure the combination of preset prompt text and an input message stays under that limit.

Costs incurred

Some public LLMs offer a free tier and then automatically begin charging per prompt (or by prompt chunk). It is up to the application developer to be aware of these possible charges and manage them appropriately.

Topics

Input

Output

Configuration

Refer to the to know which models are compatible.

Setup the Vertex LLM . Add the ai-chat-completions agent:

Refer to the to find a list of models.

Setup the Ollama . Add the ai-chat-completions agent:

Refer to the for other parameters and options.

Set up the Amazon Bedrock LLM . Add the ai-chat-completions agent:

Refer to the to learn other parameters and options.

Setup the Amazon Bedrock LLM . Add the ai-chat-completions agent:

Structured and unstructured text

Implicit topic

Templating

Structured text

Implicit topic

Check out the full configuration properties in the .

OpenAI documentation
configuration
VertexAI documentation
configuration
ollama documentation
configuration
Amazon documentation
configuration
Amazon documentation
configuration
?
?
?
?
?
API Reference page