LangStream Documentation
Langstream.aiLangStream GitHub RepoChangelog
  • LangStream Documentation
  • ❤️Langstream.ai
  • ⭐LangStream GitHub Repo
  • 📜Changelog
  • about
    • What is LangStream?
    • License
  • Get Started
  • installation
    • LangStream CLI
    • Docker
    • Minikube (mini-langstream)
    • Kubernetes
    • Build and install from source
  • Building Applications
    • Vector Databases
    • Application structure
      • Pipelines
      • Instances
      • Configuration
      • Topics
      • Assets
      • Secrets
      • YAML templating
      • Error Handling
      • Stateful agents
      • .langstreamignore
    • Sample App
    • Develop, test and deploy
    • Application Lifecycle
    • Expression Language
    • API Gateways
      • Websocket
      • HTTP
      • Message filtering
      • Gateway authentication
    • API Reference
      • Agents
      • Resources
      • Assets
  • LangStream CLI
    • CLI Commands
    • CLI Configuration
    • Web interface
  • Integrations
    • Large Language Models (LLMs)
      • OpenAI
      • Hugging Face
      • Google Vertex AI
      • Amazon Bedrock
      • Ollama
    • Data storage
      • Astra Vector DB
      • Astra
      • Cassandra
      • Pinecone
      • Milvus
      • Solr
      • JDBC
      • OpenSearch
    • Integrations
      • Apache Kafka Connect
      • Apache Camel
    • LangServe
  • Pipeline Agents
    • Agent Messaging
    • Builtin agents
      • Input & Output
        • webcrawler-source
        • s3-source
        • azure-blob-storage-source
        • sink
        • vector-db-sink
        • camel-source
      • AI Agents
        • ai-chat-completions
        • ai-text-completions
        • compute-ai-embeddings
        • flare-controller
      • Text Processors
        • document-to-json
        • language-detector
        • query
        • query-vector-db
        • re-rank
        • text-normaliser
        • text-extractor
        • text-splitter
        • http-request
      • Data Transform
        • cast
        • compute
        • drop
        • drop-fields
        • merge-key-value
        • unwrap-key-value
      • Flow control
        • dispatch
        • timer-source
        • trigger-event
    • Custom Agents
      • Python sink
      • Python source
      • Python processor
      • Python service
    • Agent Developer Guide
      • Agent Types
      • Agent Creation
      • Configuration and Testing
      • Environment variables
  • Messaging
    • Messaging
      • Apache Pulsar
      • Apache Kafka
      • Pravega.io
  • Patterns
    • RAG pattern
    • FLARE pattern
  • Examples
    • LangServe chatbot
    • LlamaIndex Cassandra sink
Powered by GitBook
On this page
  • JSON and String inputs
  • Using Open AI
  • Using Google Vertex AI
  • Using Ollama
  • Using Amazon Bedrock
  • Using Huggingface
  • Automatically computing the embeddings over a list of inputs
  • Topics
  • Configuration
Edit on GitHub
  1. Pipeline Agents
  2. Builtin agents
  3. AI Agents

compute-ai-embeddings

Previousai-text-completionsNextflare-controller

Last updated 1 year ago

This agent uses the configured AI model’s embedding feature to transform a string of text into a vector embedding. At present, it is assumed that only one AI mode will be set in configuration.yaml. This agent will discover its type (ie OpenAI, Hugging Face, Vertex) and use the corresponding library to generate the embedding. It is up to the developer to match the correct embedding model with the configured AI model.

JSON and String inputs

This agent currently only accepts JSON-formatted inputs.

Either ensure the input is JSON, or put the agent before the compute-ai-embeddings agent in your pipeline:

pipeline:
  - name: "convert-to-json"
    type: "document-to-json"
    input: "input-topic"
    configuration:
      text-field: "question"
  - name: "compute-embeddings"
    id: "step1"
    type: "compute-ai-embeddings"
    input: "input-topic"
    output: "output-topic"
    configuration:
      model: "${secrets.open-ai.embeddings-model}" # This needs to match the name of the model deployment, not the base model
      embeddings-field: "value.embeddings"
      text: "{{ value.name }} {{ value.description }}"
      batch-size: 10
      # this is in milliseconds. It is important to take this value into consideration when using this agent in the chat response pipeline
      # in fact this value impacts the latency of the response
      # for latency sensitive applications, consider to set batch-size to 1 or flush-interval to 0
      flush-interval: 500

Using Open AI

- name: "compute-embeddings"
  type: "compute-ai-embeddings"
  input: "input-topic" # optional
  output: "output-topic" # optional
  configuration:
    model: "text-embedding-ada-002"
    embeddings-field: "value.embeddings"
    text: "{{ value }}"

Using Google Vertex AI

- name: "compute-embeddings"
  type: "compute-ai-embeddings"
  input: "input-topic" # optional
  output: "output-topic" # optional
  configuration:
    model: "textembedding-gecko"
    embeddings-field: "value.embeddings"
    text: "{{ value }}"

Using Ollama

Add the compute-ai-embeddings agent:

- name: "compute-embeddings"
  type: "compute-ai-embeddings"
  input: "input-topic" # optional
  output: "output-topic" # optional
  configuration:
    model: "llama2"
    embeddings-field: "value.embeddings"
    text: "{{ value }}"

Ollama models may compute embeddings but they are currently not as good as models provided by OpenAI or Huggingface. Ollama will provide models specifically for embeddings in the future.

Using Amazon Bedrock

- name: "compute-embeddings"
  type: "compute-ai-embeddings"
  input: "input-topic" # optional
  output: "output-topic" # optional
  configuration:
    model: "amazon.titan-embed-text-v1"
    embeddings-field: "value.embeddings"
    text: "{{ value }}"

Using Huggingface

  - name: "compute-embeddings"
    id: "step1"
    type: "compute-ai-embeddings"
    input: "input-topic"
    output: "output-topic"
    configuration:
      model: "${secrets.open-ai.embeddings-model}" # This needs to match the name of the model deployment, not the base model
      embeddings-field: "value.embeddings"
      text: "{{ value.name }} {{ value.description }}"
      batch-size: 10
      # this is in milliseconds. It is important to take this value into consideration when using this agent in the chat response pipeline
      # in fact this value impacts the latency of the response
      # for latency sensitive applications, consider to set batch-size to 1 or flush-interval to 0
      flush-interval: 500

Set HUGGING_FACE_PROVIDER=api and provide your Huggingface key and embeddings model to use the HF inference API:

export HUGGING_FACE_PROVIDER=api
export HUGGING_FACE_ACCESS_KEY=your_access_key
export HUGGING_FACE_EMBEDDINGS_MODEL=multilingual-e5-small

To compute text embeddings with a local model instead of calling the Huggingface API, set HUGGING_FACE_PROVIDER=local and set your embeddings model.

HUGGING_FACE_PROVIDER=local
HUGGING_FACE_EMBEDDINGS_MODEL=multilingual-e5-small
HUGGING_FACE_EMBEDDINGS_MODEL_URL=djl://ai.djl.huggingface.pytorch/intfloat/multilingual-e5-small

Automatically computing the embeddings over a list of inputs

In the example below we use the 'loop-over' capability to compute the embeddings for each document in the list of documents to retrieve.

  - name: "compute-embeddings"
    type: "compute-ai-embeddings"
    configuration:
      loop-over: "value.documents_to_retrieve"
      model: "${secrets.open-ai.embeddings-model}"
      embeddings-field: "record.embeddings"
      text: "{{ record.text }}"

When you use "loop-over", the agent executes for each element in a list instead of operating on the whole message. Use "record.xxx" to refer to the current element in the list.

The snippet above computes the embeddings for each element in the list "documents_to_retrieve". The list is expected to be a struct like this:

{
  "documents_to_retrieve": [
      {
        "text": "the text of the first document"
      },
      {
        "text": "the text of the second document"
      }
    ]
}

After running the agent the contents of the list are:

{
  "documents_to_retrieve": [
      {
        "text": "the text of the first document",
        "embeddings": [1,2,3,4,5]
       },
       {
        "text": "the text of the second document",
        "embeddings": [6,7,8,9,10]
       }
    ]
}

Topics

Input

Output

Configuration

Set up the OpenAI LLM . Add the compute-ai-embeddings agent:

Set up the Vertex LLM . Add the compute-ai-embeddings agent:

Refer to the to find a list of models.

Setup the Ollama .

Set up the Amazon Bedrock LLM . Add the compute-ai-embeddings agent:

Set up the Huggingface resource . Add the compute-ai-embeddings agent:

The above example will use the multilingual-e5-small Huggingface model locally via the .

It is possible to perform the same computation over a list of inputs - for example, a list of questions. You can take the as an example.

Structured and unstructured text

Implicit topic

Templating

Structured text

Implicit topic

Check out the full configuration properties in the .

document-to-json
configuration
configuration
ollama documentation
configuration
configuration
configuration
Deep Java Library
Flare pattern
?
?
?
?
?
API Reference page