LangStream Documentation
Langstream.aiLangStream GitHub RepoChangelog
  • LangStream Documentation
  • ❤️Langstream.ai
  • ⭐LangStream GitHub Repo
  • 📜Changelog
  • about
    • What is LangStream?
    • License
  • Get Started
  • installation
    • LangStream CLI
    • Docker
    • Minikube (mini-langstream)
    • Kubernetes
    • Build and install from source
  • Building Applications
    • Vector Databases
    • Application structure
      • Pipelines
      • Instances
      • Configuration
      • Topics
      • Assets
      • Secrets
      • YAML templating
      • Error Handling
      • Stateful agents
      • .langstreamignore
    • Sample App
    • Develop, test and deploy
    • Application Lifecycle
    • Expression Language
    • API Gateways
      • Websocket
      • HTTP
      • Message filtering
      • Gateway authentication
    • API Reference
      • Agents
      • Resources
      • Assets
  • LangStream CLI
    • CLI Commands
    • CLI Configuration
    • Web interface
  • Integrations
    • Large Language Models (LLMs)
      • OpenAI
      • Hugging Face
      • Google Vertex AI
      • Amazon Bedrock
      • Ollama
    • Data storage
      • Astra Vector DB
      • Astra
      • Cassandra
      • Pinecone
      • Milvus
      • Solr
      • JDBC
      • OpenSearch
    • Integrations
      • Apache Kafka Connect
      • Apache Camel
    • LangServe
  • Pipeline Agents
    • Agent Messaging
    • Builtin agents
      • Input & Output
        • webcrawler-source
        • s3-source
        • azure-blob-storage-source
        • sink
        • vector-db-sink
        • camel-source
      • AI Agents
        • ai-chat-completions
        • ai-text-completions
        • compute-ai-embeddings
        • flare-controller
      • Text Processors
        • document-to-json
        • language-detector
        • query
        • query-vector-db
        • re-rank
        • text-normaliser
        • text-extractor
        • text-splitter
        • http-request
      • Data Transform
        • cast
        • compute
        • drop
        • drop-fields
        • merge-key-value
        • unwrap-key-value
      • Flow control
        • dispatch
        • timer-source
        • trigger-event
    • Custom Agents
      • Python sink
      • Python source
      • Python processor
      • Python service
    • Agent Developer Guide
      • Agent Types
      • Agent Creation
      • Configuration and Testing
      • Environment variables
  • Messaging
    • Messaging
      • Apache Pulsar
      • Apache Kafka
      • Pravega.io
  • Patterns
    • RAG pattern
    • FLARE pattern
  • Examples
    • LangServe chatbot
    • LlamaIndex Cassandra sink
Powered by GitBook
On this page
  • Astra DB example
  • AstraDB Topics
  • AstraDB Configuration
  • Pinecone Example
  • Pinecone Topics
  • Pinecone Configuration
Edit on GitHub
  1. Pipeline Agents
  2. Builtin agents
  3. Input & Output

vector-db-sink

PrevioussinkNextcamel-source

Last updated 1 year ago

This agent writes vector data to vector databases. LangStream currently supports and .

Astra DB and Pinecone are both of the type "vector-db-sink" in a LangStream pipeline, but the databases require different configuration values to map the vector data from the sink into the database.

Astra DB example

The Astra DB vector database connection is defined in configuration.yaml:

configuration:
  resources:
    - type: "vector-database"
      name: "AstraDatasource"
      configuration:
        service: "astra"
        username: "${ secrets.astra.username }"
        password: "${ secrets.astra.password }"
        secureBundle: "${ secrets.astra.secureBundle }"

The "Write to Astra DB" pipeline step takes embeddings as input from "input-topic" and writes them to the configured datasource "AstraDatasource":

name: "Write to Astra DB"
topics:
  - name: "input-topic"
    creation-mode: create-if-not-exists
pipeline:
  - name: "Write to Cassandra"
    type: "vector-db-sink"
    input: "input-topic"
    configuration:
      datasource: "AstraDatasource"
      table: "vsearch.products"
      mapping: "id=value.id,description=value.description,name=value.name"

AstraDB Topics

Input

Output

  • None, it’s a sink.

AstraDB Configuration

Label
Type
Description

datasource

String

The datasource is defined in the Resources section of configuration.yaml.

table

String

The `keyspace.table-name` the vector data will be written to

mapping

String

How the data from the input records will be mapped to the corresponding columns in the database table. "id=value.id" maps the "id" value in the input record to the "id" value of the database.

Pinecone Example

The "Write to Pinecone" pipeline step takes embeddings as input from "vectors-topic" and writes them to a Pinecone datasource.

The Pinecone vector database connection is defined in configuration.yaml:

    - type: "vector-database"
      name: "PineconeDatasource"
      configuration:
        service: "pinecone"
        api-key: "${secrets.pinecone.api-key}"
        environment: "${secrets.pinecone.environment}"
        index-name: "${secrets.pinecone.index-name}"
        project-name: "${secrets.pinecone.project-name}"
        server-side-timeout-sec: 10

The "Write to Pinecone" pipeline step takes embeddings as input from "input-topic" and writes them to the configured datasource "PineconeDatasource":

name: "Write to Pinecone DB"
topics:
  - name: "vectors-topic"
    creation-mode: create-if-not-exists
pipeline:
  - name: "Write to Pinecone"
    type: "vector-db-sink"
    configuration:
      datasource: "PineconeDatasource"
      vector.id: "value.id"
      vector.vector: "value.embeddings"
      vector.namespace: "value.namespace"
      vector.metadata.genre: "value.genre"

Pinecone Topics

Input

Output

  • None, it’s a sink.

Pinecone Configuration

Label
Type
Description

datasource

String

The datasource is defined in the Resources section of configuration.yaml.

vector.id

String

Maps id to vector.id

vector.vector

String

Maps the input value "vector" to "vector.vector" in the database.

vector.namespace

String

Maps the input value "namespace" to "vector.namespace" in the database.

vector.metadata.{metadataField}

String

Maps the input value "metadata.{metadataField}" to "vector.metadata.{metadataField}"

Structured and unstructured text

Implicit topic

Templating

Structured and unstructured text

Implicit topic

Templating

AstraDB
Pinecone
?
?
?
?
?
?