LangStream Documentation
Langstream.aiLangStream GitHub RepoChangelog
  • LangStream Documentation
  • ❤️Langstream.ai
  • ⭐LangStream GitHub Repo
  • 📜Changelog
  • about
    • What is LangStream?
    • License
  • Get Started
  • installation
    • LangStream CLI
    • Docker
    • Minikube (mini-langstream)
    • Kubernetes
    • Build and install from source
  • Building Applications
    • Vector Databases
    • Application structure
      • Pipelines
      • Instances
      • Configuration
      • Topics
      • Assets
      • Secrets
      • YAML templating
      • Error Handling
      • Stateful agents
      • .langstreamignore
    • Sample App
    • Develop, test and deploy
    • Application Lifecycle
    • Expression Language
    • API Gateways
      • Websocket
      • HTTP
      • Message filtering
      • Gateway authentication
    • API Reference
      • Agents
      • Resources
      • Assets
  • LangStream CLI
    • CLI Commands
    • CLI Configuration
    • Web interface
  • Integrations
    • Large Language Models (LLMs)
      • OpenAI
      • Hugging Face
      • Google Vertex AI
      • Amazon Bedrock
      • Ollama
    • Data storage
      • Astra Vector DB
      • Astra
      • Cassandra
      • Pinecone
      • Milvus
      • Solr
      • JDBC
      • OpenSearch
    • Integrations
      • Apache Kafka Connect
      • Apache Camel
    • LangServe
  • Pipeline Agents
    • Agent Messaging
    • Builtin agents
      • Input & Output
        • webcrawler-source
        • s3-source
        • azure-blob-storage-source
        • sink
        • vector-db-sink
        • camel-source
      • AI Agents
        • ai-chat-completions
        • ai-text-completions
        • compute-ai-embeddings
        • flare-controller
      • Text Processors
        • document-to-json
        • language-detector
        • query
        • query-vector-db
        • re-rank
        • text-normaliser
        • text-extractor
        • text-splitter
        • http-request
      • Data Transform
        • cast
        • compute
        • drop
        • drop-fields
        • merge-key-value
        • unwrap-key-value
      • Flow control
        • dispatch
        • timer-source
        • trigger-event
    • Custom Agents
      • Python sink
      • Python source
      • Python processor
      • Python service
    • Agent Developer Guide
      • Agent Types
      • Agent Creation
      • Configuration and Testing
      • Environment variables
  • Messaging
    • Messaging
      • Apache Pulsar
      • Apache Kafka
      • Pravega.io
  • Patterns
    • RAG pattern
    • FLARE pattern
  • Examples
    • LangServe chatbot
    • LlamaIndex Cassandra sink
Powered by GitBook
On this page
Edit on GitHub
  1. Integrations
  2. Data storage

Astra

PreviousAstra Vector DBNextCassandra

Last updated 1 year ago

Connecting to DataStax Astra DB

To use DataStax Astra DB as a vector database, you have to create a "vector-database" resource in your configuration.yaml file.

If you want to use the JSON based API then you have to create a "astra-vector-db" resource. See the documentation .

resources:
  - type: "vector-database"
    name: "AstraDatasource"
    configuration:
      service: "astra"
      clientId: "${ secrets.astra.clientId }"
      secret: "${ secrets.astra.secret }"
      secureBundle: "${ secrets.astra.secureBundle }"
      database: "${ secrets.astra.database }"
      token: "${ secrets.astra.token }"
      environment: "${ secrets.astra.environment }"

Required parameters:

  • clientId: this is the clientId provided by the Astra DB service

  • secret: this is the secret provided by the Astra DB service

  • token: this is the token provided by the Astra DB service

  • database: this is the database name provided by the Astra DB service

Optional parameters:

  • secureBundle: this is the secure bundle provided by the Astra DB service

  • environment: this is the environment provided by the Astra DB service, it can be PROD, STAGING or DEV, depending on the environment you are using (default is PROD, the other values are useful only for Astra developers)

Handling the secure bundle zip file

The secure bundle is a file that contains some TLS certificates and endpoint information to connect to the Astra DB service.

This is an optional parameter, as LangStream is able to download it for you when you provide the token and database parameters.

It must be a base64-encoded string like this:

   secureBundle: "base64:AAAAA2131232133123122313...."

But you can also provide the secure bundle as a file, in this case you have to use the following syntax:

   secureBundle: "<file:secure-bundle.zip>"

With this syntax the LangStream CLI will read the file and encode it in base64 for you. The path name is relative to the file that mentions this value. This syntax works only if used in a secrets.yaml file or an instance.yaml file. It doesn't work directly in a configuration.yaml file, because it is not recommended to store secrets in a configuration file, but only references to secrets (${secrets.xxx}) and to global variables (${globals.xxx}).

Special assets for Astra

For "Vector Database" resources based on Astra, you can use special assetsin your pipeline file: "astra-keyspace" and "cassandra-table".

assets:
  - name: "langstream-keyspace"
    asset-type: "astra-keyspace"
    creation-mode: create-if-not-exists    
    config:
      keyspace: "langstream"
      datasource: "AstraDatasource"
  - name: "products-table"
    asset-type: "cassandra-table"
    creation-mode: create-if-not-exists
    deletion-mode: delete
    config:
      table-name: "products"
      keyspace: "langstream"
      datasource: "AstraDatasource"
      create-statements:
        - "CREATE TABLE IF NOT EXISTS langstream.products (id int PRIMARY KEY,name TEXT,description TEXT, embeddings VECTOR<FLOAT,1536>);"
        - "CREATE CUSTOM INDEX IF NOT EXISTS documents_ann_index ON documents.documents(embeddings) USING 'StorageAttachedIndex';"
      delete-statements:
        - "TRUNCATE TABLE langstream.products;"

With the "astra-keyspace" asset, you can create a keyspace in your Astra DB instance. The keyspace is a logical container for tables. It is similar to a database in a relational database.

With the "cassandra-table" asset you can create a table in your Astra DB instance. The table is a collection of rows that share a schema of columns. It is similar to a table in a relational database.

Reading and writing to Astra

Configuration

Astra is compatible with Cassandra, so you can use the same agents you use for Cassandra to read and write to Astra. See the documentation .

Check out the full configuration properties in the .

here
here
API Reference page