Cassandra
Connecting to Apache Cassandra
To use Apache Cassandra as a vector database, create a "vector-database" (or "datasource") resource in your configuration.yaml file.
Support for Vector Search is available since Cassandra 5.0, so you need to use a version of Cassandra >= 5.0 or equivalent.
Required parameters:
contact-points: the address to connect to Cassandra
loadBalancing-localDc: the datacenter to connect to
Optional parameters:
port: the port to connect to Cassandra (default is 9042)
username: the username
password: the password
Special assets for Cassandra
For "Vector Database" resources based on Astra, you can use special assets
in your pipeline: "cassandra-keyspace" and "cassandra-table".
With the "cassandra-keyspace" asset you can create a keyspace in your Cassandra cluster. The keyspace is a logical container for tables. It is similar to a database in a relational database.
With the "cassandra-table" asset you can create a table in your Astra DB instance. The table is a collection of rows that share a schema of columns. It is similar to a table in a relational database.
Writing to Cassandra
Use the "vector-db-sink" agent with the following parameters to write to a Cassandra database:
Set the table-name to the name of the table you want to write to. Set the keyspace to the name of the keyspace you want to write to. The mapping field is a comma-separated list of field mappings, in the form "field-name=expression". The expression is a expression that can reference the value of the current message, for instance "value.filename".
Internally LangStream is using the DataStax Connector for Apache Kafka and Pulsar to write to Cassandra. You can find more information about the mapping parameters in the documentation.
Configuration
Check out the full configuration properties in the API Reference page.
Last updated