ScyllaDB Docs ScyllaDB Cloud Vector Search Working with Vector Search

Working with Vector Search¶

This page provides a technical overview of how to work with Vector Search in ScyllaDB. It covers the vector data type, vector indexes, similarity functions, index tuning, ANN queries, and driver integration.

Workflow¶

Create a keyspace.
Create a table with a vector-typed column to store your embedding vectors.
Insert vector data (embeddings) into the table.
Create a vector index on the vector column to enable efficient similarity search.
Perform similarity searches using the ANN OF query to find vectors most similar to your input.

Vector Data Type¶

The VECTOR data type allows you to store fixed-length numeric vectors as a native column type in ScyllaDB tables. These vectors can represent embedding vectors or other high-dimensional numeric data used for similarity search.

Syntax: vector<element_type, dimension> (e.g. vector<float, 768>)
Element types: Typically floating-point types (e.g., float).
Dimensions: Supports vectors with dimensionality ranging from 1 up to 16,000.

This vector data type integrates with ScyllaDB’s native protocol (v5) and is fully supported by the CQL interface.

See Data Types - Vectors in the ScyllaDB documentation for details.

Table with Vector Column¶

You can store and query vectors in a table that contains a vector-typed column (e.g., vector<float, 768>).

In the following example, a comments table is created in the myapp keyspace. In addition to columns for storing and identifying comments (commenter name, comment text, comment ID, etc.), it has a comment_vector vector-typed column for storing vectors of float type.

Note

This example uses 5-dimensional vectors for clarity. In production, you will typically use higher dimensions (384-1536) to match your embedding model’s output.

CREATE TABLE IF NOT EXISTS myapp.comments (
  record_id timeuuid,
  id uuid,
  commenter text,
  comment text,
  comment_vector vector<float, 5>,
  created_at timestamp,
  PRIMARY KEY (id, created_at)
);

Tablets Requirement¶

Caution

Tables that include a vector-typed column must reside in a keyspace with tablets enabled. If you attempt to create a table with a vector column in a keyspace where tablets are disabled, ScyllaDB will return an error.

All ScyllaDB versions currently used by ScyllaDB Cloud enable tablets by default. This means every new keyspace automatically uses tablets-based data distribution, and no additional configuration is required:

CREATE KEYSPACE myapp;

Embeddings¶

Embeddings are fixed-length numeric vectors that represent data — such as text, images, or audio — in a high-dimensional space, capturing their semantic or structural meaning. They are typically generated by external machine learning or deep learning models trained for tasks like semantic search, recommendation, or classification.

The embedding pipeline works as follows:

Your application sends raw data (text, image, etc.) to an embedding model (e.g., OpenAI, Cohere, or an open-source sentence-transformer).
The model returns a fixed-length vector of floating-point numbers (e.g., 768 floats for all-MiniLM-L6-v2).
Your application inserts the vector into a ScyllaDB table.
At query time, your application embeds the query text using the same model and runs an ANN OF query against the stored vectors.

Caution

You must use the same embedding model for both indexing and querying. Vectors from different models live in incompatible vector spaces and cannot be meaningfully compared. If you change your embedding model, you must re-embed and re-index all your data.

ScyllaDB does not generate embeddings — it stores and indexes the vectors produced by your embedding pipeline. For background on how embeddings encode meaning, see How Embeddings Work in the Concepts page.

To insert an embedding vector into a ScyllaDB table, use a standard INSERT statement with a list of numeric values matching the vector column’s defined dimension and data type.

Example:

INSERT INTO myapp.comments (
    record_id, id, commenter, comment, comment_vector, created_at
) VALUES (
    now(), uuid(), 'Alice', 'I like vector search in ScyllaDB.',
    [0.12, 0.34, 0.56, 0.78, 0.91], toTimestamp(now())
);

The vector must match the dimension and element type declared in the table schema, e.g., vector<float, 5>.
All vector values must be numeric (e.g., float), and enclosed in square brackets.

Choosing an Embedding Model¶

When selecting an embedding model, consider the following trade-offs:

Dimensions — higher dimensions capture more nuance but use more memory and increase query latency. Common choices: 384 (lightweight), 768 (general-purpose), 1536 (high accuracy).
Model family — popular options include OpenAI embeddings (text-embedding-3-small, text-embedding-3-large), Cohere Embed, and open-source sentence-transformers (e.g., all-MiniLM-L6-v2).
Normalization — some models output unit-normalized vectors (suitable for cosine similarity), while others do not (use dot product instead).

ScyllaDB supports dimensions from 1 to 16,000, so it is compatible with all major embedding models.

Vector Index Type¶

Before you query the data, you need to create a vector index to enable fast similarity search over vector columns. Without an index, a similarity query would need to compare the query vector against every stored vector — a brute-force scan that does not scale. The vector index pre-organizes vectors into a navigable graph for \(O(\log N)\) search. See Why You Need an Index for details.

This index type is based on the HNSW (Hierarchical Navigable Small World) algorithm and supports Approximate Nearest Neighbor (ANN) search with configurable similarity functions.

Creation: Use a custom index on a vector column.
Similarity functions supported: DOT_PRODUCT, COSINE (default), and EUCLIDEAN.
Index parameters: Tunable HNSW parameters such as m (maximum node connections), ef_construct (construction beam width), and ef_search (search beam width).

Example:

CREATE CUSTOM INDEX IF NOT EXISTS ann_idx
ON myapp.comments(comment_vector)
USING 'vector_index'
WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT' };

See Global Secondary Indexes - Vector Index in the ScyllaDB documentation for details.

Choosing a Similarity Function¶

The similarity function determines how distances between vectors are measured during search. Choose based on your embedding model and use case:

Function	When to use	Notes
`COSINE` (default)	Normalized embeddings (unit vectors). Most text embedding models (OpenAI, sentence-transformers) output normalized vectors.	Measures the angle between vectors. Ideal when magnitude is irrelevant and only direction matters.
`DOT_PRODUCT`	Non-normalized embeddings, or when magnitude carries meaning (e.g., popularity-weighted vectors).	May be slightly faster than cosine because it avoids the normalization division. Requires careful handling if vectors have varying magnitudes.
`EUCLIDEAN`	Spatial data, geographic coordinates, or when absolute distance matters.	Measures straight-line distance in vector space. Less common for text embeddings but useful for spatial applications.

Tuning the Vector Index¶

The HNSW index has tunable parameters that affect the trade-off between recall (search accuracy), build speed, and query latency:

Parameter	Default	Description
`maximum_node_connections` (m)	`16`	Maximum number of connections per node in the HNSW graph. Higher values improve recall but increase memory usage and index build time.
`construction_beam_width` (ef_construct)	`128`	Size of the dynamic candidate list during index construction. Higher values yield a higher-quality graph at the cost of slower builds.
`search_beam_width` (ef_search)	`128`	Size of the dynamic candidate list during query time. Higher values improve recall at the cost of higher query latency.

Example with tuned parameters:

CREATE CUSTOM INDEX IF NOT EXISTS tuned_ann_idx
ON myapp.comments(comment_vector)
USING 'vector_index'
WITH OPTIONS = {
  'similarity_function': 'COSINE',
  'maximum_node_connections': '32',
  'construction_beam_width': '200',
  'search_beam_width': '200'
};

General guidance:

Start with defaults. Increase search_beam_width if recall is too low.
Increase maximum_node_connections for high-dimensional vectors (>512 dimensions).
Remember: you cannot alter index options after creation. Drop and recreate the index to change parameters.

ANN OF Queries¶

Approximate Nearest Neighbor (ANN) is a search technique used to find data points in large, high-dimensional datasets that are most similar to a given query vector. Rather than computing exact distances for all entries, ANN algorithms trade off a small amount of accuracy for significant speed improvements, returning results that are sufficiently similar. This makes ANN especially effective for applications like semantic search, recommendations, image and audio retrieval, and generative AI, where real-time response and scalability are critical.

Once a vector index is created on a VECTOR-typed column, you can use the ANN OF query to perform ANN searches. This query allows you to efficiently retrieve the top-k rows with vectors most similar to a given input vector, using the similarity function defined while creating the vector index.

Syntax:

SELECT column1, column2, ...
FROM keyspace.table
ORDER BY vector_column ANN OF [v1, v2, ..., vn]
LIMIT k;

vector_column: The name of the indexed vector column used for similarity search.
[v1, …, vn]: The input query vector. It must match the dimensionality of the indexed column.
k: The number of the nearest neighbors to return (required).

The query returns up to k most similar vectors, ranked according to the similarity function defined in the index (COSINE, DOT_PRODUCT, or EUCLIDEAN).

Example:

SELECT id, commenter, comment, created_at
FROM myapp.comments
ORDER BY comment_vector ANN OF [0.12, 0.34, 0.56, 0.78, 0.91]
LIMIT 5;

See Data Manipulation - SELECT - Vector Queries in the ScyllaDB documentation for details.

Write-to-Query Latency¶

After inserting or updating a vector, there is a short delay before the new data becomes available in similarity search results. ScyllaDB uses a dual CDC (Change Data Capture) reader system to propagate changes to the vector index:

A fine-grained reader with sub-second intervals provides low-latency updates (typical p50 latency under 1 second).
A wide-framed reader with a 30-second safety interval ensures consistency and catches any data missed by the fast reader.

For most workloads, newly inserted vectors are queryable within approximately 1 second.

Vector Search in ScyllaDB Drivers¶

If you use a ScyllaDB driver for application development and want to use the Vector Search feature, note that:

Your driver version must support the vector data type. See ScyllaDB Drivers - Support for Vector Search to check from which version the vector type is supported in each driver.
Vector search requires the driver to be configured with a DC-aware load balancing policy.

Driver Examples¶

The following examples demonstrate connecting to a ScyllaDB Cloud cluster, inserting a vector, and running a similarity query in different programming languages.

Uses the scylla-driver package.

import ssl
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.policies import DCAwareRoundRobinPolicy

auth = PlainTextAuthProvider(username='scylla', password='YOUR_PASSWORD')
ssl_context = ssl.create_default_context()
cluster = Cluster(
    contact_points=['node-0.your-cluster.cloud.scylladb.com'],
    port=9042,
    auth_provider=auth,
    load_balancing_policy=DCAwareRoundRobinPolicy(local_dc='AWS_US_EAST_1'),
    ssl_context=ssl_context,
)
session = cluster.connect('myapp')

# Insert a vector
session.execute(
    """INSERT INTO comments (record_id, id, commenter, comment, comment_vector, created_at)
       VALUES (now(), uuid(), %s, %s, %s, toTimestamp(now()))""",
    ('Alice', 'I like vector search in ScyllaDB.', [0.12, 0.34, 0.56, 0.78, 0.91])
)

# Run a similarity search
rows = session.execute(
    """SELECT commenter, comment FROM comments
       ORDER BY comment_vector ANN OF %s LIMIT 3""",
    ([0.12, 0.34, 0.56, 0.78, 0.91],)
)
for row in rows:
    print(f"{row.commenter}: {row.comment}")

Uses the cassandra-driver package (compatible with ScyllaDB).

const cassandra = require('cassandra-driver');

const client = new cassandra.Client({
  contactPoints: ['node-0.your-cluster.cloud.scylladb.com'],
  localDataCenter: 'AWS_US_EAST_1',
  keyspace: 'myapp',
  authProvider: new cassandra.auth.PlainTextAuthProvider(
    'scylla', 'YOUR_PASSWORD'
  ),
  sslOptions: { rejectUnauthorized: true },
});

async function main() {
  await client.connect();

  // Insert a vector
  await client.execute(
    `INSERT INTO comments (record_id, id, commenter, comment, comment_vector, created_at)
     VALUES (now(), uuid(), ?, ?, ?, toTimestamp(now()))`,
    ['Alice', 'I like vector search in ScyllaDB.', [0.12, 0.34, 0.56, 0.78, 0.91]],
    { prepare: true }
  );

  // Run a similarity search
  const result = await client.execute(
    `SELECT commenter, comment FROM comments
     ORDER BY comment_vector ANN OF ? LIMIT 3`,
    [[0.12, 0.34, 0.56, 0.78, 0.91]],
    { prepare: true }
  );
  for (const row of result.rows) {
    console.log(`${row.commenter}: ${row.comment}`);
  }

  await client.shutdown();
}

main().catch(console.error);

Uses the scylla-java-driver.

import com.datastax.oss.driver.api.core.CqlSession;
import com.datastax.oss.driver.api.core.cql.ResultSet;
import com.datastax.oss.driver.api.core.cql.Row;
import com.datastax.oss.driver.api.core.data.CqlVector;
import java.net.InetSocketAddress;

public class VectorSearchExample {
    public static void main(String[] args) {
        try (CqlSession session = CqlSession.builder()
                .addContactPoint(new InetSocketAddress(
                    "node-0.your-cluster.cloud.scylladb.com", 9042))
                .withLocalDatacenter("AWS_US_EAST_1")
                .withKeyspace("myapp")
                .withAuthCredentials("scylla", "YOUR_PASSWORD")
                .build()) {

            CqlVector<Float> vector = CqlVector.newInstance(
                0.12f, 0.34f, 0.56f, 0.78f, 0.91f);

            // Insert a vector
            session.execute(session.prepare(
                "INSERT INTO comments (record_id, id, commenter, comment, "
                + "comment_vector, created_at) "
                + "VALUES (now(), uuid(), ?, ?, ?, toTimestamp(now()))")
                .bind("Alice", "I like vector search in ScyllaDB.", vector));

            // Run a similarity search
            ResultSet rs = session.execute(session.prepare(
                "SELECT commenter, comment FROM comments "
                + "ORDER BY comment_vector ANN OF ? LIMIT 3")
                .bind(vector));
            for (Row row : rs) {
                System.out.printf("%s: %s%n",
                    row.getString("commenter"), row.getString("comment"));
            }
        }
    }
}

Uses gocql.

package main

import (
    "fmt"
    "github.com/gocql/gocql"
)

func main() {
    cluster := gocql.NewCluster("node-0.your-cluster.cloud.scylladb.com")
    cluster.Keyspace = "myapp"
    cluster.Authenticator = gocql.PasswordAuthenticator{
        Username: "scylla",
        Password: "YOUR_PASSWORD",
    }
    cluster.SslOpts = &gocql.SslOptions{
        EnableHostVerification: true,
    }
    cluster.PoolConfig.HostSelectionPolicy = gocql.DCAwareRoundRobinPolicy("AWS_US_EAST_1")

    session, err := cluster.CreateSession()
    if err != nil {
        panic(err)
    }
    defer session.Close()

    vector := []float32{0.12, 0.34, 0.56, 0.78, 0.91}

    // Insert a vector
    err = session.Query(
        `INSERT INTO comments (record_id, id, commenter, comment, comment_vector, created_at)
         VALUES (now(), uuid(), ?, ?, ?, toTimestamp(now()))`,
        "Alice", "I like vector search in ScyllaDB.", vector,
    ).Exec()
    if err != nil {
        panic(err)
    }

    // Run a similarity search
    iter := session.Query(
        `SELECT commenter, comment FROM comments
         ORDER BY comment_vector ANN OF ? LIMIT 3`, vector,
    ).Iter()

    var commenter, comment string
    for iter.Scan(&commenter, &comment) {
        fmt.Printf("%s: %s\n", commenter, comment)
    }
    if err := iter.Close(); err != nil {
        panic(err)
    }
}

Framework Integration¶

Beyond the native drivers, ScyllaDB Vector Search is compatible with AI frameworks that integrate through the Cassandra connector (CassIO), such as LangChain and LlamaIndex. ScyllaDB recognizes the Cassandra Storage Attached Index (SAI) statements these libraries generate, so they can run against a ScyllaDB cluster with little or no code changes. See LangChain and CassIO Compatibility for requirements, limitations, and a complete RAG example.

Altering a Vector Index¶

ScyllaDB does not support ALTER INDEX for vector indexes — you cannot change the similarity function, HNSW parameters, or any other index option after creation.

To work around this constraint (for example, to change the similarity function from COSINE to DOT_PRODUCT, or to introduce quantization), use the following procedure to migrate the index with zero downtime.

Procedure¶

Starting with ScyllaDB 2026.2, you can create multiple vector indexes on the same column. This allows you to rebuild an index in place — the old index continues serving queries while the new one is being built, and once the new index is ready, queries are automatically routed to it.

Verify that the Vector Search instances have enough memory to hold both the existing index and the new one simultaneously. Use the memory estimation formula to calculate the required memory for both indexes. If the available memory is insufficient, resize the Vector Search deployment before proceeding.
Create a new vector index on the same column with the desired configuration. You must give the new index a different name:
```
CREATE CUSTOM INDEX IF NOT EXISTS ann_idx_v2
ON myapp.comments(comment_vector)
USING 'vector_index'
WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT' };
```
While the new index is being built, ANN queries continue to be served by the old index.
Wait for the new index to finish building. Once the new index is ready, Vector Search automatically routes queries to the newest serving index. No application changes are required.
Drop the old index once you have confirmed that the new index is serving queries correctly:
```
DROP INDEX IF EXISTS ann_idx;
```
Resize the Vector Search instances down if the removal of the old index freed enough memory. Check the current memory usage and reduce the instance size if there is sufficient free RAM.

Caution

During steps 2-3, both indexes exist simultaneously on the same column. Ensure that your Vector Search instances have enough memory to accommodate both indexes; otherwise the index build in step 2 may fail or degrade performance.

On ScyllaDB versions prior to 2026.2, a single vector column can only have one vector index at a time. To change the index configuration you must create a temporary duplicate vector column and migrate the index to it.

Add a duplicate vector column to the table. The new column must have the same vector type and dimensions as the original.
```
ALTER TABLE myapp.comments
ADD comment_vector_v2 vector<float, 5>;
```

Update your application to write embeddings to both columns. Every new INSERT or UPDATE must populate both the original (comment_vector) and the new (comment_vector_v2) column.

INSERT INTO myapp.comments (
    record_id, id, commenter, comment,
    comment_vector, comment_vector_v2, created_at
) VALUES (
    now(), uuid(), 'Bob', 'Dual-write example.',
    [0.10, 0.20, 0.30, 0.40, 0.50],
    [0.10, 0.20, 0.30, 0.40, 0.50],
    toTimestamp(now())
);

Backfill the new column for all existing rows that were written before the dual-write was enabled. Perform a full table scan and copy the vector values:
```
-- For each row returned by the scan:
SELECT id, created_at, comment_vector FROM myapp.comments;

-- For each row, replace the <...> placeholders with actual column values:
UPDATE myapp.comments
SET comment_vector_v2 = <comment_vector>
WHERE id = <row_id> AND created_at = <row_created_at>;
```
Note

For large tables, use your application or a script to iterate over all rows and copy vectors in batches. To scan efficiently, implement a tablet-aware full table scan as described in Efficient Full Table Scans with ScyllaDB Tablets. To avoid impacting production queries, run the backfill under a dedicated database role with workload_type = 'batch' and lower SHARES using ScyllaDB’s Workload Prioritization feature.
Verify that the Vector Search instances have enough memory to hold both the existing index and the new one simultaneously. Use the memory estimation formula to calculate the required memory for both indexes. If the available memory is insufficient, resize the Vector Search deployment before proceeding.

Create the new index on the duplicate column with the desired configuration:

CREATE CUSTOM INDEX IF NOT EXISTS ann_idx_v2
ON myapp.comments(comment_vector_v2)
USING 'vector_index'
WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT' };

Wait for the index to finish building over the existing data.

Switch your application to query the new column. Update all ANN OF queries to use comment_vector_v2 instead of comment_vector:

SELECT id, commenter, comment
FROM myapp.comments
ORDER BY comment_vector_v2 ANN OF [0.12, 0.34, 0.56, 0.78, 0.91]
LIMIT 5;

Drop the old index and remove the original column once you have confirmed that the new index is serving queries correctly:

First, stop writing to the old column in your application code. Then drop the index and the column:
```
DROP INDEX IF EXISTS ann_idx;
ALTER TABLE myapp.comments DROP comment_vector;
```
Resize the Vector Search instances down if the removal of the old index freed enough memory. Check the current memory usage and reduce the instance size if there is sufficient free RAM.

Caution

During steps 2–6, both columns and both indexes exist simultaneously. Ensure that your Vector Search instances have enough memory to accommodate the additional index; otherwise the index build in step 5 may fail or degrade performance.

CQL Features Not Supported with Vector Search¶

ANN OF is only supported in ORDER BY clauses.
The DISTINCT keyword in ANN OF queries is not supported.
Filtering on columns not in the primary key is not supported. See Filtering for supported filtering options.
The TOKEN function is not supported in vector queries.
The CONTAINS operator is not supported in vector queries.
The ALTER INDEX statement is not supported for vector indexes. You cannot modify index options after the index has been created. To change these settings, you must drop the existing index and recreate it with the updated configuration. See Altering a Vector Index for a zero-downtime migration procedure.
Time to Live (TTL) is not supported. This means that:
- Creating a vector index on a table with TTL set by default_time_to_live will be rejected.
- Changing TTL for a table with a vector index is ignored.
- Writes with TTL on a column with a vector index are ignored (TTL on other columns is accepted).
- Rows existing when scheduling the build of the index with TTL set on the column selected for indexing are indexed.
TRUNCATE TABLE is not supported. TRUNCATE does not generate CDC events, so the vector index is not updated — the table is emptied but the HNSW graph still contains all previous vectors, leading to stale or incorrect query results. Instead of truncating, drop and recreate both the table and the custom index.
Partition-level and range deletes on tables with clustering keys are not propagated to the vector index. Specifically:
- DELETE FROM t WHERE pk = ? (partition delete with no clustering key specified) is not reflected in the vector index.
- DELETE FROM t WHERE pk = ? AND ck > ? (range delete using an inequality operator on the clustering key) is not reflected in the vector index.
Only single-row deletes that fully specify the primary key (all partition key and clustering key columns) are propagated. Rows deleted with partition or range deletes are filtered out by ScyllaDB at query time, so they will not appear in ANN query results. However, because the vector index still contains entries for the deleted rows, they occupy candidate slots during the index search, which can cause ANN queries to return fewer results than the requested LIMIT. Additionally, the stale entries continue to consume memory on the vector search nodes.

To avoid this, always delete rows from tables with vector indexes using a fully specified primary key.

What’s Next¶

Filtering Vector Search Results — combine similarity search with metadata constraints using global and local indexes.
Quantization and Rescoring — reduce index memory usage while maintaining search quality.
Vector Search Concepts — architecture overview and data flow.
Reference — CQL syntax reference and API endpoints.

Was this page helpful?