ScyllaDB University Live | Free Virtual Training Event
Learn more
ScyllaDB Documentation Logo Documentation
  • Deployments
    • Cloud
    • Server
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
    • Supported Driver Versions
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Install
Ask AI
ScyllaDB Docs ScyllaDB Cloud Vector Search Working with Vector Search

Working with Vector Search¶

This page provides a technical overview of how to work with Vector Search in ScyllaDB. It covers the vector data type, vector indexes, similarity functions, index tuning, ANN queries, and driver integration.

Workflow¶

  1. Create a keyspace.

  2. Create a table with a vector-typed column to store your embedding vectors.

  3. Insert vector data (embeddings) into the table.

  4. Create a vector index on the vector column to enable efficient similarity search.

  5. Perform similarity searches using the ANN OF query to find vectors most similar to your input.

Vector Data Type¶

The VECTOR data type allows you to store fixed-length numeric vectors as a native column type in ScyllaDB tables. These vectors can represent embedding vectors or other high-dimensional numeric data used for similarity search.

  • Syntax: vector<element_type, dimension> (e.g. vector<float, 768>)

  • Element types: Typically floating-point types (e.g., float).

  • Dimensions: Supports vectors with dimensionality ranging from 1 up to 16,000.

This vector data type integrates with ScyllaDB’s native protocol (v5) and is fully supported by the CQL interface.

See Data Types - Vectors in the ScyllaDB documentation for details.

Table with Vector Column¶

You can store and query vectors in a table that contains a vector-typed column (e.g., vector<float, 768>).

In the following example, a comments table is created in the myapp keyspace. In addition to columns for storing and identifying comments (commenter name, comment text, comment ID, etc.), it has a comment_vector vector-typed column for storing vectors of float type.

Note

This example uses 5-dimensional vectors for clarity. In production, you will typically use higher dimensions (384-1536) to match your embedding model’s output.

CREATE TABLE IF NOT EXISTS myapp.comments (
  record_id timeuuid,
  id uuid,
  commenter text,
  comment text,
  comment_vector vector<float, 5>,
  created_at timestamp,
  PRIMARY KEY (id, created_at)
);

Tablets Requirement¶

Caution

Tables that include a vector-typed column must reside in a keyspace with tablets enabled. If you attempt to create a table with a vector column in a keyspace where tablets are disabled, ScyllaDB will return an error.

All ScyllaDB versions currently used by ScyllaDB Cloud enable tablets by default. This means every new keyspace automatically uses tablets-based data distribution, and no additional configuration is required:

CREATE KEYSPACE myapp
WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'replication_factor': 3
};

However, if your cluster was created with an older ScyllaDB version where tablets are not enabled by default, you must explicitly enable tablets when creating any keyspace intended to store vector data:

CREATE KEYSPACE myapp
WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'replication_factor': 3
}
AND tablets = {
   'enabled': true
};

Embeddings¶

Embeddings are fixed-length numeric vectors that represent data — such as text, images, or audio — in a high-dimensional space, capturing their semantic or structural meaning. They are typically generated by external machine learning or deep learning models trained for tasks like semantic search, recommendation, or classification.

The embedding pipeline works as follows:

  1. Your application sends raw data (text, image, etc.) to an embedding model (e.g., OpenAI, Cohere, or an open-source sentence-transformer).

  2. The model returns a fixed-length vector of floating-point numbers (e.g., 768 floats for all-MiniLM-L6-v2).

  3. Your application inserts the vector into a ScyllaDB table.

  4. At query time, your application embeds the query text using the same model and runs an ANN OF query against the stored vectors.

Caution

You must use the same embedding model for both indexing and querying. Vectors from different models live in incompatible vector spaces and cannot be meaningfully compared. If you change your embedding model, you must re-embed and re-index all your data.

ScyllaDB does not generate embeddings — it stores and indexes the vectors produced by your embedding pipeline. For background on how embeddings encode meaning, see How Embeddings Work in the Concepts page.

To insert an embedding vector into a ScyllaDB table, use a standard INSERT statement with a list of numeric values matching the vector column’s defined dimension and data type.

Example:

INSERT INTO myapp.comments (
    record_id, id, commenter, comment, comment_vector, created_at
) VALUES (
    now(), uuid(), 'Alice', 'I like vector search in ScyllaDB.',
    [0.12, 0.34, 0.56, 0.78, 0.91], toTimestamp(now())
);
  • The vector must match the dimension and element type declared in the table schema, e.g., vector<float, 5>.

  • All vector values must be numeric (e.g., float), and enclosed in square brackets.

Choosing an Embedding Model¶

When selecting an embedding model, consider the following trade-offs:

  • Dimensions — higher dimensions capture more nuance but use more memory and increase query latency. Common choices: 384 (lightweight), 768 (general-purpose), 1536 (high accuracy).

  • Model family — popular options include OpenAI embeddings (text-embedding-3-small, text-embedding-3-large), Cohere Embed, and open-source sentence-transformers (e.g., all-MiniLM-L6-v2).

  • Normalization — some models output unit-normalized vectors (suitable for cosine similarity), while others do not (use dot product instead).

ScyllaDB supports dimensions from 1 to 16,000, so it is compatible with all major embedding models.

Vector Index Type¶

Before you query the data, you need to create a vector index to enable fast similarity search over vector columns. Without an index, a similarity query would need to compare the query vector against every stored vector — a brute-force scan that does not scale. The vector index pre-organizes vectors into a navigable graph for \(O(\log N)\) search. See Why You Need an Index for details.

This index type is based on the HNSW (Hierarchical Navigable Small World) algorithm and supports Approximate Nearest Neighbor (ANN) search with configurable similarity functions.

  • Creation: Use a custom index on a vector column.

  • Similarity functions supported: DOT_PRODUCT, COSINE (default), and EUCLIDEAN.

  • Index parameters: Tunable HNSW parameters such as m (maximum node connections), ef_construct (construction beam width), and ef_search (search beam width).

Example:

CREATE CUSTOM INDEX IF NOT EXISTS ann_idx
ON myapp.comments(comment_vector)
USING 'vector_index'
WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT' };

See Global Secondary Indexes - Vector Index in the ScyllaDB documentation for details.

Choosing a Similarity Function¶

The similarity function determines how distances between vectors are measured during search. Choose based on your embedding model and use case:

Function

When to use

Notes

COSINE (default)

Normalized embeddings (unit vectors). Most text embedding models (OpenAI, sentence-transformers) output normalized vectors.

Measures the angle between vectors. Ideal when magnitude is irrelevant and only direction matters.

DOT_PRODUCT

Non-normalized embeddings, or when magnitude carries meaning (e.g., popularity-weighted vectors).

May be slightly faster than cosine because it avoids the normalization division. Requires careful handling if vectors have varying magnitudes.

EUCLIDEAN

Spatial data, geographic coordinates, or when absolute distance matters.

Measures straight-line distance in vector space. Less common for text embeddings but useful for spatial applications.

Tuning the Vector Index¶

The HNSW index has tunable parameters that affect the trade-off between recall (search accuracy), build speed, and query latency:

Parameter

Default

Description

maximum_node_connections (m)

16

Maximum number of connections per node in the HNSW graph. Higher values improve recall but increase memory usage and index build time.

construction_beam_width (ef_construct)

128

Size of the dynamic candidate list during index construction. Higher values yield a higher-quality graph at the cost of slower builds.

search_beam_width (ef_search)

128

Size of the dynamic candidate list during query time. Higher values improve recall at the cost of higher query latency.

Example with tuned parameters:

CREATE CUSTOM INDEX IF NOT EXISTS tuned_ann_idx
ON myapp.comments(comment_vector)
USING 'vector_index'
WITH OPTIONS = {
  'similarity_function': 'COSINE',
  'maximum_node_connections': '32',
  'construction_beam_width': '200',
  'search_beam_width': '200'
};

General guidance:

  • Start with defaults. Increase search_beam_width if recall is too low.

  • Increase maximum_node_connections for high-dimensional vectors (>512 dimensions).

  • Remember: you cannot alter index options after creation. Drop and recreate the index to change parameters.

ANN OF Queries¶

Approximate Nearest Neighbor (ANN) is a search technique used to find data points in large, high-dimensional datasets that are most similar to a given query vector. Rather than computing exact distances for all entries, ANN algorithms trade off a small amount of accuracy for significant speed improvements, returning results that are sufficiently similar. This makes ANN especially effective for applications like semantic search, recommendations, image and audio retrieval, and generative AI, where real-time response and scalability are critical.

Once a vector index is created on a VECTOR-typed column, you can use the ANN OF query to perform ANN searches. This query allows you to efficiently retrieve the top-k rows with vectors most similar to a given input vector, using the similarity function defined while creating the vector index.

Syntax:

SELECT column1, column2, ...
FROM keyspace.table
ORDER BY vector_column ANN OF [v1, v2, ..., vn]
LIMIT k;
  • vector_column: The name of the indexed vector column used for similarity search.

  • [v1, …, vn]: The input query vector. It must match the dimensionality of the indexed column.

  • k: The number of the nearest neighbors to return (required).

The query returns up to k most similar vectors, ranked according to the similarity function defined in the index (COSINE, DOT_PRODUCT, or EUCLIDEAN).

Example:

SELECT id, commenter, comment, created_at
FROM myapp.comments
ORDER BY comment_vector ANN OF [0.12, 0.34, 0.56, 0.78, 0.91]
LIMIT 5;

See Data Manipulation - SELECT - Vector Queries in the ScyllaDB documentation for details.

Write-to-Query Latency¶

After inserting or updating a vector, there is a short delay before the new data becomes available in similarity search results. ScyllaDB uses a dual CDC (Change Data Capture) reader system to propagate changes to the vector index:

  • A fine-grained reader with sub-second intervals provides low-latency updates (typical p50 latency under 1 second).

  • A wide-framed reader with a 30-second safety interval ensures consistency and catches any data missed by the fast reader.

For most workloads, newly inserted vectors are queryable within approximately 1 second.

Vector Search in ScyllaDB Drivers¶

If you use a ScyllaDB driver for application development and want to use the Vector Search feature, note that:

  • Your driver version must support the vector data type. See ScyllaDB Drivers - Support for Vector Search to check from which version the vector type is supported in each driver.

  • Vector search requires the driver to be configured with a DC-aware load balancing policy.

Driver Examples¶

The following examples demonstrate connecting to a ScyllaDB Cloud cluster, inserting a vector, and running a similarity query in different programming languages.

Uses the scylla-driver package.

import ssl
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.policies import DCAwareRoundRobinPolicy

auth = PlainTextAuthProvider(username='scylla', password='YOUR_PASSWORD')
ssl_context = ssl.create_default_context()
cluster = Cluster(
    contact_points=['node-0.your-cluster.cloud.scylladb.com'],
    port=9042,
    auth_provider=auth,
    load_balancing_policy=DCAwareRoundRobinPolicy(local_dc='AWS_US_EAST_1'),
    ssl_context=ssl_context,
)
session = cluster.connect('myapp')

# Insert a vector
session.execute(
    """INSERT INTO comments (record_id, id, commenter, comment, comment_vector, created_at)
       VALUES (now(), uuid(), %s, %s, %s, toTimestamp(now()))""",
    ('Alice', 'I like vector search in ScyllaDB.', [0.12, 0.34, 0.56, 0.78, 0.91])
)

# Run a similarity search
rows = session.execute(
    """SELECT commenter, comment FROM comments
       ORDER BY comment_vector ANN OF %s LIMIT 3""",
    ([0.12, 0.34, 0.56, 0.78, 0.91],)
)
for row in rows:
    print(f"{row.commenter}: {row.comment}")

Uses the cassandra-driver package (compatible with ScyllaDB).

const cassandra = require('cassandra-driver');

const client = new cassandra.Client({
  contactPoints: ['node-0.your-cluster.cloud.scylladb.com'],
  localDataCenter: 'AWS_US_EAST_1',
  keyspace: 'myapp',
  authProvider: new cassandra.auth.PlainTextAuthProvider(
    'scylla', 'YOUR_PASSWORD'
  ),
  sslOptions: { rejectUnauthorized: true },
});

async function main() {
  await client.connect();

  // Insert a vector
  await client.execute(
    `INSERT INTO comments (record_id, id, commenter, comment, comment_vector, created_at)
     VALUES (now(), uuid(), ?, ?, ?, toTimestamp(now()))`,
    ['Alice', 'I like vector search in ScyllaDB.', [0.12, 0.34, 0.56, 0.78, 0.91]],
    { prepare: true }
  );

  // Run a similarity search
  const result = await client.execute(
    `SELECT commenter, comment FROM comments
     ORDER BY comment_vector ANN OF ? LIMIT 3`,
    [[0.12, 0.34, 0.56, 0.78, 0.91]],
    { prepare: true }
  );
  for (const row of result.rows) {
    console.log(`${row.commenter}: ${row.comment}`);
  }

  await client.shutdown();
}

main().catch(console.error);

Uses the scylla-java-driver.

import com.datastax.oss.driver.api.core.CqlSession;
import com.datastax.oss.driver.api.core.cql.ResultSet;
import com.datastax.oss.driver.api.core.cql.Row;
import com.datastax.oss.driver.api.core.data.CqlVector;
import java.net.InetSocketAddress;

public class VectorSearchExample {
    public static void main(String[] args) {
        try (CqlSession session = CqlSession.builder()
                .addContactPoint(new InetSocketAddress(
                    "node-0.your-cluster.cloud.scylladb.com", 9042))
                .withLocalDatacenter("AWS_US_EAST_1")
                .withKeyspace("myapp")
                .withAuthCredentials("scylla", "YOUR_PASSWORD")
                .build()) {

            CqlVector<Float> vector = CqlVector.newInstance(
                0.12f, 0.34f, 0.56f, 0.78f, 0.91f);

            // Insert a vector
            session.execute(session.prepare(
                "INSERT INTO comments (record_id, id, commenter, comment, "
                + "comment_vector, created_at) "
                + "VALUES (now(), uuid(), ?, ?, ?, toTimestamp(now()))")
                .bind("Alice", "I like vector search in ScyllaDB.", vector));

            // Run a similarity search
            ResultSet rs = session.execute(session.prepare(
                "SELECT commenter, comment FROM comments "
                + "ORDER BY comment_vector ANN OF ? LIMIT 3")
                .bind(vector));
            for (Row row : rs) {
                System.out.printf("%s: %s%n",
                    row.getString("commenter"), row.getString("comment"));
            }
        }
    }
}

Uses gocql.

package main

import (
    "fmt"
    "github.com/gocql/gocql"
)

func main() {
    cluster := gocql.NewCluster("node-0.your-cluster.cloud.scylladb.com")
    cluster.Keyspace = "myapp"
    cluster.Authenticator = gocql.PasswordAuthenticator{
        Username: "scylla",
        Password: "YOUR_PASSWORD",
    }
    cluster.SslOpts = &gocql.SslOptions{
        EnableHostVerification: true,
    }
    cluster.PoolConfig.HostSelectionPolicy = gocql.DCAwareRoundRobinPolicy("AWS_US_EAST_1")

    session, err := cluster.CreateSession()
    if err != nil {
        panic(err)
    }
    defer session.Close()

    vector := []float32{0.12, 0.34, 0.56, 0.78, 0.91}

    // Insert a vector
    err = session.Query(
        `INSERT INTO comments (record_id, id, commenter, comment, comment_vector, created_at)
         VALUES (now(), uuid(), ?, ?, ?, toTimestamp(now()))`,
        "Alice", "I like vector search in ScyllaDB.", vector,
    ).Exec()
    if err != nil {
        panic(err)
    }

    // Run a similarity search
    iter := session.Query(
        `SELECT commenter, comment FROM comments
         ORDER BY comment_vector ANN OF ? LIMIT 3`, vector,
    ).Iter()

    var commenter, comment string
    for iter.Scan(&commenter, &comment) {
        fmt.Printf("%s: %s\n", commenter, comment)
    }
    if err := iter.Close(); err != nil {
        panic(err)
    }
}

CQL Features Not Supported with Vector Search¶

  • ANN OF is only supported in ORDER BY clauses.

  • The DISTINCT keyword in ANN OF queries is not supported.

  • Filtering on columns not in the primary key is not supported. See Filtering for supported filtering options.

  • The TOKEN function is not supported in vector queries.

  • The CONTAINS operator is not supported in vector queries.

  • The ALTER INDEX statement is not supported for vector indexes. You cannot modify index options after the index has been created. To change these settings, you must drop the existing index and recreate it with the updated configuration.

  • Time to Live (TTL) is not supported. This means that:

    • Creating a vector index on a table with TTL set by default_time_to_live will be rejected.

    • Changing TTL for a table with a vector index is ignored.

    • Writes with TTL on a column with a vector index are ignored (TTL on other columns is accepted).

    • Rows existing when scheduling the build of the index with TTL set on the column selected for indexing are indexed.

What’s Next¶

  • Filtering Vector Search Results — combine similarity search with metadata constraints using global and local indexes.

  • Quantization and Rescoring — reduce index memory usage while maintaining search quality.

  • Vector Search Concepts — architecture overview and data flow.

  • Reference — CQL syntax reference and API endpoints.

Was this page helpful?

PREVIOUS
Sizing and Capacity Planning
NEXT
Filtering Vector Search Results
  • Create an issue

On this page

  • Working with Vector Search
    • Workflow
    • Vector Data Type
    • Table with Vector Column
      • Tablets Requirement
    • Embeddings
      • Choosing an Embedding Model
    • Vector Index Type
      • Choosing a Similarity Function
      • Tuning the Vector Index
    • ANN OF Queries
    • Write-to-Query Latency
    • Vector Search in ScyllaDB Drivers
      • Driver Examples
    • CQL Features Not Supported with Vector Search
    • What’s Next
ScyllaDB Cloud
  • Quick Start Guide to ScyllaDB Cloud
  • About ScyllaDB Cloud as a Service
    • Benefits
    • Best Practices
    • Billing
  • Deployment
    • Cluster Types - X Cloud and Standard
    • Bring Your Own Account (BYOA) - AWS
    • Bring Your Own Account (BYOA) - GCP
    • Terraform Provider
    • Free Trial
  • Cluster Connections
    • Configure AWS Transit Gateway (TGW) VPC Attachment Connection
    • Configure Virtual Private Cloud (VPC) Peering with AWS
    • Configure Virtual Private Cloud (VPC) Peering with GCP
    • Migrating Cluster Connection
    • Checking Cluster Availability
    • Glossary for Cluster Connections
  • Access Management
    • SAML Single Sign-On (SSO)
    • User Management
  • Managing Clusters
    • Resizing a Cluster
    • Adding a Datacenter
    • Deleting a Cluster
    • Maintenance Windows
    • Email Notifications
    • Usage
  • Security
    • Security Best Practices
    • Security Concepts
    • Database-level Encryption
    • Storage-level Encryption
    • Client-to-node Encryption
    • Service Users
    • Data Privacy and Compliance
  • Using ScyllaDB
    • Apache Cassandra Query Language (CQL)
    • ScyllaDB Drivers
    • Tracing
    • Role Based Access Control (RBAC)
    • ScyllaDB Integrations
  • Vector Search
    • Quick Start Guide
    • Vector Search Concepts
    • Vector Search Deployments
    • Sizing and Capacity Planning
    • Working with Vector Search
    • Filtering
    • Quantization and Rescoring
    • Security
    • Troubleshooting
    • FAQ
    • Glossary
    • Reference
    • Example Project
  • Service Behavior
    • Backups
    • Managing ScyllaDB Versions
    • Advanced Internode (RPC) Compression
  • Monitoring
    • Monitoring Clusters
    • Extracting Cluster Metrics in Prometheus Format
  • API Documentation
    • Create a Personal Token for Authentication
    • Terraform Provider for ScyllaDB Cloud
    • API Reference
    • Error Codes
  • Help & Learning
    • Tutorials
    • FAQ
    • Getting Help
Docs Tutorials University Contact Us About Us
© 2026, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 26 Mar 2026.
Powered by Sphinx 9.1.0 & ScyllaDB Theme 1.9.1
Ask AI