ScyllaDB University Live | Free Virtual Training Event
Learn more
ScyllaDB Documentation Logo Documentation
  • Deployments
    • Cloud
    • Server
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
    • Supported Driver Versions
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Install
Search Ask AI
ScyllaDB Docs ScyllaDB Cloud Vector Search Quick Start Guide to Vector Search

Quick Start Guide to Vector Search¶

This quickstart will help you get familiar with Vector Search in ScyllaDB. It provides a step-by-step example of setting up a new cluster with vector search enabled, creating a vector index, and running a basic similarity query.

  • See Vector Search Deployments for information on enabling Vector Search in existing clusters and for a list of deployment limitations.

  • See Working with Vector Search for details of Vector Search-related CQL syntax.

Prerequisites¶

  • A ScyllaDB Cloud account. Sign up at cloud.scylladb.com if you don’t have one. You can use a free trial cluster to try Vector Search at no cost. Free trial clusters are limited to the smallest instance size (t4g.medium on AWS, e2-medium on GCP).

  • cqlsh installed on your machine (or use the web-based CQL console available in the ScyllaDB Cloud UI).

  • For real workloads, an embedding model (e.g., OpenAI, Cohere, or an open-source sentence-transformer) to generate vectors from your data. This quickstart uses hand-crafted vectors for simplicity.

Create a Cluster with Vector Search¶

Create a new cluster with Vector Search enabled by following the steps in Creating a New Cluster with Vector Search Enabled. When your cluster is deployed, go to the Connect tab, choose Cqlsh from the left menu, and follow the instructions to connect.

Create a Vector Index¶

  1. Create a new keyspace.

    CREATE KEYSPACE myapp;
    
  2. Create a table with a vector column.

    Note

    This example uses 5-dimensional vectors for clarity. In production, you will typically use higher dimensions (384-1536) to match your embedding model’s output.

    CREATE TABLE IF NOT EXISTS myapp.comments (
      record_id timeuuid,
      id uuid,
      commenter text,
      comment text,
      comment_vector vector<float, 5>,
      created_at timestamp,
      PRIMARY KEY (id, created_at)
    );
    
  3. Insert example rows.

    INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at)
      VALUES (now(), uuid(), 'Alice', 'I like vector search in ScyllaDB.',
              [0.12, 0.34, 0.56, 0.78, 0.91], toTimestamp(now()));
    INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at)
      VALUES (now(), uuid(), 'Bob', 'I like ScyllaDB!',
              [0.11, 0.35, 0.55, 0.77, 0.92], toTimestamp(now()));
    INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at)
      VALUES (now(), uuid(), 'Charlie', 'Can somebody recommend a good restaurant in Paris?',
              [0.55, 0.08, 0.44, 0.19, 0.77], toTimestamp(now()));
    INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at)
      VALUES (now(), uuid(), 'Diana', 'Vector databases are the future',
              [0.12, 0.33, 0.57, 0.79, 0.90], toTimestamp(now()));
    INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at)
      VALUES (now(), uuid(), 'Eve', 'Testing similarity search queries in ScyllaDB',
              [0.13, 0.36, 0.59, 0.76, 0.88], toTimestamp(now()));
    
  4. To enable approximate nearest neighbor (ANN) queries, create a vector index.

    CREATE CUSTOM INDEX IF NOT EXISTS comment_ann_index
    ON myapp.comments(comment_vector)
    USING 'vector_index'
    WITH OPTIONS = {
      'similarity_function': 'COSINE'
    };
    

See Global Secondary Indexes - Vector Index in the ScyllaDB documentation for details.

Run a Vector Search Query¶

Now you can run similarity queries.

In the following example, the query vector is identical to Alice’s comment vector: [0.12, 0.34, 0.56, 0.78, 0.91].

SELECT commenter, comment
FROM myapp.comments
ORDER BY comment_vector ANN OF [0.12, 0.34, 0.56, 0.78, 0.91]
LIMIT 3;

With the limit set to 3, up to the three most similar comments to the provided query vector will be retrieved:

Alice  | I like vector search in ScyllaDB.
Diana  | Vector databases are the future
Bob    | I like ScyllaDB!

Because the query vector is identical to Alice’s, her comment appears first. Diana’s and Bob’s comments rank next because their vectors are numerically closest (highest cosine similarity) to the query vector.

Retrieve Similarity Scores¶

To include similarity scores in your results, call the similarity function that matches your index’s distance metric. Since the index above uses COSINE, use similarity_cosine:

SELECT commenter, comment,
       similarity_cosine(comment_vector, [0.12, 0.34, 0.56, 0.78, 0.91])
       AS similarity
FROM myapp.comments
ORDER BY comment_vector ANN OF [0.12, 0.34, 0.56, 0.78, 0.91]
LIMIT 3;

The three available functions are similarity_cosine, similarity_dot_product, and similarity_euclidean. Each returns a float in [0, 1], where values closer to 1 indicate greater similarity.

See Vector Similarity Functions for details.

What’s Next¶

  • Working with Vector Search — learn about the vector data type, index options, and ANN query syntax.

  • Filtering Vector Search Results — combine similarity search with metadata constraints.

  • Quantization and Rescoring — reduce index memory usage.

  • Vector Search Deployments — enable, resize, or disable Vector Search on your cluster.

Was this page helpful?

PREVIOUS
Vector Search
NEXT
Vector Search Concepts
  • Create an issue

On this page

  • Quick Start Guide to Vector Search
    • Prerequisites
    • Create a Cluster with Vector Search
    • Create a Vector Index
    • Run a Vector Search Query
    • Retrieve Similarity Scores
    • What’s Next
ScyllaDB Cloud
Search Ask AI
  • Get Started
    • What Is ScyllaDB Cloud?
    • Free Trial
    • Quick Start Guide
    • Billing and Pricing
  • Create & Connect to Your Cluster
    • Deployment Overview
    • Choose Your Cluster Type
      • Cluster Types Overview
      • X Cloud Clusters
      • X Cloud Autoscaling Behavior and Best Practices
      • Standard Clusters
    • Deploy to Your Own AWS Account (BYOA)
    • Deploy to Your Own GCP Account (BYOA)
    • Configure Availability Zones
    • Connect to Your Cluster
    • Cluster Setup Best Practices
  • Configure Network Access
    • Network Access Options
    • Configure AWS Transit Gateway (TGW) VPC Attachment Connection
    • Configure Virtual Private Cloud (VPC) Peering with AWS
    • Configure Virtual Private Cloud (VPC) Peering with GCP
    • Migrate a Cluster Connection
    • Check Cluster Availability
    • Glossary for Cluster Connections
  • Operate and Manage Clusters
    • Resize a Cluster
    • Add a Datacenter
    • Delete a Cluster
    • Configure Maintenance Windows
    • Configure Notifications
    • Track Resource Usage
    • Monitor Clusters
    • Monitor with Prometheus
    • Backups
  • Use ScyllaDB
    • Application Best Practices
    • Apache Cassandra Query Language (CQL)
    • ScyllaDB Drivers
    • Data Modeling
    • Tracing
    • Change Data Capture (CDC)
    • Role Based Access Control (RBAC)
    • ScyllaDB Alternator (DynamoDB-compatible API)
    • Lightweight Transactions (LWT)
    • ScyllaDB Integrations
  • Security
    • Security Best Practices
    • Security Concepts
    • Database-level Encryption
    • Storage-level Encryption
    • Client-to-node Encryption
    • Service Users
    • User Management
    • SAML Single Sign-On (SSO)
    • Immutable (WORM) Backups
    • Data Privacy and Compliance
  • Vector Search
    • Quick Start Guide
    • Vector Search Concepts
    • Vector Search Deployments
    • Sizing and Capacity Planning
    • Working with Vector Search
    • Filtering
    • Quantization and Rescoring
    • Security
    • Troubleshooting
    • FAQ
    • Glossary
    • Reference
    • Example Project
  • Cost Optimization
    • Cost Optimization Overview
    • Advanced Internode (RPC) Compression
    • Datacenter Placement and Data Transfer Costs
  • Automate with the ScyllaDB Cloud API
    • Programmatic Access Overview
    • Create a Personal Token for Authentication
    • API Reference
    • API Error Codes
    • Terraform Provider for ScyllaDB Cloud
    • ScyllaDB Cloud MCP Server
  • Get Help
    • FAQ
    • Tutorials
    • Getting Help
Docs Tutorials University Contact Us About Us
© 2026, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 29 Jun 2026.
Powered by Sphinx 9.1.0 & ScyllaDB Theme 1.9.2