Was this page helpful?
Quick Start Guide to Vector Search¶
This quickstart will help you get familiar with vector search in ScyllaDB. It provides a step-by-step example of setting up a new cluster with vector search enabled, creating a vector index, and running a basic similarity query.
See Vector Search Deployments for information on enabling vector search in existing clusters and for a list of deployment limitations in the beta release.
See Working with Vector Search for details of vector search–related CQL syntax and for a list of CQL limitations in the beta release.
Before You Start¶
Log in at https://cloud.scylladb.com/ or sign up if you don’t have an account.
Go to Settings > Personal Tokens, click Generate Token, and create your Personal Access Token.
Ensure you copy and save your token!
The token is required to use the ScyllaDB Cloud API. In this quickstart, we refer to it as
YOUR_API_TOKEN.
Get your account ID (
ACCOUNT_ID):curl -X GET "https://api.cloud.scylladb.com/account/default" \ -H "Authorization: Bearer YOUR_API_TOKEN"
Response example:
{ "error": "", "data": { "accountId": 12345, "name": "my-account", "userId": "12345" } }
Here 12345 is your
ACCOUNT_ID.
Make sure to replace ACCOUNT_ID and YOUR_API_TOKEN with actual values
in the examples below.
Create a Cluster with Vector Search¶
To create a new cluster with vector search enabled:
Create a new cluster that includes the
vectorSearchfield in the API request body.curl -X POST "https://api.cloud.scylladb.com/account/{ACCOUNT_ID}/cluster" \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "clusterName": "my-vector-cluster", "cloudProviderId": 1, "regionId": 1, "scyllaVersion":"2025.4.0~rc0-0.20251001.6969918d3151", "numberOfNodes": 3, "instanceId": 62, "freeTier": true, "replicationFactor": 3, "vectorSearch": { "defaultNodes": 1, "defaultInstanceTypeId": 175 } }'
This will deploy dedicated vector search nodes in the cluster.
Note: Example above uses ScyllaDB instance type 62, which represents Free Trial instance type i4i.large. All Free Trial isntance types are:
62 - AWS i4i.large 63 - AWS i4i.xlarge 64 - AWS i4i.2xlarge 40 - GCP n2-highmem-2 41 - GCP n2-highmem-4 42 - n2-highmem-8
Additionally, in order to use these clusters as part of a Free Trial, please include “freeTier”: true in the body request.
During the beta period, the supported instance types for vector search are:
175 - AWS t4g.small 176 - AWS t4g.medium 177 - AWS r7g.medium 178 - GCP e2-small 179 - GCP e2-medium 180 - GCP n4-highmem-2
Connect to the cluster with cqlsh.
Go to https://cloud.scylladb.com/, choose your cluster and go to the Connect tab.
Choose Cqlsh from the left menu and follow the instructions.
Your cluster is ready to work with vector search!
Create a Vector Index¶
Create a new keyspace. The keyspace must have tablets disabled. Support for tablets will be added in future releases.
CREATE KEYSPACE myapp WITH replication = { 'class': 'NetworkTopologyStrategy', 'replication_factor': 3 } AND tablets = { 'enabled': false };
Create a table with a vector column.
CREATE TABLE IF NOT EXISTS myapp.comments ( record_id timeuuid, id uuid, commenter text, comment text, comment_vector vector<float, 64>, created_at timestamp, PRIMARY KEY (id, created_at) );
Insert example rows.
INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Alice', 'I like vector search in ScyllaDB.', [0.12,0.34,0.56,0.78,0.91,0.15,0.62,0.48,0.22,0.31,0.40,0.67,0.53,0.84,0.19,0.72,0.63,0.54,0.26,0.33,0.11,0.09,0.27,0.41,0.69,0.82,0.57,0.38,0.71,0.46,0.55,0.64,0.17,0.81,0.23,0.95,0.66,0.35,0.44,0.59,0.02,0.75,0.28,0.16,0.92,0.88,0.47,0.13,0.99,0.21,0.32,0.83,0.45,0.04,0.86,0.25,0.36,0.73,0.07,0.61,0.52,0.14,0.68,0.05], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Bob', 'I like ScyllaDB!', [0.11,0.35,0.55,0.77,0.92,0.14,0.61,0.47,0.23,0.32,0.41,0.66,0.52,0.83,0.18,0.73,0.64,0.53,0.27,0.34,0.12,0.10,0.26,0.42,0.70,0.81,0.56,0.39,0.70,0.47,0.54,0.65,0.16,0.80,0.22,0.94,0.67,0.34,0.43,0.58,0.03,0.74,0.29,0.17,0.91,0.87,0.46,0.12,0.98,0.20,0.31,0.84,0.44,0.05,0.85,0.24,0.35,0.72,0.06,0.60,0.51,0.13,0.67,0.06], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Charlie', 'Can somebody recommend a good restaurant in Paris?', [0.55,0.08,0.44,0.19,0.77,0.25,0.39,0.10,0.50,0.62,0.07,0.14,0.97,0.23,0.36,0.92,0.31,0.81,0.06,0.42,0.70,0.28,0.59,0.21,0.85,0.63,0.15,0.30,0.38,0.27,0.11,0.79,0.52,0.99,0.33,0.40,0.12,0.73,0.24,0.47,0.65,0.20,0.57,0.87,0.13,0.48,0.74,0.04,0.60,0.29,0.18,0.64,0.71,0.16,0.53,0.45,0.95,0.02,0.37,0.26,0.05,0.82,0.35,0.32], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Diana', 'Vector databases are the future', [0.12,0.33,0.57,0.79,0.90,0.16,0.63,0.49,0.21,0.30,0.39,0.68,0.54,0.85,0.20,0.71,0.62,0.55,0.25,0.32,0.10,0.08,0.28,0.40,0.68,0.83,0.58,0.37,0.72,0.45,0.56,0.63,0.18,0.82,0.24,0.96,0.65,0.36,0.45,0.60,0.01,0.76,0.27,0.15,0.93,0.89,0.48,0.14,1.00,0.22,0.33,0.82,0.46,0.03,0.87,0.26,0.37,0.74,0.08,0.62,0.53,0.13,0.69,0.04], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Eve', 'Testing similarity search queries in ScyllaDB', [0.13,0.36,0.59,0.76,0.88,0.17,0.60,0.50,0.25,0.34,0.38,0.65,0.50,0.82,0.23,0.70,0.66,0.51,0.28,0.31,0.09,0.07,0.30,0.43,0.71,0.80,0.60,0.36,0.74,0.48,0.53,0.62,0.19,0.83,0.26,0.93,0.64,0.33,0.46,0.61,0.00,0.73,0.31,0.13,0.90,0.85,0.49,0.11,0.97,0.19,0.35,0.81,0.42,0.06,0.89,0.29,0.34,0.75,0.10,0.63,0.52,0.12,0.67,0.02], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Frank', 'Deep learning meets databases', [0.10,0.31,0.53,0.81,0.93,0.13,0.59,0.45,0.26,0.33,0.42,0.64,0.49,0.80,0.22,0.74,0.61,0.57,0.24,0.35,0.15,0.11,0.29,0.39,0.66,0.84,0.55,0.40,0.73,0.50,0.51,0.60,0.14,0.79,0.20,0.92,0.68,0.37,0.41,0.56,0.04,0.77,0.30,0.18,0.91,0.86,0.47,0.10,0.96,0.23,0.36,0.80,0.43,0.02,0.88,0.27,0.38,0.70,0.06,0.65,0.54,0.08,0.71,0.07], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Grace', 'ScyllaDB is highly performant', [0.11,0.84,0.29,0.71,0.17,0.94,0.62,0.53,0.43,0.25,0.96,0.38,0.18,0.82,0.45,0.01,0.75,0.19,0.30,0.58,0.12,0.68,0.92,0.15,0.26,0.20,0.44,0.32,0.89,0.16,0.64,0.54,0.79,0.27,0.36,0.21,0.09,0.50,0.23,0.88,0.39,0.33,0.06,0.70,0.31,0.07,0.80,0.13,0.24,0.52,0.46,0.85,0.60,0.08,0.48,0.22,0.14,0.42,0.10,0.34,0.28,0.02,0.41,0.63], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Heidi', 'Anyone here using ScyllaDB for AI applications?', [0.08,0.37,0.52,0.82,0.94,0.19,0.57,0.43,0.27,0.36,0.44,0.62,0.48,0.79,0.26,0.76,0.59,0.50,0.30,0.29,0.14,0.05,0.32,0.37,0.65,0.86,0.61,0.42,0.75,0.51,0.49,0.59,0.12,0.77,0.25,0.90,0.70,0.39,0.40,0.54,0.06,0.71,0.33,0.21,0.89,0.84,0.45,0.09,0.93,0.16,0.37,0.78,0.41,0.00,0.86,0.22,0.40,0.69,0.11,0.64,0.56,0.10,0.73,0.09], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Ivan', 'Looking forward to using vector capabilities.', [0.14,0.38,0.60,0.74,0.87,0.20,0.55,0.42,0.28,0.37,0.37,0.61,0.47,0.78,0.24,0.77,0.58,0.52,0.31,0.28,0.08,0.04,0.33,0.36,0.64,0.87,0.62,0.43,0.77,0.52,0.48,0.58,0.11,0.76,0.23,0.89,0.71,0.40,0.39,0.53,0.07,0.70,0.34,0.22,0.88,0.83,0.44,0.08,0.95,0.15,0.38,0.77,0.40,0.01,0.90,0.21,0.41,0.68,0.12,0.66,0.57,0.11,0.72,0.03], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Judy', 'I''m looking for a new job.', [0.24,0.85,0.03,0.64,0.91,0.12,0.48,0.26,0.39,0.77,0.58,0.43,0.15,0.08,0.72,0.05,0.68,0.36,0.95,0.22,0.31,0.14,0.66,0.11,0.19,0.29,0.93,0.47,0.30,0.80,0.25,0.84,0.54,0.62,0.37,0.28,0.56,0.46,0.33,0.99,0.02,0.18,0.40,0.63,0.21,0.50,0.59,0.35,0.32,0.09,0.06,0.27,0.75,0.44,0.81,0.42,0.17,0.20,0.73,0.07,0.55,0.60,0.16,0.13], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Karl', 'Indexes for high dimensional data are tricky', [0.65,0.24,0.19,0.77,0.05,0.31,0.49,0.09,0.36,0.58,0.20,0.86,0.27,0.40,0.73,0.04,0.80,0.12,0.93,0.25,0.46,0.38,0.70,0.13,0.60,0.52,0.16,0.81,0.29,0.17,0.41,0.88,0.07,0.63,0.50,0.28,0.96,0.21,0.11,0.83,0.03,0.44,0.35,0.15,0.68,0.22,0.95,0.54,0.08,0.72,0.47,0.26,0.33,0.32,0.85,0.10,0.42,0.06,0.59,0.84,0.18,0.48,0.30,0.14], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Laura', 'Approximate nearest neighbor queries are useful', [0.07,0.33,0.46,0.82,0.25,0.94,0.10,0.38,0.59,0.11,0.84,0.41,0.19,0.69,0.05,0.30,0.17,0.74,0.23,0.45,0.09,0.36,0.62,0.14,0.28,0.49,0.01,0.93,0.20,0.12,0.72,0.54,0.40,0.80,0.08,0.29,0.99,0.43,0.32,0.86,0.02,0.67,0.18,0.26,0.55,0.21,0.63,0.47,0.06,0.71,0.42,0.15,0.50,0.27,0.95,0.04,0.60,0.39,0.31,0.57,0.16,0.22,0.53,0.35], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Mallory', 'Hello!', [0.84,0.11,0.57,0.34,0.22,0.75,0.63,0.41,0.15,0.20,0.92,0.36,0.07,0.60,0.48,0.23,0.71,0.27,0.39,0.29,0.51,0.08,0.77,0.17,0.42,0.68,0.10,0.31,0.40,0.95,0.28,0.56,0.32,0.66,0.04,0.30,0.13,0.45,0.89,0.38,0.19,0.54,0.14,0.79,0.35,0.47,0.25,0.09,0.61,0.44,0.12,0.81,0.33,0.50,0.21,0.18,0.65,0.26,0.05,0.87,0.24,0.37,0.46,0.02], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Niaj', 'Query optimization is important for large datasets', [0.23,0.47,0.15,0.60,0.31,0.02,0.53,0.79,0.41,0.25,0.14,0.89,0.09,0.50,0.07,0.33,0.94,0.12,0.65,0.46,0.19,0.35,0.08,0.42,0.22,0.37,0.05,0.83,0.20,0.49,0.11,0.68,0.24,0.18,0.77,0.55,0.04,0.30,0.16,0.61,0.40,0.71,0.26,0.39,0.13,0.98,0.32,0.09,0.58,0.27,0.91,0.36,0.21,0.06,0.75,0.44,0.10,0.63,0.28,0.38,0.17,0.56,0.03,0.52], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Olivia', 'Search quality improves with better vectors', [0.34,0.63,0.27,0.49,0.14,0.72,0.40,0.12,0.53,0.81,0.33,0.07,0.19,0.44,0.29,0.95,0.09,0.70,0.31,0.05,0.64,0.20,0.37,0.16,0.60,0.86,0.02,0.26,0.47,0.17,0.30,0.79,0.13,0.55,0.04,0.35,0.92,0.24,0.18,0.67,0.21,0.51,0.36,0.08,0.74,0.28,0.10,0.42,0.25,0.57,0.15,0.39,0.11,0.48,0.22,0.66,0.50,0.03,0.45,0.19,0.78,0.06,0.32,0.82], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Peggy', 'Large vectors consume more space', [0.72,0.15,0.38,0.20,0.56,0.44,0.13,0.30,0.92,0.05,0.49,0.33,0.10,0.61,0.22,0.27,0.07,0.48,0.02,0.65,0.14,0.53,0.36,0.12,0.73,0.19,0.08,0.79,0.26,0.39,0.18,0.54,0.04,0.35,0.83,0.24,0.11,0.90,0.47,0.29,0.09,0.71,0.31,0.45,0.01,0.58,0.37,0.84,0.28,0.16,0.06,0.80,0.25,0.50,0.41,0.17,0.55,0.46,0.21,0.67,0.40,0.03,0.23,0.85], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Rupert', 'Dimensionality reduction is sometimes required', [0.61,0.08,0.26,0.75,0.13,0.32,0.42,0.05,0.97,0.28,0.46,0.36,0.09,0.20,0.84,0.11,0.63,0.14,0.38,0.22,0.10,0.59,0.17,0.41,0.06,0.69,0.18,0.35,0.07,0.44,0.25,0.80,0.16,0.53,0.21,0.82,0.02,0.57,0.23,0.30,0.29,0.93,0.12,0.66,0.03,0.48,0.27,0.19,0.15,0.71,0.34,0.40,0.24,0.98,0.37,0.31,0.55,0.45,0.50,0.74,0.33,0.04,0.39,0.99], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Sybil', 'My cat is aggressive. Can somebody help?', [0.49,0.12,0.66,0.25,0.05,0.59,0.21,0.18,0.74,0.04,0.38,0.45,0.15,0.50,0.19,0.27,0.93,0.09,0.56,0.14,0.44,0.17,0.31,0.08,0.29,0.84,0.20,0.40,0.07,0.33,0.96,0.23,0.11,0.73,0.32,0.54,0.13,0.46,0.30,0.22,0.10,0.42,0.26,0.39,0.02,0.63,0.36,0.28,0.16,0.68,0.01,0.48,0.24,0.35,0.52,0.03,0.06,0.65,0.09,0.41,0.53,0.37,0.60,0.88], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Trent', 'Similarity search powers recommendation engines', [0.28,0.64,0.09,0.46,0.12,0.38,0.57,0.07,0.33,0.71,0.18,0.49,0.24,0.62,0.04,0.26,0.95,0.15,0.35,0.11,0.29,0.40,0.08,0.55,0.20,0.36,0.05,0.83,0.19,0.50,0.23,0.72,0.30,0.16,0.45,0.39,0.10,0.58,0.27,0.31,0.17,0.93,0.25,0.54,0.03,0.41,0.22,0.13,0.14,0.79,0.42,0.34,0.21,0.88,0.47,0.32,0.44,0.06,0.63,0.48,0.02,0.85,0.37,0.99], toTimestamp(now())); INSERT INTO myapp.comments (record_id, id, commenter, comment, comment_vector, created_at) VALUES (now(), uuid(), 'Victor', 'I''m hungry.', [0.06,0.43,0.22,0.71,0.19,0.09,0.33,0.12,0.74,0.29,0.17,0.47,0.20,0.60,0.25,0.30,0.98,0.15,0.32,0.14,0.40,0.27,0.13,0.35,0.23,0.46,0.11,0.84,0.21,0.50,0.28,0.62,0.31,0.16,0.44,0.36,0.10,0.57,0.26,0.39,0.18,0.95,0.24,0.53,0.03,0.42,0.08,0.34,0.09,0.78,0.41,0.38,0.05,0.80,0.48,0.33,0.45,0.07,0.65,0.52,0.02,0.82,0.37,0.99], toTimestamp(now()));
To enable approximate nearest neighbor (ANN) queries, create a vector index.
CREATE CUSTOM INDEX IF NOT EXISTS comment_ann_index ON myapp.comments(comment_vector) USING 'vector_index' WITH OPTIONS = { 'similarity_function': 'COSINE' };
See Global Secondary Indexes - Vector Index in the ScyllaDB documentation for details.
Run a Vector Search Query¶
Now you can run similarity queries.
In the following example, the vector is identical to the one in Alice’s comment: “I like vector search in ScyllaDB.”.
SELECT id, commenter, comment
FROM myapp.comments
ORDER BY comment_vector ANN OF [
0.12,0.34,0.56,0.78,0.91,0.15,0.62,0.48,0.22,0.31,
0.40,0.67,0.53,0.84,0.19,0.72,0.63,0.54,0.26,0.33,
0.11,0.09,0.27,0.41,0.69,0.82,0.57,0.38,0.71,0.46,
0.55,0.64,0.17,0.81,0.23,0.95,0.66,0.35,0.44,0.59,
0.02,0.75,0.28,0.16,0.92,0.88,0.47,0.13,0.99,0.21,
0.32,0.83,0.45,0.04,0.86,0.25,0.36,0.73,0.07,0.61,
0.52,0.14,0.68,0.05
]
LIMIT 3;
With the limit set to 3, up to the three most similar comments to the provided query vector will be retrieved:
id (of Alice), commenter = Alice, comment = "I like vector search in ScyllaDB."
id (of Diana), commenter = Diana, comment = "Vector databases are the future"
id (of Bob), commenter = Bob, comment = "I like ScyllaDB!"