Skip to main content

Vectorstore

The VectorStore API provides powerful vector database capabilities for AI applications, enabling semantic search, similarity matching, and intelligent content discovery. Store, search, and manage high-dimensional vector embeddings with associated metadata for advanced AI-powered features.

Operations Overview

EndpointDescription
vdb.insertInsert vectors and metadata into collections
vdb.searchPerform semantic similarity search across vectors
vdb.deleteDelete vector entries by keys

1. Insert Vectors

POST vdb.insert

Insert content into the vector database with automatic embedding generation and intelligent chunking.

Request Body

{
"op": "vdb.insert",
"collection": "knowledge_base",
"data": [
{
"content": "Artificial intelligence is transforming how businesses operate by automating complex processes and providing intelligent insights from data.",
"content_type": "text",
"filename": "ai-overview.txt",
"metadata": {
"document_id": "doc_001",
"source": "company_wiki",
"tags": ["AI", "automation", "business"]
}
},
{
"content": "# Machine Learning Best Practices\n\n## Data Preparation\nClean and preprocess your data before training models...",
"content_type": "markdown",
"filename": "ml-guide.md",
"metadata": {
"document_id": "doc_002",
"source": "documentation",
"tags": ["ML", "guide", "best-practices"]
}
}
],
"options": {
"chunk_size": 500,
"chunk_overlap": 50
}
}

Parameters

  • op (string, required): Must be "vdb.insert"
  • collection (string, required): Target collection name for vector storage
  • data (array, required): Array of content objects to embed and store
  • options (object, optional): Processing configuration

Data Object Structure

  • content (string, required): Text content to embed and store
  • content_type (string, optional): Content format - "text", "json", "markdown", or "html" (default: "text")
  • filename (string, optional): Associated filename for filtering and reference
  • metadata (object, optional): Custom metadata for filtering and organization
    • document_id (string): Custom document identifier
    • source (string): Source system or origin
    • tags (array): Array of tags for categorization

Options

  • chunk_size (integer, optional): Maximum characters per chunk (default: 500)
  • chunk_overlap (integer, optional): Character overlap between chunks (default: 40)

Response

{
"data": [
{
"_key": "vec_abc123xyz789",
"content": "Artificial intelligence is transforming how businesses operate...",
"is_chunked": false,
"metadata": {
"document_id": "doc_001",
"source": "company_wiki",
"tags": ["AI", "automation", "business"]
}
},
{
"_key": "vec_def456uvw012",
"content": "# Machine Learning Best Practices\n\n## Data Preparation...",
"is_chunked": true,
"metadata": {
"document_id": "doc_002",
"source": "documentation",
"tags": ["ML", "guide", "best-practices"]
}
}
]
}

Use Cases

Knowledge Base Creation:

{
"op": "vdb.insert",
"collection": "company_docs",
"data": [
{
"content": "Employee handbook section on remote work policies...",
"content_type": "text",
"metadata": {
"document_id": "handbook_remote",
"source": "hr_system",
"tags": ["HR", "remote-work", "policies"]
}
}
]
}

Product Documentation:

{
"op": "vdb.insert",
"collection": "api_docs",
"data": [
{
"content": "## Authentication\n\nAll API requests require...",
"content_type": "markdown",
"filename": "auth-guide.md",
"metadata": {
"document_id": "auth_docs",
"source": "documentation_site",
"tags": ["API", "authentication", "security"]
}
}
],
"options": {
"chunk_size": 300,
"chunk_overlap": 30
}
}

Customer Support Articles:

{
"op": "vdb.insert",
"collection": "support_kb",
"data": [
{
"content": "How to reset your password: Navigate to login page...",
"content_type": "html",
"metadata": {
"document_id": "password_reset",
"source": "support_portal",
"tags": ["troubleshooting", "account", "password"]
}
}
]
}

2. Search Vectors

POST vdb.search

Perform semantic similarity search to find relevant content based on natural language queries.

Request Body

{
"op": "vdb.search",
"query": "How does machine learning improve business processes?",
"collection": "knowledge_base",
"filename": "ml-guide.md"
}

Parameters

  • op (string, required): Must be "vdb.search"
  • query (string, required): Natural language search query
  • collection (string, optional): Specific collection to search within
  • filename (string, optional): Filter results by specific filename

Response

{
"data": [
{
"_key": "vec_abc123xyz789",
"content": "Machine learning algorithms can automate complex business processes by analyzing patterns in data and making intelligent predictions about outcomes.",
"is_chunked": false,
"similarity_score": 0.92,
"metadata": {
"document_id": "doc_002",
"source": "documentation",
"tags": ["ML", "automation", "business"],
"filename": "ml-guide.md"
}
},
{
"_key": "vec_def456uvw012",
"content": "Artificial intelligence transforms business operations through intelligent automation, reducing manual effort while improving accuracy and speed.",
"is_chunked": true,
"similarity_score": 0.87,
"metadata": {
"document_id": "doc_001",
"source": "company_wiki",
"tags": ["AI", "automation", "business"],
"filename": "ai-overview.txt"
}
}
],
"meta": {
"query": "How does machine learning improve business processes?",
"total_results": 2,
"search_time_ms": 45
}
}

Response Fields

  • _key: Unique vector identifier
  • content: Original content that matched the query
  • is_chunked: Whether content was split into chunks during insertion
  • similarity_score: Cosine similarity score (0.0 to 1.0, higher is more similar)
  • metadata: Associated metadata including tags, source, and document information

Search Examples

Product Support Search:

{
"op": "vdb.search",
"query": "password reset not working",
"collection": "support_kb"
}

Technical Documentation Search:

{
"op": "vdb.search",
"query": "API rate limiting implementation",
"collection": "api_docs",
"filename": "rate-limits.md"
}

Company Policy Search:

{
"op": "vdb.search",
"query": "vacation request approval process",
"collection": "company_docs"
}

3. Delete Vectors

POST vdb.delete

Remove vector entries from the database using their unique keys.

Request Body

{
"op": "vdb.delete",
"collection": "knowledge_base",
"_keys": [
"vec_abc123xyz789",
"vec_def456uvw012"
]
}

Parameters

  • op (string, required): Must be "vdb.delete"
  • collection (string, required): Collection containing vectors to delete
  • _keys (array, required): Array of vector keys to remove

Response

{
"data": {
"deleted_count": 2,
"deleted_keys": [
"vec_abc123xyz789",
"vec_def456uvw012"
]
},
"meta": {
"collection": "knowledge_base",
"operation": "delete"
}
}

Use Cases

Content Cleanup:

{
"op": "vdb.delete",
"collection": "support_kb",
"_keys": ["vec_old_article_123"]
}

Bulk Removal:

{
"op": "vdb.delete",
"collection": "temp_docs",
"_keys": [
"vec_temp_001",
"vec_temp_002",
"vec_temp_003"
]
}

Content Processing and Chunking

Automatic Chunking

The VectorStore API automatically splits large content into manageable chunks for optimal embedding generation:

  • Default chunk size: 500 characters
  • Default overlap: 40 characters
  • Preserves context across chunk boundaries
  • Maintains metadata for all chunks

Content Types

Text Processing:

  • Plain text content processed as-is
  • Automatic sentence boundary detection
  • Preserves formatting when possible

Markdown Processing:

  • Headers and structure preserved
  • Code blocks handled appropriately
  • Links and formatting maintained

HTML Processing:

  • Tags stripped for embedding generation
  • Text content extracted intelligently
  • Structure hints preserved in metadata

JSON Processing:

  • Structured data flattened for embedding
  • Key-value relationships maintained
  • Nested objects handled recursively

Chunking Strategy

{
"options": {
"chunk_size": 300, // Smaller chunks for precise matching
"chunk_overlap": 50 // Higher overlap for better context
}
}

Small chunks (200-400 characters):

  • Better for precise fact retrieval
  • Higher granularity in search results
  • More chunks per document

Large chunks (800-1200 characters):

  • Better for contextual understanding
  • Fewer chunks with more complete information
  • Better for narrative content

Search Capabilities and Use Cases

Semantic Search Features

Natural Language Queries:

  • Ask questions in plain English
  • Contextual understanding of intent
  • Fuzzy matching for typos and variations

Multi-language Support:

  • Cross-language semantic search
  • Automatic language detection
  • Unified search across multilingual content

Domain-Specific Search:

  • Technical documentation search
  • Legal document analysis
  • Customer support automation
  • Research paper discovery

Advanced Search Patterns

Question Answering:

{
"op": "vdb.search",
"query": "What are the security requirements for API keys?",
"collection": "security_docs"
}

Concept Discovery:

{
"op": "vdb.search",
"query": "machine learning model deployment best practices",
"collection": "technical_guides"
}

Troubleshooting:

{
"op": "vdb.search",
"query": "application crashes on startup with memory error",
"collection": "bug_reports"
}

Integration Patterns

Chatbot Integration:

async function answerQuestion(userQuestion) {
const searchResults = await vectorSearch({
query: userQuestion,
collection: 'knowledge_base'
});

// Use top results to generate contextual response
const context = searchResults.data.slice(0, 3)
.map(result => result.content)
.join('\n\n');

return generateResponse(userQuestion, context);
}

Recommendation Engine:

async function findSimilarContent(contentId) {
// Get original content
const original = await getContent(contentId);

// Search for similar content
const similar = await vectorSearch({
query: original.content,
collection: 'articles'
});

return similar.data.filter(item => item._key !== contentId);
}

Performance and Optimization

Query Optimization

Specific Collections:

  • Search within specific collections for faster results
  • Organize content by domain or topic
  • Use metadata for additional filtering

Efficient Chunking:

  • Balance chunk size based on content type
  • Consider query patterns when setting overlap
  • Monitor search performance and adjust accordingly

Metadata Best Practices

Structured Tagging:

{
"metadata": {
"category": "technical",
"subcategory": "api",
"difficulty": "intermediate",
"last_updated": "2024-01-15",
"author": "tech_team"
}
}

Search-Friendly Organization:

{
"metadata": {
"product": "SingleBase",
"feature": "authentication",
"audience": ["developers", "integrators"],
"content_stage": "production"
}
}

Error Responses

Common error scenarios and responses:

Invalid Content

{
"error": {
"code": "INVALID_CONTENT",
"message": "Content cannot be empty or exceed maximum length",
"details": {
"max_length": 50000,
"provided_length": 52000
}
}
}

Collection Not Found

{
"error": {
"code": "COLLECTION_NOT_FOUND",
"message": "Specified collection does not exist",
"details": {
"collection": "nonexistent_collection"
}
}
}

Vector Not Found

{
"error": {
"code": "VECTOR_NOT_FOUND",
"message": "One or more vector keys do not exist",
"details": {
"missing_keys": ["vec_invalid_123"]
}
}
}

Common Error Codes

CodeDescriptionResolution
INVALID_CONTENTContent is empty or too largeEnsure content is within size limits
COLLECTION_NOT_FOUNDCollection doesn't existVerify collection name or create collection
VECTOR_NOT_FOUNDVector key doesn't existCheck vector keys before deletion
EMBEDDING_FAILEDVector embedding generation failedRetry with different content or contact support
QUOTA_EXCEEDEDVector storage quota reachedUpgrade plan or delete unused vectors

Best Practices

Content Organization

  • Use descriptive collection names that reflect content domains
  • Implement consistent tagging strategies across documents
  • Include relevant metadata for enhanced filtering and discovery
  • Regularly cleanup outdated or irrelevant vectors

Search Optimization

  • Frame queries as natural questions for better results
  • Use specific terminology when searching technical content
  • Combine vector search with metadata filtering for precision
  • Monitor search patterns to optimize chunking strategies

Performance

  • Batch insert operations when adding multiple documents
  • Use appropriate chunk sizes based on content characteristics
  • Implement caching for frequently searched queries
  • Monitor vector storage usage and optimize retention policies

Security

  • Ensure proper authentication for all vector operations
  • Implement access controls at the collection level
  • Sanitize content before insertion to prevent data leakage
  • Regular audit of stored vectors for sensitive information

The VectorStore API enables powerful semantic search capabilities that can transform how users discover and interact with your content, providing the foundation for intelligent applications, chatbots, and recommendation systems.