API Reference - co-mind.ai

The co-mind.ai API follows RESTful conventions and is fully documented via our OpenAPI 3.1 specification. All endpoints listed below are auto-generated from openapi.yaml.

Base URL

https://your-comind-instance.example.com

All endpoints are prefixed with /v1/ unless otherwise noted.

Authentication

All API requests (except /health and public auth endpoints) require a Bearer token:

Authorization: Bearer <token>

Personal Access Token (Recommended)
JWT Token

curl https://your-instance/v1/models \
  -H "Authorization: Bearer cmnd_your-token-here"

PATs are long-lived, scoped tokens ideal for programmatic access. See API Tokens.

curl https://your-instance/v1/models \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIs..."

JWTs are short-lived tokens from POST /v1/auth/login. See Authentication.

Request Format

Content-Type: application/json for all JSON request bodies
File uploads: multipart/form-data
Character encoding: UTF-8

Response Format

All responses return JSON with consistent structure:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1730822400,
  "model": "tiiuae/Falcon3-7B-Instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Response content here."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 8,
    "total_tokens": 28
  }
}

Status Codes

Code	Description
`200`	Success
`201`	Created
`204`	No Content (successful deletion)
`400`	Bad Request — invalid parameters
`401`	Unauthorized — invalid or missing token
`403`	Forbidden — insufficient permissions
`404`	Not Found — resource doesn’t exist
`409`	Conflict — capability not supported or duplicate
`415`	Unsupported Media Type
`429`	Rate Limited — retry after delay
`500`	Server Error — retry with backoff

Streaming

Chat and completion endpoints support Server-Sent Events (SSE) streaming by setting "stream": true:

curl -X POST https://your-instance/v1/chat/completions \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tiiuae/Falcon3-7B-Instruct",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Streamed responses arrive as data: lines:

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"The"},"index":0}]}

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":" capital"},"index":0}]}

data: [DONE]

Rate Limiting

When you exceed rate limits, the API returns 429 Too Many Requests. Implement exponential backoff in your client:

import time
import requests

def api_call_with_retry(url, headers, json_data, max_retries=3):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=json_data)
        if response.status_code == 429:
            wait = 2 ** attempt
            time.sleep(wait)
            continue
        return response
    raise Exception("Max retries exceeded")

Endpoint Categories

The API is organized into these categories — browse them in the sidebar:

Authentication

Chat & Completions

OpenAI-compatible chat, text completion, and streaming.

Knowledge Bases

Create KBs, upload files, query for context, RAG chat.

Echo Engine

Audio transcription (STT), text-to-speech (TTS), recordings.

Researcher

Web search, content scraping, deep research, report synthesis.

Document Analyzer

Upload documents, extract data, human-in-the-loop review.

Admin

Tenant management, directory sync, sanitizer policies, audit logs.

Discovery

Models, backends, capabilities, health checks.

Health check

​Base URL

​Authentication

​Request Format

​Response Format

​Status Codes

​Streaming

​Rate Limiting

​Endpoint Categories

Authentication

Chat & Completions

Knowledge Bases

Echo Engine

Researcher

Document Analyzer

Admin

Discovery

Base URL

Authentication

Request Format

Response Format

Status Codes

Streaming

Rate Limiting

Endpoint Categories