AI Crucible API: Complete Integration Guide

AI Crucible provides a unified, OpenAI-compatible API that allows you to integrate powerful ensemble AI strategies directly into your applications. This guide covers everything from authentication to advanced orchestration controls.

Sample Application

Want to see the API in action? Check out our sample chat application on GitHub. This complete, open-source example demonstrates:

The sample app includes complete setup instructions for beginners and serves as a production-ready starting point for your own applications.


1. Getting Started

The API is built to be a drop-in replacement for standard LLM calls. If you use the OpenAI SDK, you can switch to AI Crucible by simply changing the baseURL and apiKey.

Authentication

All API requests require a valid API Key passed in the Authorization header.

Authorization: Bearer sk-aicrucible-...

You can manage your API keys in the Settings > API Keys section of the dashboard. Keys have the following permissions:


2. Chat Completions

POST /v1/chat/completions

This is the primary endpoint for running ensemble strategies. It accepts a standard chat history and returns a generated response, fully compatible with the OpenAI format.

Request Parameters

Parameter Type Required Description
model string Yes The ID of the model to use. When using ai_crucible orchestration, this model serves as the arbiter/judge. When not using orchestration, this is the model that generates the response.
messages array Yes List of message objects (role, content). Content can be a string or an array of content parts for multimodal messages.
stream boolean No If true, returns a stream of Server-Sent Events (SSE).
temperature number No Sampling temperature to use, between 0 and 2. Higher values mean the model will take more risks.
top_p number No An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.
max_tokens number No Maximum number of tokens to generate in the completion.
stream_options object No Streaming configuration. Use { include_usage: true } to receive token usage data in the final chunk when streaming.
ai_crucible object No Advanced configuration for ensemble strategies.

Advanced Configuration (ai_crucible)

To orchestrate multiple models or use advanced strategies, pass the ai_crucible object in your request. This nested object pattern is the standard way to extend OpenAI-compatible APIs without param collisions.

{
  "model": "gpt-4o",
  "messages": [...],
  "ai_crucible": {
    "strategy": "competitive_refinement",
    "iterations": 1,
    "models": ["gemini-2.5-flash", "gpt-4o-mini"]
  }
}

Supported Fields

Field Type Description
strategy string The ensemble strategy ID (e.g., competitive_refinement, expert_panel).
iterations number Number of refinement rounds (default: 1).
rounds number Alias for iterations. Both parameters are supported for backward compatibility.
models array List of model IDs to participate in the ensemble.
includeFollowUpPrompts boolean (Optional) Whether to generate follow-up prompt suggestions (default: false).
includeCandidates boolean (Optional) If true, includes individual model responses in the response candidates field. Default: false.
includeReasoning boolean (Optional) If true, includes the Arbiter's explanation and winner selection. Default: false. Alternative to root-level reasoning param.
systemPrompt string (Optional) Specific instructions for the Arbiter model (e.g., "You are a Pirate Arbiter"). Takes precedence over message history.

[!TIP] System Prompt Fallback: If ai_crucible.systemPrompt is not provided, the API will attempt to extract the first system message from the messages array to use as the Arbiter's instructions. Explicitly setting systemPrompt in the extension object is recommended for clarity.

[!IMPORTANT] Arbiter Model Selection: When using ensemble strategies via ai_crucible, the root model parameter specifies which model acts as the arbiter/judge to select the best response. The models array in ai_crucible specifies the participant models that generate competing responses.

Example:

{
  "model": "gemini-1.5-pro",  // ← This model judges the responses
  "messages": [...],
  "ai_crucible": {
    "models": ["gpt-4o", "claude-3-opus"]  // ← These models compete
  }

}

System Prompt Support

You can provide a system prompt via ai_crucible.systemPrompt or by including a message with role: "system" in the messages array. If both are provided, ai_crucible.systemPrompt takes precedence for the Arbiter model.

Provider Support Status Notes
OpenAI ✅ Supported Native support
Anthropic ✅ Supported Mapped to top-level system parameter
Google Gemini ✅ Supported Native support
DeepSeek ✅ Supported Injected as system message
Mistral ✅ Supported Injected as system message
Kimi (Moonshot) ✅ Supported Injected as system message
Qwen (Aliyun) ✅ Supported Injected as system message
xAI (Grok) ✅ Supported Injected as system message
Together ✅ Supported Injected as system message

Validation Rules:

Reasoning Parameter (OpenAI Compatible)

Crucible supports the standard reasoning parameter for controlling explanation effort:

{
  "model": "gemini-2.5-flash",
  "reasoning": {
    "effort": "high"
  }
}

Setting reasoning.effort (any value) will enable the explanation field in the response. The effort level is currently not used to adjust the detail level.

Controlling Creativity

You can control the randomness of the output using temperature and top_p. These parameters are applied to the final synthesis step (the Arbiter's response).

{
  "model": "gpt-4o",
  "messages": [...],
  "temperature": 0.7,
  "top_p": 0.9,
  "ai_crucible": { ... }
}

Advanced OpenAI-Compatible Parameters

For full OpenAI API compatibility, the following additional parameters are supported:

Parameter Type Description
n number Number of completions to generate (default: 1).
stop string | string[] Stop sequences where the API will stop generating further tokens.
presence_penalty number Penalize new tokens based on whether they appear in the text so far (-2.0 to 2.0).
frequency_penalty number Penalize new tokens based on their frequency in the text so far (-2.0 to 2.0).
logit_bias object Modify the likelihood of specified tokens appearing in the completion.
user string Unique identifier representing your end-user for monitoring and abuse detection.

[!NOTE] These parameters are primarily useful for advanced use cases and fine-tuning model behavior. Most users will find temperature and top_p sufficient for controlling output randomness.

Example Request

curl https://api.ai-crucible.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AI_CRUCIBLE_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Write a poem about rust."}],
    "ai_crucible": {
      "strategy": "expert_panel",
      "models": ["claude-3-opus", "gpt-4o"],
      "iterations": 2
    }
  }'

4. File Attachments

AI Crucible supports multimodal attachments for both images and text-based files (including Markdown, JSON, CSV).

Supported File Types

Multimodal Message Format

Messages support two content formats:

  1. Simple text (string): "content": "Your message here"
  2. Multimodal (array): "content": [{ type: "text", text: "..." }, { type: "image_url", ... }]

When including attachments, use the array format with content parts.

Usage in API

Attachments are handled by the client and sent as standard OpenAI-compatible content parts.

Image Attachment

Images should be passed as image_url content parts with base64-encoded data.

{
  "role": "user",
  "content": [
    { "type": "text", "text": "What is in this image?" },
    {
      "type": "image_url",
      "image_url": {
        "url": "data:image/jpeg;base64,..."
      }
    }
  ]
}

Text/Markdown Attachment

Text-based attachments are appended to the message content as formatted text blocks.

{
  "role": "user",
  "content": "Analyze this file:\n\n--- Attachment: readme.md ---\n# Project Title\n...\n--- End Attachment ---"
}

5. Specialized Endpoints

POST /v1/evaluations

The /evaluations endpoint allows you to programmatically evaluate model responses using AI judge models. Submit two or more model responses and receive detailed scoring across criteria like accuracy, creativity, clarity, completeness, and usefulness.

Request Parameters

Parameter Type Required Description
prompt string Yes The original prompt that the model responses were generated from.
responses array Yes Array of model responses to evaluate. Each must include modelId, modelName, and response.
judge_models array Yes Array of judge models to use. Each must include id and name. Multiple judges produce consensus scoring.
strategy string No The ensemble strategy context (e.g., competitive_refinement).
evaluation_mode string No Evaluation mode: standard (pairwise comparison) or pointwise (independent scoring). Default: standard.
weighted boolean No Whether to use weighted scoring based on judge model weights. Default: false.

Example Request

curl https://api.ai-crucible.com/v1/evaluations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AI_CRUCIBLE_KEY" \
  -d '{
    "prompt": "What is the capital of France?",
    "responses": [
      {
        "modelId": "gemini-3-flash",
        "modelName": "Gemini 3 Flash",
        "response": "The capital of France is Paris, renowned for the Eiffel Tower."
      },
      {
        "modelId": "gpt-5-nano",
        "modelName": "GPT-5 Nano",
        "response": "Paris is the capital of France."
      }
    ],
    "judge_models": [
      { "id": "gemini-3-flash", "name": "Gemini 3 Flash" }
    ]
  }'

Example Response

{
  "evaluations": [
    {
      "modelId": "gemini-3-flash",
      "modelName": "Gemini 3 Flash",
      "overallScore": 8.8,
      "criteria": {
        "accuracy": 10,
        "creativity": 5,
        "clarity": 10,
        "completeness": 10,
        "usefulness": 9
      },
      "reasoning": "Provides the correct answer with useful context about landmarks.",
      "individualScores": [
        {
          "judgeId": "gemini-3-flash",
          "judgeName": "Gemini 3 Flash",
          "overallScore": 8.8,
          "criteria": { "accuracy": 10, "creativity": 5, "clarity": 10 }
        }
      ]
    }
  ],
  "usage": {
    "prompt_tokens": 487,
    "completion_tokens": 308,
    "total_tokens": 1655,
    "total_cost": 0.0012
  }
}

[!TIP] Multi-Judge Consensus: When providing multiple judge models, scores are averaged across judges. Use weighted: true to apply model-specific quality weights for higher-fidelity scoring.

POST /v1/responses

The /responses endpoint is optimized for batch processing and complex multi-turn workflows that don't fit the strict chat format.

Request Body

{
  "model": "gemini-3-flash",
  "input": "Analyze this dataset for anomalies...",
  "stream": false
}

GET /v1/usage

Retrieve your current token usage and cost statistics. This helps you monitor your budget in real-time.

Response:

{
  "object": "list",
  "data": [
    {
      "object": "usage_record",
      "period": "2024-03",
      "total_tokens": 150000,
      "total_cost": 0.45
    }
  ]
}

6. Error Handling

The API uses standard HTTP status codes to indicate success or failure.

Code Meaning Solution
200 OK Request succeeded.
401 Unauthorized Check your API Key.
402 Payment Required You have exceeded your monthly budget or balance.
429 Too Many Requests Rate limit exceeded. Slow down or request a quota increase.
500 Internal Error Something went wrong on our side. Please retry.

Frequently Asked Questions

Can I use the OpenAI Node.js SDK?

Yes! Simply initialize the client with your AI Crucible base URL and API key.

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-aicrucible-...',
  baseURL: 'https://api.ai-crucible.com/v1',
});

How does streaming work with ensembles?

Streaming works identically to standard LLMs. For ensemble strategies like Collaborative Synthesis, you will receive the final synthesized chunk-by-chunk as it is generated by the synthesizer model. Intermediate steps (like panel discussions) happen server-side before the stream begins or are summarized in the final output.

What happens if a model in my ensemble fails?

The orchestrator automatically handles partial failures. If one model in a panel fails (e.g., due to rate limits), the strategy continues with the remaining healthy models, ensuring you still get a robust result.


Related Articles