<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>AI Crucible Articles</title>
    <link>https://ai-crucible.com/articles/</link>
    <atom:link href="https://ai-crucible.com/feed.xml" rel="self" type="application/rss+xml" />
    <description>Articles and guides on ensemble AI strategies, model comparisons, and LLM orchestration.</description>
    <language>en-us</language>
    <lastBuildDate>Tue, 09 Jun 2026 21:49:36 GMT</lastBuildDate>
    <item>
      <title>Claude Fable 5 Debut: vs Opus 4.8, Sonnet 4.6, GPT-5.5</title>
      <link>https://ai-crucible.com/articles/claude-fable-5-vs-opus-4-8-vs-gpt-5-5-rate-limiter/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/claude-fable-5-vs-opus-4-8-vs-gpt-5-5-rate-limiter/</guid>
      <pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate>
      <description>Claude Fable 5's first ensemble benchmark: fastest flagship answer and top accuracy, but GPT-5.5 takes the judged crown at 9.3/10. Full data inside.</description>
    </item>
    <item>
      <title>Qwen3.7-Max vs Kimi K2.6 vs DeepSeek V4: China's Best</title>
      <link>https://ai-crucible.com/articles/qwen-3-7-max-vs-kimi-k2-6-vs-deepseek-v4/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/qwen-3-7-max-vs-kimi-k2-6-vs-deepseek-v4/</guid>
      <pubDate>Tue, 09 Jun 2026 00:00:00 GMT</pubDate>
      <description>Alibaba's new Qwen3.7-Max takes on Kimi K2.6 and DeepSeek-V4-Pro on a hard fraud-detection design task, judged by Gemini 3.1 Pro and Claude Opus 4.8.</description>
    </item>
    <item>
      <title>Analyze Large PDFs: Page-Cited Search and a Caught Hallucination</title>
      <link>https://ai-crucible.com/articles/analyze-large-pdfs-rag-pdf-search/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/analyze-large-pdfs-rag-pdf-search/</guid>
      <pubDate>Fri, 05 Jun 2026 00:00:00 GMT</pubDate>
      <description>Drop a book-length PDF into AI Crucible and models search and cite exact pages. In our run, one model fabricated figures, and the ensemble caught it.</description>
    </item>
    <item>
      <title>Bring Your Own Key: Run Any OpenRouter Model in an Ensemble</title>
      <link>https://ai-crucible.com/articles/byok-connect-openrouter-ensembles/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/byok-connect-openrouter-ensembles/</guid>
      <pubDate>Fri, 05 Jun 2026 00:00:00 GMT</pubDate>
      <description>AI Crucible's new Connect tier lets you bring an OpenRouter key and run any model in an ensemble, unmetered. We ran two OpenRouter-only models head to head.</description>
    </item>
    <item>
      <title>What's New in AI Crucible: June 2026 Feature Roundup</title>
      <link>https://ai-crucible.com/articles/whats-new-june-2026/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/whats-new-june-2026/</guid>
      <pubDate>Thu, 04 Jun 2026 00:00:00 GMT</pubDate>
      <description>Five new AI Crucible features: bring-your-own-key models, large-PDF search, web grounding on every tier, per-run reasoning control, and agreement scoring.</description>
    </item>
    <item>
      <title>The Fastest AI Models of 2026: Speed and Cost Compared</title>
      <link>https://ai-crucible.com/articles/fastest-models-2026-speed-and-cost/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/fastest-models-2026-speed-and-cost/</guid>
      <pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate>
      <description>Compare the fastest, cheapest 2026 AI models — Gemini 3.5 Flash, Qwen3.5-Flash, DeepSeek-V4-Flash, GLM-5 and Mistral Medium 3.5 — on speed and price.</description>
    </item>
    <item>
      <title>The New Thinking Models of 2026: Deep Reasoning Compared</title>
      <link>https://ai-crucible.com/articles/thinking-models-2026-deep-reasoning/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/thinking-models-2026-deep-reasoning/</guid>
      <pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate>
      <description>Compare the new 2026 thinking models — GPT-5.5 Pro, Claude Opus 4.8, GLM-5.1, DeepSeek-V4-Pro, Kimi K2.6 and Grok 4.3 — on reasoning, context, and cost.</description>
    </item>
    <item>
      <title>GPT-5.4 vs Gemini 3.1 Pro vs Grok 4.20 vs Mistral Medium 3.1</title>
      <link>https://ai-crucible.com/articles/new-models-march-2026/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/new-models-march-2026/</guid>
      <pubDate>Wed, 18 Mar 2026 00:00:00 GMT</pubDate>
      <description>GPT-5.4, Gemini 3.1 Pro, Grok 4.20, and Mistral Medium 3.1 go head-to-head on a complex SaaS architecture challenge, scored by dual AI judges.</description>
    </item>
    <item>
      <title>GPT-5.4 vs Claude Opus 4.6 vs Gemini 3.1 Pro: Flagship Showdown</title>
      <link>https://ai-crucible.com/articles/gpt-5-4-vs-claude-opus-4-6-vs-gemini-3-1-pro-flagship-showdown/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/gpt-5-4-vs-claude-opus-4-6-vs-gemini-3-1-pro-flagship-showdown/</guid>
      <pubDate>Sun, 08 Mar 2026 00:00:00 GMT</pubDate>
      <description>We pitted the three flagship models of March 2026 against a real entrepreneurship challenge. Claude Opus 4.6 edged out GPT-5.4 — but the judges disagreed on why.</description>
    </item>
    <item>
      <title>Multi-Agent Orchestration: Ensemble AI for Enterprise Workflows</title>
      <link>https://ai-crucible.com/articles/multi-agent-orchestration/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/multi-agent-orchestration/</guid>
      <pubDate>Sat, 07 Mar 2026 00:00:00 GMT</pubDate>
      <description>Discover how AI Crucible's seven ensemble strategies mirror the agentic patterns enterprises are adopting—and why orchestrating multiple AI models beats single-agent solutions.</description>
    </item>
    <item>
      <title>AI Crucible Is Now Open: Our Journey from Closed Beta to Public Launch</title>
      <link>https://ai-crucible.com/articles/ai-crucible-official-launch/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/ai-crucible-official-launch/</guid>
      <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
      <description>AI Crucible drops the invitation code. After months of closed testing, 52 articles, and 7 ensemble strategies, the multi-model AI platform is open to everyone.</description>
    </item>
    <item>
      <title>How Prompt Classification Powers Smarter AI Ensembles</title>
      <link>https://ai-crucible.com/articles/prompt-classification-ensemble-strategies/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/prompt-classification-ensemble-strategies/</guid>
      <pubDate>Tue, 03 Mar 2026 00:00:00 GMT</pubDate>
      <description>Discover how AI Crucible classifies your prompt into 14 categories and automatically recommends the best strategy, models, and rounds for optimal results.</description>
    </item>
    <item>
      <title>AI Debate Methods: 322 Benchmarks Expose the Truth</title>
      <link>https://ai-crucible.com/articles/ai-debate-strategies/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/ai-debate-strategies/</guid>
      <pubDate>Sun, 01 Mar 2026 00:00:00 GMT</pubDate>
      <description>Compare ai debate methods with real benchmarks, code examples, and performance data. See which AI model wins for your use case.</description>
    </item>
    <item>
      <title>State of Chinese AI Models February 2026: GLM-4.7, Qwen 3.5, Kimi K2.5</title>
      <link>https://ai-crucible.com/articles/chinese-ai-models-feb-2026-glm-4-7-vs-qwen-3-5-vs-kimi-k2-5/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/chinese-ai-models-feb-2026-glm-4-7-vs-qwen-3-5-vs-kimi-k2-5/</guid>
      <pubDate>Wed, 25 Feb 2026 00:00:00 GMT</pubDate>
      <description>Chinese AI has matured beyond recognition by February 2026. GLM-4.7, Qwen 3.5 Plus, and Kimi K2.5 now challenge Western frontier models. We benchmarked all three with dual-judge scoring.</description>
    </item>
    <item>
      <title>Red Team Blue Team Walkthrough: Stress-Testing a Launch Plan</title>
      <link>https://ai-crucible.com/articles/red-team-blue-team-launch-strategy-walkthrough/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/red-team-blue-team-launch-strategy-walkthrough/</guid>
      <pubDate>Tue, 24 Feb 2026 00:00:00 GMT</pubDate>
      <description>See how AI models attack and defend a go-to-market plan for AI Crucible. This step-by-step walkthrough shows Red Team / Blue Team hardening a launch strategy across three adversarial rounds.</description>
    </item>
    <item>
      <title>Gemini 3.1 Pro vs Qwen 3.5 Plus vs Claude Sonnet 4.6 on Management</title>
      <link>https://ai-crucible.com/articles/gemini-3-1-pro-vs-qwen3-5-plus-vs-claude-sonnet-4-6-portfolio-management/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/gemini-3-1-pro-vs-qwen3-5-plus-vs-claude-sonnet-4-6-portfolio-management/</guid>
      <pubDate>Sat, 21 Feb 2026 00:00:00 GMT</pubDate>
      <description>Claude Sonnet 4.6 wins the portfolio management showdown with a 9.1 consensus score, but Qwen3.5 Plus delivers 89% of the quality at 6% of the cost. Here is what happened.</description>
    </item>
    <item>
      <title>Red Team Blue Team Walkthrough: Stress-Testing an Investor Pitch Deck</title>
      <link>https://ai-crucible.com/articles/red-team-blue-team-pitch-deck-walkthrough/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/red-team-blue-team-pitch-deck-walkthrough/</guid>
      <pubDate>Fri, 20 Feb 2026 00:00:00 GMT</pubDate>
      <description>Watch AI models attack and defend an investor pitch deck for AI Crucible. See how Red Team / Blue Team adversarial testing hardens business arguments across three rounds.</description>
    </item>
    <item>
      <title>Sonnet 4.6 vs Qwen 3.5 vs Kimi K2.5: Benchmark Results (2026)</title>
      <link>https://ai-crucible.com/articles/claude-sonnet-4-6-vs-qwen-3-5-plus-vs-kimi-k2-5-project-management/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/claude-sonnet-4-6-vs-qwen-3-5-plus-vs-kimi-k2-5-project-management/</guid>
      <pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate>
      <description>Compare claude sonnet 4.6 vs kimi k2.5 comparison head-to-head with real benchmarks, code examples, and performance data. See which AI model wins for your use case.</description>
    </item>
    <item>
      <title>AI Crucible Benchmarks: 322 Evaluations Reveal Ensemble Advantage</title>
      <link>https://ai-crucible.com/articles/benchmark-results-analysis/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/benchmark-results-analysis/</guid>
      <pubDate>Fri, 13 Feb 2026 00:00:00 GMT</pubDate>
      <description>Analysis of 322 benchmark evaluations across 20 AI models, 6 ensemble strategies, and 14 task categories. Ensemble synthesis outperforms individual models 64% of the time.</description>
    </item>
    <item>
      <title>Getting Started with AI Crucible: A Step-by-Step Guide</title>
      <link>https://ai-crucible.com/articles/getting-started-guide/</link>
      <guid isPermaLink="true">https://ai-crucible.com/articles/getting-started-guide/</guid>
      <pubDate>Wed, 11 Feb 2026 00:00:00 GMT</pubDate>
      <description>Learn how to use AI Crucible with a complete walkthrough. Follow along as we solve a real business problem using ensemble AI, from selecting models to reviewing results.</description>
    </item>
  </channel>
</rss>
