
Video demonstrates how to use the NAN AI benchmarks page and API to evaluate 60+ large language models for NE workflows. It walks through the web interface, search and filtering, cost and category metrics, model copying via OpenRouter, and practical limits such as the 65,000-character JSON paste restriction. The presenter also shows API endpoints, rate limits, common errors, and cURL import examples for integrating results into workflows.
– Benchmarks UI: Describe your use case or paste workflow JSON to get ranked model suggestions; copy model names or OpenRouter entries directly for quick integration. Note the 65,000-character paste limit for full workflows.
– Scoring and filters: Scores combine categories like tool use, hallucination, logic, structured output, speed and cost; toggling categories changes rankings. There is no per-category weighting or US/overseas model filter yet.
– Pricing and run cost: Page shows cost per thousand/per million and an average run estimate (based on 5,000 prompt tokens and 500 completion tokens); prompt vs completion tokens affect total cost.
– API details & caveats: Endpoints return top models, single-model details, and recommendations. Demo covers cURL import, rate limits (5 requests/min, 15/hour), a temporary include_results boolean bug, and advice to submit workflow sections rather than entire JSON.
Quotes:
Paste your workflow JSON — but watch out: there’s a 65,000-character limit.
Benchmarks rank models by what matters in NE: tool use, hallucinations, logic, structured output, speed, and cost.
The recommendation API is a little broken right now — it expects a boolean that hates booleans.
Statistics
| Upload date: | 2026-02-19 |
|---|---|
| Likes: | 13 |
| Comments: | 4 |
| Statistics updated: | 2026-03-02 |
Specification: What is the Best LLM for n8n in 2026 (Real Benchmark Data)
|