The Pragmatic Engineer0 Hot0 bình luận4 phút đọc3 giờ trước

The Pulse: a new trend, smart model routing

A growing trend of intelligent AI model routing is emerging, driven by the 10-20x cost difference between cheap and frontier models. Several vendors now offer smart routing solutions including Factory Router (20-25% savings), Not Diamond (~30% savings), Augment Code's Prism, Morph's Model Router, and Weave Router. AI gateways with built-in routing include OpenRouter, Kilo Gateway, Requestly.ai, LiteLLM, and Envoy AI Gateway. Cursor and GitHub Copilot also offer auto model selection, though with limitations. Enterprise demand is high, with open models reportedly sufficient for ~60% of coding-related token spend. Intelligent routing is expected to become standard across all AI vendors.

Đọc bài gốc

#llm #finops

Nguồn: https://blog.pragmaticengineer.com/the-pulse-a-new-trend-smart-model-routing. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

Elena's Growth Scoop2 Hot12 phút3 giờ trướcAI

Please stop the AI Confidence Theater

Bài viết chỉ trích "AI Confidence Theater" – xu hướng thổi phồng khả năng và quy trình AI trên mạng xã hội lẫn trong doanh nghiệp, gây hại bằng cách bóp méo kỳ vọng, tạo FOMO, khó khăn trong tuyển dụng và áp lực giả vờ thành thạo AI. Tác giả đề xuất thay đổi bằng cách chia sẻ kết quả thực tế, thừa nhận giới hạn và tập trung vào công việc duy trì hệ thống AI vốn ít hào nhoáng nhưng mang lại giá trị thực.

Nếu bạn đang tìm hiểu về cách xây dựng dự án AI thực tế và tránh bị lừa bởi hype không có cơ sở, bài viết này giúp bạn phân biệt giữa tuyên bố hype và kiến thức thực sự để đưa ra quyết định sáng suốt về việc đầu tư thời gian và nguồn lực.

#ai

The Pulse: a new trend, smart model routing

Đề xuất cho bạn

Please stop the AI Confidence Theater

Text AI watermarks will always be trivial to remove

How We Built DEmate: Taming LLMs for Data Engineering at Meta

Context vs. Memory Engineering in Agentic AI Systems

Built for Mass Scale: Hard-Won Lessons from Teams Running High Volume Inference Workloads in Production

Why Specialization Is Inevitable

Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

The many journeys of learning Rust