The New Stack0 Hot0 bình luận10 phút đọc3 giờ trước

Why traditional CI/CD fails for LLMs (and the release gates we built to fix it)

Traditional CI/CD pipelines fail for LLM systems because they rely on binary pass/fail logic designed for deterministic software, while LLMs degrade gradually through eval drift, distribution shift, and context poisoning. A four-gate release framework is proposed: (1) baseline eval suite scoring relevance, faithfulness, and safety; (2) drift detection comparing scores against rolling baselines rather than fixed thresholds; (3) shadow traffic validation using canary deployment patterns; and (4) cost and latency guardrails. Python code examples show how to implement each gate and integrate them into existing GitHub Actions or GitLab CI pipelines. Lessons learned include avoiding overly strict gates that teams bypass, testing on real messy user queries rather than synthetic ones, and versioning eval datasets like infrastructure state.

Đọc bài gốc

#python #llm #cicd #rag #mlops

Nguồn: https://thenewstack.io/why-cicd-fails-llms. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

sean goedecke2 Hot12 phút14 giờ trướcAI

Text AI watermarks will always be trivial to remove

EU sẽ yêu cầu đánh dấu (watermark) văn bản do AI tạo ra từ tháng 8/2026, nhưng hai phương pháp phổ biến hiện nay—thay thế ký tự Unicode (homoglyph) và SynthID (điều chỉnh token)—đều dễ dàng bị loại bỏ bằng cách chuẩn hóa Unicode hoặc diễn đạt lại bằng LLM. Yêu cầu công khai phương pháp đánh dấu của AI Act càng khiến kỹ thuật này kém hiệu quả, trong khi định dạng C2PA chỉ áp dụng cho file, không phải đầu ra dạng văn bản thuần.

Lập trình viên nên đọc bài này để hiểu cách các công ty AI đang giải quyết và bị vượt qua các vấn đề về bảo vệ nguồn gốc văn bản sinh tạo, từ đó dự đoán những rủi ro kỹ thuật và pháp lý trong tương lai khi luật AI của EU bắt buộc thêm dấu vân tay.

#llm

Why traditional CI/CD fails for LLMs (and the release gates we built to fix it)

Đề xuất cho bạn

Text AI watermarks will always be trivial to remove

How We Built DEmate: Taming LLMs for Data Engineering at Meta

Context vs. Memory Engineering in Agentic AI Systems

CI/CD Pipelines Make Governance Consistent

CD Build Failures Like a Support Engineer

Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

The many journeys of learning Rust

Mastering Agentic Techniques: AI Agent Reinforcement Learning