Datadog00 bình luận6 phút đọc2 giờ trước

How we saved over $3 million in idle compute costs with Datadog Kubernetes Autoscaling

Datadog's platform team Rapid adopted Datadog Kubernetes Autoscaling (DKA) to replace fragmented manual autoscaling across 1,800+ services. DKA's multidimensional scaling mode handles both horizontal replica scaling and vertical resource rightsizing through a single declarative resource, resolving the WPA/VPA incompatibility that previously blocked automated vertical scaling. In an initial data center rollout, DKA cut costs by over 50% by surfacing overprovisioned workloads and automatically rightsizing them. It also identified underprovisioned pods running at 100% CPU and corrected their allocations. Rapid configured 3,000 deployments in a single day, and the approach has since spread to ~30,000 deployments across Datadog, eliminating more than $3 million in annualized idle compute costs.

Đọc bài gốc

#devops #kubernetes #finops

Nguồn: https://www.datadoghq.com/blog/how-we-saved-with-kubernetes-autoscaling. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

Grafana Labs3313 phút3 ngày trướcAI

Tempo 3.0 release: a new architecture for scale and lower TCO, TraceQL metrics GA, and more

Tempo 3.0, phiên bản mới của hệ thống truy vết phân tán mã nguồn mở, giới thiệu kiến trúc tương thích Kafka cho microservices, tách biệt đường đọc-ghi, giảm yêu cầu sao chép RF3 xuống RF1, và thay thế ingesters/compactors bằng block-builders, live-stores cùng scheduler. Tính năng TraceQL metrics giờ đã sẵn sàng, hỗ trợ truy vấn metric trực tiếp từ trace data cùng toán tử so sánh mới, cùng nhiều cải tiến khác như giới hạn cardinality theo label, tối ưu truy vấn TraceQL AST, và công cụ di chuyển từ phiên bản 2.x.

Lập trình viên phát triển ứng dụng microservices nên đọc vì Tempo 3.0 mang đến kiến trúc Kafka-compatible cải tiến, giúp tối ưu hóa quy mô, giảm chi phí vận hành và cung cấp công cụ TraceQL mạnh mẽ để phân tích hiệu suất trực tiếp từ dữ liệu theo dõi phân tán.

How we saved over $3 million in idle compute costs with Datadog Kubernetes Autoscaling

Đề xuất cho bạn

Tempo 3.0 release: a new architecture for scale and lower TCO, TraceQL metrics GA, and more

Prioritizing Recent Messages with Go Channels

The inside scoop on alerting changes in Kubernetes Monitoring

How to Build a Durable, Autoscaling AI Agent with Temporal, Composio, KEDA, and Kubernetes

A revamped way to create and manage alerts across all your telemetry

AI & Kubernetes

Mastering Secure CI/CD for ECS with GitHub Actions

Building a European Cloud Orchestration Platform within an Enterprise