Redpanda0 Hot0 bình luận10 phút đọc2 giờ trước

How Redpanda Cloud Topics rethinks Kafka compaction

Redpanda's Cloud Topics architecture fundamentally redesigns Kafka log compaction by decoupling storage from brokers. Instead of each replica independently compacting its local copy of the log (wasting CPU and risking tombstone race conditions), Cloud Topics stores committed data as immutable objects in shared object storage. Compaction runs once against this canonical copy, eliminating redundant work across replicas. A pull-based scheduler with a priority queue (using dirty ratio and max.compaction.lag.ms heuristics) distributes compaction work across any shard on any node. Multi-part uploads to object storage avoid disk spills, and optimistic concurrency via a compaction_epoch integer in the metastore prevents stale data from being re-added. The result is lower CPU overhead, reduced cloud storage costs, and correct Kafka compaction semantics without the coordination problems of traditional disk-based implementations.

Đọc bài gốc

#distributed-systems #kafka #redpanda

Nguồn: https://www.redpanda.com/blog/how-redpanda-cloud-topics-rethinks-kafka-compaction. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

Grafana Labs44 Hot13 phút8 ngày trướcAI

Tempo 3.0 release: a new architecture for scale and lower TCO, TraceQL metrics GA, and more

Tempo 3.0, phiên bản mới của hệ thống truy vết phân tán mã nguồn mở, giới thiệu kiến trúc tương thích Kafka cho microservices, tách biệt đường đọc-ghi, giảm yêu cầu sao chép RF3 xuống RF1, và thay thế ingesters/compactors bằng block-builders, live-stores cùng scheduler. Tính năng TraceQL metrics giờ đã sẵn sàng, hỗ trợ truy vấn metric trực tiếp từ trace data cùng toán tử so sánh mới, cùng nhiều cải tiến khác như giới hạn cardinality theo label, tối ưu truy vấn TraceQL AST, và công cụ di chuyển từ phiên bản 2.x.

Lập trình viên phát triển ứng dụng microservices nên đọc vì Tempo 3.0 mang đến kiến trúc Kafka-compatible cải tiến, giúp tối ưu hóa quy mô, giảm chi phí vận hành và cung cấp công cụ TraceQL mạnh mẽ để phân tích hiệu suất trực tiếp từ dữ liệu theo dõi phân tán.

How Redpanda Cloud Topics rethinks Kafka compaction

Đề xuất cho bạn

Tempo 3.0 release: a new architecture for scale and lower TCO, TraceQL metrics GA, and more

Understanding and Avoiding CommitFailedException in Kafka

Dapr 1.18 Introduces Verifiable Execution, Bringing Cryptographic Trust to AI Agents and Workflows

Kafka's log compaction corrupts data. Here's how we fixed it

How to use traces to avoid breaking changes

Amazon Time Sync Service adds support for Microsecond accurate time on 26 additional EC2 instance types in all commercial regions

Operating Kubernetes at scale: a few stories from running Amazon EKS

Why Most People Give Up on Understanding Blockchain