InfoQ0 Hot0 bình luận3 phút đọc3 giờ trước

Cloudflare Details Unified Data Platform Where Billing Workloads Account for 53% of Queries

Cloudflare has published details on Town Lake, its internal unified data platform, and Skipper, an AI-powered analytics agent built on top of it. Town Lake uses a lakehouse architecture combining Apache Trino, Apache Iceberg, Cloudflare R2 object storage, and DataHub for metadata management, enabling cross-system SQL queries spanning Postgres, ClickHouse, Kafka, BigQuery, and object storage without moving data. A default-closed governance model ensures new datasets undergo automated PII scanning and human review before access is granted. Skipper translates natural language questions into validated SQL queries using schema metadata, transformation lineage, and documentation. Billing workloads dominate usage at 53% of all queries, with 91,760 billing-related queries processed from 324 employees in a measured period. Future plans include deeper integration with internal chat and ticketing systems, expanded Transformer pipeline capabilities, and migration of additional workloads to R2 SQL.

Đọc bài gốc

#big-data #cloudflare #apache-iceberg

Nguồn: https://www.infoq.com/news/2026/07/cloudflare-unified-data-platform. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

Cloudflare41 Hot17 phút10 ngày trướcAI

How we found a bug in the hyper HTTP library

Nhóm Cloudflare Images phát hiện lỗi điều kiện chạy (race condition) trong thư viện hyper HTTP (phiên bản 0.14–1.8) khi chuyển đổi sang sử dụng Unix sockets, khiến dữ liệu ảnh lớn bị cắt xén ngẫu nhiên do vòng lặp xử lý không chờ Poll::Pending từ poll_flush. Lỗi chỉ xuất hiện trong môi trường sản xuất với tải cao, không thể tái hiện bằng curl hay thử nghiệm cục bộ. Nhóm đã khắc phục bằng cách bổ sung 4 dòng lệnh vào poll_shutdown để đảm bảo dữ liệu được ghi hết trước khi đóng kết nối.

Lập trình viên cần đọc bài này để hiểu cách một lỗi race condition trong thư viện HTTP phổ biến (hyper) có thể gây ra vấn đề nghiêm trọng trong ứng dụng thực tế, đặc biệt khi kết hợp với các điều kiện concurrency và giao thức socket, và cách team phát hiện, debug và fix bằng cách quan sát syscall thực tế.

Cloudflare Details Unified Data Platform Where Billing Workloads Account for 53% of Queries

Đề xuất cho bạn

How we found a bug in the hyper HTTP library

Announcing DuckDB 1.5.4 Variegata

The post-quantum EO is an important milestone. Now it’s time to get to work

EDB converges analytics on Postgres to support AI agents

Data masking in Amazon RDS for Oracle

Teaching AI to run with the turbines

Cloudflare Email Workers: run code when an email arrives

How urban design leads to better wellness