NVIDIA Developer0 Hot0 bình luận6 phút đọc2 giờ trước

Hardware-Rooted AI Security That Won’t Slow You Down

NVIDIA Confidential Computing (CC) secures AI inference workloads at the hardware level using Blackwell GPUs with a silicon-embedded private key, remote attestation via NRAS (supporting AMD SEV-SNP and Intel TDX), and encrypted NVLink interconnects. Benchmarks on HGX B300 running Qwen 3.5 397B at FP8 precision show CC adds only 2–8% overhead across throughput and time-per-output-token metrics at various concurrency levels. Performance optimizations include CC-safe autotuner timing in FlashInfer, async device-to-host copy workers in SGLang, and piecewise CUDA graph support to reduce kernel launch overhead amplified in CC mode.

Đọc bài gốc

#nvidia #ai-inference

Nguồn: https://developer.nvidia.com/blog/hardware-rooted-ai-security-that-wont-slow-you-down. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

DigitalOcean1 Hot7 phút3 giờ trướcAI

Built for Mass Scale: Hard-Won Lessons from Teams Running High Volume Inference Workloads in Production

Các nhà lãnh đạo từ Workato, Hippocratic AI và ISMG chia sẻ kinh nghiệm vận hành khối lượng lớn suy luận AI trong sản xuất, nhấn mạnh: hiệu suất suy giảm nhanh khi AI dùng trên 50 công cụ; độ trễ P99 gây nguy hiểm cho bệnh nhân trong ứng dụng giọng nói lâm sàng; AI không nên có quyền admin mà hoạt động như ủy quyền theo thời gian cho từng hành động; trì hoãn cấu trúc dữ liệu và quy trình trước khi áp dụng AI khiến doanh nghiệp tụt hậu 2 năm về mô hình vận hành. Nhóm thống nhất rằng mở rộng suy luận AI là vấn đề cơ sở hạ tầng và quản trị, không phải mô hình.

Những kinh nghiệm thực tế từ các đội phát triển AI ở quy mô lớn sẽ giúp bạn tránh những sai lầm gây tốn kém về thời gian và chi phí khi thiết kế hệ thống inference, từ đó tối ưu hóa hiệu suất và an toàn ngay từ giai đoạn xây dựng.

Hardware-Rooted AI Security That Won’t Slow You Down

Đề xuất cho bạn

Built for Mass Scale: Hard-Won Lessons from Teams Running High Volume Inference Workloads in Production

AI inference is obviously profitable

NVIDIA BioNeMo Agent Toolkit Brings Accelerated AI to Life Sciences Researchers in Claude Science

OpenAI and Broadcom build a chip to rival Nvidia’s Blackwell

Claude Meets Blackwell Ultra: Anthropic’s Models Now Run on NVIDIA GB300 in Azure

“Bring it to our shop”: Workday’s pitch for keeping AI agents close to your most valuable data

How Businesses Are Building Specialized AI They Can Trust

Nvidia offers AI startups compute now, payment later