AWS0 Hot0 bình luận2 phút đọc3 giờ trước

Amazon SageMaker AI cuts generative AI inference scale-out time by up to half with automatic container image caching

Amazon SageMaker Inference now automatically caches container images on instances during scale-out events, reducing cold-start latency by up to 2x for generative AI workloads. Previously, every new instance had to pull large container images (10 GB+) from Amazon ECR, adding several minutes of delay. With automatic caching, images are pre-pulled so new instances can serve traffic immediately. No configuration changes are required — the service caches whatever image URI is specified in the endpoint or inference component config. This feature supports accelerator instance types, single-model endpoints, and inference component-based endpoints, and is available in all AWS commercial regions where SageMaker Inference is supported.

Đọc bài gốc

#aws #genai

Nguồn: https://aws.amazon.com/about-aws/whats-new/2026/06/sagemakerai-inf-scale-out-time. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

AWS1 Hot2 phút3 giờ trướcAI

Announcing general availability of Amazon WorkSpaces for AI agents

Amazon WorkSpaces for AI agents đã chính thức ra mắt, giúp các AI agent truy cập và vận hành ứng dụng desktop cũ (ERP, CRM, mainframe) trong môi trường cloud quản lý mà không cần hiện đại hóa ứng dụng. Tính năng nổi bật bao gồm MCP tool forwarding, điều khiển phiên thời gian thực, hỗ trợ domain-joined fleet qua Active Directory, tương thích với mọi framework agent sử dụng Model Context Protocol và tính phí theo thời gian phiên hoạt động.

Lập trình viên nên đọc bài này để khám phá cách AI có thể tự động hóa và kết nối với các hệ thống legacy phức tạp mà không cần thay đổi ứng dụng, giúp tối ưu hóa hiệu suất và giảm thiểu rủi ro khi tích hợp công nghệ mới vào môi trường doanh nghiệp hiện có.

#aws

Amazon SageMaker AI cuts generative AI inference scale-out time by up to half with automatic container image caching

Đề xuất cho bạn

Announcing general availability of Amazon WorkSpaces for AI agents

Amazon SageMaker AI now supports serverless model customization for Gemma 4 models

Claude Opus 4.8 is now available in AWS GovCloud (US)

Germany’s AI rollout is being sold as a fix for its worker shortage

AI Coding Agent Horror Stories: The 13-Hour AWS Outage

Toward More Controllable AI Video Editing: An Early Research Exploration at Netflix

Mastering Secure CI/CD for ECS with GitHub Actions

Drift Detection. No New Calls Required.