The Next Web00 bình luận4 phút đọc2 giờ trước

AWS GPU prices jump 20% as memory crunch bites

AWS has raised EC2 Capacity Blocks for ML prices by roughly 20% starting July, marking the second hike in six months after a 15% increase in January. The increases target reserved Nvidia GPU blocks used by AI teams for large model training and fine-tuning, while other purchasing options and Amazon's own Trainium chip remain unaffected. The root cause is a high-bandwidth memory shortage that is constraining GPU production and data center capacity globally. The scarcity gives hyperscalers like AWS, Microsoft, and Google pricing power since customers have few alternatives. The same memory crunch is driving up prices across Apple hardware and Xbox, while benefiting memory makers like Micron and SK Hynix. AI teams relying on reserved compute now face rising and unpredictable reservation costs.

Đọc bài gốc

#cloud #aws #nvidia #gpu

Nguồn: https://thenextweb.com/news/aws-gpu-prices-increase-memory-shortage. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

sean goedecke126 phút2 ngày trướcAI

AI inference is obviously profitable

Phân tích chi phí sơ lược cho thấy suy luận (inference) AI thực sự sinh lời, với chi phí ước tính khoảng 1 USD cho mỗi triệu token đầu ra, thấp hơn nhiều so với mức giá 4,5 USD trở lên của các nhà cung cấp như OpenAI, qua đó đạt biên lợi nhuận gộp 70–80%. Suy luận AI có lợi nhuận, nhưng các phòng thí nghiệm AI như OpenAI và Anthropic sử dụng khoản lợi nhuận này để bù đắp chi phí đào tạo mô hình tốn kém.

Là người phát triển muốn tối ưu chi phí cho ứng dụng AI của mình, bài viết này giúp bạn hiểu rõ về lợi nhuận thực tế của quá trình inference AI, từ đó có thể xây dựng mô hình kinh doanh hiệu quả và tránh bỏ lỡ cơ hội tiết kiệm chi phí mà không phụ thuộc vào sự hỗ trợ từ các công ty lớn.

#llm

AWS GPU prices jump 20% as memory crunch bites

Đề xuất cho bạn

AI inference is obviously profitable

OpenAI and Broadcom build a chip to rival Nvidia’s Blackwell

Oracle is slashing its workforce as it automates with AI

AI Coding Agent Horror Stories: The 13-Hour AWS Outage

The AI memory crisis just hit DDR2, a standard from 2003, with 60% price hikes

Run Codex in the cloud – DigitalOcean for Codex is now available

Improving the speed and energy-efficiency of AI agents

Qt Canvas Painter: Accelerated performance using paths