NVIDIA00 bình luận4 phút đọc1 ngày trước

NVIDIA and AWS Collaborate to Bring AI to Production at Scale

NVIDIA and AWS have announced several joint infrastructure advancements for enterprise AI at scale. New Amazon EC2 G7 instances powered by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs deliver up to 4.6x AI inference performance and 2.1x graphics performance over G6 instances, with support for up to 8 GPUs and 700 Gbps networking. Amazon OpenSearch Serverless now uses NVIDIA cuVS for GPU-accelerated vector indexing by default, enabling up to 10x faster vector indexing at a quarter of the CPU-only cost, making billion-scale vector databases buildable in under an hour. Additionally, AWS has achieved NVIDIA Exemplar Cloud status for GB300 training workloads, certifying that AWS meets NVIDIA's rigorous performance benchmarks for large-scale AI training.

Đọc bài gốc

#aws #nvidia #gpu #vector-search #ai-inference

Nguồn: https://blogs.nvidia.com/blog/nvidia-aws-ai-production-scale. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

Weaviate18 phút10 giờ trướcAI

Weaviate 1.38 Release

Weaviate 1.38 ra mắt với các tính năng mới như HFresh (chỉ số vector dựa trên đĩa, tối ưu bộ nhớ cho streaming) và MCP Server tích hợp cho phép LLMs tương tác trực tiếp. Bản cập nhật cũng bổ sung async replication mặc định, Boost API (tái xếp hạng truy vấn), nested object filtering, cùng nhiều cải tiến khác như quản lý replica, cấu hình chỉ số vector, và module text2vec-digitalocean.

Lập trình viên phát triển ứng dụng AI hoặc hệ thống vector search cần đọc để cập nhật về MCP Server và Boost API, giúp tối ưu hóa giao tiếp trực tiếp giữa LLM với cơ sở dữ liệu vector và cải thiện hiệu suất tìm kiếm bằng cách xếp hạng kết quả một cách linh hoạt mà không mất bất kỳ dữ liệu nào.

#mcp

NVIDIA and AWS Collaborate to Bring AI to Production at Scale

Đề xuất cho bạn

Weaviate 1.38 Release

Letting an LLM Pick the Right RAG Page: The Arbiter Pattern at the End of Retrieval

Knowledge graph RAG: structured retrieval for AI agents

Sub-agents: splitting context across specialized AI agents

OpenAI and Broadcom build a chip to rival Nvidia’s Blackwell

Qt Canvas Painter: Accelerated performance using paths

AI Coding Agent Horror Stories: The 13-Hour AWS Outage

How Businesses Are Building Specialized AI They Can Trust