TensorX, an Irish startup, has raised €8M to purchase Nvidia Blackwell B300 GPUs and expand its GDPR-compliant AI inference platform across Europe. The platform targets regulated industries — banks, hospitals, and law firms — that cannot send data outside European jurisdiction. It supports over 33 open-weight models and offers an OpenAI-compatible API. The funding comes primarily from Darius Cubed Ventures, with most capital going toward hardware rather than headcount. TensorX operates from Dublin and Helsinki, with expansion planned across Ireland, UK, Germany, France, and the Nordics. The article also notes the inherent tension in 'sovereign AI' claims when the underlying silicon stack remains American (Nvidia) and Asian.
Nguồn: https://thenextweb.com/news/tensorx-8-million-sovereign-ai-infrastructure. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.
OpenAI và Broadcom hợp tác phát triển chip AI tùy chỉnh Jalapeño nhằm cạnh tranh với Nvidia Blackwell và Google TPU, nhắm vào workloads inference. Chip này đã được thử nghiệm với mô hình GPT-5.3-Codex-Spark và dự kiến triển khai vào cuối năm 2025, trong khi tình trạng thiếu hụt HBM đang ảnh hưởng đến biên lợi nhuận của Broadcom.
Lập trình viên nên đọc bài này để hiểu cách các công ty lớn như OpenAI và Broadcom hợp tác phát triển chip AI chuyên dụng, giúp tối ưu hóa hiệu suất cho các mô hình lớn như GPT-5.3, ảnh hưởng trực tiếp đến hiệu năng và chi phí của các ứng dụng AI trong tương lai.
NVIDIA ra mắt NVIDIA Agent Toolkit, một nền tảng mã nguồn mở và mô-đun giúp doanh nghiệp xây dựng các tác nhân AI chuyên biệt đáng tin cậy. Bộ công cụ tích hợp các mô hình Nemotron (tùy chỉnh lý luận), NemoClaw (đảm bảo hành vi an toàn) và OpenShell (thực thi bảo mật), được triển khai trong các lĩnh vực như y tế, an ninh mạng và thiết kế chip.
Lập trình viên chuyên về AI nên đọc bài này để hiểu cách xây dựng các hệ thống agent chuyên dụng, an toàn và có thể kiểm soát được, giúp họ ứng dụng kiến thức về mô hình open-source, bảo mật và tích hợp vào các dự án doanh nghiệp thực tế.
TensorX and Solstice have announced a $1bn financing facility to fund AI hardware and data-centre capacity across the EU, targeting the growing demand for sovereign compute that stays on European soil. Alongside this, Solstice is launching aiUSX, a yield-bearing asset that lets companies put idle AI-earmarked capital to work as infrastructure lending. The product is capped at $5m at launch and is designed to generate yield that can later offset inference costs. Both companies operate within the Deus X Capital ecosystem, which positions itself as the connective tissue enabling the partnership.
NVIDIA's GeForce NOW is running summer membership discounts alongside the Steam Summer Sale, offering $70 off a 12-month Ultimate membership and $35 off a Performance membership. The Ultimate tier delivers RTX 4080/5080-class cloud performance at up to 4K/120fps with DLSS and ray tracing. Six new games join the GeForce NOW library this week, headlined by Devolver Digital's Dark Scrolls and Square Enix's The Adventures of Elliot: The Millennium Tales.
Running three different LLMs simultaneously on a single 8GB GPU fails because llama.cpp pre-allocates the full KV cache upfront, causing OOM errors for the second and third processes. The solution is a C++ daemon called lmxd that implements Connection Admission Control (borrowed from 5G/telecom) as a VRAM ledger: it tracks allocated bytes, enforces a 90% cap, and refuses new agent registrations before any GPU allocation is attempted. The daemon also handles KV-cache swapping to host RAM between agent switches, enabling multiple agents to share one GPU context slot. Additionally, a layer streaming technique using two CUDA streams overlaps compute and weight transfer, achieving ~22-32% wall-clock savings on a GTX 1080. The repo ships the admission control daemon and the streaming primitive as separate, composable components.
Unconventional AI, led by former Databricks AI chief Naveen Rao, has released Un0, an image-generation model built on a software simulation of a novel oscillator-based computing architecture. The company claims this architecture could reduce AI inference power consumption by up to 1,000x compared to conventional chips. Un0 performs comparably to state-of-the-art diffusion models like Stable Diffusion, serving as a proof-of-concept for the new architecture. The company plans to release actual chip schematics soon and eventually build a full inference stack, positioning itself as a compute provider running at a fraction of current energy costs.
The RTX 50 series launched with headline features like Multi Frame Generation, Ray Reconstruction, and Neural Texture Compression that were either unfinished or lacked broad software adoption. Months after launch, major fixes and updates are still arriving, and the most compelling exclusive features primarily benefit 4K gaming — a niche most PC gamers don't occupy. RTX 40-series owners already receive the biggest DLSS 4.5 image quality improvements, leaving the 50 series in an awkward middle ground. The author argues the generation feels like a transitional stepping stone, with the upcoming RTX 60 series (Rubin) positioned to be the hardware that fully realizes Nvidia's long-term rendering ambitions.
A reproducible benchmark comparing gradient-boosted decision trees (GBDTs) vs. LLM-based scoring for payment fraud detection across three dimensions: latency, cost, and determinism. On a single CPU core, GBDTs hit p99 latency of 0.15ms vs. ~1,200ms for LLMs — well outside the 100ms ISO 8583 authorization budget. Cost-wise, GBDTs run ~$54/hour at 50K TPS vs. $16,200–$351,000 for LLM tiers. Determinism is the most critical issue for regulated environments: GBDTs return identical scores on identical inputs while LLMs produce hundreds of distinct outputs even at temperature=0. The recommended architecture keeps deterministic tree ensembles on the synchronous hot path and deploys LLM agents on the asynchronous cold path for SAR drafting, evidence gathering, and agent-as-a-judge validation before human review. All benchmark code is open-source and reproducible on a laptop.