Hugging Face00 bình luận5 phút đọc3 giờ trước

DiScoFormer: One transformer for density and score, across distributions

DiScoFormer (Density and Score Transformer) is a new model from AI2 that estimates both the probability density and score of a distribution from a finite sample in a single forward pass, without retraining per distribution. Built on transformer architecture with cross-attention, it analytically generalizes kernel density estimation (KDE) and outperforms it significantly in high dimensions — cutting score error by ~6.5x and density error by ~37x in 100 dimensions. Trained on Gaussian Mixture Models (which have closed-form densities), it generalizes to unseen distributions including Laplace and Student-t. A consistency loss between the density and score heads enables test-time adaptation to out-of-distribution inputs. The model targets shared infrastructure for generative modeling, Bayesian inference, and scientific computing.

Đọc bài gốc

#machine-learning #diffusion-models

Nguồn: https://huggingface.co/blog/allenai/discoformer. 8sync News chỉ tóm tắt và dẫn link; bản quyền nội dung thuộc tác giả và nguồn gốc.

Đề xuất cho bạn

Gusto Engineering16 phút4 giờ trướcAI

From Prompt to Classifier: A Production Case Study

Đội kỹ thuật của Gusto xây dựng bộ phân loại chuyển tiếp AI-sang-người cho hệ thống hỗ trợ khách hàng bằng cách bắt đầu với prompt LLM, sử dụng dữ liệu sản xuất để tạo dataset 3.500 lượt hội thoại, sau đó tinh chỉnh mô hình BERT nhẹ đạt 94% precision và 93% recall. Phương pháp LLM-đầu-tiên-sau-chuyên-biệt phù hợp cho quyết định ổn định, khối lượng lớn như phân loại intent, nhưng không hiệu quả với sinh văn bản mở hoặc quy tắc thay đổi.

Lập trình viên nên đọc bài này để hiểu cách chuyển từ việc sử dụng mô hình LLM trực tiếp sang xây dựng hệ thống chuyên biệt hiệu quả, đặc biệt là trong trường hợp phân loại quyết định cụ thể như phân luồng hỗ trợ khách hàng, giúp tối ưu hóa chi phí và tốc độ triển khai.

#machine-learning

DiScoFormer: One transformer for density and score, across distributions

Đề xuất cho bạn

From Prompt to Classifier: A Production Case Study

Unlocking the Power of the TPU Stack: Introducing our new Developer Hub

Toward More Controllable AI Video Editing: An Early Research Exploration at Netflix

WiMi Explores Neural Networks for Twin-Field Quantum Key Distribution Optimization

How Far Can Classical NLP Go? From Bag-of-Words to Stacking on Spooky Author Identification

3 Questions: Beyond data-driven aesthetics

Hexora v0.3: New features and improvements

Twisted Tethers Make Tidal Energy Cheaper and Cleaner