37.2° Blog

HuggingFace Papers 2025-10-14

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AILarge language models leverage internet-scale text data, yet embodied AI remains constrained by the prohibitive costs of physical trajectory collection. Desktop environments — particularly gaming — offer a compelling alternative: they provide rich sensorimotor interactions at scale while maintaining the structured observation-action coupling essential for embodied learning. ...

HuggingFace Papers 2025-10-16

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action ModelVision-language-action (VLA) models have recently shown strong potential in enabling robots to follow language instructions and execute precise actions. However, most VLAs are built upon vision-language models pretrained solely on 2D data, which lack accurate spatial awareness and hinder their ability to operate in the 3D physical world. Existing solutions attempt ...

HuggingFace Papers 2025-08-28

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based ModelingRecent advancements in aligning large language models via reinforcement learning have achieved remarkable gains in solving complex reasoning problems, but at the cost of expensive on-policy rollouts and limited exploration of diverse reasoning paths. In this work, we introduce TreePO, involving a self-guided rollout algorithm that views ...

HuggingFace Papers 2025-10-17

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoERecent advances in unified multimodal models indicate a clear trend towards comprehensive content generation. However, the auditory domain remains a significant challenge, with music and speech often developed in isolation, hindering progress towards universal audio synthesis. This separation stems from inherent task conflicts and severe data imbalances, which impede the developme ...

HuggingFace Papers 2025-10-18

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQAHallucination detection remains a fundamental challenge for the safe and reliable deployment of large language models (LLMs), especially in applications requiring factual accuracy. Existing hallucination benchmarks often operate at the sequence level and are limited to English, lacking the fine-grained, multilingual supervision needed for a comprehensive evaluation. In ...

HuggingFace Papers 2025-10-19

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQAHallucination detection remains a fundamental challenge for the safe and reliable deployment of large language models (LLMs), especially in applications requiring factual accuracy. Existing hallucination benchmarks often operate at the sequence level and are limited to English, lacking the fine-grained, multilingual supervision needed for a comprehensive evaluation. In ...

HuggingFace Papers 2025-10-20

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQAHallucination detection remains a fundamental challenge for the safe and reliable deployment of large language models (LLMs), especially in applications requiring factual accuracy. Existing hallucination benchmarks often operate at the sequence level and are limited to English, lacking the fine-grained, multilingual supervision needed for a comprehensive evaluation. In ...

HuggingFace Papers 2025-10-21

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM ReasoningTest-time scaling seeks to improve the reasoning performance of large language models (LLMs) by adding computational resources. A prevalent approach within the field is sampling-based test-time scaling methods, which enhance reasoning by generating multiple reasoning paths for a given input during inference. However, despite its practical success, the theoretical f ...

HuggingFace Papers 2025-10-22

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. PICABench: How Far Are We from Physically Realistic Image Editing?Image editing has achieved remarkable progress recently. Modern editing models could already follow complex instructions to manipulate the original content. However, beyond completing the editing instructions, the accompanying physical effects are the key to the generation realism. For example, removing an object should also remove its shadow, reflections, and interactions with nearby obje ...

HuggingFace Papers 2025-10-23

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. LightMem: Lightweight and Efficient Memory-Augmented GenerationDespite their remarkable capabilities, Large Language Models (LLMs) struggle to effectively leverage historical interaction information in dynamic and complex environments. Memory systems enable LLMs to move beyond stateless interactions by introducing persistent information storage, retrieval, and utilization mechanisms. However, existing memory systems often introduce substantial time and c ...

HuggingFace Papers 2025-10-24

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. Every Attention Matters: An Efficient Hybrid Architecture for Long-Context ReasoningIn this technical report, we present the Ring-linear model series, specifically including Ring-mini-linear-2.0 and Ring-flash-linear-2.0. Ring-mini-linear-2.0 comprises 16B parameters and 957M activations, while Ring-flash-linear-2.0 contains 104B parameters and 6.1B activations. Both models adopt a hybrid architecture that effectively integrates linear attention and soft ...

HuggingFace Papers 2025-10-26

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1In the quest for scientific progress, communicating research is as vital as the discovery itself. Yet, researchers are often sidetracked by the manual, repetitive chore of building project webpages to make their dense papers accessible. While automation has tackled static slides and posters, the dynamic, interactive nature of webpages has remained an unaddressed challenge. To bridge this gap, ...

HuggingFace Papers 2025-10-25

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1In the quest for scientific progress, communicating research is as vital as the discovery itself. Yet, researchers are often sidetracked by the manual, repetitive chore of building project webpages to make their dense papers accessible. While automation has tackled static slides and posters, the dynamic, interactive nature of webpages has remained an unaddressed challenge. To bridge this gap, ...

HuggingFace Papers 2025-10-27

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1In the quest for scientific progress, communicating research is as vital as the discovery itself. Yet, researchers are often sidetracked by the manual, repetitive chore of building project webpages to make their dense papers accessible. While automation has tackled static slides and posters, the dynamic, interactive nature of webpages has remained an unaddressed challenge. To bridge this gap, ...

HuggingFace Papers 2025-10-28

Created2019-06-18|AI

数据来源：HuggingFace Papers Latest Papers1. DeepAgent: A General Reasoning Agent with Scalable ToolsetsLarge reasoning models have demonstrated strong problem-solving abilities, yet real-world tasks often require external tools and long-horizon interactions. Existing agent frameworks typically follow predefined workflows, which limit autonomous and global task completion. In this paper, we introduce DeepAgent, an end-to-end deep reasoning agent that performs autonomous thinking, tool discovery, an ...