37.2° Blog

ArXiv Domain 2025-12-04

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. PPTArena: A Benchmark for Agentic PowerPoint EditingWe introduce PPTArena, a benchmark for PowerPoint editing that measures reliable modifications to real slides under natural-language instructions. In contrast to image-PDF renderings or text-to-slide generation, PPTArena focuses on in-place editing across 100 decks, 2125 slides, and over 800 targeted edits covering text, charts, tables, animations, and master-level styles. Each case includes a ground-trut ...

ArXiv Domain 2025-12-05

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. SkillFactory: Self-Distillation For Learning Cognitive BehaviorsReasoning models leveraging long chains of thought employ various cognitive skills, such as verification of their answers, backtracking, retrying by an alternate method, and more. Previous work has shown that when a base language model exhibits these skills, training that model further with reinforcement learning (RL) can learn to leverage them. How can we get models to leverage skills that ar ...

ArXiv Domain 2025-12-06

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. The Universal Weight Subspace HypothesisWe show that deep neural networks trained across diverse tasks exhibit remarkably similar low-dimensional parametric subspaces. We provide the first large-scale empirical evidence that demonstrates that neural networks systematically converge to shared spectral subspaces regardless of initialization, task, or domain. Through mode-wise spectral analysis of over 1100 models - including 500 Mistral-7B LoRAs, 500 Vision ...

ArXiv Domain 2025-12-07

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. The Universal Weight Subspace HypothesisWe show that deep neural networks trained across diverse tasks exhibit remarkably similar low-dimensional parametric subspaces. We provide the first large-scale empirical evidence that demonstrates that neural networks systematically converge to shared spectral subspaces regardless of initialization, task, or domain. Through mode-wise spectral analysis of over 1100 models - including 500 Mistral-7B LoRAs, 500 Vision ...

ArXiv Domain 2025-12-08

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. The Universal Weight Subspace HypothesisWe show that deep neural networks trained across diverse tasks exhibit remarkably similar low-dimensional parametric subspaces. We provide the first large-scale empirical evidence that demonstrates that neural networks systematically converge to shared spectral subspaces regardless of initialization, task, or domain. Through mode-wise spectral analysis of over 1100 models - including 500 Mistral-7B LoRAs, 500 Vision ...

ArXiv Domain 2025-12-09

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Enhancing Retrieval-Augmented Generation with Entity Linking for Educational PlatformsIn the era of Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) architectures are gaining significant attention for their ability to ground language generation in reliable knowledge sources. Despite their impressive effectiveness in many areas, RAG systems based solely on semantic similarity often fail to ensure factual accuracy in specialized domains, wh ...

ArXiv Domain 2025-12-10

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Relational Visual SimilarityHumans do not just see attribute similarity — we also see relational similarity. An apple is like a peach because both are reddish fruit, but the Earth is also like a peach: its crust, mantle, and core correspond to the peach’s skin, flesh, and pit. This ability to perceive and recognize relational similarity, is arguable by cognitive scientist to be what distinguishes humans from other species. Yet, all widely used visual simil ...

ArXiv Domain 2025-09-19

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Scaling Environments for Organoid Intelligence with LLM-Automated Design and Plasticity-Based EvaluationAs the complexity of artificial agents increases, the design of environments that can effectively shape their behavior and capabilities has become a critical research frontier. We propose a framework that extends this principle to a novel class of agents: biological neural networks in the form of neural organoids. This paper introduces three scalable, cl ...

ArXiv Domain 2025-12-13

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation ModelWe propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation priors, existing methods struggle to simultaneously produce high-quality geometry and accurate poses under severe occlusion and open-set settings. To address these issues, we first decouple the de-occlusion mo ...

ArXiv Domain 2025-12-14

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation ModelWe propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation priors, existing methods struggle to simultaneously produce high-quality geometry and accurate poses under severe occlusion and open-set settings. To address these issues, we first decouple the de-occlusion mo ...

ArXiv Domain 2025-12-15

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation ModelWe propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation priors, existing methods struggle to simultaneously produce high-quality geometry and accurate poses under severe occlusion and open-set settings. To address these issues, we first decouple the de-occlusion mo ...

ArXiv Domain 2025-12-16

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Particulate: Feed-Forward 3D Object ArticulationWe present Particulate, a feed-forward approach that, given a single static 3D mesh of an everyday object, directly infers all attributes of the underlying articulated structure, including its 3D parts, kinematic structure, and motion constraints. At its core is a transformer network, Part Articulation Transformer, which processes a point cloud of the input mesh using a flexible and scalable architecture to p ...

ArXiv Domain 2025-12-17

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Quantum oracles give an advantage for identifying classical counterfactualsWe show that quantum oracles provide an advantage over classical oracles for answering classical counterfactual questions in causal models, or equivalently, for identifying unknown causal parameters such as distributions over functional dependences. In structural causal models with discrete classical variables, observational data and even ideal interventions generally fail to answer ...

ArXiv Domain 2025-12-18

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMsThis paper does not introduce a novel method but instead establishes a straightforward, incremental, yet essential baseline for video temporal grounding (VTG), a core capability in video understanding. While multimodal large language models (MLLMs) excel at various video understanding tasks, the recipes for optimizing them for VTG remain under-explored. In this paper, we present TimeLens, a ...

ArXiv Domain 2025-12-19

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Spatia: Video Generation with Updatable Spatial MemoryExisting video generation models struggle to maintain long-term spatial and temporal consistency due to the dense, high-dimensional nature of video signals. To overcome this limitation, we propose Spatia, a spatial memory-aware video generation framework that explicitly preserves a 3D scene point cloud as persistent spatial memory. Spatia iteratively generates video clips conditioned on this spatial mem ...