ArXiv Domain 2026-02-21
数据来源:ArXiv Domain
LLM Domain Papers1. Sink-Aware Pruning for Diffusion Language ModelsDiffusion Language Models (DLMs) incur high inference cost due to iterative denoising, motivating efficient pruning. Existing pruning heuristics largely inherited from autoregressive (AR) LLMs, typically preserve attention sink tokens because AR sinks serve as stable global anchors. We show that this assumption does not hold for DLMs: the attention-sink position exhibits substantially higher variance over the ...
ArXiv Domain 2026-02-22
数据来源:ArXiv Domain
LLM Domain Papers1. Sink-Aware Pruning for Diffusion Language ModelsDiffusion Language Models (DLMs) incur high inference cost due to iterative denoising, motivating efficient pruning. Existing pruning heuristics largely inherited from autoregressive (AR) LLMs, typically preserve attention sink tokens because AR sinks serve as stable global anchors. We show that this assumption does not hold for DLMs: the attention-sink position exhibits substantially higher variance over the ...
ArXiv Domain 2026-02-23
数据来源:ArXiv Domain
LLM Domain Papers1. Sink-Aware Pruning for Diffusion Language ModelsDiffusion Language Models (DLMs) incur high inference cost due to iterative denoising, motivating efficient pruning. Existing pruning heuristics largely inherited from autoregressive (AR) LLMs, typically preserve attention sink tokens because AR sinks serve as stable global anchors. We show that this assumption does not hold for DLMs: the attention-sink position exhibits substantially higher variance over the ...
ArXiv Domain 2026-02-24
数据来源:ArXiv Domain
LLM Domain Papers1. VIRAASAT: Traversing Novel Paths for Indian Cultural ReasoningLarge Language Models (LLMs) have made significant progress in reasoning tasks across various domains such as mathematics and coding. However, their performance deteriorates in tasks requiring rich socio-cultural knowledge and diverse local contexts, particularly those involving Indian Culture. Existing Cultural benchmarks are (i) Manually crafted, (ii) contain single-hop questions testing factu ...
ArXiv Domain 2026-02-25
数据来源:ArXiv Domain
LLM Domain Papers1. A Very Big Video Reasoning SuiteRapid progress in video models has largely focused on visual quality, leaving their reasoning capabilities underexplored. Video reasoning grounds intelligence in spatiotemporally consistent visual environments that go beyond what text can naturally capture, enabling intuitive reasoning over spatiotemporal structure such as continuity, interaction, and causality. However, systematically studying video reasoning and its scalin ...
ArXiv Domain 2026-02-26
数据来源:ArXiv Domain
LLM Domain Papers1. Language Models use Lookbacks to Track BeliefsHow do language models (LMs) represent characters’ beliefs, especially when those beliefs may differ from reality? This question lies at the heart of understanding the Theory of Mind (ToM) capabilities of LMs. We analyze LMs’ ability to reason about characters’ beliefs using causal mediation and abstraction. We construct a dataset, CausalToM, consisting of simple stories where two characters independently chang ...
ArXiv Domain 2026-02-28
数据来源:ArXiv Domain
LLM Domain Papers1. Model Agreement via AnchoringNumerous lines of aim to control $\textit{model disagreement}$ — the extent to which two machine learning models disagree in their predictions. We adopt a simple and standard notion of model disagreement in real-valued prediction problems, namely the expected squared difference in predictions between two models trained on independent samples, without any coordination of the training processes. We would like to be able to drive ...
ArXiv Domain 2026-03-01
数据来源:ArXiv Domain
LLM Domain Papers1. Model Agreement via AnchoringNumerous lines of aim to control $\textit{model disagreement}$ — the extent to which two machine learning models disagree in their predictions. We adopt a simple and standard notion of model disagreement in real-valued prediction problems, namely the expected squared difference in predictions between two models trained on independent samples, without any coordination of the training processes. We would like to be able to drive ...
ArXiv Domain 2026-03-02
数据来源:ArXiv Domain
LLM Domain Papers1. Model Agreement via AnchoringNumerous lines of aim to control $\textit{model disagreement}$ — the extent to which two machine learning models disagree in their predictions. We adopt a simple and standard notion of model disagreement in real-valued prediction problems, namely the expected squared difference in predictions between two models trained on independent samples, without any coordination of the training processes. We would like to be able to drive ...
ArXiv Domain 2026-03-04
数据来源:ArXiv Domain
LLM Domain Papers1. Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-TrainingTraining on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators often rely on fixed puzzles or templates and do not deliver the distributional breadth needed at scale. We introduce Reasoning Core, a scalable suite that procedur ...
ArXiv Domain 2026-03-06
数据来源:ArXiv Domain
LLM Domain Papers1. A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS DevelopmentWebGIS development requires rigor, yet agentic AI frequently fails due to five large language model (LLM) limitations: context constraints, cross-session forgetting, stochasticity, instruction failure, and adaptation rigidity. We propose a dual-helix governance framework reframing these challenges as structural governance problems that model capacity alone cannot resolve. We ...
ArXiv Domain 2026-03-07
数据来源:ArXiv Domain
LLM Domain Papers1. RoboPocket: Improve Robot Policies Instantly with Your PhoneScaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing the underlying policy’s weaknesses, leading to inefficient coverage of critical state distributions. C ...
ArXiv Domain 2026-03-10
数据来源:ArXiv Domain
LLM Domain Papers1. BEVLM: Distilling Semantic Knowledge from LLMs into Bird’s-Eye View RepresentationsThe integration of Large Language Models (LLMs) into autonomous driving has attracted growing interest for their strong reasoning and semantic understanding abilities, which are essential for handling complex decision-making and long-tail scenarios. However, existing methods typically feed LLMs with tokens from multi-view and multi-frame images independently, leading to redu ...
ArXiv Domain 2026-03-11
数据来源:ArXiv Domain
LLM Domain Papers1. Scale Space DiffusionDiffusion models degrade images through noise, and reversing this process reveals an information hierarchy across timesteps. Scale-space theory exhibits a similar hierarchy via low-pass filtering. We formalize this connection and show that highly noisy diffusion states contain no more information than small, downsampled images - raising the question of why they must be processed at full resolution. To address this, we fuse scale spaces ...
ArXiv Domain 2026-03-15
数据来源:ArXiv Domain
LLM Domain Papers1. The Latent Color Subspace: Emergent Order in High-Dimensional ChaosText-to-image generation models have advanced rapidly, yet achieving fine-grained control over generated images remains difficult, largely due to limited understanding of how semantic information is encoded. We develop an interpretation of the color representation in the Variational Autoencoder latent space of FLUX.1 [Dev], revealing a structure reflecting Hue, Saturation, and Lightness. We ...