37.2° Blog

ArXiv Domain 2026-01-07

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion DetectionSpeech emotion recognition (SER) systems are constrained by existing datasets that typically cover only 6-10 basic emotions, lack scale and diversity, and face ethical challenges when collecting sensitive emotional states. We introduce EMONET-VOICE, a comprehensive resource addressing these limitations through two components: (1) EmoNet-Voice Big, a 5,000-hour multilingual ...

ArXiv Domain 2026-01-11

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Optimal Lower Bounds for Online MulticalibrationWe prove tight lower bounds for online multicalibration, establishing an information-theoretic separation from marginal calibration. In the general setting where group functions can depend on both context and the learner’s predictions, we prove an $Ω(T^{2/3})$ lower bound on expected multicalibration error using just three disjoint binary groups. This matches the upper bounds of Noarov et al. (2025) up to log ...

ArXiv Domain 2026-01-12

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Optimal Lower Bounds for Online MulticalibrationWe prove tight lower bounds for online multicalibration, establishing an information-theoretic separation from marginal calibration. In the general setting where group functions can depend on both context and the learner’s predictions, we prove an $Ω(T^{2/3})$ lower bound on expected multicalibration error using just three disjoint binary groups. This matches the upper bounds of Noarov et al. (2025) up to log ...

ArXiv Domain 2026-01-13

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Manifold limit for the training of shallow graph convolutional neural networksWe study the discrete-to-continuum consistency of the training of shallow graph convolutional neural networks (GCNNs) on proximity graphs of sampled point clouds under a manifold assumption. Graph convolution is defined spectrally via the graph Laplacian, whose low-frequency spectrum approximates that of the Laplace-Beltrami operator of the underlying smooth manifold, and shallow ...

ArXiv Domain 2026-01-14

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-HeadWhile the Transformer architecture dominates many fields, its quadratic self-attention complexity hinders its use in large-scale applications. Linear attention offers an efficient alternative, but its direct application often degrades performance, with existing fixes typically re-introducing computational overhead through extra modules (e.g., depthwise separable convolution) that de ...

ArXiv Domain 2026-01-15

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Modeling LLM Agent Reviewer Dynamics in Elo-Ranked Review SystemIn this work, we explore the Large Language Model (LLM) agent reviewer dynamics in an Elo-ranked review system using real-world conference paper submissions. Multiple LLM agent reviewers with different personas are engage in multi round review interactions moderated by an Area Chair. We compare a baseline setting with conditions that incorporate Elo ratings and reviewer memory. Our simulation ...

ArXiv Domain 2026-01-16

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent PlanningVision-Language-Action (VLA) tasks require reasoning over complex visual scenes and executing adaptive actions in dynamic environments. While recent studies on reasoning VLAs show that explicit chain-of-thought (CoT) can improve generalization, they suffer from high inference latency due to lengthy reasoning traces. We propose Fast-ThinkAct, an efficient reasoning fra ...

ArXiv Domain 2026-01-17

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite MatchingTool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to ...

ArXiv Domain 2026-01-18

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite MatchingTool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to ...

ArXiv Domain 2026-01-19

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite MatchingTool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to ...

ArXiv Domain 2026-01-21

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. How Long Is a Piece of String? A Brief Empirical Analysis of TokenizersFrontier LLMs are increasingly utilised across academia, society and industry. A commonly used unit for comparing models, their inputs and outputs, and estimating inference pricing is the token. In general, tokens are used as a stable currency, assumed to be broadly consistent across tokenizers and contexts, enabling direct comparisons. However, tokenization varies significantly across ...

ArXiv Domain 2026-01-08

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Automated Semantic Rules Detection (ASRD) for Emergent Communication InterpretationThe field of emergent communication within multi-agent systems examines how autonomous agents can independently develop communication strategies, without explicit programming, and adapt them to varied environments. However, few studies have focused on the interpretability of emergent languages. The research exposed in this paper proposes an Automated Semantic Rules Detection ...

ArXiv Domain 2026-01-23

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Iterative Refinement Improves Compositional Image GenerationText-to-image (T2I) models have achieved remarkable progress, yet they continue to struggle with complex prompts that require simultaneously handling multiple objects, relations, and attributes. Existing inference-time strategies, such as parallel sampling with verifiers or simply increasing denoising steps, can improve prompt alignment but remain inadequate for richly compositional settings where ...

ArXiv Domain 2026-01-24

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action RecognitionWe study Compositional Video Understanding (CVU), where models must recognize verbs and objects and compose them to generalize to unseen combinations. We find that existing Zero-Shot Compositional Action Recognition (ZS-CAR) models fail primarily due to an overlooked failure mode: object-driven verb shortcuts. Through systematic analysis, we show tha ...

ArXiv Domain 2026-01-25

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action RecognitionWe study Compositional Video Understanding (CVU), where models must recognize verbs and objects and compose them to generalize to unseen combinations. We find that existing Zero-Shot Compositional Action Recognition (ZS-CAR) models fail primarily due to an overlooked failure mode: object-driven verb shortcuts. Through systematic analysis, we show tha ...