avatar
Articles
927
Tags
25
Categories
16

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
ArXiv Domain 2026-01-07
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion DetectionSpeech emotion recognition (SER) systems are constrained by existing datasets that typically cover only 6-10 basic emotions, lack scale and diversity, and face ethical challenges when collecting sensitive emotional states. We introduce EMONET-VOICE, a comprehensive resource addressing these limitations through two components: (1) EmoNet-Voice Big, a 5,000-hour multilingual ...
ArXiv Domain 2026-01-11
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Optimal Lower Bounds for Online MulticalibrationWe prove tight lower bounds for online multicalibration, establishing an information-theoretic separation from marginal calibration. In the general setting where group functions can depend on both context and the learner’s predictions, we prove an $Ω(T^{2/3})$ lower bound on expected multicalibration error using just three disjoint binary groups. This matches the upper bounds of Noarov et al. (2025) up to log ...
ArXiv Domain 2026-01-12
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Optimal Lower Bounds for Online MulticalibrationWe prove tight lower bounds for online multicalibration, establishing an information-theoretic separation from marginal calibration. In the general setting where group functions can depend on both context and the learner’s predictions, we prove an $Ω(T^{2/3})$ lower bound on expected multicalibration error using just three disjoint binary groups. This matches the upper bounds of Noarov et al. (2025) up to log ...
ArXiv Domain 2026-01-13
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Manifold limit for the training of shallow graph convolutional neural networksWe study the discrete-to-continuum consistency of the training of shallow graph convolutional neural networks (GCNNs) on proximity graphs of sampled point clouds under a manifold assumption. Graph convolution is defined spectrally via the graph Laplacian, whose low-frequency spectrum approximates that of the Laplace-Beltrami operator of the underlying smooth manifold, and shallow ...
ArXiv Domain 2026-01-14
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-HeadWhile the Transformer architecture dominates many fields, its quadratic self-attention complexity hinders its use in large-scale applications. Linear attention offers an efficient alternative, but its direct application often degrades performance, with existing fixes typically re-introducing computational overhead through extra modules (e.g., depthwise separable convolution) that de ...
ArXiv Domain 2026-01-15
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Modeling LLM Agent Reviewer Dynamics in Elo-Ranked Review SystemIn this work, we explore the Large Language Model (LLM) agent reviewer dynamics in an Elo-ranked review system using real-world conference paper submissions. Multiple LLM agent reviewers with different personas are engage in multi round review interactions moderated by an Area Chair. We compare a baseline setting with conditions that incorporate Elo ratings and reviewer memory. Our simulation ...
ArXiv Domain 2026-01-16
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent PlanningVision-Language-Action (VLA) tasks require reasoning over complex visual scenes and executing adaptive actions in dynamic environments. While recent studies on reasoning VLAs show that explicit chain-of-thought (CoT) can improve generalization, they suffer from high inference latency due to lengthy reasoning traces. We propose Fast-ThinkAct, an efficient reasoning fra ...
ArXiv Domain 2026-01-17
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite MatchingTool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to ...
ArXiv Domain 2026-01-18
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite MatchingTool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to ...
ArXiv Domain 2026-01-19
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite MatchingTool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to ...
ArXiv Domain 2026-01-21
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. How Long Is a Piece of String? A Brief Empirical Analysis of TokenizersFrontier LLMs are increasingly utilised across academia, society and industry. A commonly used unit for comparing models, their inputs and outputs, and estimating inference pricing is the token. In general, tokens are used as a stable currency, assumed to be broadly consistent across tokenizers and contexts, enabling direct comparisons. However, tokenization varies significantly across ...
ArXiv Domain 2026-01-08
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Automated Semantic Rules Detection (ASRD) for Emergent Communication InterpretationThe field of emergent communication within multi-agent systems examines how autonomous agents can independently develop communication strategies, without explicit programming, and adapt them to varied environments. However, few studies have focused on the interpretability of emergent languages. The research exposed in this paper proposes an Automated Semantic Rules Detection ...
ArXiv Domain 2026-01-23
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Iterative Refinement Improves Compositional Image GenerationText-to-image (T2I) models have achieved remarkable progress, yet they continue to struggle with complex prompts that require simultaneously handling multiple objects, relations, and attributes. Existing inference-time strategies, such as parallel sampling with verifiers or simply increasing denoising steps, can improve prompt alignment but remain inadequate for richly compositional settings where ...
ArXiv Domain 2026-01-24
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action RecognitionWe study Compositional Video Understanding (CVU), where models must recognize verbs and objects and compose them to generalize to unseen combinations. We find that existing Zero-Shot Compositional Action Recognition (ZS-CAR) models fail primarily due to an overlooked failure mode: object-driven verb shortcuts. Through systematic analysis, we show tha ...
ArXiv Domain 2026-01-25
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action RecognitionWe study Compositional Video Understanding (CVU), where models must recognize verbs and objects and compose them to generalize to unseen combinations. We find that existing Zero-Shot Compositional Action Recognition (ZS-CAR) models fail primarily due to an overlooked failure mode: object-driven verb shortcuts. Through systematic analysis, we show tha ...
1…606162
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
927
Tags
25
Categories
16
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
LLMs公开课 - 5.高效训练&模型压缩2024-01-07
Categories
  • AI407
  • Cython1
  • DSA24
  • GitHub220
  • HotNews69
Tags
DSARLTransformerLLMsPaperReadingDeepLearningCVGPTPLdomainhfgithubhot_newsArXivDomainAIGitHubTrendingHuggingFacePapers微博热搜HotNewsleetcodealgo
Archives
  • January 20245
  • December 202314
  • November 202326
  • October 20231
  • September 20234
Info
Article :
927
Run time :
Total Count :
46187.4k
UV :
PV :
Last Push :
©2023 - 2026 By Firefly
Search
Loading the Database