avatar
Articles
305
Tags
24
Categories
15

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Note
  • Algorithm
  • PLs
Daily
  • Github
  • Weibo
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Note
  • Algorithm
  • PLs
Daily
  • Github
  • Weibo
  • HF
  • Arxiv
Archives
Categories
About
HuggingFace Papers 2025-07-22
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative ModelsRussian speech synthesis presents distinctive challenges, including vowel reduction, consonant devoicing, variable stress patterns, homograph ambiguity, and unnatural intonation. This paper introduces Balalaika, a novel dataset comprising more than 2,000 hours of studio-quality Russian speech with comprehensive textual annotations, including punctu ...
HuggingFace Papers 2025-07-23
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. HOComp: Interaction-Aware Human-Object CompositionWhile existing image-guided composition methods may help insert a foreground object onto a user-specified region of a background image, achieving natural blending inside the region with the rest of the image unchanged, we observe that these existing methods often struggle in synthesizing seamless interaction-aware compositions when the task involves human-object interactions. In this paper, we first propo ...
HuggingFace Papers 2025-07-24
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Beyond Context Limits: Subconscious Threads for Long-Horizon ReasoningTo break the context limits of large language models (LLMs) that bottleneck reasoning accuracy and efficiency, we propose the Thread Inference Model (TIM), a family of LLMs trained for recursive and decompositional problem solving, and TIMRUN, an inference runtime enabling long-horizon structured reasoning beyond context limits. Together, TIM hosted on TIMRUN supports virtually unlimit ...
HuggingFace Papers 2025-07-25
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Pixels, Patterns, but No Poetry: To See The World like HumansAchieving human-like perception and reasoning in Multimodal Large Language Models (MLLMs) remains a central challenge in artificial intelligence. While recent research has primarily focused on enhancing reasoning capabilities in MLLMs, a fundamental question persists: Can Multimodal Large Language Models truly perceive the world as humans do? This paper shifts focus from reasoning to perception ...
HuggingFace Papers 2025-07-26
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. nablaNABLA: Neighborhood Adaptive Block-Level AttentionRecent progress in transformer-based architectures has demonstrated remarkable success in video generation tasks. However, the quadratic complexity of full attention mechanisms remains a critical bottleneck, particularly for high-resolution and long-duration video sequences. In this paper, we propose NABLA, a novel Neighborhood Adaptive Block-Level Attention mechanism that dynamically adapts to spars ...
HuggingFace Papers 2025-07-27
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. nablaNABLA: Neighborhood Adaptive Block-Level AttentionRecent progress in transformer-based architectures has demonstrated remarkable success in video generation tasks. However, the quadratic complexity of full attention mechanisms remains a critical bottleneck, particularly for high-resolution and long-duration video sequences. In this paper, we propose NABLA, a novel Neighborhood Adaptive Block-Level Attention mechanism that dynamically adapts to spars ...
HuggingFace Papers 2025-07-28
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. The Geometry of LLM Quantization: GPTQ as Babai’s Nearest Plane AlgorithmQuantizing the weights of large language models (LLMs) from 16-bit to lower bitwidth is the de facto approach to deploy massive transformers onto more affordable accelerators. GPTQ emerged as one of the standard methods for one-shot post-training quantization at LLM scale. Yet, its inner workings are described as a sequence of ad-hoc algebraic updates that obscure any geometric mean ...
HuggingFace Papers 2025-07-29
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Deep Researcher with Test-Time DiffusionDeep research agents, powered by Large Language Models (LLMs), are rapidly advancing; yet, their performance often plateaus when generating complex, long-form research reports using generic test-time scaling algorithms. Drawing inspiration from the iterative nature of human research, which involves cycles of searching, reasoning, and revision, we propose the Test-Time Diffusion Deep Researcher (TTD-DR). This novel ...
HuggingFace Papers 2025-07-30
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. MOVE: Motion-Guided Few-Shot Video Object SegmentationThis work addresses motion-guided few-shot video object segmentation (FSVOS), which aims to segment dynamic objects in videos based on a few annotated examples with the same motion patterns. Existing FSVOS datasets and methods typically focus on object categories, which are static attributes that ignore the rich temporal dynamics in videos, limiting their application in scenarios requiring motion unde ...
HuggingFace Papers 2025-07-31
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual SegmentationReferring audio-visual segmentation (RAVS) has recently seen significant advancements, yet challenges remain in integrating multimodal information and deeply understanding and reasoning about audiovisual content. To extend the boundaries of RAVS and facilitate future research in this field, we propose Omnimodal Referring Audio-Visual Segmentation (OmniAVS), a new dataset co ...
HuggingFace Papers 2025-08-01
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed-Prover: Deep and Broad Reasoning for Automated Theorem ProvingLLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective train ...
HuggingFace Papers 2025-08-02
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed-Prover: Deep and Broad Reasoning for Automated Theorem ProvingLLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective train ...
HuggingFace Papers 2025-08-03
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed-Prover: Deep and Broad Reasoning for Automated Theorem ProvingLLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective train ...
HuggingFace Papers 2025-08-04
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed-Prover: Deep and Broad Reasoning for Automated Theorem ProvingLLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective train ...
HuggingFace Papers 2025-08-05
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Beyond Fixed: Variable-Length Denoising for Diffusion Large Language ModelsDiffusion Large Language Models (DLLMs) are emerging as a powerful alternative to the dominant Autoregressive Large Language Models, offering efficient parallel generation and capable global context modeling. However, the practical application of DLLMs is hindered by a critical architectural constraint: the need for a statically predefined generation length. This static length all ...
1…111213…21
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
305
Tags
24
Categories
15
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
LLMs公开课 - 5.高效训练&模型压缩2024-01-07
Categories
  • AI93
  • Cython1
  • DSA24
  • GitHub62
  • LLMs16
Tags
DSARLTransformerLLMsPaperReadingDeepLearningCVGPTPLdomaingithubhfweiboArXivDomainAIGitHubTrendingHuggingFacePapers微博热搜leetcodealgo
Archives
  • January 20245
  • December 202314
  • November 202326
  • October 20231
  • September 20234
Info
Article :
305
Run time :
Total Count :
11109.8k
UV :
PV :
Last Push :
©2023 - 2025 By Firefly
Search
Loading the Database