avatar
Articles
497
Tags
24
Categories
15

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • Weibo
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • Weibo
  • HF
  • Arxiv
Archives
Categories
About
HuggingFace Papers 2025-07-29
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Deep Researcher with Test-Time DiffusionDeep research agents, powered by Large Language Models (LLMs), are rapidly advancing; yet, their performance often plateaus when generating complex, long-form research reports using generic test-time scaling algorithms. Drawing inspiration from the iterative nature of human research, which involves cycles of searching, reasoning, and revision, we propose the Test-Time Diffusion Deep Researcher (TTD-DR). This novel ...
HuggingFace Papers 2025-08-01
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed-Prover: Deep and Broad Reasoning for Automated Theorem ProvingLLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective train ...
HuggingFace Papers 2025-08-02
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed-Prover: Deep and Broad Reasoning for Automated Theorem ProvingLLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective train ...
HuggingFace Papers 2025-08-03
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed-Prover: Deep and Broad Reasoning for Automated Theorem ProvingLLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective train ...
HuggingFace Papers 2025-08-04
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed-Prover: Deep and Broad Reasoning for Automated Theorem ProvingLLMs have demonstrated strong mathematical reasoning abilities by leveraging reinforcement learning with long chain-of-thought, yet they continue to struggle with theorem proving due to the lack of clear supervision signals when solely using natural language. Dedicated domain-specific languages like Lean provide clear supervision via formal verification of proofs, enabling effective train ...
HuggingFace Papers 2025-08-05
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Beyond Fixed: Variable-Length Denoising for Diffusion Large Language ModelsDiffusion Large Language Models (DLLMs) are emerging as a powerful alternative to the dominant Autoregressive Large Language Models, offering efficient parallel generation and capable global context modeling. However, the practical application of DLLMs is hindered by a critical architectural constraint: the need for a statically predefined generation length. This static length all ...
HuggingFace Papers 2025-08-06
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Qwen-Image Technical ReportWe present Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing. To address the challenges of complex text rendering, we design a comprehensive data pipeline that includes large-scale data collection, filtering, annotation, synthesis, and balancing. Moreover, we adopt a progressive training strategy that starts with non-text-to ...
HuggingFace Papers 2025-08-08
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. On the Generalization of SFT: A Reinforcement Learning Perspective with Reward RectificationWe present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through mathematical analysis, we reveal that standard SFT gradients implicitly encode a problematic reward structure that may severely restrict the generaliza ...
HuggingFace Papers 2025-08-07
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed InferenceWe present Seed Diffusion Preview, a large-scale language model based on discrete-state diffusion, offering remarkably fast inference speed. Thanks to non-sequential, parallel generation, discrete diffusion models provide a notable speedup to mitigate the inherent latency of token-by-token decoding, as demonstrated recently (e.g., Mercury Coder, Gemini Diffusion). Seed Diffus ...
HuggingFace Papers 2025-08-11
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. On the Generalization of SFT: A Reinforcement Learning Perspective with Reward RectificationWe present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through mathematical analysis, we reveal that standard SFT gradients implicitly encode a problematic reward structure that may severely restrict the generaliza ...
HuggingFace Papers 2025-08-12
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation ModelsWe present GLM-4.5, an open-source Mixture-of-Experts (MoE) large language model with 355B total parameters and 32B activated parameters, featuring a hybrid reasoning method that supports both thinking and direct response modes. Through multi-stage training on 23T tokens and comprehensive post-training with expert model iteration and reinforcement learning, GLM-4.5 achieves strong performance ...
HuggingFace Papers 2025-08-13
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. ReasonRank: Empowering Passage Ranking with Strong Reasoning AbilityLarge Language Model (LLM) based listwise ranking has shown superior performance in many passage ranking tasks. With the development of Large Reasoning Models, many studies have demonstrated that step-by-step reasoning during test-time helps improve listwise ranking performance. However, due to the scarcity of reasoning-intensive training data, existing rerankers perform poorly in many c ...
HuggingFace Papers 2025-08-14
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. WebWatcher: Breaking New Frontier of Vision-Language Deep Research AgentWeb agents such as Deep Research have demonstrated superhuman cognitive abilities, capable of solving highly challenging information-seeking problems. However, most research remains primarily text-centric, overlooking visual information in the real world. This makes multimodal Deep Research highly challenging, as such agents require much stronger reasoning abilities in perception, lo ...
HuggingFace Papers 2025-08-16
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical ReasoningMultimodal Large Language Models (MLLMs) have demonstrated impressive capabilities across various tasks, but still struggle with complex mathematical reasoning. Existing research primarily focuses on dataset construction and method optimization, often overlooking two critical aspects: comprehensive knowledge-driven design and model-centric data space modeling. In this ...
HuggingFace Papers 2025-08-17
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical ReasoningMultimodal Large Language Models (MLLMs) have demonstrated impressive capabilities across various tasks, but still struggle with complex mathematical reasoning. Existing research primarily focuses on dataset construction and method optimization, often overlooking two critical aspects: comprehensive knowledge-driven design and model-centric data space modeling. In this ...
1…151617…34
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
497
Tags
24
Categories
15
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
No title2025-10-16
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
Categories
  • AI188
  • Cython1
  • DSA24
  • GitHub110
  • LLMs16
Tags
DSARLTransformerLLMsPaperReadingDeepLearningCVGPTPLdomaingithubhfweiboArXivDomainAIGitHubTrendingHuggingFacePapers微博热搜leetcodealgo
Archives
  • October 20251
  • January 20245
  • December 202314
  • November 202326
  • October 20231
Info
Article :
497
Run time :
Total Count :
20627.7k
UV :
PV :
Last Push :
©2023 - 2025 By Firefly
Search
Loading the Database