avatar
Articles
724
Tags
25
Categories
16

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
ArXiv Domain 2025-12-15
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation ModelWe propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation priors, existing methods struggle to simultaneously produce high-quality geometry and accurate poses under severe occlusion and open-set settings. To address these issues, we first decouple the de-occlusion mo ...
HuggingFace Papers 2025-08-10
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. On the Generalization of SFT: A Reinforcement Learning Perspective with Reward RectificationWe present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through mathematical analysis, we reveal that standard SFT gradients implicitly encode a problematic reward structure that may severely restrict the generaliza ...
HuggingFace Papers 2025-08-21
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RLRecent advances in large language models (LLMs) and multi-agent systems have demonstrated remarkable capabilities in complex problem-solving tasks such as deep research, vibe coding, and mathematical reasoning. However, most existing multi-agent systems are built upon manual prompt/workflow engineering with sophisticated agent frameworks, making them computatio ...
HuggingFace Papers 2025-08-09
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. On the Generalization of SFT: A Reinforcement Learning Perspective with Reward RectificationWe present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through mathematical analysis, we reveal that standard SFT gradients implicitly encode a problematic reward structure that may severely restrict the generaliza ...
1…4849
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
724
Tags
25
Categories
16
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
LLMs公开课 - 5.高效训练&模型压缩2024-01-07
Categories
  • AI306
  • Cython1
  • DSA24
  • GitHub169
  • HotNews18
Tags
DSARLPaperReadingDeepLearningTransformerCVGPTLLMsPLdomaingithubhfhot_newsArXivDomainAIGitHubTrendingHuggingFacePapers微博热搜HotNewsleetcodealgo
Archives
  • January 20245
  • December 202314
  • November 202326
  • October 20231
  • September 20234
Info
Article :
724
Run time :
Total Count :
34253.8k
UV :
PV :
Last Push :
©2023 - 2025 By Firefly
Search
Loading the Database