avatar
Articles
935
Tags
25
Categories
16

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
HuggingFace Papers 2025-11-26
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. General Agentic Memory Via Deep ResearchMemory is critical for AI agents, yet the widely-adopted static memory, aiming to create readily available memory in advance, is inevitably subject to severe information loss. To address this limitation, we propose a novel framework called \textbf{general agentic memory (GAM)}. GAM follows the principle of “\textbf{just-in time (JIT) compilation}” where it focuses on creating optimized contexts for its client at ru ...
HuggingFace Papers 2025-11-28
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Video Generation Models Are Good Latent Reward ModelsReward feedback learning (ReFL) has proven effective for aligning image generation with human preferences. However, its extension to video generation faces significant challenges. Existing video reward models rely on vision-language models designed for pixel-space inputs, confining ReFL optimization to near-complete denoising steps after computationally expensive VAE decoding. This pixel-space approach ...
HuggingFace Papers 2025-11-29
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Video Generation Models Are Good Latent Reward ModelsReward feedback learning (ReFL) has proven effective for aligning image generation with human preferences. However, its extension to video generation faces significant challenges. Existing video reward models rely on vision-language models designed for pixel-space inputs, confining ReFL optimization to near-complete denoising steps after computationally expensive VAE decoding. This pixel-space approach ...
HuggingFace Papers 2025-11-30
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Video Generation Models Are Good Latent Reward ModelsReward feedback learning (ReFL) has proven effective for aligning image generation with human preferences. However, its extension to video generation faces significant challenges. Existing video reward models rely on vision-language models designed for pixel-space inputs, confining ReFL optimization to near-complete denoising steps after computationally expensive VAE decoding. This pixel-space approach ...
HuggingFace Papers 2025-12-01
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Video Generation Models Are Good Latent Reward ModelsReward feedback learning (ReFL) has proven effective for aligning image generation with human preferences. However, its extension to video generation faces significant challenges. Existing video reward models rely on vision-language models designed for pixel-space inputs, confining ReFL optimization to near-complete denoising steps after computationally expensive VAE decoding. This pixel-space approach ...
HuggingFace Papers 2025-12-02
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion TransformerThe landscape of high-performance image generation models is currently dominated by proprietary systems, such as Nano Banana Pro and Seedream 4.0. Leading open-source alternatives, including Qwen-Image, Hunyuan-Image-3.0 and FLUX.2, are characterized by massive parameter counts (20B to 80B), making them impractical for inference, and fine-tuning on consumer-gr ...
HuggingFace Papers 2025-12-03
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. From Code Foundation Models to Agents and Applications: A Practical Guide to Code IntelligenceLarge language models (LLMs) have fundamentally transformed automated software development by enabling direct translation of natural language descriptions into functional code, driving commercial adoption through tools like Github Copilot (Microsoft), Cursor (Anysphere), Trae (ByteDance), and Claude Code (Anthropic). While the field has evolved dramatically from ...
HuggingFace Papers 2025-12-04
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. DeepSeek-V3.2: Pushing the Frontier of Open Large Language ModelsWe introduce DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance. The key technical breakthroughs of DeepSeek-V3.2 are as follows: (1) DeepSeek Sparse Attention (DSA): We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance in long-context scenario ...
HuggingFace Papers 2025-12-05
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Qwen3-VL Technical ReportWe introduce Qwen3-VL, the most capable vision-language model in the Qwen series to date, achieving superior performance across a broad range of multimodal benchmarks. It natively supports interleaved contexts of up to 256K tokens, seamlessly integrating text, images, and video. The model family includes both dense (2B/4B/8B/32B) and mixture-of-experts (30B-A3B/235B-A22B) variants to accommodate diverse latency-quality trade-offs ...
HuggingFace Papers 2025-12-07
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite LengthExisting diffusion-based video generation methods are fundamentally constrained by sequential computation and long-horizon inconsistency, limiting their practical adoption in real-time, streaming audio-driven avatar synthesis. We present Live Avatar, an algorithm-system co-designed framework that enables efficient, high-fidelity, and infinite-length avatar generation usin ...
HuggingFace Papers 2025-12-08
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite LengthExisting diffusion-based video generation methods are fundamentally constrained by sequential computation and long-horizon inconsistency, limiting their practical adoption in real-time, streaming audio-driven avatar synthesis. We present Live Avatar, an algorithm-system co-designed framework that enables efficient, high-fidelity, and infinite-length avatar generation usin ...
HuggingFace Papers 2025-12-09
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial FlowsRecent advances in large multi-modal generative models have demonstrated impressive capabilities in multi-modal generation, including image and video generation. These models are typically built upon multi-step frameworks like diffusion and flow matching, which inherently limits their inference efficiency (requiring 40-100 Number of Function Evaluations (NFEs)). While vari ...
HuggingFace Papers 2025-12-10
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement LearningWe introduce Native Parallel Reasoner (NPR), a teacher-free framework that enables Large Language Models (LLMs) to self-evolve genuine parallel reasoning capabilities. NPR transforms the model from sequential emulation to native parallel cognition through three key innovations: 1) a self-distilled progressive training paradigm that transitions from ``cold-start’’ ...
HuggingFace Papers 2025-12-11
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Wan-Move: Motion-controllable Video Generation via Latent Trajectory GuidanceWe present Wan-Move, a simple and scalable framework that brings motion control to video generative models. Existing motion-controllable methods typically suffer from coarse control granularity and limited scalability, leaving their outputs insufficient for practical use. We narrow this gap by achieving precise and high-quality motion control. Our core idea is to directly make t ...
HuggingFace Papers 2025-12-12
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. StereoWorld: Geometry-Aware Monocular-to-Stereo Video GenerationThe growing adoption of XR devices has fueled strong demand for high-quality stereo video, yet its production remains costly and artifact-prone. To address this challenge, we present StereoWorld, an end-to-end framework that repurposes a pretrained video generator for high-fidelity monocular-to-stereo video generation. Our framework jointly conditions the model on the monocular video input w ...
1…303132…63
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
935
Tags
25
Categories
16
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
LLMs公开课 - 5.高效训练&模型压缩2024-01-07
Categories
  • AI411
  • Cython1
  • DSA24
  • GitHub222
  • HotNews71
Tags
DSARLTransformerLLMsPaperReadingDeepLearningCVGPTPLdomaingithubhfhot_newsArXivDomainAIGitHubTrendingHuggingFacePapers微博热搜HotNewsleetcodealgo
Archives
  • January 20245
  • December 202314
  • November 202326
  • October 20231
  • September 20234
Info
Article :
935
Run time :
Total Count :
46916.8k
UV :
PV :
Last Push :
©2023 - 2026 By Firefly
Search
Loading the Database