avatar
Articles
927
Tags
25
Categories
16

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
ArXiv Domain 2026-01-26
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action RecognitionWe study Compositional Video Understanding (CVU), where models must recognize verbs and objects and compose them to generalize to unseen combinations. We find that existing Zero-Shot Compositional Action Recognition (ZS-CAR) models fail primarily due to an overlooked failure mode: object-driven verb shortcuts. Through systematic analysis, we show tha ...
ArXiv Domain 2026-01-27
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. A Scalable Measure of Loss Landscape Curvature for Analyzing the Training Dynamics of LLMsUnderstanding the curvature evolution of the loss landscape is fundamental to analyzing the training dynamics of neural networks. The most commonly studied measure, Hessian sharpness ($λ_{\max}^H$) — the largest eigenvalue of the loss Hessian — determines local training stability and interacts with the learning rate throughout training. Despite its significance in ana ...
ArXiv Domain 2026-01-20
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. How Long Is a Piece of String? A Brief Empirical Analysis of TokenizersFrontier LLMs are increasingly utilised across academia, society and industry. A commonly used unit for comparing models, their inputs and outputs, and estimating inference pricing is the token. In general, tokens are used as a stable currency, assumed to be broadly consistent across tokenizers and contexts, enabling direct comparisons. However, tokenization varies significantly across ...
ArXiv Domain 2026-01-10
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Optimal Lower Bounds for Online MulticalibrationWe prove tight lower bounds for online multicalibration, establishing an information-theoretic separation from marginal calibration. In the general setting where group functions can depend on both context and the learner’s predictions, we prove an $Ω(T^{2/3})$ lower bound on expected multicalibration error using just three disjoint binary groups. This matches the upper bounds of Noarov et al. (2025) up to log ...
ArXiv Domain 2026-01-29
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Evaluation of Oncotimia: An LLM based system for supporting tumour boardsMultidisciplinary tumour boards (MDTBs) play a central role in oncology decision-making but require manual processes and structuring large volumes of heterogeneous clinical information, resulting in a substantial documentation burden. In this work, we present ONCOTIMIA, a modular and secure clinical tool designed to integrate generative artificial intelligence (GenAI) into oncology wo ...
ArXiv Domain 2026-01-31
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. RedSage: A Cybersecurity Generalist LLMCybersecurity operations demand assistant LLMs that support diverse workflows without exposing sensitive data. Existing solutions either rely on proprietary APIs with privacy risks or on open models lacking domain adaptation. To bridge this gap, we curate 11.8B tokens of cybersecurity-focused continual pretraining data via large-scale web filtering and manual collection of high-quality resources, spanning 28.6K docume ...
ArXiv Domain 2026-01-28
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. ctELM: Decoding and Manipulating Embeddings of Clinical Trials with Embedding Language ModelsText embeddings have become an essential part of a variety of language applications. However, methods for interpreting, exploring and reversing embedding spaces are limited, reducing transparency and precluding potentially valuable generative use cases. In this work, we align Large Language Models to embeddings of clinical trials using the recently reported Embeddi ...
ArXiv Domain 2026-01-22
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. VideoMaMa: Mask-Guided Video Matting via Generative PriorGeneralizing video matting models to real-world videos remains a significant challenge due to the scarcity of labeled data. To address this, we present Video Mask-to-Matte Model (VideoMaMa) that converts coarse segmentation masks into pixel accurate alpha mattes, by leveraging pretrained video diffusion models. VideoMaMa demonstrates strong zero-shot generalization to real-world footage, even though ...
HuggingFace Papers 2025-08-09
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. On the Generalization of SFT: A Reinforcement Learning Perspective with Reward RectificationWe present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through mathematical analysis, we reveal that standard SFT gradients implicitly encode a problematic reward structure that may severely restrict the generaliza ...
HuggingFace Papers 2025-08-10
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. On the Generalization of SFT: A Reinforcement Learning Perspective with Reward RectificationWe present a simple yet theoretically motivated improvement to Supervised Fine-Tuning (SFT) for the Large Language Model (LLM), addressing its limited generalization compared to reinforcement learning (RL). Through mathematical analysis, we reveal that standard SFT gradients implicitly encode a problematic reward structure that may severely restrict the generaliza ...
HuggingFace Papers 2025-08-21
Created2019-06-18|AI
数据来源:HuggingFace Papers Latest Papers1. Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RLRecent advances in large language models (LLMs) and multi-agent systems have demonstrated remarkable capabilities in complex problem-solving tasks such as deep research, vibe coding, and mathematical reasoning. However, most existing multi-agent systems are built upon manual prompt/workflow engineering with sophisticated agent frameworks, making them computatio ...
ArXiv Domain 2026-01-30
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Recursive Language ModelsWe study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference paradigm that treats long prompts as part of an external environment and allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the prompt. We find that RLMs can successfully process inputs up to tw ...
1…6162
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
927
Tags
25
Categories
16
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
LLMs公开课 - 5.高效训练&模型压缩2024-01-07
Categories
  • AI407
  • Cython1
  • DSA24
  • GitHub220
  • HotNews69
Tags
DSARLTransformerLLMsPaperReadingDeepLearningCVGPTPLdomainhfgithubhot_newsArXivDomainAIGitHubTrendingHuggingFacePapers微博热搜HotNewsleetcodealgo
Archives
  • January 20245
  • December 202314
  • November 202326
  • October 20231
  • September 20234
Info
Article :
927
Run time :
Total Count :
46187.4k
UV :
PV :
Last Push :
©2023 - 2026 By Firefly
Search
Loading the Database