avatar
Articles
405
Tags
23
Categories
15

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
ArXiv Domain 2026-01-18
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite MatchingTool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to ...
ArXiv Domain 2026-01-19
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite MatchingTool-Integrated Reasoning (TIR) empowers large language models (LLMs) to tackle complex tasks by interleaving reasoning steps with external tool interactions. However, existing reinforcement learning methods typically rely on outcome- or trajectory-level rewards, assigning uniform advantages to all steps within a trajectory. This coarse-grained credit assignment fails to ...
ArXiv Domain 2026-01-20
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. How Long Is a Piece of String? A Brief Empirical Analysis of TokenizersFrontier LLMs are increasingly utilised across academia, society and industry. A commonly used unit for comparing models, their inputs and outputs, and estimating inference pricing is the token. In general, tokens are used as a stable currency, assumed to be broadly consistent across tokenizers and contexts, enabling direct comparisons. However, tokenization varies significantly across ...
ArXiv Domain 2026-01-21
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. How Long Is a Piece of String? A Brief Empirical Analysis of TokenizersFrontier LLMs are increasingly utilised across academia, society and industry. A commonly used unit for comparing models, their inputs and outputs, and estimating inference pricing is the token. In general, tokens are used as a stable currency, assumed to be broadly consistent across tokenizers and contexts, enabling direct comparisons. However, tokenization varies significantly across ...
ArXiv Domain 2026-01-22
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. VideoMaMa: Mask-Guided Video Matting via Generative PriorGeneralizing video matting models to real-world videos remains a significant challenge due to the scarcity of labeled data. To address this, we present Video Mask-to-Matte Model (VideoMaMa) that converts coarse segmentation masks into pixel accurate alpha mattes, by leveraging pretrained video diffusion models. VideoMaMa demonstrates strong zero-shot generalization to real-world footage, even though ...
ArXiv Domain 2026-01-23
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Iterative Refinement Improves Compositional Image GenerationText-to-image (T2I) models have achieved remarkable progress, yet they continue to struggle with complex prompts that require simultaneously handling multiple objects, relations, and attributes. Existing inference-time strategies, such as parallel sampling with verifiers or simply increasing denoising steps, can improve prompt alignment but remain inadequate for richly compositional settings where ...
ArXiv Domain 2026-01-24
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action RecognitionWe study Compositional Video Understanding (CVU), where models must recognize verbs and objects and compose them to generalize to unseen combinations. We find that existing Zero-Shot Compositional Action Recognition (ZS-CAR) models fail primarily due to an overlooked failure mode: object-driven verb shortcuts. Through systematic analysis, we show tha ...
ArXiv Domain 2026-01-25
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action RecognitionWe study Compositional Video Understanding (CVU), where models must recognize verbs and objects and compose them to generalize to unseen combinations. We find that existing Zero-Shot Compositional Action Recognition (ZS-CAR) models fail primarily due to an overlooked failure mode: object-driven verb shortcuts. Through systematic analysis, we show tha ...
ArXiv Domain 2026-01-26
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action RecognitionWe study Compositional Video Understanding (CVU), where models must recognize verbs and objects and compose them to generalize to unseen combinations. We find that existing Zero-Shot Compositional Action Recognition (ZS-CAR) models fail primarily due to an overlooked failure mode: object-driven verb shortcuts. Through systematic analysis, we show tha ...
ArXiv Domain 2026-01-27
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. A Scalable Measure of Loss Landscape Curvature for Analyzing the Training Dynamics of LLMsUnderstanding the curvature evolution of the loss landscape is fundamental to analyzing the training dynamics of neural networks. The most commonly studied measure, Hessian sharpness ($λ_{\max}^H$) — the largest eigenvalue of the loss Hessian — determines local training stability and interacts with the learning rate throughout training. Despite its significance in ana ...
ArXiv Domain 2026-01-28
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. ctELM: Decoding and Manipulating Embeddings of Clinical Trials with Embedding Language ModelsText embeddings have become an essential part of a variety of language applications. However, methods for interpreting, exploring and reversing embedding spaces are limited, reducing transparency and precluding potentially valuable generative use cases. In this work, we align Large Language Models to embeddings of clinical trials using the recently reported Embeddi ...
ArXiv Domain 2026-01-29
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Evaluation of Oncotimia: An LLM based system for supporting tumour boardsMultidisciplinary tumour boards (MDTBs) play a central role in oncology decision-making but require manual processes and structuring large volumes of heterogeneous clinical information, resulting in a substantial documentation burden. In this work, we present ONCOTIMIA, a modular and secure clinical tool designed to integrate generative artificial intelligence (GenAI) into oncology wo ...
ArXiv Domain 2026-01-30
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Recursive Language ModelsWe study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference paradigm that treats long prompts as part of an external environment and allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the prompt. We find that RLMs can successfully process inputs up to tw ...
ArXiv Domain 2026-01-31
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. RedSage: A Cybersecurity Generalist LLMCybersecurity operations demand assistant LLMs that support diverse workflows without exposing sensitive data. Existing solutions either rely on proprietary APIs with privacy risks or on open models lacking domain adaptation. To bridge this gap, we curate 11.8B tokens of cybersecurity-focused continual pretraining data via large-scale web filtering and manual collection of high-quality resources, spanning 28.6K docume ...
ArXiv Domain 2026-02-01
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. RedSage: A Cybersecurity Generalist LLMCybersecurity operations demand assistant LLMs that support diverse workflows without exposing sensitive data. Existing solutions either rely on proprietary APIs with privacy risks or on open models lacking domain adaptation. To bridge this gap, we curate 11.8B tokens of cybersecurity-focused continual pretraining data via large-scale web filtering and manual collection of high-quality resources, spanning 28.6K docume ...
1…222324…27
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
405
Tags
23
Categories
15
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
LLMs公开课 - 5.高效训练&模型压缩2024-01-07
Categories
  • AI152
  • Cython1
  • DSA24
  • GitHub81
  • HotNews81
Tags
DSARLTransformerLLMsPaperReadingDeepLearningCVGPTPLdomaingithubhfhot_newsGitHubTrendingHuggingFacePapersAIHotNewsleetcodealgoArXivDomain
Archives
  • January 20245
  • December 202314
  • November 202326
  • October 20231
  • September 20234
Info
Article :
405
Run time :
Total Count :
22299.6k
UV :
PV :
Last Push :
©2023 - 2026 By Firefly
Search
Loading the Database