avatar
Articles
493
Tags
24
Categories
15

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • Weibo
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • Weibo
  • HF
  • Arxiv
Archives
Categories
About
ArXiv Domain 2025-07-21
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical JokesHumour, as a complex language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we investigate whether the ability of Large Language Models (LLMs) to explain humour depends on the particular humour form. We compare models on si ...
ArXiv Domain 2025-07-22
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuningLarge language models (LLMs) excel across various tasks, but standard first-order (FO) fine-tuning demands considerable memory, significantly limiting real-world deployment. Recently, zeroth-order (ZO) optimization stood out as a promising memory-efficient training paradigm, avoiding backward passes and relying solely on forward passes for gradient estimation, m ...
ArXiv Domain 2025-07-23
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. The Impact of Language Mixing on Bilingual LLM ReasoningProficient multilingual speakers often intentionally switch languages in the middle of a conversation. Similarly, recent reasoning-focused bilingual large language models (LLMs) with strong capabilities in both languages exhibit language mixing—alternating languages within their chain of thought. Discouraging this behavior in DeepSeek-R1 was found to degrade accuracy, suggesting that language mixing m ...
ArXiv Domain 2025-07-24
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. LingBench++: A Linguistically-Informed Benchmark and Reasoning Framework for Multi-Step and Cross-Cultural Inference with LLMsWe propose LingBench++, a linguistically-informed benchmark and reasoning framework designed to evaluate large language models (LLMs) on complex linguistic tasks inspired by the International Linguistics Olympiad (IOL). Unlike prior benchmarks that focus solely on final answer accuracy, LingBench++ provides structured reasoning trac ...
ArXiv Domain 2025-07-25
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuningLarge Language Models (LLMs) have become indispensable in real-world applications. However, their widespread adoption raises significant safety concerns, particularly in responding to socially harmful questions. Despite substantial efforts to improve model safety through alignment, aligned models can still have their safety protections undermined by subsequent fine-tuning - even when the ...
ArXiv Domain 2025-07-26
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMsKnowledge distillation can be a cost-effective technique to distill knowledge in Large Language Models, if the teacher output logits can be pre-computed and cached. However, successfully applying this to pre-training remains largely unexplored. In this work, we prove that naive approaches for sparse knowledge distillation such as caching Top-K probabilities, while intuitive, provide biased e ...
ArXiv Domain 2025-07-27
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMsKnowledge distillation can be a cost-effective technique to distill knowledge in Large Language Models, if the teacher output logits can be pre-computed and cached. However, successfully applying this to pre-training remains largely unexplored. In this work, we prove that naive approaches for sparse knowledge distillation such as caching Top-K probabilities, while intuitive, provide biased e ...
ArXiv Domain 2025-07-28
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader ImpactsMany recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performan ...
ArXiv Domain 2025-07-29
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader ImpactsMany recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performan ...
ArXiv Domain 2025-07-30
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Multi-Agent-as-Judge: Aligning LLM-Agent-Based Automated Evaluation with Multi-Dimensional Human EvaluationNearly all human work is collaborative; thus, the evaluation of real-world NLP applications often requires multiple dimensions that align with diverse human perspectives. As real human evaluator resources are often scarce and costly, the emerging “LLM-as-a-judge” paradigm sheds light on a promising approach to leverage LLM agents to believably simulat ...
ArXiv Domain 2025-07-31
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. DeepSieve: Information Sieving via LLM-as-a-Knowledge-RouterLarge Language Models (LLMs) excel at many reasoning tasks but struggle with knowledge-intensive queries due to their inability to dynamically access up-to-date or domain-specific information. Retrieval-Augmented Generation (RAG) has emerged as a promising solution, enabling LLMs to ground their responses in external sources. However, existing RAG methods lack fine-grained control over both the qu ...
ArXiv Domain 2025-08-01
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Past Meets Present: Creating Historical Analogy with Large Language ModelsHistorical analogies, which compare known past events with contemporary but unfamiliar events, are important abilities that help people make decisions and understand the world. However, research in applied history suggests that people have difficulty finding appropriate analogies. And previous studies in the AI community have also overlooked historical analogies. To fill this gap, in ...
ArXiv Domain 2025-08-02
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World ModelAI agents built on large language models (LLMs) hold enormous promise, but current practice focuses on a one-task-one-agent approach, which not only falls short of scalability and generality, but also suffers from the fundamental limitations of autoregressive LLMs. On the other hand, humans are general agents who reason by mentally simulating the out ...
ArXiv Domain 2025-08-03
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World ModelAI agents built on large language models (LLMs) hold enormous promise, but current practice focuses on a one-task-one-agent approach, which not only falls short of scalability and generality, but also suffers from the fundamental limitations of autoregressive LLMs. On the other hand, humans are general agents who reason by mentally simulating the out ...
ArXiv Domain 2025-08-04
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World ModelAI agents built on large language models (LLMs) hold enormous promise, but current practice focuses on a one-task-one-agent approach, which not only falls short of scalability and generality, but also suffers from the fundamental limitations of autoregressive LLMs. On the other hand, humans are general agents who reason by mentally simulating the out ...
1…272829…33
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
493
Tags
24
Categories
15
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
No title2025-10-15
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
Categories
  • AI186
  • Cython1
  • DSA24
  • GitHub109
  • LLMs16
Tags
DSARLTransformerLLMsPaperReadingDeepLearningCVGPTPLdomaingithubhfweiboArXivDomainAIGitHubTrendingHuggingFacePapers微博热搜leetcodealgo
Archives
  • October 20251
  • January 20245
  • December 202314
  • November 202326
  • October 20231
Info
Article :
493
Run time :
Total Count :
20359.6k
UV :
PV :
Last Push :
©2023 - 2025 By Firefly
Search
Loading the Database