Weibo Hot 2025-08-28
数据来源:微博热搜
排名
话题
热度
用户
1
持之以恒推动中俄关系高水平发展
2
中国抗战阅兵以来日本砸560亿搞公关
945204
3
男子三亚游泳溺水妻子崩溃大哭
283131
4
中国蓝盔卫士亮相九三阅兵
271183
5
郑佩佩儿子为妻子再次众筹医药费
269909
6
00后短剧女演员确诊胃癌
269212
7
酒店浴缸 红线虫
266764
8
檀健次Discovery全球品牌代言人
9
直播公司围猎初中辍学农村女孩
264001
10
王源说大不了一星期不说话
演出 261974
11
陈乔恩谈这辈子最勇敢的事
综艺 257834
12
男子杀妻因抑郁发作被判死缓最新进展
252132
13
金子涵剃寸头
247762
14
网友称武汉天空有巨大流星坠落
247243
15
张馨予 为啥补货啊姐一直在
242036
16
司美格鲁肽
229466
17
毒犯整容逃亡因耳朵落网
194895
18
美团利润大跌89%
185516
19
北京 ...
Weibo Hot 2025-07-14
数据来源:微博热搜
排名
话题
热度
分类
1
城市要有高度更要有温度
2
关晓彤李昀锐海边吻
剧集 1955081
3
救女子被质疑袭胸男子称感到寒心
940098
4
医生支招高温天吃什么更解暑
936562
5
连体人姐姐要结婚妹妹要单身
934768
6
尾号77777777手机号320万元拍出
662753
7
王鹤棣 LV
564614
8
对翻垃圾找儿童手表环卫工给予奖励
540782
9
实付9.9元米粉店家到手不足1元
465255
10
偶像剧不准用折叠伞
389921
11
外卖大战有人65元搞定一周伙食
389514
12
Angelababy小海绵巴黎背影照
377058
13
VOGUE金九不是85花群封
374464
14
田栩宁说了第六句话
361729
15
蒋欣吃关晓彤同款彩椒碗
359968
16
家长曝学费上万幼儿园孩子牙刷发霉
312175
17
唐悠悠去找林宛瑜了吗
304036
18
小海绵干嘛呢
282339
...
ArXiv Domain 2025-07-19
数据来源:ArXiv Domain
LLM Domain Papers1. Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical JokesHumour, as a complex language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we investigate whether the ability of Large Language Models (LLMs) to explain humour depends on the particular humour form. We compare models on si ...
ArXiv Domain 2025-07-20
数据来源:ArXiv Domain
LLM Domain Papers1. Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical JokesHumour, as a complex language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we investigate whether the ability of Large Language Models (LLMs) to explain humour depends on the particular humour form. We compare models on si ...
ArXiv Domain 2025-07-21
数据来源:ArXiv Domain
LLM Domain Papers1. Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical JokesHumour, as a complex language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we investigate whether the ability of Large Language Models (LLMs) to explain humour depends on the particular humour form. We compare models on si ...
ArXiv Domain 2025-07-22
数据来源:ArXiv Domain
LLM Domain Papers1. Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuningLarge language models (LLMs) excel across various tasks, but standard first-order (FO) fine-tuning demands considerable memory, significantly limiting real-world deployment. Recently, zeroth-order (ZO) optimization stood out as a promising memory-efficient training paradigm, avoiding backward passes and relying solely on forward passes for gradient estimation, m ...
ArXiv Domain 2025-07-23
数据来源:ArXiv Domain
LLM Domain Papers1. The Impact of Language Mixing on Bilingual LLM ReasoningProficient multilingual speakers often intentionally switch languages in the middle of a conversation. Similarly, recent reasoning-focused bilingual large language models (LLMs) with strong capabilities in both languages exhibit language mixing—alternating languages within their chain of thought. Discouraging this behavior in DeepSeek-R1 was found to degrade accuracy, suggesting that language mixing m ...
ArXiv Domain 2025-07-24
数据来源:ArXiv Domain
LLM Domain Papers1. LingBench++: A Linguistically-Informed Benchmark and Reasoning Framework for Multi-Step and Cross-Cultural Inference with LLMsWe propose LingBench++, a linguistically-informed benchmark and reasoning framework designed to evaluate large language models (LLMs) on complex linguistic tasks inspired by the International Linguistics Olympiad (IOL). Unlike prior benchmarks that focus solely on final answer accuracy, LingBench++ provides structured reasoning trac ...
ArXiv Domain 2025-07-25
数据来源:ArXiv Domain
LLM Domain Papers1. LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuningLarge Language Models (LLMs) have become indispensable in real-world applications. However, their widespread adoption raises significant safety concerns, particularly in responding to socially harmful questions. Despite substantial efforts to improve model safety through alignment, aligned models can still have their safety protections undermined by subsequent fine-tuning - even when the ...
ArXiv Domain 2025-07-27
数据来源:ArXiv Domain
LLM Domain Papers1. Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMsKnowledge distillation can be a cost-effective technique to distill knowledge in Large Language Models, if the teacher output logits can be pre-computed and cached. However, successfully applying this to pre-training remains largely unexplored. In this work, we prove that naive approaches for sparse knowledge distillation such as caching Top-K probabilities, while intuitive, provide biased e ...
ArXiv Domain 2025-07-28
数据来源:ArXiv Domain
LLM Domain Papers1. Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader ImpactsMany recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performan ...
ArXiv Domain 2025-07-29
数据来源:ArXiv Domain
LLM Domain Papers1. Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader ImpactsMany recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performan ...
ArXiv Domain 2025-07-30
数据来源:ArXiv Domain
LLM Domain Papers1. Multi-Agent-as-Judge: Aligning LLM-Agent-Based Automated Evaluation with Multi-Dimensional Human EvaluationNearly all human work is collaborative; thus, the evaluation of real-world NLP applications often requires multiple dimensions that align with diverse human perspectives. As real human evaluator resources are often scarce and costly, the emerging “LLM-as-a-judge” paradigm sheds light on a promising approach to leverage LLM agents to believably simulat ...
ArXiv Domain 2025-07-31
数据来源:ArXiv Domain
LLM Domain Papers1. DeepSieve: Information Sieving via LLM-as-a-Knowledge-RouterLarge Language Models (LLMs) excel at many reasoning tasks but struggle with knowledge-intensive queries due to their inability to dynamically access up-to-date or domain-specific information. Retrieval-Augmented Generation (RAG) has emerged as a promising solution, enabling LLMs to ground their responses in external sources. However, existing RAG methods lack fine-grained control over both the qu ...
ArXiv Domain 2025-08-01
数据来源:ArXiv Domain
LLM Domain Papers1. Past Meets Present: Creating Historical Analogy with Large Language ModelsHistorical analogies, which compare known past events with contemporary but unfamiliar events, are important abilities that help people make decisions and understand the world. However, research in applied history suggests that people have difficulty finding appropriate analogies. And previous studies in the AI community have also overlooked historical analogies. To fill this gap, in ...