37.2° Blog

Weibo Hot 2025-08-28

Created2019-06-18|微博

数据来源：微博热搜排名话题热度用户 1 持之以恒推动中俄关系高水平发展 2 中国抗战阅兵以来日本砸560亿搞公关 945204 3 男子三亚游泳溺水妻子崩溃大哭 283131 4 中国蓝盔卫士亮相九三阅兵 271183 5 郑佩佩儿子为妻子再次众筹医药费 269909 6 00后短剧女演员确诊胃癌 269212 7 酒店浴缸红线虫 266764 8 檀健次Discovery全球品牌代言人 9 直播公司围猎初中辍学农村女孩 264001 10 王源说大不了一星期不说话演出 261974 11 陈乔恩谈这辈子最勇敢的事综艺 257834 12 男子杀妻因抑郁发作被判死缓最新进展 252132 13 金子涵剃寸头 247762 14 网友称武汉天空有巨大流星坠落 247243 15 张馨予为啥补货啊姐一直在 242036 16 司美格鲁肽 229466 17 毒犯整容逃亡因耳朵落网 194895 18 美团利润大跌89% 185516 19 北京 ...

Weibo Hot 2025-07-14

Created2019-06-18|微博

数据来源：微博热搜排名话题热度分类 1 城市要有高度更要有温度 2 关晓彤李昀锐海边吻剧集 1955081 3 救女子被质疑袭胸男子称感到寒心 940098 4 医生支招高温天吃什么更解暑 936562 5 连体人姐姐要结婚妹妹要单身 934768 6 尾号77777777手机号320万元拍出 662753 7 王鹤棣 LV 564614 8 对翻垃圾找儿童手表环卫工给予奖励 540782 9 实付9.9元米粉店家到手不足1元 465255 10 偶像剧不准用折叠伞 389921 11 外卖大战有人65元搞定一周伙食 389514 12 Angelababy小海绵巴黎背影照 377058 13 VOGUE金九不是85花群封 374464 14 田栩宁说了第六句话 361729 15 蒋欣吃关晓彤同款彩椒碗 359968 16 家长曝学费上万幼儿园孩子牙刷发霉 312175 17 唐悠悠去找林宛瑜了吗 304036 18 小海绵干嘛呢 282339 ...

ArXiv Domain 2025-07-19

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical JokesHumour, as a complex language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we investigate whether the ability of Large Language Models (LLMs) to explain humour depends on the particular humour form. We compare models on si ...

ArXiv Domain 2025-07-20

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical JokesHumour, as a complex language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we investigate whether the ability of Large Language Models (LLMs) to explain humour depends on the particular humour form. We compare models on si ...

ArXiv Domain 2025-07-21

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical JokesHumour, as a complex language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we investigate whether the ability of Large Language Models (LLMs) to explain humour depends on the particular humour form. We compare models on si ...

ArXiv Domain 2025-07-22

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuningLarge language models (LLMs) excel across various tasks, but standard first-order (FO) fine-tuning demands considerable memory, significantly limiting real-world deployment. Recently, zeroth-order (ZO) optimization stood out as a promising memory-efficient training paradigm, avoiding backward passes and relying solely on forward passes for gradient estimation, m ...

ArXiv Domain 2025-07-23

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. The Impact of Language Mixing on Bilingual LLM ReasoningProficient multilingual speakers often intentionally switch languages in the middle of a conversation. Similarly, recent reasoning-focused bilingual large language models (LLMs) with strong capabilities in both languages exhibit language mixing—alternating languages within their chain of thought. Discouraging this behavior in DeepSeek-R1 was found to degrade accuracy, suggesting that language mixing m ...

ArXiv Domain 2025-07-24

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. LingBench++: A Linguistically-Informed Benchmark and Reasoning Framework for Multi-Step and Cross-Cultural Inference with LLMsWe propose LingBench++, a linguistically-informed benchmark and reasoning framework designed to evaluate large language models (LLMs) on complex linguistic tasks inspired by the International Linguistics Olympiad (IOL). Unlike prior benchmarks that focus solely on final answer accuracy, LingBench++ provides structured reasoning trac ...

ArXiv Domain 2025-07-25

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuningLarge Language Models (LLMs) have become indispensable in real-world applications. However, their widespread adoption raises significant safety concerns, particularly in responding to socially harmful questions. Despite substantial efforts to improve model safety through alignment, aligned models can still have their safety protections undermined by subsequent fine-tuning - even when the ...

ArXiv Domain 2025-07-27

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMsKnowledge distillation can be a cost-effective technique to distill knowledge in Large Language Models, if the teacher output logits can be pre-computed and cached. However, successfully applying this to pre-training remains largely unexplored. In this work, we prove that naive approaches for sparse knowledge distillation such as caching Top-K probabilities, while intuitive, provide biased e ...

ArXiv Domain 2025-07-28

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader ImpactsMany recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performan ...

ArXiv Domain 2025-07-29

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader ImpactsMany recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performan ...

ArXiv Domain 2025-07-30

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Multi-Agent-as-Judge: Aligning LLM-Agent-Based Automated Evaluation with Multi-Dimensional Human EvaluationNearly all human work is collaborative; thus, the evaluation of real-world NLP applications often requires multiple dimensions that align with diverse human perspectives. As real human evaluator resources are often scarce and costly, the emerging “LLM-as-a-judge” paradigm sheds light on a promising approach to leverage LLM agents to believably simulat ...

ArXiv Domain 2025-07-31

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. DeepSieve: Information Sieving via LLM-as-a-Knowledge-RouterLarge Language Models (LLMs) excel at many reasoning tasks but struggle with knowledge-intensive queries due to their inability to dynamically access up-to-date or domain-specific information. Retrieval-Augmented Generation (RAG) has emerged as a promising solution, enabling LLMs to ground their responses in external sources. However, existing RAG methods lack fine-grained control over both the qu ...

ArXiv Domain 2025-08-01

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Past Meets Present: Creating Historical Analogy with Large Language ModelsHistorical analogies, which compare known past events with contemporary but unfamiliar events, are important abilities that help people make decisions and understand the world. However, research in applied history suggests that people have difficulty finding appropriate analogies. And previous studies in the AI community have also overlooked historical analogies. To fill this gap, in ...