ArXiv Domain 2025-07-26
数据来源:ArXiv Domain
LLM Domain Papers1. Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMsKnowledge distillation can be a cost-effective technique to distill knowledge in Large Language Models, if the teacher output logits can be pre-computed and cached. However, successfully applying this to pre-training remains largely unexplored. In this work, we prove that naive approaches for sparse knowledge distillation such as caching Top-K probabilities, while intuitive, provide biased e ...
ArXiv Domain 2025-07-27
数据来源:ArXiv Domain
LLM Domain Papers1. Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMsKnowledge distillation can be a cost-effective technique to distill knowledge in Large Language Models, if the teacher output logits can be pre-computed and cached. However, successfully applying this to pre-training remains largely unexplored. In this work, we prove that naive approaches for sparse knowledge distillation such as caching Top-K probabilities, while intuitive, provide biased e ...
ArXiv Domain 2025-07-28
数据来源:ArXiv Domain
LLM Domain Papers1. Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader ImpactsMany recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performan ...
ArXiv Domain 2025-07-29
数据来源:ArXiv Domain
LLM Domain Papers1. Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader ImpactsMany recent papers have studied the development of superforecaster-level event forecasting LLMs. While methodological problems with early studies cast doubt on the use of LLMs for event forecasting, recent studies with improved evaluation methods have shown that state-of-the-art LLMs are gradually reaching superforecaster-level performan ...
ArXiv Domain 2025-07-30
数据来源:ArXiv Domain
LLM Domain Papers1. Multi-Agent-as-Judge: Aligning LLM-Agent-Based Automated Evaluation with Multi-Dimensional Human EvaluationNearly all human work is collaborative; thus, the evaluation of real-world NLP applications often requires multiple dimensions that align with diverse human perspectives. As real human evaluator resources are often scarce and costly, the emerging “LLM-as-a-judge” paradigm sheds light on a promising approach to leverage LLM agents to believably simulat ...
ArXiv Domain 2025-07-31
数据来源:ArXiv Domain
LLM Domain Papers1. DeepSieve: Information Sieving via LLM-as-a-Knowledge-RouterLarge Language Models (LLMs) excel at many reasoning tasks but struggle with knowledge-intensive queries due to their inability to dynamically access up-to-date or domain-specific information. Retrieval-Augmented Generation (RAG) has emerged as a promising solution, enabling LLMs to ground their responses in external sources. However, existing RAG methods lack fine-grained control over both the qu ...