avatar
Articles
708
Tags
25
Categories
16

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
ArXiv Domain 2025-11-10
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. CBraMod: A Criss-Cross Brain Foundation Model for EEG DecodingElectroencephalography (EEG) is a non-invasive technique to measure and record brain electrical activity, widely used in various BCI and healthcare applications. Early EEG decoding methods rely on supervised learning, limited by specific tasks and datasets, hindering model performance and generalizability. With the success of large language models, there is a growing body of studies focusing on ...
ArXiv Domain 2025-11-12
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. On the Shape of Brainscores for Large Language Models (LLMs)With the rise of Large Language Models (LLMs), the novel metric “Brainscore” emerged as a means to evaluate the functional similarity between LLMs and human brain/neural systems. Our efforts were dedicated to mining the meaning of the novel score by constructing topological features derived from both human fMRI data involving 190 subjects, and 39 LLMs plus their untrained counterparts. Subsequentl ...
ArXiv Domain 2025-11-14
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. On the Shape of Brainscores for Large Language Models (LLMs)With the rise of Large Language Models (LLMs), the novel metric “Brainscore” emerged as a means to evaluate the functional similarity between LLMs and human brain/neural systems. Our efforts were dedicated to mining the meaning of the novel score by constructing topological features derived from both human fMRI data involving 190 subjects, and 39 LLMs plus their untrained counterparts. Subsequentl ...
ArXiv Domain 2025-11-13
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. On the Shape of Brainscores for Large Language Models (LLMs)With the rise of Large Language Models (LLMs), the novel metric “Brainscore” emerged as a means to evaluate the functional similarity between LLMs and human brain/neural systems. Our efforts were dedicated to mining the meaning of the novel score by constructing topological features derived from both human fMRI data involving 190 subjects, and 39 LLMs plus their untrained counterparts. Subsequentl ...
ArXiv Domain 2025-11-16
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM InferenceWeight-only post-training quantization (PTQ) compresses the weights of Large Language Models (LLMs) into low-precision representations to reduce memory footprint and accelerate inference. However, the presence of outliers in weights and activations often leads to large quantization errors and severe accuracy degradation, especially in recent reasoning LLMs where errors accumulat ...
ArXiv Domain 2025-11-15
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM InferenceWeight-only post-training quantization (PTQ) compresses the weights of Large Language Models (LLMs) into low-precision representations to reduce memory footprint and accelerate inference. However, the presence of outliers in weights and activations often leads to large quantization errors and severe accuracy degradation, especially in recent reasoning LLMs where errors accumulat ...
ArXiv Domain 2025-11-17
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM InferenceWeight-only post-training quantization (PTQ) compresses the weights of Large Language Models (LLMs) into low-precision representations to reduce memory footprint and accelerate inference. However, the presence of outliers in weights and activations often leads to large quantization errors and severe accuracy degradation, especially in recent reasoning LLMs where errors accumulat ...
ArXiv Domain 2025-11-19
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Scaling Spatial Intelligence with Multimodal Foundation ModelsDespite remarkable progress, multimodal foundation models still exhibit surprising deficiencies in spatial intelligence. In this work, we explore scaling up multimodal foundation models to cultivate spatial intelligence within the SenseNova-SI family, built upon established multimodal foundations including visual understanding models (i.e., Qwen3-VL and InternVL3) and unified understanding and g ...
ArXiv Domain 2025-11-20
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. ARC Is a Vision Problem!The Abstraction and Reasoning Corpus (ARC) is designed to promote research on abstract reasoning, a fundamental aspect of human intelligence. Common approaches to ARC treat it as a language-oriented problem, addressed by large language models (LLMs) or recurrent reasoning models. However, although the puzzle-like tasks in ARC are inherently visual, existing research has rarely approached the problem from a vision-centric perspective ...
ArXiv Domain 2025-11-22
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Dataset Distillation for Pre-Trained Self-Supervised Vision ModelsThe task of dataset distillation aims to find a small set of synthetic images such that training a model on them reproduces the performance of the same model trained on a much larger dataset of real samples. Existing distillation methods focus on synthesizing datasets that enable training randomly initialized models. In contrast, state-of-the-art vision approaches are increasingly building o ...
ArXiv Domain 2025-11-23
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Dataset Distillation for Pre-Trained Self-Supervised Vision ModelsThe task of dataset distillation aims to find a small set of synthetic images such that training a model on them reproduces the performance of the same model trained on a much larger dataset of real samples. Existing distillation methods focus on synthesizing datasets that enable training randomly initialized models. In contrast, state-of-the-art vision approaches are increasingly building o ...
ArXiv Domain 2025-11-24
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Dataset Distillation for Pre-Trained Self-Supervised Vision ModelsThe task of dataset distillation aims to find a small set of synthetic images such that training a model on them reproduces the performance of the same model trained on a much larger dataset of real samples. Existing distillation methods focus on synthesizing datasets that enable training randomly initialized models. In contrast, state-of-the-art vision approaches are increasingly building o ...
ArXiv Domain 2025-11-25
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. The Loss of Control Playbook: Degrees, Dynamics, and PreparednessThis research report addresses the absence of an actionable definition for Loss of Control (LoC) in AI systems by developing a novel taxonomy and preparedness framework. Despite increasing policy and research attention, existing LoC definitions vary significantly in scope and timeline, hindering effective LoC assessment and mitigation. To address this issue, we draw from an extensive literatu ...
ArXiv Domain 2025-11-26
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. VDC-Agent: When Video Detailed Captioners Evolve Themselves via Agentic Self-ReflectionWe present VDC-Agent, a self-evolving framework for Video Detailed Captioning that requires neither human annotations nor larger teacher models. The agent forms a closed loop of caption generation, principle-guided scoring (score and textual suggestions), and prompt refinement. When caption quality regresses, a self-reflection path leverages the previous chain-of-thought ...
ArXiv Domain 2025-11-18
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Optimizing Mixture of Block AttentionMixture of Block Attention (MoBA) (Lu et al., 2025) is a promising building block for efficiently processing long contexts in LLMs by enabling queries to sparsely attend to a small subset of key-value blocks, drastically reducing computational cost. However, the design principles governing MoBA’s performance are poorly understood, and it lacks an efficient GPU implementation, hindering its practical adoption. In this pa ...
1…45464748
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
708
Tags
25
Categories
16
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
LLMs公开课 - 5.高效训练&模型压缩2024-01-07
Categories
  • AI298
  • Cython1
  • DSA24
  • GitHub165
  • HotNews14
Tags
DSARLTransformerLLMsPaperReadingDeepLearningCVGPTPLgithubdomainhfhot_newsleetcodealgoGitHubTrendingArXivDomainAIHuggingFacePapers微博热搜HotNews
Archives
  • January 20245
  • December 202314
  • November 202326
  • October 20231
  • September 20234
Info
Article :
708
Run time :
Total Count :
33308.1k
UV :
PV :
Last Push :
©2023 - 2025 By Firefly
Search
Loading the Database