avatar
Articles
618
Tags
23
Categories
15

Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
37.2° Blog
Search
Home
Content
  • Paper
  • LLMs
  • Jupyter
  • Algorithm
  • PLs
Daily
  • Github
  • HotNews
  • HF
  • Arxiv
Archives
Categories
About
ArXiv Domain 2026-05-24
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. CR4T: Rewrite-Based Guardrails for Adolescent LLM SafetyAbstract:Large language models (LLMs) are increasingly embedded in adolescent digital environments, mediating information seeking, advice, and emotionally sensitive interactions. Yet existing safety mechanisms remain largely grounded in adult-centric norms and operationalize safety through refusal-oriented suppression. While such approaches may reduce immediate policy violations, they can also create ...
ArXiv Domain 2026-05-26
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Evaluating Large Language Models in a Complex Hidden Role GameAbstract:Quantifying the deceptive potential of Large Language Models (LLMs) is critical for AI safety, yet difficult to achieve in uncontrolled environments. This work investigates the reasoning, persuasion, and deceptive capabilities of LLMs within the social deduction game Secret Hitler. I introduce an open-source framework and novel metrics to measure performance: Role Identification Accurac ...
ArXiv Domain 2026-05-18
Created2019-06-18|AI
数据来源:ArXiv Domain LLM Domain Papers1. Merging Methods for Multilingual Knowledge Editing for Large Language Models: An Empirical OdysseyAbstract:Multilingual knowledge editing (MKE) remains challenging because language-specific edits interfere with one another, even when locate-then-edit methods work well in monolingual settings. This paper focuses on three issues: the effectiveness of vector merging methods for MKE, the extent to which Task Singular Vectors for Merging (TSVM) can reduce multi ...
1…4142
avatar
Firefly
A firefly flying freely in the AI domain.
Articles
618
Tags
23
Categories
15
Follow Me
Announcement
Welcome to My Personal Blog!
If Not, Please Visit Gitee Mirror.
Recent Post
检索增强LLM2024-01-13
LLMs公开课 - 6.文本理解和生成大模型2024-01-10
LLMs公开课 - 5.高效训练&模型压缩2024-01-07
Categories
  • AI243
  • Cython1
  • DSA24
  • GitHub142
  • HotNews142
Tags
DSARLTransformerLLMsPLPaperReadingDeepLearningCVGPTdomaingithubhfhot_newsArXivDomainAIGitHubTrendingHuggingFacePapersHotNewsleetcodealgo
Archives
  • January 20245
  • December 202314
  • November 202326
  • October 20231
  • September 20234
Info
Article :
618
Run time :
Total Count :
35644.3k
UV :
PV :
Last Push :
©2023 - 2026 By Firefly
Search
Loading the Database