37.2° Blog

ArXiv Domain 2026-05-26

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Evaluating Large Language Models in a Complex Hidden Role GameAbstract:Quantifying the deceptive potential of Large Language Models (LLMs) is critical for AI safety, yet difficult to achieve in uncontrolled environments. This work investigates the reasoning, persuasion, and deceptive capabilities of LLMs within the social deduction game Secret Hitler. I introduce an open-source framework and novel metrics to measure performance: Role Identification Accurac ...

ArXiv Domain 2026-05-31

Created2019-06-18|AI

数据来源：ArXiv Domain LLM Domain Papers1. Lightweight Multimodal LLM-Enabled Cost-Effective Defect Grading of Power Transmission EquipmentAbstract:Defect grading of power transmission equipment (DGPTE) is crucial to the stability of electric energy transmission. Although existing machine learning methods exhibit strong capabilities in defect detection, they are plagued by difficulties in integrating expert experience and facing class imbalance in more refined defect grading field. To address this ...