GitHub Trending 2026-03-16
数据来源:github.com/trending
global Languageslightpanda-io/browserLightpanda: the headless browser designed for AI and automation
⭐ Stars: 18471
🍴 Forks: 0
📝 Language: Zig
Crosstalk-Solutions/project-nomadProject N.O.M.A.D, is a self-contained, offline survival computer packed with critical tools, knowledge, and AI to keep you informed and empowered—anytime, anywhere.
⭐ Stars: 1068
🍴 Forks: 0
📝 Language: TypeScript
volcengine/OpenVikingOpenViking is an open-source context database design ...
GitHub Trending 2026-03-17
数据来源:github.com/trending
global Languages666ghj/MiroFishA Simple and Universal Swarm Intelligence Engine, Predicting Anything. 简洁通用的群体智能引擎,预测万物
⭐ Stars: 29906
🍴 Forks: 0
📝 Language: Python
thedotmack/claude-memA Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude’s agent-sdk), and injects relevant context back into future sessions.
⭐ Stars: 36750
🍴 Forks: 0
📝 Language: TypeScript
Crosstalk-Solutions/pr ...
GitHub Trending 2026-03-19
数据来源:github.com/trending
global Languagesjarrodwatts/claude-hudA Claude Code plugin that shows what’s happening - context usage, active tools, running agents, and todo progress
⭐ Stars: 6890
🍴 Forks: 0
📝 Language: JavaScript
obra/superpowersAn agentic skills framework & software development methodology that works.
⭐ Stars: 96095
🍴 Forks: 0
📝 Language: Shell
unslothai/unslothUnified web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.
⭐ Star ...
GitHub Trending 2026-03-20
数据来源:github.com/trending
global Languagesopendataloader-project/opendataloader-pdfPDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
⭐ Stars: 5537
🍴 Forks: 0
📝 Language: Java
langchain-ai/open-sweAn Open-Source Asynchronous Coding Agent
⭐ Stars: 7009
🍴 Forks: 0
📝 Language: Python
obra/superpowersAn agentic skills framework & software development methodology that works.
⭐ Stars: 99110
🍴 Forks: 0
📝 Language: Shell
jarrodwatts/claude-hudA Claude Code plugin ...
GitHub Trending 2026-03-21
数据来源:github.com/trending
global Languagesjarrodwatts/claude-hudA Claude Code plugin that shows what’s happening - context usage, active tools, running agents, and todo progress
⭐ Stars: 9513
🍴 Forks: 0
📝 Language: JavaScript
langchain-ai/open-sweAn Open-Source Asynchronous Coding Agent
⭐ Stars: 7623
🍴 Forks: 0
📝 Language: Python
obra/superpowersAn agentic skills framework & software development methodology that works.
⭐ Stars: 101485
🍴 Forks: 0
📝 Language: Shell
opendataload ...
GitHub Trending 2026-03-22
数据来源:github.com/trending
global LanguagesFujiwaraChoki/MoneyPrinterV2Automate the process of making money online.
⭐ Stars: 17704
🍴 Forks: 0
📝 Language: Python
systemd/systemdThe systemd System and Service Manager
⭐ Stars: 15720
🍴 Forks: 0
📝 Language: C
aquasecurity/trivyFind vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
⭐ Stars: 33366
🍴 Forks: 0
📝 Language: Go
Crosstalk-Solutions/project-nomadProject N.O.M.A.D, is ...
HuggingFace Papers 2026-02-01
数据来源:HuggingFace Papers
Latest Papers1. Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific NarrativesAutonomous scientific discovery with large language model (LLM)-based agents has recently made substantial progress, demonstrating the ability to automate end-to-end research workflows. However, existing systems largely rely on runtime-centric execution paradigms, repeatedly reading, summarizing, and reasoning over large volumes of scientific literatur ...
HuggingFace Papers 2026-02-02
数据来源:HuggingFace Papers
Latest Papers1. Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific NarrativesAutonomous scientific discovery with large language model (LLM)-based agents has recently made substantial progress, demonstrating the ability to automate end-to-end research workflows. However, existing systems largely rely on runtime-centric execution paradigms, repeatedly reading, summarizing, and reasoning over large volumes of scientific literatur ...
HuggingFace Papers 2026-02-03
数据来源:HuggingFace Papers
Latest Papers1. ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement ArenasLarge language models (LLMs) are increasingly used as tool-augmented agents for multi-step decision making, yet training robust tool-using agents remains challenging. Existing methods still require manual intervention, depend on non-verifiable simulated environments, rely exclusively on either supervised fine-tuning (SFT) or reinforcement learning (RL), and struggle with stable lo ...
HuggingFace Papers 2026-02-04
数据来源:HuggingFace Papers
Latest Papers1. Green-VLA: Staged Vision-Language-Action Model for Generalist RobotsWe introduce Green-VLA, a staged Vision-Language-Action (VLA) framework for real-world deployment on the Green humanoid robot while maintaining generalization across diverse embodiments. Green-VLA follows a five stage curriculum: (L0) foundational VLMs, (L1) multimodal grounding, (R0) multi-embodiment pretraining, (R1) embodiment-specific adaptation, and (R2) reinforcement-learning (RL) ...
HuggingFace Papers 2026-02-05
数据来源:HuggingFace Papers
Latest Papers1. CodeOCR: On the Effectiveness of Vision Language Models in Code UnderstandingLarge Language Models (LLMs) have achieved remarkable success in source code understanding, yet as software systems grow in scale, computational efficiency has become a critical bottleneck. Currently, these models rely on a text-based paradigm that treats source code as a linear sequence of tokens, which leads to a linear increase in context length and associated computational c ...
HuggingFace Papers 2026-02-07
数据来源:HuggingFace Papers
Latest Papers1. CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World UncertaintyLLM Analysis
Q: 这篇论文试图解决什么问题?
论文针对现有大语言模型(LLM)智能体评测基准的两大盲区,提出并解决以下核心问题:
现实不确定性下的可靠性缺失现有基准多在“信息完备、工具齐全”的理想条件下评估任务完成率,忽视真实场景(如车载语音助手)中用户请求常出现:
工具缺失或参数粒度不足
环境查询返回不完整数据导致请求本质不可满足或高度模糊。此时智能体需具备“自知不能”与“主动消歧”能力,而非继续生成看似合理的幻觉结果。
一致性评测缺位现有指标仅衡量“至少一次成功”(Pass@k),无法揭示智能体在多回合、多轮次中是否稳定遵守策略、稳定识别自身能力边界。论文提出以 Pass^k(k 次全部成功)作为主要指标,量化部署级一致性。
新任务类型的系统化评估空白引入两类真实故障模式并 ...
HuggingFace Papers 2026-02-08
数据来源:HuggingFace Papers
Latest Papers1. CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World UncertaintyExisting benchmarks for Large Language Model (LLM) agents focus on task completion under idealistic settings but overlook reliability in real-world, user-facing applications. In domains, such as in-car voice assistants, users often issue incomplete or ambiguous requests, creating intrinsic uncertainty that agents must manage through dialogue, tool use, and ...
HuggingFace Papers 2026-02-09
数据来源:HuggingFace Papers
Latest Papers1. CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World UncertaintyExisting benchmarks for Large Language Model (LLM) agents focus on task completion under idealistic settings but overlook reliability in real-world, user-facing applications. In domains, such as in-car voice assistants, users often issue incomplete or ambiguous requests, creating intrinsic uncertainty that agents must manage through dialogue, tool use, and ...
HuggingFace Papers 2026-02-10
数据来源:HuggingFace Papers
Latest Papers1. F-GRPO: Don’t Let Your Policy Learn the Obvious and Forget the RareReinforcement Learning with Verifiable Rewards (RLVR) is commonly based on group sampling to estimate advantages and stabilize policy updates. In practice, large group sizes are not feasible due to computational limits, which biases learning toward trajectories that are already likely. Smaller groups often miss rare-correct trajectories while still containing mixed rewards, concentrating ...