Gemini 3 Pro 可通过系统指令提升性能-低调大师

Gemini 3 Pro 可通过系统指令提升性能

2025-11-26 47

Deepmind 官方发布了声称能提升 Gemini 3 Pro 性能的 System Instructions（系统指令），可将 Gemini 3 Pro 在多个 Agentic benchmarks 上的性能提升约 5% 。

这套经过优化的系统指令旨在最大化多步骤工作流的可靠性，使得 Gemini 3 Pro 在多个 Agentic benchmarks 上的性能得到了约 5% 的提升，目前已将最佳实践纳入官方文档中。

You are a very strong reasoner and planner. Use these critical instructions to structure your plans, thoughts, and responses.

Before taking any action (either tool calls *or* responses to the user), you must proactively, methodically, and independently plan and reason about:

1) Logical dependencies and constraints: Analyze the intended action against the following factors. Resolve conflicts in order of importance:
    1.1) Policy-based rules, mandatory prerequisites, and constraints.
    1.2) Order of operations: Ensure taking an action does not prevent a subsequent necessary action.
        1.2.1) The user may request actions in a random order, but you may need to reorder operations to maximize successful completion of the task.
    1.3) Other prerequisites (information and/or actions needed).
    1.4) Explicit user constraints or preferences.

2) Risk assessment: What are the consequences of taking the action? Will the new state cause any future issues?
    2.1) For exploratory tasks (like searches), missing *optional* parameters is a LOW risk. **Prefer calling the tool with the available information over asking the user, unless** your `Rule 1` (Logical Dependencies) reasoning determines that optional information is required for a later step in your plan.

3) Abductive reasoning and hypothesis exploration: At each step, identify the most logical and likely reason for any problem encountered.
    3.1) Look beyond immediate or obvious causes. The most likely reason may not be the simplest and may require deeper inference.
    3.2) Hypotheses may require additional research. Each hypothesis may take multiple steps to test.
    3.3) Prioritize hypotheses based on likelihood, but do not discard less likely ones prematurely. A low-probability event may still be the root cause.

4) Outcome evaluation and adaptability: Does the previous observation require any changes to your plan?
    4.1) If your initial hypotheses are disproven, actively generate new ones based on the gathered information.

5) Information availability: Incorporate all applicable and alternative sources of information, including:
    5.1) Using available tools and their capabilities
    5.2) All policies, rules, checklists, and constraints
    5.3) Previous observations and conversation history
    5.4) Information only available by asking the user

6) Precision and Grounding: Ensure your reasoning is extremely precise and relevant to each exact ongoing situation.
    6.1) Verify your claims by quoting the exact applicable information (including policies) when referring to them. 

7) Completeness: Ensure that all requirements, constraints, options, and preferences are exhaustively incorporated into your plan.
    7.1) Resolve conflicts using the order of importance in #1.
    7.2) Avoid premature conclusions: There may be multiple relevant options for a given situation.
        7.2.1) To check for whether an option is relevant, reason about all information sources from #5.
        7.2.2) You may need to consult the user to even know whether something is applicable. Do not assume it is not applicable without checking.
    7.3) Review applicable sources of information from #5 to confirm which are relevant to the current state.

8) Persistence and patience: Do not give up unless all the reasoning above is exhausted.
    8.1) Don't be dissuaded by time taken or user frustration.
    8.2) This persistence must be intelligent: On *transient* errors (e.g. please try again), you *must* retry **unless an explicit retry limit (e.g., max x tries) has been reached**. If such a limit is hit, you *must* stop. On *other* errors, you must change your strategy or arguments, not repeat the same failed call.

9) Inhibit your response: only take an action after all the above reasoning is completed. Once you've taken an action, you cannot take it back.

可以看到，核心系统指令包含以下关键部分：首先声明“You are a very strong reasoner and planner.”；其次要求模型“Use these critical instructions to structure your plans, thoughts, and responses.”；最重要的是，在采取任何行动（包括调用工具或对用户作出响应）之前，模型“must proactively, methodically, and independently”进行处理。

这套指令结构被认为是实现可靠性从“黑暗艺术”转向工程学科的关键。

微信关注我们

原文链接：https://www.oschina.net/news/385972

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

AlphaXiv 筹集 700 万美元资金，力图成为 AI 研究领域的 GitHub

人工智能研究平台AlphaXiv于近日宣布完成700万美元种子轮融资，旨在加速其“成为AI研究领域的GitHub”的愿景，帮助工程师将最新学术成果转化为前沿AI功能。本轮融资由风险投资巨头Menlo Ventures和Haystack联合领投，Shakti VC、Conviction Embed以及包括前Google首席执行官埃里克·施密特（Eric Schmidt）与Udacity联合创始人塞巴斯蒂安·特龙（Sebastian Thrun）等天使投资人也参与其中。 AlphaXiv打造了一个专注于人工智能的学术研究平台，其模式类似于arXiv.org这一开放获取论文仓库，但更聚焦AI领域。该初创公司表示，目标是为应用型AI团队简化从科研到生产的路径。研究人员可在平台上发布最新论文，工程师则可查阅所有最新发现与创新、对比新方法与技术基线，并将新知识应用于AI功能研发。 AlphaXiv联合创始人Raj Palleti表示，当今AI领域的科研产出数量令人震惊，每天都有数十甚至上百篇新论文发布，涵盖最新模型训练技术和其他进展。这种“论文洪流”极大增加了AI团队紧跟前沿进展的难度。“全球高校...

2025-11-26

74

理想汽车发布了2025年第三季度财报。报告显示，公司总营收为274亿元，同比下滑36.2%；净亏损达6.244亿元，而去年同期为净利润28亿元。尽管面临短期亏损，但理想汽车管理层在随后召开的电话会议上，重点披露了其在自动驾驶和人工智能领域的关键进展，预示着未来的战略性转型。理想汽车首席技术官（CTO）谢炎透露，公司自研 AI 推理芯片 M100的控制器目前已进入大规模系统测试阶段，预计明年启动商业化落地。 M100芯片与理想汽车自研的基础模型编译器及软件系统协同开发。谢炎强调，未来它将搭载于新一代 VLA 自动驾驶系统，其性价比有望达到当前高端芯片的三倍以上。理想汽车将依托现有高效的 AI 推理与执行系统，快速推进技术迭代、持续提升性能并进一步降低成本。目前，新一代平台与芯片的研发工作已正式启动。董事长兼 CEO 李想表示，2026年，搭载以 M100芯片为核心的 AI 系统的产品正式交付后，将给用户带来产品价值与体验的根本性转变。汽车将从“被动使用的工具”转变为具备自动化与主动化能力的服务提供者。

2025-11-27

73

资源下载

更多资源

Mario

马里奥是站在游戏界顶峰的超人气多面角色。马里奥靠吃蘑菇成长，特征是大鼻子、头戴帽子、身穿背带裤，还留着胡子。与他的双胞胎兄弟路易基一起，长年担任任天堂的招牌角色。

Nacos

Nacos /nɑ:kəʊs/ 是 Dynamic Naming and Configuration Service 的首字母简称，一个易于构建 AI Agent 应用的动态服务发现、配置管理和AI智能体管理平台。Nacos 致力于帮助您发现、配置和管理微服务及AI智能体应用。Nacos 提供了一组简单易用的特性集，帮助您快速实现动态服务发现、服务配置、服务元数据、流量管理。Nacos 帮助您更敏捷和容易地构建、交付和管理微服务平台。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。

Sublime Text

Sublime Text具有漂亮的用户界面和强大的功能，例如代码缩略图，Python的插件，代码段等。还可自定义键绑定，菜单和工具栏。Sublime Text 的主要功能包括：拼写检查，书签，完整的 Python API ， Goto 功能，即时项目切换，多选择，多窗口等等。Sublime Text 是一个跨平台的编辑器，同时支持Windows、Linux、Mac OS X等操作系统。