MAT的单调性

您好，非常感谢您的开源工作，最近在看MAT的论文，对于MAT的单调性有一些疑问。
其中对于MAT的单调性保证部分描述较少，没清楚您的意思。首先对于公式5，论文中提到是前序1-m-1智能体的新决策，可否理解为在前序智能体策略更新的基础上，并且动作a的符号表示与decoder的输出相同，二者代表意义相同吗？接着论文中提到MAT模型无需等待前序智能体的策略更新，优化目标可以并行计算，并根据算法流程，在并行计算时，输入的动作为当前策略的动作，每个智能体策略更新也使用了其他智能体更新前的策略参数，这如何体现顺次更新的思想呢。

![Image](https://github.com/user-attachments/assets/d0f41876-cd37-4d88-a566-7ca6ba14ffe5)

![Image](https://github.com/user-attachments/assets/15db88eb-b314-4da9-9d80-6cf91d78790e)

![Image](https://github.com/user-attachments/assets/67b1ed8d-a0de-4d6a-b068-ec183d8850df)

![Image](https://github.com/user-attachments/assets/ed3b8b42-7e20-4832-9ce5-1f7b15e70040)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAT的单调性 #42

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

MAT的单调性 #42

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions