Generating Diverse Cooperative Agents by Learning Incompatible Policies
ICLR 2023 (Spotlight, 5.65% accept rate, tied for 24th highest rated paper)
ICML 2022 AI4ABM Workshop (Spotlight)
Rujikorn Charakorn, Poramate Manoonpong, and Nat Dilokthanakul.
Paper | Project Site | Code |
Summary
In this work, we propose to learn diverse behaviors via policy compatibility. Conceptually, policy compatibility measures whether policies of interest can coordinate effectively. We theoretically show that incompatible policies are not similar. Thus, policy compatibility—which has been used exclusively as a measure of robustness—can be used as a proxy for learning diverse behaviors. Then, we incorporate the proposed objective into a population-based training scheme to allow concurrent training of multiple agents. Additionally, we use state-action information to induce local variations of each policy. Empirically, the proposed method consistently discovers more solutions than baseline methods across various multi-goal cooperative environments. Finally, in multi-recipe Overcooked, we show that our method produces populations of behaviorally diverse agents, which enables generalist agents trained with such a population to be more robust.
This work was also presented at AI4ABM workshop @ ICML 2022 (spotlight) [pdf] [poster]
Citation
@inproceedings{
charakorn2023generating,
title={Generating Diverse Cooperative Agents by Learning Incompatible Policies},
author={Rujikorn Charakorn and Poramate Manoonpong and Nat Dilokthanakul},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=UkU05GOH7_6}
}