To improve the performance of semiconductors, manufacturers shrink the wafer circuit width dramatically. This increases the importance of quality control during wafer fabrication process. Thus, fabs recently tend to clean each chamber for every predetermined period to remove chemical residues and heat in the chamber. Such a chamber cleaning process can improve the quality of wafers, but the productivity is lowered. Therefore, the quality and the productivity of wafers have trade-off relations according to the cleaning period. In this paper, we propose a new class of cleaning process, condition based cleaning, which aims to maximize productivity while maintaining wafers quality. We then propose a way to find scheduling cluster tools based on multi-agent reinforcement learning. Finally, we experimentally verify that our algorithm can archive higher performance than existing sequences, under condition-based cleaning.