ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints

1 National Taiwan University
2 National Yang Ming Chiao Tung University
3 National Tsing Hua University
ACL 2026

Abstract

Intelligent embodied agents should not simply follow instructions, as real-world environments often involve unexpected conditions and exceptions. However, existing methods usually focus on directly executing instructions, without considering whether the target objects can actually be manipulated, meaning they lack the ability to assess available affordances. To address this limitation, we introduce ADAPT, a benchmark that evaluates embodied agents in dynamic environments where object affordances may change over time and are not specified in the instruction. ADAPT requires agents to perceive object states, infer implicit preconditions, and adapt their actions accordingly. To enable this capability, we further propose Affordance-Aware Action Selection (AAS), a plug-and-play module that augments existing planners with explicit affordance reasoning. Experiments demonstrate that incorporating AAS significantly improves robustness and task success across both seen and unseen environments. We also show that a domain-adapted, LoRA-finetuned vision-language model used as the affordance inference backend outperforms a commercial LLM (GPT-4o), highlighting the importance of task-aligned affordance grounding.

Poster

BibTeX


@misc{chen2026adaptbenchmarkingcommonsenseplanning,
      title={ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints}, 
      author={Pei-An Chen and Yong-Ching Liang and Jia-Fong Yeh and Hung-Ting Su and Yi-Ting Chen and Min Sun and Winston Hsu},
      year={2026},
      eprint={2604.14902},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2604.14902}, 
}