Living review · synced from the source repositoryView / contribute on GitHub →

A (LIVE) Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding

A living survey tracking Machine Learning approaches to solving the MAPF Problem.

🕐 Last updated: 2026-06-30

📄 Original Survey (Alkazzi & Okumura - IEEE Access 2024)

📚 All BibTeX

📋 Changelog — latest: 2026.06 (+79 papers)


Representation

Representation for Planning - OQ 1

Paper Venue Year Links
CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-Agent Path Planning in Continuous Spaces AAMAS 2022 papercodeproject
Avoidance Critical Probabilistic Roadmaps for Motion Planning in Dynamic Environments ICRA 2021 paper

Environment Optimization - OQ 2

Paper Venue Year Links
Scaling Multi-Agent Environment Co-Design with Diffusion Models ICML 2026 papercode
Optimization of Edge Directions and Weights for Mixed Guidance Graphs in Lifelong Multi-Agent Path Finding arXiv 2026 paper
Differentiable Environment-Trajectory Co-Optimization for Safe Multi-Agent Navigation arXiv 2026 paper
Online Guidance Graph Optimization for Lifelong Multi-Agent Path Finding AAAI 2025 papercode
Generative Curricula for Multi-Agent Path Finding via Unsupervised and Reinforcement Learning JAIR 2025 papercode
Co-Optimizing Reconfigurable Environments and Policies for Decentralized Multi-Agent Navigation TRO 2025 paperprojectvideo
Guidance Graph Optimization for Lifelong Multi-Agent Path Finding IJCAI 2024 papercodeproject
Learning Neural Traffic Rules RA-L 2024 paper
Arbitrarily Scalable Environment Generators via Neural Cellular Automata NeurIPS 2023 papercode
Multi-Robot Coordination and Layout Design for Automated Warehousing IJCAI 2023 papercode
Constrained Environment Optimization for Prioritized Multi-Agent Navigation IEEE Open Journal of Control Systems 2023 paper
Environment Optimization for Multi-Agent Navigation ICRA 2023 paper

Environment Generation for MAPF Algorithm Evaluation

Paper Venue Year Links
QD-MAPPER: A Quality Diversity Framework to Automatically Evaluate Multi-Agent Path Finding Algorithms in Diverse Maps AAMAS 2026 papercodeproject

Representation for Selection OQ 3,4

Paper Venue Year Links
Anytime Automatic Algorithm Selection for the Multi-Agent Path Finding Problem IEEE Access 2024 paper
No Panacea in Planning: Algorithm Selection for Suboptimal Multi-Agent Path Finding arXiv 2024 paper
Algorithm Selection for Optimal Multi-Agent Path Finding via Graph Embedding arXiv 2024 paper
MAPFASTER: A Faster and Simpler take on Multi-Agent Path Finding Algorithm Selection IROS 2022 papercode
MAPFAST: A Deep Algorithm Selector for Multi Agent Path Finding using Shortest Path Embeddings AAMAS 2021 papercode
Algorithm Selection for Optimal Multi-Agent Pathfinding ICAPS 2020 papercode
Automatic algorithm selection in multi-agent pathfinding arXiv 2019 paper

Planning

Augmenting Existing Solvers - OQ 5,6

Enhancing Conflict-Based Search

Paper Venue Year Links
Multi-Agent Path Finding Among Dynamic Uncontrollable Agents with Statistical Safety Guarantees arXiv 2025 paper
Proactive Conflict Area Prediction for Boosting Search-Based Multi-Agent Pathfinding IROS 2025 paper
Conflict Area Prediction for Boosting Search-Based Multi-Agent Pathfinding Algorithms ICRA 2024 paper
Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality ICRA 2023 paper
Learning Node-Selection Strategies in Bounded-Suboptimal Conflict-Based Search for Multi-Agent Path Finding AAMAS 2021 paper
Learning to Resolve Conflicts for Multi-Agent Path Finding with Conflict-Based Search AAAI 2021 paper

Enhancing Prioritized Planning

Paper Venue Year Links
Learning-guided Prioritized Planning for Lifelong Multi-Agent Path Finding in Warehouse Automation JAIR 2026 papercode
Attention-based Priority Learning for Limited Time Multi-Agent Path Finding AAMAS 2024 papercode
Synthesizing priority planning formulae for multi-agent pathfinding AIIDE 2023 paper
Learning a Priority Ordering for Prioritized Planning in Multi-Agent Path Finding SoCS 2022 paper

Enhancing other MAPF solvers

Paper Venue Year Links
Graph Attention-Guided Search for Dense Multi-Agent Pathfinding AAAI 2026 papercode
GRAND: Guidance, Rebalancing, and Assignment for Networked Dispatch in Multi-Agent Path Finding RAL 2026 paper
Truncated Counterfactual Learning for Anytime Multi-Agent Path Finding AAAI 2026 papercodevideo
Discrete Diffusion for Complex and Congested Multi-Agent Path Finding with Sparse Social Attention arXiv 2026 paper
LNS2+RL: Combining Multi-Agent Reinforcement Learning with Large Neighborhood Search in Multi-Agent Path Finding AAAI 2025 papercode
Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic AAAI 2025 papercode
Enhancing PIBT for Multi-Agent Path Finding via MLP-Based Candidate Selection and Priority Perturbation IEEE Access 2025 paper
Learn to Refine: Synergistic Multi-Agent Path Optimization for Lifelong Conflict-Free Navigation of Autonomous Vehicles KDD 2025 papercode
Neural Neighborhood Search for Multi-agent Path Finding ICLR 2024 papercode
Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search AAAI 2024 paper
NLNS-MASPF for solving Multi-Agent scheduling and Path-Finding IROS 2024 paper
Anytime Multi-Agent Path Finding via Machine Learning-Guided Large Neighborhood Search AAAI 2022 paper
Subdimensional Expansion Using Attention-Based Learning For Multi-Agent Path Finding arXiv 2021 papercode

Learning-based Policies - OQ 7,8,9,10,11,12

Decentralized

Paper Venue Year Links
Confidence-Based Curricula for Multi-Agent Path Finding via Reinforcement Learning JAAMAS 2026 papercode
Multi-Agent Reinforcement Learning With Spatial Structure Awareness for Topological Map-Based Path-Finding RAL 2026 paper
ORION: Option-Regularized Deep Reinforcement Learning for Cooperative Multi-Agent Online Navigation RAL 2026 papercode
Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding ICLR 2026 papercodeproject
Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding arXiv 2026 paper
MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning AAMAS 2026 papercodeweightsdataset
From One to Many: Adaptive Multi-Agent Pathfinding in Heterogeneous Environments Optical Memory and Neural Networks 2026 paper
SPARC: Spatial-Aware Path Planning via Attentive Agent Communication arXiv 2026 paper
Mean-Field Deep Reinforcement Learning for Multi-Agent Path Finding RAL 2026 paper
Spatially Grouped Curriculum Learning for Multi-Agent Path Finding AAAI 2026 papercode
Simulation-Informed Diffusion for Decentralized Multi-robot Motion Planning arXiv 2026 paper
Social Behavior as a Key to Learning-based Multi-Agent Pathfinding Dilemmas AIJ 2025 papercodeproject
MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale AAAI 2025 papercodeprojectweightsdatasetnotebooks
Work Smarter Not Harder: Simple Imitation Learning with CS-PIBT Outperforms Large Scale Imitation Learning for MAPF ICRA 2025 papercodeproject
Deploying Ten Thousand Robots: Scalable Imitation Learning for Lifelong Multi-Agent Path Finding ICRA 2025 papercodeproject
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding arXiv 2025 papercode
SIGMA: Sheaf-Informed Geometric Multi-Agent Pathfinding ICRA 2025 papercode
Learning Verified Safe Neural Network Controllers for Multi-Agent Path Finding AAAI 2025 papervideo
MARF: Cooperative Multi-Agent Path Finding with Reinforcement Learning and Frenet Lattice in Dynamic Environments ICRA 2025 paper
Towards Transparent Multi-Agent Autonomous Systems Through Principled Multi-Source Knowledge Distillation ICRA 2025 paper
Advancing Learnable Multi-Agent Pathfinding Solvers with Active Fine-Tuning arXiv 2025 papercodeproject
MAPF-World: Action World Model for Multi-Agent Path Finding arXiv 2025 paper
Towards Information-Optimized Multi-Agent Path Finding: A Hybrid Framework with Reduced Inter-Agent Information Sharing arXiv 2025 paper
PC2P: Multi-Agent Path Finding via Personalized-Enhanced Communication and Crowd Perception IROS 2025 paper
STF: Spatio-Temporal Fusion-Based Multi-Agent Path-Finding RAL 2025 paper
Improving Learnt Local MAPF Policies with Heuristic Search ICAPS 2024 paperextras
Decentralized Monte Carlo Tree Search for Partially Observable Multi-agent Pathfinding AAAI 2024 papercode
Learn to Follow: Decentralized Lifelong Multi-agent Pathfinding via Planning and Learning AAAI 2024 papercode
When to Switch: Planning and Learning for Partially Observable Multi-Agent Pathfinding IEEE TNNLS 2024 papercode
Optimizing Crowd-Aware Multi-Agent Path Finding through Local Communication with Graph Neural Networks IROS 2024 paperproject
POAQL: A Partially Observable Altruistic Q-Learning Method for Cooperative Multi-Agent Reinforcement Learning ICRA 2024 paper
Crowd Perception Communication-Based Multi-Agent Path Finding With Imitation Learning RAL 2024 paper
MFC-EQ: Mean-Field Control with Envelope Q-Learning for Moving Decentralized Agents in Formation IROS 2024 papercode
ALPHA: Attention-based Long-horizon Pathfinding in Highly-structured Areas ICRA 2024 papercode
SCRIMP: Scalable Communication for Reinforcement- and Imitation-Learning-Based Multi-Agent Pathfinding AAMAS 2023 papercode
SACHA: Soft Actor-Critic with Heuristic-Based Attention for Partially Observable Multi-Agent Path Finding RAL 2023 papercode
Learning Selective Communication for Multi-Agent Path Finding RAL 2022 papercode
Multi-agent path finding with prioritized communication learning ICRA 2022 papercode
Distributed Heuristic Multi-Agent Path Finding with Communication ICRA 2021 papercode
Message-Aware Graph Attention Networks for Large-Scale Multi-Robot Path Planning RAL 2021 papercode
PRIMAL2: Pathfinding Via Reinforcement and Imitation Multi-Agent Learning - Lifelong RAL 2021 papercode
Mobile robot path planning in dynamic environments through globally guided reinforcement learning RAL 2020 paper
Mapper: Multi-agent path planning with evolutionary reinforcement learning in mixed dynamic environments IROS 2020 paper
Graph Neural Networks for Decentralized Multi-Robot Path Planning IROS 2019 papercode
PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning RAL 2019 papercode

Centralized

Paper Venue Year Links
Discrete-Guided Diffusion for Scalable and Safe Multi-Robot Motion Planning AAAI 2026 paper
Multi-Robot Motion Planning from Vision and Language using Heat-Inspired Diffusion RAL 2026 papercodeproject
Train-Small Deploy-Large: Leveraging Diffusion-Based Multi-Robot Planning arXiv 2026 paper
DeepFleet: Multi-Agent Foundation Models for Mobile Robots arXiv 2025 paper
RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks IROS 2025 papercode
Multi-Robot Motion Planning with Diffusion Models ICLR 2025 papercodeproject
Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models ICML 2025 papercodeproject
Why Solving Multi-agent Path Finding with Large Language Model has not Succeeded Yet arXiv 2024 paper
Multi-Agent Path Finding in Continuous Spaces with Projected Diffusion Models arXiv 2024 paper

Execution

Travel and Action Time Modeling - OQ 13

Paper Venue Year Links
Conflict Mitigation in Shared Environments using Flow-Aware Multi-Agent Path Finding ICRA 2026 paper
From Discrete Plans to Real-World Execution: A World-Model-Driven Framework for Execution-Aware Multi-Agent Path Finding arXiv 2025 paper
Traffic Flow Learning Enhanced Large-Scale Multi-Robot Cooperative Path Planning Under Uncertainties ICRA 2024 paper
Online Re-Planning and Adaptive Parameter Update for Multi-Agent Path Finding with Stochastic Travel Times AAMAS 2023 paper
Congestion Prediction for Large Fleets of Mobile Robots ICRA 2023 paperproject
Reinforcement Learning for Zone Based Multiagent Pathfinding under Uncertainty ICAPS 2020 paper

Failure Prediction - OQ 14

Paper Venue Year Links
Should I Replan? Learning to Spot the Right Time in Robust MAPF Execution arXiv 2026 paper

Simulation Environments and Testbeds

Paper Venue Year Links
CAMAR: Continuous Actions Multi-Agent Routing AAAI 2026 papercodeposter
Advancing MAPF Toward the Real World: A Scalable Multi-Agent Realistic Testbed (SMART) RA-L 2026 papercodeproject
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding ICLR 2025 papercode
SkyRover: A Modular Simulator for Cross-Domain Pathfinding IJCAI 2025 paperproject
100-Mouse System: Scalable Multi-Robot Testbed with State Management User Interface Journal of Robotics and Mechatronics 2025 paper
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks NeurIPS 2023 paperMiniGridMiniWorld
VMAS: A Vectorized Multi-Agent Simulator for Collective Robot Learning DARS 2022 papercode
RWARE NeurIPS 2020 papercode
Flatland-rl: Multi-agent reinforcement learning on trains arXiv 2020 papercodeproject

Interpretable ML for MAPF

Paper Venue Year Links
Interpretable Multi-Agent Path Finding via Decision Tree Extraction from Neural Policies AAAI Workshop 2026 papervideo

Surveys & Benchmarks

Paper Venue Year Links
Reevaluation of Large Neighborhood Search for MAPF: Findings and Opportunities SoCS 2025 papercode
An empirical evaluation of learning-based multi-agent path finding algorithms in warehouse environments Robotics and Autonomous Systems 2025 paper
Where Paths Collide: A Comprehensive Survey of Classic and Learning-Based Multi-Agent Pathfinding arXiv 2025 paper

Note

In 2024, we published "A Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding - Alkazzi & Okumura" which you can find as open access on the IEEE Access Journal.

Given the nature of research papers, this is now stuck in time. All newer papers and approaches tackling MAPF through Machine Learning techniques are not included, and this pushes the field for someone to eventually re-write such a review to keep the information up to date.

I feel that an incrementally updated review would benefit the community more than a complete new re-write every few years.

(Dream) Ideally, I would love for this to be an actually fully written paper that is being updated monthly with new references or even sections (think a mini booklet style). As an initial step, I am designing it as a list of references under the same structure proposed in the original paper. Once this is updated at a stable rate, I will hopefully move on to the full paper endeavour.

Open Questions

Each section includes open questions worthy of future investigation as originally proposed in our review. For simplicity, we keep them in this table and reference them as OQ X.

OQ Question
1 What can be beneficial criteria and reliable benchmarks for assessing the quality of environment representation?
2 What are efficient transition mechanisms between offline and online environment optimization?
3 What is the appropriate input instance representation for algorithm selection?
4 What is the appropriate representation for MAPF on non-grid worlds?
5 How can learning from experience be transferred from smaller to larger instances?
6 How can we identify and extract effective features to maximize the performance of ML-assisted MAPF algorithms?
7 Which benchmarking suite of environments and evaluation metrics would best reflect the performance of different techniques?
8 Which communication strategy is most effective in real-world environments with their inherent challenges?
9 How can effectively learned implicit communication minimize the need and overhead of explicit communication while achieving comparable outcomes?
10 How could more advanced IL methods improve the performance of agent-based approaches beyond naive behavior cloning (BC)?
11 How can we avoid reward shaping to eliminate human bias in the learning process?
12 How can one construct a dataset of MAPF instances that progressively increases in difficulty?
13 To what extent should the real-world agent dynamics captured by ML be reflected in MAPF?
14 How can ML enhance the fault tolerance of MAPF systems?

Citation

@article{Alkazzi2024mlmapf,
  author={Alkazzi, Jean-Marc and Okumura, Keisuke},
  journal={IEEE Access}, 
  title={A Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding}, 
  year={2024},
  doi={10.1109/ACCESS.2024.3392305}
}