Author Image

Hi, I am Marcel

Marcel Müller

Research Fellow at Otto von Guericke University Magdeburg

I am a passionate researcher and engineer with several years of experience in modeling, simulation, and logistics. I have built simulation models to address real-world problems and developed custom environments for training multi-agent reinforcement learning policies using RLlib. My research focuses on resolving coordination and deadlock issues among mobile robots in logistics systems. On the side, I also enjoy fun RL projects, such as connecting PySC2 with RLlib to train agents on StarCraft II or applying RL to browser games.

Projects

Reference Models for deadlock-capable multi-agent pathfinding
Reference Models for deadlock-capable multi-agent pathfinding
Developer Aug 2024 - Present

A project focused on creating reference models for deadlock-capable multi-agent pathfinding. The goal is to provide a benchmark for evaluating MAPF and RL algorithms in this domain.

Reinforcement learning with Plant Simulation
Reinforcement learning with Plant Simulation
Developer Feb 2023 - Present

A project focused on applying reinforcement learning with RLlib to the simulation software Plant Simulation. Context is a simple AGV system for two processing stations.

Development of a Bologna-based Master Curriculum in Resource Efficient Production Logistics (ProdLog)
Development of a Bologna-based Master Curriculum in Resource Efficient Production Logistics (ProdLog)
Project manager Mar 2020 - Oct 2021

I was responsible for the management of the project and to make sure that the project goals were met. The goal was to develop a new Master curriculum in Resource Efficient Production Logistics (ProdLog) at six partner universities. The project was funded by the European Union’s Erasmus+ programme.

Real-time combination of material flow simulation, digital twins of manufacturing cells, an AGV and a mixed-reality application
Real-time combination of material flow simulation, digital twins of manufacturing cells, an AGV and a mixed-reality application
Developer Mar 2020 - Jul 2020

I developed the simulation model containing the digital twin of a manufacturing cell and the AGV. The model communicated via MQTT broker with a mixed-reality application and was used to visualize the real-time state of the system.

Laundry Order Consolidation System (LOCsys)
Laundry Order Consolidation System (LOCsys)
Developer Jan 2018 - Feb 2020

My task was to develop a simulation model of a new designed laundry order consolidation system. The model was used to evaluate the performance of the new system and to identify potential bottlenecks. The “LOCSys” project was funded as a joint project by the BMWi as part of the Central Innovation Program for SMEs (ZIM).

Publications

Multi-Agent Proximal Policy Optimization for a Deadlock Capable Transport System in a Simulation-Based Learning Environment

In this paper, we explore the potential of multi-agent reinforcement learning (MARL) for managing the driving behavior of autonomous guided vehicles (AGVs) in production logistics environments with single-lane tracks, where deadlocks pose a significant challenge. We build upon previous work and adopt a MARL approach using the Proximal Policy Optimization (PPO) algorithm. We conduct a thorough hyperparameter search and investigate the impact of varying numbers of agents on the performance of the AGVs. Our results demonstrate the effectiveness of the MARL approach in addressing deadlocks and coordinating AGV behavior, as well as the scalability of the learned policy to different numbers of agents. The Bayesian optimization process and increased iteration count contribute to improved performance and more stable learning curves.

A review on reinforcement learning algorithms and applications in supply chain management

Decision-making in supply chains is challenged by high complexity, a combination of continuous and discrete processes, integrated and interdependent operations, dynamics, and adaptability. The rapidly increasing data availability, computing power and intelligent algorithms unveil new potentials in adaptive data-driven decision-making. Reinforcement Learning, a class of machine learning algorithms, is one of the data-driven methods. This semi-systematic literature review explores the current state of the art of reinforcement learning in supply chain management (SCM) and proposes a classification framework. The framework classifies academic papers based on supply chain drivers, algorithms, data sources, and industrial sectors. The conducted review revealed a few critical insights. First, the classic Q-learning algorithm is still the most popular one. Second, inventory management is the most common application of reinforcement learning in supply chains, as it is a pivotal element of supply chain synchronisation. Last, most reviewed papers address toy-like SCM problems driven by artificial data. Therefore, shifting to industry-scale problems will be a crucial challenge in the next years. If this shift is successful, the vision of data-driven decision-making in real-time could become a reality.

Comparison of Deadlock Handling Strategies for Different Warehouse Layouts with an AGVS

Automated guided vehicles (AGVs) form a large and important part of logistic systems to improve productivity and reduce costs. When multiple AGVs are running in limited and uncertain environments, lots of issues can occur, such as collisions and deadlocks, which need to be addressed. This paper presents a flexible simulation model for a warehouse with various AGVs. We implemented all three typical strategies to handle deadlocks (prevention, avoidance and detection and resolution). The results show that there is no dominant strategy and that the results strongly depend on the individual case and the input parameters.

Experiences

1

Magdeburg, Germany

Fraunhofer Institute for Factory Operation and Automation IFF is a research institution located in Magdeburg, Germany. It focuses on applied research in the fields of production and logistics.

Research Fellow (secondary employment)

Jul 2025 - Present

Responsibilities:
  • Develop a simulation model for a hydrogen factory.

Magdeburg, Germany

Otto von Guericke University Magdeburg is a research university located in Magdeburg, Germany. It is known for its engineering and natural sciences programs.

Research Fellow

Feb 2018 - Present

Responsibilities:
  • Develop simulation models to address real-world problems.
  • Give lectures on simulation modeling.
  • Manage research projects.
2

3

Magdeburg, Germany

FASA is an association for the promotion of mechanical and plant engineering in Saxony-Anhalt.

Technology Consultant

Aug 2017 - Dec 2017

Responsibilities:
  • Connect companies with technology partners.
  • Advise companies on possible Industry 4.0 technologies.

Müller Marketing GmbH

Jan 2012 - Jun 2017

Magdeburg, Germany

Müller Marketing GmbH is a B2B-focused marketing and advertising agency, offering services in brand communication, corporate design, and strategic marketing for medium-sized businesses.

Online Editor & Process Analyst

Jan 2012 - Jun 2017

Responsibilities:
  • Determine and analyze costs of marketing processes.
  • Edit online contents.
  • Manage Google Ads.
4

5
Brömse GmbH & Co. KG

Oct 2013 - Dec 2013

Haldensleben, Germany

Brömse GmbH & Co. KG is a company specializing in the production of energy-efficient windows, doors, and roller shutter systems made from plastic and aluminum.

Student Assistant

Oct 2013 - Dec 2013

Responsibilities:
  • Analyze processes in goods receipt.
  • Planning of a small parts warehouse.
  • Develop a concept for the shipping area.

Education

Master of Science in Industrial Engineering for Logistics
Thesis:
Extension of a key performance indicator system for comparing freight transport system scenarios to include a dynamic representation of forecast values.
Supervisor:
Univ.-Prof. Dr.-Ing. habil. Prof. E. h. Dr. h. c. mult. Michael Schenk
Bachelor of Science in Industrial Engineering for Logistics
Thesis:
Development of a simulation model for planning the washing order sequence in an industrial laundry.
Supervisor:
Univ.-Prof. Dr.-Ing. habil. Prof. E. h. Dr. h. c. mult. Michael Schenk

Achievements

Best PhD Student Paper Award

Melting Pot Challenge of NeurIPS 2023

2nd place in the Sustainable Supply Chain Deephack