Multi-Agent Reinforcement Learning for Network Load Balancing in Data Center - Archive ouverte HAL Access content directly
Conference Papers Year :

Multi-Agent Reinforcement Learning for Network Load Balancing in Data Center

(1, 2) , (3) , (4)
1
2
3
4
Zhiyuan Yao
Zihan Ding
  • Function : Author
  • PersonId : 1114902

Abstract

This paper presents the network load balancing problem, a challenging real-world task for multi-agent reinforcement learning (MARL) methods. Conventional heuristic solutions like Weighted-Cost Multi-Path (WCMP) and Local Shortest Queue (LSQ) are less flexible to the changing workload distributions and arrival rates, with a poor balance among multiple load balancers. The cooperative network load balancing task is formulated as a Dec-POMDP problem, which naturally induces the MARL methods. To bridge the reality gap for applying learning-based methods, all models are directly trained and evaluated on a real-world system from moderateto large-scale setups. Experimental evaluations show that the independent and "selfish" load balancing strategies are not necessarily the globally optimal ones, while the proposed MARL solution has a superior performance over different realistic settings. Additionally, the potential difficulties of the application and deployment of MARL methods for network load balancing are analysed, which helps draw the attention of the learning and network communities to such challenges.
Fichier principal
Vignette du fichier
cikm22.pdf (1.97 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03753203 , version 1 (18-08-2022)

Identifiers

Cite

Zhiyuan Yao, Zihan Ding, Thomas Clausen. Multi-Agent Reinforcement Learning for Network Load Balancing in Data Center. 31st ACM International Conference on Information and Knowledge Management (CIKM '22), Oct 2022, Atlanta, GA, United States. ⟨10.1145/3511808.3557133⟩. ⟨hal-03753203⟩
24 View
8 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More