Skip to main content

Research Repository

Advanced Search

Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning

Comsa, Ioan Sorin; Zhang, Sijing; Aydin, Mehmet; Chen, Jianping; Kuonen, Pierre; Wagen, Jean Frederic

Adaptive proportional fair parameterization based LTE scheduling using continuous actor-critic reinforcement learning Thumbnail


Authors

Ioan Sorin Comsa

Sijing Zhang

Profile image of Mehmet Aydin

Dr Mehmet Aydin Mehmet.Aydin@uwe.ac.uk
Senior Lecturer in Networks and Mobile Computing

Jianping Chen

Pierre Kuonen

Jean Frederic Wagen



Abstract

© 2014 IEEE. Maintaining a desired trade-off performance between system throughput maximization and user fairness satisfaction constitutes a problem that is still far from being solved. In LTE systems, different tradeoff levels can be obtained by using a proper parameterization of the Generalized Proportional Fair (GPF) scheduling rule. Our approach is able to find the best parameterization policy that maximizes the system throughput under different fairness constraints imposed by the scheduler state. The proposed method adapts and refines the policy at each Transmission Time Interval (TTI) by using the Multi-Layer Perceptron Neural Network (MLPNN) as a non-linear function approximation between the continuous scheduler state and the optimal GPF parameter(s). The MLPNN function generalization is trained based on Continuous Actor-Critic Learning Automata Reinforcement Learning (CACLA RL). The double GPF parameterization optimization problem is addressed by using CACLA RL with two continuous actions (CACLA-2). Five reinforcement learning algorithms as simple parameterization techniques are compared against the novel technology. Simulation results indicate that CACLA-2 performs much better than any of other candidates that adjust only one scheduling parameter such as CACLA-1. CACLA-2 outperforms CACLA-1 by reducing the percentage of TTIs when the system is considered unfair. Being able to attenuate the fluctuations of the obtained policy, CACLA-2 achieves enhanced throughput gain when severe changes in the scheduling environment occur, maintaining in the same time the fairness optimality condition.

Presentation Conference Type Conference Paper (published)
Conference Name 2014 IEEE Global Communications Conference, GLOBECOM 2014
Start Date Dec 8, 2014
End Date Dec 12, 2014
Publication Date Feb 9, 2014
Deposit Date Jun 8, 2015
Publicly Available Date Feb 10, 2016
Peer Reviewed Peer Reviewed
Pages 4387-4393
Book Title 2014 IEEE Global Communications Conference
DOI https://doi.org/10.1109/GLOCOM.2014.7037498
Keywords long term evolution, approximation theory, learning (artificial intelligence), learning automata, multilayer perceptrons, optimisation, telecommunication computing, telecommunication scheduling
Public URL https://uwe-repository.worktribe.com/output/807022
Publisher URL http://dx.doi.org/10.1109/GLOCOM.2014.7037498
Additional Information Additional Information : © © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Title of Conference or Conference Proceedings : 2014 IEEE Global Communications Conference (GLOBECOM)
Contract Date Feb 10, 2016

Files






You might also like



Downloadable Citations