Robot path planning using enhanced Q-learning algorithm based on single parameter.

Noor Fallooh, Electrical Engineering Dept., University of Technology-Iraq, Alsina’a street, 10066 Baghdad, Iraq.Follow
Ahmed Sadiq, Computer Science Dept., University of Technology-Iraq, Alsina’a street, 10066 Baghdad, Iraq.Follow
Eyad Abbas, Electrical Engineering Dept., University of Technology-Iraq, Alsina’a street, 10066 Baghdad, Iraq.Follow
Ivan Hashim, Electrical Engineering Dept., University of Technology-Iraq, Alsina’a street, 10066 Baghdad, Iraq.Follow

Keywords

Reinforcement Learning, Q-Learning Algorithm, Robot Path Planning, Learning Rate (α), Discount Factor (γ)

Document Type

Research Paper

Abstract

One of the challenging aspects of robot navigation is path planning in a dynamic environment. The Q-learning algorithm is one of the reinforcement learning techniques that can be applied to the path planning of a mobile robot. The vital algorithm for any intelligent mobile robot is path planning. On the other hand, the traditional Q-learning method examines every conceivable state of the robot to choose the optimal path. As a result, this method is very computationally intensive, especially when there is a need to compute a large environment. This study proposes a modified version of the technique for planning robot paths. Using the learning rate (1-α) instead of the certification discount factor (γ), the algorithm became completely dependent on the reliance parameters, making it one of those that depend on a single parameter. This reliance can reduce the number of parameters and increase the algorithm’s execution efficiency. A modified version of Q-learning was investigated with to determine the optimal path planning in several dynamic obstacle environments. Learning efficiency was enhanced by using priority trial replay in the improved Inclined Eight Connection Q-learning Algorithm (I8QA). A simulated environment was used for the suggested method, and it was shown that it can successfully plan optimal paths in dynamic obstacle environments. Overall, Q-learning, a strong and adaptable reinforcement learning method, is utilized for dealing with a wide range of problems. The improvement ratio of path length in the experiment environment is 40.812%, indicating that the I8QA algorithm is more compatible with dynamic environments.

References

P. Wang, Ch. Chan and A. de-L. Fortelle, A Reinforcement learning based approach for automated lane change maneuvers, 2018 IEEE I Intell. Veh Symposium (IV), China, 2018. https://doi.org/10.48550/arXiv.1804.07871 M. Naeem, S.T.H. Rizvi and A. Coronato, A Gentle introduction to reinforcement learning and its application in different fields, IEEE Access, 8 (2020) 209320-209344. https://doi.org/10.1109/ACCESS.2020.3038605 G.U.O. Tong, N. Jiang, L.I. Biyue, Z.H.U. Xi, Y. Wang, UAV navigation in high dynamic environments: A deep reinforcement learning approach, Chin. J. Aeronaut., 34 (2020) 479-489. http://dx.doi.org/10.1016/j.cja.2020.05.011 C. Wang, J. Wang, Y. Shen, Autonomous navigation of UAVs in large-scale complex environments: A deep reinforcement learning approach, IEEE Trans Veh. Technol., 68 (2019) 2124-36. https://doi.org/10.1109/TVT.2018.2890773 L. Chang, L. Shan, Ch. Jiang, and Y. Dai, Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment, Auton. Rob., 45 (2021) 51-76. https://doi.org/10.1007/s10514-020-09947-4 J. Kim, M. Hong, K. Lee, D.W. Kim, Y.-L. Park, and S. Oh, Learning to Walk a Tripod Mobile Robot Using Nonlinear Soft Vibration Actuators with Entropy Adaptive Reinforcement Learning, IEEE Rob. Autom. Lett., 5 (2020) 2317-2324. https://doi.org/10.1109/LRA.2020.2970945 S.S. Mousavi, M. Schukat, and E. Howley, Traffic light control using deep policy gradient and value-function-based reinforcement learning, IET Inst. Eng. Technol., 11 (2017) 417-423. https://doi.org/10.1049/iet-its.2017.0153 Bakr S. Shihab, Hadeel N. Abdullah and Layth A. Hassnawi, Improved Artificial Bee Colony Algorithm-based Path Planning of Unmanned Aerial Vehicle Using Late Acceptance Hill Climbing, Int. J. Intell. Eng. Syst., 15 (2022) 432-442. https://doi.org/10.22266/ijies2022.1231.39 A.T. Sadiq, and A. H. Hasan, Robot Path Planning Based on PSO and D* Algorithms in Dynamic Environment, Int. Conf. Current Research in Computer Science and Information Technology (ICCIT), Slimani – Iraq, 2017. http://dx.doi.org/10.1109/CRCSIT.2017.7965550 E.S. Low, P. Ong, and K. Ch. Cheah, Solving the optimal path planning of a mobile robot using improved Q-learning, Rob. Auton. Syst., 115 (2019) 143-161. https://doi.org/10.1016/j.robot.2019.02.013 X. Luo, Y. Gao, and Sh. Huang, Modification of Q-learning to Adapt to the Randomness of Environment, Int. Conf. Control, Automation and Information Sciences (ICCAIS), Chengdu, China , 2019. https://doi.org/10.1109/ICCAIS46528.2019.9074718 H.S. Lee, and J. Jeong, Mobile Robot Path Optimization Technique Based on Reinforcement Learning Algorithm in Warehouse Environment, Appl. Sci., 11 (2021) 1209. https://doi.org/10.3390/app11031209 H. Sang, Y. You, X. Sun, Y. Zhou, and F. Liu, The hybrid path planning algorithm based on improved A* and artificial potential field for unmanned surface vehicle formations, Ocean Eng., 223 (2021) 108709. http://dx.doi.org/10.1016/j.oceaneng.2021.108709 A. Maoudj, A. Hentout, Optimal path planning approach based on Q-learning algorithm for mobile robots, Appl. Soft Comput. P.A., 97 (2020) 106796. https://doi.org/10.1016/j.asoc.2020.106796 T. Ma, and J. Lyu, J. Yang, R. Xi, Y. Li , J. An, and Ch. Li, CLSQL: Improved Q-Learning Algorithm Based on Continuous Local Search Policy for Mobile Robot Path Planning, Sensors, 22 (2022) 5910. https://doi.org/10.3390/s22155910 J. Qin, Path Planning Method of Mobile Robot Based on Q-learning, Journal of Physics: Conference Series, International Symposium on Artificial Intelligence and Intelligent, 2181, 2022, 012030. https://doi.org/10.1088/1742-6596/2181/1/012030 T. Bonny and M. Kashkash, Highly optimized Q‐learning‐based bees’ approach for mobile robot path planning in static and dynamic environments, J. Field Rob., 39 (2022) 317-334. http://dx.doi.org/10.1002/rob.22052 N. H. Fallooh, A. T. Sadiq, E. I. Abbas and I. A. Hashim, Modifiedment the Performance of Q-learning Algorithm Based on Parameters Setting for Optimal Path Planning, Fifth Int. Sci. Conf. Alkafeel University, BIO Web of Conf., 97 (2024) 00010. https://doi.org/10.1051/bioconf/20249700010 M.A.K. Jaradat, M. Al-Rousan, L. Quadan, Reinforcement based mobile robot navigation in dynamic environment, Rob. Comput. Integr. Manuf., 27 (2011) 135-149. https://doi.org/10.1016/j.rcim.2010.06.019 H. A. R. Akkar, F. R. Mahdi, Adaptive Path Tracking Mobile Robot Controller Based on Neural Networks and Novel Grass Root Optimization Algorithm, Int. J. Intell. Syst. Appl., 9 (2017) 1-9. https://doi.org/10.5815/ijisa.2017.05.01 X. Zhang, Y. Zhao1, N. Deng, and K. Guo, Dynamic path planning algorithm for a mobile robot based on visible space and an improved genetic algorithm, Int. J. Adv. Rob. Syst., 13 ( 2016) 1-17. http://dx.doi.org/10.5772/63484 N. H. Fallooh, A. T. Sadiq, E. I. Abbas and I. A. Hashim, Dynamic Path Planning using a modification Q-Learning Algorithm for a Mobile Robot, Fifth Int. Sci. Conf. Alkafeel University (ISCKU 2024), 97 (2024) 00011. https://doi.org/10.1051/bioconf/20249700011 Richard S. S. and Andrew G. B., Introduction to Reinforcement Learning, 2nd ed., MIT Press: London, UK, 2018. H. A. Atiyah and M. Y. Hassan, Outdoor Localization of 4 Wheels for Mobile Robot Using CNN with 3D Data, Int. J. Adv. Sci. Eng. Inf. Technol., 12 (2022) 1403-1409. http://dx.doi.org/10.18517/ijaseit.12.4.16181 N. Kohl, and P. Stone, Policy gradient reinforcement learning for fast quadrupedal locomotion, Int. Conf. Robotics and Automation, IEEE, 2004. https://doi.org/10.1109/ROBOT.2004.1307456 M. Kirtas, K. Tsampazis, N. Passalis and Deepbots, A Webots-Based Deep Reinforcement Learning Framework for Robotics, Proc. 16th IFIP WG 12.5 Int. Conf. AIAI 2020, Marmaras, Greece, (2020) 64-75. http://dx.doi.org/10.1007/978-3-030-49186-4_6 F.A. Raheem, A.T. Sadiq, N. A. F. Abbas, Optimal Trajectory Planning of 2-DOF Robot Arm Using the Integration of PSO Based on D* Algorithm and Cubic Polynomial Equation , the first for Conference engineering researches, 2017. L. Jiang, R. Wei and D. Wang, Multi-UAV Roundup Inspired by Hierarchical Cognition Consistency Learning Based on an Interaction Mechanism, Drones, 7 (2023) 462. https://doi.org/10.3390/drones7070462 Hidayat , A. Buono , K. Priandana and S. Wahjuni , Modified Q-Learning Algorithm for Mobile Robot Real-Time Path Planning using Reduced States, Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 7 (2023) 628-636. https://doi.org/10.29207/resti.v7i3.4949 B. Wang, Z. Liu, Q. Li, A. Prorok, Mobile robot path planning in dynamic environments through globally guided reinforcement learning, EEE Rob. Autom. Lett., 5 (2020) 6932-6939. https://doi.org/10.48550/arXiv.2005.05420 Z. Wang, The impact of Q-learning parameters on robot path planning problems in different complex environment, Proc. 4th Int. Conf. Signal Processing and Machine Learning, 135-141, 2024. https://doi.org/10.54254/2755-2721/54/20241447 N. H. Fallooh, A. T. Sadiq, E. I. Abbas and I. A. hashim, Modifiedment the Performance of Q-learning Algorithm Based on Parameters Setting for Optimal Path Planning, 5th Int. Sci. Conf. Alkafeel University (ISCKU 2024), 00010, 2024. https://doi.org/10.1051/bioconf/20249700010 Y. Cao and X. Fang , Optimized-Weighted-Speedy Q-Learning Algorithm for Multi-UGV in Static Environment Path Planning under Anti-Collision Cooperation Mechanism, Mathematics, 11 (2023) 2476, https://doi.org/10.3390/math11112476 Z. Li, L. Shi, L. Yang, Z. Shang, An adaptive learning rate Q-Learning algorithm based on lalman filter inspired by pigeon pecking-color learning, Int. J. Bio-Inspir. Com., 1160 (2020) 693-706. https://link.springer.com/chapter/10.1007/978-981-15-3415-7_59 A. Sonny, S. R. Yeduri and L. R. Cenkeramaddi, Q-learning-based unmanned aerial vehicle path planning with dynamic obstacle avoidance, Appl. Soft Comput., 147 (2023) 110773. https://doi.org/10.1016/j.asoc.2023.110773 K.B. de Carvalho, I.R.L. de Oliveira, D.K.D. Villa, A.G. Caldeira, M. Sarcinelli Filho, A.S. Brandão, Q-learning based path planning method for UAVs using priority shifting, 2022 Int. Conf. Unmanned Aircraft Systems (ICUAS), 2022, 421-426. C. Wang, X. Yang, H. Li, Improved Q-learning applied to dynamic obstacle avoidance and path planning, IEEE Access, 10 (2022) 92879-92888. http://dx.doi.org/10.1109/ACCESS.2022.3203072.

Highlights

The Q-learning algorithm's output was optimized to reduce time consumption Parameters were selected to enable the best possible path planning through a modified approach The modified Q-learning algorithm relies on the (1-α) parameter instead of the (γ) parameter Learning efficiency was enhanced by using priority trial replay to reach the target position The shortest distance between two points was represented by movement in an inclined direction

Recommended Citation

Fallooh, Noor; Sadiq, Ahmed; Abbas, Eyad; and Hashim, Ivan (2025) "Robot path planning using enhanced Q-learning algorithm based on single parameter.," Engineering and Technology Journal: Vol. 43: Iss. 2, Article 4.
DOI: https://doi.org/10.30684/etj.2024.154230.1831

DOI

10.30684/etj.2024.154230.1831

First Page

159

Last Page

173

Download

COinS