Hu Xiaomei, Xu Jun, Chai Jianfei
Path planning is one of the most basic and pivotal aspects in research of robots, which is to solve the walking problem of robots. As a widely used reinforcement learning method, Q-learning is employed when the robot has no prior knowledge of how its actions affect its environment. For Q-learning method, there is a problem of exploration-utilization in robot path planning. Therefore, robot path planning based on an improved Q-learning method is proposed. According to the environment in which the robot is located, a Markov decision model is established to design the reward-punishment mechanism and action strategy of the robot in the path planning. During the robot training process, a heuristic search function is defined and added to the value iteration algorithm in order to reduce the invalid path exploration in the environment. The experimental results show that the proposed method not only reduces the length of path and improves the efficiency of path planning, but also accelerates the speed of robot learning. This indicates the effectiveness of the proposed method.
Reinforcement learning, Q-learning, path planning, heuristic search
Hu Xiaomei, Xu Jun, Chai Jianfei, Robot Path Planning based on an Improved Q-learning Method. 2019 International Computer Science and Applications Conference (ICSAC 2019). 2019: 99-102.