Diving into Reinforcement Learning
Introduction and motivation
I have finally decided to start posting the progress of my research on topic of social aware navigation using reinforcement learning!
Initially, most of my experience was related to the development and integration of motion planning systems and algorithms using classical approaches like dynamic and kinematic modelling. At the same time, I have been studying machine learning algorithms with the main orientation towards deep learning for image recognition. I pretty liked the fact that deep neural networks allows to approximate any non-linear function and wanted to apply them to solve navigation problems in robotics.
Reinforcement learning
Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. One of the key features of RL is that agent is always interacting with the surrounding environment through actions and is understanding it through observations, which is exactly what robots do on a daily basis.
Main challenges
Currently, there is a lot of research going on RL by great companies like Deepmind, OpenAI, and Facebook. Undoubtedly, their contribution to the field is invaluable and helps to gather more engineers and researchers to the RL field. However, their work is still mostly theoretical and usually bound to simulated or game environments, thus many robotics RL research is mainly done by individual research groups.
Moreover, most of the work done is mainly bound to the discrete action spaces where most of the famous environments operate, but due to its nature, robotics requires a continuous action space to work with. There are some on-policy algorithms that help to solve these task, but many work is yet to be done.
Work done
So far, I was mainly reading textbooks to get a better understanding of underlying principles and lots of papers for better understanding of the problems and proposed solutions.
I have almost finished reading a great book by Maxim Lapan called Deep Reinforcement Learning Hands On, 2nd edition. Firstly, it explains basic underlying concepts of most state-of-the-art algorithms like cross-correlation methods, markov decision processes and Q-networks. Additionally, it has a lot of code explanations for PyTorch implementations which allows to better understand implementation and issues engineers usually face. Furthermore, the book dives deeper by explaining most modern approaches for solving RL problems like A2C, A3C, DDPG, PPO, and TRPO which is pretty cool.
Currently, I was able to use a Proximal Policy Optimization algorithm for modified CartPole problem in robotics Webots simulator and results can be seen here: