Reinforcement Learning Model Comparison in MuJoCo Humanoid-v4 Environment

Project Overview

This project was unique, and really different than the other ones i've worked on. For this one, I tested four different RL algorithms, and measured their performance in the MuJoCo Humanoid Enviroment. The data in the notebook was gathered during training of the algorithms.

Features

Challenges & Solutions

The main challenge of this project was to figure out where to start. I knew I wanted to try out reinforcement learning, and I wanted it to be something more complex (or at least, not boring). I had to read a lot about what reinforcement learning is, how training actually happens, how models' performance is tracked. 'OpenAI's Spinning up has helped me a ton to understand how to start this project.

Another big challenge was to find out what data to track from the RL models. I chose to track the episode mean reward, and episode mean step, and for the SAC and TD3 the actor loss. And for finding this data, I found a way to download the data after training them for a specific amount of time. There is a tensorflow window that you can open during training of the models that has all of the information of all of the different models' performance that are currently being trained

Live Demo & Source Code

View Live Project | Source Code

The gif below is of the model that performed the best (PPO)
Screenshot of Project 1