The Hands-on RL course intermezzo
Tired of training deep learning models on your laptop, at the speed of… a turtle? 🐢
Not enthusiastic about buying an expensive GPU or optimizing cloud services bills? 💸
Wish there was a better way to do it?
Luckily for you, the answer to the last question is yes. This precious GPU will train your deep learning models faster. Much faster. And completely free.
This article is both for the Hands-on RL course students as well as for any deep learning developer out there looking for faster training loops.
Let’s get to business.
GPUs in deep learning
The GPU (Graphics Processing Unit) is the key hardware component behind Deep Learning’s tremendous success. GPUs accelerate neural network training loops, to fit into reasonable human time spans. Without them, Deep Learning would not be possible.
If you want to train large deep neural networks you NEED to use a GPU. However, GPUs are also expensive 💰. Which means most of us do not have one at home.
Sure, you can rent a GPU using cloud services like GCP or AWS, but at the risk of getting a large monthly bill.
The question is then…
Is there a way to get this precious GPU… for free?
Yes, there is. Let me show you how.
The solution: Google Colab for deep learning
Google Colab is a service that
👉🏽 offers GPUs free of charge for intervals up to 12 hours, and
👉🏽 integrates seamlessly with GitHub repositories.
With these 2 features in mind, we can easily supercharge any Jupyter notebook in GitHub with a free GPU.
We will calibrate the hyperparameters of a deep Q agent, which is a pretty expensive computation. If you wanna know more about the specifics of this example, check my Hands-on Reinforcement Learning course. This notebook is based on part 6 of the course, where we train a deep Q agent to solve the Cart Pole environment from OpenAI gym.
Let’s get started!
Go ahead, open this notebook:
and follow along with these 3 easy steps.
Step 1. Load Jupyter notebook in Google Colab
You can load any Jupyter notebook from a public GitHub repository into Google Colab with one simple click.
Once you commit and push the notebook to the remote GitHub branch, you can find it under a URL like this:
Now, if you type this other URL on your browser
you will magically load this same Jupyter with Google Colab.
Typing URLs manually is not the best user experience. But we can easily fix that by adding a link to that URL on our original Jupyter notebook.
If you scroll down the notebook a bit you will see I added a button like this
Click on it. The URL linked to it has the format I just told you ☝🏽.
A browser tab will open, and you will see an interface very similar to your local Jupyter server. This is Google Colab. The frontend app, actually. On the backend, you are connected to a runtime on Google servers, that lets you execute Python code.
Now, by default Google Colab runtime uses only CPUs, not a GPU. But you can change that easily. Let’s do it.
Step 2. Enable free GPU acceleration
Go to Runtime >> Change runtime type >> GPU. And voila!
Your runtime now uses a GPU. Sweet 🍭.
Step 3. Install Python dependencies for your deep learning code
Google Colab runtime comes with a few popular Python packages installed, like PyTorch or TensorFlow. However, to run your code you need to make sure all the right dependencies are installed.
Add a cell at the top of the notebook, to install all the necessary packages inside Colab’s environment.
This is what this setup looks like.
We fetch the code from GitHub and we pip install the dependencies from our
requirements.txt file. Google Colab will probably ask us to restart the runtime after this step. We simply go to the top menu >> Runtime >> Restart Runtime.
Then execute this second cell, to add our local folder
src as a local Python package.
That’s it. You have a free GPU at your disposal!
Feel free to run the entire notebook and (probably) solve your first deep reinforcement learning problem.
Bonus hack to integrate MLflow with Colab 🥷
This one is mainly for my students, as we use MLflow in the Hands-on RL course to track all experiment results, but feel free to read along. You never know when it can come in handy.
When I train deep learning models I like to use open-source MLflow to track all experiment results. However, if I run the notebook on Colab, how can I log to my local MLflow server?
There is a trick. Quite hacky, but I love it ❤️.
Its name is ngrok, a tool that lets you expose your local computer to the outside internet in a secured way.
First, spin up the MLflow tracking server locally from the command line (adjust the port number if 5000 is already taken)
$ (venv) mlflow server --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./artifacts --host 0.0.0.0 --port 5000
Then, you install pyngrok, a convenient Python wrapper for ngrok.
$ (venv) pip install pyngrok
And expose the MLflow server port securely, on the same port we run the MLflow server
$ (venv) ngrok http 5000
A screen will show up, with the URL that ngrok generated for your tunnel. Copy that URL address.
Now, if you go to the notebook in Colab you can paste this URL there:
And run the rest of the notebook.
You will be wondered seeing the MLflow dashboard receiving experiment data again.
What’s next? ❤️
In the next lecture, we will introduce a new family of deep RL algorithms, and we will solve a new environment.