- What is colab-ssh?
- Code we need to run in Colab
- Setting up Cloudflared
- Setup in VSCode
- Additional Tips to Get Started Quickly
- That's It!
We'll be using the package called
colab-ssh. It's a package that uses either Cloudflare or Ngrok to connect to a Colab instance.
This is NOT the same as using Codespaces in your browser, like the approach taken with
colabcode. For now, I much prefer using
colab-ssh because it allows me to use a local VSCode rather than one in the browser.
I'll be using
colab-ssh for my own projects and see how it goes. It's a cheap way to do deep learning, but I'm still not certain if errors and timeouts will bug me enough to stop using it. I think it'll be fine, though! I'll likely use it just to run hyperparameter sweeps and other experiments. I think that's the ideal use for it.
Now, let's get started. First we need to run code in Colab.
First we can mount our Google Drive so that we have access files or data that we need:
from google.colab import drive drive.mount("/content/drive")
This part is optional, but you can access a .env file in your Google Drive to access a
!pip install python-dotenv --quiet import dotenv import os dotenv.load_dotenv( os.path.join('/content/drive/MyDrive/vscode-ssh', '.env') ) password = os.getenv('PASSWORD') github_access_token = os.getenv('GITHUB_ACCESS_TOKEN')
Here we will add the url to the github repo we would like to work on:
git_repo = '<link_to_git_repo>'
Now we can install
colab-ssh and import it:
!pip install colab_ssh --upgrade --quiet from colab_ssh import launch_ssh_cloudflared, init_git_cloudflared
Finally, we create the ssh connection and also add our github repo:
launch_ssh_cloudflared(password) init_git_cloudflared(repository_url=git_repo + ".git", personal_token=github_access_token, branch="main", email="<email_for_github>", username="<github_username>")
After that, you will get the following output:
As it says in "Client machine configuration", you will need to download "cloudflared (Argo Tunnel)" for your OS. I use Mac so that's the one I downloaded. I downloaded the latest version instead of using
brew install since that was faster.
Anyways, go here and download the binary. Then, untar the file (or execute the .exe?) and then place the
cloudflared file in whatever local path you prefer.
Download Remote - SSH: go into VSCode and go to Extensions (CTRL+SHIFT+P), and search and click on "Install Extension". Then, in Extensions, search and download "Remote - SSH".
Now that we have Remote - SSH, go into Command Palette (CTRL+SHIFT+P), and search and click on "Remote - SSH: Open SSH Configuration File". This file is located at
~/.ssh/config. Go to that file and paste the following:
Host *.trycloudflare.com HostName %h User root Port 22 ProxyCommand <PUT_THE_ABSOLUTE_CLOUDFLARE_PATH_HERE> access ssh --hostname %h
I'm assuming the port is 22 for everyone. If you have a different port, you can change it based on the output you received.
Now, save the config file, copy the "VSCode Remote SSH" hostname from the Colab output, and paste it into the text box after clicking on "Remote - SSH: Connect to Host...".
There should be a new window that opens up.
Click continue when a pop-up about a fingerprint appears and then type in the password you passed in to
launch_ssh_cloudflared. You are now fully connected via ssh!
You can now access your GitHub repository via "Open Folder" in Explorer. I have not figured out how to changed the repository location yet, but for now, you will need to click on
.. to exit /root/ and then click on
content and your repository should be there.
You will get some cloudflared files added to the root of your repository, you can add them to your .gitignore file.
Once you've set things up, you just need to click Run All in Colab and it goes pretty fast. However, you will still need to reinstall all packages every time you create a new connection since Colab instances are ephemeral.
I suggest you either create a
environment.yml file, or you use a package like
poetry to get up and running quickly.
Note for Conda: you need to run some extra code in Colab in order to get access to Conda in Colab. Follow the tutorial here if you really want to use Conda. Personally, I would recommend against it since it takes longer to install. Try using pip, pip-tools or poetry instead.
In my case, I create a Makefile for every project and then I simply need to enter
make poetry in the terminal. To create a Makefile, simply create a file called
Makefile in your project directory. Then, in the Makefile, you can add the following (or whatever installation commands you want for your specific dependency manager):
poetry: pip install poetry poetry install
Of course, you can use whatever package manager you prefer.
And that's it! You are now ready to start coding!
To prevent having to create a notebook for every project, do the following to things:
Do your package installations in VSCode rather than Colab. Then you only need to install the packages for a specific project.
Create a cell in your Colab notebook with strings to your github repositories using
git_repo = "git_repo_url". Just comment out the ones you don't want and uncomment the one you do.
This might sound obvious, but I started out by trying to install via Colab when I started out!
If you are asked for a username and password after launching the SSH connection, that means you are not passing in your GitHub personal access token into
init_git_cloudflared. Make sure to do that.
You can setup your GitHub personal access token by clicking on your icon on the top right on GitHub, clicking on "Settings", scroll down and click on "Developer settings", and then clicking on "Personal Access Tokens". Generate a new token and use it in
This could mean a few things, so I'll go over the ones I encountered:
1: Your Remote - SSH config file is not correct.
Go to "Remote - SSH: Settings" and make sure that you are using the correct config file like the one below:
2: Colab is still running
init_git_cloudflared because you did not pass it a valid personal access token.
Don't forget to go to Runtime > Change Runtime Type and select "GPU" in Colab!
If you ran the code on a different repository and then you rerun it on a new repository, this may happen. Do resolve this, just do a factory reset of your Colab instance, and then rerun the code.
If you have any questions, let me know! Or better yet, go to the
colab-ssh repo and ask there!
If you liked this post, follow me on Twitter for more content like this! And make sure to let me know what kind of content you'd like to see more of!