Keep Jupyter Notebooks Running

2024-11-10

As a graduate student working on physics analysis with LHCb data and various machine learning projects, I’ve learned the hard way that running long computational tasks can be both a blessing and a curse. There’s nothing quite like the sinking feeling of realizing your laptop went to sleep in the middle of a 6-hour training run, or that your SSH connection dropped while processing terabytes of particle collision data. Yes Apple Silicon machines provides a very good battery life but have you ever dealt with UC IT and their very good internet connection? If Not, you are lucky. I tell you that probably two out of three ssh connections will drop randomly. This is even funny sometimes because this will happen for me when connecting to our group’s cluster which is in the same building on the same network. Not to mention that for some reason edurom is not working to connect to group cluster.

After one too many lost connections and interrupted training sessions, I’ve finally figured out a robust way to keep my Jupyter notebooks running smoothly, even when I’m not actively connected to the server. Trust me, this knowledge has saved me countless hours of frustration and repeated computations. Although to be honest, Running long tasks like these should be done using python scripts or bash scripts. But sometimes you just want to see the results in real time.

So, if you’re like me, you probably spend a good chunk of your time running analysis notebooks that take hours to complete. Whether it’s training deep learning models on LHCb simulation data or trying to fine tune something (small LLM model if you are rich academic), these aren’t the kind of tasks you want to babysit. You need something that keeps running even when:

Your laptop decides it’s bedtime and goes to sleep
Your coffee shop Wi-Fi becomes unstable (or god forbid, your university’s Wi-Fi)
You need to catch the last bus home (or the first bus back to the physics building)
You want to actually have a life outside research (there is a chance that this exists)

After trying various approaches, I’ve found that tmux is the holy grail for running long-duration Jupyter sessions. Think of it as a persistent terminal that keeps your processes running even when you’re not connected.

The basic workflow I’ve settled on is elegantly simple: I start a tmux session on our research cluster, launch Jupyter there, and then I can connect to it from anywhere – my laptop, the department computers, even my iPad when I’m feeling adventurous.

The beauty of this approach is its flexibility. When I’m working on training a new machine learning model for particle identification, I can:

Start the training run in the evening
Detach from tmux and head home
Check on the progress from my couch
Wake up to completed results

The SSH tunneling aspect means I can securely access my notebooks from anywhere. Plus, the VSCode integration means I can keep using my favorite development environment without sacrificing the stability of a tmux-managed session.

The real game-changer for me has been the ability to quickly check if my notebooks are still running without having to fully reconnect. A simple jupyter notebook list command tells me everything I need to know about my running sessions. No more anxiously wondering if that crucial training run is still going while I’m grabbing lunch.

But how exactly does this work? Let’s dive into the details.

The Setup: A Step-by-Step Guide

Let me walk you through how I set this up. It’s pretty straightforward once you know the steps, and trust me, it’s worth the initial setup time.

Step 1: Getting Started with tmux

First things first, you need to connect to your remote server. For me, that’s our physics cluster:

ssh username@physics-cluster.university.edu

Once you’re connected, start a tmux session. I usually give it a meaningful name so I can find it later:

tmux new -s physics_analysis

Note

I like to name my sessions based on what I’m working on - like lhcb_training or ml_project. Makes it easier to keep track when you have multiple sessions running.

Step 2: Launch Your Jupyter Server

Now that you’re in your tmux session, navigate to where your notebooks live and start Jupyter:

jupyter notebook --no-browser --port=8888

You’ll see a URL with a token appear - copy this somewhere safe, you’ll need it in a minute. It’ll look something like: http://localhost:8888/?token=abcd1234...

Step 3: The Magic of Detaching

Here’s where the magic happens. You can now detach from your tmux session while keeping everything running. Just press:

Ctrl+b, then d

Think of it as putting your session in a safe bubble - it keeps running even when you’re not watching it.

Step 4: Connecting from Your Local Machine

Now comes the part where you can work from anywhere. Open a new terminal on your local machine and set up an SSH tunnel:

ssh -L 8888:localhost:8888 username@physics-cluster.university.edu

Then just open your browser and go to http://localhost:8888. Use that token you saved earlier, and voilà - you’re connected to your running notebook!

Bonus: VSCode Integration

If you’re a VSCode user like me (and let’s be honest, who isn’t these days?), here’s how to make it even better:

Set up your SSH tunnel as above
Open VSCode
Press Cmd+Shift+P (or Ctrl+Shift+P on non-Mac)
Type “Jupyter: Specify Jupyter Server for Connections”
Enter the URL with your token

Now you can work with your notebooks right in VSCode while your computations keep running safely on the cluster.

Checking on Your Running Sessions

Sometimes you just want to make sure everything’s still running. You can do this without reconnecting to tmux:

jupyter notebook list

Or if you want to see all your Jupyter processes:

ps aux | grep jupyter | grep $(whoami)

Need to Get Back to Your Session?

If you need to check on something in the terminal, just reattach to your tmux session:

tmux attach -t physics_analysis

This should give you a good starting point for keeping your Jupyter notebooks running smoothly, and you don’t have to worry about ssh connections dropping, missing your bus, or your laptop going to sleep.