Keep Jupyter Notebooks Running
As a graduate student working on physics analysis with LHCb data and various machine learning projects, I’ve learned the hard way that running long computational tasks can be both a blessing and a curse. There’s nothing quite like the sinking feeling of realizing your laptop went to sleep in the middle of a 6-hour training run, or that your SSH connection dropped while processing terabytes of particle collision data. Yes Apple silicon machines provides a very good battery life but have you ever dealt with UC IT and their very good internet connection? If Not, you are lucky. I tell you that probably two out of three ssh connections will drop randomly. This is even funny sometimes because this will happen for me when connecting to our group’s cluster which is in the same building on the same network. Not to mention that for some reason edurom is not working to connect to group cluster.
After one too many lost connections and interrupted training sessions, I’ve finally figured out a robust way to keep my Jupyter notebooks running smoothly, even when I’m not actively connected to the server. Trust me, this knowledge has saved me countless hours of frustration and repeated computations. Although to be honest, Running long tasks like these should be done using python scripts or bash scripts. But sometimes you just want to see the results in real time.
So, if you’re like me, you probably spend a good chunk of your time running analysis notebooks that take hours to complete. Whether it’s training deep learning models on LHCb simulation data or trying to fine tune something(small LLM model if you are rich academic), these aren’t the kind of tasks you want to babysit. You need something that keeps running even when:
- Your laptop decides it’s bedtime and goes to sleep
- Your coffee shop WiFi becomes unstable (or god forbid, your university’s WiFi)
- You need to catch the last bus home (or the first bus back to the physics building)
- You want to actually have a life outside research (there is a chance that this exist)
After trying various approaches, I’ve found that tmux
is the holy grail for running long-duration Jupyter sessions. Think of it as a persistent terminal that keeps your processes running even when you’re not connected.
The basic workflow I’ve settled on is elegantly simple: I start a tmux session on our research cluster, launch Jupyter there, and then I can connect to it from anywhere – my laptop, the department computers, even my iPad when I’m feeling adventurous.
The beauty of this approach is its flexibility. When I’m working on training a new machine learning model for particle identification, I can:
- Start the training run in the evening
- Detach from tmux and head home
- Check on the progress from my couch
- Wake up to completed results
The SSH tunneling aspect means I can securely access my notebooks from anywhere,. Plus, the VSCode integration means I can keep using my favorite development environment without sacrificing the stability of a tmux-managed session.
The real game-changer for me has been the ability to quickly check if my notebooks are still running without having to fully reconnect. A simple jupyter notebook list
command tells me everything I need to know about my running sessions. No more anxiously wondering if that crucial training run is still going while I’m grabbing lunch.
But how exactly does this work? Let’s dive into the details.
The Setup: A Step-by-Step Guide
Let me walk you through how I set this up. It’s pretty straightforward once you know the steps, and trust me, it’s worth the initial setup time.
Step 1: Getting Started with tmux
First things first, you need to connect to your remote server. For me, that’s our physics cluster:
ssh username@physics-cluster.university.edu
Once you’re connected, start a tmux session. I usually give it a meaningful name so I can find it later:
tmux new -s physics_analysis
lhcb_training
or ml_project
. Makes it easier to keep track when you have multiple sessions running.Step 2: Launch Your Jupyter Server
Now that you’re in your tmux session, navigate to where your notebooks live and start Jupyter:
jupyter notebook --no-browser --port=8888
You’ll see a URL with a token appear - copy this somewhere safe, you’ll need it in a minute. It’ll look something like:
http://localhost:8888/?token=abcd1234...
Step 3: The Magic of Detaching
Here’s where the magic happens. You can now detach from your tmux session while keeping everything running. Just press:
Ctrl+b, then d
Think of it as putting your session in a safe bubble - it keeps running even when you’re not watching it.
Step 4: Connecting from Your Local Machine
Now comes the part where you can work from anywhere. Open a new terminal on your local machine and set up an SSH tunnel:
ssh -L 8888:localhost:8888 username@physics-cluster.university.edu
Then just open your browser and go to http://localhost:8888
. Use that token you saved earlier, and voilà - you’re connected to your running notebook!
Bonus: VSCode Integration
If you’re a VSCode user like me (and let’s be honest, who isn’t these days?), here’s how to make it even better:
- Set up your SSH tunnel as above
- Open VSCode
- Press
Cmd+Shift+P
(orCtrl+Shift+P
on non-Mac) - Type “Jupyter: Specify Jupyter Server for Connections”
- Enter the URL with your token
Now you can work with your notebooks right in VSCode while your computations keep running safely on the cluster.
Checking on Your Running Sessions
Sometimes you just want to make sure everything’s still running. You can do this without reconnecting to tmux:
jupyter notebook list
Or if you want to see all your Jupyter processes:
ps aux | grep jupyter | grep $(whoami)
Need to Get Back to Your Session?
If you need to check on something in the terminal, just reattach to your tmux session:
tmux attach -t physics_analysis
This should give you a good starting point for keeping your Jupyter notebooks running smoothly and you don’t have to worry for ssh connections droppingm missing you bus or your laptop going to sleep.