Running Jupyter Notebook in the Background

Running long Jupyter Notebook jobs in the background

If you find yourself like me constantly dealing with Jupyter notebooks for data analysis or machine learning, you know the importance of running these notebooks in an uninterrupted fashion. Particularly, if you’re running long computations, you can’t afford to have your notebook stop if you disconnect from your server or shut your laptop. This is where the beauty of tmux comes in. TMUX or Terminal Multiplexer allows you to run sessions in the background, detach from them and re-attach when needed. In this tutorial, we will learn how to combine tmux with Jupyter Notebook, ensuring that your notebooks run smoothly, even when you’re not around. But it seems like you can’t just open the notebook and run it then close the ssh connection, even with tmux this will shutdown the execution of the notebook. So we need to do a little bit more work to make this work.

I will assume that tmux is already installed on your system (assuming you are root). If not, you can install it using the following command:

sudo apt-get install tmux

Before diving in, let’s understand some basic tmux commands:

Let’s now assume that you have a Jupyter Notebook running on a remote server. First you will need to connect to the server using ssh:

ssh username@remote_address

Now you can start a new tmux session:

tmux new-session -s my_jupyter_session

We will use the the nbconvert tool to achieve this. nbconvert is a command line tool that allows you to convert a Jupyter Notebook to a number of other formats. One of the options is to execute the notebook and save the output. So we will use this to execute the notebook and save the output to a log file. To do this, we will use the following command:

cd path/to/notebook
jupyter nbconvert --to notebook --execute my_notebook.ipynb --output output.ipynb 

This command will execute the notebook and save the output to output.ipynb. Now we can use the tmux session and then detach from it by clicking Ctrl + B, then D. This will detach from the session and you can safely close the ssh connection. When you want to check on the progress of the notebook, you can re-attach to the session using the following command:

tmux attach -t my_jupyter_session

This will re-attach to the session and you can check the progress of the notebook. If you want to kill the session, you can use the following command:

tmux kill-session -t my_jupyter_session

It is useful to check the list of available sessions to avoid confusion using the following command:

tmux ls

It sounds easy, isn’t it?

Actually some notebooks are more complicated and have hard coded kernel. Open the notebook in a text editor and you can see something like this

"metadata": {
  "kernelspec": {
    "name": "conda-env-conda-kernel-py",

Where kreneklspec is the kernel name. So you might need to run it in different kernel or you don’t have access to the original conda env. You can manually modify this part of the notebook to run it in different kernel.

"metadata": {
  "kernelspec": {
    "name": "conda-env-conda-My_kernel-py",

Or you can see the list of the available kernels using the following command:

jupyter kernelspec list

This will tell you the available kernels and their names (will be different depending on your environment)

Then you can run the notebook using the following command:

jupyter nbconvert --to notebook --execute my_notebook.ipynb --output output.ipynb --ExecutePreprocessor.kernel_name='conda-env-conda-My_kernel-py'

This will run the notebook in the specified kernel. Enjoy your long running notebooks!

#jupyter   #Python   #background jobs