Jupyter Server: A whirlwind tour

jupyter
tutorial
A step-by-step guide to interacting with Jupyter Server and creating your own Jupyter frontend.
Author

Wasim Lorgat

Published

January 12, 2023

Photo by Planet Volumes on Unsplash

This blog post (and the source notebook) is an executable playground for understanding how to communicate with Jupyter Servers. You can think of it as a barebones Jupyter frontend, since we’ll be implementing the full lifecycle including creating a new notebook, writing and executing code cells, and shutting down the server.

I’m building my own native macOS Jupyter frontend and writing about my experience and learnings along the way. In order to do that, I need to be familiar with how Jupyter Servers works.

My approach to learning this was a combination of using Chrome dev tools to inspect network requests in Jupyter Lab, and reading the wonderful Jupyter Server docs (particularly the REST API reference). I’ll include links to the relevant docs in each section below.

Let’s get started!

Starting the server

To start, ensure that you’re running a Jupyter Server in another process (e.g. in a terminal) by entering the following command:

jupyter server

Once the server is running, update the url_with_token variable below to match what’s displayed in the terminal output. For example, it should output something like this:

[C 2023-01-07 12:03:57.482 ServerApp]

    To access the server, open this file in a browser:
        file:///Users/seem/Library/Jupyter/runtime/jpserver-80287-open.html
    Or copy and paste one of these URLs:
        http://localhost:8889/?token=72b22f0cee26baaa6aed492b6fed5a010d57bd6c0e1adcce
     or http://127.0.0.1:8889/?token=72b22f0cee26baaa6aed492b6fed5a010d57bd6c0e1adcce
# NB: Update this based on your terminal output
url_with_token = 'http://localhost:8889/?token=e78ceb3114cb10d50f64485b18e3052c66861616166e0bab'

Authenticating

First, we’ll do a quick check that there is a server at the defined url. We need to get the URL without the token query parameter:

from urllib.parse import urlparse
url = urlparse(url_with_token)._replace(query=None).geturl()
url
'http://localhost:8889/'

Now we can make the request:

import requests
requests.get(url)
<Response [200]>

A 200 response means that the server processed the request successfully.

Next we need to authenticate. What happens if we try to make a request to an endpoint that requires authentication, for example GET /api/contents?

requests.get(url + 'api/contents')
<Response [403]>

It fails with 403 Forbidden.

If we include our token in the Authorization header:

token = urlparse(url_with_token).query.split('=')[-1]
headers = {'Authorization': f'token {token}'}
requests.get(url + 'api/contents', headers=headers)
<Response [200]>

… it works!

Let’s create a requests.Session so we don’t have to keep specifying the header:

session = requests.Session()
session.headers.update(headers)

Managing files

Jupyter Server lets you manage files via the Contents API. Browser frontends access this via the /api/contents REST API.

Let’s use the Contents API to create a file, rename it, and write some contents to it.

List the contents of a directory

GET /api/contents/<path> returns the contents of the file or directory at path. You can think of it as ls for Jupyter Server:

session.get(url + 'api/contents').json()
{'name': '',
 'path': '',
 'last_modified': '2023-01-19T05:58:38.693411Z',
 'created': '2023-01-19T05:58:38.693411Z',
 'content': [],
 'format': 'json',
 'mimetype': None,
 'size': None,
 'writable': True,
 'type': 'directory'}

Since the directory is currently empty, content is an empty list.

Create an empty notebook

POST /api/contents/<path> creates an empty file in the directory at path. You can specify the type of the file in the request body:

session.post(url + 'api/contents', json={'type': 'notebook'})
<Response [201]>

The 201 status code means that the request succeeded and a resource was created.

Let’s confirm that the file exists with GET /api/contents:

session.get(url + 'api/contents').json()
{'name': '',
 'path': '',
 'last_modified': '2023-01-19T06:01:01.089699Z',
 'created': '2023-01-19T06:01:01.089699Z',
 'content': [{'name': 'Untitled.ipynb',
   'path': 'Untitled.ipynb',
   'last_modified': '2023-01-19T06:01:01.090600Z',
   'created': '2023-01-19T06:01:01.090600Z',
   'content': None,
   'format': None,
   'mimetype': None,
   'size': 72,
   'writable': True,
   'type': 'notebook'}],
 'format': 'json',
 'mimetype': None,
 'size': None,
 'writable': True,
 'type': 'directory'}

The response is a nested dict. The root dict refers to the root directory as before, however, content now contains the newly created notebook named Untitled.ipynb.

We can get the contents of this file using the same method but referring to the file’s path i.e. GET /api/contents/<path>:

data = session.get(url + 'api/contents/Untitled.ipynb').json()
data
{'name': 'Untitled.ipynb',
 'path': 'Untitled.ipynb',
 'last_modified': '2023-01-19T06:01:01.090600Z',
 'created': '2023-01-19T06:01:01.090600Z',
 'content': {'cells': [], 'metadata': {}, 'nbformat': 4, 'nbformat_minor': 5},
 'format': 'json',
 'mimetype': None,
 'size': 72,
 'writable': True,
 'type': 'notebook'}

We’re probably most interested in content, which contains the JSON content of the notebook:

data['content']
{'cells': [], 'metadata': {}, 'nbformat': 4, 'nbformat_minor': 5}

For now, the notebook only has some metadata, and cells is empty.

Rename a notebook

Our newly created file is still named Untitled.ipynb. Let’s rename it to sum.ipynb with PATCH /api/contents/<path>:

session.patch(url + 'api/contents/Untitled.ipynb', json={'path': 'sum.ipynb'}).json()
{'name': 'sum.ipynb',
 'path': 'sum.ipynb',
 'last_modified': '2023-01-19T06:01:01.090600Z',
 'created': '2023-01-19T06:01:01.210202Z',
 'content': None,
 'format': None,
 'mimetype': None,
 'size': 72,
 'writable': True,
 'type': 'notebook'}

Confirm that it’s been renamed. Untitled.ipynb no longer exists:

session.get(url + 'api/contents/Untitled.ipynb').json()
{'message': 'No such file or directory: Untitled.ipynb', 'reason': None}

… but sum.ipynb does:

session.get(url + 'api/contents/sum.ipynb').json()
{'name': 'sum.ipynb',
 'path': 'sum.ipynb',
 'last_modified': '2023-01-19T06:01:01.090600Z',
 'created': '2023-01-19T06:01:01.210202Z',
 'content': {'cells': [], 'metadata': {}, 'nbformat': 4, 'nbformat_minor': 5},
 'format': 'json',
 'mimetype': None,
 'size': 72,
 'writable': True,
 'type': 'notebook'}
Note

You can also create a file with a specified name using PUT /api/contents/<path>, instead of letting the server find a unique named prefixed with Untitled.

Update a notebook’s contents

Create a cell and append it to existing contents:

cell = {
    'cell_type': 'code',
    'id': '0',
    'metadata': {},
    'source': [
        '1 + 1\n',
    ],
    'outputs': [],
    'execution_count': 0,
}
data = session.get(url + 'api/contents/sum.ipynb').json()
data['content']['cells'].append(cell)

Update the notebook’s contents using PUT /api/contents/<path>:

session.put(url + 'api/contents/sum.ipynb', json={'content': data['content'], 'type': 'notebook'})
<Response [200]>

Confirm that the notebook’s been updated. Note that last_modified and content have both updated:

session.get(url + 'api/contents/sum.ipynb').json()
{'name': 'sum.ipynb',
 'path': 'sum.ipynb',
 'last_modified': '2023-01-19T06:01:01.348274Z',
 'created': '2023-01-19T06:01:01.348274Z',
 'content': {'cells': [{'cell_type': 'code',
    'execution_count': 0,
    'id': '0',
    'metadata': {'trusted': True},
    'outputs': [],
    'source': '1 + 1\n'}],
  'metadata': {},
  'nbformat': 4,
  'nbformat_minor': 5},
 'format': 'json',
 'mimetype': None,
 'size': 216,
 'writable': True,
 'type': 'notebook'}

Executing code

Most of the functionality available inside a Jupyter Notebook in your browser is achieved by communicating with the server via websockets. This includes executing code as well as code completion.

Let’s execute a very simple bit of code on the server.

Start a session

List open sessions with GET /api/sessions:

session.get(url + 'api/sessions').json()
[]

First we need to choose a kernel specification. Here are the available options on my computer – yours will likely differ:

session.get(url + 'api/kernelspecs').json()
{'default': 'python3',
 'kernelspecs': {'dyalog-kernel': {'name': 'dyalog-kernel',
   'spec': {'argv': ['python3',
     '-m',
     'dyalog_kernel',
     '-f',
     '{connection_file}'],
    'env': {},
    'display_name': 'Dyalog APL',
    'language': 'apl',
    'interrupt_mode': 'signal',
    'metadata': {}},
   'resources': {'kernel.js': '/kernelspecs/dyalog-kernel/kernel.js'}},
  'python3': {'name': 'python3',
   'spec': {'argv': ['python',
     '-m',
     'ipykernel_launcher',
     '-f',
     '{connection_file}'],
    'env': {},
    'display_name': 'Python 3 (ipykernel)',
    'language': 'python',
    'interrupt_mode': 'signal',
    'metadata': {'debugger': True}},
   'resources': {'logo-64x64': '/kernelspecs/python3/logo-64x64.png',
    'logo-32x32': '/kernelspecs/python3/logo-32x32.png',
    'logo-svg': '/kernelspecs/python3/logo-svg.svg'}}}}

Create a new session with POST /api/sessions with the python3 kernelspec:

data = session.post(url + 'api/sessions', json={'kernel': {'name': 'python3'}, 'name': 'sum.ipynb', 'path': 'sum.ipynb', 'type': 'notebook'}).json()
data
{'id': '5730d780-fa1f-446e-b8ad-f3e66be9d063',
 'path': 'sum.ipynb',
 'name': 'sum.ipynb',
 'type': 'notebook',
 'kernel': {'id': '760db402-af7f-4559-aa39-5518d2107b14',
  'name': 'python3',
  'last_activity': '2023-01-19T06:01:01.734770Z',
  'execution_state': 'starting',
  'connections': 0},
 'notebook': {'path': 'sum.ipynb', 'name': 'sum.ipynb'}}

Now that a session exists, we can connect to a websocket. We’ll need the kernel_id and session_id to do that, so let’s store them for the next step:

kernel_id = data['kernel']['id']
session_id = data['id']

Communicate over WebSockets

First, let’s craft a message to request an execution – you can try changing the value of the code variable below to execute something else:

import uuid

code = '1 + 1'
code_msg_id = str(uuid.uuid1())
code_msg = {'channel': 'shell',
            'content': {'silent': False, 'code': code},
            'header': {'msg_id': code_msg_id, 'msg_type':'execute_request'},
            'metadata': {},
            'parent_header':{}}

Now we can send the message to the server and receive all responses.

We’ll use the websocket-client library. You might also want to consider the websockets library which is asynchronous.

import json
from contextlib import closing
from websocket import create_connection, WebSocketTimeoutException

def recv_all(conn):
    while True:
        try: msg = json.loads(conn.recv())
        except WebSocketTimeoutException: break
        print(f"  type: {msg['msg_type']:16} content: {msg['content']}")

ws_base_url = urlparse(url)._replace(scheme='ws').geturl()
ws_url = ws_base_url + f'api/kernels/{kernel_id}/channels?session_id={session_id}'

with closing(create_connection(ws_url, header=headers, timeout=1)) as conn:
    print('Receiving initial messages\n')
    recv_all(conn)
    print('\nSending execute_request\n')
    conn.send(json.dumps(code_msg))
    print('Receiving execute_reply\n')
    recv_all(conn)
Receiving initial messages

  type: status           content: {'execution_state': 'busy'}
  type: status           content: {'execution_state': 'idle'}
  type: status           content: {'execution_state': 'idle'}

Sending execute_request

Receiving execute_reply

  type: status           content: {'execution_state': 'busy'}
  type: execute_input    content: {'code': '1 + 1', 'execution_count': 1}
  type: execute_result   content: {'data': {'text/plain': '2'}, 'metadata': {}, 'execution_count': 1}
  type: status           content: {'execution_state': 'idle'}
  type: execute_reply    content: {'status': 'ok', 'execution_count': 1, 'user_expressions': {}, 'payload': []}

Yay! We successfully executed code on the server via websockets.

You can learn more about Jupyter’s messaging specification in the Jupyter Client docs.

Cleanup

It’s always good practice to cleanup after ourselves, particularly if we share the server with other users.

Let’s close our session and shutdown the server (although we probably wouldn’t shut it down if we shared it with others!).

Close the session

Since we’re done with the session, we can close it via DELETE /api/sessions/<session_id>:

session.delete(url + f'api/sessions/{session_id}')
<Response [204]>

Shutdown the server

Finally, shutdown the server via POST /api/shutdown.

session.post(url + 'api/shutdown')
<Response [200]>

… and confirm that it’s been shutdown correctly:

try: session.get(url)
except requests.exceptions.ConnectionError: print('Server has been successfully shutdown!')
Server has been successfully shutdown!

All done!

Next steps

Congrats! If you followed all the way to the end, you’ve now created a barebones Jupyter frontend. Here are some directions you might consider to take this further:

  • How would you implement other notebook features like code completion?
  • How does Jupyter’s trust system work?
  • How would you implement Jupyter’s checkpointing system?
  • Can you redo this in another language?
  • How would you design and build your own UI on top of this?

As for me, my next step is to start translating these into Swift as part of the native macOS Jupyter frontend I’m building.

Let me know on Twitter or via email if you enjoyed this or if you have any questions!