In this post you'll learn how to use Pelican (a Python static site generator) programmatically rather than through its command-line interface. This will give you a better understanding of how Pelican works internally and enable you to customise it for your needs.
Pelican's highest-level of abstraction is its command-line interface, which you would typically use as follows:
$ pelican content output -s pelicanconf.py
This would read all articles and pages in the content
directory, convert them to HTML, render web pages with the relevant Jinja templates, and write the resulting static website to the output
directory.
The rough flow to achieve this is as follows:
Generator
s (which house all of the relevent Reader
s and a jinja Environment
) and a Writer
.Generator
:generate_context
method, which reads the input files, converts them to HTML, and adds the outputs to a context
dictionary.generate_output
method, passing the Writer
and context
. This gets the relevant jinja Template
from the Environment
, renders it with the provided context
, and writes the result to the final output directory.As you can see, Generator
s are responsible for glueing together the lower-level components: Reader
, jinja Template
, and Writer
. In order to understand each of these components, we'll reimplement the core logic of a Generator
from scratch!
Start by setting root
to the directory of your Pelican website. If you don't yet have a website, follow Pelican's informative documentation to get started:
from pathlib import Path
root = Path('..')
Now we can load our pelicanconf.py
settings file. Pelican provides a function for this which handles details like applying defaults:
from pelican.settings import read_settings
settings = read_settings(root/'pelicanconf.py')
Let's create a quick blog post for testing. I prefer to write more technical blog posts in Jupyter notebooks but we'll use markdown here since Pelican supports it natively.
post_filepath = root/'content/2022-06-20-hello-pelican.md'
%%writefile {post_filepath}
Title: Hello Pelican
Slug: hello-pelican
Author: Wasim Lorgat
Date: 2022-06-20
Tags: python, pelican
Category: python
## Welcome
Hello and welcome to our markdown blog post!
Writing ../content/2022-06-20-hello-pelican.md
Reader
We'll start by instantiating a MarkdownReader
to read our blog post. We're using a MarkdownReader
because we wrote the post in markdown, but Pelican also provides HTMLReader
and RstReader
if you prefer those formats.
from pelican.readers import MarkdownReader
reader = MarkdownReader(settings)
The most important part of a Reader
is its read
method which accepts a file path and returns the contents of the file in HTML format along with metadata about the file:
content, metadata = reader.read(post_filepath)
... content
is a string containing the blog post content converted to HTML. Since this was written in a notebook, we can use an IPython
function to render it directly!
from IPython.core.display import HTML
HTML(content)
Hello and welcome to our markdown blog post!
... and metadata
is a dictionary that describes the file:
metadata
{'title': 'Hello Pelican', 'slug': 'hello-pelican', 'author': <Author 'Wasim Lorgat'>, 'date': SafeDatetime(2022, 6, 20, 0, 0), 'tags': [<Tag 'python'>, <Tag 'pelican'>], 'category': <Category 'python'>}
Writer
Now that we have the contents of the post in HTML format, we'll render it into a static web page using a Writer
. However, we first need to create an appropriate jinja Template
. Jinja provides the Environment
class for reusing functionality across templates so we'll use that here.
Pelican searches for templates in the following order:
settings['THEME_TEMPLATES_OVERRIDES']
.settings['THEME']
.We can implement this search order using a FileSystemLoader
, housed in an Environment
for convenience:
import pelican
from jinja2 import Environment, FileSystemLoader
from pathlib import Path
template_paths = [*(Path(o) for o in settings['THEME_TEMPLATES_OVERRIDES']),
Path(settings['THEME'])/'templates',
Path(pelican.__file__).parent/'themes/simple/templates']
env = Environment(loader=FileSystemLoader(template_paths),
**settings['JINJA_ENVIRONMENT'])
Now we can get the article template:
template = env.get_template('article.html')
The last step of preparation is to create the context
dictionary that's passed through to the Template
to render the article:
from pelican.contents import Article
context = settings.copy()
article = Article(content, metadata, settings, post_filepath, context)
article.readtime = {'minutes': 1} # NOTE: this is a workaround to support the readtime plugin that I use
context['article'] = article
And now we can write the final result!
from pelican.writers import Writer
output_dir = root/'test'
writer = Writer(output_dir, settings)
writer.write_file(Path(post_filepath.name).with_suffix('.html'), template, context)
Let's read it back in and see what it looks like. We'll extract only the body using a simple regex - I'd usually recommend considering Beautiful Soup for parsing HTML but regex works fine for our case:
import re
with open(output_dir/'2022-06-20-hello-pelican.html') as f: html = f.read()
body = re.findall('<body>(.*?)</body>', html, re.DOTALL)[0].strip()
HTML(body)
The provided templates have added a navigation bar at the top, a title below that, as well as the publication date and estimated reading time. And that's it, we've successfully rendered a blog post web page using Pelican's low-level components!
Before we end off, clean up the files we made along the way:
import shutil
shutil.rmtree(output_dir, ignore_errors=True)
post_filepath.unlink(missing_ok=True)