Building the Blog

A rundown of how this blog is written, stored, and published

Garrett McGrath Sun 11 April 2021

It’s been a while since I’ve spent any serious time looking at blogging, static site generation, and the github pages system so it was no surprise that my knowledge was more than a little out of date. What was surprising however was how much information online seems to be equally out of date at this point a state that was a little disappointing but not overtly surprising. It’s not like github shouts from the rooftop when they change things quietly in the background, especially when those old approaches still work.

So this post will serve somewhat as modern documentation for how github user pages can be setup and automated to use Pelican to generate a blog (like this one).

How things used to work

Originally what you would do is create a public repository named <githubuser>.github.io. From here the main or master branch in the repository would be your raw content, so in pelican a folder full of markdown files. From there you’d generate an output folder, that typically you would not capture in git, and merge that into a separate branch named gh-pages. This felt like creating some sort of weird hell alternate file stream based approach and was generally kinda difficult to wrap your head around. The alternative was brute forcing it by simply checking out two different copies of the same repo in separate folders, then just copy/pasting from one to the other and pushing. This works but at that point, why have them be the same repo? I suspect this is something github realized somewhere along the way because that’s how you do it now!

Structure

Github has simplified things a bit, now you can just put your websites static folder in the main branch (main or master depending on repo settings and age). This allows us some flexibility, the core of this blog is published from a private repository allowing me to not exposed my pelican configs, comment cruft, in dev theme files, and other details around how pelican sites are published in the public repository that serves out what you are reading.

To get started I made a conda environment to use for local test builds, it’s overkill but miniconda envs are my default approach, to this we need to add the following packages (this is the contents of my requirements.txt file, but using the latest versions should work; you might not even need all of this):

blinker==1.4
docutils==0.16
feedgenerator==1.9.1
Jinja2==2.11.3
Markdown==3.2.2
MarkupSafe==1.1.1
pelican==4.5.4
pelican-render-math==1.0.3
Pygments==2.6.1
python-dateutil==2.8.1

After that environment is created and activated the private repo needs to be cloned into a local folder, and cd’d into so it’s your working directory. At this point the basic pelican environment can be created with the pelican-quickstart command, this will create the default folder structure and config files needed to get everything running. Before adding and committing these changes add a quick .gitignore file with the following ignores:

__pycache__
.vscode/*

output/*

Once the above is added to git it can be committed and pushed up to the private repository. This sets up the base you need to build on going forward.

Custom URL

If using a custom url (like say blog.shadowgears.com) you’ll need to do a few things:

1) make a folder inside your content folder named extra and add a file named CNAME with the custom domain name in it, that will handle the CNAME id file required by github pages.
2) Add the following to your pelicanconf.py file:

STATIC_PATHS = ['extra','extra/CNAME']  #this flags these folders as 'static' they will not be run through any parsers
EXTRA_PATH_METADATA = {'extra/CNAME': {'path': 'CNAME'}} #this takes the file and puts it at the root of the published directory and names it CNAME per the requirements in the github pages docs

3) Add the following to your publishconf.py:

SITEURL = 'https://<your site url here>'

Adding the above will ensure that the file CNAME is placed in the root of your output folder every time the site is generated so even if it’s removed by accident it’ll be re-instated on the next publish. Additionally setting the siteurl will make sure all internal links on the site point to the correct location when the site is published.

Auto Publishing Posts

At this point you can publish the content, using the make publish command will push a full processed copy of the content folder to output, and for some that’s probably anywhere between good and fine. But that’s not something I want to have to remember every time I write something new, so the next step is to solve automatic publishing! We’ll be pushing from our private repository to the public one so we are going to need a personal access token with the repo->public repo privileges, I’ve tested no additional privileges are required. Make sure you note this key value down somewhere you can copy / paste it.

The next step is going to be adding this as a secret attached to your private repository, this is done in the specific private repos settings->secrets, simply add a new secret named APITOKENPUBLISH and paste the contents of the generated token above in as the value. This will be used by the github action we’ll be setting up next.

Now that we’ve got the basics needed to make this work all that’s required is adding the automatic workflow to the private repository. This will be broken into a few steps:

1) Checkout a copy of the repository.
2) Run make publish on the contents of the repository.
3) Clone the public repository into a separate location, copy the output of make publish into it, and push it back out.

To get this started we’ll need a new set of folders at the root of the repository named .github/workflows, inside here is where we’ll place the rest of the magic. Once that folder is in-place we are going to need a yaml file inside it, this can be named whatever but I’m going for pelican-publish.yml. This is going to be a github actions file and is the reason we made that secret entry further up the post. A full copy of the action file is included at the end of this post, the following sections break down the sub parts.

Setup the Environment

This initial segment sets up some rules and the python environment. We’ll be naming the action Pelican-Publisher, restricting it to only run on pushes to our source repository, and instructing it to run on ubuntu. This last part we could probably make something lighter weight just so it runs a little faster (this process is slower than I’d like) but for simplicity and compatibility with most published documentation and actions I stuck with ubuntu here. The last bits of this step check out the private repository and creates a python environment

name: Pelican-Publisher

# Run this workflow ever time a new commit is pushed ot the repo
on: push

jobs:
  publish:
    name: Publish 

    runs-on: ubuntu-latest

    steps:
      ## pull the committed code into an environment.
      - name: Checkout Code
        uses: actions/checkout@v2
      - name: Set up Python 3.9
        uses: actions/setup-python@v2
        with:
          python-version: 3.9
      - name: Install Dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt

Build the Website

This sounds like it should be long or complicated, it’s neither. If test make publish commands worked on the local system the adding this to the above will build the website fresh:

      - name: Build Site
        run: |
          make publish

If this doesn’t work, there’s probably an issue with your local make publish or with your requirements file, verify those are correct then revisit this.

Publish the Website

The actual complicated part of this is here, we’ll be making use of user cpina‘s github-action-push-to-another-repository published action to do the heavy lifting here. It’ll require a few quick inputs, mostly involving your github ID or the afore mentioned API key secret.

      # push generated site to public repo
      - name: Pushes to another repository
        id: push_directory
        uses: cpina/github-action-push-to-another-repository@v1.2
        env:
          API_TOKEN_GITHUB: ${{ secrets.APITOKENPUBLISH }}
        with:
          source-directory: 'output'
          destination-github-username: '<your github id here>'
          destination-repository-name: '<your github id here>.github.io'
          user-email: <your email address here so the commit is signed correctly>
          commit-message: Automated Update
          target-branch: main

This action does a few things. First it checks out the target repository, in this case our public <username>.github.io repository. Then it makes a new folder and copies just the .git subfolder from the checked out repository into there. Over top of this it copies the contents of the folder output, and finally issues a commit and push of this newly minted git repository.

Doing it this way is important in that it ensures you have a completely clean copy of the generated output as your new website version instead of possibly missing an article deletion or restructuring that doesn’t get tracked properly.

Putting it all together

This is the full action file ready to be copy/pasted and updated to your own settings.

name: Pelican-Publisher

# Run this workflow ever time a new commit is pushed ot the repo
on: push

jobs:
  publish:
    name: Publish 

    runs-on: ubuntu-latest

    steps:
      ## pull the committed code into an environment.
      - name: Checkout Code
        uses: actions/checkout@v2
      - name: Set up Python 3.9
        uses: actions/setup-python@v2
        with:
          python-version: 3.9
      - name: Install Dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
      - name: Build Site
        run: |
          make publish
      # push generated site to public repo
      - name: Pushes to another repository
        id: push_directory
        uses: cpina/github-action-push-to-another-repository@v1.2
        env:
          API_TOKEN_GITHUB: ${{ secrets.APITOKENPUBLISH }}
        with:
          source-directory: 'output'
          destination-github-username: '<your github id here>'
          destination-repository-name: '<your github id here>.github.io'
          user-email: <your email address here so the commit is signed correctly>
          commit-message: Automated Update
          target-branch: main

Adding Content

In the process of writing this article I’ve gone from using VSCode to write it, to moving to ZETTLR, and then switching back to vscode for a few reasons. Initially when I was developing this article in vscode I’d attempted to install a spell checker to make sure my writing is at least spelled right if not you know, comprehensible. Unfortunately the recommended spell checker just seemed to freak out and highlight the entire markdown file as spelled incorrectly. So I removed that addon and migrated over to editing in a zettlr window, this has some upsides I really liked. The spell checking is sane, rendering markdown ‘live’ as you edit it works incredibly well. Unfortunately the folder handling is rather clunky, zettlr has opinions about how things are organized and they don’t mesh with how I work as well as I thought. Additionally the linux version seems to have some serious keyboard shortcut collisions that I didn’t have the willpower to fix; when you are trying to paste with ctrl+v and it instead keeps warping you to the bottom of the page for some reason, it’s time to stop fighting a tool not setup to do what you want. So I’m back to using vscode as my editor, however in the mean time I’ve found a different spellchecker for vscode that acts more sanely. The Code Spell Checker by Street Side Software works more or less how I’d expect. The only nagging point currently is having to hit ctrl+. instead of right clicking on a misspelled word in order to fix it. All in all, it works and that’s what’s important here.

Notes

Below are notes and references used to work out the details above.

This blog is built and published using the Pelican static site generator system in a two stage setup.

reference material on the current state of github pages:

actions reference materials:

  • https://github.com/marketplace/actions/push-directory-to-another-repository
  • https://github.com/cpina/push-to-another-repository-example
  • https://nolanbconaway.github.io/pelican-deploy-gh-actions/pages/deployment-on-github-pages.html
  • https://docs.github.com/en/actions/quickstart

currently using: github actions based on ubuntu-latest, python setup commands, and the push directory to another repository action.


Read more: