<- Home

Docker Hub Stats & Google Apps Script

Tags: docker, docker hub, google, apps script, gas, stats

Introduction

I've put together a few public Docker images containing the latest Chrome Headless binaries. My main motivation at that time was to have an easy way to play with the latest features and come up with a few prototypes to play with headless mode on a cheap VPS.

I sent a short announcement to the dev-headless mailing list and kept an eye on the stats by manually checking the Docker Hub page. After a while, I thought it would be nice to just have a daily snapshot of the numbers instead.

More Data

I had this thought in the back of my mind for a while, but didn't feel it was worth setting up a proper cron job in a production environment and building a web based visualization. I wanted something that required less maintenance and was easier for me to share.

It happened that I was doing a deep dive into Google Sheets and I was interested in learning its more advanced features. This lead me to reading up on Google Apps Script and what I found was essentially a very restricted, but fully managed version of a server-side JavaScript VM.

The way I think of it is more like a Google App Engine JS VM with a very limited, but essential set of APIs.

DockerHub: Alpeware

How it works

1. Crawling and Parsing Docker Hub

If you look at the source of Alpeware's Docker Hub profile page, you will notice a bunch of JSON boilerplate powering the ReactJS UI.

https://hub.docker.com/r/alpeware/

{

    "UserProfileReposStore": {

        "repos": [

            {

                "build_on_cloud": null,

                "can_edit": false,

                "description": "Always up-to-date Headless Chrome right off the trunk in a Docker image",

                "is_automated": true,

                "is_private": false,

                "last_updated": "2017-07-05T16:40:05.877737Z",

                "name": "chrome-headless-trunk",

                "namespace": "alpeware",

                "pull_count": 3926,

                "repository_type": "image",

                "star_count": 2,

                "status": 1,

                "user": "alpeware"

            }

        ]

    }

}

Google Apps Script provides a function to fetch an URL of our choosing and returns its content. I was considering using a library like Cheerio or GAS's XML parsing, but ended up just using a regular expression.

Once I get the values, I simply parse the JSON string as a JS object.

function parsePage(page) {

  var matches = page.match(/\"UserProfileReposStore\":(.*),\"plugins\"/)

  var s = matches[1]

  var d = JSON.parse('{"data":' + s.replace('},"plugins":{}}', ''))

  return d.data.repos

}

2. Triggering the script

GAS has the ability to set up a cron job. It's called a Trigger, but serves the same purpose.

I simply tell it to run a function once a day and send me notifications in case the execution fails (so far, it never has).

3. Visualizing the data

I dump the data into a Google Sheet. In a separate sheet, I filter it to just show the repo I'm interested in. I then create a chart based on that data in it's own sheet and publish it. This allows me to embed it in a Google Doc. Since Alpeware's main website is powered by Google Docs, it is easily published to the web and kept up to date using some additional GAS magic (unfortunately, just embedding it doesn't automatically update it when the underlying data changes).

https://developers.google.com/apps-script/

Also, Amit's blog posts have been informational and I've been amazed by the use cases he's been using GAS for:

https://ctrlq.org/

If you have any questions about GAS or Docker Hub stats, feel free to shoot me an email or leave a comment.

Observations

I started running the script early in June when downloads were at around 2k. Since then, downloads almost doubled to 4k. Personally, I've downloaded it maybe a total of 10 times, so most of the usage must come from somewhere else. My best guess currently is that it's part of a CI environment and regularly pulled in, most likely driven by one single large user. The consistency of the steady growth is my main reason for reaching this conclusion.

Whoever you are, I'm glad it's useful and hopefully it helps make the web a better and less buggy place.

More Docker Images

I've also taken the opportunity to add images for the beta, unstable and stable binaries. They've gotten some usage so far, but the numbers are too small in comparison to the trunk image. Therefore, I've left them out of the chart. I do hope they get more usage in the future!

Donations

As a reminder, I'm still intrigued by the idea to test a donation based business model for public images, so I'll link to it here again:

https://donate.alpeware.com/

Subscribe to receive updates.