The GitLab Outage Learning Opportunity

As many of you know Microsoft announced it was acquiring GitHub for $7.5b on Monday June 4, 2018.

This caused many open-source developers to panic and jump ship over to other platforms, namely GitLab, the platform that we use for source-control hosting, code-reviews, pull-requests and continuous integration. Which brought GitLab’s cloud services to its knees.

Monday also happens to be our release day.

GitLab was receiving a ton of traffic, thousands of repositories being imported from new users trying to flee the corporate takeover. It slowed the entire system down. Our continuous integration jobs sat in an idle state of “pending” all day. Unfortunately, we had to hold off on releasing because our testing suite was not able to run.

GitLab Import Traffic on June 4

The following morning I discovered that a few of our jobs had been picked up, but most had not run and I was quite a bit irritated. We are currently on the free-tier for GitLab, and after looking through the pricing options I couldn’t find one that would bump our jobs up on the processing queue. In fact, GitLab’s Continuous Integration and Continuous Delivery tools come standard, for free, on every pricing level. Everyone got the same tools regardless of how much they paid, but I wanted our jobs to run now.

This setback prompted a discovery opportunity, as setbacks usually do. Previously I didn’t know how these jobs were executed, I told GitLab what commands to execute on which docker containers. The tests would run, and succeed or fail. I investigated exactly how these tasks worked.

The test pipelines run on a technology called GitLab Runners. A Runner, runs a job. GitLab provides, for free, a set of shared runners for all GitLab projects to use, limited to 2000 minutes per month per group for private projects (unlimited for open-source projects,  ergo the  massive queue).

gitlab-runner, the technology used to process CI pipelines, is an open-source tool. And we already ran it on our local machines to debug/test CI jobs before pushing them to GitLab. I quickly realized, by reading their excellent documentation, that this tool allowed you to “register” a GitLab Runner running on your local machine, or anywhere, which ties to your GitLab cloud account.

Essentially, they provided us with the equivalent of a self-checkout kiosk at a supermarket. When all the normal cashier lanes have lineups, we could effectively check ourselves out. Except instead of a normal self-checkout kiosk, it was operated by us, on our own hardware, accessible whenever we wanted, with zero cost.

Now, in addition to using the shared runners, our jobs would also be picked up on any of our local machines which ultimately speeds up our development and deployment process because we no longer have our tasks solely waiting in a public queue.

1 reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *