16 Sep 2020
Your app is slow. It does not spark joy. This post will show you how to use memory allocation profiling tools to discover performance hotspots, even when they’re coming from inside a library. We will use this technique with a real-world application to identify a piece of optimizable code in Active Record that ultimately leads to a patch with a substantial impact on page speed.
Keep Reading
08 Jul 2020
When API requests are made one-after-the-other they’ll quickly hit rate limits and when that happens:
Keep Reading
25 Jun 2020
In the beginning, there were API requests, and they were good. But then some jerk went and made too many requests too fast and brought the server crashing to its knees. Enter: Rate limiting.
Keep Reading
17 Mar 2020
I got a customer ticket the other day that said they weren’t worried about response time because “New Relic is showing our average response time to be sub 200ms”. Sounds good, right? Well, when it comes to performance - you can’t use the average if you don’t know the distribution. It’s usually best to use the median, which is also perc50, though you’ll also want to look at your long tail of responses. If you’re not following, then this post is for you.
Keep Reading
18 Dec 2019
I maintain an internal-facing service at Heroku that does metadata processing. It’s not real-time, so there’s plenty of slack for when things go wrong. Recently I discovered that the system was getting bogged down to the point where no jobs were being executed at all. After hours of debugging, I found the problem was an UPDATE
on a single row on a single table was causing the entire table to lock, which caused a lock queue and ground the whole process to a halt. This post is a story about how the problem was debugged and fixed and why such a seemingly simple query caused so much harm.
Keep Reading
06 Nov 2019
Why on earth does my memory consumption chart look like that? It’s a question I hear every week. To help answer that question, I wrote a Web server request simulator to model how Ruby uses memory over time, though it applies to other languages as well. We will use the output of that project to dissect why a web app’s memory would be expected to look like this:
Keep Reading
12 Jul 2019
For quite some time we’ve received reports from our larger customers about a mysterious H13 - Connection closed error showing up for Ruby applications. Curiously it only ever happened around the time they were deploying or scaling their dynos. Even more peculiar, it only happened to relatively high scale applications. We couldn’t reproduce the behavior on an example app. This is a story about distributed coordination, the TCP API, and how we debugged and fixed a bug in Puma that only shows up at scale.
Keep Reading
26 Jun 2019
Here’s the setup: You are a web server named Puma. You need to accept incoming connections and give them to your thread pool, but before we can get that far, you’ll have to make sure all of the request’s packets have been received so that it’s ready to be passed to a Rack app. This sounds like the job for a Reactor!
Keep Reading
26 Oct 2018
Today I have an unusual proposition for you. I’m spending a bunch of time to try to get Beto elected to Texas Senate, so I’ve not been able to write as much technical content. Rather than slow down on my door knocking, I’m looking to pick up the pace, and I want you to do it with me. Starting today, I’m offering anyone who phone banks or “block walks” (knocks on doors) the opportunity to win some of my technical time. Here’s how it’s going to work.
Keep Reading
17 Oct 2018
Rails applications that use ActiveRecord objects in their cache may experience an issue where the entries cannot be invalidated if all of these conditions are true:
Keep Reading