A Fast Car Needs Good Brakes: How We Added Client Rate Throttling to the Platform API Gem
08 Jul 2020When API requests are made one-after-the-other they’ll quickly hit rate limits and when that happens:
When API requests are made one-after-the-other they’ll quickly hit rate limits and when that happens:
In the beginning, there were API requests, and they were good. But then some jerk went and made too many requests too fast and brought the server crashing to its knees. Enter: Rate limiting.
I got a customer ticket the other day that said they weren’t worried about response time because “New Relic is showing our average response time to be sub 200ms”. Sounds good, right? Well, when it comes to performance - you can’t use the average if you don’t know the distribution. It’s usually best to use the median, which is also perc50, though you’ll also want to look at your long tail of responses. If you’re not following, then this post is for you.
I maintain an internal-facing service at Heroku that does metadata processing. It’s not real-time, so there’s plenty of slack for when things go wrong. Recently I discovered that the system was getting bogged down to the point where no jobs were being executed at all. After hours of debugging, I found the problem was an UPDATE
on a single row on a single table was causing the entire table to lock, which caused a lock queue and ground the whole process to a halt. This post is a story about how the problem was debugged and fixed and why such a seemingly simple query caused so much harm.
Why on earth does my memory consumption chart look like that? It’s a question I hear every week. To help answer that question, I wrote a Web server request simulator to model how Ruby uses memory over time, though it applies to other languages as well. We will use the output of that project to dissect why a web app’s memory would be expected to look like this:
For quite some time we’ve received reports from our larger customers about a mysterious H13 - Connection closed error showing up for Ruby applications. Curiously it only ever happened around the time they were deploying or scaling their dynos. Even more peculiar, it only happened to relatively high scale applications. We couldn’t reproduce the behavior on an example app. This is a story about distributed coordination, the TCP API, and how we debugged and fixed a bug in Puma that only shows up at scale.
Here’s the setup: You are a web server named Puma. You need to accept incoming connections and give them to your thread pool, but before we can get that far, you’ll have to make sure all of the request’s packets have been received so that it’s ready to be passed to a Rack app. This sounds like the job for a Reactor!
Today I have an unusual proposition for you. I’m spending a bunch of time to try to get Beto elected to Texas Senate, so I’ve not been able to write as much technical content. Rather than slow down on my door knocking, I’m looking to pick up the pace, and I want you to do it with me. Starting today, I’m offering anyone who phone banks or “block walks” (knocks on doors) the opportunity to win some of my technical time. Here’s how it’s going to work.
Rails applications that use ActiveRecord objects in their cache may experience an issue where the entries cannot be invalidated if all of these conditions are true:
You might know rubocop as the linter that helps enforce your code styles, but did you know you can use it to make your code faster? In this post, we’ll look at static performance analysis and then at the end there’s a video of me live coding a PR that introduces a new performance cop to rubocop.