Troubleshooting (Web) Application Performance Issues with Strace

In Apache by Aurimas Mikalauskas2 Comments

What do you do when your Linux web app starts acting out? You troubleshoot it, of course! Me too. And today I would like to talk to you about one tool that works really well with all sorts of applications and application servers – ranging from apache http server running modules (mod_php, mod_perl, etc.), to applications running in fast cgi setting or even simple single-threaded programs.

The tool I’m talking about is strace, and if you’re not familiar with it, it will be my pleasure to introduce you two. I guess it’s worth noting that strace is not a universal troubleshooting tool that will help you nail down just about any sort of performance issue (yes, you still need to learn and use the other 27 performance troubleshooting tools), but when it helps, it helps big time and with little effort.

And the good news – it’s really easy to tell upfront whether it is going to be helpful or not.

When you should not be using strace

Let’s start by looking at anti-patterns:

  • Applications that are (or can easily be) profiled properly. I mean, if you have New Relic or some other type of application analytics, it should be a no brainer – just look at the dashboard and you know what to focus on. Strace, on the other hand, can be used with applications that were built with no profiling in mind and that you don’t know well enough (or at all).
  • Strace as a general best practice tool. While strace is a very handy quick n’ dirty tool that helps me a lot in my day to day life as a performance consultant (troubleshooting random systems I see for the first time), if you’re consistently working on the same project, do get proper application analytics set up. This will pay off in the long run, I promise.
  • Finally, strace is not about profiling where User CPU time is spent. You can usually spot weather it is the case pretty easily – just look at vmstat 1 or mpstat -P ALL 1 for a few seconds and if you see that CPUs are maxed out and most activity is in %usr, then you have to profile your application (or simply add more CPU power if your application got there due to increased load).

When does strace really help

Most applications these days are built around micro services. Meaning that besides the logic inside the application, there are lots of calls to other (micro) services. Sometimes these calls go to remote servers you don’t even have access to. Other times, they could be calls to application modules on different cloud instances, database requests, full text searches, cache checks and so on.

This is where strace becomes really handy.

So how do you know you have problem strace can help you with? Well, say you have your application running 20-30 processes normally and all of the sudden the number of processes jumps through the roof. Often reaching hard limits you’ve set for it (e.g. MaxClients), and as a consequence, consuming large amounts of memory.

You look at the CPU stats, and it’s clearly not even close to being saturated. That is a good sign some remote service is blocking your application.

(It could also be blocked by a number of internal resources, such as I/O calls, mutexes, semaphores and what not, and strace is good at identifying those too)

So what is that strace anyway?

In a nutshell, strace is a system call and signal tracer. You attach it to a running program and all of the sudden you can see all the connect(), poll(), read(), write() and many other system calls that the program is making. And most importantly, strace can tell you how long it took to return from any given call (as well as any arguments that were used).

I’ve used strace successfully to troubleshoot all sorts of weirdnesses, from applications running hundreds of database queries instead of one to API calls nobody knew existed.

And I want to show you one such case I have been troubleshooting just a couple of weeks ago.

Before we go on though, I’d like to mention two important things about strace:

  1. Strace can (and often will) slow down your application. Which is why it’s even more so important to check that CPU is not saturated already before you run strace.
  2. Sometimes (although very rarely) strace can leave your program in traceable state (T state in ps), so you may need this small program to unfreeze it.

I haven’t had this happen for the last 2 or 3 years though, so whatever was causing it, is probably fixed on most systems by now.

Where do we start

We start by identifying processes that we need to trace. This is the easy part, it’s what alerted us in the first place after all.

Say it’s apache (as it was the case in my last session), so we start by attaching to a PID of the main apache process (one that’s started by “root” user):

Hm, not very helpful, is it?

Thing is, in case of prefork MPM (often used with non-thread safe modules such as mod_php), tracing the main process is useless, as the main process is only forking child processes to do the actual work.

Luckily, instead of tracing the main process only (or only the child processes), we can ask strace to trace child processes as they are created:

That’s better. Still, data from different processes is now mixed in a single stream and there’s no timing information.

Therefore let’s create a folder and ask strace to store data for each process in a separate file (-ff) with a name tr1.PID (-o tr1). Additionally, we want unix timestamps with microsecond resolution (-ttt) and time spent in a syscall (-T). Finally, to get a bit more visibility in strings used, let’s increase the length for them (-s 100).

Here’s how our final command looks like:

# strace -tttT -s 100 -ff -o tr1 -p <PID>

For the customer I’m working with, this has generated couple of gigabytes of data within 10 minutes. If the issue you are seeing is not as intermittent, you may only need to run strace for a few minutes to get just enough data. I had to wait a bit until the issue has occurred.

Now, let’s look at a small sample of data we’re now getting:

So every line has a wall clock timestamp with microsecond resolution, followed by a system call (syscall) with all the arguments, the result and actual time spent in syscall in angle brackets.

In this example you can see a program connect()‘s to a sphinx server at using file socket #4, setsockeopt() sets various socket options and starts read()‘ing and write()‘ing.

It’s worth mentioning that time between the timestamps won’t necessarily match the time spent in syscalls. In fact it will almost never match as your program will be executing a user code between the system calls. Plus, it will take some time before the call is scheduled to run, especially if the CPU is rather busy.

Analysing the data

Now it’s time to make sense of all the data strace has captured. What exactly are we looking for? Essentially – syscalls that took long time to return from. Here’s a handy awk one-liner that helps us find the offenders (in this case taking more than 0.5s):

So in order to find the files you are interested in, you can run this loop:

Alternatively, we can get the offending calls straight away:

So now we finally have something that can help us understand what is going on. The issue you see above is slow reads from sphinx search daemon which, as I discovered later, was caused by a completely saturated disks on the Sphinx server. This was a well known issue for the customer already.

This, on the other hand, was a completely different story:

The issue I have discovered here was a remote call to the server nobody in the team even knew existed (and it took a while to locate it!). Apparently everyone who worked on (or knew about) this part of the application, left the company. And so did the know-how (and know-what).

Let’s review this piece of strace sample together.

When you’re investigating, naturally you are looking at it the other way around – you start from the bottom (syscall that’s taking too long for your tastes) and you go back up to understand what it was all about.

In line 17, you can see a select() call that took 4.33s to return. Right before it, in line 16, there’s an HTTP request on file descriptor #24 (first argument to write() function). So the last task for us is to find what is that fd #24 all about.

And it doesn’t take very long to get to line 9, where we find the connection attempt, and the server to which connection is made. (The call “fails” with EINPROGRESS as it’s a non-blocking call, so it is followed by a select() and getsockopt() calls to determine whether connect() completed successfully).

Few quick ab checks on URL http://REMOTE_IP/remotecall.json?q=query-string confirmed that the service was flaky. And it only took a few minutes to get it fixed now that we knew what to fix. If not strace, nobody would have guessed it, since nobody even knew about this server or this API call.

Final notes

Before we wrap up, let me give you a few quick tips that would have saved me hours when I first started using strace back in 2004 (and had no clue about Linux system calls, kernel and many other internals):

  1. Some calls are bound to take a long time so before you dig too deep, be sure you’re checking the right thing. Good examples are epoll_wait() and poll() calls that could be waiting for http requests to come in.
  2. Some slow syscalls are symptoms rather than a root cause, e.g. semop() taking a long time could mean a lot of contention between many processes rather than issues with memory or semaphores.
  3. Documentation of calls is super easy to find. You don’t even need Google. In fact, it’s best if you do not use Google if you want to learn how a specific call works on your system. For example, to learn more about semop(), just run “man 2 semop” on your system.

Well, have fun playing with strace. In the mean time, I need to figure out how to get the same result with tcpdump rather than strace. Sure, strace is nice in that it shows you an local issues too (e.g. slow disk reads), but for micro service based applications, tcpdump + something like pt-tcp-model could be a good less intrusive way to get to very similar conclusions.

If you know how to do it with tcpdump, I’m all ears!

Share this Post

About the Author

Aurimas Mikalauskas

Facebook Twitter Google+

Linux hacker since the age of 0xE, deeply passionate about high performance and scalability. Former architect of scalable systems at Percona - largest independent MySQL and MongoDB consultancy company (for 9 years). Audiobook addict, piano & guitar virtuoso, scuba diver, wind surfer and a wannabe glider pilot. Learn more about the author.