How To Speed Up MySQL Restart (hint: just like before, but this time for real)

In MySQL by Aurimas Mikalauskas2 Comments

Restating MySQL can be really annoying. You just want to disable the goddamn query cache and it takes forever (read 5-10 minutes) to shutdown, not to mention the warm-up time. Yes, with MySQL 5.7 you can do many changes online, so you won’t necessarily be restarting that often, but you still need to do upgrades, occasionally increase redo log size and, admit it, enable skip-grant-tables. Here’s how you can make this process way less painful.

Why is MySQL so slow to restart?

Before we go any further, let me tell you right away that when I’m speaking about MySQL here, I’m actually speaking about InnoDB, or rather, a MySQL server that’s running InnoDB as the main storage engine. And if that’s not your case, do not read any further. You’ve been warned!

Now.. ah yes. Restart. So, restarting MySQL involves two slow stages. I have already mentioned them, but repetition is the mother of skill, so let me say that again:

  1. With large trx logs (and big innodb buffer pool), InnoDB takes long time to shutdown – often 10-15 minutes or more,
  2. Right after a restart, query performance is crippled for hours until the server warms-up.

And that especially comes unexpected for people who were used to fast restarts (and virtually no warm-up) with MyISAM, which keeps both caches and dirty pages in OS cache rather than MySQL memory. But it’s not all lost even with InnoDB, here’s what you can do.

1. Shutting down faster

Let’s start with shutdown. During a normal (read “fast”) shutdown, InnoDB is basically doing one thing – flushing dirty data to disk.

MySQL allows up to 75% of the buffer pool to be dirty by default, but with most OLTP workloads, the relative number of dirty pages will often be less than 10%. Take this system for example, which has a buffer pool of 90GB (5898232 pages):

The relative number of dirty pages is just over 2%. Still that’s 2GB of dirty data and ~127216 random write operations to flush it. Some of these writes will be merged of course, and non-volatile cache on the controller definitely helps. Despite all that, with 15k rpm disks it takes 5 minutes to shutdown this MySQL instance.

Let’s now try the set global innodb_max_dirty_pages_pct = 0 technique once described by Baron on Percona blog and see how that helps:

You would expect number of dirty pages to get to zero eventually, unfortunately on MySQL 5.6 (and earlier), it doesn’t work that way. For a full description why, see this bug report by Domas Mituzas, however bottom line is that even if you set innodb_max_dirty_pages_pct to 0, InnoDB will keep up to 1% of pages dirty! Indeed as you can see in the example above, it approaches 1%, but doesn’t go any lower.

And with that, shutdown is still quite slow:

Imagine how much slower it would be on a 512GB machine. MySQL folks say they fixed it in 5.6.18, but I’m running 5.6.28 here and it’s definitely not working as it should. Let’s see if MySQL 5.7.10 addressed this.

For the test, I have started a slightly smaller 30GB instance – a replica of the previous server. And I let it get behind for about 24h so I can play around. This instance also has very different IO properties, but for our purposes it’s not as important since we only care about the number of dirty pages.

Our starting position is this:

Interestingly, when I set global innodb_max_dirty_pages_pct = 0, strange thing happens:

Yep, rather than decreasing, the number of dirty pages actually increases. After checking IO, however, I realize it’s not actually pushing as hard as it could:

So it looks like MySQL is still unable to keep the number of dirty pages low enough. However, in this case flushing activity is actually limited by the innodb_io_capacity (which is set to default 200), so let’s shift the gears:

After this change, IO activity increases and we finally get the desired effect:

It levels down at around 250 which is just under 0.01% of our 1703728 page buffer pool and the shutdown is now super fast:

The takeaway? Use MySQL 5.7 if you want faster shutdowns. Only then you can actually use the set global innodb_max_dirty_pages_pct = 0 technique and accelerate it with an additional innodb_io_capacity bump up if the number of dirty pages doesn’t go down fast enough.

2. Warming up faster

If you’re already using InnoDB buffer pool restore feature, I won’t tell you anything new here.

If you don’t, you should. Let me tell you why.

Unlike MyISAM, which caches data in operating system’s memory (and therefore is warm until the whole server is restarted), InnoDB caches data in InnoDB buffer pool. There are some exceptions to that (like when you’re not using O_DIRECT and the buffer pool is at it’s default size), but if your server is configured properly, then after restart InnoDB will start off with blank InnoDB buffer pool.

As queries are coming in, they will be populating the innodb buffer pool, but a lot of random read IO will need to happen before the buffer pool returns to its previous state.

Many years ago someone smart took time to think about it and figured out a way to warm up the buffer pool by keeping track of loaded pages and then preloading them sequentially on MySQL startup so that most queries will already find the needed data in memory.

In 2010, Percona implemented that inside their XtraDB storage engine and starting with MySQL 5.6, this feature is available in MySQL. Beware though that only MySQL 5.7 has it enabled by default and still it only restores 25% of the most recently used buffer pool pages, so you may want to fine-tune it for your tastes. Here’s how you can do that.

Oh and let me borrow one graph from an earlier Vadim’s post to illustrate how amazing this feature really is:

MySQL's InnoDB buffer pool restore performance impact

This graph shows query performance over time right after a MySQL restart. Blue line is what happens when you let InnoDB warm-up organically. Whereas red line shows that warm-up is at least 3 times faster and even when it’s not yet fully warmed up, the overall performance is still much better to start with.

I enable this feature on every server I get to touch. (Unless I’m asked not to (Which never happened (Yet)))

Share this Post

About the Author

Aurimas Mikalauskas

Facebook Twitter Google+

Linux hacker since the age of 0xE, deeply passionate about high performance and scalability. Former architect of scalable systems at Percona - largest independent MySQL and MongoDB consultancy company (for 9 years). Audiobook addict, piano & guitar virtuoso, scuba diver, wind surfer and a wannabe glider pilot. Learn more about the author.