Why would you wanna do that, you ask? Well, because then you can parse it with tools like pt-query-digest
and get a nice look at the queries hitting Sphinx sorted by resource consumption.
And if you’re not using SphinxQL query log format in Sphinx, I highly recommend it as it helps you see more vividly what queries is your Sphinx server actually running. You can even change the format of query log online – by running the following SphinxQL command:
sphinxql> set global query_log_format = sphinxql;
After you do that, it may be a good idea to rotate the log so the new log begins in the SphinxQL format. For that, you should simply rename the current query log and then run kill -SIGUSR1 $(pidof searchd)
.
Finally, in order to convert the sphinx query log, run the following awk program:
$ awk '
{ m["Jan"]=1; m["Feb"]=2; m["Mar"]=3; m["Apr"]=4; m["May"]=5; m["Jun"]=6; m["Jul"]=7; m["Aug"]=8; m["Sep"]=9; m["Oct"]=10; m["Nov"]=11; m["Dec"]=12; }
/\/\*/{
qpos=index($0, "*/");
printf("# Time: %02d%02d%02d %s\n# Query_time: %.6f Lock_time: 0.000000 Rows_sent: %d Rows_examined: 0 Rows_affected: 0\n%s\n", substr($6,3,2), m[$3], $4, $5, $10, $14, substr($0, qpos+3));
}
!/\/\*/{ print $0; }' query.log > parsed-query.log
So this tiny awk one-liner extracts date, query time and number of records returned and writes that in a proper format as a slow query log. At this point, processing the entire parsed-query.log
is as simple as:
$ pt-query-digest parsed-query.log > parsed-query-log-report.txt
And now you know what to focus on whether you want to improve user experience or reduce server resource consumption. Here’s how my report looks:
# 430.7s user time, 3.4s system time, 66.67M rss, 242.43M vsz
# Current date: Wed Apr 6 16:51:39 2016
# Hostname: sphinx5
# Files: parsed-query.log
# Overall: 1.42M total, 343 unique, 12.98 QPS, 0.35x concurrency _________
# Time range: 2016-04-05 13:08:55 to 2016-04-06 19:35:24
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 38536s 0 152s 27ms 78ms 314ms 12ms
# Lock time 0 0 0 0 0 0 0
# Rows sent 1.46G 0 13.50M 1.08k 563.87 39.71k 0.99
# Rows examine 0 0 0 0 0 0 0
# Rows affecte 0 0 0 0 0 0 0
# Query size 847.14M 93 2.28k 624.51 1.14k 324.33 463.90
# Profile
# Rank Query ID Response time Calls R/Call V/M Item
# ==== ================== ================ ====== ======= ===== ==========
# 1 0x47C2083EEA45DEAC 11604.8510 30.1% 286741 0.0405 2.70 SELECT catalog
# 2 0xF164798873344DD8 7282.0320 18.9% 194201 0.0375 0.07 SELECT catalog
# 3 0x2AC8F58AE0C0BBE2 7169.4320 18.6% 417430 0.0172 0.05 SELECT catalog
# 4 0xD4EC9B2970A069A1 6318.3600 16.4% 280790 0.0225 0.06 SELECT catalog
# 5 0x9F6F1D5469DC2D96 955.1560 2.5% 10 95.5156 19.18 SELECT catalog
# 6 0x52A07B10AF155247 741.2580 1.9% 16632 0.0446 0.06 SELECT catalog
# 7 0xE29BBF5A942E0287 574.7860 1.5% 15821 0.0363 0.05 SELECT catalog
# 8 0x0B8EF479086D5AB0 557.7800 1.4% 15945 0.0350 0.05 SELECT catalog
# 9 0x33DB0632CB09C79F 531.3840 1.4% 41380 0.0128 0.04 SELECT catalog
# 10 0x841F87BC120D1EBC 502.9380 1.3% 37997 0.0132 0.04 SELECT catalog
# 11 0xC3183721C638CC46 420.4840 1.1% 37997 0.0111 0.06 SELECT catalog_pp
# 17 0x7B17ADE5CE4B3049 96.6300 0.3% 334 0.2893 2.94 SELECT catalog
# 39 0xF28A07BFDDD7AB73 13.4550 0.0% 55 0.2446 2.09 SELECT catalog
# MISC 0xMISC 1767.4290 4.6% 77041 0.0229 0.0 <329 ITEMS>
# Query 1: 2.62 QPS, 0.11x concurrency, ID 0x47C2083EEA45DEAC at byte 895941331
...
So now I know my query #1 consumes almost 1/3 of resources, so if I focust just on that, I can easily get back up to 30% of resources used. Also, it’s pretty clear I need to work on query #5 as it’s way slow and of course other queries from the top in that order.
That’s pretty cool isn’t it? It sure is!
Share this Post