Hi All,
some other updates.
I wasn't quite able to get some understandable call graphs as @vitalyd suggested, anyhow I used strace
to get a pulse of where I was wasting time, at least in kernel mode, and turns out quite unsurprisingly that most of the time was spent in futex calls:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
99.93 170.146626 262 648425 76214 futex
0.05 0.088029 0 305498 write
0.02 0.027358 0 169171 764 read
0.00 0.001294 1 1053 mprotect
0.00 0.000293 0 764 epoll_wait
0.00 0.000167 0 381 open
0.00 0.000029 0 381 close
0.00 0.000000 0 4 mmap
0.00 0.000000 0 4 sched_yield
------ ----------- ----------- --------- --------- ----------------
100.00 170.263796 1125681 76978 total
If my understanding of trace is correct, it is telling us that all (99.93%) of the time spent in kernel mode is because of a futex related call.
Next step was to understand where those calls happens.
I tried to use a standard perf
with should result in sampling CPU-cycles but I wasn't really able to understand what was going on.
Then I tried to use, instead of the classical perf cycles, the sampling of context-switch under the assumption that a context switch is more likely if the thread is waiting (correct me if I am wrong!)
The results are as follow:
Samples: 538K of event 'cs', Event count (approx.): 1817778 [0/2240]
Children Self Samples Shared Object Symbol ▒
- 99,98% 99,98% 128842 [kernel] [k] schedule ▒
- schedule ▒
- 414,41% futex_wait_queue_me ◆
futex_wait ▒
do_futex ▒
sys_futex ▒
- entry_SYSCALL_64_fastpath ▒
- 233,79% __lll_lock_wait ▒
- pthread_mutex_lock ▒
+ 93,61% sqlite3Malloc.part.6 ▒
+ 79,27% sqlite3_free ▒
+ 58,12% RM_UnblockClient ▒
+ 2,65% moduleHandleBlockedClients ▒
173,92% pthread_cond_wait@@GLIBC_2.3.2 ▒
+ 2,70% exit_to_usermode_loop ▒
+ 11,69% 0,00% 0 [kernel] [k] entry_SYSCALL_64_fastpath ▒
+ 11,69% 0,00% 0 [kernel] [k] sys_futex ▒
+ 11,69% 0,00% 0 [kernel] [k] do_futex ▒
+ 11,69% 0,00% 0 [kernel] [k] futex_wait ▒
+ 11,69% 0,00% 0 [kernel] [k] futex_wait_queue_me ▒
+ 6,70% 0,00% 0 libpthread-2.23.so [.] __lll_lock_wait ▒
+ 6,69% 0,00% 0 libpthread-2.23.so [.] pthread_mutex_lock ▒
+ 4,97% 0,00% 0 libredis_sql.so [.] _$LT$redis_sql..community_statement..Statement$LT$$u27$a$GT$$u20$as$u20$redis_sql..sqlite▒
+ 4,97% 0,00% 0 libredis_sql.so [.] sqlite3_step ▒
+ 4,97% 0,00% 0 libredis_sql.so [.] sqlite3VdbeExec ▒
+ 4,81% 0,00% 0 libpthread-2.23.so [.] pthread_cond_wait@@GLIBC_2.3.2 ▒
+ 2,67% 0,00% 0 libredis_sql.so [.] sqlite3BtreeInsert ▒
+ 2,66% 0,00% 0 libredis_sql.so [.] pager_write ▒
+ 2,66% 0,00% 0 libredis_sql.so [.] sqlite3Malloc.part.6 ▒
+ 2,63% 0,00% 0 libredis_sql.so [.] insertCell ▒
+ 2,29% 0,00% 0 libredis_sql.so [.] sqlite3VdbeHalt ▒
+ 2,29% 0,00% 0 libredis_sql.so [.] sqlite3BtreeCommitPhaseTwo.part.456 ▒
+ 2,29% 0,00% 0 libredis_sql.so [.] pager_end_transaction ▒
+ 2,29% 0,00% 0 libredis_sql.so [.] sqlite3_free ▒
+ 2,02% 0,00% 0 libredis_sql.so [.] memjrnlWrite ▒
+ 1,94% 0,00% 0 libredis_sql.so [.] memjrnlClose ▒
+ 1,69% 0,00% 0 libredis_sql.so [.] redis_sql::redis::listen_and_execute::h1ab708507bc22d52
It seems to me, but again, please correct me if you see anything wrong. That most of the time is spending inside sqlite3Malloc, sqlite3_free and RM_UnblockClient (RM -> RedisModule).
If this is really the case, I really don't know what to do.
If malloc is the serialization point I don't believe that I could do much more than this.
(Maybe using a different process? But it seems quite extreme...)
Thoughts?