一、數據庫重啟日志分析 terminate called after throwing an instance of 'std::out_of_range' what(): vector::_M_range_check 04:10:09 UTC - mysqld got signal 6 ; mysqld got signal 6 ; ...... Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0 thread_stack 0x40000 /dbdata/mysql3306/bin/mysqld(my_print_stacktrace+0x35)[0xf3e175] /dbdata/mysql3306/bin/mysqld(handle_fatal_signal+0x4b4)[0x7c3b94] /lib64/libpthread.so.0(+0xf7e0)[0x7f79f28e87e0] /lib64/libc.so.6(gsignal+0x35)[0x7f79f137d495] /lib64/libc.so.6(abort+0x175)[0x7f79f137ec75] /usr/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x12d)[0x7f79f1c37a8d] /usr/lib64/libstdc++.so.6(+0xbcbe6)[0x7f79f1c35be6] /usr/lib64/libstdc++.so.6(+0xbcc13)[0x7f79f1c35c13] /usr/lib64/libstdc++.so.6(+0xbcd32)[0x7f79f1c35d32] /usr/lib64/libstdc++.so.6(_ZSt20__throw_out_of_rangePKc+0x67)[0x7f79f1bdadb7] /dbdata/mysql3306/bin/mysqld[0x11d8f15] /dbdata/mysql3306/bin/mysqld[0x11d99d5] /dbdata/mysql3306/bin/mysqld(_Z17dict_stats_updateP12dict_table_t23dict_stats_upd_option_t+0x9dc)[0x11de0cc] /dbdata/mysql3306/bin/mysqld(dict_stats_thread+0x4f2)[0x11e0512] /lib64/libpthread.so.0(+0x7aa1)[0x7f79f28e0aa1] /lib64/libc.so.6(clone+0x6d)[0x7f79f1433bcd] You may download the Percona Server operations manual by visiting http://www.percona.com/software/percona-server/. You may find information in the manual which will help you identify the cause of the crash. 這部分是數據庫崩潰的時候的棧幀,因為收到的是信號6 SIGABRT,只要捕獲信號后改變其行為即可。這部分在MySQL官方文檔中叫做Stack Trace,參考:
28.5.1.5 Using a Stack Trace 實際上在這里我們已經可以看到大約是統計數據收集的問題,因為我們看到了dict_stats_thread,這是統計收集線程的回調函數。
[10 Feb 2017 8:12] Shane Bester Oli, Umesh, this would be same as internal: Bug 24585978 - INNODB: ASSERTION TOTAL_RECS > 0 FAILURE IN FILE DICT0STATS.CC 四、查詢Bug到底修改了什么地方 這里請教了阿里的印風兄才知道怎么查看,首先拿到了內部bug號:24585978 然后在git的commit log中搜索得到 git --no-pager log >/root/commitlog vi /root/commitlog 找到commit號為: 29acdaaaeef9afe32b42785f1da3d79d56ed7e59 如下是這個bug的修復地方:
Analysis: ======== There was missing bracket for IF conditon in dict_stats_analyze_index_level() and it leads to wrong result.
Fix: ==== Fix the IF condition in dict_stats_analyze_index_level() so that it satisfied the if condtion only if level is zero.
Reviewed-by : Jimmy Yang
diff --git a/storage/innobase/dict/dict0stats.cc b/storage/innobase/dict/dict0stats.cc index 3494070..55a2626 100644 --- a/storage/innobase/dict/dict0stats.cc +++ b/storage/innobase/dict/dict0stats.cc @@ -1099,10 +1099,10 @@ dict_stats_analyze_index_level( leaf-level delete marks because delete marks on non-leaf level do not make sense. */
/* if any of these is 0 then there is exactly one page in the B-tree and it is empty and we should have done full scan and should not be here */ ut_ad(total_recs > 0); ut_ad(n_diff_on_level[n_prefix - 1] > 0); 六、如何規避 在官網查看的時候有如下方式可以規避這個Bug
if (dict_stats_is_persistent_enabled(table)) { //參數innodb-stats-persistent 作用 if (counter > n_rows / 10 /* 10% */ && dict_stats_auto_recalc_is_enabled(table)) {//參數innodb-stats-auto-recalc 作用 dict_stats_recalc_pool_add(table); table->stat_modified_counter = 0; } return; } /* Calculate new statistics if 1 / 16 of table has been modified since the last time a statistics batch was run. We calculate statistics at most every 16th round, since we may have a counter table which is very small and updated very often. */ if (counter > 16 + n_rows / 16 /* 6.25% */) {
ut_ad(!mutex_own(&dict_sys->mutex)); /* this will reset table->stat_modified_counter to 0 */ dict_stats_update(table, DICT_STATS_RECALC_TRANSIENT); } 這樣做的話肯定不會調用到觸發bug的函數,有興趣的可以看看dict_stats_update(table, DICT_STATS_RECALC_TRANSIENT);的邏輯。實際上使用的是老的方式斷點可以打在btr_estimate_number_of_different_key_vals函數上。