簡單分析MySQL中的primary key功能

2024-07-24 13:07:09

字體：大中小

來源：轉載

供稿：網友

這篇文章主要介紹了MySQL中的primary key功能,包括講到了其對InnoDB使用的影響,需要的朋友可以參考下

在5.1.46中優化器在對primary key的選擇上做了一點改動：

Performance: While looking for the shortest index for a covering index scan, the optimizer did not consider the full row length for a clustered primary key, as in InnoDB. Secondary covering indexes will now be preferred, making full table scans less likely。

該版本中增加了find_shortest_key函數，該函數的作用可以認為是選擇最小key length的

索引來滿足我們的查詢。

該函數是怎么工作的：

What find_shortest_key should do is the following. If the primary key is a covering index

and is clustered, like in MyISAM, then the behavior today should remain the same. If the

primary key is clustered, like in InnoDB, then it should not consider using the primary

key because then the storage engine will have to scan through much more data.

調用Primary_key_is_clustered()，當返回值為true，執行find_shortest_key：選擇key length最小的覆蓋索引(Secondary covering indexes)，然后來滿足查詢。

首先在5.1.45中測試：

$mysql -V

mysql Ver 14.14 Distrib 5.1.45, for unknown-linux-gnu (x86_64) using EditLine wrapper

root@test 03:49:45>create table test(id int,name varchar(20),name2 varchar(20),d datetime,primary key(id)) engine=innodb;

Query OK, 0 rows affected (0.16 sec)

root@test 03:49:47>insert into test values(1,'xc','sds',now()),(2,'xcx','dd',now()),(3,'sdds','ddd',now()),(4,'sdsdf','dsd',now()),(5,'sdsdaa','sds',now());

Query OK, 5 rows affected (0.00 sec)

Records: 5 Duplicates: 0 Warnings: 0

root@test 03:49:51>

root@test 03:49:51>insert into test values(6,'xce','sdsd',now()),(7,'xcx','sdsd',now()),(8,'sdds','sds',now()),(9,'sdsdsdf','sdsdsd',now()),(10,'sdssdfdaa','sdsdsd',now());

Query OK, 5 rows affected (0.00 sec)

Records: 5 Duplicates: 0 Warnings: 0

創建索引ind_1：

root@test 03:49:53>alter table test add index ind_1(name,d);

Query OK, 0 rows affected (0.09 sec)

Records: 0 Duplicates: 0 Warnings: 0

root@test 03:50:08>explain select count(*) from test;

+—-+————-+——-+——-+—————+———+———+——+——+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+——-+——-+—————+———+———+——+——+————-+

| 1 | SIMPLE | test | index | NULL | PRIMARY | 4 | NULL | 10 | Using index |

+—-+————-+——-+——-+—————+———+———+——+——+————-+

1 row in set (0.00 sec)

添加ind_2：

root@test 08:04:35>alter table test add index ind_2(d);

Query OK, 0 rows affected (0.07 sec)

Records: 0 Duplicates: 0 Warnings: 0

root@test 08:04:45>explain select count(*) from test;

+—-+————-+——-+——-+—————+———+———+——+——+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+——-+——-+—————+———+———+——+——+————-+

| 1 | SIMPLE | test | index | NULL | PRIMARY | 4 | NULL | 10 | Using index |

+—-+————-+——-+——-+—————+———+———+——+——+————-+

1 row in set (0.00 sec)

上面的版本【5.1.45】中，可以看到優化器選擇使用主鍵來完成掃描，并沒有使用ind_1，ind_2來完成查詢;

接下來是：5.1.48

$mysql -V

mysql Ver 14.14 Distrib 5.1.48, for unknown-linux-gnu (x86_64) using EditLine wrapper

root@test 03:13:15> create table test(id int,name varchar(20),name2 varchar(20),d datetime,primary key(id)) engine=innodb;

Query OK, 0 rows affected (0.00 sec)

root@test 03:48:04>insert into test values(1,'xc','sds',now()),(2,'xcx','dd',now()),(3,'sdds','ddd',now()),(4,'sdsdf','dsd',now()),(5,'sdsdaa','sds',now());

Query OK, 5 rows affected (0.00 sec)

Records: 5 Duplicates: 0 Warnings: 0

root@test 03:48:05>insert into test values(6,'xce','sdsd',now()),(7,'xcx','sdsd',now()),(8,'sdds','sds',now()),(9,'sdsdsdf','sdsdsd',now()),(10,'sdssdfdaa','sdsdsd',now());

Query OK, 5 rows affected (0.01 sec)

Records: 5 Duplicates: 0 Warnings: 0

創建索引ind_1：

root@test 03:13:57>alter table test add index ind_1(name,d);

Query OK, 0 rows affected (0.01 sec)

Records: 0 Duplicates: 0 Warnings: 0

root@test 03:15:55>explain select count(*) from test;

+—-+————-+——-+——-+—————+——-+———+——+——+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+——-+——-+—————+——-+———+——+——+————-+

| 1 | SIMPLE | test | index | NULL | ind_1 | 52 | NULL | 10 | Using index |

+—-+————-+——-+——-+—————+——-+———+——+——+————-+

root@test 08:01:56>alter table test add index ind_2(d);

Query OK, 0 rows affected (0.03 sec)

Records: 0 Duplicates: 0 Warnings: 0

添加ind_2：

root@test 08:02:09>explain select count(*) from test;

+—-+————-+——-+——-+—————+——-+———+——+——+————-+

| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |

+—-+————-+——-+——-+—————+——-+———+——+——+————-+

| 1 | SIMPLE | test | index | NULL | ind_2 | 9 | NULL | 10 | Using index |

+—-+————-+——-+——-+—————+——-+———+——+——+————-+

1 row in set (0.00 sec)

版本【5.1.48】中首先明智的選擇ind_1來完成掃描，并沒有考慮到使用主鍵(全索引掃描)來完成查詢，隨后添加ind_2，由于 ind_1的key長度是大于ind_2 key長度，所以mysql選擇更優的ind_2來完成查詢，可以看到mysql在選擇方式上也在慢慢智能了。

觀察性能：

5.1.48

root@test 08:49:32>set profiling =1;

Query OK, 0 rows affected (0.00 sec)

root@test 08:49:41>select count(*) from test;

+———-+

| count(*) |

+———-+

| 5242880 |

+———-+

1 row in set (1.18 sec)

root@test 08:56:30>show profile cpu,block io for query 1;

+——————————–+———-+———-+————+————–+—————+

| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |

+——————————–+———-+———-+————+————–+—————+

| starting | 0.000035 | 0.000000 | 0.000000 | 0 | 0 |

| checking query cache for query | 0.000051 | 0.000000 | 0.000000 | 0 | 0 |

| Opening tables | 0.000014 | 0.000000 | 0.000000 | 0 | 0 |

| System lock | 0.000005 | 0.000000 | 0.000000 | 0 | 0 |

| Table lock | 0.000010 | 0.000000 | 0.000000 | 0 | 0 |

| init | 0.000015 | 0.000000 | 0.000000 | 0 | 0 |

| optimizing | 0.000007 | 0.000000 | 0.000000 | 0 | 0 |

| statistics | 0.000015 | 0.000000 | 0.000000 | 0 | 0 |

| preparing | 0.000012 | 0.000000 | 0.000000 | 0 | 0 |

| executing | 0.000007 | 0.000000 | 0.000000 | 0 | 0 |

| Sending data | 1.178452 | 1.177821 | 0.000000 | 0 | 0 |

| end | 0.000016 | 0.000000 | 0.000000 | 0 | 0 |

| query end | 0.000005 | 0.000000 | 0.000000 | 0 | 0 |

| freeing items | 0.000040 | 0.000000 | 0.000000 | 0 | 0 |

| logging slow query | 0.000002 | 0.000000 | 0.000000 | 0 | 0 |

| logging slow query | 0.000086 | 0.000000 | 0.000000 | 0 | 0 |

| cleaning up | 0.000006 | 0.000000 | 0.000000 | 0 | 0 |

+——————————–+———-+———-+————+————–+—————+

對比性能：

5.1.45

root@test 08:57:18>set profiling =1;

Query OK, 0 rows affected (0.00 sec)

root@test 08:57:21>select count(*) from test;

+———-+

| count(*) |

+———-+

| 5242880 |

+———-+

1 row in set (1.30 sec)

root@test 08:57:27>show profile cpu,block io for query 1;

+——————————–+———-+———-+————+————–+—————+

| Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |

+——————————–+———-+———-+————+————–+—————+

| starting | 0.000026 | 0.000000 | 0.000000 | 0 | 0 |

| checking query cache for query | 0.000041 | 0.000000 | 0.000000 | 0 | 0 |

| Opening tables | 0.000014 | 0.000000 | 0.000000 | 0 | 0 |

| System lock | 0.000005 | 0.000000 | 0.000000 | 0 | 0 |

| Table lock | 0.000008 | 0.000000 | 0.000000 | 0 | 0 |

| init | 0.000015 | 0.000000 | 0.000000 | 0 | 0 |

| optimizing | 0.000006 | 0.000000 | 0.000000 | 0 | 0 |

| statistics | 0.000014 | 0.000000 | 0.000000 | 0 | 0 |

| preparing | 0.000012 | 0.000000 | 0.000000 | 0 | 0 |

| executing | 0.000007 | 0.000000 | 0.000000 | 0 | 0 |

| Sending data | 1.294178 | 1.293803 | 0.000000 | 0 | 0 |

| end | 0.000016 | 0.000000 | 0.000000 | 0 | 0 |

| query end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 |

| freeing items | 0.000040 | 0.000000 | 0.001000 | 0 | 0 |

| logging slow query | 0.000002 | 0.000000 | 0.000000 | 0 | 0 |

| logging slow query | 0.000080 | 0.000000 | 0.000000 | 0 | 0 |

| cleaning up | 0.000006 | 0.000000 | 0.000000 | 0 | 0 |

+——————————–+———-+———-+————+————–+—————+

從上面的profile中可以看到在Sending data上，差異還是比較明顯的，mysql不需要掃描整個表的頁塊，而是掃描表中索引key最短的索引頁塊來完成查詢，這樣就減少了很多不必要的數據。

PS:innodb是事務引擎，所以在葉子節點中除了存儲本行記錄外，還會多記錄一些關于事務的信息(DB_TRX_ID ,DB_ROLL_PTR 等)，因此單行長度額外開銷20個字節左右，最直觀的方法是將myisam轉為innodb，存儲空間會明顯上升。那么在主表為t(id,name,pk(id)),二級索引ind_name(name,id),這個時候很容易混淆，即使只有兩個字段，第一索引還是比第二索引要大(可以通過innodb_table_monitor觀察表的的內部結構)在查詢所有id的時候,優化器還是會選擇第二索引ind_name。

上一篇：Suse Linux 10中MySql安裝與配置步驟

下一篇：在大數據情況下MySQL的一種簡單分頁優化方法