Table of contents
History of MySQL performance improvements
" #
"
$%
& #
'
$%
%(%
、
)' *+,
! -
')
#
.
!
&
&
/
!
-
!
%
Per-server performance is important
(
0
1!
22
0333 1!
4 0333
03333 1!
4 033
%
533
03 6
& 0 6
!
!
! &
!
!
$
.
) /
"
033*733
-
-
- !
&
8
%
!
. 2 2 93:;<
/
-)
-
!
-
! -
!
!
;
!
%
!
-
!
=
32bit Linux
HDD RAID
(20GB)
2GB RAM
HDD RAID
(20GB)
2GB RAM
HDD RAID
(20GB)
2GB RAM
Updates
64bit Linux + large RAM + BBWC
&
>?-
!,
!
-
0>:;
$%
!,
&
!
!
-(
! -
-
.
/
)
.
! -
!
/
@(
A
,
!
-
-!
-
&
; -
%
.
&
/
-HDD RAID
(120GB)
16GB RAM
HDD RAID
%
0>*97:; $%
-
&
!
!
!
#
?*B $%
C 03
-
&
!
> "
$%
03&
)
!
733& 033
)
!
0333*7333
:
-Side effect caused by fast server
!
.03<
!
/
HDD RAID
&
.
/2
> "
$%
03
"
-HDD RAID
.
) 4 033< *D 9333</
!
.
) 4 0333< *D
9333</
#
(
! -
!
-
!
)
-
.0333</
.033</
!
!
!
)
E
4
:
!
"
$%
-
1!
4
; ) ! !
$
)
%(%
9333<&
!
-
!
.0C
-
"
/
F!
"
!
! -
!
-SATA SSD
Concurrency improvements around MySQL
$
&
-;4 $
&
% 4 (
-"
:
.: -
! ,/
'
.
-
- /
!
!
; ;!
)
&
&
;
')
.C20 ) !
& C2C/
)
!
-G * . 2 + +'(/ ! !
!
!
)
-
!
.
/
!
.
*
/
Avoiding sudden performance drops
!
-
-
!
!
-
-
!
=
!
=
(
!
!
%
-
&
!
;
!
-=
! -
2 H
= -
--!
=
!
-Perf
Avoiding stalls
%
-
.
G
/
(
! -
.03*033
& -!
Avoiding stalls(2)
!
-
-
-! 2
-= !
.?23 ?20& C232?C&
22/
(
!
-
.
'JK
! ,/
;!
.
&
-
!
/
Handling Real-World workloads..
'
!
4
%
-
)
F
G ;
G
!-
4/
F
8
&
)
&
)'
7*9
-7C<
!
RDBMS or NoSQL ?
E
=
L
% !
&
!
-M
" #
!
$ ;
.
') &
$% & )' *+,
/
'
, 1!
&
!
!
&
&
,&
-
! &
& !
!
$ ;
.'
=
0333 !
/ E
!
03&333< ) %(+
"
,
%
-
= -
-
- !
-Social Game workloads
+
!
-
8
G +
)J
@!
K
< ,,,K A
G
:;
-
!
%
-!
-
=
!
&
INSERT-mostly tables
"
-
!
&
&
-+$(
+ +'(
( -
8 -
!
.
,
0(;/
.
+ +'(
/
+$(
; .(
;!
2
!
% /
(
,
-
&
-
-!
#
, 8 -
-!
&
*
*D
*- !
G
!
!
!
G ) %(+
+ +(+ + +'(
!
InnoDB Feature: Insert Buffering
*! 1! &
, -
&
;
-!
.@
-!
A/
-! - - I (+
-)
&
-!
,
-
.@
A/
)
4 $
!
$ ! ! - - 1!
-- 1!
'
4
%
, ! - !
- 2
! !
Insert buffer
INSERT gets slower
Time to insert 1 million records (InnoDB, HDD)
0 100 200 300 400 500 600
1 13 25 37 49 61 73 85 97 109 121 133 145 Existing records (millions)
S
e
c
o
n
ds
Sequential order Random order
, 8
,
- -!
8
N9
:
!
-
!
-!
.
/
1!
&
2
INSERT performance difference
*
+$(
!
!
0C333<
" #
+,
-!
&
7333*?333
"
&
>333*B333
! *
!
%
C333
!
!
G
;
!
!
')
!
1!
- !
. 2 2
!
! &
/
Approach to complete INSERT in memory
$
-C20
, 8
-
K
,K 8
! - K K
(
(
+ (% )
-
!
-G (
+ (% )
!
!
-
- % (+$ (%; + 22
$ ) )%$( (
Single big physical table(index)
Partition 1 Partition 2
Approaches to complete INSERT in memory
-
-
.!
C23&
!
-
-
=
/
)!
,
!
UPDATE-mostly tables
(
!
!
=
!
G '!
")& ,
&
& ! -
&
=
! &
-! &
G ) %(+
+ +'(
1!
, !
!
!
) %(+
-
,
-G
-!
&
!
G
8
&
+$(
+ +(+
"!
-G
*
) %(+4 07&333
G "
) %(+4 933
G
%(%
) %(+4 0&B33
G )' *+
) %(+4 ?&333
G O
(
What do you need to consider? (H/W layer)
"
E
%(% %
)' *+,
E& "
E
$%
" # $%
&
# $%
F;
E
033 -
0:-
!
E
7:; $%
< )' *+
>?:; $%
<
B"
E
')
What do you need to consider?
$
!
$%
$;
.
/
*
$
%
$
, 9& , &
E
& $
&
!
#
Why SSD? IOPS!
) 4
! -
.
/
%
-
1!
-
,
,
$
!
% "
4 733
.
Q
/
Table of contents
;
)
"
$
$
$
#
1!
$
1!
#
./
)
!
8
'
'
R
-Random Read benchmark
"
4 05>
0
& ??9
033
4 9C3B
0
& 0?C9B
033
!
4 03C7>
0
& ?09N5
033
!
!
0>, -
"
& !
7C,
-’
!
.?,/
!
-
"
’
.727,/
H
!
Direct Random Read IOPS (Single Drive, 16KB, xfs)
0 5000 10000 15000 20000 25000 30000 35000 40000 45000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 # of I/O threads
IO
P
S HDD
High Concurrency
!
%
. 2 2 ?3 , ?:;
S 0>3:;/
"
%
PCI-Express SSD
%
)' *+,
!
%
%(%
. !
/
-
R
)' *+
North Bridge
PCI0Express Controller
SSD I/O Controller
Flash CPU
South Bridge SAS/SATA Controller
2GB/s (PCI0Express x 8) 300MB/s
SSD I/O Controller
Write performance on SSD
H
!
;!
22 '
!
! O
OE
22
’
Random Write IOPS (16KB Blocks)
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000
HDD(4 RAID10 xfs) Intel(xfs) Fusion (xfs)
Understanding how data is written to SSD (1)
. 2 2 7:;/
%
-
. 2 2 C07J;/
% -
. 2 2 ?J;/
O
O
-
*
-$
-#
-
-%
-*
-
O
O
-Flash memory chips
Block (empty)
Page Page Block (empty)
Block (empty) Block
Understanding how data is written to SSD (2)
*
-
-#
-#
*
-
.*733
/
+
!
. 2 2
;
/&
-
-!
.#
*
/
Block (empty)
Page Block (empty)
Block (empty)
!.
Page
New data
×
×
×
×
Understanding how data is written to SSD (3)
1. Reading all pages
2. Erasing the block
P
P
3. Writing all data
Reserved Space
1. Reading pages
Background jobs ERASE unused blocks
2. Writing data
P
P
Reserved Space
Write performance deterioration
%
-
&
)
“
”
#
&
)
!
“
”
.-
!
+$% +
/
*
!
!
#
)
“
”
.
-
+$% + - -
!
M -/
"
.($
!
&
+$% +
!
&
/
Write IOPS deterioration (16KB random write)
0 5000 10000 15000 20000 25000 30000
Intel Fusion(150G) Fusion(120G) Fusion(96G) Fusion(80G)
IO
P
S Fastest
Slowest
Continuous write/intensive workloads
Mitigating write performance deterioration
%
)
2 (
-8
$%
3
#
-!
!
2 I !
Sequential I/O
(
4 !
-
.
/&
M !
.
/
!
"
1!
& -!
"
.? $%
03/
!
1!
8 -
1!
-
!
&
!
- !
!
1!
Sequential Read/Write throughput (1MB consecutive reads/writes)
0 100 200 300 400 500 600
4 HDD(raid10, xfs) Intel(xfs) Fusion(xfs)
M
B
/
s Seq read
fsync() speed
03&333<
!
') - !
.T
/&
- !
.T
/2
fsync speed
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000
HDD(xfs) Intel (xfs) Fusion I/O(xfs)
fs
y
n
c/
se
c
HDD is fast for sequential writes / fsync
;
)
4 #
- -
- !
;;#' ;#'
.;
;
! #
'
/&
$+
.-
!
’
1!
/
!
disk
Write cache
disk
seek & rotation time
Filesystem matters
, &
!
K
$+'(& -!
, O
:
!
, &
, 9
8
!
Random write iops (16KB Blocks)
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000
Fusion(ext3) Fusion (xfs) Fusion (raw) Filesystem
io
p
s 1 thread
Changing I/O unit size
"
&
,
!
77T
!
-
0J;
0>J;
-
!
U 03
Read IOPS and I/O unit size (4 HDD RAID10)
0 500 1000 1500 2000 2500
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200
concurrency
IO
P
S 1KB
Changing I/O unit size on FusionIO
"!
&
) & -!
8
’
+
!
“
! -
-
8
”
!
Random Read IOPS (FusionIO, SLC)
0 20000 40000 60000 80000 100000 120000 140000 160000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 Concurrency
R
e
a
ds
/
s 4KB
SLC vs MLC (16KB)
B*?3T -
!
!
'
Random Read IOPS, FusionIO (16KB)
0 5000 10000 15000 20000 25000 30000 35000 40000 45000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 concurrency
re
a
ds
/
s
SLC vs MLC (8KB)
7C*NCT -
!
!
'
$
-
!
-
!
.
'
'&
!
8 &
22/
Random Read IOPS, FusionIO (8KB)
0 10000 20000 30000 40000 50000 60000 70000 80000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 concurrency
re
a
d
s/
s
tachIOn vs FusionIO (SLC)
&
!
!
Random Read IOPS (16KB)
0 10000 20000 30000 40000 50000 60000 70000 80000 90000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 concurrency
re
ad
s/
s
PCI-Express interface and CPU util
T 1 ."
!
/
-$
)' *+
; !
!
! -
$
# cat /proc/interrupts | grep PCI
83: … PCI-MSI vgcinit
202: … PCI-MSI-X eth2-0
210: … PCI-MSI-X eth2-1
218: … PCI-MSI-X eth2-2
226: … PCI-MSI-X eth2-3
234: … PCI-MSI-X eth2-4
# mpstat –P ALL 1
CPU %user %nice %sys %iowait %irq %soft %idle intr/s
all 0.45 0.00 7.75 26.69 1.65 0.00 63.45 40046.40
0 1.00 0.00 12.60 86.40 0.00 0.00 0.00 1000.20
1 1.00 0.00 13.63 85.37 0.00 0.00 0.00 0.00
2 0.40 0.00 4.80 26.80 0.00 0.00 68.00 0.00
# of interfaces (tachIOn SLC)
(
!
-
!
! &
!
#
! -
.
!
1!
/&
T 1
&
!
!
!
(
! -
-Random Read IOPS (16KB, tachIOn)
0 50000 100000 150000 200000 250000 300000 350000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 concurrency
re
ad
s/
s
Single Drive
# of interfaces (FusionIO MLC)
!
!
)' *+,
(
$
- !
&
!
!
%
!
!
!
.
&
-/2
-!
!
.
/
G ! - ") 9>3 . )' *+ /
G & ! - & , !
-Random Read IOPS (16KB)
0 10000 20000 30000 40000 50000 60000 70000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 concurrency
re
ad
s/
s
tachIOn(SLC) vs FusionIO Duo(MLC)
!
!
!
&
=
'
Random Read IOPS (16KB)
0 10000 20000 30000 40000 50000 60000 70000 80000 90000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200
concurrency
re
a
ds
/
s tachIOn
Opteron vs Nehalem(tachIOn)
7
)
--
)' *+
')
1!
tachIOn Random Read IOPS (16KB)
0 10000 20000 30000 40000 50000 60000 70000 80000 90000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 # of threads
re
ad
s/
s
Opteron vs Nehalem(tachIOn)
7*?
-
T 1
033T !
"
"
!
1!
')
tachIOn Random Read IOPS(4KB)
0 50000 100000 150000 200000 250000 300000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200 # of threads
re
ad
s/
s
How about using SSD as L2 Cache?
7'
& ($
!
V 7%$'
G ) V
- '
G # !, J !
! '
G # - ! 2 !
'
G 7 - 2
-!
) . ' /
G !, J !
G + ! 033T & ?3T& NCT
! . ' C2C/
G P7C*+
= ) !
G = M! 2 # , * 2 = - & 2
Virtualization?
'!
! .
/
:
0 93
!
!
-! ! 032? < JH
<
!
#
"
&
!
!
- ! !
!
!
8
Virtualization benchmarks (HDD)
-
"
Random Read IOPS (Dell could HDD, 16KB unit)
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200
# of threads
re
a
ds
/
se
c
Virtualization benchmarks (SATA SSD)
-
!
!
Random Read IOPS (Dell cloud SSD, 16KB unit)
0 5000 10000 15000 20000 25000
1 2 3 4 5 6 8 10 15 20 30 40 50 100 200
# of threads
re
ad
s/
se
c
MySQL Deployment Practices on SSD
#
9
$
"
. %(%/
$
"
!
.)' *+/
-
"
!
C2C
C20
;(*7 -
.
/
733 #
!
.73:;
–
7C:;
/
HDD vs Intel SSD vs Fusion I/O
!
)' *+
-)' *+
!
%
%(%
0?,
?"
CN3523>
0C0772NC
!
007C2??
;!
0:
"
Which should we spend money, RAM or SSD?
$%
.-!
8 /
!
;
!
-!
&
.-1!
/
!
&
1!
-!
!
$%
< "
!
$%
<
0C0772NC
CN3523>
007C2??
;!
0:
0B>9205
;!
7:
?9BC20B
;!
C:
9>NB?2N>
;!
93:
.'
/
Which should we spend money, RAM or SSD?
-!
<
!
.C:;/
;
!
- 4
!
&
!
$%
< "
!
’
&
!
,
!
! &
-$%
0C0772NC
CN3523>
007C2??
;!
0:
7335>299
NC9>2CC
0B>9205
;!
7:
93B?>29?
07B572C>
?9BC20B
;!
C:
CN??02>?
*
9>NB?2N>
;!
93:
.'
/
MySQL file location
,
"
!
1!
!
1!
$
4
.O2 - /
–
1!
!
-&
;!
. -
/
–
-
.
& ,
!
* !
-
/
–
*
-!
.
& ,
;
!
!
, /
1!
#
4
!-
;!
. -
3/
–
#
!
1!
O -
2 "!
;
.
1 *- 2PPPPPP/
$
. -K
/
Moving sequentially written files into HDD
.! SNNT&
S0T/
95?9C27C
.! S?5T&
S>T/
7C>7N2?5
.! S9>T&
SBT/
0575C25?
.! S97T&
S03T/
!
< "
.! S7CT&
S0CT/
;!
0:
7335>299
.! S93T&
S072CT/
;!
7:
93B?>29?
.! S95T&
S03T/
;!
C:
CN??02>?
.! SN3T&
S92CT/
;!
93:
Does CPU matter?
-02
') 4
*
72
-
')
;
72C,
&
') U*D
!
)' *+,
Nehalem
Older Xeon
CPUs
Memory
North Bridge
(IOH)
PCI/Express
QPI: 25.6GB/s
CPUs
Memory
North Bridge
(MCH)
PCI/Express
Harpertown X5470 (older Xeon) vs Nehalem X5570 (HDD)
')
') - !
935392? .! S?3T/
?0N>2C0 .! SNT/
0577279 .! S7T/
009C29N .! S0T/
"
PC?N3&
9299:"8
*0T
007C2?? .! S0T/
;!
0:
*9T
0B>9205 .! S7T/
;!
7:
<CT
?9BC20B.! SNT/
;!
C:
<05T
9>NB?2N> .! S?3T/
;!
93:
.PCCN3&
7259:"8/
"
Harpertown X5470 vs Nehalem X5570 (Fusion)
C7CB72N0 .!
SN>T/
933CB2?B .!
SC3T/
0537>2>? .!
S?3T/
09C9?23> .!
S9CT/
"
PC?N3&
9299:"8
<?9T
0575C25? .!
S97T/
;!
0:
<9CT
7C>7N2?5 .!
S9NT/
;!
7:
<90T
95?9C27C .!
SC3T/
;!
C:
<7>T
>>3C92>B .!
SN>T/
;!
93:
.PCCN3&
7259:"8/
Intel SSDs with a traditional H/W raid controller
$
!
-
!
" #
+
$%
03
" # $
!
-!
.
P7C*
H
+
’
/
F;
<
<
973
!
00N9527N
?N>92>3
75NC23?
! $%
C
*5T
*9NT
*?BT
CN3523>
;!
0:
NC9>2CC
;!
7:
07B572C>
Enable HyperThreading
; ) !
C20
0>*7? ')
"(
-
!
') - !
B0?07279
?CNBC207
7B?9B233
73NBC2?7
"(
.0>/
<79T
<0>T
<00T
<N2NT
0575C25?
;!
0:
7C>7N2?5
;!
7:
95?9C27C
;!
C:
>>3C92>B
;!
93:
Where can PCI-Express SSD be used?
E
)' *+
!
&
G 03&333 ) & ?3&333< ) 033
03&333
)
033
-
%(%
)
!
-E
!
"
& %(%
!
-
!
G )' *+, ! , %(%
"
- ! !
!
E
G H ! 8
G $! !
-)' *+
!
!
!
E
"
&
033*733:;
-
--
!
)
"
!
'4 973:;
! < 0>3:; S ?B3:;
!
'4 07B3:;
! < >?3:; S 0573:; .
!
/
Our environment
")
9>3:N .0 /&
$>03
)' *+
!
' .>?3*07B3:;
! < 973*>?3:; /
' .B33:; , 7/ *D
'
')
(
&
>*
& "(
-G 7? ')
-G ! ,
$%
>3:;
;
;' CN35&
!
;
, 7
(
)
& >*07 .S
/
!
)
"
?*B % $%
0<3
Statistics
' N ! .>?3:; ' ! < 973:; '/
+ +'( 1!
>:; -K-! K K 8
) ) . N /
>0>B92N 1! 9N59520 NB>020 ! 003C 0B?9 90?92C
-') 8
T! 7N29T& T 00T.T ?T/& T ?T '2 2 %(% :T! ?T& T 0T& T 0T
;!
552?T
%(% 4 552BT
-CPU loads
') ! 8 & -! ! -
-T! 7N29T& T 00T.T ?T/& T ?T
$ .5>3:;/2 0573:; ' !
--
-$ 4 0?2> ; & 4 7B2N ;
' C < - 1! . ') 1! /
! - & -! - !
22:10:57 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
22:11:57 all 27.13 0.00 6.58 4.06 0.14 3.70 0.00 58.40 56589.95
…
Things to consider
%
)
!
-%
#
)
-
!
,
@
S993>A
-
*
S@
!
A
2
'
&
&
;
& -
!
-"
&
taskset
!
7?
')
#
!
><
7?
&
! -
!
8
')
!
-'
!
! !
.
/
#
& !
.
/
Application Design
;
')
, !
!
-"
-"
-
.
/
"
-
!
$
-
-
"
-Making MySQL better
)
)
BJ; ?J;
; ;
8
-
! ,
+%$ "% "
Future improvements from DBA perspective
&
-
!
$
!
-
!
.
!
/
$
B33:;
!
!
-!
!
G
*-
S0&
-K ! K
K K ,K
S0
G $
*
.:%
C2C/
'
*
-
.-
! /
$
, !
!
!
-
;
-
&
.
*
2
/ .
C2>E/
-Conclusion for choosing H/W
)' *+
. 2 2 !
&
/
%
%(%
. 2 2
973/
'
!
$%
2
J
!
& !
&
!
$%
3
!
"
1!
')
Conclusion for database deployments
)!
1!
"
-
& -K
&
-"
!
1!
#
-,
-)!
O -
&
,
. I /&
. I /
03, *033,
"
%
-
"
!
,
"
C20
; ) !
C2C
What will happen in the real database world?
(
M!
!
22
’
!
M!
"
!
E
!
-H !
8
!
!
1
$
2 $!
!
!
!
M!
)
'
! -
-
8 &
-
!-
&
'
!
B*0> "
Random Access Memory
$%
!
"
$% 4 *>3
–
033&333 1!
-"
4 *C
4 033*C33!
0>*033<:; $%
O
O
!
-8
8
.
%
(
H%$'"%$ ; :
(&
(
+ (% )
%(+(
+&
/
!
,
Use Direct I/O
!
!
8
-K ! K
S K
$+'(
%
4
!
! -
C07
-'
’
!
K
$+'(
;
& ;
&
% & )
&
Filesystem Cache
InnoDB Buffer Pool
RAM
InnoDB Data File
InnoDB Buffer Pool
RAM
InnoDB Data File
Do not allocate too much memory
#
- E
$
!
. /
-)
.
! /
*
.
/
user$ top
Mem: 32967008k total, 32808696k used,
158312k free
, 10240k buffers
Swap: 35650896k total,
4749460k used
, 30901436k free, 819840k cached
PID USER PR NI
VIRT RES
SHR S %CPU %MEM TIME+ COMMAND
;
8
8
&
’
2 ;! 22
H
!
#
$%
- &
2
J
(
*
!
.
1 /
-’
-
!
2 '
)
-
$ +$ ;I
U)
D
K
+ '
1
–
H
8 & ')
& !
&
.
!
! /
J
#
’
!
!
Do not set swap=zero
Cpu(s): 0.0%us,
): 0.0%us,
): 0.0%us,
): 0.0%us,
24.9%sy
24.9%sy
24.9%sy
24.9%sy
, 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st
, 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st
, 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st
, 0.0%ni,75.0%id,0.2%wa,0.0%hi, 0.0%si,0.0%st
Mem
Mem
Mem
What if stopping OOM Killer?
U)
D
K
M
*0N&
J
’
*0N
!
R
*0N D
U
D
K
M
;!
’
*0N
1
*
*
!
&
!,
’
Swap space management
- 8
;!
’
1
!
#
!
E
$ ;
–
!
.
-K-!
K
&
K-!
&
K-!
&
/
–
!
.
;
&
-
&
%
&
/
%
.-
! &
/
–
!
Be careful about backup operations
'
!
Mem: 32967008k total, 28947472k used, 4019536k free, 152520k buffers
Swap: 35650896k total, 0k used, 35650896k free, 197824k cached
PID USER PR NI VIRT
RES
SHR S %CPU %MEM TIME+ COMMAND
5231 mysql
25 0 27.0g
27g
288 S 0.0 92.6 7:40.88 mysqld
Mem: 32967008k total, 32808696k used, 158312k free, 10240k buffers
Swap: 35650896k total,
4749460k used
, 30901436k free,
8819840k cached
PID USER PR NI VIRT
RES
SHR S %CPU %MEM TIME+ COMMAND
5231 mysql
25 0 27.0g
22g
324 S 0.0 71.8 7:46.50 mysqld
vm.swappiness = 0
2
S3
2
!
>3
#
$%
!
!
&
!,
!
.
/
%
- &
’
-
!
2
’
Mem: 32967008k total, 28947472k used, 4019536k free, 152520k buffers
Swap: 35650896k total, 0k used, 35650896k free, 197824k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5231 mysql
25 0 27.0g 27g 288 S 0.0 91.3 7:55.88 mysqld
Mem: 32967008k total, 32783668k used, 183340k free, 3940k buffers
Swap: 35650896k total,
216k used
, 35650680k free,
4117432k cached
PID USER PR NI VIRT
RES
SHR S %CPU %MEM TIME+ COMMAND
5231 mysql
25 0
27.0g 27g
288 S 0.0 80.6 8:01.44 mysqld
But swap happens even though swappiness==0
97:; $%
- ,& '
C .72>20B207B/
-K-!
K
K 8 S 7>:;
1
-K-! K
K 8 4
, 9:;
J
4
, 0:;
top 0 11:54:51 up 7 days, 15:17, 1 user, load average: 0.21, 0.14, 0.10
Tasks: 251 total, 1 running, 250 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.5%us, 0.2%sy, 0.0%ni, 98.9%id, 0.3%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 32818368k total, 31154696k used, 1663672k free, 125048k buffers
Swap: 4184924k total,
1292756k used
, 2892168k free,
2716380k cached
Swap caused by log files
!
!
;
8 4 9:;
’
; ) !
-
!
-
!
;
.
/
Filesystem Cache
InnoDB Buffer Pool
Swap bug on Linux
%
!
!, J
, 72>27B4 $"+ >
4 -! 8 2 2 K-! 2 E S0>3399
# ' > - E
;
-!
,
4
K S K 7 < < *D W
K D 033&
G ( ! & ! SS 3
#
-
E
# ; & - &
! - ! - ! -K K K 8
-G -! - , ; ) ! & ! 9*?:; -
-#
!
-
-
$"+ CE
-K-! K K 8 -K K K 8 E
G = !
-Tool: unmap_mysql_logs
4
!-2
!
K
1 K
)
-G
44' ')%
!
-;
4 %
;
& $
4
G '!
.
/ -
4 %
,
.
/ 03T
G ;
!
!
- ;
!
4
-G
4
Filesystem Cache
InnoDB Buffer Pool
Performance effect
!
&
$
!
+
03
!
Mem: 32825116k total, 32728924k used,
96192k free
, 246360k buffers
Swap: 4186072k total,
118764k used
, 4067308k free,
2803088k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30814 mysql
15 0 28.8g 27g 5644 S 81.1 89.4 158198:37 mysqld
Mem: 32825116k total, 30061632k used,
2763484k free
, 247420k buffers
Swap: 4186072k total, 118764k used, 4067308k free,
143372k cached
Memory allocator
1
!
./
./
!
!
- !
:
)
.
!
/
–
R !
-!
–
R
*
*,2 W 2
!
**
- *
*
W
W
,
K)$+
% S !
-
K
2 W
1 K
Q
;
!
'
-
; ) !
–
-K! K
K
S 0.
! 0/&
; !
Be careful about per-session memory
!
.
*
/
%
7 ;
!
07BJ;
!,
./
- ./
8 US C07J;&
./
*
!
+ +'( O $
!
K
K -
( 0W
+(
K-!
K 8 S 7C>O037?W .7C>J;/
–
*D 32>B
! 03&333
+(
K-!
K 8 S 73?BO037?W .7 ;/
–
*D 0B2B0
! 03&333
*
2
;! -
!
- !
,
.
-
4
File I/O and synchronous writes
$ ;
./
.
&
&
/
!
!
;
;
! #
'
.;;#'/
03&333<
./
&
! ;;#'
733
"
-“
-
”
.
-
-
!
/
#
*
!
!
;;#'
-
.
/
, 94
!
* -
S3 .
!
!
!,/
, 4
!
*
-- 4
*
*-
- 2
disk
Write cache with battery
disk
seek & rotation time
BBWC and “Auto Learn” Issues
;
-@%!
A
-
.
!
/&
-G (
0 " !
!
G
!
%!
!
G
!
-G
!
!
!
.
!
/
G
!
*
G
!
, !
. 2 2 B
/& -
-+,
-
.
-! /
$%
'
;;#'
%
,
53
FBWC
;
#
'
$
-@%!
A
!
Overwriting or Appending?
. ,
8 /&
.
8 /
4
;
%
4 ;
%
<
./
!
<
./
–
%
-
Q
-
!
./
03&333<
& 9&333
–
%
–
'
*
*
!
V
!
.N&333</
;
!
!
*-
S0
-–
'
!
V
–
'
“
-
”
4 # R?57C
,
1!
Quick file i/o health check
'
;#' ;;#'
-
&
-<
./
–
$!
1
.
;&
*
&
-K ! K
K K ,K
S0/&
1
0&333
6
1
**
!
S0 **
S0 **
S
- X
** ! *
* 1 ** ! *
* 1 *
*
S
X
Buffered and asynchronous writes
&
!
&
% &
1 !
&
-K ! K
K K ,K
S7&
-
!
!
&
,
!
B
#
E *D
2
K-
!
K
2
K
!
-
!
K-
!
K
O
$%
.
!
03T& 03T
>?:;
>2?:;/
!
K
O $%
.
!
?3T/
&
-!
!
-% -!
-
! &
!
!
+, !
W
$
!
2
K-
!
K
72>297
–
!
2
!
’
;
!
;!
-.
"
/
&
$ )
- &
- -
-
.-
'JK
! ,/
;
!
!
% &
;
-K
K
K - & );P(&
#
8
8
-
“
* ! ,
”
&
*
!
8
.
)' *+,
/
“
K
,
”
!
-
S3
-
*
!
Tip: fast large file remove
'
1
-0 !
K - 2 -
!
K - 2 - 2
$ ) (%; + !
K - W
G
!
K - 2 - 2
)
=
-
!
!
8
#
&
-
!
!
8
)
!
& -!
=
! ,
=
Filesystem
xfs/ext2
,
'
!
-
!
K
$+'(
!
!
$"+ . !
! +/
-
-
-
“
-
”
, 7
-
!
, 7
’
!
M !
*
!
. 2 2
$
/&
, 7
!
;
.!
/
'
*
*
!
.
*-
!
/
Concurrent write matters on fast storage
-
"
.? % $%
0/
02B
!
Random Write IOPS (16KB Blocks)
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000
HDD(ext3)
HDD(xfs)
Intel(ext3)
Intel(xfs)
Fusion(ext3) Fusion (xfs)
I/O scheduler
4 $ ;
.
;/
!
1!
!,
!
!,
!
1!
“
!
”
“
! !
8
”
(
!
.
!
72>2034 $"+ C/
4
1!
-
-
&
’
4 )
8
.
/
1!
1!
.
/
,
.
“
*
*
”
-
/
1.
! /4
!
1!
4 $
72>299 .-
!
2
’
!
/
!
1& -!
cfq madness
Running two benchmark programs concurrently
Changing I/O scheduler (InnoDB)
DBT-2 (MySQL5.1)
0
5000
10000
15000
noop
cfq
deadline
as
N
O
T
P
M
RAID1+0
RAID5
0 Sun Fire X4150 (4 HDDs, H/W RAID controller+BBWC)
0 RHEL5.3 (2.6.180128)
Changing I/O scheduler queue size (MyISAM)
! !
8 S
!
1!
8
%
8
1!
"
#
, &
1! !
8
!
R
033333 D
-
P 1! !
K 1!
;
$ ;
!
;
1!
Time to insert 1 million records (HDD)
0 1000 2000 3000 4000 5000
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77
Existing records (millions)
S
ec
o
nd
s
Fighting against network bottlenecks
033 - +
4 9333 !
G % !
9C&3331
033
G
! !
G +
033 -
-0:- +
4 ?33!
G % !
7C3&3331
033
G 033&3331
-8
03:- +
G '
!
&
')
-!,
.
!
'
C/
')
1!
Tx/Rx multiqueue
1!
-
-!
-
')
'
! 1! !
& -!
=
-
-'
!
'
T 1
T
')
!
033T T
.$"+ >
/
-
!
Be careful about bandwidth
;
&
&
;
!
!
)P+
-
!
$
8
!
033
-&
'
& '
-#
!
&
-&
!
033
-(%$( ( )
%H+
K("$+%
W
!
-
-
!
&
Remote datacenter
$(( . !
/
(
#
,
733
$
$
!
$(( 7
G (
-G
-
.
$((/
G
-
-!
*
!
G +
$((
2
-Waiting three seconds for SYN
+
-
,
'
4
(') I
4
(') I <%'J
'
4
(') %'J
I <%'J
&
&
I
(
1!
-
& -!
-
,
G
B057 D
?
K
,K
K-G -
K
2
Flows to establish N/W connections
-
0
>&
*
;!
&
Client
Server
1. Sending SYN,
Changing state to SYN_SENT
4. Generating SYN+ACK,
Changing state to SYN_RECV
2. Receiving SYN
5. Sending SYN+ACK
3. Checking conditions (i.e. back_log)
If it doesn’t meet criteria, dropping it
6. Receiving SYN+ACK
7. Generating ACK
8. Sending ACK
9. Receiving ACK,
Database connections
*)
M
$
=
!
!
G +
-G #
9
(') I
1!
--
-
-')
G +,
. I *D I <%'J *D %'J&
8
. +( ,,,/&
/
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
J
-(
! -
1!
@(
! -
# -
A , @'
A
+
03333
,*
2
-=
! -
-
Y
=
! -
-
-)
=
!
!
;! $%
!
1
.32C*0
/& !
0333<
, !
1!
H
!
-
!
Persistent Connections
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Use persistent connections properly
! -
-
&
;
!
-
1! !
&
!
-!
&
!
.0379
&
&
/
Web
Servers
Queue
Servers
(Q4M)
MySQL
Servers
(InnoDB)
Worker
Program
Limited Number of
Persistent Connections
Uncontrollable number of
Proxy-based connection pooling
# -
-
,
&
) ,
-
!
-F
++
) ,
=
!
!
! -
03&333
(
! -
&
!
& -! 22
) ,
1!
$
"
%
-
!
) ,
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Proxy Server
Thread pooling within MySQL
-
&
! -
1
!
-%
,
&
--
1
!
-!
!
! -
4 )
!
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Web Server
Thread Pool
Useful commands and tools
iostat
H
-
!
$ ;
-- !
*,
'
&
&
& T!
)
!
8
%
T!
S .
<
/ O
.
!
/
# # #
# iostatiostatiostat -iostat ---xmxmxm 10xm 101010 avg
avgavg
avg----cpucpucpu: %user %nice %system %cpu: %user %nice %system %: %user %nice %system %iowait: %user %nice %system %iowaitiowait %steal %idleiowait %steal %idle%steal %idle%steal %idle 21.16 0.00 6.14 29.77 0.00 42.93 21.16 0.00 6.14 29.77 0.00 42.9321.16 0.00 6.14 29.77 0.00 42.93 21.16 0.00 6.14 29.77 0.00 42.93 Device:
Device: Device:
Device: rqm/srqm/srqm/s wrqm/srqm/s wrqm/swrqm/swrqm/s r/sr/sr/sr/s w/s rMB/sw/sw/sw/s rMB/srMB/s wMB/srMB/s wMB/swMB/s avgrqwMB/s avgrqavgrq-avgrq---szszsz avgqusz avgquavgqu-avgqu---szsz await szsz await await svctmawait svctmsvctm %svctm %%%utilutilutilutil sdb
sdbsdb
sdb 2.60 389.01 283.12 47.35 4.86 2.19 43.67 4.89 14.76 2.60 389.01 283.12 47.35 4.86 2.19 43.67 4.89 14.76 3.02 99.83 2.60 389.01 283.12 47.35 4.86 2.19 43.67 4.89 14.76 2.60 389.01 283.12 47.35 4.86 2.19 43.67 4.89 14.76 3.02 99.83 3.02 99.83 3.02 99.83
iostat example (DBT-2)
avg----cpucpucpu: %user %nice %system %cpu: %user %nice %system %: %user %nice %system %iowait: %user %nice %system %iowaitiowait %steal %idleiowait %steal %idle%steal %idle%steal %idle 21.16 0.00 6.14 29.77 0.00 42.93 21.16 0.00 6.14 29.77 0.00 42.9321.16 0.00 6.14 29.77 0.00 42.93 21.16 0.00 6.14 29.77 0.00 42.93 Device:
Device: Device:
Device: rqm/srqm/srqm/s wrqm/srqm/s wrqm/swrqm/swrqm/s r/sr/sr/sr/s w/s rMB/sw/sw/sw/s rMB/srMB/s wMB/srMB/s wMB/swMB/s avgrqwMB/s avgrqavgrq-avgrq---szszsz avgqusz avgquavgqu-avgqu---szsz await szsz await await svctmawait svctmsvctm %svctm %%%utilutilutilutil sdb
sdbsdb
sdb 2.60 389.01 283.12 47.35 4.86 2.19 43.67 4.89 14.76 2.60 389.01 283.12 47.35 4.86 2.19 43.67 4.89 14.76 3.02 99.83 2.60 389.01 283.12 47.35 4.86 2.19 43.67 4.89 14.76 2.60 389.01 283.12 47.35 4.86 2.19 43.67 4.89 14.76 3.02 99.83 3.02 99.83 3.02 99.83
(283.12+47.35) * 3.02(ms)/1000 = 0.9980 = 100% util
#
avg----cpucpucpucpu: %user %nice %system %: %user %nice %system %: %user %nice %system %: %user %nice %system %iowaitiowaitiowaitiowait %steal %idle%steal %idle%steal %idle%steal %idle 40.03 0.00 16.51 16.52 0.00 26.94
Device: rrqm/srrqm/srrqm/srrqm/s wrqm/swrqm/swrqm/swrqm/s r/sr/sr/sr/s w/s rMB/sw/sw/sw/s rMB/srMB/srMB/s wMB/swMB/swMB/swMB/s avgrqavgrqavgrqavgrq----szszszsz avgquavgquavgquavgqu----szsz await szsz await await await svctmsvctmsvctmsvctm %%%%utilutilutilutil sdb
sdb sdb
sdb 6.39 368.53 543.06 490.41 6.71 3.90 21.02 3.29 3.26.39 368.53 543.06 490.41 6.71 3.90 21.02 3.29 3.26.39 368.53 543.06 490.41 6.71 3.90 21.02 3.29 3.26.39 368.53 543.06 490.41 6.71 3.90 21.02 3.29 3.20 0.90 92.66 0 0.90 92.66 0 0.90 92.66 0 0.90 92.66
mpstat
)
')
’
')
!
033T ')
!
(
')
+
-
M
-!
!
-I !
-
.T 1&
T
/
vmstat and mpstat
2 07T O B .
/ S 033T O 0 < 3T O N
# # #