【東京リージョン対応】Amazon EFS の速度計測してみた結果【祝】

冗長化目的でLAMPサーバーのドキュメントルートにEFSを使用する想定で、リード性能を計測。

前提

OS

EC2 インスタンス作成画面トップに表示される普通の Amazon Linux 2 AMI (HVM)

# /etc/os-release

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"

EC2インスタンス

すべて東京リージョンでインスタンスタイプは m5.large (EBS最適化有効) 。

※すべて今回の計測用に新しく起動したリソースであり、EBSボリュームやEFSのバーストクレジット等は余裕があった状態

計測方法

Amazon Linux 2 の yum で普通にインストールした fio を用いて計測。

# fio --version
fio-2.14

fio のジョブは下記の通り、ランダムリードとシーケンシャルリードの計測ジョブをサクッと記述。

[global]
# EBSの場合
directory=/tmp

# EFSの場合
# directory=/mnt/efs

[rand-read]
rw=randread
size=1m
numjobs=1

[seq-read]
rw=read
size=1g
numjobs=1

※念の為、3回連続実行して3回目実行時の計測結果を採用

計測結果

Img の結果

シーケンシャルリードについては、ほぼカタログスペック通り。

  • ランダムリード: 3618.4 KB/s (約 3.6 MB/s)
    • レイテンシー平均: 約 1.1 ミリ秒
  • シーケンシャルリード: 53450 KB/s (約 53 MB/s)
    • レイテンシー平均: 約 74 マイクロ秒
rand-read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
seq-read: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
fio-2.14
Starting 2 processes
Jobs: 1 (f=1): [_(1),R(1)] [100.0% done] [56064KB/0KB/0KB /s] [14.2K/0/0 iops] [eta 00m:00s]
rand-read: (groupid=0, jobs=1): err= 0: pid=3034: Wed Jul 18 06:39:24 2018
  read : io=1024.0KB, bw=3618.4KB/s, iops=904, runt=   283msec
    clat (usec): min=175, max=19035, avg=1103.97, stdev=2952.10
     lat (usec): min=175, max=19035, avg=1104.00, stdev=2952.09
    clat percentiles (usec):
     |  1.00th=[  179],  5.00th=[  185], 10.00th=[  189], 20.00th=[  195],
     | 30.00th=[  199], 40.00th=[  205], 50.00th=[  209], 60.00th=[  217],
     | 70.00th=[  231], 80.00th=[  860], 90.00th=[ 3184], 95.00th=[ 3216],
     | 99.00th=[18816], 99.50th=[18816], 99.90th=[19072], 99.95th=[19072],
     | 99.99th=[19072]
    lat (usec) : 250=73.05%, 500=2.34%, 750=1.95%, 1000=5.86%
    lat (msec) : 2=6.64%, 4=5.47%, 10=1.56%, 20=3.12%
  cpu          : usr=0.35%, sys=0.00%, ctx=257, majf=0, minf=8
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=256/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
seq-read: (groupid=0, jobs=1): err= 0: pid=3035: Wed Jul 18 06:39:24 2018
  read : io=1024.0MB, bw=53450KB/s, iops=13362, runt= 19618msec
    clat (usec): min=0, max=147689, avg=74.39, stdev=1890.70
     lat (usec): min=0, max=147689, avg=74.42, stdev=1890.70
    clat percentiles (usec):
     |  1.00th=[    0],  5.00th=[    0], 10.00th=[    1], 20.00th=[    1],
     | 30.00th=[    1], 40.00th=[    1], 50.00th=[    1], 60.00th=[    1],
     | 70.00th=[    1], 80.00th=[    1], 90.00th=[    1], 95.00th=[    1],
     | 99.00th=[  604], 99.50th=[ 2704], 99.90th=[17536], 99.95th=[19328],
     | 99.99th=[140288]
    lat (usec) : 2=95.99%, 4=2.42%, 10=0.03%, 20=0.01%, 50=0.45%
    lat (usec) : 100=0.01%, 500=0.01%, 750=0.16%, 1000=0.05%
    lat (msec) : 2=0.30%, 4=0.27%, 10=0.13%, 20=0.17%, 50=0.01%
    lat (msec) : 250=0.01%
  cpu          : usr=0.40%, sys=1.95%, ctx=2915, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262144/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=1025.0MB, aggrb=53501KB/s, minb=3618KB/s, maxb=53449KB/s, mint=283msec, maxt=19618msec

Disk stats (read/write):
  nvme0n1: ios=4283/2, merge=0/0, ticks=26176/0, in_queue=7040, util=35.61%

Igp の結果

こちらもシーケンシャルリードについては、ほぼカタログスペック通り。

  • ランダムリード: 18618 KB/s (約 18 MB/s)
    • レイテンシー平均: 約 212 マイクロ秒
  • シーケンシャルリード: 148566 KB/s (約 149 MB/s)
    • レイテンシー平均: 約 27 マイクロ秒
rand-read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
seq-read: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
fio-2.14
Starting 2 processes
Jobs: 1 (f=1): [_(1),R(1)] [100.0% done] [128.0MB/0KB/0KB /s] [32.8K/0/0 iops] [eta 00m:00s]
rand-read: (groupid=0, jobs=1): err= 0: pid=32506: Wed Jul 18 06:42:15 2018
  read : io=1024.0KB, bw=18618KB/s, iops=4654, runt=    55msec
    clat (usec): min=182, max=326, avg=212.00, stdev=21.52
     lat (usec): min=182, max=326, avg=212.06, stdev=21.50
    clat percentiles (usec):
     |  1.00th=[  183],  5.00th=[  189], 10.00th=[  193], 20.00th=[  197],
     | 30.00th=[  201], 40.00th=[  205], 50.00th=[  209], 60.00th=[  211],
     | 70.00th=[  215], 80.00th=[  223], 90.00th=[  237], 95.00th=[  255],
     | 99.00th=[  306], 99.50th=[  318], 99.90th=[  326], 99.95th=[  326],
     | 99.99th=[  326]
    lat (usec) : 250=93.36%, 500=6.64%
  cpu          : usr=3.70%, sys=0.00%, ctx=257, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=256/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
seq-read: (groupid=0, jobs=1): err= 0: pid=32507: Wed Jul 18 06:42:15 2018
  read : io=1024.0MB, bw=148566KB/s, iops=37141, runt=  7058msec
    clat (usec): min=0, max=2845, avg=26.48, stdev=213.00
     lat (usec): min=0, max=2845, avg=26.51, stdev=213.00
    clat percentiles (usec):
     |  1.00th=[    0],  5.00th=[    0], 10.00th=[    1], 20.00th=[    1],
     | 30.00th=[    1], 40.00th=[    1], 50.00th=[    1], 60.00th=[    1],
     | 70.00th=[    1], 80.00th=[    1], 90.00th=[    1], 95.00th=[    1],
     | 99.00th=[ 1800], 99.50th=[ 1912], 99.90th=[ 2024], 99.95th=[ 2064],
     | 99.99th=[ 2160]
    lat (usec) : 2=96.10%, 4=2.30%, 10=0.03%, 20=0.01%, 250=0.01%
    lat (usec) : 500=0.04%, 750=0.25%, 1000=0.01%
    lat (msec) : 2=1.13%, 4=0.14%
  cpu          : usr=1.39%, sys=5.19%, ctx=4110, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262144/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=1025.0MB, aggrb=148710KB/s, minb=18618KB/s, maxb=148565KB/s, mint=55msec, maxt=7058msec

Disk stats (read/write):
  nvme0n1: ios=4274/0, merge=0/0, ticks=13300/0, in_queue=6800, util=97.14%

Ief の結果

やはりシーケンシャルリードについては、ほぼカタログスペック通り。 ランダムリードの単位に K が無いのは誤植ではないです……

  • ランダムリード: 107745 B/s (約 108 KB/s)
    • レイテンシー平均: 約 38 ミリ秒
  • シーケンシャルリード: 113926 KB/s (約 114 MB/s)
    • レイテンシー平均: 約 35 マイクロ秒
rand-read: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
seq-read: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
fio-2.14
Starting 2 processes
Jobs: 2 (f=2): [r(1),R(1)] [29.4% done] [100.2MB/0KB/0KB /s] [25.7K/0/0 iops] [eta 00m:24s]
rand-read: (groupid=0, jobs=1): err= 0: pid=32472: Wed Jul 18 06:40:32 2018
  read : io=1024.0KB, bw=107745B/s, iops=26, runt=  9732msec
    clat (msec): min=1, max=428, avg=37.99, stdev=95.41
     lat (msec): min=1, max=428, avg=37.99, stdev=95.41
    clat percentiles (msec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    3],
     | 30.00th=[    3], 40.00th=[    3], 50.00th=[    4], 60.00th=[    4],
     | 70.00th=[    4], 80.00th=[    4], 90.00th=[  281], 95.00th=[  302],
     | 99.00th=[  330], 99.50th=[  330], 99.90th=[  429], 99.95th=[  429],
     | 99.99th=[  429]
    lat (msec) : 2=0.39%, 4=81.64%, 10=3.52%, 20=0.39%, 50=1.95%
    lat (msec) : 100=0.78%, 500=11.33%
  cpu          : usr=0.00%, sys=0.03%, ctx=259, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=256/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1
seq-read: (groupid=0, jobs=1): err= 0: pid=32473: Wed Jul 18 06:40:32 2018
  read : io=1024.0MB, bw=113926KB/s, iops=28481, runt=  9204msec
    clat (usec): min=0, max=105569, avg=34.57, stdev=661.10
     lat (usec): min=0, max=105569, avg=34.60, stdev=661.10
    clat percentiles (usec):
     |  1.00th=[    0],  5.00th=[    0], 10.00th=[    0], 20.00th=[    1],
     | 30.00th=[    1], 40.00th=[    1], 50.00th=[    1], 60.00th=[    1],
     | 70.00th=[    1], 80.00th=[    1], 90.00th=[    1], 95.00th=[    2],
     | 99.00th=[    3], 99.50th=[    4], 99.90th=[10176], 99.95th=[10560],
     | 99.99th=[19840]
    lat (usec) : 2=93.73%, 4=5.55%, 10=0.33%, 20=0.04%, 50=0.03%
    lat (usec) : 100=0.01%, 250=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.19%, 20=0.12%, 50=0.01%
    lat (msec) : 100=0.01%, 250=0.01%
  cpu          : usr=1.24%, sys=3.63%, ctx=1104, majf=0, minf=9
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262144/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=1025.0MB, aggrb=107850KB/s, minb=105KB/s, maxb=113926KB/s, mint=9204msec, maxt=9732msec

まとめ

計測対象ランダムリード速度シーケンシャルリード速度
m5.large + EBS マグネティック3.6 MB/s53 MB/s
m5.large + EBS 汎用 SSD (gp2)18 MB/s149 MB/s
m5.large + EFS (General Purpose, Bursting)108 KB/s114 MB/s

所感

  • EFSのランダムリード、遅すぎませんか……?
    • Webサーバーで画像やらPHPやら細かいファイルをちまちま読み取りに行くとすごく遅くなりそう
  • Webサーバーの公開領域を置くボリュームとして使用する場合、CloudFront や NGINX のプロキシキャッシュ、OPCache (PHPの場合) などを適切に設定して、ディスクIOをなるべく減らす必要性がありそう