Architecting Data Center Networks in the era of Big Data and Cloud Brad Hedlund Spring Interop—May 2012
VIDEO of this session: http://bradhedlund.com/?p=3912 Global Marketing
Two approaches to DC Networking
THE SAME OLD
• Centralized, Scale-up Layer 2 networks • Monstrous chassis switches TRILL
OpenFlow VEPA
SPB
Or a Different Approach Distributed, Scale-out Layer 3 fabrics Efficient fixed switches Open, industry standard protocols 2
Global Brad Marketing Hedlund
Networks that suck for Cloud & Big Data Core
Network Topology
Dist Access
VM PARTITIONED
Capacity Topology
CAPACITY
“Data center networks are in my way” -James Hamilton, AWS 3
Global Brad Marketing Hedlund
Networks that Don’t suck for Cloud & Big Data Spine
Network Topology
Leaf
VM
UNIFORM CAPACITY
Capacity Topology
All points equidistant 4
Global Brad Marketing Hedlund
Big Data switch
TCP
TCP
switch
switch
switch
Name Node
Job Tracker
Secondary NN
Node
Node
Node
switch
switch
switch Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Rack 1
Rack 2
Rack 3
Client
Rack 4
TCP
TCP World
Client
Rack N
• Inverse Virtualization • Workloads orchestrated like cattle • L2 or L3 network. Does it matter? 5
Global Brad Marketing Hedlund
Basic requirements of Cloud (IaaS) switch
switch
switch
switch
Physical Network
switch
FW
switch
LB
World
VM
VM
VM
VM
Virtual Network
• Secure, Scalable Multi Tenancy • Location independence
• On Demand virtual networks 6
Global Brad Marketing Hedlund
Blend the Virtual and Physical Networks VLAN 20 switch
VLAN 10 switch
switch
vSwitch
vSwitch
VM VM
VM VM
Host
Host
VM
VM
VM
VM
• Tenant subnet = Network VLAN 7
Global Brad Marketing Hedlund
Abstract the Virtual Network from Physical Segment ID 20 switch Segment ID 10 switch
switch
vSwitch
vSwitch
VM VM
VM VM
Host
Host
VM
VM
VM
VM
• Network Virtualization Overlay • Tenant subnet = Software VLAN 8
Global Brad Marketing Hedlund
Scale-up centralized Layer 2 L3
• 2-post Rooted Architecture
L2
• Centralized L2/L3 • L2/L3/ARP table scale? • Scale w/ Bigger Boxes • Precious Pets • VLAN Provisioning? • Broadcasts
9
vSwitch
vSwitch
VM VM
VM VM Global Brad Marketing Hedlund
Scale-out Layer 3 Leaf/Spine Fabric (16)
(8)
(2)
(128) (64)
L3
(16)
L2
1980 768 Server Server ports 3072 6144 Serverports ports
• Mesh from Leaf to Spine
• Non-blocking Spine
• OSPF, ISIS, BGP, TRILL
• 3:1 @ ToR
• ToR w/ 16 uplinks (ECMP)
• 128 port 2RU Spine
10
Global Brad Marketing Hedlund
Uniform fabric for Cloud & Big Data (16)
(8)
(2)
Storage Access
Hadoop (128) (64)
L3
(16)
L2
Database 6144 Server ports
Name Node
Job Tracker
Node
Node
Secondary NN
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
Node
11
Client
Client
Rack 1
Rack 2
Node Node
Block I/O NAS Object
Rack 3
vSwitch
vSwitch
VM VM
VM VM
Rack N
Global Brad Marketing Hedlund
Attaching Services & North/South (16)
(8)
(2)
(128) (64)
(16)
Name Node
Job Tracker
Node
Secondary NN
Node
Node
Node
Node
Node
Node
Node
Node
Client
L3
Client
Firewall Firewall
LB LB
vswitch VM
VM
VM
vswitch
Rack 1
12
Rack N
VM
VM
vswitch
vswitch
VM VM
VM
VM
VM
VM
VM
VM
x86
Gateways
World
vswitch
vswitch VM
VM
L2
VM
VM
VM
Global Brad Marketing Hedlund
Generic Logical Architecture 1 World
• Overlay based L2 • Physical/Static FW
FW
Fabric
L3 NAT
DC router
FW
L3 NAT
L2
LB
LB
L2 L3
VM
VM
L2
VM
VM
Green Co. 13
VM
L2
Big Data Orange Co. Global Brad Marketing Hedlund
Generic Logical Architecture 2 World
• Overlay based L2 • Virtual/Mobile FW • Overlay Gateway FW
Fabric
DC router Pub DMZ
L3 NAT
FW
L3 NAT
L2
LB
LB
L2 L3
VM
VM
L2
VM
VM
Green Co. 14
VM
L2
Big Data Orange Co. Global Brad Marketing Hedlund
Generic Logical Architecture 3 World
• No Overlays • TRILL based L2 • Virtual/Mobile FW
Fabric
DC router Pub DMZ
TRILL
FW
L3 NAT
FW
L3 NAT
L2
LB
LB
L2 L3
VM
VM
L2
VM
VM
Green Co. 15
VM
L2
Big Data Orange Co. Global Brad Marketing Hedlund
Density: Fixed vs. Chassis 140
10G per RU @ Line Rate (L3)
120 100 80
Chassis Fixed
60 40 20 0 2008 16
2010
2012
2014 Global Brad Marketing Hedlund
Power: Fixed vs. Chassis 18
Max Watts / Line Rate 10G (L3)
16 14 12
10
Chassis Fixed
8 6 4 2 0 2010 17
2012
2014 Global Brad Marketing Hedlund
What are the Challenges? (16)
(8)
(2)
Dell
Fabric Manager (128)
(16)
Design Templates Validate deployment Automate fabric configuration Monitoring & Operations
L3 L2
• Deployment & Cabling
• Layer 2 (TRILL?)
• Configuration & Policy
• Design Best Practices
• Monitor & Troubleshoot
Dell Fabric Manager
18
Global Brad Marketing Hedlund
Webinar: CLOS Fabrics Explained
http://closfabric.eventbrite.com/ Wednesday, June 20, 2012 from 10:00 AM to 1:00 PM (ET)
HOST CO-HOST
Brad Hedlund
Ivan Pepelnjak 19
Global Brad Marketing Hedlund
Three Stage Layer 3 Leaf/Spine Fabric (64)
(8)
(2) /26 0/0
(128) /26
(512)
/26 0/0
L3 L2
24,576 Server ports
• Leaf+ToR mesh groups • Non-blocking @ top tiers • Default route @ ToR & Leaf • ~8usec worst case 20
Global Brad Marketing Hedlund
The case for 40G QSFP switch ports
10G 10G 10G 10G
10G
10G
10G
10G
SFP+
SFP+
SFP+
SFP+
$1K
$1K
$1K
$1K
VS QSFP $1,800
32 ToR
$230K
$512K
21
Global Brad Marketing Hedlund
Comparing Fabric efficiencies of Fixed vs. Chassis designs • Supplemental Section
Brad Hedlund May 2012
Global Marketing
Total Fabric Power: Chassis vs. Fixed KW
non-blocking
250
200
150 Chassis Fixed
100
50
0 384 23
2048
4096
8192
Fabric size Global Brad Marketing Hedlund
Total Fabric RU: Chassis vs. Fixed RU
non-blocking
700 600 500 400
Chassis Fixed
300 200 100 0 384 24
2048
4096
8192
Fabric size Global Brad Marketing Hedlund
8192 non-blocking Fabric (64)
(8)
(2)
(128)
8192 non-blocking
• 384RU • 153.6KW • 8192 ISL • 192 switches 25
Global Brad Marketing Hedlund
8192 non-blocking Fabric (32)
(256) 8192 non-blocking
• 608RU • 216.3KW • 8192 ISL • 288 switches 26
Global Brad Marketing Hedlund
4096 non-blocking Fabric (32)
(2)
(64) 4096 non-blocking
• 192RU • 76.8KW • 4096 ISL • 96 switches 27
Global Brad Marketing Hedlund
4096 non-blocking Fabric (16)
(128) 4096 non-blocking
• 304RU • 108.1KW • 4096 ISL • 144 switches 28
Global Brad Marketing Hedlund
2048 non-blocking Fabric (16)
(2)
(32) 2048 non-blocking
• 96RU • 38.4KW • 512 ISL • 48 switches 29
Global Brad Marketing Hedlund
2048 non-blocking Fabric (8)
(64) 2048 non-blocking
• 152RU • 54KW • 2048 ISL • 72 switches 30
Global Brad Marketing Hedlund
384 non-blocking Fabric (4)
(2)
(6) 384 non-blocking
• 20RU • 8KW • 96 ISL • 10 switches 31
Global Brad Marketing Hedlund
384 non-blocking Fabric (2)
(12) 384 non-blocking
• 26RU • 18KW • 384 ISL • 14 switches 32
Global Brad Marketing Hedlund
256 non-blocking Fabric (2)
(4) 256 non-blocking
• 12RU • 4.8KW • 64 ISL • 6 switches 33
Global Brad Marketing Hedlund
256 non-blocking Fabric (2)
(8) 256 non-blocking
• 22RU • 6.7KW • 256 ISL • 10 switches 34
Global Brad Marketing Hedlund
Total Fabric Power: Chassis vs. Fixed KW
3:1 oversubscribed
80 70 60 50
Chassis Fixed
40 30 20 10 0 768 35
1536
3072
6144
Fabric size Global Brad Marketing Hedlund
Total Fabric RU: Chassis vs. Fixed RU
3:1 oversubscribed
250
200
150 Chassis Fixed
100
50
0 768 36
1536
3072
6144
Fabric size Global Brad Marketing Hedlund
768 @ 3:1 oversubscribed Fabric (2)
(16) 768 @ 3:1
• 20RU • 7.2KW • 64 ISL • 18 switches 37
Global Brad Marketing Hedlund
768 @ 3:1 non-blocking Fabric (2)
(16) 256 non-blocking
• 30RU • 7.5KW • 256 ISL • 18 switches 38
Global Brad Marketing Hedlund
1536 @ 3:1 oversubscribed Fabric (4)
(2)
(32) 1536 @ 3:1
• 40RU • 14.4KW • 128 ISL • 36 switches 39
Global Brad Marketing Hedlund
1536 @ 3:1 oversubscribed Fabric (2)
(32) 1536 @ 3:1
• 54RU • 17KW • 512 ISL • 34 switches 40
Global Brad Marketing Hedlund
3072 @ 3:1 oversubscribed Fabric (8)
(2)
(64) 3072 @ 3:1
• 80RU • 28.8KW • 1024 ISL • 72 switches 41
Global Brad Marketing Hedlund
3072 @ 3:1 oversubscribed Fabric (4)
(64) 3072 @ 3:1
• 108RU • 34KW • 1024 ISL • 68 switches 42
Global Brad Marketing Hedlund
6144 @ 3:1 oversubscribed Fabric (16)
(8)
(2)
(128)
6144 @ 3:1
• 160RU • 57.6KW • 2048 ISL • 144 switches 43
Global Brad Marketing Hedlund
6144 @ 3:1 oversubscribed Fabric (8)
(128) 6144 @ 3:1
• 216RU • 68KW • 2048 ISL • 136 switches 44
Global Brad Marketing Hedlund
Total Fabric Power: Chassis vs. Fixed KW
3:1 oversubscribed - Massive
1400 1200 1000 800
Chassis Fixed
600 400 200 0 12288 45
24576
49152
98304 Fabric size Global Brad Marketing Hedlund
Total Fabric RU: Chassis vs. Fixed RU
3:1 oversubscribed - Massive
5000 4500 4000 3500 3000
Chassis Fixed
2500 2000 1500 1000 500 0 12288 46
24576
49152
98304 Fabric size Global Brad Marketing Hedlund
12,288 @ 3:1 oversubscribed Fabric (32) (64)
(8)
(2)
(256)
12,288 @ 3:1
• 448RU • 166.4KW • 5120 ISL • 352 switches 47
Global Brad Marketing Hedlund
12,288 @ 3:1 oversubscribed Fabric (16)
(256) 12,288 @ 3:1
• 432RU • 136.3KW • 4096 ISL • 272 switches 48
Global Brad Marketing Hedlund
24,576 @ 3:1 oversubscribed Fabric (64) (128)
(8)
(2)
(512)
24,576 @ 3:1
• 896RU • 332.8KW • 10,240 ISL • 704 switches 49
Global Brad Marketing Hedlund
24,576 @ 3:1 oversubscribed Fabric (32)
(256)
(512) 24,576 @ 3:1
• 1120RU • 328.9KW • 16,384 ISL • 800 switches 50
Global Brad Marketing Hedlund
49,152 @ 3:1 oversubscribed Fabric (128)
(256)
(1024) 49,152 @ 3:1
• 1792RU • 665.6KW • 20480 ISL • 1408 switches 51
Global Brad Marketing Hedlund
49,152 @ 3:1 oversubscribed Fabric (64)
(512)
(1024) 49,152 @ 3:1
• 2240RU • 657.9KW • 32,768 ISL • 1600 switches 52
Global Brad Marketing Hedlund
98,304 @ 3:1 oversubscribed Fabric (256)
(512)
(2048) 98,304 @ 3:1
• 3584RU • 1331.2KW • 40960 ISL • 2816 switches 53
Global Brad Marketing Hedlund
98,304 @ 3:1 oversubscribed Fabric (128)
(1024)
(2048) 98,304 @ 3:1
• 4480RU • 1315.8KW • 65,536 ISL • 3200 switches 54
Global Brad Marketing Hedlund
The power to do more
55