Bixby - Hot Chips conference

7 downloads 180 Views 2MB Size Report
hardware domains. Associate errors with single domain and ... Increases availability since system can be used ... CRC ch
1

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Bixby: the Scalability and Coherence Directory ASIC in Oracle's Highly Scalable Enterprise Systems Thomas Wicki and Jürgen Schulz Senior Principal Hardware Engineers, Microelectronics Hot Chips 25 – August 25-27, 2013

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Outline  Motivation and Design Objectives

 M5 System and Beyond  System RAS Features

 Implementation Details  Debug and DFT Features

 Summary

4

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Motivation SL

SL

added to M5 and M6 – Bixby ASICs are needed

(Glued system)

M6

M6

SL

M6

M6 SL

M6 SL

5

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

SL

CL

-

– Scalability Links (SL) were

SL

T

scale up to 8 processors using Coherence Links (CL) (Glueless system)  To enable systems to scale beyond 8 processors:

M6

M6

 M5 and M6’s direct interconnects

M6 SL

Bixby Design Objectives  Scalable up to 96 processors

 Communication switch between 8-processor SMPs

Large System Scaling

Coherence Directory & Processing

 Directory for L3 caches of all processors  Multi-generation support  Enabling mixed processor systems

Bixby Enterprise System Focus

6

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

 Enterprise-Class RAS feature set  High bandwidth, low latency

Challenges and Trade-Offs

7

Challenge

Solution

Directory Size

Large directory size requirement

Scale up number of Bixbys with system size

Directory Width

Massive number of L3 cache ways x number of processors per look-up

Pipeline look-ups

Switch Size

24 x 24 crossbar efficiency

Overprovision switching bandwidth

Shared Resources

Some resources shared by multiple hardware domains

Associate errors with single domain and clean up shared resources after error

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Outline  Motivation and Design Objectives

 M5 System and Beyond  System RAS Features

 Implementation Details  Debug and DFT Features

 Summary

8

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Oracle’s M5-32 System

 32 M5 SPARC processors  12 Bixbys  4 physical (hardware) domains  3.1TB/s payload coherence

bandwidth  1.5TB/s payload scalability bandwidth

9

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Bixbys

M5-32 System Coherence Interconnect  Coherence Links (CL) – 12 lanes per direction

 Scalability Links (SL) – 4 lanes per direction

 12Gbps per lane  7 CLs + 6 SLs per processor  16 SLs per Bixby

10

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Scalability Link Bandwidth

11

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Outline  Motivation and Design Objectives

 M5 System and Beyond  System RAS Features

 Implementation Details  Debug and DFT Features

 Summary

12

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Physical (Hardware) Domain Support S S S S S S S S Domain A



✓ BX BX BX BX

S S S S S S S S Domain B

13

S S S S S S S S Domain A

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

S S S S S S S S Domain C

 Up to 12 physical domains  Dynamically configurable by

Service Processor  Packet filtering and Physical Address fencing  Errors resolved to physical domain  Per-domain Cease Operation support

5-of-6 Redundancy Mode Normal configuration:

BX BX BX BX BX BX

Failover configuration:



BX BX BX BX BX BX

14

 System can boot with any

BX BX BX BX BX BX

Still Boots!



BX BX BX BX BX BX

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

5 out of each group of 6 Bixbys  Increases availability since system can be used until service is performed

Hot Maintenance Support S S S S S S S S Domain A

S S S S S S S S Domain A

 In a running system, failing

Bixby or SMP can be: BIST

BX BX BX BX

– Replaced

IBIST

– Tested – Re-integrated

S S S S S S S S Domain B

15

BIST

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

S S S S S S S S New SMP

Link Protection Chip A

 CRC check and auto retry – Replay, if CRC error detected

LFU

– Guaranteed lane failure detection

 Built in PRBS testing during link training  Auto link re-initialization – Re-training link, if Replay unsuccessful – No Service Processor intervention required

SL LFU

 Auto single lane failover (per direction) – Based on PRBS testing – No Service Processor intervention required 16

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Chip B

Outline  Motivation and Design Objectives

 M5 System and Beyond  System RAS Features

 Implementation Details  Debug and DFT Features

 Summary

17

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Implementation Details

 96 Tx + 96 Rx 16Gb/s Long-Reach AC coupled SerDes  Package: 45mm x 45mm 1677-pin FPBGA (~500 signal IO)  Process: 28nm 10 layer metal 0.85V ASIC

 ~160 Mbits SRAM (~20MB Tags)  ~70M Gates (nand2 equivalent)

18

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Functional Blocks Forward Packet

SerDes Links (24 x4) Link Framing Units (LFU) LQU Input Queues (IQU) ASU Crossbar In (AXI)

Forwarding Crossbar (FXU)

ASU

ASU

Address Serialization Units ASU ASU ASU ASU ASU Crossbar Out (AXO)

LQU Output Queues (OQU)

Link Framing Units (LFU) SerDes Links (24 x4) 19

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

ASU

ASU

Functional Blocks SerDes Links (24 x4)

Directory Lookup

Link Framing Units (LFU) LQU Input Queues (IQU) ASU Crossbar In (AXI) Forwarding Crossbar (FXU)

ASU

ASU

Address Serialization Units ASU ASU ASU ASU ASU Crossbar Out (AXO)

LQU Output Queues (OQU)

Link Framing Units (LFU) SerDes Links (24 x4) 20

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

ASU

ASU

Floorplan CRC/Retry

CRC/Retry  SEC-DED on all major datapaths

21

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

CRC/Retry

CRC/Retry

CRC/Retry CRC/Retry

CRC/Retry

CRC/Retry

ECC for Datapath Parity for Control

 Parity on control signals  Custom top level wires on top two

routing layers – Critical nets implemented by

Buffer on route – Faster ps/mm

 PVT invariant clock distribution

Link Framing Units (LFU)

Link Queuing Unit (LQU)

LQU Input Queues (IQU) ASU Crossbar In (AXI) Address Serialization Units Forwarding Crossbar ASU ASU ASU ASU ASU ASU ASU ASU (FXU) ASU Crossbar Out (AXO) LQU Output Queues (OQU) Link Framing Units (LFU)

 Each manages an x4 Scalability Link  Provides queuing support for multiple Virtual Channels  Each LQU is part of a single physical domain  SEC-DED on Link FIFOs (RAM based)

22

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Link Framing Units (LFU)

Cross Bar Units (XBU)

LQU Input Queues (IQU) ASU Crossbar In (AXI) Address Serialization Units Forwarding Crossbar ASU ASU ASU ASU ASU ASU ASU ASU (FXU) ASU Crossbar Out (AXO)

 A separate data path forwards traffic and is sized to

LQU Output Queues (OQU) Link Framing Units (LFU)

account for any Head-of-Line blocking inefficiencies – FXU: 24in x 24out (2-cycle packet)

Switch Efficiencies

– AXI: 24in x 8out

1 0.8

– Switch fabrics implemented as

custom layout hard macros  Bixby fully sustains mixed request and

% Efficiency

– AXO: 16in x 24out

0.6

Output Input

0.4 0.2 0

data traffic at full line rate FXU AXI  FXU is single domain, AXI/AXO are multi-domain logic  Flow through SEC-DED, parity on routing control 23

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

AXO

Address Serialization Unit (ASU)

Link Framing Units (LFU) LQU Input Queues (IQU) ASU Crossbar In (AXI) Address Serialization Units Forwarding Crossbar ASU ASU ASU ASU ASU ASU ASU ASU (FXU)

      

 24

ASU Crossbar Out (AXO)

Partitioned into eight parallel units Each directory unit can compare and process up to 22,656 bits per cycle (total 181,248 bits per chip per cycle) 0.5 request lookups per cycle (total 4 per chip) Flow-through correction on incoming packets and tag directory contents Retry on directory tag staging flops Supports up to 12 hardware domains with error steering Per domain Built-In Self Initialization (BISI) Tag RAM scrubber

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

LQU Output Queues (OQU) Link Framing Units (LFU)

Outline  Motivation and Design Objectives

 M5 System and Beyond  System RAS Features

 Implementation Details  Debug and DFT Features

 Summary

25

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Debug and DFT Features  Monitoring link at full signaling speed is challenging – Two internal rings to allow capturing packet flow in ingress or egress

direction – Internal triggering logic and RAM to store packet flow – External DDR interface to allow capturing packet flow on Logic Analyzer

 In-system test features: – MemBIST – InterconnectBIST

– ASU tag RAM can be read, written or read-modify-write

26

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Outline  Motivation and Design Objectives

 M5 System and Beyond  System RAS Features

 Implementation Details  Debug and DFT Features

 Summary

27

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Bixby Design Objectives Accomplished  Scales to 96 processors  Provides communication switching between 8-processor SMPs

 Directory for L3 caches of Large System Scaling

Coherence Directory & Processing

all processors

 Multi-generation support  Enabling mixed processor systems

Bixby Enterprise System Focus

28

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

 Enterprise-Class RAS feature set  High bandwidth, low latency invisible

Bixby Scalability ASIC  Hosts L3 cache directory  Processes coherence requests  Includes comprehensive Enterprise-Class RAS  Provides extensive debug and DFT features  Pushes ASIC boundaries: technology, complexity, die size, SerDes count, power, …  Flexible scaling up to 12x 8

2

29

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

4

16

Glueless systems

24

96

64

48

32

Glued systems

Q&A

30

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

References • White Paper: • SPARC M5-32 Server Architecture http://www.oracle.com/technetwork/server-storage/sun-sparcenterprise/documentation/o13-024-m5-32-architecture-1920556.pdf

• Data Sheets: • SPARC M5-32 Server http://www.oracle.com/us/products/servers-storage/servers/sparc/oracle-sparc/m532/sparc-m5-32-ds-1922642.pdf

• SPARC M5 Processor http://www.oracle.com/us/products/servers-storage/servers/sparc/oracle-sparc/m5-32/m5processor-ds-1922646.pdf

31

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

Glossary  BISI – Built-In Self Initialization

 BIST – Built-In Self Test  BX – Bixby ASIC  CL – Coherence Link  CRC – Cyclic Redundancy Check

 IBIST – Interconnect Built-In Self Test  MemBIST – Memory Built-In Self Test  PRBS – Pseudo-Random Binary Sequence  PVT – Process Voltage Temperature

 RAS – Reliability Availability Serviceability  SEC-DED – Single-bit Error Correction - Double-bit Error Detection  SL – Scalability Link  SMP – Shared Memory Processor

 SPARC - Scalable Processor ARChitecture

32

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

33

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.

34

Copyright © 2013 Oracle and/or its affiliates. All rights reserved.