



# News about OpenCL and FPGA

Massimo Coppola 28/05/2018







### **XEON with FPGA**



- First hinted in 2015
- Announced as an actual product May 2018
  - Intel Xeon Gold 6138P
  - Integrates an Altera Arria 10 GX 1150 FPGA (no ARM)

#### Sources

- https://www.anandtech.com/show/12773/intel-shows-xeonscalable-gold-6138p-with-integrated-fpga-shipping-to-vendors
- https://www.top500.org/news/intel-ships-xeon-skylakeprocessor-with-integrated-fpga/
- https://www.altera.com/solutions/acceleration-hub/ overview.html
- SW toolchain based on open CL and Quartus prime





# **Expected features**



| AnandTech          | Xeon Gold 6138            | Xeon Gold 6138P<br>with Arria 10 FPGA        |
|--------------------|---------------------------|----------------------------------------------|
| Socket             | Socket P<br>LGA 3647      | Socket P<br>LGA 3647                         |
| Cores / Threads    | 20 / 40                   | 20 / 40 ?                                    |
| Base Frequency     | 2000 MHz                  | 2000 MHz ?                                   |
| Turbo Frequency    | 3700 MHz                  | 3700 MHz ?                                   |
| PCle Lanes         | 48                        | 32                                           |
| DRAM               | Six Channels<br>DDR4-2666 | Six Channels<br>DDR4-2666                    |
| On-Package FPGA    | -                         | Arria 10 GX 1150                             |
| Logic Elements     | -                         | 1150K (1.15m)                                |
| Embedded<br>Memory | -                         | 53 Mb                                        |
| UPI Links          | Three                     | Two                                          |
| TDP                | 125 W                     | 125 W CPU<br>60 - 70 W FPGA<br>195 W Total ? |



## Launch app



- A virtual switch reference design
  - VM on the main processor control packet switching and compute on the FPGA



| Cores                                                                                 | Up to 28C with Intel® HT Technology                                                                                                                     |                                                                       |  |
|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------|--|
| FPGA                                                                                  | Altera® Arria 10 GX 1150                                                                                                                                |                                                                       |  |
| Socket TDP                                                                            | Shared socket TDP<br>Up to 165W SKL & Up to 90W FPGA                                                                                                    |                                                                       |  |
| Socket                                                                                | Socket P                                                                                                                                                |                                                                       |  |
| Scalability                                                                           | Up to 2S - with SKL-SP or SKL + FPGA SKUs                                                                                                               |                                                                       |  |
| PCH                                                                                   | Lewisburg: DMI3 – 4 lanes; 14xUSB2 ports<br>Up to: 10xUSB3; 14xSATA3, 20xPCIe*3 New: Innovation<br>Engine, 4x10GbE ports, Intel® QuickAssist Technology |                                                                       |  |
|                                                                                       | For CPU                                                                                                                                                 | For FPGA                                                              |  |
| Memory                                                                                | 6 channels DDR4<br>RDIMM, LRDIMM,                                                                                                                       | Low latency access to<br>system memory via UPI &<br>PCIe interconnect |  |
|                                                                                       | 2666 1DPC,<br>2133, 2400 2DPC                                                                                                                           |                                                                       |  |
| Intel® UPI                                                                            | 2 channels<br>(10.4, 9.6 GT/s)                                                                                                                          | 1 channel<br>(9.6 GT/s)                                               |  |
| PCIe*                                                                                 | PCIe* 3.0<br>(8.0, 5.0, 2.5 GT/s)                                                                                                                       | PCIe* 3.0<br>(8.0, 5.0, 2.5 GT/s)                                     |  |
|                                                                                       | 32 lanes per CPU<br>Bifurcation support:<br>x16, x8, x4                                                                                                 | 16 lanes per FPGA<br>Bifurcation support<br>x8                        |  |
| High Speed<br>Serial Interface<br>(Different board<br>design based on<br>HSSI config) | N/A                                                                                                                                                     | 2xPCIe 3.0 x8                                                         |  |
|                                                                                       |                                                                                                                                                         | Direct Ethernet<br>(4x10 GbE, 2x40 GbE,<br>10x10 GbE, 2x25 GbE)       |  |









### SPD Course Path from now on



- Complete the homeworks
  - MPI
  - TBB
  - OpenCL
- Choose a project topic
  - A mining/stream mining algorithm
- Choose a technology
  - MPI, TBB, OpenCL
  - or more than one, if you want to exploit a hybrid technique



