Intel Editorial: Accelerating Innovations for Open and Sustainable HPC

\n \n \n “. concat(self. i18n. t(‘search. voice. recognition_retry’), “\n

Intel is answering computing’s insatiable call with sustainability as a priority in the future of supercomputing for all.

HAMBURG, Germany, May 31, 2022–(BUSINESS WIRE)–The following is an op-ed by Jeff McVeigh of Intel Corporation:

This press is multimedia. Read the full press here: https://www. businesswire. com/news/home/20220531005771/en/

At the May 31, 2022 International Supercomputing Conference in Hamburg, Germany, Jeff McVeigh, vice president and general manager of Intel Corporation’s Super Compute Group, announced Rialto Bridge, Intel’s intermediate graphics processing unit (GPU). Using the same architecture as Intel’s Ponte Vecchio-based intermediate GPU and the combination of advanced tiles with Intel’s next compute node, Rialto Bridge will offer up to 160 Xe cores, plus FLOP, more I/O bandwidth, and TDP upper limits for considerably higher density, performance, and efficiency. (Credit: Intel Corporation)

As we enter the exascale era and rush to the zetta scale, the tech industry’s contribution to global carbon emissions is also increasing. knowledge centers, with IT infrastructure being one of the main drivers of the new electricity consumption.

This year, Intel committed to achieving net zero greenhouse fuel emissions in its global operations by 2040 and introducing more sustainable generation solutions. computing (HPC). While daunting, it’s possible if we use each and every component of the HPC computing stack: silicon, software, and systems.

This is the focus of my speech at SAI 2022 in Hamburg, Germany.

Start with silicon and a heterogeneous computing architecture

We have a competitive HPC roadmap planned for 2024 that will provide a diversified portfolio of heterogeneous architectures. These architectures will enable us to feature in order of magnitude while reducing the demands on strength for emerging and general-purpose workloads, such as AI, encryption, and analytics.

The codename of the Intel® Xeon® Sapphire Rapids processor with high-bandwidth memory (HBM) is a wonderful example of how we are leveraging complex packaging technologies and silicon inventions to deliver truly broad innovations in functionality, bandwidth, and energy savings for HPC. Up to 64 gigabytes of high-bandwidth HBM2e memory in the package and accelerators built into the processor, we’re going to lose memory bandwidth workloads while delivering significant functionality innovations in key HPC use cases. Running Sapphire Rapids HBM processors, we see a double or triple increase in the functionality of meteorological, energy, manufacturing and physical survey workloads2. times more functionality in actual Ansys Fluent and ParSeNet3 workloads.

Compute density is another imperative, as we look for multi-orders of magnitude functionality gains in AI and HPC supercomputing workloads. AI education and inference workloads. We also show that Ponte Vecchio accelerates high-fidelity simulation through 2 with OpenMC4.

Today we’re announcing the successor to this intermediate hard-knowledge GPU, called the Rialto Bridge. By evolving the Ponte Vecchio architecture and combining advanced tiles with the following procedure node technology, Rialto Bridge will deliver particularly higher density, functionality and efficiency, while ensuring software consistency.

Looking to the future, Falcon Shores is the next major architectural innovation on our roadmap, combining x86 CPU and Xe GPU architectures into a single socket. This architecture is planned for 2024 and is expected to deliver benefits of more than five times the consistent functionality with watts, five times the compute density, five times the memory capacity, and improvements in bandwidth.

Principles of a successful software strategy: openness, choice, trust

Silicon is just sand with no software to bring it to life. Our approach to software is to facilitate open progression across the stack and provide tools, platforms, and IP software for developers to be more productive and produce scalable, more powerful, and more effective code. that can take advantage of the newest silicon inventions without the burden of code refactoring. The oneAPI industry initiative provides HPC developers with cross-architecture programming so that code can be targeted to CPUs, GPUs, and other specialized accelerators in a transparent and portable manner.

There are now more than 20 oneAPI centres of excellence in leading educational and study establishments around the world, and they are making significant progress. more productive practices to achieve portability of exascale oneAPI scale functionality and khronos Group’s SYCL abstraction layer for cross-architecture programming.

Making the connection: for sustainable heterogeneous computing

As HPC and mid-knowledge workloads evolve into disaggregated architectures and heterogeneous computing, we’ll need teams that can help us manage those complex and varied IT environments well.

Today, we’re introducing Intel® XPU Manager, an open-source solution for monitoring and managing Intel’s knowledge midsize GPUs locally and remotely. It has been designed to simplify management, maximize reliability and availability by running comprehensive diagnostics, utilizing, and performing firmware updates.

A Distributed Asynchronous Object Storage (DAOS) logging formula provides formula-level optimizations for energy-intensive data transfer and storage responsibilities. DAOS has a massive impact on record formula performance, either improving overall access time and reducing the capacity required for the garage to reduce the average data footprint and increase power power. In 500 I/O effects compared to Lustre, DAOS achieved a 70×6 buildup in hard write register formula performance.

Meet HPC Sustainability

We are proud to partner with like-minded consumers and leading think tanks around the world for a more sustainable and open HPC. Recent examples come with our union with the Barcelona Supercomputing Centre to establish a pioneering zetta-scale RISC-V laboratory, and our ongoing collaboration with the University of Cambridge and Dell to convert the existing Exascale Lab into the new Cambridge Zettascale Lab. These efforts build on our plans to create a strong European innovation ecosystem for the long-term computing sector.

In the end, no company can do it alone. The entire ecosystem will have to rely similarly on manufacturing, silicon, interconnection, software, and systems. By doing this together, we can turn one of the most demanding HPC situations of the century into a century opportunity and replace the global for generations in the long run.

Jeff McVeigh is vice president and general manager of Intel Corporation’s supercomputing group.

About Intel

Intel (Nasdaq: INTC) is an industry leader creating a revolutionary generation that enables global progress and enriches lives. Inspired by Moore’s Law, we constantly strive to advance semiconductor design and production to help meet our customers’ biggest challenges. By integrating intelligence into the cloud, network, edge, and all kinds of computing devices, we unlock the knowledge perspective for businesses and society for the better. To learn more about Intel’s innovations, visit newsroom. intel. com and intel. com.

Disclaimers and Disclaimers:

1 Andrae assumptions for energy intake number one, electrical energy intake, and CO2 emissions from global computing and their overall percentage between 2020 and 2030, WSEAS Trans Power Syst, 15 (2020)

2 Measured through the following:

Clover leaf

Tested via Intel on node 26/04/2022. 1, 2 Intel® Xeon® Platinum 8360Y processors, cores, HT-enabled, Turbo-enabled, 256 GB (16 x 16 GB DDR4 3200 MT/s) total memory, SE5C6200. 86B. 0021. D40. 2101090208, Ubuntu 20. 04, kernel 5. 10, 0xd0002a0, ifort 2021. 5, Intel MPI 2021. 5. 1, build buttons: -xCORE-AVX512 –qopt-zmm-usage=high

Tested via Intel on node 19/04/22. 1, 2 pre-production Intel® Xeon® Scalable processors named Sapphire Rapids Plus HBM, > cores, HT ON, Turbo ON, 128 GB general memory (HBM2e at 3200 MHz), EGSDCRB1. 86B . 0077. D11Array2203281354 revision, revision ucode=0x83000200, CentOS Stream 8, Linux Edition 5. 16, ifort 2021. 5, Intel MPI 2021. 5. 1, build buttons: -xCORE-AVX512 –qopt-zmm-usage=high

OPEN FOAM

Tested via Intel from node 2022-01-26. 1, 2 Intel® Xeon® Platinum 83 processors, cores, HT-enabled, Turbo-enabled, 256 GB (16 x 16 GB 3200 MT/s, dual-tier) total memory, SE5C6200. 86B. 0020. P23. 2103261309, 0xd000270, Rocky Linux 8. 5, Linux edition 4. 18. , OpenFOAM® v1912, Moto 28M @ 250 iterations; Build Notes: Tools: Intel Parallel Studio 2020u4, Build Buttons: -O3 -ip -xCORE-AVX512

Tested via Intel as of 26/1/2022 1 node, 2 pre-production Intel® Xeon Scalable processors codenamed® Sapphire Rapids Plus HBM, > cores, HT disabled, Turbo disabled, general memory 128 GB (HBM2e at 3200 MHz), pre-production platform and BIOS, CentOS 8, Linux edition 5. 12, OpenFOAM® v1912, Motorbike 28M @ 250 iterations; Build Notes: Tools: Intel Parallel Studio 2020u4, Build Buttons: -O3 -ip -xCORE-AVX512

FRM

Tested via Intel from 05/03/2022. 1 node, 2 Intel® Xeon® 8380 processors, 80 cores, HT-enabled, Turbo-enabled, 256 GB (16 x 16 GB 3200 MT/s, range) total memory, SE5C6200 BIOS edition. 86B. 0020. P23. 2103261309, ucode revision = 0xd000270, Rocky Linux 8. 5, Linux edition 4. 18, WRF v4. 2. 2

Tested through Intel as of 05/03/2022. 1 node, 2 pre-production Intel® Xeon® Scalable processors called Sapphire Rapids Plus HBM, > cores, HT ON, Turbo ON, total memory of 128 GB (HBM2e at 3200 MHz), EGSDCRB1. 86B. 0077. D11Array2203281354 BIOS Edition, ucode Revision = 0x83000200, CentOS Stream 8, Linux Edition 5. 16, WRF v4. 2. 2

Yask

Tested via Intel from 09/05/2022. 1 node, 2 Intel® Xeon® Platinum 8360Y processors, cores, HT-enabled, Turbo-enabled, 256 GB (16 x 16 GB DDR4 3200 MT/s) general memory, SE5C6200. 86B. 0021 . D40. 2101090208, Rocky linux 8. 5, kernel 4. 18Array0, 0xd000270, Build buttons: make -j YK_CXX=’mpiicpc -cxx=icpx’ arch=avx2 stencil=iso3dfd radius=8,

Tested via Intel on 03/05/22. 1 node, 2 pre-production Intel® Xeon® Scalable processors called Sapphire Rapids Plus HBM, > cores, HT ON, Turbo ON, general memory 128 GB (HBM2e at 3200 MHz), EGSDCRB1. 86B . 0077. D11Array2203281354, ucode revision=0x83000200, CentOS Stream 8, Linux edition 5. 16, Build buttons: make -j YK_CXX=’mpiicpc -cxx=icpx’ arch=avx2 stencil=iso3dfd radius=8,

3 fluid analysis

Tested via Intel since 2/2022 1 node, 2 Intel ® Xeon ® Platinum 8380 processors, 80 cores, HT-enabled, Turbo-enabled, general memory 256 GB (16 x 16 GB 3200 MT/s, range), SE5C6200. 86BArray0020 BIOS edition. P23. 2103261309, revision ucode=0xd000270, Rocky Linux 8. 5, Linux edition 4. 18, Ansys Fluent 2021 R2 Aircraft_wing_14m; Build Notes: Version with Intel Compiler 19. 3 and Intel MPI 2019u

Tested via Intel since 2/2022 1 node, 2 pre-production Intel® Xeon® Scalable Sapphire Rapids processor codenames with HBM, > 40 cores, HT off, Turbo off, 128 GB general memory (HBM2e at 3200 MHz), pre-production platform and BIOS, CentOS 8, Linux edition 5. 12, Ansys Fluent 2021 R2 Aircraft_wing_14m; Build Notes: Intel Compiler Version 19. 3 and Intel MPI 2019u8

Ansys ParSeNet

Tested via Intel on 24/05/2022. 1 node, 2 Intel® Xeon® Platinum 83 processors, cores, HT-enabled, Turbo-enabled, 256 GB general memory (16 x 16 GB DDR4 3200 MT/s [3200 MT/s]), SE5C6200. 86B. 0021. D40. 2101090208, Ubuntu 20. 04. 1 LTS, 5. 10, ParSeNet (SplineNet), PyTorch 1. 11. 0, Torch-CCL 1. 2. 0, IPEX 1. 10. 0, MKL (2021. 4-product build 20210904), oneDNN (v2. 5. 0)

Tested via Intel on 18/04/2022. 1 node, 2 pre-production Intel® Xeon® Scalable processors named Sapphire Rapids Plus HBM, 112 cores, HT enabled, Turbo enabled, 128 GB total memory (HBM2e 3200 MT/s), EGSDCRB1. 86B. 0077. D11. 2203281354, CentOS Stream 8, 5. 16, ParSeNet (SplineNet), PyTorch 1. 11. 0, Torch-CCL 1. 2. 0, IPEX 1. 10. 0, MKL (2021. 4-Product Build 20210904), oneDNN (v2. 5. 0) )

4Test through Argonne National Laboratory starting May 23, 2022, 1 node, 2 AMD EPYC 7532, DDR4 3200 256GB, HT on, Turbo enabled, ucode 0x8301038. 1 x A100 GB PCIe. OpenSUSE Leap 15. 3, Linux version 5. 3. 18, Libraries: CUDA 11. 6 with OpenMP clang compiler. Build buttons: cmake –preset=llvm_a100 -DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_MODE=BATCH -DCMAKE_UNITY_BUILD_BATCH_SIZE=1000 -DCMAKE_INSTALL_PREFIX=. /install -Ddebug=off -Doptimize=on -Dopenmp=on -Dnew_w =on -Ddevice_history=off -Dchedisable_xs on -Ddevice_printf=off. Benchmark: Idle batch functionality of spent fuel in the HM-Large reactor with M particlesTest conducted through Intel on 25/05/2022, 1 node, 2 Intel(r) Xeon(r) 8360Y Scalable processors, 256 GB DDR4 3200, HT On, Turbo, Enabled, ucode 0xd0002c1. 1x Preproduction Ponte Vecchio. Ubunt 20. 04, Linux version 5. 10. 54, agama 434, build buttons: cmake -DCMAKE_CXX_COMPILER=”mpiicpc” -DCMAKE_C_COMPILER=”mpiicc” -DCMAKE_CXX_FLAGS=”-cxx=icpx -mllvm -indvars-widen-indvars=false -Xclang -fopenmp -declare-target-global-default-no-map -std=c 17 -Dgsl_CONFIG_CONTRACT_CHECKING_OFF -fsycl -DSYCL_SORT -D_GLIBCXX_USE_TBB_PAR_BACKEND=0″ –preset=spirv -DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_MODE=BATCH -DCMAKE_UNITY_BUILD_INSray. /install -Ddebug=off -Doptimize=on -Dopenmp=on -Dnew_w=on -Ddevice_history=off -Ddisable_xs_cache=on -Ddevice_printf=off Benchmark: Idle batch functionality of spent fuel in the HM-Large reactor with M particles

5 Falcon Shores functionality targets based on estimates for existing platforms as of February 2022. Results may vary.

6 Results may vary. Learn more about io500 and “Comparing DAOS Performance to Lustre Installation” on YouTube.

All product plans and roadmaps are topics to replace the notice.

Intel does not audit or audit third-party data. You deserve to consult other resources to assess accuracy.

Intel technologies may require activation of enabled hardware, software, or service.

Performance varies depending on usage, configuration, and other factors. For more information, visit www. Intel. com/PerformanceIndex.

Performance effects are based on tests performed on the dates specified in the configurations and may not reflect all available public updates. Refer to the backup for configuration details. No product or component can be safe.

Their prices and effects would possibly vary.

Statements that refer to long-term plans or expectations are forward-looking statements. These statements are based on existing expectations and involve hazards and uncertainties that may cause actual effects to differ materially from those expressed or implied by such statements. For more information on points that may cause actual effects to differ materially, please see our recent maximum earnings report and SEC filings in www. intc. com.

©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and trademarks may be claimed as assets of third parties.

See the businesswire. com edition: https://www. businesswire. com/news/home/20220531005771/en/

Contacts

Bats Jafferji1-603-809-5145bats. jafferji@intel. com

Leave a Comment Cancel Reply