Just a while ago, there was a test between Intel’s Gaudi 2 and NVIDIA’s H100 accelerator. Now the latest generation, Gaudi 3, has been presented, which is even better than its predecessor.
Gaudi 3 is equipped with the latest, i.e. 5th generation of the Tensor Core series with a structure width of 5 nm. Gaudi 3 contains 64 cores and is based on the architecture of Gaudi 2. However, Gaudi 3 offers a significant improvement in computing power, memory bandwidth and architectural efficiency. The processor has two compute dies carrying 8 MME engines and 24x200GBps RDMA NIC ports.
The processor is also equipped with 8 HBM2e chips, a standard 128 GB memory and is also characterized by its performance values. The AI accelerator offers 1.8 PFlops FP8 and BF16 computing power and 3.7 TB/s bandwidth for training and inference. In addition, the accelerator has 96 MB on-board SRAM and is said to offer sufficient memory for processing large GenAI data sets on fewer Intel Gaudi 3s.
Intel states that Gaudi 3 MME is capable of running 64,000 parallel operations. The Intel Gaudi software integrates the PyTorch framework and is said to offer optimized, community-based hugging face models. The models are currently the most widely used AI framework for GenAI developers. This allows GenAI developers to work at a high level of abstraction to simplify usage and productivity and to facilitate the porting of models to different hardware types.
The company also emphasizes its speed compared to NVIDIA’s H100 accelerator and H200 accelerator. The average performance is said to be 1.7x better. As a comparison, the LLAMA2 model with 7B parameters, the LLAMA2 with 13B parameters and Falcon with 180B parameters are used. The GPT 3 model with 175 parameters was also used for the H100 GPU accelerator. In all of these models, the NVIDIA H100 accelerator and the H200 in the Falcon model have been beaten. They also have a head start when it comes to power efficiency.
Gaudi is produced in three versions. One is the Intel Gaudi 3 AI Accelerator 325-L OAM Mezzanine Card, which is rated at 900 W on paper. The second variant is the Intel Gaudi 3 AI Accelerator HLB-325 Baseboard, which has a stated 7.6 KW TDP, but also has 8 HL-325L OAMs on it. Finally, the Intel Gaudi 3 AI Accelerator HL-338 PCIe Add-In Card is on board, which only has a TDP of 600 W.
The air-cooled Gaudi 3 accelerator version will ship in the second quarter of 2024 for OEMs, e.g. Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermirco, in branch-specific configurations of Univseral Baseboard and Open Accelerator Module. General availability is planned for the third quarter of 2024. The Intel Gaudi 3 PCIe add-in card is expected to be available in the last quarter of 2024.
Source: Intel
6 Antworten
Kommentar
Lade neue Kommentare
Veteran
Veteran
Mitglied
Urgestein
Veteran
Urgestein
Alle Kommentare lesen unter igor´sLAB Community →