The article discusses the role of MultiLayer Perceptrons (MLPs) in Machine Learning (ML) and Artificial Intelligence (AI), and their implementation on Intel GPUs. MLPs, characterized by their fully connected layers, are universal approximators, capable of approximating any continuous function. The article also explains the implementation of fully-fused MLPs on Intel GPUs, which involves fusing layers into a single kernel to keep relevant data in faster memories. The SYCL implementation for Intel GPUs focuses on MLPs with arbitrary depth and fixed layer width. The fully-fused MLPs approach significantly increases the arithmetic intensity and performance, outperforming the Intel Extension for PyTorch (IPEX) and the CUDA PyTorch version on Nvidia's H100 GPU.
↧