Microsoft datacenter

Google TPU and AWS Graviton are custom AI chips designed to accelerate machine learning workloads in the cloud. Google TPU is a tensor processing unit that can perform large-scale matrix operations for deep learning models, such as those used in natural language processing, computer vision, and recommender systems. AWS Graviton is an ARM-based processor that can run a wide range of general purpose, compute-optimized, memory-optimized, storage-optimized, and accelerated computing workloads, such as application servers, video encoding, high-performance computing, CPU-based machine learning inference, and open-source databases. Both chips offer significant performance and cost benefits over traditional x86-based processors from Intel and AMD.

In order to better compete with AWS and Google, Microsoft is working on its own custom AI chip. The Information today first reported about this project. This new AI chip code-named Athena has been under development from 2019 and can be used for training large-language models and supporting inference.¬†According to the report, the new AI chips are already available to a small group of Microsoft and OpenAI employees for testing. Microsoft is expecting that its own AI chip will outperform NVIDIA’s chips both in terms of cost and performance.

Due to the sudden raise in the popularity of AI services, Microsoft has accelerated the development of its custom AI chip and it is expected to be available for deployment by next year.

It will be interesting to see the dynamics of partnership between Microsoft and NVIDIA going forward since both these companies are working closely together on AI chips which are already powering Microsoft’s services worldwide.