Microsoft ONNX Runtime

Last year at Build, Microsoft announced Hybrid Loop, a new development pattern that enables hybrid AI scenarios across Azure and client devices. At Build 2023, Microsoft announced ONNX Runtime and Olive toolchain that will allow developers to easily build Hybrid Loop-based AI apps for Windows.

With ONNX Runtime, developers can run AI models on Windows or other devices across CPU, GPU, NPU, or hybrid with Azure. Also, ONNX Runtime supports the same API for running models on the device or in the cloud, allowing developers to build hybrid inferencing scenarios.

With the new Azure EP preview, developers can connect to models deployed in AzureML
or even to the Azure OpenAI service. Also, Azure EP will allow developers to choose between using the larger model in the cloud or the smaller local model at runtime.

Olive is an extensible toolchain that combines cutting edge techniques for model compression, optimization, and compilation. It simplifies the optimization process and eliminates the need for deep hardware knowledge when optimizing models for varied Windows and other devices across CPU, GPU and NPU.

Hybrid Loop development, enabled by Olive toolchain and ONNX Runtime, makes it easier for developers to create amazing AI experiences on Windows and other platforms, with less engineering effort and better performance.

Since ONNX Runtime and Olive toolchain are also cross-platform, developers can easily bring their AI experiences into their apps across Windows, Android, iOS and Linux. The new Olive toolchain is now in preview.