NVIDIA Unveils New API To Run AI on GPU Alongside RTX AI Toolkit

At the recent Computex 2024 event, NVIDIA and Microsoft announced their collaboration on a new Application Programming Interface (API). This API will enable developers to run their AI-accelerated applications on RTX graphics cards, including the various Small Language Models (SLMs) that form part of the Copilot runtime. These SLMs are used as the foundation for features like Recall and Live Captions.

The new API will allow applications to run locally on your GPU instead of the Neural Processing Unit (NPU). This opens up the possibility for more powerful AI applications, as GPUs generally have higher AI capabilities than NPUs. It also extends the ability to run on PCs that don’t currently fall under the Copilot+ umbrella.

In addition to running on the GPU, the new API introduces retrieval-augmented generation (RAG) capabilities to the Copilot runtime. RAG allows the AI model to access specific information locally, enabling it to provide more helpful solutions.

Alongside the API, NVIDIA announced the RTX AI Toolkit at Computex. This developer suite, set to arrive in June, combines various tools and SDKs that allow developers to optimize AI models for specific applications. NVIDIA claims that by using the RTX AI Toolkit, developers can make models four times faster and three times smaller compared to using open-source solutions.

NVIDIA Unveils New API To Run AI on GPU Alongside RTX AI Toolkit

Related articles: