GPU as a Service
Be faster, more intuitive and more efficient, accessing GPU power in a way that fits your business needs.
Dedicated GPU Resources. No Compromise.
GPUs are exceptional for their ability to execute multiple tasks simultaneously, allowing them to handle the computational demands of a high-volume data processing and analysis tasks. This offering leverages NVIDIA’s Multi-Instance GPU (MIG) technology, a hardware-level capable of securely dividing a physical NVIDIA H200 GPU into up to seven independent, isolated instances. Unlike software-based virtualisation, each MIG instance has its own dedicated compute cores, high-bandwidth memory, and cache. This guarantees Quality of Service (QoS), ensuring that a client’s workload performance is predictable and unaffected by other tenants on the same physical hardware.
The entire environment is orchestrated by the NVIDIA AI Enterprise software suite, which provides the essential components for a seamless user experience.
On Demand
On-Demand is the most flexible model for accessing H200 GPU power. Clients pay only for the exact amount of compute time they use, billed by hour, with no upfront costs or long-term contracts.
This plan is best suited for:
- New Customers who want to try the platform without committing to a long-term plan.
- Businesses with unpredictable workloads who need a plan that can adapt to their company’s demands.
- Developers and researchers who are running short-term tests and prototyping new models.
Reserved
The Reserved model is a strategic commitment involving committing to a specific amount of H200 GPU capacity for 1 to 5-year terms, they receive a substantial discount on the hourly rate. This provides a guarantee that the capacity they have paid for will always be available for their workloads.
This plan is best suited for:
- Businesses with stable production workloads that require applications to run around the clock.
- Budget-conscious organizations that want highly predictable and manageable costs.
- Companies with mission-critical applications and AI services that can’t be without computer power.
Spot
Take advantage of our unused GPU capacity. The most critical aspect of Spot instances is that they can be interrupted or “pre-empted” with very short notice. If we need that capacity to serve an On-Demand or Reserved customer, the Spot instance will be terminated. It is the most cost-effective way to access powerful H200 GPU resources.
This plan is best suited for:
- Highly cost-sensitive projects, such as start-ups or academic research, need to process massive amounts of data on a tight budget.
- Fault-tolerant tasks that can be paused and resumed (i.e. video rendering or scientific simulations)