GPU/PCI capability information in the Offer
This GAP introduces GPU (and possibly other PCI) devices capability description. Providers will specify this information so that Requestors would be able to specify certain processing requirements in Demands.
Namespace that describes GPU capabilities
Property | Type | Applies to | Description |
---|---|---|---|
golem.!exp.gap-35.v1.inf.gpu.model |
string |
Offer | Indicates the name of the GPU model |
Namespace that describes GPU CUDA capabilities
Property | Type | Applies to | Description |
---|---|---|---|
golem.!exp.gap-35.v1.inf.gpu.cuda.enabled |
boolean |
Offer | this is related to the image running and not the VM runtime |
golem.!exp.gap-35.v1.inf.gpu.cuda.cores |
integer |
Offer | This can be found from the general specification of each GPU. Otherwise it is platform specific that we would not like to work with. We should aim to focus on platform independent metrics that are easy to access in a health check |
golem.!exp.gap-35.v1.inf.gpu.cuda.version |
string |
Offer | A version of CUDA that is guaranteed to work. The range of compatibile versions is not known by the GPU because there is a minimum CUDA version and also it can be deprecated in a future CUDA version. It can also be different for CUDA and a given library, for example tensorflow had a different GPU compatibility requirement for building from source and using prebuild binaries. Get the info from Nvidia and/or a general guideline from StackOverflow. The property is a version string including a major version, and a minor version in #.# format. |
golem.!exp.gap-35.v1.inf.gpu.cuda.compute-capability |
string |
Offer | Every Nvidia GPU has Compute Capability level. It allows to identify CUDA features it supports. The property is a version string including a major version, and a minor version in #.# format. |
Namespace that describes information about GPU clocks
Property | Type | Applies to | Description |
---|---|---|---|
golem.!exp.gap-35.v1.inf.gpu.clocks.graphics.mhz |
integer |
Offer | The max rate of the graphics clock as reported by nvidia-smi |
golem.!exp.gap-35.v1.inf.gpu.clocks.memory.mhz |
integer |
Offer | The max rate of the memory clock as reported by nvidia-smi |
golem.!exp.gap-35.v1.inf.gpu.clocks.sm.mhz |
integer |
Offer | The max rate of the streaming multiprocessor clock as reported by nvidia-smi. CUDA cores are driven by this clock. |
golem.!exp.gap-35.v1.inf.gpu.clocks.video.mhz |
integer |
Offer | The max rate of the video clock as reported by nvidia-smi |
Namespace that describes information about GPU memory
Property | Type | Applies to | Description |
---|---|---|---|
golem.!exp.gap-35.v1.inf.gpu.memory.bandwidth.gib |
integer |
Offer | Optional The theoretical maximum amount of data that the bus can handle per second |
golem.!exp.gap-35.v1.inf.gpu.memory.total.gib |
integer |
Offer | Indicates the amount of memory available to the GPU |
This specification is based on the wish-list provided in #157.
For now the assumption is that we only detect the first pci (possibly nvidia) gpu.
Other PCI (and similarly non-PCI) devices can be added as another sub-tree next to golem.!exp.gap-35.v1.inf.gpu
.
If gpu
node is not present in the offer then it means the Provider does not have it or does not allow using it.
If gpu
node is present but the Requestor does not demand it then it should be skipped in the agreement.
TBD
TBD
TBD
Copyright and related rights waived via CC0.