30 series cards use compute level 8.6
40 series cards use compute level 8.9
The enhancements of compute level 8.9 over compute level 8.6 are as follows:
perhaps a new version needs to be compiled that supports only compute level 8.9 and upwards to be more optimized for the 40 series?
40 series cards use compute level 8.9
The enhancements of compute level 8.9 over compute level 8.6 are as follows:
- FP32 Operations: Devices of compute capability 8.9 have 2x more FP32 operations per cycle per SM than devices of compute capability 8.612. While a binary compiled for 8.0 will run as-is on 8.9, it is recommended to compile explicitly for 8.9 to benefit from the increased FP32 throughput12.
- Tensor Core Operations: The NVIDIA Ada GPU architecture includes new Ada Fourth Generation Tensor Cores featuring the Hopper FP8 Transformer Engine2.
- Memory System: Compute capability 8.9 has an increased L2 capacity2.
perhaps a new version needs to be compiled that supports only compute level 8.9 and upwards to be more optimized for the 40 series?