Here is the GFLOPS comparative table of recent AMD Radeon and NVIDIA GeForce GPUs in FP32 (single precision floating point) and FP64 (double precision floating point). I compiled on a single table the values I found from various articles and reviews over the web.
GPU | FP32 GFLOPS | FP64 GFLOPS | Ratio |
GeForce RTX 3090 | 35580 | 556 | FP64 = 1/64 FP32 |
GeForce RTX 3080 | 29770 | 465 | FP64 = 1/64 FP32 |
Radeon RX 6900 XT | 23040 | 1440 | FP64 = 1/16 FP32 |
Radeon RX 6800 XT | 20740 | 1296 | FP64 = 1/16 FP32 |
GeForce RTX 3070 | 20310 | 317 | FP64 = 1/64 FP32 |
GeForce RTX 3060 Ti | 16200 | 253 | FP64 = 1/64 FP32 |
Radeon RX 6800 | 16170 | 1010 | FP64 = 1/16 FP32 |
TITAN V | 13800 | 6900 | FP64 = 1/2 FP32 |
GeForce RTX 2080 Ti | 13450 | 420 | FP64 = 1/32 FP32 |
Radeon RX 6700 XT | 13200 | 825 | FP64 = 1/16 FP32 |
GeForce RTX 3060 | 12740 | 199 | FP64 = 1/64 FP32 |
Radeon RX Vega 64 | 12700 | 790 | FP64 = 1/16 FP32 |
GeForce GTX 1080 Ti | 11340 | 354 | FP64 = 1/32 FP32 |
GeForce GTX 1080 Ti | 11300 | 350 | FP64 = 1/32 FP32 |
Radeon R9 295X2 | 11264 | 1408 | FP64 = 1/8 FP32 |
TITAN X Pascal | 11000 | 340 | FP64 = 1/32 FP32 |
Radeon RX Vega 56 | 10500 | 656 | FP64 = 1/16 FP32 |
GeForce RTX 2080 | 10070 | 314 | FP64 = 1/32 FP32 |
Radeon RX 5700 XT | 9754 | 609 | FP64 = 1/16 FP32 |
GeForce RTX 2070 SUPER | 9062 | 283 | FP64 = 1/32 FP32 |
GeForce GTX 1080 | 8873 | 277 | FP64 = 1/32 FP32 |
Radeon R9 Fury X | 8600 | 537 | FP64 = 1/16 FP32 |
GeForce GTX 1070 Ti | 8100 | 253 | FP64 = 1/32 FP32 |
Radeon HD 7990 | 7782 | 1946 | FP64 = 1/4 FP32 |
GeForce RTX 2070 | 7465 | 233 | FP64 = 1/32 FP32 |
Radeon RX 5600 XT | 7188 | 449 | FP64 = 1/16 FP32 |
GeForce GTX 1070 | 6463 | 202 | FP64 = 1/32 FP32 |
GeForce RTX 2060 | 6451 | 201 | FP64 = 1/32 FP32 |
Radeon RX 580 | 6175 | 385 | FP64 = 1/16 FP32 |
GeForce GTX 980 Ti | 6060 | 189 | FP64 = 1/32 FP32 |
Radeon RX 480 | 5834 | 364 | FP64 = 1/16 FP32 |
GeForce GTX Titan Black | 5645 | 1881 | FP64 = 1/3 FP32 |
GeForce GTX 690 | 5622 | 234 | FP64 = 1/24 FP32 |
Radeon R9 290X | 5632 | 704 | FP64 = 1/8 FP32 |
GeForce GTX 1660 Ti | 5437 | 169 | FP64 = 1/32 FP32 |
GeForce GTX 780 Ti | 5345 | 223 | FP64 = 1/24 FP32 |
Radeon RX 5500 XT | 5196 | 324 | FP64 = 1/16 FP32 |
Radeon HD 6990 | 5099 | 1276 | FP64 = 1/4 FP32 |
Radeon RX 570 | 5095 | 318 | FP64 = 1/16 FP32 |
GeForce GTX 1660 SUPER | 5027 | 157 | FP64 = 1/32 FP32 |
GeForce GTX 1660 | 5027 | 157 | FP64 = 1/32 FP32 |
GeForce GTX 980 | 4981 | 156 | FP64 = 1/32 FP32 |
Radeon RX 470 | 4900 | 306 | FP64 = 1/16 FP32 |
Radeon R9 290 | 4849 | 606 | FP64 = 1/8 FP32 |
GeForce GTX Titan | 4709 | 1523 | FP64 = 1/3 FP32 |
GeForce GTX 1650 SUPER | 4416 | 138 | FP64 = 1/32 FP32 |
GeForce GTX 1060 | 4400 | 137 | FP64 = 1/32 FP32 |
GeForce GTX 1060 6GB | 4375 | 136 | FP64 = 1/32 FP32 |
Radeon HD 7970 GHz | 4301 | 1075 | FP64 = 1/4 FP32 |
GeForce GTX 780 | 4156 | 173 | FP64 = 1/24 FP32 |
Radeon R9 280X | 4096 | 1024 | FP64 = 1/4 FP32 |
GeForce GTX 970 | 3920 | 122 | FP64 = 1/32 FP32 |
Radeon HD 7970 | 3789 | 947 | FP64 = 1/4 FP32 |
Radeon R9 280 | 3344 | 836 | FP64 = 1/4 FP32 |
Radeon HD 7950 Boost | 3315 | 828 | FP64 = 1/4 FP32 |
GeForce GTX 770 | 3210 | 134 | FP64 = 1/24 FP32 |
GeForce GTX 680 | 3090 | 129 | FP64 = 1/24 FP32 |
GeForce GTX 1650 | 2984 | 93 | FP64 = 1/32 FP32 |
Radeon HD 7950 | 2867 | 717 | FP64 = 1/4 FP32 |
Radeon HD 5870 | 2720 | 544 | FP64 = 1/5 FP32 |
Radeon HD 6970 | 2703 | 675 | FP64 = 1/4 FP32 |
Radeon R9 270X | 2688 | 168 | FP64 = 1/16 FP32 |
Radeon RX 560 | 2611 | 163 | FP64 = 1/16 FP32 |
Radeon HD 7870 | 2560 | 160 | FP64 = 1/16 FP32 |
GeForce GTX 590 | 2488 | 311 | FP64 = 1/8 FP32 |
GeForce GTX 670 | 2460 | 102 | FP64 = 1/24 FP32 |
GeForce GTX 660 Ti | 2460 | 102 | FP64 = 1/24 FP32 |
Radeon R9 270 | 2368 | 148 | FP64 = 1/16 FP32 |
GeForce GTX 760 | 2258 | 94 | FP64 = 1/24 FP32 |
Radeon HD 6950 | 2253 | 563 | FP64 = 1/4 FP32 |
GeForce GTX 1050 Ti | 2138 | 66 | FP64 = 1/32 FP32 |
Radeon HD 5850 | 2088 | 417 | FP64 = 1/5 FP32 |
Radeon R7 260X | 1971 | 123 | FP64 = 1/16 FP32 |
Radeon R7 265 | 1894 | 118 | FP64 = 1/16 FP32 |
GeForce GTX 660 | 1882 | 78 | FP64 = 1/24 FP32 |
GeForce GTX 1050 | 1862 | 58 | FP64 = 1/32 FP32 |
Radeon HD 7790 | 1792 | 128 | FP64 = 1/14 FP32 |
Radeon HD 7850 | 1761 | 110 | FP64 = 1/16 FP32 |
GeForce GTX 580 | 1581 | 197 | FP64 = 1/8 FP32 |
Radeon R7 260 | 1536 | 96 | FP64 = 1/16 FP32 |
GeForce GTX 650 Ti Boost | 1505 | 62 | FP64 = 1/24 FP32 |
GeForce GTX 650 Ti | 1425 | 60 | FP64 = 1/24 FP32 |
GeForce GTX 570 | 1405 | 175 | FP64 = 1/8 FP32 |
GeForce GTX 750 Ti | 1389 | 43 | FP64 = 1/32 FP32 |
Radeon HD 7770 GHz | 1280 | 80 | FP64 = 1/16 FP32 |
Radeon R7 250X | 1280 | 80 | FP64 = 1/16 FP32 |
Radeon RX 550 | 1211 | 75 | FP64 = 1/16 FP32 |
GeForce GT 1030 | 1127 | 35 | FP64 = 1/32 FP32 |
GeForce GTX 750 | 1110 | 34 | FP64 = 1/32 FP32 |
GeForce GTX 650 | 812 | 33 | FP64 = 1/24 FP32 |
Radeon R7 250 | 806 | 50 | FP64 = 1/16 FP32 |
GeForce GT 1010 | 751 | 31 | FP64 = 1/24 FP32 |
GeForce GT 730 | 692 | 28 | FP64 = 1/24 FP32 |
Radeon R7 240 | 500 | 31 | FP64 = 1/16 FP32 |
GeForce GT 710 | 366 | 15 | FP64 = 1/24 FP32 |
GeForce GT 210 | 39 | N.A. |
Workstations cards:
GPU | FP32 GFLOPS | FP64 GFLOPS | Ratio |
FirePro W9100 | 5240 | 2620 | FP64 = 1/2 FP32 |
What’s the FP32/FP64 benchmark tool?? Do you use AIDA64?
The values don’t come from a benchmark tool, it’s just a compilation from articles / reviews.
So basically theoretical peak performance instead of actual peak performance
@DrBalthar – define “actual peak performance”
I believe you are referring to typical/average real life performance but that one depends strongly by app you test it with.
Well not really you can write very simple code that just does a MAD operation (as that’s the one the usually use for advertising FLOPs).
“Well not really you can write very simple code that just does a MAD operation (as that’s the one the usually use for advertising FLOPs”)
No, it was FPMADD but since all new GPUs uses FMA which does basically the very same thing but with more accurate final result it’s just as good. And btw it wouldn’t be best possible benchmark and “peak” performance since many apps uses only certain type of calculations. Bitcoin miners iirc used plenty of integer and bit operations which were much higher on AMD GPUs than NVidia’s – define universal “actual peak performance” then? There’s no such thing.
The F in FLOP stands for Floating point so integer and bit operation are irrelevant. Using just FMA, FPNADD still would be the most fair test as it is the only operation used so there wouldn’t be any difference or cheating. Actual peak performance doesn’t mean real life application performance. It just means actually validated numbers and not just marketing material.
@DrBalthar – sorry but I disagree. Test results of full FMA coverage on all ALUs would be the very same artificial as those of “theoretical peak performance” – it just doesn’t translate into real-world performance. Some apps uses more data parallelism, some more task parallelism, some uses both float and integer intensively and other are focused on fp64 only. Some shares a lot of calculation with CPU and some runs solely on GPU. FP MADD/FMA test results would mean absolutely nothing. For the very same reason there’s no “actual peak performance” benchmark for x86-64 CPUs. But you do have some peek of a real world performance with Linpack benchmark. Check top500.org, there’re both Rmax (linpack results) and Rpeak (theoretical performance) numbers. In mixed GPU and CPU clusters it’s ratio is close to 60-70%. On CPU only clusters it reaches 80-95%.
gtx 690@ ES BIOS 450W PT150%- +150 mhz + 605 mhz gddr5
baseclock 1065 mhz – boost 1215 mhz – 7.2ghz GDDR5
AIda : 6.498 Tflops/s FP32 / 0.275 Tflops/s FP64
This is awesome. Thank you.
It’s necessary to take these numbers with a grain of salt, and it’s difficult to estimate just how much salt is needed for any given architecture.
I think a copy of the table that is sorted on FP64 would be useful as well.