Matching SM Architectures (CUDA Arch and CUDA Gencode) For Various NVIDIA Cards - Blame Arnon

5/11/2018 Matching SM architectures (CUDA arch and CUDA gencode) for various NVIDIA cards - Blame Arnon
Home About me
IT
overheard
haskell
Blame Arnon
GPUs
gcc
Databases
SQream DB
Uncategorized
Network Security
Talks
Posted by Arnon Shimoni #compute

11/11/2016 #compute_52
#cuda
Matching SM architectures (CUDA #cuda arch
arch and CUDA gencode) for various

#gencode
#gpu
NVIDIA cards #gtx 1080

#nvcc
gcc GPUs #nvcc arch

#nvcc flags
I’ve seen some confusion regarding NVIDIA’s nvcc sm flags and what they’re used for: #nvcc sm
When compiling with NVCC, the arch flag (‘-arch‘) specifies the name of the NVIDIA GPU architecture #nvidia
that the CUDA files will be compiled for. #nvidia sm
Gencodes (‘-gencode‘) allows for more PTX generations, and can be repeated many times for #pascal
different architectures. #sm
When should different ‘gencodes’ or ‘cuda

Recent Posts
arch’ be used?
When you compile CUDA code, you should always compile only one ‘-arch‘ flag that matches your
I built a GeoJSON to CSV parser in
most used GPU cards. This will enable faster runtime, because code generation will occur during
Python
compilation. Posted by Arnon Shimoni
If you only mention ‘-gencode‘, but omit the ‘-arch‘ flag, the GPU code generation will occur on the JIT
compiler by the CUDA driver. How to check which CUDA version is
installed on Linux
Sometimes, you would also like to enable some backwards compatibility. In those cases, you want to Posted by Arnon Shimoni
add some more ‘-gencode‘ flags.
When would SQream DB be right for

Find out which GPU you have, and which CUDA version you have first.
me?
Supported SM and Gencode variations

Posted by Arnon Shimoni
Below are the supported sm variations and sample cards from that generation
Matching SM architectures (CUDA
arch and CUDA gencode) for various
Supported on CUDA 7 and later NVIDIA cards
Fermi (CUDA 3.2 and later, deprecated from CUDA 9): Posted by Arnon Shimoni
SM20 or SM_20, compute_30 – Older cards such as GeForce 400, 500, 600, GT-630
Kepler (CUDA 5 and later): SQream DB string cheat-sheet

Posted by Arnon Shimoni
SM30 or SM_30, compute_30 – Kepler architecture (generic – Tesla K40/K80, GeForce 700, GT-
730)
Adds support for unified memory programming
SM35 or SM_35, compute_35 – More specific Tesla K40

Adds support for dynamic parallelism. Shows no real benefit over SM30 in my experience.
SM37 or SM_37, compute_37 – More specific Tesla K80

Adds a few more registers. Shows no real benefit over SM30 in my experience
http://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/ 1/4
Maxwell (CUDA 6 and later):

SM50 or SM_50, compute_50 – Tesla/Quadro M series
SM52 or SM_52, compute_52 – Quadro M6000 , GeForce 900, GTX-970, GTX-980, GTX Titan X
SM53 or SM_53, compute_53 – Tegra (Jetson) TX1 / Tegra X1
Pascal (CUDA 8 and later)

SM60 or SM_60, compute_60 – GP100/Tesla P100 – DGX-1 (Generic Pascal)
SM61 or SM_61, compute_61 – GTX 1080, GTX 1070, GTX 1060, GTX 1050, GTX 1030, Titan Xp,
Tesla P40, Tesla P4
SM62 or SM_62, compute_62 – Drive-PX2, Tegra (Jetson) TX2, Denver-based GPU
Volta (CUDA 9 and later)

SM70 or SM_70, compute_70 – Tesla V100
SM71 or SM_71, compute_71 – probably not implemented
SM72 or SM_72, compute_72 – currently unknown
Sample flags
According to NVIDIA:
The arch= clause of the -gencode= command-line option to nvcc specifies the front-end
compilation target and must always be a PTX version. The code= clause specifies the back-
end compilation target and can either be cubin or PTX or both. Only the back-end target
version(s) specified by the code= clause will be retained in the resulting binary; at least one
must be PTX to provide Volta compatibility.
Sample flags for generation on CUDA 7 for maximum compatibility:
1. -arch=sm_30 \
2. -gencode=arch=compute_20,code=sm_20 \
6. -gencode=arch=compute_52,code=compute_52
Sample flags for generation on CUDA 8 for maximum compatibility:
1. -arch=sm_30 \
Sample flags for generation on CUDA 9 for maximum compatibility. Note the removed SM_20:
1. -arch=sm_30 \
Leave a Reply
Enter your comment here...
Alexander Stohr says: 02/12/2016 at 09:13

i can not find any hard information for term “SM62” on the web.
at least some are speculating that it is meant for Tegra.
what are your sources for your statements on “SM62”?
Reply
Arnon Shimoni says: 02/12/2016 at 11:38

You could be right… I’m not entirely sure
Reply
Yan says: 08/03/2017 at 10:00

Hi,
Then what happens if I only use the following at compile time
-gencode arch=compute_20,code=\”sm_20,compute_20\”
but run the compiled code on a 5.0 card? The JIT compiler will generate the GPU code, but is it
going to compile with
-gencode arch=compute_50,code=\”sm_50,compute_50\”
I’ve been searching the web, but couldn’t find anything. Please advice.
Thanks,
Ian
Reply

Hey Ian
If you’re compiling for a 5.0 card, the second option you suggested is better. If you
have to have cross-compatibility, I’d recommend the first.
Reply
jg says: 24/05/2017 at 04:14

Thank you, very useful, what about sm_37 ?
Reply

`sm_37` is for the Tesla K80 cards, but our experience proves that it’s not effective
to compile for it specifically. sm_30 gives the same results and is better if you also
have K40s or similar.
Reply
LostWorld says: 19/06/2017 at 17:57

kindly help me to find SM for GTX950 and compute_????
Reply

-gencode=arch=compute_52,code=sm_52 – make sure you have CUDA 6.5 at least.
Reply
LostWorld says: 20/06/2017 at 03:26

thank you. so nice of u
Reply
Mandar Gogate says: 23/07/2017 at 15:46

Thank you.
Reply
Up Next
When would SQream DB be right for me?

Copyright © 2014 Arnon Shimoni Powered by ExpressCurate.

Matching SM Architectures (CUDA Arch and CUDA Gencode) For Various NVIDIA Cards - Blame Arnon

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Matching SM Architectures (CUDA Arch and CUDA Gencode) For Various NVIDIA Cards - Blame Arnon

Diunggah oleh

Hak Cipta:

Format Tersedia

5/11/2018 Matching SM architectures (CUDA arch and CUDA gencode) for various NVIDIA cards - Blame Arnon

Posted by Arnon Shimoni #compute

Matching SM architectures (CUDA #cuda arch

arch and CUDA gencode) for various

NVIDIA cards #gtx 1080

gcc GPUs #nvcc arch

When should diﬀerent ‘gencodes’ or ‘cuda

When would SQream DB be right for

Supported SM and Gencode variations

Kepler (CUDA 5 and later): SQream DB string cheat-sheet

SM35 or SM_35, compute_35 – More speciﬁc Tesla K40

SM37 or SM_37, compute_37 – More speciﬁc Tesla K80

Maxwell (CUDA 6 and later):

SM53 or SM_53, compute_53 – Tegra (Jetson) TX1 / Tegra X1

Pascal (CUDA 8 and later)

SM62 or SM_62, compute_62 – Drive-PX2, Tegra (Jetson) TX2, Denver-based GPU

Volta (CUDA 9 and later)

SM71 or SM_71, compute_71 – probably not implemented

SM72 or SM_72, compute_72 – currently unknown

Sample ﬂags for generation on CUDA 7 for maximum compatibility:

Enter your comment here...

Alexander Stohr says: 02/12/2016 at 09:13

Arnon Shimoni says: 02/12/2016 at 11:38

Yan says: 08/03/2017 at 10:00

Arnon Shimoni says: 09/03/2017 at 03:22

jg says: 24/05/2017 at 04:14

Arnon Shimoni says: 24/05/2017 at 08:00

LostWorld says: 19/06/2017 at 17:57

Arnon Shimoni says: 20/06/2017 at 01:30

LostWorld says: 20/06/2017 at 03:26

Mandar Gogate says: 23/07/2017 at 15:46

When would SQream DB be right for me?

Copyright © 2014 Arnon Shimoni Powered by ExpressCurate.

Anda mungkin juga menyukai