Use CUDA 7.0 NVRTC with Thrust
Runtime Compilation (NVRTC) introduced in CUDA 7.0 allows to dynamically compile CUDA kernels during program execution (see example). This functionality allows to...
Runtime Compilation (NVRTC) introduced in CUDA 7.0 allows to dynamically compile CUDA kernels during program execution (see example). This functionality allows to...
Starting from GK110 (Tesla Kepler), “const restrict” annotation on kernel argument has an extra GPU-specific meaning: accesses to that argument should go through ...
Thrust is a very handy STL-like template library for rapid data processing on GPUs.
Largely thanks to LLVM, in recent years we’ve seen a significant increase of interest to domain-specific compilation tools research & development. With the re...
It is well-known that CUDA toolkit uses LLVM backend, but the used version number is not shown. We can use gdb and LLVM API function to print the version string:
In CUDA 6.0 release an extremely handy feature has been added to Visual Profiler: support for remote profiling. This means that you can run the profiler GUI from ...
OpenACC is known to be a fast method of developing quite efficient GPU-enabled applications. It is also possible to mix CUDA kernels and libraries with OpenACC ke...
Chart on the left shows the bandwidths of memory transfers on Jetson K1 (Click to enlarge). For the baseline we also added GTX680M’s host-device and device-host (...
We finally got the most wanted Jetson K1 board in the house! In this post we show how to turn a just unboxed tiny board into fully-functional CUDA development nod...
After installing CUDA driver from NVIDIA website, Ubuntu 13.04/14.04 window manager decorations (Unity, via Compiz) may stop working properly on Optimus machines ...