Dmitry Mikushin: Tags

AVX2 (1 posts)

Building PyTorch without AVX2 on MacOS

In order to quickly explore PyTorch internals, I decided to compile and install a Debug build on my local machine. The first problem was that modern Clang surpris...

Apr 18, 2020 Building, Software Engineering, PyTorch, AVX2 Comments

Building (3 posts)

Embedding Jupyter-quaility rich visual Python into a static website

We all love our CV/blog websites hosted on GitHub Pages. We also love Jupyter notebooks for revolting the look and feel of daily data processing. Now imagine that...

May 30, 2020 Building, Software Engineering Comments

Setting up Jitsi Meet

Web-conferencing platforms are on the raise during these unprecedented times. On the other side, the vulnerablilities of Zoom and lack of privacy motivates us to ...

May 07, 2020 Building, Software Engineering, Jitsi, Web Conferencing Comments

Building PyTorch without AVX2 on MacOS

In order to quickly explore PyTorch internals, I decided to compile and install a Debug build on my local machine. The first problem was that modern Clang surpris...

Apr 18, 2020 Building, Software Engineering, PyTorch, AVX2 Comments

CUDA (4 posts)

Enabling GPU device debugging in HIP

In order to debug a GPU kernel with cuda-gdb, we add -G -O0 to nvcc command line, which in case of CMake would be:

Sep 27, 2022 GPU, CUDA, HIP Comments

Installing CUDA from DEB packages without graphics

You may want to have your NVIDIA GPU not to be involved in any desktop rendering for many reasons. While this is the default on the headless servers, personal sys...

Sep 03, 2022 CUDA Comments

How to get infinite loops to work in CUDA

The CUDA compiler does not handle infinite loops properly. For instance, the loop below will be completely eliminated from the resulting assembly, along with its ...

Jun 12, 2018 CUDA, Software Engineering Comments

How to fix CUDA and avx512vlintrin.h incompatibilty issue

Recent 5.x and 6.x GCC compilers are causing NVCC to produce the following kind of weird compile errors:

May 08, 2018 CUDA, Software Engineering, GCC Comments

GCC (1 posts)

How to fix CUDA and avx512vlintrin.h incompatibilty issue

Recent 5.x and 6.x GCC compilers are causing NVCC to produce the following kind of weird compile errors:

May 08, 2018 CUDA, Software Engineering, GCC Comments

GPU (1 posts)

Enabling GPU device debugging in HIP

In order to debug a GPU kernel with cuda-gdb, we add -G -O0 to nvcc command line, which in case of CMake would be:

Sep 27, 2022 GPU, CUDA, HIP Comments

Git (2 posts)

Migrating from CVS to Git

CVS is still around for many important projects, making it difficult to scale their development. Tutorials available for this topic are not robust enough for ease...

Mar 26, 2023 Linux, Git Comments

Working with Overleaf via Git

Overleaf deploys Git to track collaborative modifications to projects. Moreover, a user has an option to work with Overleaf’s Git backend directly. It supports Gi...

Sep 24, 2022 Overleaf, Git Comments

HIP (1 posts)

Enabling GPU device debugging in HIP

In order to debug a GPU kernel with cuda-gdb, we add -G -O0 to nvcc command line, which in case of CMake would be:

Sep 27, 2022 GPU, CUDA, HIP Comments

Jitsi (1 posts)

Setting up Jitsi Meet

Web-conferencing platforms are on the raise during these unprecedented times. On the other side, the vulnerablilities of Zoom and lack of privacy motivates us to ...

May 07, 2020 Building, Software Engineering, Jitsi, Web Conferencing Comments

LLVM (1 posts)

Using C-Reduce to understand Clang compiler bugs

Suppose we have a crash while compiling huge application from source, e.g. a Python package with native C++ code. A source file fails to compile with the followin...

Jun 28, 2022 LLVM, Software Engineering Comments

Linux (2 posts)

Migrating from CVS to Git

CVS is still around for many important projects, making it difficult to scale their development. Tutorials available for this topic are not robust enough for ease...

Mar 26, 2023 Linux, Git Comments

Use multiple mirrors in apt

There is an interesting and not so well-known feature of APT package manager: the ability to automatically choose a download mirror for every individual operation.

Nov 16, 2022 Linux, Ubuntu Comments

Loongson (1 posts)

An overview of Loongson 3A5000 laptop

Sep 18, 2022 Loongson Comments

Networking (1 posts)

VPN tunnelling for ipv6 only providers

The number of providers having problems with ipv4 support is growing. Recently we came across an ISP, which offers only ipv6, and is able to connect only to ipv6 ...

Aug 02, 2022 Networking Comments

Overleaf (1 posts)

Working with Overleaf via Git

Overleaf deploys Git to track collaborative modifications to projects. Moreover, a user has an option to work with Overleaf’s Git backend directly. It supports Gi...

Sep 24, 2022 Overleaf, Git Comments

PostgreSQL (2 posts)

Error-tolerant PostgreSQL for data rescue

Database corruption always happens before we prepare for it. “Back up or give up” is the most frequently recommended solution. The main reason is database engines...

Aug 21, 2023 PostgreSQL Comments

Updating PostgreSQL version 10 to version 14 in dockerized Zulip instance

Upgrade from Zulip 5 to Zulip 6 requires updating PostgreSQL version from 10 to 14.

Jun 11, 2023 Zulip, PostgreSQL Comments

PyTorch (1 posts)

Building PyTorch without AVX2 on MacOS

In order to quickly explore PyTorch internals, I decided to compile and install a Debug build on my local machine. The first problem was that modern Clang surpris...

Apr 18, 2020 Building, Software Engineering, PyTorch, AVX2 Comments

Qt (1 posts)

QWebEngineView remains blank whatever I do

In the most recent version of PyQt5, QWebEngineView refuses to draw any page content. Aparently, the solution is to disable sandboxing, as mentioned in this comme...

Sep 08, 2022 Qt Comments

Software Engineering (22 posts)

Using C-Reduce to understand Clang compiler bugs

Suppose we have a crash while compiling huge application from source, e.g. a Python package with native C++ code. A source file fails to compile with the followin...

Jun 28, 2022 LLVM, Software Engineering Comments

Embedding Jupyter-quaility rich visual Python into a static website

We all love our CV/blog websites hosted on GitHub Pages. We also love Jupyter notebooks for revolting the look and feel of daily data processing. Now imagine that...

May 30, 2020 Building, Software Engineering Comments

Setting up Jitsi Meet

Web-conferencing platforms are on the raise during these unprecedented times. On the other side, the vulnerablilities of Zoom and lack of privacy motivates us to ...

May 07, 2020 Building, Software Engineering, Jitsi, Web Conferencing Comments

Building PyTorch without AVX2 on MacOS

In order to quickly explore PyTorch internals, I decided to compile and install a Debug build on my local machine. The first problem was that modern Clang surpris...

Apr 18, 2020 Building, Software Engineering, PyTorch, AVX2 Comments

How to get infinite loops to work in CUDA

The CUDA compiler does not handle infinite loops properly. For instance, the loop below will be completely eliminated from the resulting assembly, along with its ...

Jun 12, 2018 CUDA, Software Engineering Comments

How to fix CUDA and avx512vlintrin.h incompatibilty issue

Recent 5.x and 6.x GCC compilers are causing NVCC to produce the following kind of weird compile errors:

May 08, 2018 CUDA, Software Engineering, GCC Comments

Remote profiling with NVIDIA Visual Profiler on a SLURM-based cluster

GPU-equipped clusters are often managed by SLURM job control system. Essentially, developer logs into the frontend node by SSH, builds the application and then qu...

Nov 16, 2016 Software Engineering Comments

Using CUDA device functions from OpenACC

OpenACC enables rapid transition of serial C/C++/Fortran into GPU-enabled parallel code. However, due to high-level nature, OpenACC does not offer access to GPU-s...

Sep 16, 2016 Software Engineering Comments

CUDA-like runtime interface for Xeon Phi

The performance power of GPUs could be exposed to applications using two principal kinds of programming interfaces: with manual parallel programming (CUDA or Open...

Apr 12, 2016 Software Engineering Comments

OpenMP 4.0 on NVIDIA CUDA GPUs

Multiple presentations about OpenMP 4.0 support on NVIDIA GPUs date back to 2012. There is however still very limited OpenMP 4.0 production-ready tools availabili...

Oct 14, 2015 Software Engineering Comments

Use CUDA 7.0 NVRTC with Thrust

Runtime Compilation (NVRTC) introduced in CUDA 7.0 allows to dynamically compile CUDA kernels during program execution (see example). This functionality allows to...

Apr 29, 2015 Software Engineering Comments

Get extra 8% perf in bilinear interpolation on GPU using restrict keyword

Starting from GK110 (Tesla Kepler), “const restrict” annotation on kernel argument has an extra GPU-specific meaning: accesses to that argument should go through ...

Mar 26, 2015 Software Engineering Comments

Thrust/CUDA tip: reuse temporary buffer across multiple transforms

Thrust is a very handy STL-like template library for rapid data processing on GPUs.

Oct 09, 2014 Software Engineering Comments

On-the-fly modification of LLVM IR code of CUDA sources

Largely thanks to LLVM, in recent years we’ve seen a significant increase of interest to domain-specific compilation tools research & development. With the re...

Sep 23, 2014 Software Engineering Comments

How to find CUDA's version of LLVM backend

It is well-known that CUDA toolkit uses LLVM backend, but the used version number is not shown. We can use gdb and LLVM API function to print the version string:

Jul 14, 2014 Software Engineering Comments

NVIDIA Visual Profiler allows to connect 64-bit Linux server from 32-bit Windows

In CUDA 6.0 release an extremely handy feature has been added to Visual Profiler: support for remote profiling. This means that you can run the profiler GUI from ...

Jul 13, 2014 Software Engineering Comments

Calling CUDA device function from OpenACC Fortran kernel

OpenACC is known to be a fast method of developing quite efficient GPU-enabled applications. It is also possible to mix CUDA kernels and libraries with OpenACC ke...

Jul 11, 2014 Software Engineering Comments

Jetson K1: bandwidthTest

Chart on the left shows the bandwidths of memory transfers on Jetson K1 (Click to enlarge). For the baseline we also added GTX680M’s host-device and device-host (...

Jun 15, 2014 Software Engineering Comments

Jetson K1: from unboxing straight to CUDA in 5 steps

We finally got the most wanted Jetson K1 board in the house! In this post we show how to turn a just unboxed tiny board into fully-functional CUDA development nod...

Jun 14, 2014 Software Engineering Comments

How to break Ubuntu 13.04/14.04 with vanilla CUDA driver and unbreak it back

After installing CUDA driver from NVIDIA website, Ubuntu 13.04/14.04 window manager decorations (Unity, via Compiz) may stop working properly on Optimus machines ...

Jun 01, 2014 Software Engineering Comments

Improving CUDA profiler output of the MPI-CUDA program

Consider we need to profile the following MPI-CUDA program on GPU cluster. The most obvious way to profile this code on console-only cluster would be to invoke th...

Apr 24, 2014 Software Engineering Comments

One non-obvious reason of 'Illegal instruction' in GPU code

If cuda-gdb throws Program received signal CUDA_EXCEPTION_4, Warp Illegal Instruction. for the following code line:

Apr 12, 2014 Software Engineering Comments

Ubuntu (1 posts)

Use multiple mirrors in apt

There is an interesting and not so well-known feature of APT package manager: the ability to automatically choose a download mirror for every individual operation.

Nov 16, 2022 Linux, Ubuntu Comments

Web Conferencing (1 posts)

Setting up Jitsi Meet

Web-conferencing platforms are on the raise during these unprecedented times. On the other side, the vulnerablilities of Zoom and lack of privacy motivates us to ...

May 07, 2020 Building, Software Engineering, Jitsi, Web Conferencing Comments

Zulip (1 posts)

Updating PostgreSQL version 10 to version 14 in dockerized Zulip instance

Upgrade from Zulip 5 to Zulip 6 requires updating PostgreSQL version from 10 to 14.

Jun 11, 2023 Zulip, PostgreSQL Comments

AVX2 (1 posts)

Building PyTorch without AVX2 on MacOS

Building (3 posts)

Embedding Jupyter-quaility rich visual Python into a static website

Setting up Jitsi Meet

Building PyTorch without AVX2 on MacOS

CUDA (4 posts)

Enabling GPU device debugging in HIP

Installing CUDA from DEB packages without graphics

How to get infinite loops to work in CUDA

How to fix CUDA and avx512vlintrin.h incompatibilty issue

GCC (1 posts)

How to fix CUDA and avx512vlintrin.h incompatibilty issue

GPU (1 posts)

Enabling GPU device debugging in HIP

Git (2 posts)

Migrating from CVS to Git

Working with Overleaf via Git

HIP (1 posts)

Enabling GPU device debugging in HIP

Jitsi (1 posts)

Setting up Jitsi Meet

LLVM (1 posts)

Using C-Reduce to understand Clang compiler bugs

Linux (2 posts)

Migrating from CVS to Git

Use multiple mirrors in apt

Loongson (1 posts)

An overview of Loongson 3A5000 laptop

Networking (1 posts)

VPN tunnelling for ipv6 only providers

Overleaf (1 posts)

Working with Overleaf via Git

PostgreSQL (2 posts)

Error-tolerant PostgreSQL for data rescue

Updating PostgreSQL version 10 to version 14 in dockerized Zulip instance

PyTorch (1 posts)

Building PyTorch without AVX2 on MacOS

Qt (1 posts)

QWebEngineView remains blank whatever I do

Software Engineering (22 posts)

Using C-Reduce to understand Clang compiler bugs

Embedding Jupyter-quaility rich visual Python into a static website

Setting up Jitsi Meet

Building PyTorch without AVX2 on MacOS

How to get infinite loops to work in CUDA

How to fix CUDA and avx512vlintrin.h incompatibilty issue

Remote profiling with NVIDIA Visual Profiler on a SLURM-based cluster

Using CUDA device functions from OpenACC

CUDA-like runtime interface for Xeon Phi

OpenMP 4.0 on NVIDIA CUDA GPUs

Use CUDA 7.0 NVRTC with Thrust

Get extra 8% perf in bilinear interpolation on GPU using __restrict__ keyword

Thrust/CUDA tip: reuse temporary buffer across multiple transforms

On-the-fly modification of LLVM IR code of CUDA sources

How to find CUDA's version of LLVM backend

NVIDIA Visual Profiler allows to connect 64-bit Linux server from 32-bit Windows

Calling CUDA device function from OpenACC Fortran kernel

Jetson K1: bandwidthTest

Jetson K1: from unboxing straight to CUDA in 5 steps

How to break Ubuntu 13.04/14.04 with vanilla CUDA driver and unbreak it back

Improving CUDA profiler output of the MPI-CUDA program

One non-obvious reason of 'Illegal instruction' in GPU code

Ubuntu (1 posts)

Use multiple mirrors in apt

Web Conferencing (1 posts)

Setting up Jitsi Meet

Zulip (1 posts)

Updating PostgreSQL version 10 to version 14 in dockerized Zulip instance

Get extra 8% perf in bilinear interpolation on GPU using restrict keyword