It is often useful to be able to call BLAS routines directly from Cython. Doing so avoids calling the corresponding NumPy functions (which would incur a performance penalty of running interpreted code and type and shape checking) as well as re-implementing linear algebra operations in Cython (which will likely be both incorrect and slower).
Existing Cython BLAS wrappers
Correspondingly, there are several ways of doing so.
- CythonGSL provides Cython wrappers for the GNU Scientific Library.
- tokyo wraps a lot of BLAS routines in Cython functions.
- This StackOverflow thread suggests a way of calling the BLAS version bundled with SciPy.
If these projects fit your requirements, great! You can read no further. In my code, however, I often find myself needing only one or two BLAS routines that are called in a tight inner loop – and in these cases I find it preferable to write my own quick wrapper with just these two functions.
Calling BLAS directly
Declaring BLAS functions is a straightforward application of the Cython
cdef extern machinery.
Getting the BLAS level 1 double inner product function is very straightforward:
cdef extern from 'cblas.h': double ddot 'cblas_ddot'(int N, double* X, int incX, double* Y, int incY) nogil
This gives a function that takes the length the vectors
N, the pointers to the first element of
Y, and their strides
Calling it is also very easy:
cpdef double run_blas_dot(double[::1] x, double[::1] y, int dim): # Get the pointers. cdef double* x_ptr = &x cdef double* y_ptr = &y return ddot(dim, x_ptr, 1, y_ptr, 1)
Declaring level 2 and level 3 functions is a little bit trickier as we need to take care of the various flags passed into the routines. Taking
DGEMV (double matrix-vector product) as an example, we need:
cdef extern from 'cblas.h': ctypedef enum CBLAS_ORDER: CblasRowMajor CblasColMajor ctypedef enum CBLAS_TRANSPOSE: CblasNoTrans CblasTrans CblasConjTrans void dgemv 'cblas_dgemv'(CBLAS_ORDER order, CBLAS_TRANSPOSE transpose, int M, int N, double alpha, double* A, int lda, double* X, int incX, double beta, double* Y, int incY) nogil
The first two
ctypedefs give us the flags governing the matrix-vector product operation:
CBLAS_ORDERdetermines whether the matrix
Auses row-major or column-major storage (C and Fortran arrays in NumPy parlance), and
CBLAS_TRANSPOSEdetermines whether the matrix
Ashould be transposed for the multiplication.
The final lines gives us the actual function signature. To call it, we need:
- the two parameters above,
- the dimensions of the
- the scaling constants
- the pointers to the
Yvector (where the
Yvector stores the result), and
- the strides of the
To call it:
cpdef run_blas_dgemv(double[:, ::1] A, double[::1] x, double[::1] y, int M, int N, double alpha, double beta): cdef double* A_ptr = &A[0, 0] cdef double* x_ptr = &x cdef double* y_ptr = &y dgemv(CblasRowMajor, CblasNoTrans, M, N, alpha, A_ptr, N, x_ptr, 1, beta, y_ptr, 1)
And that’s it: good enough for quick and dirty projects.