The matter is that for execution of many operations numpy causes functions from library of linear algebra. Here in it it is normal and the problem is covered. Fortunately, everything is quite easily adjustable.
So, perhaps three situations:
- you set no libraries of linear algebra and then numpy uses the built-in library, and it, it is necessary to tell, very slow;
- you already set classical libraries like ATLAS and BLAS, and they are able to use only one processor;
- you set modern library OpenBLAS, MKL and it similar.
Let's carry out the simple test. Start this program:
import numpy as np size = 10000 a = np.random.random_sample((size, size)) b = np.random.random_sample((size, size)) n = np.dot(a,b)
Then if you work in Linux, then start top and if you work in Windows, come on a tab into "High-speed performance" in a task manager (is caused on Ctrl+Shift+Esc) … If top shows load at the level of 100%, and the Loading of the CPU indicator on the High-speed performance tab, on the contrary, shows value repeatedly lower than 100%, means calculations only one kernel — and this article for you is occupied. Those at whom all processors are involved can rejoice — at them everything is all right — it is possible not to read further.
Solution for Windows
Theoretically, it is possible to find, of course, source codes of libraries, to recompile them and to rebuild numpy. I even heard that someone wrote that he saw people who said that they managed it … Generally, the easiest way is to set the scientific Python distribution kit, for example, of Anaconda or Canopy. Enters a distribution kit not only python and numpy, but also the whole heap of useful libraries for calculations and visualization.
Then you can restart the initial test to be convinced that speed increased many times.
Solution for Linux
Actually you can also set the ready distribution kit Anaconda, Canopy or something else with all libraries at once. But if you prefer to collect by the hands, then read further — there are all recipes.
Check of libraries
As you remember, two options are possible:
- at you libraries are set "oldskulny" (or "outdated" as it is pleasant to whom) (for example, ATLAS);
- you did not set libraries, and numpy uses the built-in library (which even more slowly)
If you have a fresh numpy version (> 1.10), then, having come into the directory where numpy (normally it /usr/local/lib/python2.7/dist-packages/numpy is set, but depending on the Linux and Python version the way can change) and execute the following commands in the console:
cd core ldd multiarray.so
In earlier numpy versions there is no multiarray.so library, but there is _dotblas.so:
The output of the ldd command will show you whether uses numpy third-party libraries of linear algebra.
linux-vdso.so.1 => (0x00007fffe58a2000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f8adbff4000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f8adbdd6000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8adba10000) /lib64/ld-linux-x86-64.so.2 (0x00007f8adc68c000)
If in listing you do not see libblas.so, your numpy means uses the internal library. If you see, means at you costs ATLAS or BLAS.
Anyway at first to you it is necessary the library of linear algebra is correct.
OpenBLAS — good library of algorithms and functions of linear algebra which are the cornerstone of modern methods of data analysis and machine learning.
First of all you will need the Fortran compiler as OpenBLAS is not compatible to the standard compiler g77.
sudo apt-get install gfortran
Load OpenBLAS with github'a (previously having returned to the directory, suitable for installation):
git clone https://github.com/xianyi/OpenBLAS.git
Now come into the directory and start assembly:
cd OpenBLAS make FC=gfortran
When compilation and assembly successfully come to the end, set library.
sudo make install
By default, the library will be set in / opt/OpenBLAS. If you want to set it to other place, start make install with PREFIX key:
sudo make install PREFIX=/your/preferred/location
Reassignment of libraries
If earlier you found out that you already set some library of linear algebra, then it is enough to you to start command of reassignment of libraries:
sudo update-alternatives --install /usr/lib/libblas.so.3 libblas.so.3 \ /opt/OpenBLAS/lib/libopenblas.so 50
After this OpenBLAS by default will become library of a linear alegbra not only for numpy, and in general for all your programs and libraries.
And again start the test to see how at you at calculations all processors are involved now.
We collect the correct numpy
If your numpy worked at the built-in library, then you should rebuild it that it picked up just set OpenBLAS.
At first get rid of defective library:
sudo pip uninstall numpy
Then create the .numpy-site.cfg file of the following contents in the house directory:
[default] include_dirs = /opt/OpenBLAS/include library_dirs = /opt/OpenBLAS/lib [openblas] openblas_libs = openblas include_dirs = /opt/OpenBLAS/include library_dirs = /opt/OpenBLAS/lib [lapack] lapack_libs = openblas [atlas] atlas_libs = openblas libraries = openblas
If you selected a non-standard arrangement for OpenBLAS earlier, then change ways in the file. And now set numpy again:
sudo pip install numpy
When compilation and installation come to the end, start the initial test to be convinced that now processors do not stand idle. That's all.
This article is a translation of the original post at habrahabr.ru/post/274331/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: email@example.com.
We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.