User Tools

Site Tools


working_with_python

This is an old revision of the document!


Working with Python

Python is available on all Sterrewacht and Lorentz Institute GNU/Linux workstations. In most cases, both python 2 (currently 2.7) and python 3 (currently 3.6) are available.

Notes from a python introduction course are available on the observatory web site.

Python modules

Many common python packages (or modules), such as numpy, scipy, and astropy are available to any users regardless of the workstation. To display a list of available modules in python v2 type

python -m pip list

or similarly in python v3

python3 -m pip list

If the package you would like to use is not installed you have two options:

  • Contact the system administrators to install it globally (system wide).
  • Install it locally, that is in one of the directories writeable by you.

The two options are described in detail in the sections below.

Global installation

If you believe that the required package could be useful to other researchers in the Observatory or Lorentz Institute, then you can request the installation on our helpdesk https://helpdesk.strw.leidenuniv.nl/ (STRW) or https://helpdesk.lorentz.leidenuniv.nl/ (Lorentz) giving motivations and detailed instructions on where to find the requested package and its license information. We will notify you when the installation is complete.

Local installation

Sometimes, the requested python package is not useful to other researchers in your department and/or you are a developer who wants to try and modify development versions of installed packages or new packages. Generally speaking, a user or developer will want to install and manage python packages in alternative locations (usually outside the system python environment) for one of following reasons:

  1. The package is not of interest to the majority of users.
  2. They don't have write access to the main Python site-packages directories.
  3. They want a custom stash of packages, that is not visible to other users.
  4. They want to isolate a set of packages to a specific python application, usually to minimize the possibility of version conflicts.

In this case you can follow one of the methods below.

METHOD 1: pip with the `--user' option

Python 2.6 introduced the possibility of package installations via a “user scheme”. According to this scheme, Python distributions support an alternative install location that is specific to a user. A user's default install location is defined through the `site' module with the variable site.USER_BASE. In the GNU/Linux OS site.USER_BASE defaults to ~/.local. This mode of installation can be turned on by specifying the `–user' option to pip install, for instance

pip install --user SomePackage

To display the value of `site.USER_BASE', type

python[2,3] -m site --user-base

and to show the path to your site-packages directory

python[2,3] -m site --user-site

In the STRW and IL environments the site.USER_BASE variable defaults to a user's ~/.local directory nonetheless it can be customised/updated by modifying the environment variable PYTHONUSERBASE:

# in bash
export PYTHONUSERBASE=/somewhere/I/can/write/to
pip install --user SomePackage

will install `SomePackage' in /somewhere/I/can/write/to.

When using the `user' scheme to install packages, it is important to note

  • When globally installed packages are on the python path, and they conflict with the installation requirements, they are ignored, and not uninstalled.
  • When globally installed packages are on the python path, and they satisfy the installation requirements, pip does nothing, and reports that requirement is satisfied.
  • pip will not perform a –user install in a virtualenv unless the virtualenv was created specifying –system-site-packages. Nonetheless, pip will never install a package that conflicts with a package in the virtualenv site-packages.

METHOD 2: virtualenv

This guide refers to virtualenv version 12.0.7.

virtualenv is a tool that creates isolated Python environments. A python environment is essentially a folder which contains copies of all necessary files needed for a Python project to run. In addition each virtual environment will contain a copy of the utility pip to manage packages. For example, let us suppose you would like to install pymatlab which is not installed on the departmental workstations, then you could do

$ mkdir python_virt_envs && cd python_virt_envs
$ virtualenv --system-site-packages pymatlab

to create a virtual environment (folder) called pymatlab.

Please note that your newly created virtual environment will be a `python2' one if you used `virtualenv' or a `python3' one if using `virtualenv-3'.

The last step before starting to use the newly generated environment is to activate it, that is to prepend its /bin folder to your $PATH environment variable. This is done by issuing

source pymatlab/bin/activate

or

source pymatlab/bin/activate.csh

if you are using csh! To acknowledge the activation of pymatlab, virtualenv will change the terminal prompt ($PS1) to

(pymatlab)username@hostname:~/python_virt_envs/pymatlab$

to emphasize that you are operating in a virtual environment. To install pymatlab (or any other package) locally (in your virtual environment) run

pip install  pymatlab

Your virtual environment should now have the same core python packages defined globally for all the Observatory or Lorentz Institute users plus any packages installed in the virtual environment.

In any cases, it is advisable to keep a backup of your virtual environment configuration by creating a list of installed packages

pip freeze > packages.dat

This can help collaborators and fellow developers to reproduce your environment with

pip install -r packages.dat

When you are done working in a virtual environment deactivate it running

deactivate 

At any time, any virtual environment can be destroyed by removing the corresponding folder from the file system by executing

rm -rf ~/python_virt_envs/pymatlab

so do not panic if things do not work, just delete your virtual environment and start all over again.

Finally, it possible to choose which python interpreter to use in your virtual environment and that is done by running virtualenv with the `p' option

virtualenv -p /usr/bin/python3.4 pymatlab

Note: System administrators will not be responsible and/or manage users virtual environments. You are strongly advised to consult the documentation:

virtualenv --help

METHOD 3: easy_install

Easy Install is a python module (easy_install) that lets you automatically download, build, install, and manage Python packages. By default, easy_install installs python packages into Python’s main site-packages directory, and manages them using a custom .pth file in that same directory. Very often though, a user or developer wants easy_install to install and manage python packages in an alternative location, so to install a package locally type

easy_install -N --user pymatlab

This will install pymatlab in ~/.local/lib ready to be imported in your next python session. Note: If you want to install your package in a different location than ~/.local, then set the environment variable $PYTHONUSERBASE to a custom location, e.g,

export PYTHONUSERBASE=/home/user/some/where/I/can/write

Please consult the docs to know more:

python -m easy_install --help

Example: how to let python search arbitrary library paths

Create/edit

~/.local/lib/python2.7/site-packages/my-super-library.pth

by appending the path of your choice, for instance

echo "/my/home/sweet/home/library" >> ~/.local/lib/python2.7/site-packages/my-super-library.pth

All .pth files will be sourced by python provided they are in the right location.

Example: how to create a python environment module

First enable the module package to search also private module directories

module load use.own

the line above will create a $HOME/privatemodules if it does not exist and its path will be searched for the presence of environment modules files.

Let us now install some packages to an arbitrary location and upgrade (only in $PYTHONUSERBASE) an already system-wide installed package

export PYTHONUSERBASE=/somewhere/I/can/write/to
pip install --user SomePackage
pip install -I --user SomePackageThatWASInstalledSystemwide

Create a file, say `$HOME/privatemodules/super-module' with the following contents

#%Module 1.0
#
#  
# 
prepend-path  PATH          /somewhere/I/can/write/to/bin # if executables were installed
prepend-path  PYTHONPATH    /somewhere/I/can/write/to/lib/python2.7/site-packages

and type

module load super-module

and you are ready to use your newly created python environment. Note that is similar procedure can be repeated using python3.

Example: numpy with openBLAS

In this example we create a python2 virtual environment in which we will install the latest version of numpy that will use the openBLAS library.

:!: The procedure and paths below will work on any maris node and on the para cluster.

virtualenv py2_numpy_openBLAS
source py2_numpy_openBLAS/bin/activate
cd py2_numpy_openBLAS
mkdir numpy
pip install -d numpy numpy && cd numpy
tar xzf numpy-X.Y.z.tar.gz
cd numpy-X.Y.Z/
cp site.cfg.example site.cfg

Edit site.cfg with your favorite editor such that

[openblas]
libraries = openblas
library_dirs = /usr/lib64
include_dirs = /usr/include/openblas/
runtime_library_dirs = /usr/lib64

then install numpy

python setup.py install

If the installation is going smoothly you should see

....
 
openblas_info:
  FOUND:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
    runtime_library_dirs = ['/usr/lib64']
 
....
 
Installed /some/where/py2_numpy_openBLAS/lib/python2.7/site-packages/numpy-X.Y.Z-py2.7-linux-x86_64.egg

Now that numpy is installed you could also install scipy, for instance

pip install scipy

openBLAS will automatically use multithreading on the basis of the computer resources and the executable. If you wanted more control on multithreading you could either build openBLAS from source by specifying the number of threads or specify the number of threads in your application. If none of the above methods satisfies you, then it is possible to set the environment variable OPENBLAS_NUM_THREADS.

:!: Be careful! Choose the number of threads with care or your application will run slower than a single-threaded one!

:!: If your application is parallelized please build OpenBLAS with USE_OPENMP=1.

:!: If your application is already multi-threaded, it will conflict with OpenBLAS multi-threading. You must

  • export OPENBLAS_NUM_THREADS=1 in the environment variables. Or
  • Call openblas_set_num_threads(1) in the application on runtime. Or
  • Build OpenBLAS single thread version, e.g. make USE_THREAD=0

In any cases, please READ the docs.

Bypassing the existing python environment

Occasionally, something in the systemwide directories (e.g /software/local/lib64/python2.7/site-packages) interferes with your python application. Perhaps you have a code that requires a specific, older, version of numpy or matplotlib. Just installing that version is not always sufficient. The trick is, to set the PYTHONPATH to point first to a directory where you place a private sitecustomize.py which then overrides the one we have placed in /usr/lib64/python2.7/site-packages (which is where we add the /software directories to the path for everyone). Here is how:

  mkdir /some/location/python_custom_dir
  setenv PYTHONPATH /some/location/python_custom_dir:/usr/lib64/python2.7/site-packages

The sitecustomize.py could be something like this:

  import sys
  import site
  mypath='/usr/lib64/python%s/site-packages' % sys.version[:3]
  # We want this directory at the start of the path, to enforce the original defaults
  sys.path.insert(1,mypath)
  # In order to find also eggs and subdirectories, addsitedir seems necessary:
  site.addsitedir(mypath, known_paths=None)
working_with_python.1535615362.txt.gz · Last modified: 2018/08/30 07:49 by lenocil