We have seen that a module or package can be used in a python session via:
import numpy as np
But where does the file(s) of a package actually reside?
When an import statement is executed there are several paths where the package is searched for (similarly to how
LD_LIBRARY_PATH search paths work for binaries and libraries on linux).
['/Users/carlo/repos/pycourse/Slides', '/usr/local/lib/python', '/usr/local/Cellar/root/6.18.00/lib/root', '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python37.zip', '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7', '/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/lib-dynload', '', '/Users/carlo/Library/Python/3.7/lib/python/site-packages', '/usr/local/lib/python3.7/site-packages', '/usr/local/lib/python3.7/site-packages/qtconsole-4.4.1-py3.7.egg', '/usr/local/lib/python3.7/site-packages/openpyxl-2.5.5-py3.7.egg', '/usr/local/lib/python3.7/site-packages/tabulate-0.8.2-py3.7.egg', '/usr/local/lib/python3.7/site-packages/nose-1.3.7-py3.7.egg', '/usr/local/lib/python3.7/site-packages/scikit_learn-0.19.2-py3.7-macosx-10.12-x86_64.egg', '/usr/local/lib/python3.7/site-packages/scikit_image-0.14.0-py3.7-macosx-10.12-x86_64.egg', '/usr/local/lib/python3.7/site-packages/pynrrd-0.3.2-py3.7.egg', '/usr/local/lib/python3.7/site-packages/pydicom-1.1.0-py3.7.egg', '/usr/local/lib/python3.7/site-packages/scipy-1.1.0-py3.7-macosx-10.12-x86_64.egg', '/usr/local/lib/python3.7/site-packages/ipykernel-4.8.2-py3.7.egg', '/usr/local/lib/python3.7/site-packages/et_xmlfile-1.0.1-py3.7.egg', '/usr/local/lib/python3.7/site-packages/jdcal-1.4-py3.7.egg', '/usr/local/lib/python3.7/site-packages/Pillow-5.2.0-py3.7-macosx-10.12-x86_64.egg', '/usr/local/lib/python3.7/site-packages/networkx-2.1-py3.7.egg', '/usr/local/lib/python3.7/site-packages/dask-0.18.2-py3.7.egg', '/usr/local/lib/python3.7/site-packages/cloudpickle-0.5.5-py3.7.egg', '/usr/local/lib/python3.7/site-packages/PyWavelets-0.5.2-py3.7-macosx-10.12-x86_64.egg', '/usr/local/lib/python3.7/site-packages/pyzmq-17.1.2-py3.7-macosx-10.12-x86_64.egg', '/usr/local/lib/python3.7/site-packages/Keras-2.2.4-py3.7.egg', '/usr/local/lib/python3.7/site-packages/opencv_python-184.108.40.206-py3.7-macosx-10.12-x86_64.egg', '/usr/local/lib/python3.7/site-packages/PyYAML-5.1.1-py3.7-macosx-10.12-x86_64.egg', '/usr/local/lib/python3.7/site-packages/Keras_Preprocessing-1.1.0-py3.7.egg', '/usr/local/lib/python3.7/site-packages/Keras_Applications-1.0.8-py3.7.egg', '/usr/local/lib/python3.7/site-packages/dicom_tools-2.4-py3.7.egg', '/usr/local/lib/python3.7/site-packages', '/usr/local/lib/python3.7/site-packages/IPython/extensions', '/Users/carlo/.ipython']
A module, when imported, is searched in order in the list of paths. The current directory is by default added as the first search path. The directory
site-packages usually contains the distribution modules and packages. Note that often packages can come in
egg format (all files of a packaged are zipped together with meta-data files).
You can add or modify the path search in two ways, directly from a python program, manipulating the
from os.path import join
On *NIX systems You can also define the environment variable
PYTHONPATH before starting a python session to extend the search path.
The PyPI (Python Package Index) is a repository of published python packages (currently more than 180.000 projects) that can be easily installed.
The oldest way to install a package is to use
easy_install that comes with the python
setuptools. For example, to install the python package
pip for the whole system you can do:
#Don't do that
sudo easy_install pip
pip is a more flexible way to interact with PyPI. It usually comes with all python distributions and thus you do not need to install it. The command line utility allows for the installation/removal of packages, for example to install the package
numpy for the whole system you can do:
#Don't do this
sudo pip install numpy
pip will take care of dependencies installing them for you.
The most appreciated feature of
pip is the possibility to specify a requirements file that contains the list of packages and versions you need to be installed in one go:
pip install -r requirements.txt
A python environment can be reproduced:
pip freeze > requirements.txt
virtualenv solves a very specific problem: it allows multiple Python projects that have different (and often conflicting) requirements, to coexist on the same computer.
It also allows to install packages without the need to have super-user privileges (i.e. no
sudo pip install virtualenv
This will create an environment (a directory) called
myenv that contains a python distribution that can be activated:
pip install -r requirements.txt
Now the specified packages are installed in a subdirectory of
myenv creating an isolated environment. You can deactivate the environment with:
The Anaconda distribution is maintained by a private company (Anaconda Inc.), it provides a free and open-source distribution tailored to data science.
Similarly to pip/virtualenv it provides a package and environment manager.
After installing anaconda distribution, similarly to
pip packages can be installed (globally) with:
conda install numpy
However usually packages are installed in environments:
conda env create myenv
conda activate myenv
conda install numpy
pip all needed packages can be specified via a file (in YAML format):
conda env create -f environment.yml
conda activate myenv
For this tutorial you should have pre-installed anaconda. We have also a VM available with everything pre-installed. We have also created an environment with all python code that is needed. Remember to activate the environment with:
conda activate pycourse
This should be done in each new terminal. Note the name of the environment, prefixed to the terminal prompt.
Instead of the default interpreter,
ipython provides additional features, very useful in interactive sessions:
Tab-key with an incomplete word/command to see suggestions
!pwd). Note the form
mydir = !pwd
Up-key to auto complete line to most recent matching line
_ or with
_<N> for output of the Nth past command
%magic help on magic subsystem itself
%timeit python-code-goes-here will time the python line, repeating it a large number of times to improve precision
%bookmark create favorite folders to easily cd into them
%cd change the current directory
%logstop start/stop logging of interactive session and save it to a file
%pycat similar to
cat but syntax highlight as python code
Initially developed for python, now supports many programming languages. The kernels run the code (it's a
ipython interpreter in our case), receive output from the browser input and send back output.
Installation via conda:
conda activate <env>
conda install jupyter
#Other useful packages
conda install -c conda-forge jupyter_contrib_nbextensions nbconvert nb_conda nb_conda_kernels
Start jupyter with:
conda activate <env> #If needed
Jupyter is very popular and several ways to share notebooks exist. It should be noted that when a notebook is executed the output of code cells is stored in meta-data, thus it can be rendered: