Table of Contents
1 00 - Introduction
1.1 First steps
1.1.1 Installing python
1.1.2 Installing extra dependencies
1.2 Course objectives
1.3 Motivation (editorial)
1.3.1 Concurrency vs. parallelism
1.3.2 Threads and processes
1.3.3 Thread scheduling
1.3.4 Releasing the GIL
from IPython.display import Image
(Note – this material is under construction and might change significantly between now and June 14)
All the code used in the tutorial can be run on a Windows, Mac, or Linux laptop
We will use a python distribution called Anaconda to run a series of Jupyter notebooks (you are reading a jupyter notebook now).
You’re definitely encouraged to bring your laptop to the tutorial, please do the following:
Download and install Miniconda 3.6 from https://conda.io/miniconda.html on your laptop, accepting all the defaults for the install (I specified an install into a folder named ma36 below):
If you are running on Windows:
Press the ⊞ Win key to open a cmd shell and type: anaconda prompt
This should launch a cmd shell that we can use to install other packages
To test your installation, type
conda list
at the prompt, which should show you a list of installed packages starting with:
(base) C:\Users\paust>conda list
# packages in environment at C:\Users\paust\ma36:
If you are running MacOS or Linux, after the install launch a bash terminal. Hopefully when you type:
conda list
you will see output that looks like:
% conda list
# packages in environment at /Users/phil/mb36:
#
# Name Version Build Channel
To install the software required to run the notebooks:
Download conda_packages.txt by right-clicking on the link.
From you cmd or bash terminal, cd to the folder containing that file and do:
conda install --file conda_packages.txt
If this succeeds, then typing the command:
python -c 'import numpy;print(numpy.__version__)'
should print:
1.14.2
(or possibly a higher version)
If you have trouble with these steps, send me a bug report at paustin@eos.ubc.ca and we can iterate.
From Wikipedia:
“In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system.[1] The implementation of threads and processes differs between operating systems, but in most cases a thread is a component of a process. Multiple threads can exist within one process, executing concurrently and sharing resources such as memory, while different processes do not share these resources. In particular, the threads of a process share its executable code and the values of its variables at any given time.”
Reference: Thomas Moreau and Olivier Griesel, PyParis 2017 [Mor2017]
If multiple threads are present in a python process, the python intepreter releases the GIL at specified intervals (5 miliseconds default) to allow them to execute:
Image(filename='images/morreau1.png') #[Mor2017]
If the computation running on the thread has released the GIL, then it can run independently of other threads in the process. Execution of these threads are scheduled by the operating system along with all the other threads and processes on the system.
In particular, basic computation functions in Numpy, like (_*add_* (+), _subtract_ (-) etc. release the GIL, as well as universal math functions like cos, sin etc.
Image(filename='images/morreau2.png') #[Morr2017]
Image(filename='images/morreau3.png') #[Morr2017]