You are on page 1of 4

MSIS 5223: Installing Python

Instructions

Python is a true object oriented programming language. As such, many libraries,


programs, and various software package versions exist. Unlike SAS, IBM Modeler, or Tableau,
you can edit Python using a variety of applications. Like R, you are not limited to a single
application. For example, you could use Microsoft Visual Studio or Eclipse to edit your Python
files. Even Notepad will do.
The advantages of using Python are similar to that of using R. If you like using Linux, a
version exists for that. If you are one of the few who owns an Apple computer (i.e. OS-X), a
distribution exists. For most people, the Windows 64-bit installer is the best option to work with.
This course will take advantage of using the Anaconda package. What is Anaconda?
Essentially, someone was nice enough to collect all the appropriate files and libraries related to
analytics and put them together into a nice, easy installer. You can find out more by visiting the
website of Continuum Analytics at https://www.continuum.io/, the company behind Anaconda.

1. Basic Overview of Tools and Applications

At the most basic level, this course will focus on using the programming language Python
and the Python library Pandas to perform advanced statistical modeling. On its own, Python can
do very basic data manipulation; it lacks capabilities for more complex modeling. This is where
the library Pandas comes in (see http://pandas.pydata.org/ for more information on Pandas).
The inception of the idea behind the Pandas library came from R. The developers wanted
similar features in Python that were found in R. For example, using dataframes as objects,
reading and writing data in a simple way, data management and handling, time series functions,
and many other statistical abilities not found in Python.
The Anaconda package that you will install contains a number of applications, most of
which you will not use during the semester. You are encouraged to use them, however, to gain a
broader knowledge of the tools available. One of the most important tools installed is iPython
(see http://ipython.readthedocs.io/en/stable/overview.html for more information). The goal of the
developers of iPython was to create an environment that is more powerful and not as limited as
the default interpreter in the standard Python distribution.

What does iPython bring that the standard Python distribution does not? It has an
improved Python shell (i.e. tab completion, object information, etc.), a decoupled two-process
model, and interactive parallel computing. The last two improvements are not important for this
course, so your focus should be on the first one. The extended Python shell, or console, functions
similar to that found in R. This is where you can type in your code and receive some kind of
output (see screenshot below).

For example, say you would like to use the function pd.read_csv() to open a file;
however, you cannot remember any of the code and need some way to refresh your memory.
Simply type ?pd.read_csv and metadata about the function will appear within the console.

For those of you who are familiar with using IDE environments (e.g. Visual Studio,
Eclipse) you can also use Visual Studio 2013 or Visual Studio 2015 in conjunction with Python

Tools for Visual Studio (PTVS) to code Python applications. The official website for PTVS is
microsoft.github.io/PTVS/.
This allows you to code and run Python from directly within Visual Studio, including tab
completion. Note, you must have an interpreter in order to run the code. You can use CPython,
IronPython, Anaconda, or Canopy. This is my preferred method for programming Python. As a
student enrolled in this course, you can download a free copy of Visual Studio 2015 Enterprise
from MSDNAA via DreamSpark.
After you install Visual Studio and PTVS, please read this webpage for an overview on
how to setup iPython to run within Visual Studio:
https://github.com/Microsoft/PTVS/wiki/Using-IPython-with-PTVS.
For a quick introduction to PTVS, please watch the following video: YouTube Video.
2. Download the Installation Package

The installation files for the Anaconda distribution of Python are located at
https://www.continuum.io/downloads. The installation files are listed by operating system
starting with Windows, then Mac OS X, and listing Linux last. Windows provides a couple of
options based on the architecture of your computer. In other words, whether you are using a 32bit (i.e. X86) system or a 64-bit one. How do you know? Type the Windows key on your
keyboard, and while holding it down, press the Pause/Break key. You should see a window like
that shown below, indicating the version.

If you do not have a Pause/Break key on your keyboard, you can also navigate to Control
Panel System and Security System. This will take you to the same screen shown on the
previous page.

After determining the version, proceed to download the correct version for Python 3.5.
You will not be using Python 2.7 for this course. In the image below, the correct version for a
64-bit version of Windows is circled in red; for a 32-bit version of Windows, select the link
circled in green. Note, the file is over 350 MB in size, so ensure you have a good, solid
connection to the internet prior to downloading.

3. Installing Anaconda with iPython

Run the installer after your download completes. Follow along and accept the defaults.
The installation is large and takes some time. On a slower, older machine, this can take up to 10
minutes (thats what students have told me, anyway).
After installation is complete, you should now have access to all the various libraries,
files, and packages. Many applications and programs were installed. The program you will be
using the most is called iPython.
If you would like to explore and learn more about using iPython, please read this tutorial
overview: http://pandas.pydata.org/pandas-docs/stable/tutorials.html; or, you can just read
through the tutorials I have created for this course.

You might also like