(adapted from Step by step approach to perform data analysis in Python)
So you have decided to learn Python, but you don’t have prior programming experience. So you are confused on where to start, and how much Python to learn.
These are some of the common questions a beginner has while getting started with Python(for data centric application):
- “How long does it take to learn Python”
- “How much Python should I learn for performing data analysis”
- “What are the best books/courses to learn Python”
- “Should I be an expert Python programmer, in order to work with data sets”
It is good to be confused, while beginning to learn a new skill, that’s what author of “learn anything in 20 hours” says.
However the key word here is: Don’t Panic! This tutorial has been thought and designed to show you that
Most people have the misconception that for performing data analysis in Python requires to be proficient in Python programming.
Coding is fun, but you don't really need to be a coding ninja in Python to do data analysis.
What you just need to get started is some basics of (Python) programming and some very elementary software engineering concepts, just to avoid disasters when you go in production - whatever production means to you (e.g. deploy a system online, or share the code of your prototype or experiments on a public repo for reproducibility.)
In this tutoria, you won't learn how to program in Python. If you are looking for a quick tutorial on Python programming, maybe this is the tutorial for you: Python Programming Tutorial
For a glimpse on what to expect by this tutorial, I would suggest this 5 mins
reading:
5 amazingly powerful Python libraries for Data Science
To run the code included in this repository, we will be using Python 3 (which is not Python 2, by the way). Although using your Python version (and environment) will be more than fine, for an easier and quick setup of all the necessary Python packages, I would strongly suggest to download and use the Anaconda Python distribution.
(Most of) The materials in this tutorial will be provided as Jupyter Notebooks.
If you don't know what a Jupyter notebook is, or how to use it, please take a look at this quick introductory tour: IPython Notebook Beginner Guide.
For additional details and materials on Jupyter and IPython, here there are some other suggester readings:
- Jupyter Notebook the Definitive Guide:
- What is a Jupyter Notebook
- Practical Introduction
- Notebook Examples
If you want an introductory overview of Python for Data Science, I strongly recommend
Scipy Lecture Notes: a community driven project where you can find
tutorials (for non-experts) on the scientific Python ecosystems.
Additional Books for further readings:
- Scipy and Numpy
- Python Data Science Handbook
- Elegan Scipy
- Python for Data Analysis
- Introduction to Machine Learning with Python
- Building Machine Learning Systems with Python
Some of the material included in this repository has been created by adapting the materils in the Python-ML-Course repository by luisPinedo. Original versions available here: https://github.com/luisPinedo/python-ml-course