**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.**
- Introduction to Machine Learning and Pattern Classification
- Pre-Processing
- Model Evaluation
- Parameter Estimation
- Machine Learning Algorithms and Classification Models
- Clustering
- Collecting Data
- Data Visualization
- Statistical Pattern Classification Examples
- Resources
[Download a PDF version] of this flowchart.
-
Predictive modeling, supervised machine learning, and pattern classification - the big picture [Markdown]
-
Entry Point: Data - Using Python's sci-packages to prepare data for Machine Learning tasks and other data analyses [IPython nb]
-
An Introduction to simple linear supervised classification using
scikit-learn
[IPython nb]
-
Scaling and Normalization
- About Feature Scaling: Standardization and Min-Max-Scaling (Normalization) [IPython nb]
-
Feature Selection
- Sequential Feature Selection Algorithms [IPython nb]
-
Dimensionality Reduction
-
Principal Component Analysis (PCA) [IPython nb]
-
The effect of scaling and mean centering of variables prior to a PCA [PDF] [HTML]
-
PCA based on the covariance vs. correlation matrix [IPython nb]
-
Linear Discriminant Analysis (LDA) [IPython nb]
-
Kernel tricks and nonlinear dimensionality reduction via PCA [IPython nb]
-
- An Overview of General Performance Metrics of Binary Classifier Systems [PDF]
- Cross-validation
- Streamline your cross-validation workflow - scikit-learn's Pipeline in action [IPython nb]
-
Parametric Techniques
- Introduction to the Maximum Likelihood Estimate (MLE) [IPython nb]
- How to calculate Maximum Likelihood Estimates (MLE) for different distributions [IPython nb]
-
Non-Parametric Techniques
- Kernel density estimation via the Parzen-window technique [IPython nb]
- The K-Nearest Neighbor (KNN) technique
-
Regression Analysis
-
Linear Regression
- Least-Squares fit [IPython nb]
-
Non-Linear Regression
-
-
Naive Bayes and Text Classification I - Introduction and Theory [View PDF] [Download PDF]
-
Implementing a Weighted Majority Rule Ensemble Classifier in scikit-learn [IPython nb]
- Protoype-based clustering
- Hierarchical clustering
- Complete-Linkage Clustering and Heatmaps in Python [IPython nb]
- Density-based clustering
- Graph-based clustering
- Probabilistic-based clustering
-
Collecting Fantasy Soccer Data with Python and Beautiful Soup [IPython nb]
-
Download Your Twitter Timeline and Turn into a Word Cloud Using Python [IPython nb]
- Exploratory Analysis of the Star Wars API [IPython nb]
- Matplotlib examples -Exploratory data analysis of the Iris dataset [IPython nb]
-
Supervised Learning
-
Parametric Techniques
-
Univariate Normal Density
- Ex1: 2-classes, equal variances, equal priors [IPython nb]
- Ex2: 2-classes, different variances, equal priors [IPython nb]
- Ex3: 2-classes, equal variances, different priors [IPython nb]
- Ex4: 2-classes, different variances, different priors, loss function [IPython nb]
- Ex5: 2-classes, different variances, equal priors, loss function, cauchy distr. [IPython nb]
-
Multivariate Normal Density
- Ex5: 2-classes, different variances, equal priors, loss function [IPython nb]
- Ex7: 2-classes, equal variances, equal priors [IPython nb]
-
-
Non-Parametric Techniques
-
-
Copy-and-paste ready LaTex equations [Markdown]
-
Open-source datasets [Markdown]
-
Free Machine Learning eBooks [Markdown]
-
Terms in data science defined in less than 50 words [Markdown]
-
Useful libraries for data science in Python [Markdown]
-
General Tips and Advices [Markdown]
-
A matrix cheatsheat for Python, R, Julia, and MATLAB [HTML]