It can be difficult at times to understand what’s going on with machine learning and to understand what mathematics for machine learning really means. This is due to the emergence of machine learning libraries and programming frameworks that take care of the mathematical and statistical logic.
Anybody who works with machine learning needs to understand the mathematics and statistics of machine learning. Here’s a list of handy resources split by topic to address that need.
Mathematics For Machine Learning
Let’s start off with the mathematics knowledge you need. We’ll go onto statistics, a subsection of the mathematics, and bring it together into theory later. By the end of this, you should have a pretty good understanding of the math knowledge you already have, and areas where you need to refresh.
This introductory level resource helps summarize the mathematics needed for machine learning and lists many helpful resources. It can be a great place to start building your curriculum.
A Brown University course that focuses on the linear algebra needed for machine learning. This is a must-read resource for the mathematics of machine learning.
This tutorial helps work through why linear algebra is so important for machine learning. It talks through reasons behind improving your linear algebra skills. It finishes with a list of resources and an overview of the topic.
A Coursera course on the mathematics of machine learning from Imperial College London. You can enroll for free. The course focuses on multivariate calculus, and linear algebra, with a focus on eigenvectors and PCA (principal components analysis).
This free course helps offers a comprehensive overview of machine learning mathematics. It comes with lecture notes and exercises from MIT.
A book focused on optimization problems related to convex spaces. Very topical for machine learning.
This KhanAcademy course has the same format as other KhanAcademy courses. It is strictly focused on how to use calculus to solve optimization problems.
A tutorial that summarizes some of the basics of calculus (from limits to continuity). This course then applies these ideas to machine learning.
Statistics For Machine Learning
Now onto the statistics part of this article, where we learn how data is distributed and how machine learning models can be evaluated.
This introductory level course features videos and teaches you the statistics you need to understand most machine learning concepts.
Cover data distributions, an introduction to Bayesian probability, and importantly, learn about confidence intervals and start understanding the logic behind regression.
These are all topics that might be covered in a typical high school or university curriculum. Use the course as a refresher if you’ve seen the content before.
For those of you who learn better through videos, look no further than this Youtube playlist of a Harvard statistics intro.
It’s taught by Joe Blitzstein, Professor of the Practice in Statistics at Harvard University. The course covers everything from the basics of sampling to Markov chains.
This iPython notebook lets you play with probability in-depth. Built by Peter Norvig, Google’s head of research, it goes through a theory-based introduction to probability. Then, it drives into warm-up exercises expressed in Python, from die games to combinatorial urn problems (ex: An urn contains 6 blue, 9 red, and 8 white balls. We select six balls at random. What is the probability of these outcomes […]).
The article helps summarize how you can create commonly used probability distributions in Python. It uses Seaborn to help you visualize the results. It’s a helpful exercise to really understand different probability distributions of data. By generating random numbers from each distribution, your knowledge of probability deepens as you work with it.
With a brief introduction to Bayesian inference, this article helps explain when to use Bayesian logic. It links to several helpful articles and resources so you can understand the magic of prior and posterior distributions.
If you want a more in-depth read into probability and Bayesian inference, look no further than this interactive book.
Machine Learning Theory
Now, it’s time to combine both your maths and statistics knowledge into a cohesive whole and apply it towards machine learning theory and models.
This interactive introduction (scroll down and watch the magic happen) helps bring together the basics of machine learning. Walk through a decision tree analysis: whether or not a building is in San Francisco or New York City based on the rent and square footage.
You’ll learn about how data is processed, trained, and iterated upon to create more and more successful machine learning models.
This broad overview of machine learning theory will help you categorize different machine learning algorithms. Then, it will teach you the statistical/mathematical foundation for each.
Running through all the different machine learning algorithms out there can be confusing. This Medium article from Towards Data Science breaks down the theory behind the top ten.
You’ll learn here how we can use statistical properties of algorithm outputs and the amount of error/variance present to be able to measure how effective our machine learning model is.
You’ll learn about foundational concepts like the AUC (area under curve), the accuracy score, and the confusion matrix — drawing heavily on the math and statistics for machine learning taught in the resources above.
Going beyond the default accuracy metric often reported first by machine learning frameworks, this article explains the nuances of precision and recall. It relates the two key metrics and others to machine learning models.
This Quora answer helps sum up different foundational research papers in machine learning that highlight theoretical approaches. Use your math and statistics knowledge to put these approaches in code.
This section of Stack Overflow is filled with questions you’d have about machine learning theory, statistics, and mathematical expressions.
1- Build a mathematics for machine learning curriculum
2- Learn or refresh on calculus and optimization functions
3- Learn or refresh on linear algebra and eiganvectors/values
4- Build a statistics for machine learning curriculum
5- Learn or refresh frequentist statistics, confidence intervals, p-values
6- Learn or refresh Bayesian inference
7- Combine your math and stats knowledge and learn machine learning theory, algorithms and model evaluation.