Week 1: Machine Learning Basics

Introduction

Before we can dive deep into large language models, it is essential to get familiar with fundamental concepts in machine learning. Some key topics that should be revised include:

Machine Learning Problems
- Supervised vs Unsupervised
- Classification vs Regression
Neural Networks
- Building Deep Neural Networks: MLPs & CNNs
- Activation Functions: Sigmoid, ReLu
- Loss Functions: Cross-Entropy Loss
- Optimization and Gradient Descent
Natural Language Processing
- Word Embeddings
- Tokenization
- RNNs

Additionally, working knowledge of libraries such as NumPy, Pandas, Scikit-Learn, PyTorch, Tensorflow etc is going to helpful.

Goals for the Week

The following guide from Maxime Labonne covers the essentials discussed above, and hence have been provided as-is for reference. It introduces essential knowledge about mathematics, Python, and neural networks. Your goal this week is to revise the fundamentals by going through this guide as needed.

Reference: https://github.com/mlabonne/llm-course?tab=readme-ov-file#-llm-fundamentals

Learning Guide

1. Mathematics for Machine Learning

Before mastering machine learning, it is important to understand the fundamental mathematical concepts that power these algorithms.

Linear Algebra: This is crucial for understanding many algorithms, especially those used in deep learning. Key concepts include vectors, matrices, determinants, eigenvalues and eigenvectors, vector spaces, and linear transformations.
Calculus: Many machine learning algorithms involve the optimization of continuous functions, which requires an understanding of derivatives, integrals, limits, and series. Multivariable calculus and the concept of gradients are also important.
Probability and Statistics: These are crucial for understanding how models learn from data and make predictions. Key concepts include probability theory, random variables, probability distributions, expectations, variance, covariance, correlation, hypothesis testing, confidence intervals, maximum likelihood estimation, and Bayesian inference.

📚 Resources:

3Blue1Brown - The Essence of Linear Algebra: Series of videos that give a geometric intuition to these concepts.
StatQuest with Josh Starmer - Statistics Fundamentals: Offers simple and clear explanations for many statistical concepts.
AP Statistics Intuition by Ms Aerin: List of Medium articles that provide the intuition behind every probability distribution.
Immersive Linear Algebra: Another visual interpretation of linear algebra.
Khan Academy - Linear Algebra: Great for beginners as it explains the concepts in a very intuitive way.
Khan Academy - Calculus: An interactive course that covers all the basics of calculus.
Khan Academy - Probability and Statistics: Delivers the material in an easy-to-understand format.

2. Python for Machine Learning

Python is a powerful and flexible programming language that’s particularly good for machine learning, thanks to its readability, consistency, and robust ecosystem of data science libraries.

Python Basics: Python programming requires a good understanding of the basic syntax, data types, error handling, and object-oriented programming.
Data Science Libraries: It includes familiarity with NumPy for numerical operations, Pandas for data manipulation and analysis, Matplotlib and Seaborn for data visualization.
Data Preprocessing: This involves feature scaling and normalization, handling missing data, outlier detection, categorical data encoding, and splitting data into training, validation, and test sets.
Machine Learning Libraries: Proficiency with Scikit-learn, a library providing a wide selection of supervised and unsupervised learning algorithms, is vital. Understanding how to implement algorithms like linear regression, logistic regression, decision trees, random forests, k-nearest neighbors (K-NN), and K-means clustering is important. Dimensionality reduction techniques like PCA and t-SNE are also helpful for visualizing high-dimensional data.

📚 Resources:

Real Python: A comprehensive resource with articles and tutorials for both beginner and advanced Python concepts.
freeCodeCamp - Learn Python: Long video that provides a full introduction into all of the core concepts in Python.
Python Data Science Handbook: Free digital book that is a great resource for learning pandas, NumPy, Matplotlib, and Seaborn.
freeCodeCamp - Machine Learning for Everybody: Practical introduction to different machine learning algorithms for beginners.
Udacity - Intro to Machine Learning: Free course that covers PCA and several other machine learning concepts.

3. Neural Networks

Neural networks are a fundamental part of many machine learning models, particularly in the realm of deep learning. To utilize them effectively, a comprehensive understanding of their design and mechanics is essential.

Fundamentals: This includes understanding the structure of a neural network, such as layers, weights, biases, and activation functions (sigmoid, tanh, ReLU, etc.)
Training and Optimization: Familiarize yourself with backpropagation and different types of loss functions, like Mean Squared Error (MSE) and Cross-Entropy. Understand various optimization algorithms like Gradient Descent, Stochastic Gradient Descent, RMSprop, and Adam.
Overfitting: Understand the concept of overfitting (where a model performs well on training data but poorly on unseen data) and learn various regularization techniques (dropout, L1/L2 regularization, early stopping, data augmentation) to prevent it.
Implement a Multilayer Perceptron (MLP): Build an MLP, also known as a fully connected network, using PyTorch.

📚 Resources:

3Blue1Brown - But what is a Neural Network?: This video gives an intuitive explanation of neural networks and their inner workings.
freeCodeCamp - Deep Learning Crash Course: This video efficiently introduces all the most important concepts in deep learning.
Fast.ai - Practical Deep Learning: Free course designed for people with coding experience who want to learn about deep learning.
Patrick Loeber - PyTorch Tutorials: Series of videos for complete beginners to learn about PyTorch.

4. Natural Language Processing (NLP)

NLP is a fascinating branch of artificial intelligence that bridges the gap between human language and machine understanding. From simple text processing to understanding linguistic nuances, NLP plays a crucial role in many applications like translation, sentiment analysis, chatbots, and much more.

Text Preprocessing: Learn various text preprocessing steps like tokenization (splitting text into words or sentences), stemming (reducing words to their root form), lemmatization (similar to stemming but considers the context), stop word removal, etc.
Feature Extraction Techniques: Become familiar with techniques to convert text data into a format that can be understood by machine learning algorithms. Key methods include Bag-of-words (BoW), Term Frequency-Inverse Document Frequency (TF-IDF), and n-grams.
Word Embeddings: Word embeddings are a type of word representation that allows words with similar meanings to have similar representations. Key methods include Word2Vec, GloVe, and FastText.
Recurrent Neural Networks (RNNs): Understand the working of RNNs, a type of neural network designed to work with sequence data. Explore LSTMs and GRUs, two RNN variants that are capable of learning long-term dependencies.

📚 Resources:

Lena Voita - Word Embeddings: Beginner-friendly course about concepts related to word embeddings.
RealPython - NLP with spaCy in Python: Exhaustive guide about the spaCy library for NLP tasks in Python.
Kaggle - NLP Guide: A few notebooks and resources for a hands-on explanation of NLP in Python.
Jay Alammar - The Illustration Word2Vec: A good reference to understand the famous Word2Vec architecture.
Jake Tae - PyTorch RNN from Scratch: Practical and simple implementation of RNN, LSTM, and GRU models in PyTorch.
colah’s blog - Understanding LSTM Networks: A more theoretical article about the LSTM network.