NLP - Neural Networks

date

Aug 22, 2024

type

Post

AI summary

The document discusses supervised learning in machine learning, where models are trained on labeled datasets to predict outputs. It explains neural networks, their structure, including input, hidden, and output layers, and the process of weight adjustment using gradient descent. Additionally, it covers deep learning as a subset of machine learning that utilizes multi-layer neural networks for complex tasks, with a focus on PyTorch for implementation and includes a code example for a simple neural network model.

slug

nlp-supervised-learning

status

Published

Supervised Learning

Supervised Learning is a type of machine learning where the model is trained on a labeled dataset. In this approach, each training example consists of an input paired with the correct output (label). The model learns to map inputs to the correct outputs by finding patterns in the data. During training, the model makes predictions and is corrected based on the actual labels, which allows it to improve over time.

Common examples of supervised learning tasks include classification (e.g., determining whether an email is spam or not) and regression (e.g., predicting house prices based on features like size and location). The goal is for the model to generalize from the training data so that it can accurately predict outcomes for new, unseen data.

Neural Networks

Neural Networks are a class of machine learning models inspired by the structure and function of the human brain. They consist of layers of interconnected nodes, or "neurons," which process and transmit information.

Neuron/Nodes

Each neuron receives inputs, applies a weight to each input, sums them up, passes the result through an activation function, and produces an output. This process is repeated across multiple layers, allowing the network to learn complex patterns in the data.

Examples of activation functions:

Layers

Input Layer: This layer receives the raw input data (e.g., features of an image, numerical data, etc.). Each node in this layer corresponds to one feature of the input.

Hidden Layers: These intermediate layers process the input data through weighted connections and activation functions. The hidden layers enable the network to learn complex, non-linear relationships in the data.

Output Layer: The final layer produces the output of the network, which could be a single value (in regression tasks) or a set of probabilities (in classification tasks).

Gradient Descent

Loss Function example — Mean Square Error

For each weight compute the gradient of the Loss Function:

Gradient of each weight is a function of the incoming activation values and the weight applied to each incoming value, applied recursively. Each weight adjustment can be computed separately, allowing for parallelization on GPUs.

Deep Learning - PyTorch

Concepts

Deep Learning is a specialized subset of machine learning that uses neural networks with many layers (hence "deep") to learn from large amounts of data. Deep learning models, such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), excel at automatically extracting features and learning complex patterns from raw data. This makes them particularly powerful for tasks like computer vision, natural language processing, and game playing.

Neural network is a collection of modules.

Each module is a layer that stores weights and activations (if any)

Each module knows how to update its own weights (if any)

Modules keep track of who passes it information (called a chain)

Each module knows how to distribute loss to child modules

PyTorch supports parallelization on GPUs.

Example of Layers in PyTorch:

Linear Layer (Affine Transformation):