[Book] [Sebastian Raschka] Build a Large Language Model (From Scratch) [ENG, 2024]
github:
https://github.com/rasbt/LLMs-from-scratch
[Videos] [Sebastian Raschka and Abhinav Kimothi] Master and Build Large Language Models [ENG, 54 Lessons (17h 15m) | 2.91 GB]
https://www.manning.com/livevideo/master-and-build-large-language-models
1.1. Python Environment Setup Video
1.2. Foundations to Build a Large Language Model (From Scratch)
2.1. Prerequisites to Chapter 2 (1
2.2. Tokenizing text
2.3. Converting tokens into token IDs
2.4. Adding special context tokens
2.5. Byte pair encoding
2.6. Data sampling with a sliding window
2.7. Creating token embeddings
2.8. Encoding word positions
3.1. Prerequisites to Chapter 3 (1
3.2. A simple self-attention mechanism without trainable weights | Part 1
3.3. A simple self-attention mechanism without trainable weights | Part 2
3.4. Computing the attention weights step by step
3.5. Implementing a compact self-attention Python class
3.6. Applying a causal attention mask
3.7. Masking additional attention weights with dropout
3.8. Implementing a compact causal self-attention class
3.9. Stacking multiple single-head attention layers
3.10. Implementing multi-head attention with weight splits
4.1. Prerequisites to Chapter 4 (1
4.2. Coding an LLM architecture
4.3. Normalizing activations with layer normalization
4.4. Implementing a feed forward network with GELU activations
4.5. Adding shortcut connections
4.6. Connecting attention and linear layers in a transformer block
4.7. Coding the GPT model
4.8. Generating text
5.1. Prerequisites to Chapter 5
5.2. Using GPT to generate text
5.3. Calculating the text generation loss: cross entropy and perplexity
5.4. Calculating the training and validation set losses
5.5. Training an LLM
5.6. Decoding strategies to control randomness
5.7. Temperature scaling
5.8. Top-k sampling
5.9. Modifying the text generation function
5.10. Loading and saving model weights in PyTorch
5.11. Loading pretrained weights from OpenAI
6.1. Prerequisites to Chapter 6
6.2. Preparing the dataset
6.3. Creating data loaders
6.4. Initializing a model with pretrained weights
6.5. Adding a classification head
6.6. Calculating the classification loss and accuracy
6.7. Fine-tuning the model on supervised data
6.8. Using the LLM as a spam classifier
7.1. Preparing a dataset for supervised instruction fine-tuning
7.2. Organizing data into training batches
7.3. Creating data loaders for an instruction dataset
7.4. Loading a pretrained LLM
7.5. Fine-tuning the LLM on instruction data
7.6. Extracting and saving responses
7.7. Evaluating the fine-tuned LLM
U01M01-Python-Environment-Setup-Video
Можно на оф.сайте заценить.
https://livevideo.manning.com/module/1820_1_1/master-and-build-large-language-models/chapter-1—understanding-large-language-models/python-environment-setup-video?
$ pip install uv
$ uv venv --python=python3.10
$ source .venv/bin/activate
$ uv pip install -r requirements.txt