Posts

Showing posts from May, 2026

ML Exam - Transformers and LLMs

  ML Associate Exam Prep  Transformers and LLMs Basic Concepts - Tokens and Embeddings Tokens = numerical representations of words or parts of words A word can consist of 1+ tokens Punctuation signs (. “ ,) are also usually tokens Words/tokens can be loosely thought as the same , although strictly speaking they're obviously different Embeddings = mathematical representations (vectors) that encode the “meaning” of a token Evolution of the Transformer Architecture 1. RNNs and LSTMs    Recurrent Neural Networks (RNNs) are AI models designed for sequential data - like text or time series - by using internal memory to process inputs in order.     Long Short-Term Memory (LSTM) networks are a specialized, advanced type of RNN created to solve the "vanishing gradient" problem, allowing them to learn long-term dependencies that standard RNNs forget.    RNNs and LSTMs are obsolete with Transformers for many NLP tasks, though they remain rele...