AI Practitioner Exam Prep

Layers

Overall AI Abbreviations

AI = Artificial Intelligence, DL = Deep Learning, GenAI = Generative AI,

L = Labeled data, ML = Machine Learning, SL = Supervised Learning,

SSL = Self-Supervised Learning, U = Unlabeled data, UL = Unsupervised Learning

Mathematical Subsets/Layers of AI

Concentric Circles of:

AI = broadest field (with rule engines). computers appear as human behavior or reasoning.

ML = computer makes predictions or decisions without coding every rule

Agentic AI = drives other AI agents usually in a loop

Neural Nets = technical family of learned models inspired by human brain

DL = ML that uses multi-layer neural networks.

Gen AI = generates content. runs on pre-trained FMs that can run multiple tasks.

Physical AI = combines AI with something physical such as robot or sensor.

ML model = is trained and makes predictions or decisions. most models are for only one specific task. data is used to create ML model by using algorithm that tokenizes or vectorizes the data into vectors in the vector space.

Learning Types:

A) Reinforcement ML = AI tries task, gets score, and labeling some data with feedback of rewards or penalties. good for when good/bad outcome is known, but path is not known. AI must try a bunch to learn. ex1: learn to play race car game. ex2: learn chess by trial and error. Value algorithms are Q-Learning, Deep Q Network, and State-Reward-Action-State-Action. Policy optimization algorithms are REINFORCE, Proximal, and Trust Regional. Actor-Critic algorithms are Asynch Advantage and Soft.

B) SL = trains on L to predict labels or continuous values for new data. types are classification and regression. ex: data = labeled by maker and model of car pictures, then AI given new image and it must id the car maker and model. ex algorithms are linear learner, factorization machines, XGBoost, and KNN. Classification metrics are accuracy, precision, recall, F1 score, and AUC-ROC. Regression metrics are mean squared error and R squared.

Types:

a) Classification SL = groups U by similarity. ex: data = car pictures labeled by maker and model. Types are binary classification and multi-class classification.

b) Regression SL = for continuous numbers where past numbers matter to future answer. ex: home pricing (with historical data and AI must predict future price for home).

C) UL = starts with U looking for patterns within the data. pattern discovery. types are anomaly detection, clustering, dimensionality reduction, and topic modeling. ex algorithms are K-Means, LDA, Object2Vec, Random Cut Forest, IP Insights, and PCA. ex: groups customers into clusters based on purchase history.

Types:

a) Clustering UL = assigns data to a particular group with other similar, U.

b) Dimensionality Reduction UL = Removes irrelevant dimensions/features from the vector data to try to shrink the data and its complexity while retaining the most important dimensions.

Hybrid Learning Types:

A) Semi SL = U, except for a few L.

a) Multi-Instance ML = U individual, but L group.

B) SSL = U. self trains using U itself but has defined function. Types: masked prediction, contrastive learning, and autoregression.

D) IL (Imitation Learning): NOT ON AMAZON CERT - A training method where the robot learns by watching and mimicking human demos, teleoperation data, or algorithmic experts, rather than figuring out the task entirely from scratch.

Statistical Inference Types:

A) Inductive ML = uses evidence to calc outcome. Builds model to predict future, unseen data.

B) Deductive ML = using general rules to specific outcomes

C) Transductive ML = predicts specific labels for fixed set of U by using both L training and distribution of the U test. Optimizes for performance of specific dataset.

ML Process: Training Data (L or U; structured or not) => ML algorithm => Model

ML Create Steps: 1) Get data, 2) ML (SL, UL, reinforced, or hybrid), 3) inferencing (batch or RT)

ML Model Evaluation: with training set, validation set, and test set.

Feature engineering transforms data into features or inputs that will be valuable for the model.

ML Phases: data collection phase = best for compliance and regulatory requirements.

Challenges are: a) bias and b) variance (sensitivity to noise or overfitting). This leads to bias-variance tradeoffs. Types of bias-variance tradeoffs are underfitted, overfitted, and balanced. Balanced is ideal with low variance and low bias. To overcome bias and variance errors, you should a) cross-validate, b) increase data, c) regularization, d) simpler models, e) dimensional regulation, f) stop training early so no memorization. If does poorly on new data, increase regularization parameter to drop model complexity to address overfitting.

GenAI

Primary Advantage: Lower barrier to entry.

Disadvantage: Inaccuracy

Capabilities/Attributes: a) adaptable (can handle wide range of tasks), b) creative or explore, c) data efficiency, d) personalize, e) responsive, f) scalable, g) simplify

Challenges: a) regulatory violations, b) social risks, c) privacy concerns, d) toxicity, e) hallucination (not consistent with training data), f) incorrect conclusions, g) nondeterminism, h) intellectual property, i) plagiarism and cheating, and j) work disruption.

Selection Factors: a) tasks and use cases for different 3rd party FMs, b) performance, c) capabilities, d) constraints, e) compliance.

Business Metrics: a) user satisfaction, b) avg. revenue per user, c) conversion rate, d) cross-domain performance, e) efficiency, f) cost savings, g) time savings, h) quality improvement, i) productivity gains.

Approaches: a) process automation, b) augmented decision making, c) personalization and stabilization, d) creative content generation, e) exploratory analysis.

Prompt: composed of instructions, context, input data, and output indicator.

Parameters for randomness and diversity influence the variation in generated responses by limiting the outputs. These parameters are: temperature, top P, top K.

Strategies are ART, CoT, few-shot, RAG, ReAct, self-consistency, ToT, and zero-shot.

Templates increase efficiency, consistency, and scalability. Good if must do in future.

You can decrease tokens if forced to decrease cost.

FMs:

FM = large pretrained model adapted to many downstream tasks.

FM Creation Steps: 0) decide use case (parts are: name, actors, preconditions, basic flow, alternative flow, postconditions, business rules, assumptions, requirements, notes), 1) get U or select FM, 2) pretrain with SSL (if not a FM), 3) optional improve data (further pre-train on non-FM data) or prompt engineering, 4) optional optimize (see below), 5) optional evaluation with metrics or benchmarks, 6) deploy, 7) use to create answer.

FM Types: a) LLMs (such as diffusion models and multimodal models), b) GANs, c) VAEs

FM Optimization Approaches: a) prompt engineering, b) RAG, c) fine tuning by instructions, d) or fine tuning by RLHF.

Search This Blog

Ones and Zeros

AI Exam: 2 - Layers

AI Practitioner Exam Prep

Layers

Comments

Post a Comment