AI Practitioner Exam Prep - SL, UL, + TS

Abbreviations

ARIMA = Auto Regressive Integrated Moving Average
AUC-ROC = Area Under the Curve-Receiver Operating Characteristic
ETS = Error, Trend, Seasonal, GLM = Generalized Linear Models, KNN = K-Nearest Neighbors,
L = Labeled Data, MAP = Mean Average Precision, SL = Supervised Learning,
TF-IDF = Term Frequency-Inverse Document Frequency, UL = Unsupervised Learning,

VAEs = Variational Auto Encoders, XGBoost = eXtreme Gradient Boosting

Abbreviation Problem (LDA)

Latent Dirichlet Allocation - UL. NLP. Dirichlet is a lazy (so UL) bible reader that looks through text (so NLP), finding different topics, and finds the theme by associations between topics.

Linear Discriminant Analysis - <<previously defined>>

SL Terms (Predictive Models)

AUC-ROC = SL. in NLP. think AUC is the score, ROC is the trade off. AUC-ROC is reading docs (NLP), organized (SL), classification (+ and -). Ex: reviews. 1 = perfect classification.

Decision Trees = SL. CASE statement. easy to explain. used for classification and regression.

KNN algorithm = SL. Classification. finds absolute distance of data point to class of its neighbors. Ex answers: "Apple", "Orange", "Spam", "Not Spam".

Linear regression algorithm = SL. predicts "best-fit" line of min diff between input and output. Disadv: does not work with non-normal data. Can be negative. Answer is continuous range.

GLM = SL. framework that extends Linear regression to non-normal response distributions. GLM regressions: Logistic, Possion, Gamma, and Tweedie.

Logistic regression algorithm = SL. Classification. estimates % that input is in category by predicting binary outcome of using Logistic function and Log-odds. Answer is 0 to 1.

XGBoost = SL. builds trees sequentially, with each new tree correcting previous errors. Good at regression, classifying, and ranking. Good on tabular data. Gradient boosted trees algorithm. Set Objective parameter to multi:softmax if doing product categorization.

UL Terms (Pattern Discovery)

K-Means algorithm = UL. No event planning by 1) K = number of K leaders for people to cluster around, 2) each data point finds closest leader, and 3) the leader moves to the Means = math mean (center) of their group. finds hidden or unlabeled patterns.

VAEs = UL. learns a template for data. 1) looks at data, 2) encoder reduces to min traits, 3) variation part adds randomness, 4) decoder copies template. so learns recipe for the data. good for understanding and cleaning data, working with noisy data, and anomaly detection. Uses Bayesian Inference, Gaussian Distribution, and Kullback–Leibler (KL) Divergence.

Time Series Forecasting Terms (Traditional Statistics)

ARIMA = combines Auto Regression (AR), differencing (I), and moving average (MA) into single formula. Good for simple, single-time series datasets.

ETS = family of models based on Error, Trend, and Seasonal pieces using ExponenTial Smoothing. calcs by exponentially decreasing weights over time smoothing by doing weighted average over past observations.

Optimize and Eval Terms

MAP = mean of average precision scores across queries or classes.

Regularization = penalizes extreme weight values to prevent overfitting.

Search This Blog

Ones and Zeros

AI Exam: 1C - SL, UL, + TS

AI Practitioner Exam Prep - SL, UL, + TS

Comments

Post a Comment

Popular posts from this blog

GHL Email Campaigns

Await

Free AI Tools