AI Exam 1 - SL, UL, + TS

AI Practitioner Exam Prep - SL, UL, + TS

Abbreviations
ARIMA = Auto Regressive Integrated Moving Average 
AUC-ROC = Area Under the Curve-Receiver Operating Characteristic
ETS = Error, Trend, Seasonal, GLM = Generalized Linear Models, KNN = K-Nearest Neighbors,
L = Labeled Data, MAP = Mean Average Precision, SL = Supervised Learning,
TF-IDF = Term Frequency-Inverse Document Frequency, UL = Unsupervised Learning, 
VAEs = Variational Auto Encoders, XGBoost = eXtreme Gradient Boosting

Abbreviation Problem (LDA)
Latent Dirichlet Allocation - ULNLP. Dirichlet is a lazy (so UL) bible reader that looks through text (so NLP), finding different topics, and finds the theme by associations between topics. 
Linear Discriminant Analysis - <<previously defined>>

SL Terms (Predictive Models)
AUC-ROC = SL. in NLP. think AUC is the score, ROC is the trade off. AUC-ROC is reading docs (NLP), organized (SL), classification (+ and -). Ex:  reviews. 1 = perfect classification.
Decision Trees = SL. CASE statement. easy to explain. used for classification and regression.  
KNN algorithm SL. Classification (common) or rarely in Progressionclassifies data point on how its features are similar to others (neighbors). Classification answer is 0 to 1.
Linear regression algorithm SL. predicts "best-fit" line of min diff between input and output. Disadv: does not work with non-normal data. Can be negative. Answer is continuous range.
  GLM = SL. framework that extends Linear regression to non-normal response distributions. GLM regressions: Logistic, Possion, Gamma, and Tweedie.
Logistic regression algorithm SL. Classification. estimates probability that input is in category by predicting binary outcome of using Logistic function and Log-odds. allows users to adjust the weights of different variables based on domain knowledge and expertise. Answer is 0 to 1.
XGBoost = SL. builds trees sequentially, with each new tree correcting previous errors. Good at regression, classifying, and ranking. Good on tabular data. Gradient boosted trees algorithm. 

UL Terms (Pattern Discovery)
K-Means algorithm UL. No event planning by 1) K = number of K leaders for people to cluster around, 2) each data point finds closest leader, and 3) the leader moves to the Means = math mean (center) of their group. finds hidden or unlabeled patterns.
VAEs = ULlearns a template for data. 1) looks at data, 2) encoder reduces to min traits, 3) variation part adds randomness, 4) decoder copies template. so learns recipe for the data. good for understanding and cleaning data, working with noisy data, and anomaly detection. Uses Bayesian Inference, Gaussian Distribution, and Kullback–Leibler (KL) Divergence.

Time Series Forecasting Terms (Traditional Statistics)
ARIMA = combines Auto Regression (AR), differencing (I), and moving average (MA) into single formula. Good for simple, single-time series datasets.
ETS = family of models based on Error, Trend, and Seasonal pieces using ExponenTial Smoothing. calcs by exponentially decreasing weights over time smoothing by doing weighted average over past observations.

Optimize and Eval Terms
MAP = mean of average precision scores across queries or classes.
Regularization = penalizes extreme weight values to prevent overfitting.
TF-IDF = in NLP. scores word importance in documents by doing local count x log of (global rarity). Lessens filler words like "the" or "an".

Comments

Popular posts from this blog

GHL Email Campaigns

Whitelabel Options

Free AI Tools