ML Exam Prep

 

ML Associate Exam Prep 


1: Data Prep for ML (28%)

Task 1.1: Ingest and store data
Knowledge of:
* Data formats and ingestion mechanisms (ex: validated and non-validated formats, Apache Parquet, JSON, CSV, Apache ORC, Apache Avro, RecordIO)
* Use the core data sources (ex: S3, EFS, FSx for NetApp ONTAP)
* Use streaming data sources to ingest data (ex: Kinesis, Apache Flink, Apache Kafka)
* Storage options, including use cases and tradeoffs

Skills in:
* Extracting data from storage (ex: S3, EBS, EFS, RDS, DynamoDB) by using (ex: S3 Transfer Acceleration, EBS Provisioned IOPS)
* Choosing appropriate data formats (ex: Parquet, JSON, CSV, ORC) based on data access patterns
* Ingesting data into SageMaker Data Wrangler and SageMaker Feature Store
* Merging data from multiple sources (ex: programming, Glue, or Apache Spark)
* Troubleshooting and debugging data ingestion and storage issues that involve capacity and scalability
* Making initial storage decisions based on cost, performance, and data structure

Task 1.2: Transform data and perform feature engineering
Knowledge of:
* Data cleaning and transformation techniques (ex: detecting and treating outliers, imputing missing data, combining, deduplication)
* Feature engineering techniques (ex: data scaling and standardization, feature splitting, binning, log transformation, normalization)
* Encoding techniques (ex: one-hot encoding, binary encoding, label encoding, tokenization)
* Tools to explore, visualize, or transform data (ex: SageMaker Data Wrangler, Glue, Glue DataBrew)
* Services that transform streaming data (ex: Lambda, Spark)
* Data annotation and labeling services that create high-quality labeled datasets

Skills in:
* Transforming data using (ex: Glue, DataBrew, Spark running on EMR, SageMaker Data Wrangler)
* Creating and managing tools (ex: SageMaker Feature Store)
* Validating and labeling data services (ex: SageMaker Ground Truth, Mechanical Turk)

Task 1.3: Ensure data integrity and prepare data for modeling
Knowledge of:
* Pre-training bias metrics for numeric, text, and image data (ex: class imbalance [CI], difference in proportions of labels [DPL])
* Strategies to address CI in numeric, text, and image datasets (ex: synthetic data generation, resampling)
* Techniques to encrypt data
* Data classification, anonymization, and masking
* Implications of compliance requirements (ex: PII, protected health information [PHI], data residency)

Skills in:
* Validating data quality (ex: using DataBrew and Glue Data Quality)
* Identifying and mitigating sources of bias in data (ex: selection bias, measurement bias) (ex: SageMaker Clarify)
* Preparing data to reduce prediction bias (ex: by using dataset splitting, shuffling, and augmentation)
* Configuring data to load into the model training resource (ex: EFS, FSx)


2: ML Model Development (26%)

Task 2.1: Choose a modeling approach
Knowledge of:
* Capabilities and appropriate uses of ML algorithms to solve business problems
* Use AI services (ex: Translate, Transcribe, Rekognition, Bedrock) to solve specific business problems
* Interpretability during model selection or algorithm selection
* SageMaker AI built-in algorithms and when to apply them

Skills in:
* Assessing available data and problem complexity to determine the feasibility of an ML solution
* Comparing and selecting appropriate ML models or algorithms to solve specific problems
* Choosing built-in algorithms, foundation models, and solution templates (ex: in SageMaker JumpStart and Amazon Bedrock)
* Selecting models or algorithms based on costs
* Selecting AI services to solve common business needs

Task 2.2: Train and refine models
Knowledge of:
* Elements in the training process (ex: epoch, steps, batch size)
* Methods to reduce model training time (ex: early stopping, distributed training)
* Factors that influence model size
* Methods to improve model performance
* Benefits of regularization techniques (ex: dropout, weight decay, L1 and L2)
* Hyperparameter tuning techniques (ex: random search, Bayesian optimization)
* Model hyperparameters and their effects on model performance (ex: number of trees in a tree-based model, number of layers in a neural network)
* Methods to integrate models that were built outside SageMaker AI into SageMaker AI

Skills in:
* Using SageMaker AI built-in algorithms and common ML libraries to develop ML models
* Using SageMaker AI script mode with SageMaker AI supported frameworks to train models (for example, TensorFlow, PyTorch)
* Using custom datasets to fine-tune pre-trained models (for example, Bedrock, SageMaker JumpStart)
* Performing hyperparameter tuning (ex: by using SageMaker AI auto model tuning [AMT])
* Integrating automated hyperparameter optimization capabilities
* Preventing model overfitting, underfitting, and catastrophic forgetting (ex: by using regularization techniques, feature selection)
* Combining training models to improve outcome (ex: ensembling, stacking, boosting)
* Reducing model size (for example, by altering data types, pruning, updating feature selection, compression)
* Managing model versions for repeatability and audits (ex: SageMaker Model Registry)

Task 2.3: Analyze model performance
Knowledge of:
* Model evaluation techniques and metrics (ex: confusion matrix, heat maps, F1 score, accuracy, precision, recall, Root Mean Square Error [RMSE], receiver operating characteristic [ROC], Area Under the ROC Curve [AUC])
* Methods to create performance baselines
* Methods to identify model overfitting and underfitting
* Metrics available in SageMaker Clarify to gain insights into ML training data and models
* Convergence issues

Skills in:
* Selecting and interpreting evaluation metrics and detecting model bias
* Assessing tradeoffs between model performance, training time, and cost
* Performing reproducible experiments by using AWS services
* Comparing the performance of a shadow variant to the performance of a production variant
* Using SageMaker Clarify to interpret model outputs
* Using SageMaker Model Debugger to debug model convergence

3: Deploy and Orchestrate of ML Workflows (22%)

Task 3.1: Select deployment infrastructure based on existing architecture and requirements
Knowledge of:
* Deployment best practices (ex: versioning, rollback strategies)
* AWS deployment services (ex: Amazon SageMaker AI)
* Methods to serve ML models in real time and in batches
* How to provision compute resources in production and test environments (ex: CPU, GPU)
* Model and endpoint requirements for deployment endpoints (ex: serverless endpoints, real-time endpoints, asynchronous endpoints, batch inference)
* How to choose appropriate containers (ex: provided or customized)
* Methods to optimize models on edge devices (ex: SageMaker Neo)

Skills in:
* Evaluating performance, cost, and latency tradeoffs
* Choosing the appropriate compute environment for training and inference based on requirements (ex: GPU or CPU specs, processor family, networking bandwidth)
* Selecting the correct deployment orchestrator (ex: Apache Airflow, SageMaker Pipelines)
* Selecting multi-model or multi-container deployments
* Selecting the correct deployment target (ex: SageMaker AI endpoints, Kubernetes, ECS, Elastic Kubernetes Service [EKS], Lambda)
* Choosing model deployment strategies (ex: real time, batch)

Task 3.2: Create and script infrastructure based on existing architecture and requirements
Knowledge of:
* Difference between on-demand and provisioned resources
* How to compare scaling policies
* Tradeoffs and use cases of IaC options (ex: CloudFormation, AWS CDK)
* Containerization concepts and AWS container services
* How to use SageMaker AI endpoint auto scaling policies to meet scalability requirements (ex: based on demand, time)

Skills in:
* Applying best practices to enable maintainable, scalable, and cost-effective ML solutions (ex: automatic scaling on SageMaker AI endpoints, dynamically adding Spot Instances, by using Amazon EC2 instances, by using Lambda behind the endpoints)
* Automating the provisioning of compute resources, including communication between stacks (ex: by using CloudFormation, AWS CDK)
* Building and maintaining containers (ex: ECR, EKS, ECS, by using bring your own container [BYOC] with SageMaker AI)
* Configuring SageMaker AI endpoints within the VPC network
* Deploying and hosting models by using the SageMaker AI SDK
* Choosing specific metrics for auto scaling (ex: model latency, CPU utilization, invocations per instance)

Task 3.3: Use automated orchestration tools to set up continuous integration and continuous delivery (CI/CD) pipelines
Knowledge of:
* Capabilities and quotas for CodePipeline, CodeBuild, and CodeDeploy
* Auto and integration of data ingestion with orchestration services
* Version control systems and basic usage (for example, Git)
* CI/CD principles and how they fit into ML workflows
* Deployment strategies and rollback actions (for example, blue/green, canary, linear)
* How code repositories and pipelines work together

Skills in:
* Config and troubleshooting CodeBuild, CodeDeploy, and CodePipeline, including stages
* Applying continuous deployment flow structures to invoke pipelines (ex: Gitflow, GitHub Flow)
* Auto orchestration (ex: to deploy ML models, automate model building)
* Configure training and inference jobs (ex: by using EventBridge rules, SageMaker Pipelines, CodePipeline)
* Creating auto tests in CI/CD pipelines (ex: integration tests, unit tests, end-to-end tests)
* Building and integrating mechanisms to retrain models

4: ML Monitor, Maintain, and Security (24%)

Task 4.1: Monitor model inference
Knowledge of:
* Drift in ML models
* Techniques to monitor data quality and model performance
* Design principles for ML lenses relevant to monitoring

Skills in:
* Monitoring models in production (ex: by using SageMaker Model Monitor)
* Monitoring workflows to detect anomalies or errors in data processing or model inference
* Detecting changes in the distribution of data that can affect model performance (ex: by using SageMaker Clarify)
* Monitoring model performance in production by using A/B testing

Task 4.2: Monitor and optimize infrastructure and costs
Knowledge of:
* Key performance metrics for ML infrastructure (for example, utilization, throughput, availability, scalability, fault tolerance)
* Monitoring and observability tools to troubleshoot latency and performance issues (for example, X-Ray, CloudWatch Lambda Insights, CloudWatch Logs Insights)
* How to use CloudTrail to log, monitor, and invoke re-training activities
* Differences between instance types and how they affect performance (for example, memory optimized, compute optimized, general purpose, inference optimized)
* Capabilities of cost analysis tools (ex: Cost Explorer, Billing and Cost Management, Trusted Advisor)
* Cost tracking and allocation techniques (ex: resource tagging)

Skills in:
* Configuring and using tools to troubleshoot and analyze resources (ex: CloudWatch Logs, CloudWatch alarms)
* Creating CloudTrail trails
* Setting up dashboards to monitor performance metrics (by using Quick Sight, CloudWatch dashboards)
* Monitoring infrastructure (ex: EventBridge events)
* Rightsizing instance families and sizes (ex: SageMaker AI Inference Recommender and Compute Optimizer)
* Monitoring and resolving latency and scaling issues
* Preparing infrastructure for cost monitoring (ex: by applying a tagging strategy)
* Troubleshooting capacity concerns that involve cost and performance (ex: provisioned concurrency, service quotas, auto scaling)
* Optimizing costs and setting cost quotas by using appropriate cost management tools (ex: Cost Explorer, Trusted Advisor, Budgets)
* Optimizing infrastructure costs by selecting purchasing options (ex: Spot Instances, On-Demand Instances, Reserved Instances, SageMaker AI Savings Plans)

Task 4.3: Secure AWS resources

Knowledge of:
* IAM roles, policies, and groups that control access to AWS services (ex: IAM, bucket policies, SageMaker Role Manager)
* SageMaker AI security and compliance features
* Controls for network access to ML resources
* Security best practices for CI/CD pipelines

Skills in:
* Configuring least privilege access to ML artifacts
* Configuring IAM policies and roles for users and applications that interact with ML systems
* Monitoring, auditing, and logging ML systems to ensure continued security and compliance
* Troubleshooting and debugging security issues
* Building VPCs, subnets, and security groups to securely isolate ML systems


Comments

Popular posts from this blog

GHL Email Campaigns

Whitelabel Options

Free AI Tools