AWS ML Modules Questions PDF
Document Details

Uploaded by TimeHonoredEternity7101
Tags
Summary
This document contains practice questions for AWS (Amazon Web Services) Machine Learning modules. The questions cover topics such as machine learning algorithms, data preparation, model training, Amazon SageMaker, and natural language processing (NLP). Correct answers are provided for each question.
Full Transcript
AWS ML Modules Questions 1. Which statement describes machine learning? The scientific study of algorithms and statistical models to perform tasks by using inference instead of instructions The process of explicitly programming a computer to follow step-by-step instructions T...
AWS ML Modules Questions 1. Which statement describes machine learning? The scientific study of algorithms and statistical models to perform tasks by using inference instead of instructions The process of explicitly programming a computer to follow step-by-step instructions The automation of repetitive manual tasks without the need for data analysis The design of computer hardware to process data faster Correct Answer: The scientific study of algorithms and statistical models to perform tasks by using inference instead of instructions 2. You are working on a machine learning problem that requires the system to choose between eight possible departments to route a customer. Which type of machine learning problem does this scenario describe? Binary classification Multiclass classification Regression Clustering Correct Answer: Multiclass classification 3. Which type of training describes a machine learning application that interacts with its environment and learns to take actions that maximize rewards? Supervised learning Unsupervised learning Reinforcement learning Deep learning Correct Answer: Reinforcement learning 4. You are working on a machine learning problem. At which stage would you verify that your data is all of a uniform type? Feature selection Data preparation Model training Evaluation Correct Answer: Data preparation 5. You are working on a machine model that uses data from multiple countries. The countries are listed by using alphabetical abbreviations. Which stages involve converting these abbreviations to numerical values? Data preparation Model tuning Model evaluation Model training Correct Answer: Data preparation 6. Your model is overfitting if it performs well on the training data, but not on the evaluation data. True False Correct Answer: True 7. Which two Python libraries are commonly used for machine learning? (Choose two.) math pandas scikit-learn Jupyter Notebooks Bokeh Correct Answers: pandas, scikit-learn 8. When should you consider machine learning as a development methodology? When you have a pre-defined model for the task When you have a complete set of rules for the task When you have large datasets with a large number of variables When your data is stored in a data center Correct Answer: When you have large datasets with a large number of variables 9. Which resources help define a machine learning (ML) problem? (Select TWO.) Access to labeled data A domain expert to consult A traditional coded solution Sufficient hardware A neural network Correct Answers: Access to labeled data, A domain expert to consult 10. When preparing data for supervised classification machine learning, which attributes should the data have? (Select TWO.) Data should be labeled Data should contain only instances of the target Anyone in the company should be able to access the data Data should be generated randomly by using genetic algorithms Data should be representative of production Correct Answers: Data should be labeled, Data should be representative of production 11. What can you learn by examining the statistics of your data? Identifying anomalies in the data Verifying that the data is formatted correctly Removing outliers Filling in missing data Correct Answer: Identifying anomalies in the data 12. You have a preprocessed dataset that's ready for use in training a model. How should you divide your training data? Use all the data to train the model. Split the data into two equal sets. Use one half for training and the other half for testing. Split the data into three sets. Use 80% for training, 10% for testing, and 10% for validation. Split the data into two sets. Use 80% for training, and 20% for testing and validation. Correct Answer: Split the data into three sets. Use 80% for training, 10% for testing, and 10% for validation. 13. You can select between single model and multi-model hosting with Amazon SageMaker. True False Correct Answer: True 14. What is the purpose of a confusion matrix? To plot the labels from the predicted dataset To show the true or false positives, along with the true or false negatives To show the correlation between two columns in the dataset To stratify the classes across training and testing datasets Correct Answer: To show the true or false positives, along with the true or false negatives 15. What does a correlation heatmap show? The level of correlation between features in a dataset The level of correlation between the test and the validation data The level of correlation between the predicted and actual values The level of correlation between encoded and text data Correct Answer: The level of correlation between features in a dataset 16. Which of the following file formats does pandas support for data importing? (Select TWO.) JSON MS Word CSV Binary files PDF Correct Answers: JSON, CSV 17. Which Amazon service can you use to deploy machine learning instances and run Jupyter Notebooks? Amazon Comprehend Amazon SageMaker Amazon Polly Amazon Lex Correct Answer: Amazon SageMaker 18. What is the goal of an Amazon SageMaker hyperparameter tuning job? To optimize the validation metrics for training To optimize the model parameters to produce the best model To optimize the data inputs to produce the fastest prediction To optimize the algorithm choice to produce the best model Correct Answer: To optimize the model parameters to produce the best model 19. Which patterns are common in time series data? (Select TWO.) Trends Seasonal Exponential Star shaped None of the above Correct Answers: Trends, Seasonal 20. Which use cases apply to forecasting? (Select TWO.) Predicting the inventory that's required for items in a warehouse Predicting if an X-ray image contains an abnormality Predicting the energy consumption of an office Predicting the sentiment of a review Determining if two images are of the same person Correct Answers: Predicting the inventory that's required for items in a warehouse, Predicting the energy consumption of an office 21. Which datasets could be used as a time series dataset? (Select TWO.) Sales data that contains items, purchase dates, and quantities Web logs that contain IP addresses, pages, and timestamps Chemical composition of food additives Membership data that contains PII and a donate flag Results from a one-time survey Correct Answers: Sales data that contains items, purchase dates, and quantities; Web logs that contain IP addresses, pages, and timestamps 22. You have a dataset of temperature readings from a weather station. Temperature readings are logged every 5 minutes. You notice that there are several missing values each day. Which approach could you take? (Select TWO.) Replace the missing values with zero Forward fill the missing values Backward fill the missing values Use the sum of the temperatures for the day to fill the missing values Remove the records that have the missing data Correct Answers: Forward fill the missing values, Backward fill the missing values 23. Which scenarios are examples of appropriate downsampling? (Select TWO.) Using mean to convert temperature readings every minute to an hourly value Using sum to convert sales order information during the day to a daily total Using mean to convert sales order information during the day to a daily total Using sum to convert temperature readings every minute to an hourly value Correct Answers: Using mean to convert temperature readings every minute to an hourly value, Using sum to convert sales order information during the day to a daily total 24. What are examples of seasonality that you might observe in time series data? (Select TWO.) Quarterly, yearly Spring, summer, fall, winter Every two years One time sales events Hourly Correct Answers: Quarterly, yearly; Spring, summer, fall, winter 25. An Amazon SageMaker Canvas forecast model generates predictions for P10, P50, and P90. If the forecast predicts shoe sales, what do the P10, P50, and P90 tell you? P10 indicates that 10% of the time, fewer than the predicted value will be ordered. P50 indicates that 50% of the time, the exact number of the predicted value will be ordered. P90 indicates that 90% of the time, more than the predicted value will be ordered. The average of P10, P50, and P90 indicates the exact number of predicted value that will be ordered. Correct Answers: P10 indicates that 10% of the time, fewer than the predicted value will be ordered. 26. Which items in a dataset are required for generating a retail forecast with Amazon SageMaker Canvas? Item data that includes an item and category Item stock information that includes a timestamp, item, and stock quantity Item pricing data including a timestamp, item, and price Time series data that includes a timestamp, item, and quantity Correct Answer: Time series data that includes a timestamp, item, and quantity 27. Amazon SageMaker Canvas provides various evaluation metrics for forecasting models. What is the benefit of column impact scores? Column impact scores provide probable predictions to show how much uncertainty is associated with a forecast. Column impact scores show forecast reliability by comparing target and forecasted values. Column impact scores show how much an attribute contributes to the model’s forecast. Column impact scores show model’s forecast accuracy. Correct Answer: Column impact scores show how much an attribute contributes to the model’s forecast. 28. Which of the following are options Amazon SageMaker Canvas provides to refine your forecasting insights? (Select THREE.) Specify a group column. Change the values of the metrics to see how it affects the forecast. Import holiday schedules to improve sales forecasting. Compare model versions in production automatically. Change values in the input data to see how it affects the forecast with a what-if scenario. Correct Answers: Specify a group column, Import holiday schedules to improve sales forecasting, Change values in the input data to see how it affects the forecast with a what-if scenario. 29. Which are common use cases for computer vision? Image Analysis Facial recognition Home security All of the above Correct Answer: All of the above 30. What is the location of an object in an image called? A bounding box An object box Object coordinates Object location Correct Answer: A bounding box 31. Which capabilities are provided by Amazon Recognition? (Select TWO.) Searching libraries of images and videos Adding labels to images Image manipulation Facial detection Video editing Correct Answers: Searching libraries of images and videos, Facial detection 32. When Amazon Rekognition performs predictions, it also provides a score that indicates the level of confidence in the prediction. True False Correct Answer: True 33. What does Amazon Rekognition do with the results after it completes a video analysis? Stores the results in an Amazon Relational Database Service (Amazon RDS) database Starts an AWS Lambda function to notify the owner of the job Publishes the results to an Amazon Simple Notification Service (Amazon SNS) queue Stores the results in Amazon Simple Storage Service (Amazon S3) Correct Answer: Publishes the results to an Amazon Simple Notification Service (Amazon SNS) queue 34. Which features are part of Amazon Rekognition Custom Labels? (Select TWO.) UI for labeling images and defining bounding boxes Automated selection of machine learning algorithms Retrieval of text from an image Facial analysis Identification of celebrities Correct Answers: UI for labeling images and defining bounding boxes, Automated selection of machine learning algorithms 35. What is the minimum number of images that are required to use automated data labeling by Amazon SageMaker Ground Truth? 5000 3000 1500 1250 Correct Answer: 1250 36. What is a confusion matrix? A way to test if your model is working A test to determine the accuracy of a classification model A special output from Amazon Rekognition Custom Labels A way to validate a linear regression model Correct Answer: A test to determine the accuracy of a classification model 37. Which types of data are included in an Amazon SageMaker Ground Truth manifest file? (Select THREE.) Confidence value File type Creation date Class name Number of images File size Correct Answers: Confidence value, Creation date, Class name 38. Which of the following are steps for preparing a custom dataset for object detection? (Select TWO.) Collect images Feature engineering Train the model Generate a confusion matrix Correct Answers: Collect images, Train the model 39. Which issue is not a major challenge for natural language processing (NLP)? Lack of precision Meaning based on context Multiple dependencies Memory limitations Correct Answer: Memory limitations 40. Which tasks are common preprocessing tasks for natural language processing (NLP) applications? (Select TWO.) Removing noise Normalizing similar terms Adjusting for context Removing proper nouns Feature engineering Correct Answers: Removing noise, Normalizing similar terms 41. Natural language processing (NLP) systems predate machine learning systems. True False Correct Answer: True 42. Which models are common machine learning models for natural language processing (NLP) applications? (Select TWO.) pandas Bag of words Word tokens Term frequency and inverse document frequency Scikit-learn Correct Answers: Bag of words, Term frequency and inverse document frequency 43. What is not a text-analysis category? Auto-correcting text Classifying text Discovering similarities in text Deriving relationships within text Correct Answer: Auto-correcting text 44. Which capabilities are supported by Amazon Transcribe? (Select TWO.) Change audio output in response to SSML tags. Convert streaming audio to text Build subtitles for multiple languages Translate text into another language Analyze text for sentiment. Correct Answers: Convert streaming audio to text, Build subtitles for multiple languages 45. How can you change the way Amazon Polly pronounces words? By slowing down the audio output By adding Speech Synthesis Markup Language (SSML) tags to the text By sending custom instructions through the API By importing custom voices Correct Answer: By adding Speech Synthesis Markup Language (SSML) tags to the text 46. Which capabilities are a part of Amazon Comprehend? (Select TWO.) Translate a document into another language Identifying the language used in a document Identify images in a document Determining the sentiment in a document, such as positive, negative, neutral, or mixed Convert text into speech Correct Answers: Identifying the language used in a document, Determining the sentiment in a document, such as positive, negative, neutral, or mixed 47. Which of the following AWS services would you use to launch a workflow based on input to an Amazon Lex chatbot? Amazon Simple Storage Service Amazon Athena Amazon Lambda All of the above Correct Answer: Amazon Lambda 48. You work for a company that builds applications that are used by a global audience. Which services could help you analyze how your customers use your applications? (Select TWO.) Amazon Comprehend Amazon Translate Amazon Polly Amazon Lex Amazon Transcribe Correct Answers: Amazon Comprehend, Amazon Translate 49. What type of artificial intelligence (AI) uses pre-trained large models to create content? Expert systems Deep learning Traditional machine learning (ML) Generative artificial intelligence (AI) Correct Answer: Generative artificial intelligence (AI) 50. Amazon Q Developer enhances security in your code. Which type of vulnerability does Amazon Q Developer scan for? Compliance best practices AWS sustainability best practices AWS security best practices Reference libraries best practices Correct Answer: AWS security best practices 51. Which statement accurately describes what prompt engineering is? The process of designing and refining the instructions for a language model to generate specific types of output The process of creating new input features from raw data that help machine learning algorithms better capture the underlying relationships. The process of preparing a model with additional data to better fit your personal use case. The process of preparing data that you use with your machine learning (ML) model. Correct Answer: The process of designing and refining the instructions for a language model to generate specific types of output 52. An organization is building a new application, and they want to be able to generate text from a text prompt. Which type of machine learning (ML) model should you choose for the application? Diffusion Model Language Learning Model (LLM) Regression Model Image Classifier Model Correct Answer: Language Learning Model (LLM) 53. What is the name of the models used to create generative artificial intelligence (generative AI) applications? Regression model Foundation models (FMs) Binary Classification Model Forecast models Correct Answer: Foundation models (FMs) 54. An organization wants to increase productivity by using an artificial intelligence (AI) service that can assist with generating code. Which AWS service can they use to embed in an integrated development environment (IDE) and generate code? Amazon Q Developer AWS Inferentia Amazon Bedrock Amazon SageMaker JumpStart Correct Answer: Amazon Q Developer 55. Which task can Amazon Q Developer help with? Computer vision Image classification Code generation Audio generation Correct Answer: Code generation 56. Which AWS managed service enables a user to access foundation models? AWS Inferentia Amazon Bedrock Amazon SageMaker JumpStart Amazon Q Developer Correct Answer: Amazon Bedrock 57. Amazon Q Developer generates code snippets to full functions in real-time based on user's comments and existing code. What is a benefit of Amazon Q Developer code generation? Write code with no coding experience Review code on developer forums Use collaborative coding Bypass repetitive coding tasks Correct Answer: Bypass repetitive coding tasks 58. What are requirements for choosing machine learning as a development methodology? A pre-defined machine learning model Large datasets with a large number of variables A complete set of rules for decision making. A stand-alone data center Correct Answer: Large datasets with a large number of variables 59. Which stage of the machine learning (ML) pipeline involves verifying that your data is all of a uniform type? Problem formulation Model training Data preparation Feature engineering Correct Answer: Data preparation