Pre-Summer Sale 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: exams65

ExamsBrite Dumps

Google Professional Machine Learning Engineer Question and Answers

Google Professional Machine Learning Engineer

Last Update Apr 15, 2026
Total Questions : 296

We are offering FREE Professional-Machine-Learning-Engineer Google exam questions. All you do is to just go and sign up. Give your details, prepare Professional-Machine-Learning-Engineer free exam questions and then go for complete pool of Google Professional Machine Learning Engineer test questions that will help you more.

Professional-Machine-Learning-Engineer pdf

Professional-Machine-Learning-Engineer PDF

$36.75  $104.99
Professional-Machine-Learning-Engineer Engine

Professional-Machine-Learning-Engineer Testing Engine

$43.75  $124.99
Professional-Machine-Learning-Engineer PDF + Engine

Professional-Machine-Learning-Engineer PDF + Testing Engine

$57.75  $164.99
Questions 1

You are an ML engineer at a bank. You have developed a binary classification model using AutoML Tables to predict whether a customer will make loan payments on time. The output is used to approve or reject loan requests. One customer’s loan request has been rejected by your model, and the bank’s risks department is asking you to provide the reasons that contributed to the model’s decision. What should you do?

Options:

A.  

Use local feature importance from the predictions.

B.  

Use the correlation with target values in the data summary page.

C.  

Use the feature importance percentages in the model evaluation page.

D.  

Vary features independently to identify the threshold per feature that changes the classification.

Discussion 0
Questions 2

You built a deep learning-based image classification model by using on-premises data. You want to use Vertex Al to deploy the model to production Due to security concerns you cannot move your data to the cloud. You are aware that the input data distribution might change over time You need to detect model performance changes in production. What should you do?

Options:

A.  

Use Vertex Explainable Al for model explainability Configure feature-based explanations.

B.  

Use Vertex Explainable Al for model explainability Configure example-based explanations.

C.  

Create a Vertex Al Model Monitoring job. Enable training-serving skew detection for your model.

D.  

Create a Vertex Al Model Monitoring job. Enable feature attribution skew and dnft detection for your model.

Discussion 0
Questions 3

You have a demand forecasting pipeline in production that uses Dataflow to preprocess raw data prior to model training and prediction. During preprocessing, you employ Z-score normalization on data stored in BigQuery and write it back to BigQuery. New training data is added every week. You want to make the process more efficient by minimizing computation time and manual intervention. What should you do?

Options:

A.  

Normalize the data using Google Kubernetes Engine

B.  

Translate the normalization algorithm into SQL for use with BigQuery

C.  

Use the normalizer_fn argument in TensorFlow ' s Feature Column API

D.  

Normalize the data with Apache Spark using the Dataproc connector for BigQuery

Discussion 0
Questions 4

You are building a custom image classification model and plan to use Vertex Al Pipelines to implement the end-to-end training. Your dataset consists of images that need to be preprocessed before they can be used to train the model. The preprocessing steps include resizing the images, converting them to grayscale, and extracting features. You have already implemented some Python functions for the preprocessing tasks. Which components should you use in your pipeline ' ?

Options:

A.  

B.  

C.  

D.  

Discussion 0
Questions 5

You need to train a regression model based on a dataset containing 50,000 records that is stored in BigQuery. The data includes a total of 20 categorical and numerical features with a target variable that can include negative values. You need to minimize effort and training time while maximizing model performance. What approach should you take to train this regression model?

Options:

A.  

Create a custom TensorFlow DNN model.

B.  

Use BQML XGBoost regression to train the model

C.  

Use AutoML Tables to train the model without early stopping.

D.  

Use AutoML Tables to train the model with RMSLE as the optimization objective

Discussion 0
Questions 6

Your organization manages an online message board A few months ago, you discovered an increase in toxic language and bullying on the message board. You deployed an automated text classifier that flags certain comments as toxic or harmful. Now some users are reporting that benign comments referencing their religion are being misclassified as abusive Upon further inspection, you find that your classifier ' s false positive rate is higher for comments that reference certain underrepresented religious groups. Your team has a limited budget and is already overextended. What should you do?

Options:

A.  

Add synthetic training data where those phrases are used in non-toxic ways

B.  

Remove the model and replace it with human moderation.

C.  

Replace your model with a different text classifier.

D.  

Raise the threshold for comments to be considered toxic or harmful

Discussion 0
Questions 7

You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deploying this model into a mobile application. You discover that the inference latency of the current model doesn’t meet production requirements. You need to reduce the inference time by 50%, and you are willing to accept a small decrease in model accuracy in order to reach the latency requirement. Without training a new model, which model optimization technique for reducing latency should you try first?

Options:

A.  

Weight pruning

B.  

Dynamic range quantization

C.  

Model distillation

D.  

Dimensionality reduction

Discussion 0
Questions 8

You are developing an ML model using a dataset with categorical input variables. You have randomly split half of the data into training and test sets. After applying one-hot encoding on the categorical variables in the training set, you discover that one categorical variable is missing from the test set. What should you do?

Options:

A.  

Randomly redistribute the data, with 70% for the training set and 30% for the test set

B.  

Use sparse representation in the test set

C.  

Apply one-hot encoding on the categorical variables in the test data.

D.  

Collect more data representing all categories

Discussion 0
Questions 9

You are working on a prototype of a text classification model in a managed Vertex AI Workbench notebook. You want to quickly experiment with tokenizing text by using a Natural Language Toolkit (NLTK) library. How should you add the library to your Jupyter kernel?

Options:

A.  

Install the NLTK library from a terminal by using the pip install nltk command.

B.  

Write a custom Dataflow job that uses NLTK to tokenize your text and saves the output to Cloud Storage.

C.  

Create a new Vertex Al Workbench notebook with a custom image that includes the NLTK library.

D.  

Install the NLTK library from a Jupyter cell by using the! pip install nltk —user command.

Discussion 0
Questions 10

You need to train a natural language model to perform text classification on product descriptions that contain millions of examples and 100,000 unique words. You want to preprocess the words individually so that they can be fed into a recurrent neural network. What should you do?

Options:

A.  

Create a hot-encoding of words, and feed the encodings into your model.

B.  

Identify word embeddings from a pre-trained model, and use the embeddings in your model.

C.  

Sort the words by frequency of occurrence, and use the frequencies as the encodings in your model.

D.  

Assign a numerical value to each word from 1 to 100,000 and feed the values as inputs in your model.

Discussion 0
Questions 11

You work for the AI team of an automobile company, and you are developing a visual defect detection model using TensorFlow and Keras. To improve your model performance, you want to incorporate some image augmentation functions such as translation, cropping, and contrast tweaking. You randomly apply these functions to each training batch. You want to optimize your data processing pipeline for run time and compute resources utilization. What should you do?

Options:

A.  

Embed the augmentation functions dynamically in the tf.Data pipeline.

B.  

Embed the augmentation functions dynamically as part of Keras generators.

C.  

Use Dataflow to create all possible augmentations, and store them as TFRecords.

D.  

Use Dataflow to create the augmentations dynamically per training run, and stage them as TFRecords.

Discussion 0
Questions 12

You need to create a working environment in Vertex AI Workbench for a team of data scientists. Each data scientist has different VM CPU and RAM and package requirements, and will be assigned a personal notebook instance. You want each instance to have a custom set of packages pre-installed. Your company wants to minimize the running cost of notebook instances. How should you create the environment?

Options:

A.  

Create Vertex AI Workbench instances based on a Deep Learning Containers container image. Disable idle shutdown of each instance.

B.  

Create Vertex AI Workbench instances based on a Deep Learning Containers container image. Use n1-standard-4 as the default machine type. Verify idle shutdown of each instance.

C.  

Create Vertex AI Workbench instances based on a custom container image derived from a Deep Learning Containers image. Verify idle shutdown of each instance.

D.  

Create Vertex AI Workbench instances based on a custom container image derived from a Deep Learning Containers image. Use n1-standard-4 as the default machine type. Verify idle shutdown of each instance.

Discussion 0
Questions 13

You are an AI engineer working for a popular video streaming platform. You built a classification model using PyTorch to predict customer churn. Each week, the customer retention team plans to contact customers identified as at-risk for churning with personalized offers. You want to deploy the model while minimizing maintenance effort. What should you do?

Options:

A.  

Use Vertex AI’s prebuilt containers for prediction. Deploy the container on Cloud Run to generate online predictions.

B.  

Use Vertex AI’s prebuilt containers for prediction. Deploy the model on Google Kubernetes Engine (GKE), and configure the model for batch prediction.

C.  

Deploy the model to a Vertex AI endpoint, and configure the model for batch prediction. Schedule the batch prediction to run weekly.

D.  

Deploy the model to a Vertex AI endpoint, and configure the model for online prediction. Schedule a job to query this endpoint weekly.

Discussion 0
Questions 14

You have developed a fraud detection model for a large financial institution using Vertex AI. The model achieves high accuracy, but stakeholders are concerned about potential bias based on customer demographics. You have been asked to provide insights into the model ' s decision-making process and identify any fairness issues. What should you do?

Options:

A.  

Enable Vertex AI Model Monitoring to detect training-serving skew. Configure an alert to send an email when the skew or drift for a model’s feature exceeds a predefined threshold. Retrain the model by appending new data to existing training data.

B.  

Compile a dataset of unfair predictions. Use Vertex AI Vector Search to identify similar data points in the model ' s predictions. Report these data points to the stakeholders.

C.  

Use feature attribution in Vertex AI to analyze model predictions and the impact of each feature on the model ' s predictions.

D.  

Create feature groups using Vertex AI Feature Store to segregate customer demographic features and non-demographic features. Retrain the model using only non-demographic features.

Discussion 0
Questions 15

You work for a hospital that wants to optimize how it schedules operations. You need to create a model that uses the relationship between the number of surgeries scheduled and beds used You want to predict how many beds will be needed for patients each day in advance based on the scheduled surgeries You have one year of data for the hospital organized in 365 rows

The data includes the following variables for each day

• Number of scheduled surgeries

• Number of beds occupied

• Date

You want to maximize the speed of model development and testing What should you do?

Options:

A.  

Create a BigQuery table Use BigQuery ML to build a regression model, with number of beds as the target variable and number of scheduled surgeries and date features (such as day of week) as the predictors

B.  

Create a BigQuery table Use BigQuery ML to build an ARIMA model, with number of beds as the target variable and date as the time variable.

C.  

Create a Vertex Al tabular dataset Tram an AutoML regression model, with number of beds as the target variable and number of scheduled minor surgeries and date features (such as day of the week) as the predictors

D.  

Create a Vertex Al tabular dataset Train a Vertex Al AutoML Forecasting model with number of beds as the target variable, number of scheduled surgeries as a covariate, and date as the time variable.

Discussion 0
Questions 16

You are responsible for building a unified analytics environment across a variety of on-premises data marts. Your company is experiencing data quality and security challenges when integrating data across the servers, caused by the use of a wide range of disconnected tools and temporary solutions. You need a fully managed, cloud-native data integration service that will lower the total cost of work and reduce repetitive work. Some members on your team prefer a codeless interface for building Extract, Transform, Load (ETL) process. Which service should you use?

Options:

A.  

Dataflow

B.  

Dataprep

C.  

Apache Flink

D.  

Cloud Data Fusion

Discussion 0
Questions 17

You are a lead ML engineer at a retail company. You want to track and manage ML metadata in a centralized way so that your team can have reproducible experiments by generating artifacts. Which management solution should you recommend to your team?

Options:

A.  

Store your tf.logging data in BigQuery.

B.  

Manage all relational entities in the Hive Metastore.

C.  

Store all ML metadata in Google Cloud’s operations suite.

D.  

Manage your ML workflows with Vertex ML Metadata.

Discussion 0
Questions 18

You work on an operations team at an international company that manages a large fleet of on-premises servers located in few data centers around the world. Your team collects monitoring data from the servers, including CPU/memory consumption. When an incident occurs on a server, your team is responsible for fixing it. Incident data has not been properly labeled yet. Your management team wants you to build a predictive maintenance solution that uses monitoring data from the VMs to detect potential failures and then alerts the service desk team. What should you do first?

Options:

A.  

Train a time-series model to predict the machines’ performance values. Configure an alert if a machine’s actual performance values significantly differ from the predicted performance values.

B.  

Implement a simple heuristic (e.g., based on z-score) to label the machines’ historical performance data. Train a model to predict anomalies based on this labeled dataset.

C.  

Develop a simple heuristic (e.g., based on z-score) to label the machines’ historical performance data. Test this heuristic in a production environment.

D.  

Hire a team of qualified analysts to review and label the machines’ historical performance data. Train a model based on this manually labeled dataset.

Discussion 0
Questions 19

You work for a biotech startup that is experimenting with deep learning ML models based on properties of biological organisms. Your team frequently works on early-stage experiments with new architectures of ML models, and writes custom TensorFlow ops in C++. You train your models on large datasets and large batch sizes. Your typical batch size has 1024 examples, and each example is about 1 MB in size. The average size of a network with all weights and embeddings is 20 GB. What hardware should you choose for your models?

Options:

A.  

A cluster with 2 n1-highcpu-64 machines, each with 8 NVIDIA Tesla V100 GPUs (128 GB GPU memory in total), and a n1-highcpu-64 machine with 64 vCPUs and 58 GB RAM

B.  

A cluster with 2 a2-megagpu-16g machines, each with 16 NVIDIA Tesla A100 GPUs (640 GB GPU memory in total), 96 vCPUs, and 1.4 TB RAM

C.  

A cluster with an n1-highcpu-64 machine with a v2-8 TPU and 64 GB RAM

D.  

A cluster with 4 n1-highcpu-96 machines, each with 96 vCPUs and 86 GB RAM

Discussion 0
Questions 20

You are implementing a batch inference ML pipeline in Google Cloud. The model was developed using TensorFlow and is stored in SavedModel format in Cloud Storage You need to apply the model to a historical dataset containing 10 TB of data that is stored in a BigQuery table How should you perform the inference?

Options:

A.  

Export the historical data to Cloud Storage in Avro format. Configure a Vertex Al batch prediction job to generate predictions for the exported data.

B.  

Import the TensorFlow model by using the create model statement in BigQuery ML Apply the historical data to the TensorFlow model.

C.  

Export the historical data to Cloud Storage in CSV format Configure a Vertex Al batch prediction job to generate predictions for the exported data.

D.  

Configure a Vertex Al batch prediction job to apply the model to the historical data in BigQuery

Discussion 0
Questions 21

You need to quickly build and train a model to predict the sentiment of customer reviews with custom categories without writing code. You do not have enough data to train a model from scratch. The resulting model should have high predictive performance. Which service should you use?

Options:

A.  

AutoML Natural Language

B.  

Cloud Natural Language API

C.  

AI Hub pre-made Jupyter Notebooks

D.  

AI Platform Training built-in algorithms

Discussion 0
Questions 22

You are building a TensorFlow text-to-image generative model by using a dataset that contains billions of images with their respective captions. You want to create a low maintenance, automated workflow that reads the data from a Cloud Storage bucket collects statistics, splits the dataset into training/validation/test datasets performs data transformations, trains the model using the training/validation datasets. and validates the model by using the test dataset. What should you do?

Options:

A.  

Use the Apache Airflow SDK to create multiple operators that use Dataflow and Vertex Al services Deploy the workflow on Cloud Composer.

B.  

Use the MLFlow SDK and deploy it on a Google Kubernetes Engine Cluster Create multiple components that use Dataflow and Vertex Al services.

C.  

Use the Kubeflow Pipelines (KFP) SDK to create multiple components that use Dataflow and Vertex Al services Deploy the workflow on Vertex Al Pipelines.

D.  

Use the TensorFlow Extended (TFX) SDK to create multiple components that use Dataflow and Vertex Al services Deploy the workflow on Vertex Al Pipelines.

Discussion 0
Questions 23

Your organization wants to make its internal shuttle service route more efficient. The shuttles currently stop at all pick-up points across the city every 30 minutes between 7 am and 10 am. The development team has already built an application on Google Kubernetes Engine that requires users to confirm their presence and shuttle station one day in advance. What approach should you take?

Options:

A.  

1. Build a tree-based regression model that predicts how many passengers will be picked up at each shuttle station.

2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the prediction.

B.  

1. Build a tree-based classification model that predicts whether the shuttle should pick up passengers at each shuttle station.

2. Dispatch an available shuttle and provide the map with the required stops based on the prediction

C.  

1. Define the optimal route as the shortest route that passes by all shuttle stations with confirmed attendance at the given time under capacity constraints.

2 Dispatch an appropriately sized shuttle and indicate the required stops on the map

D.  

1. Build a reinforcement learning model with tree-based classification models that predict the presence of passengers at shuttle stops as agents and a reward function around a distance-based metric

2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the simulated outcome.

Discussion 0
Questions 24

You need to design an architecture that serves asynchronous predictions to determine whether a particular mission-critical machine part will fail. Your system collects data from multiple sensors from the machine. You want to build a model that will predict a failure in the next N minutes, given the average of each sensor’s data from the past 12 hours. How should you design the architecture?

Options:

A.  

1. HTTP requests are sent by the sensors to your ML model, which is deployed as a microservice and exposes a REST API for prediction

2. Your application queries a Vertex AI endpoint where you deployed your model.

3. Responses are received by the caller application as soon as the model produces the prediction.

B.  

1. Events are sent by the sensors to Pub/Sub, consumed in real time, and processed by a Dataflow stream processing pipeline.

2. The pipeline invokes the model for prediction and sends the predictions to another Pub/Sub topic.

3. Pub/Sub messages containing predictions are then consumed by a downstream system for monitoring.

C.  

1. Export your data to Cloud Storage using Dataflow.

2. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to perform scoring on the preprocessed data.

3. Export the batch prediction job outputs from Cloud Storage and import them into Cloud SQL.

D.  

1. Export the data to Cloud Storage using the BigQuery command-line tool

2. Submit a Vertex AI batch prediction job that uses your trained model in Cloud Storage to perform scoring on the preprocessed data.

3. Export the batch prediction job outputs from Cloud Storage and import them into BigQuery.

Discussion 0
Questions 25

You trained a model on data stored in a Cloud Storage bucket. The model needs to be retrained frequently in Vertex AI Training using the latest data in the bucket. Data preprocessing is required prior to retraining. You want to build a simple and efficient near-real-time ML pipeline in Vertex AI that will preprocess the data when new data arrives in the bucket. What should you do?

Options:

A.  

Create a pipeline using the Vertex AI SDK. Schedule the pipeline with Cloud Scheduler to preprocess the new data in the bucket. Store the processed features in Vertex AI Feature Store.

B.  

Create a Cloud Run function that is triggered when new data arrives in the bucket. The function initiates a Vertex AI Pipeline to preprocess the new data and store the processed features in Vertex AI Feature Store.

C.  

Build a Dataflow pipeline to preprocess the new data in the bucket and store the processed features in BigQuery. Configure a cron job to trigger the pipeline execution.

D.  

Use the Vertex AI SDK to preprocess the new data in the bucket prior to each model retraining. Store the processed features in BigQuery.

Discussion 0
Questions 26

Your data science team is training a PyTorch model for image classification based on a pre-trained RestNet model. You need to perform hyperparameter tuning to optimize for several parameters. What should you do?

Options:

A.  

Convert the model to a Keras model, and run a Keras Tuner job.

B.  

Run a hyperparameter tuning job on AI Platform using custom containers.

C.  

Create a Kuberflow Pipelines instance, and run a hyperparameter tuning job on Katib.

D.  

Convert the model to a TensorFlow model, and run a hyperparameter tuning job on AI Platform.

Discussion 0
Questions 27

You have written unit tests for a Kubeflow Pipeline that require custom libraries. You want to automate the execution of unit tests with each new push to your development branch in Cloud Source Repositories. What should you do?

Options:

A.  

Write a script that sequentially performs the push to your development branch and executes the unit tests on Cloud Run

B.  

Using Cloud Build, set an automated trigger to execute the unit tests when changes are pushed to your development branch.

C.  

Set up a Cloud Logging sink to a Pub/Sub topic that captures interactions with Cloud Source Repositories Configure a Pub/Sub trigger for Cloud Run, and execute the unit tests on Cloud Run.

D.  

Set up a Cloud Logging sink to a Pub/Sub topic that captures interactions with Cloud Source Repositories. Execute the unit tests using a Cloud Function that is triggered when messages are sent to the Pub/Sub topic

Discussion 0
Questions 28

You developed a Vertex Al pipeline that trains a classification model on data stored in a large BigQuery table. The pipeline has four steps, where each step is created by a Python function that uses the KubeFlow v2 API The components have the following names:

You launch your Vertex Al pipeline as the following:

You perform many model iterations by adjusting the code and parameters of the training step. You observe high costs associated with the development, particularly the data export and preprocessing steps. You need to reduce model development costs. What should you do?

Options:

A.  

B.  

C.  

D.  

Discussion 0
Questions 29

You work for a pharmaceutical company based in Canada. Your team developed a BigQuery ML model to predict the number of flu infections for the next month in Canada Weather data is published weekly and flu infection statistics are published monthly. You need to configure a model retraining policy that minimizes cost What should you do?

Options:

A.  

Download the weather and flu data each week Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model weekly.

B.  

Download the weather and flu data each month Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model monthly.

C.  

Download the weather and flu data each week Configure Cloud Scheduler to execute a Vertex Al pipeline to retrain the model every month.

D.  

Download the weather data each week, and download the flu data each month Deploy the model to a Vertex Al endpoint with feature drift monitoring. and retrain the model if a monitoring alert is detected.

Discussion 0
Questions 30

You work for a company that manages a ticketing platform for a large chain of cinemas. Customers use a mobile app to search for movies they’re interested in and purchase tickets in the app. Ticket purchase requests are sent to Pub/Sub and are processed with a Dataflow streaming pipeline configured to conduct the following steps:

1. Check for availability of the movie tickets at the selected cinema.

2. Assign the ticket price and accept payment.

3. Reserve the tickets at the selected cinema.

4. Send successful purchases to your database.

Each step in this process has low latency requirements (less than 50 milliseconds). You have developed a logistic regression model with BigQuery ML that predicts whether offering a promo code for free popcorn increases the chance of a ticket purchase, and this prediction should be added to the ticket purchase process. You want to identify the simplest way to deploy this model to production while adding minimal latency. What should you do?

Options:

A.  

Run batch inference with BigQuery ML every five minutes on each new set of tickets issued.

B.  

Export your model in TensorFlow format, and add a tfx_bsl.public.beam.RunInference step to the Dataflow pipeline.

C.  

Export your model in TensorFlow format, deploy it on Vertex AI, and query the prediction endpoint from your streaming pipeline.

D.  

Convert your model with TensorFlow Lite (TFLite), and add it to the mobile app so that the promo code and the incoming request arrive together in Pub/Sub.

Discussion 0
Questions 31

You are designing an ML recommendation model for shoppers on your company ' s ecommerce website. You will use Recommendations Al to build, test, and deploy your system. How should you develop recommendations that increase revenue while following best practices?

Options:

A.  

Use the " Other Products You May Like " recommendation type to increase the click-through rate

B.  

Use the " Frequently Bought Together ' recommendation type to increase the shopping cart size for each order.

C.  

Import your user events and then your product catalog to make sure you have the highest quality event stream

D.  

Because it will take time to collect and record product data, use placeholder values for the product catalog to test the viability of the model.

Discussion 0
Questions 32

You have trained an XGBoost model that you plan to deploy on Vertex Al for online prediction. You are now uploading your model to Vertex Al Model Registry, and you need to configure the explanation method that will serve online prediction requests to be returned with minimal latency. You also want to be alerted when feature attributions of the model meaningfully change over time. What should you do?

Options:

A.  

1 Specify sampled Shapley as the explanation method with a path count of 5.

2 Deploy the model to Vertex Al Endpoints.

3. Create a Model Monitoring job that uses prediction drift as the monitoring objective.

B.  

1 Specify Integrated Gradients as the explanation method with a path count of 5.

2 Deploy the model to Vertex Al Endpoints.

3. Create a Model Monitoring job that uses prediction drift as the monitoring objective.

C.  

1. Specify sampled Shapley as the explanation method with a path count of 50.

2. Deploy the model to Vertex Al Endpoints.

3. Create a Model Monitoring job that uses training-serving skew as the monitoring objective.

D.  

1 Specify Integrated Gradients as the explanation method with a path count of 50.

2. Deploy the model to Vertex Al Endpoints.

3 Create a Model Monitoring job that uses training-serving skew as the monitoring objective.

Discussion 0
Questions 33

You are a data scientist at an industrial equipment manufacturing company. You are developing a regression model to estimate the power consumption in the company’s manufacturing plants based on sensor data collected from all of the plants. The sensors collect tens of millions of records every day. You need to schedule daily training runs for your model that use all the data collected up to the current date. You want your model to scale smoothly and require minimal development work. What should you do?

Options:

A.  

Develop a custom TensorFlow regression model, and optimize it using Vertex Al Training.

B.  

Develop a regression model using BigQuery ML.

C.  

Develop a custom scikit-learn regression model, and optimize it using Vertex Al Training

D.  

Develop a custom PyTorch regression model, and optimize it using Vertex Al Training

Discussion 0
Questions 34

You recently trained a XGBoost model that you plan to deploy to production for online inference Before sending a predict request to your model ' s binary you need to perform a simple data preprocessing step This step exposes a REST API that accepts requests in your internal VPC Service Controls and returns predictions You want to configure this preprocessing step while minimizing cost and effort What should you do?

Options:

A.  

Store a pickled model in Cloud Storage Build a Flask-based app packages the app in a custom container image, and deploy the model to Vertex Al Endpoints.

B.  

Build a Flask-based app. package the app and a pickled model in a custom container image, and deploy the model to Vertex Al Endpoints.

C.  

Build a custom predictor class based on XGBoost Predictor from the Vertex Al SDK. package it and a pickled model in a custom container image based on a Vertex built-in image, and deploy the model to Vertex Al Endpoints.

D.  

Build a custom predictor class based on XGBoost Predictor from the Vertex Al SDK and package the handler in a custom container image based on a Vertex built-in container image Store a pickled model in Cloud Storage and deploy the model to Vertex Al Endpoints.

Discussion 0
Questions 35

Your team has been tasked with creating an ML solution in Google Cloud to classify support requests for one of your platforms. You analyzed the requirements and decided to use TensorFlow to build the classifier so that you have full control of the model ' s code, serving, and deployment. You will use Kubeflow pipelines for the ML platform. To save time, you want to build on existing resources and use managed services instead of building a completely new model. How should you build the classifier?

Options:

A.  

Use the Natural Language API to classify support requests

B.  

Use AutoML Natural Language to build the support requests classifier

C.  

Use an established text classification model on Al Platform to perform transfer learning

D.  

Use an established text classification model on Al Platform as-is to classify support requests

Discussion 0
Questions 36

While performing exploratory data analysis on a dataset, you find that an important categorical feature has 5% null values. You want to minimize the bias that could result from the missing values. How should you handle the missing values?

Options:

A.  

Remove the rows with missing values, and upsample your dataset by 5%.

B.  

Replace the missing values with the feature’s mean.

C.  

Replace the missing values with a placeholder category indicating a missing value.

D.  

Move the rows with missing values to your validation dataset.

Discussion 0
Questions 37

You work at a bank You have a custom tabular ML model that was provided by the bank ' s vendor. The training data is not available due to its sensitivity. The model is packaged as a Vertex Al Model serving container which accepts a string as input for each prediction instance. In each string the feature values are separated by commas. You want to deploy this model to production for online predictions, and monitor the feature distribution over time with minimal effort What should you do?

Options:

A.  

1 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Ai endpoint.

2. Create a Vertex Al Model Monitoring job with feature drift detection as the monitoring objective, and provide an instance schema.

B.  

1 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

2 Create a Vertex Al Model Monitoring job with feature skew detection as the monitoring objective and provide an instance schema.

C.  

1 Refactor the serving container to accept key-value pairs as input format.

2. Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

3. Create a Vertex Al Model Monitoring job with feature drift detection as the monitoring objective.

D.  

1 Refactor the serving container to accept key-value pairs as input format.

2 Upload the model to Vertex Al Model Registry and deploy the model to a Vertex Al endpoint.

3. Create a Vertex Al Model Monitoring job with feature skew detection as the monitoring objective.

Discussion 0
Questions 38

You need to build an ML model for a social media application to predict whether a user’s submitted profile photo meets the requirements. The application will inform the user if the picture meets the requirements. How should you build a model to ensure that the application does not falsely accept a non-compliant picture?

Options:

A.  

Use AutoML to optimize the model’s recall in order to minimize false negatives.

B.  

Use AutoML to optimize the model’s F1 score in order to balance the accuracy of false positives and false negatives.

C.  

Use Vertex AI Workbench user-managed notebooks to build a custom model that has three times as many examples of pictures that meet the profile photo requirements.

D.  

Use Vertex AI Workbench user-managed notebooks to build a custom model that has three times as many examples of pictures that do not meet the profile photo requirements.

Discussion 0
Questions 39

You recently developed a wide and deep model in TensorFlow. You generated training datasets using a SQL script that preprocessed raw data in BigQuery by performing instance-level transformations of the data. You need to create a training pipeline to retrain the model on a weekly basis. The trained model will be used to generate daily recommendations. You want to minimize model development and training time. How should you develop the training pipeline?

Options:

A.  

Use the Kubeflow Pipelines SDK to implement the pipeline Use the BigQueryJobop component to run the preprocessing script and the customTrainingJobop component to launch a Vertex Al training job.

B.  

Use the Kubeflow Pipelines SDK to implement the pipeline. Use the dataflowpythonjobopcomponent to preprocess the data and the customTraining JobOp component to launch a Vertex Al training job.

C.  

Use the TensorFlow Extended SDK to implement the pipeline Use the Examplegen component with the BigQuery executor to ingest the data the Transform component to preprocess the data, and the Trainer component to launch a Vertex Al training job.

D.  

Use the TensorFlow Extended SDK to implement the pipeline Implement the preprocessing steps as part of the input_fn of the model Use the ExampleGen component with the BigQuery executor to ingest the data and the Trainer component to launch a Vertex Al training job.

Discussion 0
Questions 40

Your team is training a large number of ML models that use different algorithms, parameters and datasets. Some models are trained in Vertex Ai Pipelines, and some are trained on Vertex Al Workbench notebook instances. Your team wants to compare the performance of the models across both services. You want to minimize the effort required to store the parameters and metrics What should you do?

Options:

A.  

Implement an additional step for all the models running in pipelines and notebooks to export parameters and metrics to BigQuery.

B.  

Create a Vertex Al experiment Submit all the pipelines as experiment runs. For models trained on notebooks log parameters and metrics by using the Vertex Al SDK.

C.  

Implement all models in Vertex Al Pipelines Create a Vertex Al experiment, and associate all pipeline runs with that experiment.

D.  

Store all model parameters and metrics as mode! metadata by using the Vertex Al Metadata API.

Discussion 0
Questions 41

You trained a text classification model. You have the following SignatureDefs:

What is the correct way to write the predict request?

Options:

A.  

data = json.dumps({ " signature_name " : " serving_default ' \ " instances " : [fab ' , ' be1, ' cd ' ]]})

B.  

data = json dumps({ " signature_name " : " serving_default " ! " instances " : [[ ' a ' , ' b ' , " c " , ' d ' , ' e ' , ' f ' ]]})

C.  

data = json.dumps({ " signature_name " : " serving_default, " instances " : [[ ' a ' , ' b\ ' c ' 1, [d\ ' e\ T]]})

D.  

data = json dumps({ " signature_name " : f,serving_default " , " instances " : [[ ' a ' , ' b ' ], [c\ ' d ' ], [ ' e\ T]]})

Discussion 0
Questions 42

Your task is classify if a company logo is present on an image. You found out that 96% of a data does not include a logo. You are dealing with data imbalance problem. Which metric do you use to evaluate to model?

Options:

A.  

F1 Score

B.  

RMSE

C.  

F Score with higher precision weighting than recall

D.  

F Score with higher recall weighted than precision

Discussion 0
Questions 43

You work for a food product company. Your company ' s historical sales data is stored in BigQuery You need to use Vertex Al’s custom training service to train multiple TensorFlow models that read the data from BigQuery and predict future sales You plan to implement a data preprocessing algorithm that performs min-max scaling and bucketing on a large number of features before you start experimenting with the models. You want to minimize preprocessing time, cost and development effort How should you configure this workflow?

Options:

A.  

Write the transformations into Spark that uses the spark-bigquery-connector and use Dataproc to preprocess the data.

B.  

Write SQL queries to transform the data in-place in BigQuery.

C.  

Add the transformations as a preprocessing layer in the TensorFlow models.

D.  

Create a Dataflow pipeline that uses the BigQuerylO connector to ingest the data process it and write it back to BigQuery.

Discussion 0
Questions 44

You are analyzing customer data for a healthcare organization that is stored in Cloud Storage. The data contains personally identifiable information (PII) You need to perform data exploration and preprocessing while ensuring the security and privacy of sensitive fields What should you do?

Options:

A.  

Use the Cloud Data Loss Prevention (DLP) API to de-identify the PI! before performing data exploration and preprocessing.

B.  

Use customer-managed encryption keys (CMEK) to encrypt the Pll data at rest and decrypt the Pll data during data exploration and preprocessing.

C.  

Use a VM inside a VPC Service Controls security perimeter to perform data exploration and preprocessing.

D.  

Use Google-managed encryption keys to encrypt the Pll data at rest, and decrypt the Pll data during data exploration and preprocessing.

Discussion 0
Questions 45

You work as an analyst at a large banking firm. You are developing a robust, scalable ML pipeline to train several regression and classification models. Your primary focus for the pipeline is model interpretability. You want to productionize the pipeline as quickly as possible What should you do?

Options:

A.  

Use Tabular Workflow for Wide & Deep through Vertex Al Pipelines to jointly train wide linear models and

deep neural networks.

B.  

Use Google Kubernetes Engine to build a custom training pipeline for XGBoost-based models.

C.  

Use Tabular Workflow forTabel through Vertex Al Pipelines to train attention-based models.

D.  

Use Cloud Composer to build the training pipelines for custom deep learning-based models.

Discussion 0
Questions 46

You work for a gaming company that develops massively multiplayer online (MMO) games. You built a TensorFlow model that predicts whether players will make in-app purchases of more than $10 in the next two weeks. The model’s predictions will be used to adapt each user’s game experience. User data is stored in BigQuery. How should you serve your model while optimizing cost, user experience, and ease of management?

Options:

A.  

Import the model into BigQuery ML. Make predictions using batch reading data from BigQuery, and push the data to Cloud SQL

B.  

Deploy the model to Vertex AI Prediction. Make predictions using batch reading data from Cloud Bigtable, and push the data to Cloud SQL.

C.  

Embed the model in the mobile application. Make predictions after every in-app purchase event is published in Pub/Sub, and push the data to Cloud SQL.

D.  

Embed the model in the streaming Dataflow pipeline. Make predictions after every in-app purchase event is published in Pub/Sub, and push the data to Cloud SQL.

Discussion 0
Questions 47

You have created multiple versions of an ML model and have imported them to Vertex AI Model Registry. You want to perform A/B testing to identify the best-performing model using the simplest approach. What should you do?

Options:

A.  

Split incoming traffic among separate Cloud Run instances of deployed models. Monitor the performance of each version using Cloud Monitoring.

B.  

Split incoming traffic to distribute prediction requests among the versions. Monitor the performance of each version using Looker Studio dashboards that compare logged data for each version.

C.  

Split incoming traffic among Google Kubernetes Engine (GKE) clusters and use Traffic Director to distribute prediction requests to different versions. Monitor the performance of each version using Cloud Monitoring.

D.  

Split incoming traffic to distribute prediction requests among the versions. Monitor the performance of each version using Vertex AI’s built-in monitoring tools.

Discussion 0
Questions 48

Your team needs to build a model that predicts whether images contain a driver ' s license, passport, or credit card. The data engineering team already built the pipeline and generated a dataset composed of 10,000 images with driver ' s licenses, 1,000 images with passports, and 1,000 images with credit cards. You now have to train a model with the following label map: [ ' driversjicense ' , ' passport ' , ' credit_card ' ]. Which loss function should you use?

Options:

A.  

Categorical hinge

B.  

Binary cross-entropy

C.  

Categorical cross-entropy

D.  

Sparse categorical cross-entropy

Discussion 0
Questions 49

You recently joined an enterprise-scale company that has thousands of datasets. You know that there are accurate descriptions for each table in BigQuery, and you are searching for the proper BigQuery table to use for a model you are building on AI Platform. How should you find the data that you need?

Options:

A.  

Use Data Catalog to search the BigQuery datasets by using keywords in the table description.

B.  

Tag each of your model and version resources on AI Platform with the name of the BigQuery table that was used for training.

C.  

Maintain a lookup table in BigQuery that maps the table descriptions to the table ID. Query the lookup table to find the correct table ID for the data that you need.

D.  

Execute a query in BigQuery to retrieve all the existing table names in your project using the

INFORMATION_SCHEMA metadata tables that are native to BigQuery. Use the result o find the table that you need.

Discussion 0
Questions 50

You are experimenting with a built-in distributed XGBoost model in Vertex AI Workbench user-managed notebooks. You use BigQuery to split your data into training and validation sets using the following queries:

CREATE OR REPLACE TABLE ‘myproject.mydataset.training‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() < = 0.8);

CREATE OR REPLACE TABLE ‘myproject.mydataset.validation‘ AS

(SELECT * FROM ‘myproject.mydataset.mytable‘ WHERE RAND() < = 0.2);

After training the model, you achieve an area under the receiver operating characteristic curve (AUC ROC) value of 0.8, but after deploying the model to production, you notice that your model performance has dropped to an AUC ROC value of 0.65. What problem is most likely occurring?

Options:

A.  

There is training-serving skew in your production environment.

B.  

There is not a sufficient amount of training data.

C.  

The tables that you created to hold your training and validation records share some records, and you may not be using all the data in your initial table.

D.  

The RAND() function generated a number that is less than 0.2 in both instances, so every record in the validation table will also be in the training table.

Discussion 0
Questions 51

You are developing a recommendation engine for an online clothing store. The historical customer transaction data is stored in BigQuery and Cloud Storage. You need to perform exploratory data analysis (EDA), preprocessing and model training. You plan to rerun these EDA, preprocessing, and training steps as you experiment with different types of algorithms. You want to minimize the cost and development effort of running these steps as you experiment. How should you configure the environment?

Options:

A.  

Create a Vertex Al Workbench user-managed notebook using the default VM instance, and use the %%bigquery magic commands in Jupyter to query the tables.

B.  

Create a Vertex Al Workbench managed notebook to browse and query the tables directly from the JupyterLab interface.

C.  

Create a Vertex Al Workbench user-managed notebook on a Dataproc Hub. and use the %%bigquery magic commands in Jupyter to query the tables.

D.  

Create a Vertex Al Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables.

Discussion 0
Questions 52

You created an ML pipeline with multiple input parameters. You want to investigate the tradeoffs between different parameter combinations. The parameter options are

• input dataset

• Max tree depth of the boosted tree regressor

• Optimizer learning rate

You need to compare the pipeline performance of the different parameter combinations measured in F1 score, time to train and model complexity. You want your approach to be reproducible and track all pipeline runs on the same platform. What should you do?

Options:

A.  

1 Use BigQueryML to create a boosted tree regressor and use the hyperparameter tuning capability

2 Configure the hyperparameter syntax to select different input datasets. max tree depths, and optimizer teaming rates Choose the grid search option

B.  

1 Create a Vertex Al pipeline with a custom model training job as part of the pipeline Configure the pipeline ' s parameters to include those you are investigating

2 In the custom training step, use the Bayesian optimization method with F1 score as the target to maximize

C.  

1 Create a Vertex Al Workbench notebook for each of the different input datasets

2 In each notebook, run different local training jobs with different combinations of the max tree depth and optimizer learning rate parameters

3 After each notebook finishes, append the results to a BigQuery table

D.  

1 Create an experiment in Vertex Al Experiments

2. Create a Vertex Al pipeline with a custom model training job as part of the pipeline. Configure the pipelines parameters to include those you are investigating

3. Submit multiple runs to the same experiment using different values for the parameters

Discussion 0
Questions 53

While running a model training pipeline on Vertex Al, you discover that the evaluation step is failing because of an out-of-memory error. You are currently using TensorFlow Model Analysis (TFMA) with a standard Evaluator TensorFlow Extended (TFX) pipeline component for the evaluation step. You want to stabilize the pipeline without downgrading the evaluation quality while minimizing infrastructure overhead. What should you do?

Options:

A.  

Add tfma.MetricsSpec () to limit the number of metrics in the evaluation step.

B.  

Migrate your pipeline to Kubeflow hosted on Google Kubernetes Engine, and specify the appropriate node parameters for the evaluation step.

C.  

Include the flag -runner=DataflowRunner in beam_pipeline_args to run the evaluation step on Dataflow.

D.  

Move the evaluation step out of your pipeline and run it on custom Compute Engine VMs with sufficient memory.

Discussion 0
Questions 54

You are building a linear regression model on BigQuery ML to predict a customer ' s likelihood of purchasing your company ' s products. Your model uses a city name variable as a key predictive component. In order to train and serve the model, your data must be organized in columns. You want to prepare your data using the least amount of coding while maintaining the predictable variables. What should you do?

Options:

A.  

Create a new view with BigQuery that does not include a column with city information

B.  

Use Dataprep to transform the state column using a one-hot encoding method, and make each city a column with binary values.

C.  

Use Cloud Data Fusion to assign each city to a region labeled as 1, 2, 3, 4, or 5r and then use that number to represent the city in the model.

D.  

Use TensorFlow to create a categorical variable with a vocabulary list Create the vocabulary file, and upload it as part of your model to BigQuery ML.

Discussion 0
Questions 55

You are an ML engineer at a manufacturing company You are creating a classification model for a predictive maintenance use case You need to predict whether a crucial machine will fail in the next three days so that the repair crew has enough time to fix the machine before it breaks. Regular maintenance of the machine is relatively inexpensive, but a failure would be very costly You have trained several binary classifiers to predict whether the machine will fail. where a prediction of 1 means that the ML model predicts a failure.

You are now evaluating each model on an evaluation dataset. You want to choose a model that prioritizes detection while ensuring that more than 50% of the maintenance jobs triggered by your model address an imminent machine failure. Which model should you choose?

Options:

A.  

The model with the highest area under the receiver operating characteristic curve (AUC ROC) and precision greater than 0 5

B.  

The model with the lowest root mean squared error (RMSE) and recall greater than 0.5.

C.  

The model with the highest recall where precision is greater than 0.5.

D.  

The model with the highest precision where recall is greater than 0.5.

Discussion 0
Questions 56

Your organization ' s call center has asked you to develop a model that analyzes customer sentiments in each call. The call center receives over one million calls daily, and data is stored in Cloud Storage. The data collected must not leave the region in which the call originated, and no Personally Identifiable Information (Pll) can be stored or analyzed. The data science team has a third-party tool for visualization and access which requires a SQL ANSI-2011 compliant interface. You need to select components for data processing and for analytics. How should the data pipeline be designed?

Options:

A.  

1 = Dataflow, 2 = BigQuery

B.  

1 = Pub/Sub, 2 = Datastore

C.  

1 = Dataflow, 2 = Cloud SQL

D.  

1 = Cloud Function, 2 = Cloud SQL

Discussion 0
Questions 57

You trained a model, packaged it with a custom Docker container for serving, and deployed it to Vertex Al Model Registry. When you submit a batch prediction job, it fails with this error " Error model server never became ready Please validate that your model file or container configuration are valid. There are no additional errors in the logs What should you do?

Options:

A.  

Add a logging configuration to your application to emit logs to Cloud Logging.

B.  

Change the HTTP port in your model ' s configuration to the default value of 8080

C.  

Change the health Route value in your models configuration to /heal thcheck.

D.  

Pull the Docker image locally and use the decker run command to launch it locally. Use the docker logs command to explore the error logs.

Discussion 0
Questions 58

You work as an ML researcher at an investment bank and are experimenting with the Gemini large language model (LLM). You plan to deploy the model for an internal use case and need full control of the model’s underlying infrastructure while minimizing inference time. Which serving configuration should you use for this task?

Options:

A.  

Deploy the model on a Vertex AI endpoint using one-click deployment in Model Garden.

B.  

Deploy the model on a Google Kubernetes Engine (GKE) cluster manually by creating a custom YAML manifest.

C.  

Deploy the model on a Vertex AI endpoint manually by creating a custom inference container.

D.  

Deploy the model on a Google Kubernetes Engine (GKE) cluster using the deployment options in Model Garden.

Discussion 0
Questions 59

You built a custom Vertex AI pipeline job that preprocesses images and trains an object detection model. The pipeline currently uses 1 n1-standard-8 machine with 1 NVIDIA Tesla V100 GPU. You want to reduce the model training time without compromising model accuracy. What should you do?

Options:

A.  

Reduce the number of layers in your object detection model.

B.  

Train the same model on a stratified subset of your dataset.

C.  

Update the WorkerPoolSpec to use a machine with 24 vCPUs and 1 NVIDIA Tesla V100 GPU.

D.  

Update the WorkerPoolSpec to use a machine with 24 vCPUs and 3 NVIDIA Tesla V100 GPUs.

Discussion 0
Questions 60

You work for a global footwear retailer and need to predict when an item will be out of stock based on historical inventory data. Customer behavior is highly dynamic since footwear demand is influenced by many different factors. You want to serve models that are trained on all available data, but track your performance on specific subsets of data before pushing to production. What is the most streamlined and reliable way to perform this validation?

Options:

A.  

Use the TFX ModelValidator tools to specify performance metrics for production readiness

B.  

Use k-fold cross-validation as a validation strategy to ensure that your model is ready for production.

C.  

Use the last relevant week of data as a validation set to ensure that your model is performing accurately on current data

D.  

Use the entire dataset and treat the area under the receiver operating characteristics curve (AUC ROC) as the main metric.

Discussion 0
Questions 61

You work for a toy manufacturer that has been experiencing a large increase in demand. You need to build an ML model to reduce the amount of time spent by quality control inspectors checking for product defects. Faster defect detection is a priority. The factory does not have reliable Wi-Fi. Your company wants to implement the new ML model as soon as possible. Which model should you use?

Options:

A.  

AutoML Vision model

B.  

AutoML Vision Edge mobile-versatile-1 model

C.  

AutoML Vision Edge mobile-low-latency-1 model

D.  

AutoML Vision Edge mobile-high-accuracy-1 model

Discussion 0
Questions 62

You have been tasked with deploying prototype code to production. The feature engineering code is in PySpark and runs on Dataproc Serverless. The model training is executed by using a Vertex Al custom training job. The two steps are not connected, and the model training must currently be run manually after the feature engineering step finishes. You need to create a scalable and maintainable production process that runs end-to-end and tracks the connections between steps. What should you do?

Options:

A.  

Create a Vertex Al Workbench notebook Use the notebook to submit the Dataproc Serverless feature engineering job Use the same notebook to submit the custom model training job Run the notebook cells sequentially to tie the steps together end-to-end

B.  

Create a Vertex Al Workbench notebook Initiate an Apache Spark context in the notebook, and run the PySpark feature engineering code Use the same notebook to run the custom model training job in TensorFlow Run the notebook cells sequentially to tie the steps together end-to-end

C.  

Use the Kubeflow pipelines SDK to write code that specifies two components

- The first is a Dataproc Serverless component that launches the feature engineering job

- The second is a custom component wrapped in the

creare_cusrora_rraining_job_from_ccraponent Utility that launches the custom model training

job.

D.  

Create a Vertex Al Pipelines job to link and run both components Use the Kubeflow pipelines SDK to write code that specifies two components

- The first component initiates an Apache Spark context that runs the PySpark feature engineering code

- The second component runs the TensorFlow custom model training code Create a Vertex Al Pipelines job to link and run both components

Discussion 0
Questions 63

You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on Al Platform for high-throughput online prediction. Which architecture should you use?

Options:

A.  

• Validate the accuracy of the model that you trained on preprocessed data

• Create a new model that uses the raw data and is available in real time

• Deploy the new model onto Al Platform for online prediction

B.  

• Send incoming prediction requests to a Pub/Sub topic

• Transform the incoming data using a Dataflow job

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue

C.  

• Stream incoming prediction request data into Cloud Spanner

• Create a view to abstract your preprocessing logic.

• Query the view every second for new records

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue.

D.  

• Send incoming prediction requests to a Pub/Sub topic

• Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic.

• Implement your preprocessing logic in the Cloud Function

• Submit a prediction request to Al Platform using the transformed data

• Write the predictions to an outbound Pub/Sub queue

Discussion 0
Questions 64

You recently deployed a model to a Vertex Al endpoint Your data drifts frequently so you have enabled request-response logging and created a Vertex Al Model Monitoring job. You have observed that your model is receiving higher traffic than expected. You need to reduce the model monitoring cost while continuing to quickly detect drift. What should you do?

Options:

A.  

Replace the monitoring job with a DataFlow pipeline that uses TensorFlow Data Validation (TFDV).

B.  

Replace the monitoring job with a custom SQL scnpt to calculate statistics on the features and predictions in BigQuery.

C.  

Decrease the sample_rate parameter in the Randomsampleconfig of the monitoring job.

D.  

Increase the monitor_interval parameter in the scheduieconfig of the monitoring job.

Discussion 0
Questions 65

You are training an object detection machine learning model on a dataset that consists of three million X-ray images, each roughly 2 GB in size. You are using Vertex AI Training to run a custom training application on a Compute Engine instance with 32-cores, 128 GB of RAM, and 1 NVIDIA P100 GPU. You notice that model training is taking a very long time. You want to decrease training time without sacrificing model performance. What should you do?

Options:

A.  

Increase the instance memory to 512 GB and increase the batch size.

B.  

Replace the NVIDIA P100 GPU with a v3-32 TPU in the training job.

C.  

Enable early stopping in your Vertex AI Training job.

D.  

Use the tf.distribute.Strategy API and run a distributed training job.

Discussion 0
Questions 66

Your organization’s marketing team is building a customer recommendation chatbot that uses a generative AI large language model (LLM) to provide personalized product suggestions in real time. The chatbot needs to access data from millions of customers, including purchase history, browsing behavior, and preferences. The data is stored in a Cloud SQL for PostgreSQL database. You need the chatbot response time to be less than 100ms. How should you design the system?

Options:

A.  

Use BigQuery ML to fine-tune the LLM with the data in the Cloud SQL for PostgreSQL database, and access the model from BigQuery.

B.  

Replicate the Cloud SQL for PostgreSQL database to AlloyD

B.  

Configure the chatbot server to query AlloyD

B.  

C.  

Transform relevant customer data into vector embeddings and store them in Vertex AI Search for retrieval by the LLM.

D.  

Create a caching layer between the chatbot and the Cloud SQL for PostgreSQL database to store frequently accessed customer data. Configure the chatbot server to query the cache.

Discussion 0
Questions 67

You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metrics would give you the most confidence in your model?

Options:

A.  

F-score where recall is weighed more than precision

B.  

RMSE

C.  

F1 score

D.  

F-score where precision is weighed more than recall

Discussion 0
Questions 68

You are the lead ML engineer on a mission-critical project that involves analyzing massive datasets using Apache Spark. You need to establish a robust environment that allows your team to rapidly prototype Spark models using Jupyter notebooks. What is the fastest way to achieve this?

Options:

A.  

Configure a Compute Engine instance with Spark and use Jupyter notebooks.

B.  

Set up a Dataproc cluster with Spark and use Jupyter notebooks.

C.  

Set up a Vertex AI Workbench instance with a Spark kernel.

D.  

Use Colab Enterprise with a Spark kernel.

Discussion 0
Questions 69

You are developing a custom TensorFlow classification model based on tabular data. Your raw data is stored in BigQuery contains hundreds of millions of rows, and includes both categorical and numerical features. You need to use a MaxMin scaler on some numerical features, and apply a one-hot encoding to some categorical features such as SKU names. Your model will be trained over multiple epochs. You want to minimize the effort and cost of your solution. What should you do?

Options:

A.  

1 Write a SQL query to create a separate lookup table to scale the numerical features.

2. Deploy a TensorFlow-based model from Hugging Face to BigQuery to encode the text features.

3. Feed the resulting BigQuery view into Vertex Al Training.

B.  

1 Use BigQuery to scale the numerical features.

2. Feed the features into Vertex Al Training.

3 Allow TensorFlow to perform the one-hot text encoding.

C.  

1 Use TFX components with Dataflow to encode the text features and scale the numerical features.

2 Export results to Cloud Storage as TFRecords.

3 Feed the data into Vertex Al Training.

D.  

1 Write a SQL query to create a separate lookup table to scale the numerical features.

2 Perform the one-hot text encoding in BigQuery.

3. Feed the resulting BigQuery view into Vertex Al Training.

Discussion 0
Questions 70

You lead a data science team at a large international corporation. Most of the models your team trains are large-scale models using high-level TensorFlow APIs on AI Platform with GPUs. Your team usually

takes a few weeks or months to iterate on a new version of a model. You were recently asked to review your team’s spending. How should you reduce your Google Cloud compute costs without impacting the model’s performance?

Options:

A.  

Use AI Platform to run distributed training jobs with checkpoints.

B.  

Use AI Platform to run distributed training jobs without checkpoints.

C.  

Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs with checkpoints.

D.  

Migrate to training with Kuberflow on Google Kubernetes Engine, and use preemptible VMs without checkpoints.

Discussion 0
Questions 71

You work for a gaming company that has millions of customers around the world. All games offer a chat feature that allows players to communicate with each other in real time. Messages can be typed in more than 20 languages and are translated in real time using the Cloud Translation API. You have been asked to build an ML system to moderate the chat in real time while assuring that the performance is uniform across the various languages and without changing the serving infrastructure.

You trained your first model using an in-house word2vec model for embedding the chat messages translated by the Cloud Translation API. However, the model has significant differences in performance across the different languages. How should you improve it?

Options:

A.  

Add a regularization term such as the Min-Diff algorithm to the loss function.

B.  

Train a classifier using the chat messages in their original language.

C.  

Replace the in-house word2vec with GPT-3 or T5.

D.  

Remove moderation for languages for which the false positive rate is too high.

Discussion 0
Questions 72

You are deploying a new version of a model to a production Vertex Al endpoint that is serving traffic You plan to direct all user traffic to the new model You need to deploy the model with minimal disruption to your application What should you do?

Options:

A.  

1 Create a new endpoint.

2 Create a new model Set it as the default version Upload the model to Vertex Al Model Registry.

3. Deploy the new model to the new endpoint.

4 Update Cloud DNS to point to the new endpoint

B.  

1. Create a new endpoint.

2. Create a new model Set the parentModel parameter to the model ID of the currently deployed model and set it as the default version Upload the model to Vertex Al Model Registry

3. Deploy the new model to the new endpoint and set the new model to 100% of the traffic

C.  

1 Create a new model Set the parentModel parameter to the model ID of the currently deployed model Upload the model to Vertex Al Model Registry.

2 Deploy the new model to the existing endpoint and set the new model to 100% of the traffic.

D.  

1, Create a new model Set it as the default version Upload the model to Vertex Al Model Registry

2 Deploy the new model to the existing endpoint

Discussion 0
Questions 73

You are training a custom language model for your company using a large dataset. You plan to use the ReductionServer strategy on Vertex Al. You need to configure the worker pools of the distributed training job. What should you do?

Options:

A.  

Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs Configure the third worker pool to have GPUs: and use the reduction server container image.

B.  

Configure the machines of the first two worker pools to have GPUs and to use a container image where your training code runs. Configure the third worker pool to use the reductionserver container image without accelerators, and choose a machine type that prioritizes bandwidth.

C.  

Configure the machines of the first two worker pools to have TPUs and to use a container image where your training code runs Configure the third worker pool without accelerators, and use the reductionserver container image without accelerators and choose a machine type that prioritizes bandwidth.

D.  

Configure the machines of the first two pools to have TPUs. and to use a container image where your training code runs Configure the third pool to have TPUs: and use the reductionserver container image.

Discussion 0
Questions 74

You are an AI architect at a popular photo-sharing social media platform. Your organization’s content moderation team currently scans images uploaded by users and removes explicit images manually. You want to implement an AI service to automatically prevent users from uploading explicit images. What should you do?

Options:

A.  

Develop a custom TensorFlow model in a Vertex AI Workbench instance. Train the model on a dataset of manually labeled images. Deploy the model to a Vertex AI endpoint. Run periodic batch inference to identify inappropriate uploads and report them to the content moderation team.

B.  

Train an image clustering model using TensorFlow in a Vertex AI Workbench instance. Deploy this model to a Vertex AI endpoint and configure it for online inference. Run this model each time a new image is uploaded to identify and block inappropriate uploads.

C.  

Create a dataset using manually labeled images. Ingest this dataset into AutoML. Train an image classification model and deploy it to a Vertex AI endpoint. Integrate this endpoint with the image upload process to identify and block inappropriate uploads. Monitor predictions and periodically retrain the model.

D.  

Send a copy of every user-uploaded image to a Cloud Storage bucket. Configure a Cloud Run function that triggers the Cloud Vision API to detect explicit content each time a new image is uploaded. Report the classifications to the content moderation team for review.

Discussion 0
Questions 75

You have been asked to productionize a proof-of-concept ML model built using Keras. The model was trained in a Jupyter notebook on a data scientist’s local machine. The notebook contains a cell that performs data validation and a cell that performs model analysis. You need to orchestrate the steps contained in the notebook and automate the execution of these steps for weekly retraining. You expect much more training data in the future. You want your solution to take advantage of managed services while minimizing cost. What should you do?

Options:

A.  

Move the Jupyter notebook to a Notebooks instance on the largest N2 machine type, and schedule the execution of the steps in the Notebooks instance using Cloud Scheduler.

B.  

Write the code as a TensorFlow Extended (TFX) pipeline orchestrated with Vertex AI Pipelines. Use standard TFX components for data validation and model analysis, and use Vertex AI Pipelines for model retraining.

C.  

Rewrite the steps in the Jupyter notebook as an Apache Spark job, and schedule the execution of the job on ephemeral Dataproc clusters using Cloud Scheduler.

D.  

Extract the steps contained in the Jupyter notebook as Python scripts, wrap each script in an Apache Airflow BashOperator, and run the resulting directed acyclic graph (DAG) in Cloud Composer.

Discussion 0
Questions 76

Your company manages an ecommerce website. You developed an ML model that recommends additional products to users in near real time based on items currently in the user ' s cart. The workflow will include the following processes.

1 The website will send a Pub/Sub message with the relevant data and then receive a message with the prediction from Pub/Sub.

2 Predictions will be stored in BigQuery

3. The model will be stored in a Cloud Storage bucket and will be updated frequently

You want to minimize prediction latency and the effort required to update the model How should you reconfigure the architecture?

Options:

A.  

Write a Cloud Function that loads the model into memory for prediction Configure the function to be

triggered when messages are sent to Pub/Sub.

B.  

Create a pipeline in Vertex Al Pipelines that performs preprocessing, prediction and postprocessing

Configure the pipeline to be triggered by a Cloud Function when messages are sent to Pub/Sub.

C.  

Expose the model as a Vertex Al endpoint Write a custom DoFn in a Dataflow job that calls the endpoint for

prediction.

D.  

Use the Runlnference API with watchFilePatterr. in a Dataflow job that wraps around the model and serves predictions.

Discussion 0
Questions 77

You work at a leading healthcare firm developing state-of-the-art algorithms for various use cases You have unstructured textual data with custom labels You need to extract and classify various medical phrases with these labels What should you do?

Options:

A.  

Use the Healthcare Natural Language API to extract medical entities.

B.  

Use a BERT-based model to fine-tune a medical entity extraction model.

C.  

Use AutoML Entity Extraction to train a medical entity extraction model.

D.  

Use TensorFlow to build a custom medical entity extraction model.

Discussion 0
Questions 78

You work with a team of researchers to develop state-of-the-art algorithms for financial analysis. Your team develops and debugs complex models in TensorFlow. You want to maintain the ease of debugging while also reducing the model training time. How should you set up your training environment?

Options:

A.  

Configure a v3-8 TPU VM SSH into the VM to tram and debug the model.

B.  

Configure a v3-8 TPU node Use Cloud Shell to SSH into the Host VM to train and debug the model.

C.  

Configure a M-standard-4 VM with 4 NVIDIA P100 GPUs SSH into the VM and use

Parameter Server Strategy to train the model.

D.  

Configure a M-standard-4 VM with 4 NVIDIA P100 GPUs SSH into the VM and use

MultiWorkerMirroredStrategy to train the model.

Discussion 0
Questions 79

You are creating a social media app where pet owners can post images of their pets. You have one million user uploaded images with hashtags. You want to build a comprehensive system that recommends images to users that are similar in appearance to their own uploaded images.

What should you do?

Options:

A.  

Download a pretrained convolutional neural network, and fine-tune the model to predict hashtags based on the input images. Use the predicted hashtags to make recommendations.

B.  

Retrieve image labels and dominant colors from the input images using the Vision API. Use these properties and the hashtags to make recommendations.

C.  

Use the provided hashtags to create a collaborative filtering algorithm to make recommendations.

D.  

Download a pretrained convolutional neural network, and use the model to generate embeddings of the input images. Measure similarity between embeddings to make recommendations.

Discussion 0
Questions 80

You are developing a model to predict whether a failure will occur in a critical machine part. You have a dataset consisting of a multivariate time series and labels indicating whether the machine part failed You recently started experimenting with a few different preprocessing and modeling approaches in a Vertex Al Workbench notebook. You want to log data and track artifacts from each run. How should you set up your experiments?

Options:

A.  

B.  

C.  

D.  

Discussion 0
Questions 81

Your organization wants you to compare various, widely available ML models for Gen AI use cases. The models you plan to compare are also available on Google Cloud. You have received curated internal benchmark datasets from several teams for their specific use cases and tasks. You need to submit a comprehensive report of your recommendations. You want to evaluate the models using the most efficient approach. What should you do?

Options:

A.  

Use Model Garden to deploy the candidate models to Vertex AI endpoints. Use the Gen AI Evaluation Service API to evaluate the performance of each deployed model on the internal benchmark datasets. Report the best models based on the experiments.

B.  

Stream raw data from open-source large language model leaderboards into a BigQuery dataset. Send the data to an internal Looker Studio dashboard. Evaluate the performance of each model by using open-source datasets that are similar to the internal benchmark datasets. Report the best models based on the dashboard metrics.

C.  

Review the model cards in Model Garden to evaluate each model ' s performance on open-source datasets that are similar to the internal benchmark datasets. Report the best models based on your analysis.

D.  

Download model weights from the respective provider website for each model. Write an inference script to deploy the candidate models to Vertex AI endpoints. Write an evaluation script to compare all deployed models on the internal benchmark datasets by using Vertex AI Experiments. Report the best models based on the experiments.

Discussion 0
Questions 82

You work for a retail company that is using a regression model built with BigQuery ML to predict product sales. This model is being used to serve online predictions Recently you developed a new version of the model that uses a different architecture (custom model) Initial analysis revealed that both models are performing as expected You want to deploy the new version of the model to production and monitor the performance over the next two months You need to minimize the impact to the existing and future model users How should you deploy the model?

Options:

A.  

Import the new model to the same Vertex Al Model Registry as a different version of the existing model. Deploy the new model to the same Vertex Al endpoint as the existing model, and use traffic splitting to route 95% of production traffic to the BigQuery ML model and 5% of production traffic to the new model.

B.  

Import the new model to the same Vertex Al Model Registry as the existing model Deploy the models to one Vertex Al endpoint Route 95% of production traffic to the BigQuery ML model and 5% of production traffic to the new model

C.  

Import the new model to the same Vertex Al Model Registry as the existing model Deploy each model to a separate Vertex Al endpoint.

D.  

Deploy the new model to a separate Vertex Al endpoint Create a Cloud Run service that routes the prediction requests to the corresponding endpoints based on the input feature values.

Discussion 0
Questions 83

As the lead ML Engineer for your company, you are responsible for building ML models to digitize scanned customer forms. You have developed a TensorFlow model that converts the scanned images into text and stores them in Cloud Storage. You need to use your ML model on the aggregated data collected at the end of each day with minimal manual intervention. What should you do?

Options:

A.  

Use the batch prediction functionality of Al Platform

B.  

Create a serving pipeline in Compute Engine for prediction

C.  

Use Cloud Functions for prediction each time a new data point is ingested

D.  

Deploy the model on Al Platform and create a version of it for online inference.

Discussion 0
Questions 84

You need to train a ControlNet model with Stable Diffusion XL for an image editing use case. You want to train this model as quickly as possible. Which hardware configuration should you choose to train your model?

Options:

A.  

Configure one a2-highgpu-1g instance with an NVIDIA A100 GPU with 80 GB of RAM. Use float32 precision during model training.

B.  

Configure one a2-highgpu-1g instance with an NVIDIA A100 GPU with 80 GB of RAM. Use bfloat16 quantization during model training.

C.  

Configure four n1-standard-16 instances, each with one NVIDIA Tesla T4 GPU with 16 GB of RAM. Use float32 precision during model training.

D.  

Configure four n1-standard-16 instances, each with one NVIDIA Tesla T4 GPU with 16 GB of RAM. Use float16 quantization during model training.

Discussion 0
Questions 85

You have deployed a model on Vertex AI for real-time inference. During an online prediction request, you get an “Out of Memory” error. What should you do?

Options:

A.  

Use batch prediction mode instead of online mode.

B.  

Send the request again with a smaller batch of instances.

C.  

Use base64 to encode your data before using it for prediction.

D.  

Apply for a quota increase for the number of prediction requests.

Discussion 0
Questions 86

You work on a growing team of more than 50 data scientists who all use Al Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

Options:

A.  

Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or group can access a given instance.

B.  

Separate each data scientist ' s work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.

C.  

Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources

D.  

Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about Al Platform resource usage In BigQuery create a SQL view that maps users to the resources they are using.

Discussion 0
Questions 87

You have developed an application that uses a chain of multiple scikit-learn models to predict the optimal price for your company ' s products. The workflow logic is shown in the diagram Members of your team use the individual models in other solution workflows. You want to deploy this workflow while ensuring version control for each individual model and the overall workflow Your application needs to be able to scale down to zero. You want to minimize the compute resource utilization and the manual effort required to manage this solution. What should you do?

Options:

A.  

Expose each individual model as an endpoint in Vertex Al Endpoints. Create a custom container endpoint to orchestrate the workflow.

B.  

Create a custom container endpoint for the workflow that loads each models individual files Track the versions of each individual model in BigQuery.

C.  

Expose each individual model as an endpoint in Vertex Al Endpoints. Use Cloud Run to orchestrate the workflow.

D.  

Load each model ' s individual files into Cloud Run Use Cloud Run to orchestrate the workflow Track the versions of each individual model in BigQuery.

Discussion 0
Questions 88

You are investigating the root cause of a misclassification error made by one of your models. You used Vertex Al Pipelines to tram and deploy the model. The pipeline reads data from BigQuery. creates a copy of the data in Cloud Storage in TFRecord format trains the model in Vertex Al Training on that copy, and deploys the model to a Vertex Al endpoint. You have identified the specific version of that model that misclassified: and you need to recover the data this model was trained on. How should you find that copy of the data ' ?

Options:

A.  

Use Vertex Al Feature Store Modify the pipeline to use the feature store; and ensure that all training data is stored in it Search the feature store for the data used for the training.

B.  

Use the lineage feature of Vertex Al Metadata to find the model artifact Determine the version of the model and identify the step that creates the data copy, and search in the metadata for its location.

C.  

Use the logging features in the Vertex Al endpoint to determine the timestamp of the models deployment Find the pipeline run at that timestamp Identify the step that creates the data copy; and search in the logs for its location.

D.  

Find the job ID in Vertex Al Training corresponding to the training for the model Search in the logs of that job for the data used for the training.

Discussion 0