Labour Day Special 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: exams65

Designing and Implementing a Data Science Solution on Azure Question and Answers

Designing and Implementing a Data Science Solution on Azure

Last Update May 1, 2024
Total Questions : 407

We are offering FREE DP-100 Microsoft exam questions. All you do is to just go and sign up. Give your details, prepare DP-100 free exam questions and then go for complete pool of Designing and Implementing a Data Science Solution on Azure test questions that will help you more.

DP-100 pdf

DP-100 PDF

$38.5  $109.99
DP-100 Engine

DP-100 Testing Engine

$45.5  $129.99
DP-100 PDF + Engine

DP-100 PDF + Testing Engine

$59.5  $169.99
Questions 1

You need to select a pre built development environment for a series of data science experiments. You must use the R language for the experiments.

Which three environments can you use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.

Options:

A.  

MI.NET Library on a local environment

B.  

Azure Machine Learning Studio

C.  

Data Science Virtual Machine (OSVM)

D.  

Azure Data bricks

E.  

Azure Cognitive Services

Discussion 0
Questions 2

You create a new Azure subscription. No resources are provisioned in the subscription.

You need to create an Azure Machine Learning workspace.

What are three possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.  

Run Python code that uses the Azure ML SDK library and calls the Workspace.create method with name, subscription_id, resource_group, and location parameters.

B.  

Use an Azure Resource Management template that includes a Microsoft.MachineLearningServices/

workspaces resource and its dependencies.

C.  

Use the Azure Command Line Interface (CLI) with the Azure Machine Learning extension to call the az

group create function with --name and --location parameters, and then the az ml workspace create

function, specifying –w and –g parameters for the workspace name and resource group.

D.  

Navigate to Azure Machine Learning studio and create a workspace.

E.  

Run Python code that uses the Azure ML SDK library and calls the Workspace.get method with name,

subscription_id, and resource_group parameters.

Discussion 0
Questions 3

You need to implement a new cost factor scenario for the ad response models as illustrated in the

performance curve exhibit.

Which technique should you use?

Options:

A.  

Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.

B.  

Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.

C.  

Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.

D.  

Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.

Discussion 0
Questions 4

You need to implement a model development strategy to determine a user’s tendency to respond to an ad.

Which technique should you use?

Options:

A.  

Use a Relative Expression Split module to partition the data based on centroid distance.

B.  

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

C.  

Use a Split Rows module to partition the data based on distance travelled to the event.

D.  

Use a Split Rows module to partition the data based on centroid distance.

Discussion 0
Questions 5

You are creating data wrangling and model training solutions in an Azure Machine Learning workspace.

You must use the same Python notebook to perform both data wrangling and model training.

You need to use the Azure Machine Learning Python SDK v2 to define and configure the Synapse Spark pool asynchronously in the workspace as dedicated compute

How should you complete the rode segment? To answer, select the appropriate options in the answer area.

NOTE: Lach correct selection is worth one point.

Options:

Discussion 0
Questions 6

You need to implement a feature engineering strategy for the crowd sentiment local models.

What should you do?

Options:

A.  

Apply an analysis of variance (ANOVA).

B.  

Apply a Pearson correlation coefficient.

C.  

Apply a Spearman correlation coefficient.

D.  

Apply a linear discriminant analysis.

Discussion 0
Questions 7

You must use the Azure Machine Learning SDK to interact with data and experiments in the workspace.

You need to configure the config.json file to connect to the workspace from the Python environment.

Which two additional parameters must you add to the config.json file in order to connect to the workspace? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.  

subscription_Id

B.  

Key

C.  

resource_group

D.  

region

E.  

Login

Discussion 0
Questions 8

You are with a time series dataset in Azure Machine Learning Studio.

You need to split your dataset into training and testing subsets by using the Split Data module.

Which splitting mode should you use?

Options:

A.  

Regular Expression Split

B.  

Split Rows with the Randomized split parameter set to true

C.  

Relative Expression Split

D.  

Recommender Split

Discussion 0
Questions 9

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You create an Azure Machine Learning service datastore in a workspace. The datastore contains the following files:

• /data/2018/Q1 .csv

• /data/2018/Q2.csv

• /data/2018/Q3.csv

• /data/2018/Q4.csv

• /data/2019/Q1.csv

All files store data in the following format:

id,M,f2,l

1,1,2,0

2,1,1,1

32,10

You run the following code:

You need to create a dataset named training_data and load the data from all files into a single data frame by using the following code:

Solution: Run the following code:

Does the solution meet the goal?

Options:

A.  

Yes

B.  

No

Discussion 0
Questions 10

You manage an Azure Machine Learning workspace.

You must log multiple metrics by using MLflow.

You need to maximize logging performance.

What are two possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.  

MLflowClient.log_batch

B.  

mlflowlog_metrics

C.  

mlflow.log_param

D.  

mlflow.log. metric

Discussion 0
Questions 11

You manage an Azure Machine Learning won pace named workspace 1 by using the Python SDK v2. You create a Gene-al Purpose v2 Azure storage account named mlstorage1. The storage account includes a pulley accessible container name micOTtalnerl. The container stores 10 blobs with files in the CSV format.

You must develop Python SDK v2 code to create a data asset referencing all blobs in the container named mtcontamer1.

You need to complete the Python SDK v2 code.

How should you complete the code? To answer select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 12

You have a Python data frame named salesData in the following format:

The data frame must be unpivoted to a long data format as follows:

You need to use the pandas.melt() function in Python to perform the transformation.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 13

you create an Azure Machine learning workspace named workspace1. The workspace contains a Python SOK v2 notebook mat uses Mallow to correct model coaxing men’s anal arracks from your local computer.

Vou must reuse the notebook to run on Azure Machine I earning compute instance m workspace.

You need to comminute to log training and artifacts from your data science code.

What should you do?

Options:

A.  

Configure the tracking URL.

B.  

Instantiate the MLClient class.

C.  

Log in to workspace1.

D.  

Instantiate the job class.

Discussion 0
Questions 14

You have a Jupyter Notebook that contains Python code that is used to train a model.

You must create a Python script for the production deployment. The solution must minimize code maintenance.

Which two actions should you perform? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.  

Refactor the Jupyter Notebook code into functions

B.  

Save each function to a separate Python file

C.  

Define a main() function in the Python script

D.  

Remove all comments and functions from the Python script

Discussion 0
Questions 15

You have a dataset that contains records of patients tested for diabetes. The dataset includes the patient s age.

You plan to create an analysis that will report the mean age value from the differentially private data derived from the dataset-

You need to identify the epsilon value to use in the analysis that minimizes the risk of exposing the actual data.

Which epsilon value should you use?

Options:

A.  

-1.5

B.  

-0.5

C.  

0.5

D.  

1.5

Discussion 0
Questions 16

You have an Azure Machine Learning (ML) model deployed to an online endpoint.

You need to review container logs from the endpoint by using Azure Ml Python SDK v2. The logs must include the console log from the inference server with print/log statements from the models scoring script.

What should you do first?

Options:

A.  

Create an instance of the the MLCIient class.

B.  

Create an instance of the OnlineDeploymentOperations class.

C.  

Connect by using SSH to the inference server.

D.  

Connect by using Docker tools to the inference server.

Discussion 0
Questions 17

You use Azure Machine Learning Studio to build a machine learning experiment.

You need to divide data into two distinct datasets.

Which module should you use?

Options:

A.  

Partition and Sample

B.  

Assign Data to Clusters

C.  

Group Data into Bins

D.  

Test Hypothesis Using t-Test

Discussion 0
Questions 18

You deploy a real-time inference service for a trained model.

The deployed model supports a business-critical application, and it is important to be able to monitor the data submitted to the web service and the predictions the data generates.

You need to implement a monitoring solution for the deployed model using minimal administrative effort.

What should you do?

Options:

A.  

View the explanations for the registered model in Azure ML studio.

B.  

Enable Azure Application Insights for the service endpoint and view logged data in the Azure portal.

C.  

Create an ML Flow tracking URI that references the endpoint, and view the data logged by ML Flow.

D.  

View the log files generated by the experiment used to train the model.

Discussion 0
Questions 19

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You plan to use a Python script to run an Azure Machine Learning experiment. The script creates a reference to the experiment run context, loads data from a file, identifies the set of unique values for the label column, and completes the experiment run:

from azureml.core import Run

import pandas as pd

run = Run.get_context()

data = pd.read_csv('data.csv')

label_vals = data['label'].unique()

# Add code to record metrics here

run.complete()

The experiment must record the unique labels in the data as metrics for the run that can be reviewed later.

You must add code to the script to record the unique label values as run metrics at the point indicated by the comment.

Solution: Replace the comment with the following code:

for label_val in label_vals:

run.log('Label Values', label_val)

Does the solution meet the goal?

Options:

A.  

Yes

B.  

No

Discussion 0
Questions 20

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are creating a new experiment in Azure Machine Learning Studio.

One class has a much smaller number of observations than tin- other classes in the training set.

You need to select an appropriate data sampling strategy to compensate for the class imbalance.

Solution: You use the Principal Components Analysis (PCA) sampling mode.

Does the solution meet the goal?

Options:

A.  

Yes

B.  

No

Discussion 0
Questions 21

You use Azure Machine Learning to implement hyperparameter tuning with a Bandit early termination policy.

The policy uses a slack_factor set to 01. an evaluation interval set to 1, and an evaluation delay set to b.

You need to evaluate the outcome of the early termination policy

What should you evaluate? To answer, select the appropriate options m the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 22

You load data from a notebook in an Azure Machine Learning workspace into a panda’s cat frame. The data contains 10.000 records. Each record consists of 10 columns.

You must identify the number of missing values in each of the columns.

You need to complete the Python code that will return the number of missing values in each of the columns.

Which code segments should you use? To answer, select the appropriate options «i the answer area.

NOTE; Each correct selection it worth one point.

Options:

Discussion 0
Questions 23

You create an Azure Machine Learning workspace and a new Azure DevOps organization. You register a model in the workspace and deploy the model to the target environment.

All new versions of the model registered in the workspace must automatically be deployed to the target environment.

You need to configure Azure Pipelines to deploy the model.

Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 24

You have an Azure Machine Learning workspace. You are connecting an Azure Data Lake Storage Gen2 account to the workspace as a data store. You need to authorize access from the workspace to the Azure Data Lake Storage Gen2 account.

What should you use?

Options:

A.  

Managed identity

B.  

SAS token

C.  

Service principal

D.  

Account key

Discussion 0
Questions 25

You use the Azure Machine learning SDK v2 tor Python and notebooks to tram a model. You use Python code to create a compute target, an environment, and a taring script. You need to prepare information to submit a training job.

Which class should you use?

Options:

A.  

MLClient

B.  

command

C.  

BuildContext

D.  

EndpointConnection

Discussion 0
Questions 26

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Which three Azure Machine Learning Studio modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 27

You use Azure Machine Learning to train a machine learning model.

You use the following training script in Python to perform logging:

You must use a Python script to define a sweep job.

You need to provide the primary metric and goal you want hyperparameter tuning to optimize.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 28

You need to set up the Permutation Feature Importance module according to the model training requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 29

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 30

You need to configure the Permutation Feature Importance module for the model training requirements.

What should you do? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 31

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 32

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have a Python script named train.py in a local folder named scripts. The script trains a regression model by using scikit-learn. The script includes code to load a training data file which is also located in the scripts folder.

You must run the script as an Azure ML experiment on a compute cluster named aml-compute.

You need to configure the run to ensure that the environment includes the required packages for model training. You have instantiated a variable named aml-compute that references the target compute cluster.

Solution: Run the following code:

Does the solution meet the goal?

Options:

A.  

Yes

B.  

No

Discussion 0
Questions 33

You have the following Azure subscriptions and Azure Machine Learning service workspaces:

You need to obtain a reference to the ml-project workspace.

Solution: Run the following Python code:

Does the solution meet the goal?

Options:

A.  

Yes

B.  

No

Discussion 0
Questions 34

You use the following code to run a script as an experiment in Azure Machine Learning:

You must identify the output files that are generated by the experiment run.

You need to add code to retrieve the output file names.

Which code segment should you add to the script?

Options:

A.  

files = run.get_properties()

B.  

files= run.get_file_names()

C.  

files = run.get_details_with_logs()

D.  

files = run.get_metrics()

E.  

files = run.get_details()

Discussion 0
Questions 35

You create an Azure Machine Learning compute resource to train models. The compute resource is configured as follows:

  • Minimum nodes: 2
  • Maximum nodes: 4

You must decrease the minimum number of nodes and increase the maximum number of nodes to the following values:

  • Minimum nodes: 0
  • Maximum nodes: 8

You need to reconfigure the compute resource.

What are three possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.  

Azure Machine Learning designer

B.  

Azure CLI ml extension v2

C.  

Azure Machine Learning studio

D.  

BuildContext class in Python SDK v2

E.  

MLCIient class in Python SDK v2

Discussion 0
Questions 36

You need to implement early stopping criteria as suited in the model training requirements.

Which three code segments should you use to develop the solution? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Options:

Discussion 0
Questions 37

You need to identify the methods for dividing the data according, to the testing requirements.

Which properties should you select? To answer, select the appropriate option-, m the answer area. NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 38

You need to replace the missing data in the AccessibilityToHighway columns.

How should you configure the Clean Missing Data module? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 39

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 40

You need to select a feature extraction method.

Which method should you use?

Options:

A.  

Spearman correlation

B.  

Mutual information

C.  

Mann-Whitney test

D.  

Pearson’s correlation

Discussion 0
Questions 41

You need to select a feature extraction method.

Which method should you use?

Options:

A.  

Mutual information

B.  

Mood’s median test

C.  

Kendall correlation

D.  

Permutation Feature Importance

Discussion 0
Questions 42

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 43

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 44

You need to select an environment that will meet the business and data requirements.

Which environment should you use?

Options:

A.  

Azure HDInsight with Spark MLlib

B.  

Azure Cognitive Services

C.  

Azure Machine Learning Studio

D.  

Microsoft Machine Learning Server

Discussion 0
Questions 45

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 46

You need to use the Python language to build a sampling strategy for the global penalty detection models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 47

You need to implement a scaling strategy for the local penalty detection data.

Which normalization type should you use?

Options:

A.  

Streaming

B.  

Weight

C.  

Batch

D.  

Cosine

Discussion 0
Questions 48

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 49

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 50

You need to resolve the local machine learning pipeline performance issue. What should you do?

Options:

A.  

Increase Graphic Processing Units (GPUs).

B.  

Increase the learning rate.

C.  

Increase the training iterations,

D.  

Increase Central Processing Units (CPUs).

Discussion 0
Questions 51

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 52

You need to modify the inputs for the global penalty event model to address the bias and variance issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 53

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 54

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0