Labour Day Special 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: exams65

Data Engineering on Microsoft Azure Question and Answers

Data Engineering on Microsoft Azure

Last Update Apr 24, 2024
Total Questions : 316

We are offering FREE DP-203 Microsoft exam questions. All you do is to just go and sign up. Give your details, prepare DP-203 free exam questions and then go for complete pool of Data Engineering on Microsoft Azure test questions that will help you more.

DP-203 pdf

DP-203 PDF

$38.5  $109.99
DP-203 Engine

DP-203 Testing Engine

$45.5  $129.99
DP-203 PDF + Engine

DP-203 PDF + Testing Engine

$59.5  $169.99
Questions 1

You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1. Table1 contains the following:

  • One billion rows
  • A clustered columnstore index
  • A hash-distributed column named Product Key
  • A column named Sales Date that is of the date data type and cannot be null

Thirty million rows will be added to Table1 each month.

You need to partition Table1 based on the Sales Date column. The solution must optimize query performance and data loading.

How often should you create a partition?

Options:

A.  

once per month

B.  

once per year

C.  

once per day

D.  

once per week

Discussion 0
Questions 2

You have an Azure subscription linked to an Azure Active Directory (Azure AD) tenant that contains a service principal named ServicePrincipal1. The subscription contains an Azure Data Lake Storage account named adls1. Adls1 contains a folder named Folder2 that has a URI of https://adls1.dfs.core.windows.net/container1/Folder1/Folder2/.

ServicePrincipal1 has the access control list (ACL) permissions shown in the following table.

You need to ensure that ServicePrincipal1 can perform the following actions:

  • Traverse child items that are created in Folder2.
  • Read files that are created in Folder2.

The solution must use the principle of least privilege.

Which two permissions should you grant to ServicePrincipal1 for Folder2? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.  

Access - Read

B.  

Access - Write

C.  

Access - Execute

D.  

Default-Read

E.  

Default - Write

F.  

Default - Execute

Discussion 0
Questions 3

You have an Azure Data Factory pipeline shown the following exhibit.

The execution log for the first pipeline run is shown in the following exhibit.

The execution log for the second pipeline run is shown in the following exhibit.

For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 4

You configure monitoring for a Microsoft Azure SQL Data Warehouse implementation. The implementation uses PolyBase to load data from comma-separated value (CSV) files stored in Azure Data Lake Gen 2 using an external table.

Files with an invalid schema cause errors to occur.

You need to monitor for an invalid schema error.

For which error should you monitor?

Options:

A.  

EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect: Error

[com.microsoft.polybase.client.KerberosSecureLogin] occurred while accessing

external files.'

B.  

EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect: Error [No FileSystem for scheme: wasbs] occurred while accessing external file.'

C.  

Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11": for linked server "(null)", Query aborted- the maximum reject threshold (o

rows) was reached while regarding from an external source: 1 rows rejected out of total 1 rows processed.

D.  

EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect: Error [Unable to instantiate LoginClass] occurred

while accessing external files.'

Discussion 0
Questions 5

You have an Azure Databricks workspace named workspace1 in the Standard pricing tier.

You need to configure workspace1 to support autoscaling all-purpose clusters. The solution must meet the following requirements:

  • Automatically scale down workers when the cluster is underutilized for three minutes.
  • Minimize the time it takes to scale to the maximum number of workers.
  • Minimize costs.

What should you do first?

Options:

A.  

Enable container services for workspace1.

B.  

Upgrade workspace1 to the Premium pricing tier.

C.  

Set Cluster Mode to High Concurrency.

D.  

Create a cluster policy in workspace1.

Discussion 0
Questions 6

A company uses Azure Stream Analytics to monitor devices.

The company plans to double the number of devices that are monitored.

You need to monitor a Stream Analytics job to ensure that there are enough processing resources to handle the additional load.

Which metric should you monitor?

Options:

A.  

Early Input Events

B.  

Late Input Events

C.  

Watermark delay

D.  

Input Deserialization Errors

Discussion 0
Questions 7

You have two Azure Blob Storage accounts named account1 and account2?

You plan to create an Azure Data Factory pipeline that will use scheduled intervals to replicate newly created or modified blobs from account1 to account?

You need to recommend a solution to implement the pipeline. The solution must meet the following requirements:

• Ensure that the pipeline only copies blobs that were created of modified since the most recent replication event.

• Minimize the effort to create the pipeline.

What should you recommend?

Options:

A.  

Create a pipeline that contains a flowlet.

B.  

Create a pipeline that contains a Data Flow activity.

C.  

Run the Copy Data tool and select Metadata-driven copy task.

D.  

Run the Copy Data tool and select Built-in copy task.

Discussion 0
Questions 8

A company plans to use Apache Spark analytics to analyze intrusion detection data.

You need to recommend a solution to analyze network and system activity data for malicious activities and policy violations. The solution must minimize administrative efforts.

What should you recommend?

Options:

A.  

Azure Data Lake Storage

B.  

Azure Databricks

C.  

Azure HDInsight

D.  

Azure Data Factory

Discussion 0
Questions 9

You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1.

You have files that are ingested and loaded into an Azure Data Lake Storage Gen2 container named container1.

You plan to insert data from the files into Table1 and azure Data Lake Storage Gen2 container named container1.

You plan to insert data from the files into Table1 and transform the data. Each row of data in the files will produce one row in the serving layer of Table1.

You need to ensure that when the source data files are loaded to container1, the DateTime is stored as an additional column in Table1.

Solution: In an Azure Synapse Analytics pipeline, you use a Get Metadata activity that retrieves the DateTime of the files.

Does this meet the goal?

Options:

A.  

Yes

B.  

No

Discussion 0
Questions 10

You have an Azure subscription that contains an Azure Data Lake Storage Gen2 account named storage1. Storage1 contains a container named container1. Container1 contains a directory named directory1. Directory1 contains a file named file1.

You have an Azure Active Directory (Azure AD) user named User1 that is assigned the Storage Blob Data Reader role for storage1.

You need to ensure that User1 can append data to file1. The solution must use the principle of least privilege.

Which permissions should you grant? To answer, drag the appropriate permissions to the correct resources. Each permission may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

Options:

Discussion 0
Questions 11

You have an Azure subscription that contains an Azure Synapse Analytics workspace and a user named Used.

You need to ensure that User1 can review the Azure Synapse Analytics database templates from the gallery. The solution must follow the principle of least privilege.

Which role should you assign to User1?

Options:

A.  

Synapse User

B.  

Synapse Contributor

C.  

Storage blob Data Contributor

D.  

Synapse Administrator

Discussion 0
Questions 12

You are designing 2 solution that will use tables in Delta Lake on Azure Databricks.

You need to minimize how long it takes to perform the following:

*Queries against non-partitioned tables

* Joins on non-partitioned columns

Which two options should you include in the solution? Each correct answer presents part of the solution.

(Choose Correct Answer and Give Explanation and References to Support the answers based from Data Engineering on Microsoft Azure)

Options:

A.  

Z-Ordering

B.  

Apache Spark caching

C.  

dynamic file pruning (DFP)

D.  

the clone command

Discussion 0
Questions 13

You have an enterprise data warehouse in Azure Synapse Analytics named DW1 on a server named Server1.

You need to determine the size of the transaction log file for each distribution of DW1.

What should you do?

Options:

A.  

On DW1, execute a query against the sys.database_files dynamic management view.

B.  

From Azure Monitor in the Azure portal, execute a query against the logs of DW1.

C.  

Execute a query against the logs of DW1 by using the

Get-AzOperationalInsightsSearchResult PowerShell cmdlet.

D.  

On the master database, execute a query against the

sys.dm_pdw_nodes_os_performance_counters dynamic management view.

Discussion 0
Questions 14

You have an Azure Data Lake Storage Gen 2 account named storage1.

You need to recommend a solution for accessing the content in storage1. The solution must meet the following requirements:

  • List and read permissions must be granted at the storage account level.
  • Additional permissions can be applied to individual objects in storage1.
  • Security principals from Microsoft Azure Active Directory (Azure AD), part of Microsoft Entra, must be used for authentication.

What should you use? To answer, drag the appropriate components to the correct requirements. Each component may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 15

You are building an Azure Stream Analytics job that queries reference data from a product catalog file. The file is updated daily.

The reference data input details for the file are shown in the Input exhibit. (Click the Input tab.)

The storage account container view is shown in the Refdata exhibit. (Click the Refdata tab.)

You need to configure the Stream Analytics job to pick up the new reference data.

What should you configure? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 16

You need to design the partitions for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 17

You use Azure Stream Analytics to receive Twitter data from Azure Event Hubs and to output the data to an Azure Blob storage account.

You need to output the count of tweets from the last five minutes every minute.

Which windowing function should you use?

Options:

A.  

Sliding

B.  

Session

C.  

Tumbling

D.  

Hopping

Discussion 0
Questions 18

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

Options:

Discussion 0
Questions 19

You need to design a data retention solution for the Twitter feed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

Options:

A.  

change feed

B.  

soft delete

C.  

time-based retention

D.  

lifecycle management

Discussion 0
Questions 20

You need to design a data retention solution for the Twitter teed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

Options:

A.  

time-based retention

B.  

change feed

C.  

soft delete

D.  

Iifecycle management

Discussion 0
Questions 21

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Discussion 0
Questions 22

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

Options:

A.  

Azure-SSIS integration runtime

B.  

self-hosted integration runtime

C.  

Azure integration runtime

Discussion 0
Questions 23

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

Options:

A.  

a table that has an IDENTITY property

B.  

a system-versioned temporal table

C.  

a user-defined SEQUENCE object

D.  

a table that has a FOREIGN KEY constraint

Discussion 0
Questions 24

You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytics requirements.

Which three Transaction-SQL DDL commands should you run in sequence? To answer, move the appropriate commands from the list of commands to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Options:

Discussion 0
Questions 25

You need to implement an Azure Synapse Analytics database object for storing the sales transactions data. The solution must meet the sales transaction dataset requirements.

What solution must meet the sales transaction dataset requirements.

What should you do? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 26

You need to design an analytical storage solution for the transactional data. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 27

What should you do to improve high availability of the real-time data processing solution?

Options:

A.  

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.  

Deploy a High Concurrency Databricks cluster.

C.  

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.  

Set Data Lake Storage to use geo-redundant storage (GRS).

Discussion 0
Questions 28

What should you recommend using to secure sensitive customer contact information?

Options:

A.  

data labels

B.  

column-level security

C.  

row-level security

D.  

Transparent Data Encryption (TDE)

Discussion 0
Questions 29

You need to design a data storage structure for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 30

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Discussion 0
Questions 31

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

Options:

A.  

a server-level virtual network rule

B.  

a database-level virtual network rule

C.  

a database-level firewall IP rule

D.  

a server-level firewall IP rule

Discussion 0