Labour Day Special 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: exams65

Databricks Certified Data Analyst Associate Exam Question and Answers

Databricks Certified Data Analyst Associate Exam

Last Update May 8, 2024
Total Questions : 45

We are offering FREE Databricks-Certified-Data-Analyst-Associate Databricks exam questions. All you do is to just go and sign up. Give your details, prepare Databricks-Certified-Data-Analyst-Associate free exam questions and then go for complete pool of Databricks Certified Data Analyst Associate Exam test questions that will help you more.

Databricks-Certified-Data-Analyst-Associate pdf

Databricks-Certified-Data-Analyst-Associate PDF

$35  $99.99
Databricks-Certified-Data-Analyst-Associate Engine

Databricks-Certified-Data-Analyst-Associate Testing Engine

$42  $119.99
Databricks-Certified-Data-Analyst-Associate PDF + Engine

Databricks-Certified-Data-Analyst-Associate PDF + Testing Engine

$56  $159.99
Questions 1

How can a data analyst determine if query results were pulled from the cache?

Options:

A.  

Go to the Query History tab and click on the text of the query. The slideout shows if the results came from the cache.

B.  

Go to the Alerts tab and check the Cache Status alert.

C.  

Go to the Queries tab and click on Cache Status. The status will be green if the results from the last run came from the cache.

D.  

Go to the SQL Warehouse (formerly SQL Endpoints) tab and click on Cache. The Cache file will show the contents of the cache.

E.  

Go to the Data tab and click Last Query. The details of the query will show if the results came from the cache.

Discussion 0
Questions 2

Which of the following is an advantage of using a Delta Lake-based data lakehouse over common data lake solutions?

Options:

A.  

ACID transactions

B.  

Flexible schemas

C.  

Data deletion

D.  

Scalable storage

E.  

Open-source formats

Discussion 0
Questions 3

Which of the following layers of the medallion architecture is most commonly used by data analysts?

Options:

A.  

None of these layers are used by data analysts

B.  

Gold

C.  

All of these layers are used equally by data analysts

D.  

Silver

E.  

Bronze

Discussion 0
Questions 4

A data organization has a team of engineers developing data pipelines following the medallion architecture using Delta Live Tables. While the data analysis team working on a project is using gold-layer tables from these pipelines, they need to perform some additional processing of these tables prior to performing their analysis.

Which of the following terms is used to describe this type of work?

Options:

A.  

Data blending

B.  

Last-mile

C.  

Data testing

D.  

Last-mile ETL

E.  

Data enhancement

Discussion 0
Questions 5

The stakeholders.customers table has 15 columns and 3,000 rows of data. The following command is run:

After runningSELECT * FROM stakeholders.eur_customers, 15 rows are returned. After the command executes completely, the user logs out of Databricks.

After logging back in two days later, what is the status of thestakeholders.eur_customersview?

Options:

A.  

The view remains available and SELECT * FROM stakeholders.eur_customers will execute correctly.

B.  

The view has been dropped.

C.  

The view is not available in the metastore, but the underlying data can be accessed with SELECT * FROM delta. `stakeholders.eur_customers`.

D.  

The view remains available but attempting to SELECT from it results in an empty result set because data in views are automatically deleted after logging out.

E.  

The view has been converted into a table.

Discussion 0
Questions 6

Which of the following is a benefit of Databricks SQL using ANSI SQL as its standard SQL dialect?

Options:

A.  

It has increased customization capabilities

B.  

It is easy to migrate existingSQL queries to Databricks SQL

C.  

It allows for the use of Photon's computation optimizations

D.  

It is more performant than other SQL dialects

E.  

It is more compatible with Spark's interpreters

Discussion 0
Questions 7

A data analyst is working with gold-layer tables to complete an ad-hoc project. A stakeholder has provided the analyst with an additional dataset that can be used to augment the gold-layer tables already in use.

Which of the following terms is used to describe this data augmentation?

Options:

A.  

Data testing

B.  

Ad-hoc improvements

C.  

Last-mile

D.  

Last-mile ETL

E.  

Data enhancement

Discussion 0
Questions 8

A data analyst has been asked to produce a visualization that shows the flow of users through a website.

Which of the following is used for visualizing this type of flow?

Options:

A.  

Heatmap

B.  

IChoropleth

C.  

Word Cloud

D.  

Pivot Table

E.  

Sankey

Discussion 0
Questions 9

A data analyst has set up a SQL query to run every four hours on a SQL endpoint, but the SQL endpoint is taking too long to start up with each run.

Which of the following changes can the data analyst make to reduce the start-up time for the endpoint while managing costs?

Options:

A.  

Reduce the SQL endpoint cluster size

B.  

Increase the SQL endpoint cluster size

C.  

Turn off the Auto stop feature

D.  

Increase the minimum scaling value

E.  

Use a Serverless SQL endpoint

Discussion 0
Questions 10

Consider the following two statements:

Statement 1:

Statement 2:

Which of the following describes how the result sets will differ for each statement when they are run in Databricks SQL?

Options:

A.  

The first statement will return all data from the customers table and matching data from the orders table. The second statement will return all data from the orders table and matching data from the customers table. Any missing data will be filled in with NULL.

B.  

When the first statement is run, only rows from the customers table that have at least one match with the orders table on customer_id will be returned. When the second statement is run, only those rows in the customers table that do not have at least one match with the orders table on customer_id will be returned.

C.  

There is no difference between the result sets for both statements.

D.  

Both statements will fail because Databricks SQL does not support those join types.

E.  

When the first statement is run, all rows from the customers table will be returned and only the customer_id from the orders table will be returned. When the second statement is run, only those rows in the customers table that do not have at least one match with the orders table on customer_id will be returned.

Discussion 0
Questions 11

Which of the following approaches can be used to ingest data directly from cloud-based object storage?

Options:

A.  

Createan external table while specifying the DBFS storage path to FROM

B.  

Create anexternal table while specifying the DBFS storage path to PATH

C.  

It is not possible to directly ingest data from cloud-based object storage

D.  

Create an external table while specifying the object storage path to FROM

E.  

Create an external table while specifying the object storage path to LOCATION

Discussion 0
Questions 12

A data analyst has been asked to use the below tablesales_tableto get the percentage rank of products within region by the sales:

The result of the query should look like this:

Which of the following queries will accomplish this task?

A)

B)

C)

D)

Options:

A.  

Option A

B.  

Option B

C.  

Option C

D.  

Option D

Discussion 0
Questions 13

A data analyst needs to use the Databricks Lakehouse Platform to quickly create SQL queries and data visualizations. It is a requirement that the compute resources in the platform can be made serverless, and it is expected that data visualizations can be placed within a dashboard.

Which of the following Databricks Lakehouse Platform services/capabilities meets all of these requirements?

Options:

A.  

Delta Lake

B.  

Databricks Notebooks

C.  

Tableau

D.  

Databricks Machine Learning

E.  

Databricks SQL

Discussion 0