Pass4itsure > Cloudera > Cloudera Certified Associate CCA > CCA175 > CCA175 Online Practice Questions and Answers

CCA175 Online Practice Questions and Answers

Questions 4

Problem Scenario 17 : You have been given following mysql database details as well as other info. user=retail_dba password=cloudera database=retail_db jdbc URL = jdbc:mysql://quickstart:3306/retail_db Please accomplish below assignment.

1.

Create a table in hive as below, create table departments_hiveOl(department_id int,

department_name string, avg_salary int);

2.

Create another table in mysql using below statement CREATE TABLE IF NOT EXISTS

departments_hive01(id int, department_name varchar(45), avg_salary int);

3.

Copy all the data from departments table to departments_hive01 using insert into

departments_hive01 select a.*, null from departments a;

Also insert following records as below

insert into departments_hive01 values(777, "Not known",1000);

insert into departments_hive01 values(8888, null,1000);

insert into departments_hive01 values(666, null,1100);

4.

Now import data from mysql table departments_hive01 to this hive table. Please make

sure that data should be visible using below hive command. Also, while importing if null

value found for department_name column replace it with "" (empty string) and for id column

with -999 select * from departments_hive;

Buy Now
Questions 5

Problem Scenario 45 : You have been given 2 files , with the content as given Below (spark12/technology.txt) (spark12/salary.txt) (spark12/technology.txt) first,last,technology Amit,Jain,java Lokesh,kumar,unix Mithun,kale,spark Rajni,vekat,hadoop Rahul,Yadav,scala (spark12/salary.txt) first,last,salary Amit,Jain,100000 Lokesh,kumar,95000 Mithun,kale,150000 Rajni,vekat,154000 Rahul,Yadav,120000 Write a Spark program, which will join the data based on first and last name and save the joined results in following format, first Last.technology.salary

Buy Now
Questions 6

Problem Scenario 8 : You have been given following mysql database details as well as

other info.

Please accomplish following.

1.

Import joined result of orders and order_items table join on orders.order_id = order_items.order_item_order_id.

2.

Also make sure each tables file is partitioned in 2 files e.g. part-00000, part-00002

3.

Also make sure you use orderid columns for sqoop to use for boundary conditions.

Buy Now
Questions 7

Problem Scenario 40 : You have been given sample data as below in a file called spark15/file1.txt 3070811,1963,1096,,"US","CA",,1, 3022811,1963,1096,,"US","CA",,1,56 3033811,1963,1096,,"US","CA",,1,23 Below is the code snippet to process this tile. val field= sc.textFile("spark15/f ilel.txt") val mapper = field.map(x=> A) mapper.map(x => x.map(x=> {B})).collect

Please fill in A and B so it can generate below final output

Array(Array(3070811,1963,109G, 0, "US", "CA", 0,1, 0)

,Array(3022811,1963,1096, 0, "US", "CA", 0,1, 56)

,Array(3033811,1963,1096, 0, "US", "CA", 0,1, 23)

)

Buy Now
Questions 8

Problem Scenario 96 : Your spark application required extra Java options as below. XX:+PrintGCDetails-XX:+PrintGCTimeStamps Please replace the XXX values correctly ./bin/spark-submit --name "My app" --master local[4] --conf spark.eventLog.enabled=talse -conf XXX hadoopexam.jar

Buy Now
Questions 9

Problem Scenario 27 : You need to implement near real time solutions for collecting information when submitted in file with below information.

Data

echo "IBM,100,20160104" >> /tmp/spooldir/bb/.bb.txt echo "IBM,103,20160105" >> /tmp/spooldir/bb/.bb.txt mv /tmp/spooldir/bb/.bb.txt /tmp/spooldir/bb/bb.txt After few mins echo "IBM,100.2,20160104" >> /tmp/spooldir/dr/.dr.txt echo "IBM,103.1,20160105" >> /tmp/spooldir/dr/.dr.txt mv /tmp/spooldir/dr/.dr.txt /tmp/spooldir/dr/dr.txt

Requirements:

You have been given below directory location (if not available than create it) /tmp/spooldir .

You have a finacial subscription for getting stock prices from BloomBerg as well as

Reuters and using ftp you download every hour new files from their respective ftp site in

directories /tmp/spooldir/bb and /tmp/spooldir/dr respectively.

As soon as file committed in this directory that needs to be available in hdfs in

/tmp/flume/finance location in a single directory.

Write a flume configuration file named flume7.conf and use it to load data in hdfs with

following additional properties .

1.

Spool /tmp/spooldir/bb and /tmp/spooldir/dr

2.

File prefix in hdfs sholuld be events

3.

File suffix should be .log

4.

If file is not commited and in use than it should have _ as prefix.

5.

Data should be written as text to hdfs

Buy Now
Questions 10

Problem Scenario GG : You have been given below code snippet.

val a = sc.parallelize(List("dog", "tiger", "lion", "cat", "spider", "eagle"), 2)

val b = a.keyBy(_.length)

val c = sc.parallelize(List("ant", "falcon", "squid"), 2)

val d = c.keyBy(.length)

operation 1

Write a correct code snippet for operationl which will produce desired output, shown below.

Array[(lnt, String)] = Array((4,lion))

Buy Now
Questions 11

Problem Scenario 74 : You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.orders

table=retail_db.order_items

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Columns of order table : (orderjd , order_date , ordercustomerid, order status}

Columns of orderjtems table : (order_item_td , order_item_order_id ,

order_item_product_id,

order_item_quantity,order_item_subtotal,order_item_product_price)

Please accomplish following activities.

1.

Copy "retaildb.orders" and "retaildb.orderjtems" table to hdfs in respective directory p89_orders and p89_order_items .

2.

Join these data using orderjd in Spark and Python

3.

Now fetch selected columns from joined data Orderld, Order date and amount collected on this order.

4.

Calculate total order placed for each date, and produced the output sorted by date.

Buy Now
Questions 12

Problem Scenario 93 : You have to run your Spark application with locally 8 thread or locally on 8 cores. Replace XXX with correct values. spark-submit --class com.hadoopexam.MyTask XXX \ -deploy-mode cluster SSPARK_HOME/lib/hadoopexam.jar 10

Buy Now
Questions 13

Problem Scenario 6 : You have been given following mysql database details as well as other info. user=retail_dba password=cloudera database=retail_db jdbc URL = jdbc:mysql://quickstart:3306/retail_db Compression Codec : org.apache.hadoop.io.compress.SnappyCodec Please accomplish following.

1.

Import entire database such that it can be used as a hive tables, it must be created in default schema.

2.

Also make sure each tables file is partitioned in 3 files e.g. part-00000, part-00002, part00003

3.

Store all the Java files in a directory called java_output to evalute the further

Buy Now
Exam Code: CCA175
Exam Name: CCA Spark and Hadoop Developer Exam
Last Update: Apr 25, 2024
Questions: 95
10%OFF Coupon Code: SAVE10

PDF (Q&A)

$45.99

VCE

$49.99

PDF + VCE

$59.99