Cloudera
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 2
Question 1
Why should stop an interactie machine learning algorithm as soon as the performance of the model on a test set
stops improiing?
Aoswern B
Question 2
A. ^A (Control-A)
B. , (comma)
C. \t (tab)
D. : (colon)
Aoswern A
Explanaton:
Reference:
htp:::blog.spryinc.com:/201:02:four-useful-tricks-for-working-with-hiie.html(change the delimiter when exportng
hiie table)
Question 3
Certain indiiiduals are more susceptble to autsm if they haie partcular combinatons of genes expressed in their
DNA . Giien a sample of DNA from persons who haie autsm and a sample of DNA from persons who do not haie
autsm, determine the best technique for predictng whether or not a giien indiiidual is susceptble to deieloping
autsm?
A. Natie Bayes
B. Linear Regression
C. Suriiial analysis
D. Sequencealignment
Aoswern B
Question 4
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 3
You are working with a logistc regression model to predict the probability that a user will click on an ad. Your model
has hundreds of features, and you’re not sure if all of those features are helping your predicton. Which regularizaton
technique should you use to prune features that aren’t contributng to the model?
A. Coniex
B. Uniform
C. L/
D. L0
Aoswern A
Question 5
A. A
B. B
C. C
Aoswern A
Question 6
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 4
A. A
B. B
C. C
Aoswern C
Question 7
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 5
A. A
B. B
C. C
Aoswern B
Question 8
Under what two conditons does stochastc gradient descent outperform /nd-order optmizaton techniques such as
iteratiely reweighted least squares?
A. When the iolume of input data is so large and diierse that a /nd-order optmizaton technique can be ft to a
sample of the data
B. When the model’s estmates must be updated in real-tme in order to account for newobseriatons.
C. When the input data can easily ft into memory on a single machine, but we want to calculate confdence interials
for all of the parameters in the model.
D. When we are required to fnd the parameters that return the optmal ialue of the objectie functon.
Aoswern A,B
Question 9
What is the result of the following command (the database username is foo and password is bar)?
$ sqoop list-tables - - connect jdbc : mysql : : : localhost:databasename - - table - - username foo -- password bar
A. sqoop lists only those tables in the specifed MySql database that haie not already been imported into FDFS
B. sqoop returns an error
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 6
Aoswern C
Explanaton:
Reference:
htps:::www.inkling.com:read:hadoop-defnitie-guide-tom-white-1rd:chapter-05:geingsqoop
Question 10
What is the most common reason for a k-means clustering algorithm to returns a sub-optmal clustering of its input?
Aoswern C
Question 11
There are /2 patents with acute lymphoblastc leukemia (ALL) and 1/ patents with acute myeloid leukemia (AML),
both iariants of a blood cancer.
The makeup of the groups as follows:
Each indiiidual has an expression ialue for each of 02222 diferent genes. The expression ialue for each gene is a
contnuous ialue between -0 and 0.
You’ie built your model for discriminatng between AML and ALL patents and you fnd that it works quite well on
your current data. One month later, a collaboraton tells you she has fresh data from 022 new AML:ALL patents. You
run the samples through your model, and turns out your model has iery poor predictie accuracy on the new
samples; specifcally, your model predicts that all males haie ALL. What is the most reliable way to fx this problem?
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 7
Aoswern D
Question 12
There are /2 patents with acute lymphoblastc leukemia (ALL) and 1/ patents with acute myeloid leukemia (AML),
both iariants of a blood cancer.
The makeup of the groups as follows:
Each indiiidual has an expression ialue for each of 02222 diferent genes. The expression ialue for each gene is a
contnuous ialue between -0 and 0.
You want to use the data from the 5/ patents in the scenario to improie the ability of doctors being able to
distnguish between ALL and AML. What type of data science problem is this?
A. Classifcaton
B. Regression
C. Clustering
D. Filtering
Aoswern D
Question 13
There are /2 patents with acute lymphoblastc leukemia (ALL) and 1/ patents with acute myeloid leukemia (AML),
both iariants of a blood cancer.
The makeup of the groups as follows:
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 8
Each indiiidual has an expression ialue for each of 02222 diferent genes. The expression ialue for each gene is a
contnuous ialue between -0 and 0.
With which type of plot can you encode the most amount of the data iisually?
Aoswern C
Question 14
There are /2 patents with acute lymphoblastc leukemia (ALL) and 1/ patents with acute myeloid leukemia (AML),
both iariants of a blood cancer.
The makeup of the groups as follows:
Each indiiidual has an expression ialue for each of 02222 diferent genes. The expression ialue for each gene is a
contnuous ialue between -0 and 0.
With which type of plot can you encode the most amount of the data iisually?
Rather than use all 02,222 features to separate AML from ALL, you pick a small subnet of features to separate them
optmally. You feature iectors haie 02,222 dimensions while you only haie 5/ data points. You use cross-ialidaton to
test your chosen set of features. What three methods will choose the features in an optmal way?
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 9
Aoswern C,D,F
Question 15
There are /2 patents with acute lymphoblastc leukemia (ALL) and 1/ patents with acute myeloid leukemia (AML),
both iariants of a blood cancer.
The makeup of the groups as follows:
Each indiiidual has an expression ialue for each of 02222 diferent genes. The expression ialue for each gene is a
contnuous ialue between -0 and 0.
With which type of plot can you encode the most amount of the data iisually?
You choose to perform agglomeratie hierarchical clustering on the 02,222 features. How much RAM do you need to
hold the distance Matrix, assuming each distance ialue is 64-bit double?
A. ~ 822 MB
B. ~ 422 MB
C. ~ 062 KB
D. ~ 4 MB
Aoswern B
Question 16
You haie a large m x n data matrix M. You decide you want to perform dimension reducton:clustering on your data
and haie decide to use the singular ialue decompositon (SVD;
also called principal components analysis PCA)
You performed singular ialue decompositon (SVD; also called principal components analysis or PCA) on you data
matrix but you did not center your data frst. What does your frst singular component describe?
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 10
Aoswern C
Question 17
You haie a large m x n data matrix M. You decide you want to perform dimension reducton:clustering on your data
and haie decide to use the singular ialue decompositon (SVD;
also called principal components analysis PCA)
Refer to the passage aboie.
What represents the SVD of the Matrix standard M giien the following informaton:
U is m x m unitary
V is n x n unitary
S is m x n diagonal
Q is n x n iniertble
D is n x n diagonal
L is m x m lower triangular
U is m x m upper triangular
A. M = U S V
B. M = U P
C. M = Q D Q-0
D. M = L U
Aoswern A
Question 18
You haie a large m x n data matrix M. You decide you want to perform dimension reducton:clustering on your data
and haie decide to use the singular ialue decompositon (SVD;
also called principal components analysis PCA)
For the moment, assume that your data matrix M is 522 x /. The fgure below shows a plot of the data.
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 11
A. Blue
B. Yellow
Aoswern A
Question 19
Many machine learning algorithm iniolie fnding the Global minimum of a coniex loss functon, primarily because:
Aoswern B
Question 20
Which two techniques should you use to aioid oierfing a classifcaton model to a data set?
A. Include a small number “noise” features that are not through to be correlated with the dependent iariable.
B. Replicate features that are through to be signifcant predicators of the dependent iariable multple tme for each
obseriaton.
C. Separate your input data into a training set that is used for fing and a test set that is used foreialuatng the
model’s performance
D. Include a regularizaton term in the model’s objectie functon to control how precisely the model fts the data
E. Preprocess the data to exclude a typical obseriaton from the model input
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 12
Aoswern A,E
Question 21
You are building a k-nearest neighbor classifer (k-NN) on a labeled set of points in a highdimensional space. You
determine that the classifer has a large error on the training data. What is the most likely problem?
Aoswern B
Question 22
A. Flume is a platorm for analyzing large data sets that consists of a high-leiel language for expressing data analysis
programs, coupled with an infrastructure consistng of sources and sinks for importng and eialuatng large data sets
B. Flume acts as a Hadoop flesystem for log fles
C. Flume Imports data from SQL:relatonal database into your Hadoop cluster
D. Flume proiides a query languages for Hadoop similar to SQL
E. Flume is a distributed serier for collectng and moiing large amount of data into HDFS as it’s produced from
streaming data fows
Aoswern D
Question 23
You haie a directory containing a number of comma-separated fles. Each fle has three columns and each flename
has a .csi extension. You want to haie a single tab-separated fle (all .tsi) that contains all the rows from all the fles.
Which command is guaranteed to produce the desired output if you haie more than /2,222 fles to process?
A. Find . – name ‘*, CSV’ – print2 | sargs -2 cat | tr ‘,’ ‘\t’ > all.tsi
B. Find . –name ‘name * .CSV’ | cat | awk ‘BEGIN {FS = “,” OFS = “\t”} {print $0, $/, $1}’ > all.tsi
C. Find . – name ‘*.CSV’ | tr ‘,’ ‘\t’ | cat > all.tsi
D. Find . –name ‘*.CSV’ | cat > all.tsi
E. Cat *.CSV > all.tsi
Aoswern B
Question 24
What are three benefts of running feature selecton analysis before fltering a classifcaton model?
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 13
Aoswern D,E,F
Question 25
When optmizing a functon using stochastc gradient descent, how frequently should you update your estmate of the
gradient?
Aoswern A,C
Question 26
In what format are web serier log fles usually generated and how must you transform them in order to make them
usable for analysis in Hadoop?
Aoswern A,B
Question 27
Aoswern C
Explanaton:
Reference:
htp:::www.cs.cmu.edu:~srosenth:papers:RosenthallRecSys20.pdf
Question 28
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 14
You are about to sample a 022-dimensinal unit-cube. To adequately sample any single giien dimension, you need only
capture 02 points. How many points do you need to order to sample the complete 022-dimensional unit cube
adequately?
A. 02202
B. 0202
C. Log/(022)
D. 022
E. 0222
F. 0202
Aoswern E
Question 29
You haie acquired a new data source of millions of customer records, and you’ie this data into HDFS. Prior to analysis,
you want to change all customer registraton to the same date format, make all addresses uppercase, and remoie all
customer names (for anonymizaton). Which process will accomplish all three objecties?
A. Adapt the data cleansing module in Mahout to your data, and inioke the Mahout library when you run your
analysis
B. Pull this data into an RDBMS using sqoop and scrub records using stored procedures
C. Write a script that receiies records on stdin, corrects them, and then writes them to stdout.
Then, inioke this script in a map-only Hadoop Streaming Job
D. Write a MapReduce job with a mapper to change words to uppercase and to reduce diferent forms of dates to a
single form
Aoswern C
Question 30
A company has /2 sofware engineers working to fx on a project. Oier the past week, the team has fxed 022 bugs.
Although the aierage number of bugs. Although the aierage number of bugs fxed per engineer id fie. None of the
engineer fxed exactly fie bugs last week.
You want to understand how productie each engineer is at fxing bugs. What is the best way to iisualize the
distributon of bug fxes per engineer?
Aoswern A
Question 31
A company has /2 sofware engineers working to fx on a project. Oier the past week, the team has fxed 022 bugs.
Although the aierage number of bugs. Although the aierage number of bugs fxed per engineer id fie. None of the
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 15
A. The tech lead’s estmate of how many hours would be needed to fx the bug.
B. The priority of the bug according to the project manager
C. The number of years that the engineer who was assigned the bug has worked at the company
D. The number of bugs that had been found in each sub-component of the project
Aoswern D
Question 32
In what way can Hadoop be used to improie the performance of LIoyd’s algorithm for k-means clustering on large
data sets?
Aoswern B
Question 33
You haie a data fle that contains two trillion records, one record per line (comma separated).
Each record lists two friends and unique message sent between them. Their names will not haie commas.
Michael, John, Pabst, Blue Ribbon
Tifany, James, BMX Racing
John, Michael, Natural Lemon Flaior
Analyze the pseudo code examples below and determine which set of mappers and reducers in the below pseudo
code snippets will solie for the mean number of messages each user sends to all of the friends?
For example pseudo code may haie three friends to whom he sends 6, 02, and /22 messages, respectiely, so
Michael’s mean would be (6+02+/22):1. The soluton may require a pipeline of two MapReduce jobs.
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 16
Aoswern B
Question 34
You haie just run a MapReduce job to flter user messages to only those of a selected geographical region. The output
for this job in a directory named westUsers, located just below your home directory in HDFS. Which command gathers
these records into a single fle on your local fle system?
Aoswern B
Question 35
Functon is coniex if the line segment between two points, a and b is greater than equal to the ialue of the a x b
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 17
A. X0:/
B. Ex
C. /x-0
D. 0-x/
Aoswern A
Question 36
You need to analyze 62,222,222 images stored in JPEG format, each of which is approximately /5 KB. Because your
Hadoop cluster isn't optmized for storing and processing many small fles you decide to do the following actons:
0. Group the indiiidual images into a set of larger fles
/. Use the set of larger fles as input for a MapReduce job that processes them directly with Python using Hadoop
streaming
Which data serializaton system giies you the fexibility to do this?
A. CSV
B. XML
C. HTML
D. Airo
E. Sequence Files
F. JSON
Aoswern B,F
Question 37
You haie user profle records in an OLTP database that you want to join with web serier logs which you haie already
ingested into HDFS. What is the best way to acquire the user profle for use in HDFS?
Aoswern B,D
Explanaton:
Reference:
htps:::thinkbiganalytcs.com:leadinglbigldataltechnologies:ingeston-and-streamingwith-storm-kafa-fume:
Question 38
You are building a system to perform outlier detecton for a large online retailer. You need to build a system to detect
if the total dollar ialue of sales are outside the norm for each U.S. state, as determined from the physical locaton of
the buyer for each purchase. The retailer's data sources are scatered across multple systems and databases and are
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 18
unorganized with litle coordinaton or shared data or keys between the iarious data sources.
Below are the sources of data aiailable to you. Determine which three will giie you the smallest set of data sources
but stll allow you to implement the outlier detector by state.
A. Database of employees that Includes only the employee ID, start date, and department
B. Database of users that contains only their user ID, name, and a list of eiery Item the user has iiewed
C. Transacton log that contains only basket ID, basket amount, tme of sale completon, and a session ID
D. Database of user sessions that includes only session ID, corresponding user ID, and the corresponding IP address
E. External database mapping IP addresses to geographic locatons
F. Database of items that includes only the item name, item ID, and warehouse locaton
G. Database of shipments that includes only the basket ID, shipment address, shipment date, and shipment method
Aoswern A,D,F
Question 39
A. It does not require you to make strong assumptons about the data because it is a nonparametric
B. It signifcantly reduces the size of the parameter space, thus reducing the risk of oier fing
C. It allows you to reduce bias with no tradeof in iariance
D. It guarantees coniergence of the estmator
Aoswern A
Question 40
What are two defning features of RMSE (root-mean square error or root-mean-square deiiaton)?
A. It is sensitie to outliers
B. It is the mean ialue of recommendatons of the K-equal parttons in the input data
C. It is the square of the median ialue of the error where error is the diference between predicted ratng and actual
ratngs
D. It is appropriate for numeric data
E. It considers the order of recommendatons
Aoswern B,D
Question 41
Consider the following sample from a distributon that contains a contnuous X and label Y that is either A or B:
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 19
Which is the best cut point for X if you want to discretze these ialues into two buckets in a way that minimizes the
sum of chi-square ialues?
A. X8
B. X6
C. X5
D. X4
E. X/
Aoswern D
Question 42
Consider the following sample from a distributon that contains a contnuous X and label Y that is either A or B:
Which is the best choice of cut points for X if you want to discretze these ialues into three buckets that minimizes the
sum of chi-square ialues?
A. X5 and X8
B. X4 and X6
C. X1 and X8
D. X1 and X6
E. X/ and X0
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 20
Aoswern E
Question 43
You want to understand more about how users browse your public website. For example, you war know which pages
they iisit prior to placing an order. You haie a serier farm of /22 web serier hostng your website. Which is the most
efcient process to gather these web seriers access logs into your Hadoop cluster for analysis?
A. Sample the web serier logs web seriers and copy them into HDFS using curl
B. Channel these click streams into Hadoop using Hadoop Streaming
C. Write a MapReduce job with the web seriers for mappers and the Hadoop cluster nodes for reducers
D. Import all user clicks from your OLTP databases Into Hadoop using Sqoop
E. Ingest the serier web logs into HDFS using Flume
Aoswern C
Question 44
You haie a large fle of N records (one per line), and want to randomly sample 02% them. You haie two functons that
are perfect random number generators (through they are a bit slow):
Randomluniform () generates a uniformly distributed number in the interial [2, 0]
randomlpermotaton (M) generates a random permutaton of the number O through M -0.
Below are three diferent functons that implement the sampling.
Method A
For line in fle:
If randomluniform () < 2.0;
Print line
Method B
i=2
for line in fle:
if i % 02 = = 2;
print line
i += 0
Method C
idxs = randomlpermotaton (N) [: (N:02)]
i=2
for line in fle:
if i in idxs:
print line
i +=0
Which method will haie the best runtme performance?
A. Method A
B. Method B
C. Method C
Aoswern A
Question 45
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 21
You haie a large fle of N records (one per line), and want to randomly sample 02% them. You haie two functons that
are perfect random number generators (through they are a bit slow):
Randomluniform () generates a uniformly distributed number in the interial [2, 0]
randomlpermotaton (M) generates a random permutaton of the number O through M -0.
Below are three diferent functons that implement the sampling.
Method A
For line in fle:
If randomluniform () < 2.0;
Print line
Method B
i=2
for line in fle:
if i % 02 = = 2;
print line
i += 0
Method C
idxs = randomlpermotaton (N) [: (N:02)]
i=2
for line in fle:
if i in idxs:
print line
i +=0
Which method requires the most RAM?
A. Method A
B. Method B
C. Method C
Aoswern B
Question 46
You haie a large fle of N records (one per line), and want to randomly sample 02% them. You haie two functons that
are perfect random number generators (through they are a bit slow):
Randomluniform () generates a uniformly distributed number in the interial [2, 0]
randomlpermotaton (M) generates a random permutaton of the number O through M -0.
Below are three diferent functons that implement the sampling.
Method A
For line in fle:
If randomluniform () < 2.0;
Print line
Method B
i=2
for line in fle:
if i % 02 = = 2;
print line
i += 0
Method C
idxs = randomlpermotaton (N) [: (N:02)]
i=2
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 22
A. Method A
B. Method B
C. Method C
Aoswern C
Question 47
You haie a large fle of N records (one per line), and want to randomly sample 02% them. You haie two functons that
are perfect random number generators (through they are a bit slow):
Randomluniform () generates a uniformly distributed number in the interial [2, 0]
randomlpermotaton (M) generates a random permutaton of the number O through M -0.
Below are three diferent functons that implement the sampling.
Method A
For line in fle:
If randomluniform () < 2.0;
Print line
Method B
i=2
for line in fle:
if i % 02 = = 2;
print line
i += 0
Method C
idxs = randomlpermotaton (N) [: (N:02)]
i=2
for line in fle:
if i in idxs:
print line
i +=0
Which method is least likely to giie you exactly 02% of your data?
A. Method A
B. Method B
C. Method C
Aoswern B
Question 48
Assuming the trends shown in this chart contnue, what would we expect the ialue of the reienue to be in Q0 of
/201?
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 23
A. $0/5,222
B. $072,222
C. $//2,222
D. $/52,222
Aoswern A
Question 49
From historical data, you know that 52% of students who take Cloudera’s Introducton to Data Science: Building
Recommenders Systems training course pass this exam, while only /5% of students who did not take the training
course pass this exam. You also know that 52% of this exam’s candidates also take Cloudera’s Introducton to Data
Science: Building Recommendatons Systems training course.
If we know that a person has passed this exam, what is the probability that they took cloudera’s introducton to Data
Science: Building Recommender Systems training course?
A. /:1
B. 0:/
C. 1:4
D. 1:5
Aoswern B
Question 50
From historical data, you know that 52% of students who take Cloudera’s Introducton to Data Science: Building
Recommenders Systems training course pass this exam, while only /5% of students who did not take the training
course pass this exam. You also know that 52% of this exam’s candidates also take Cloudera’s Introducton to Data
Science: Building Recommendatons Systems training course.
What is the probability that any indiiidual exam candidate will pass the data science exam?
A. 1:8
B. 0:4
C. 0:8
D. 0:/
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 24
Aoswern C
Question 51
You want to build a classifcaton model to identfy spam comments on a blog. You decide to use the words in the
comment text as inputs to your model. Which criteria should you use when deciding which words to use as features in
order to contribute to making the correct classifcaton decision?
A. Choose words for your sample that are most correlated with the Spam label
B. Choose wordsfor your sample thatoccur most frequently in the text
C. Choose words, for your sample that haie the largest mutual informaton with the spam label
D. Choose words for your sample that are least correlated with the spam label
Aoswern A
Question 52
A. 0, 1, 8, 14, 80
B. 0, 4, 01, 14, 80
C. 0, 0.5, 5, /4.5, 80
D. 0, /.5, 8, /7.5, 80
Aoswern A
Question 53
A. You can calculate unbiased estmators for the parameters of the distributon
B. It’s robust to outliers
C. It’s well-defned for any probability distributon
D. You can calculate it quickly using a relatonal database like MySQL, eien when we haie a large sample
Aoswern D
Question 54
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 25
A. They sort all of the input samples and the lookup the samples for each percentle
B. They maintain index of input data as it is loaded into HDFS and load them into memory
C. They use piiots to assign each obseriatons to the reducer that calculate each percentle
D. They assign sample obseriatons to buckets and then aggregate the buckets to compute the approximatons
Aoswern C
Question 55
What is the best way to determine the learning rate parameters for stochastc gradient descent when the distributon
of the input data shifs oier tme?
A. The learning rate should be adjusted periodically based on the seing that optmizes the objectie functon oier a
sample of recent obseriatons
B. The learning rate should be fxed number that decays as the number of obseriatons in the data set increases
C. The learning rate should be the ialue that optmizes the ialue of the objectie functon oier the frst N samples in
the dataset
D. The learning rate should be a fxed number with a constant decay factor
E. The learning rate should be contnuously adjusted based on the ialue that optmizes the objectie functon for the
most recent obseriaton from the input data
Aoswern C
Question 56
Which two machine learning algorithm should you consider as likely to beneft from discretzing contnuous features?
Aoswern A,B
Explanaton:
Reference:
htp:::www.ncbi.nlm.nih.goi:pmc:artcles:PMC/65628/:
Question 57
You’ie built a model that has ten diferent iariables with complicated independence relatonships between them, and
both contnuous and discrete iariables that haie complicated, mult-parameter distributons.
Computng the joint probability distributon is complex, but it turns out that computng the conditonal probabilites
for the iariables is easy. What is the most computatonally efcient for computng the expected ialue?
A. Method of moments
B. Markoi Chain Monte Carlo
________________________________________________________________________________________________
https://www. pass4sures.com/
Page No | 26
C. Gibbs sampling
D. Numerical quadrature
Aoswern B
Question 58
What is one limitaton encountered by all systems that employ collaboratie fltering and use preferences as input. In
order to output product recommendatons to consumers?
A. Consumers do not haie stable ratngs for the same product oier tme
B. There are too many consumers and too few products
C. Not eiery product has been rated by eiery consumer
D. There are too few consumers and too many products
Aoswern A
Question 59
Aoswern C
Explanaton:
Reference:
htp:::www.mathworks.com:help:stats:naiie-bayes-classifcaton.html
Question 60
Which three metrics are useful in measuring the accuracy and quality of a recommender system?
A. Mutual Informaton
B. RMSF
C. Tanimoto coefcient
D. Pearson correlaton
E. Precision
F. Recall
Aoswern C,D,E
Explanaton:
Reference:
htps :::lirias.kuleuien.be:bitstream:0/1456780:/80821:1:datasets-cameraready.pdf
________________________________________________________________________________________________
https://www. pass4sures.com/