CS614 Midterm Online Quiz

0%

CS614-Midterm

1 / 50

Data mining is a/an __________ approach, where browsing through data using data mining techniques may reveal something that might be of interest to the user as information that was unknown previously.

2 / 50

The _________ is only a small part in realizing the true business value buried within the mountain of data collected and stored within organizations business systems and operational databases.

3 / 50

If every key in the data file is represented in the index file then index is :

4 / 50

Pre-join technique is used to avoid

5 / 50

With data mining, the best way to accomplish this is by setting aside some of your data in a vault to isolate it from the mining process; once the mining is complete, the results can be tested against the isolated data to confirm the model's _______.

6 / 50

To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?

7 / 50

Cube is a __________ entity containing values of a certain fact at a certain aggregation level at an intersection of a combination of dimensions.

8 / 50

If „M‟ rows from table-A match the conditions in the query then table-B is accessed „M‟ times. Suppose table-B has an index on the join column. If „a‟ I/Os are required to read the data block for each scan and „b‟ I/Os for each data block then the total cost of accessing table-B is _____________ logical I/Os approximately.

9 / 50

_________ breaks a table into multiple tables based upon common column values.

10 / 50

The input to the data warehouse can come from OLTP or transactional system but not from other third party database.

11 / 50

Horizontal splitting breaks a table into multiple tables based upon_______

12 / 50

Data mining evolve as mechanism to cater the limitations of _____ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc...

13 / 50

Node of a B-Tree is stored in memory block and traversing a B-Tree involves ______ page faults.

14 / 50

To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?

15 / 50

For a smooth DWH implementation we must be a technologist.

16 / 50

It is observed that every year the amount of data recorded in anorganization is

17 / 50

People that design and build the data warehouse must be capable of working across the organization at all levels

18 / 50

For a relation to be in 4NF it must be:-

19 / 50

For good decision making, data should be integrated across the organization to cross the LoB (Line of Business). This is to give the total view of organization from:

20 / 50

Data mining evolve as a mechanism to cater the limitations of ________ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.

21 / 50

NUMA stands for __________

22 / 50

Multidimensional databases typically use proprietary __________ format to store pre-summarized cube structures.

23 / 50

For a given data set, to get a global view in un-supervised learning we use

24 / 50

Ad-hoc access means to run such queries which are known already.

25 / 50

As apposed to the out come of classification, estimation deal with ____________ valued outcome.

26 / 50

De-Normalization normally speeds up

27 / 50

When performing objective assessments, companies follow a set of principles to develop metrics specific to their needs, there is hard to have “one size fits all” approach. Which of the following statement represents the pervasive functional forms?

28 / 50

There are many variants of the traditional nested-loop join, if there is an index is exploited, then it is called……

29 / 50

We must try to find the one access tool that will handle all the needs of their users.

30 / 50

_____modeling technique is more appropriate for data warehouses.

31 / 50

In DWH project, it is assured that ___________ environment is similar to the production environment

32 / 50

5 million bales.

33 / 50

To judge effectiveness we perform data profiling twice.

34 / 50

Companies collect and record their own operational data, but at the same time they also use reference data obtained from _______ sources such as codes, prices etc.

35 / 50

Data mining evolve as a mechanism to cater the limitations of ________ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.

36 / 50

Collapsing tables can be done on the ___________ relationships

37 / 50

For a DWH project, the key requirement are ________ and product experience.

38 / 50

To identify the __________________ required we need to perform data profiling

39 / 50

The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The __________ the portion of the program that must be executed sequentially, the greater the scalability of computation.

40 / 50

The STAR schema used for data design is a __________ consisting of fact and dimension tables. :

41 / 50

The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The smaller the portion of the program that must be executed __________, the greater the scalability of the computation.

42 / 50

_____________ is a process which involves gathering of information about column through execution of certain queries with intention to identify erroneous records.

43 / 50

People that design and build the data warehouse must be capable of working across the organization at all levels

44 / 50

A ________ dimension is a collection of random transactional codes, flags and/text attributes that are unrelated to any particular dimension. The ______ dimension is simply a structure that provides a convenient place to store the ______ attributes.

45 / 50

The goal of star schema design is to simplify ________

46 / 50

The degree of similarity between two records, often measured by a numerical value between _______, usually depends on application characteristics.

47 / 50

_______ is an application of information and data.

48 / 50

DOLAP allows download of “cube” structures to a desktop platform with the need for shared relational or cube server.

49 / 50

Investing years in architecture and forgetting the primary purpose of solving business problems, results in inefficient application. This is the example of _________ mistake.

50 / 50

The key idea behind ___________ is to take a big task and break it into subtasks that can be processed concurrently on a stream of data inputs in multiple, overlapping stages of execution.

Your score is

The average score is 0%

0%

Qunoot e Nazilah
Dua e Hajat
4 Qul
6 Kalma
Dua-e-Akasha
Darood Akbar
Surah Fatiha
Dua-e-Ganj Ul Arsh
Dua-e-Jamilah
Ayat-ul-Kursi