CS614-Midterm
1 / 50
The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The __________ the portion of the program that must be executed sequentially, the greater the scalability of computation.
2 / 50
The technique that is used to perform these feats in data mining modeling, and this act of model building is something that people have been doing for long time, certainly before the _______ of computers or data mining technology.
3 / 50
Data Transformation Services (DTS) provide a set of _____ that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations supported by DTS connectivity.
4 / 50
Pre-computed _______ can solve performance problems
5 / 50
De-Normalization normally speeds up
6 / 50
Investing years in architecture and forgetting the primary purpose of solving business problems, results in inefficient application. This is the example of _________ mistake.
7 / 50
Virtual cube is used to query two similar cubes by creating a third “virtual” cube by a join between two cubes.
8 / 50
_______________, if fits into memory, costs only one disk I/O access to locate a record by given key.
9 / 50
_____________ is a process which involves gathering of information about column through execution of certain queries with intention to identify erroneous records.
10 / 50
Focusing on data warehouse delivery only often end up _________.
11 / 50
.______ is class of Decision Support Environment.
12 / 50
With data mining, the best way to accomplish this is by setting aside some of your data in a vault to isolate it from the mining process; once the mining is complete, the results can be tested against the isolated data to confirm the model's _______.
13 / 50
Normalization effects performance
14 / 50
The users of data warehouse are knowledge workers in other words they are _______in the organization.
15 / 50
Pre-join technique is used to avoid
16 / 50
Data mining uses _________ algorithms to discover patterns and regularities in data.
17 / 50
_____modeling technique is more appropriate for data warehouses.
18 / 50
To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?
19 / 50
Data Warehouse provides the best support for analysis while OLAP carries out the _________ task.
20 / 50
During ETL process of an organization, suppose you have data which can be transformed using any of the transformation method. Which of the following strategy will be your choice for least complexity?
21 / 50
NUMA stands for __________
22 / 50
The degree of similarity between two records, often measured by a numerical value between _______, usually depends on application characteristics.
23 / 50
Node of a B-Tree is stored in memory block and traversing a B-Tree involves ______ page faults.
24 / 50
Which statement is true for De-Normalization?
25 / 50
: An optimized structure which is built primarily for retrieval, with update being only a secondary consideration is
26 / 50
Horizontal splitting breaks a table into multiple tables based upon_______
27 / 50
Grain is the ________ level of data stored in the warehouse.
28 / 50
For good decision making, data should be integrated across the organization to cross the LoB (Line of Business). This is to give the total view of organization from:
29 / 50
For a smooth DWH implementation we must be a technologist.
30 / 50
We must try to find the one access tool that will handle all the needs of their users.
31 / 50
Collapsing tables can be done on the ___________ relationships
32 / 50
A single database, couldn‟t serve both operational high performance transaction processing and DSS, analytical processing, all at the same time.
33 / 50
Rearranging the grouping of source data, delivering it to the destination database, and ensuring the quality of data are crucial to the process of loading the data warehouse. Data ____________ is vitally important to the overall health of a warehouse project. 1. Cleansing 2. Cleaning 3. Scrubbing Which of the following options is true?
34 / 50
Data mining is a/an ______ approach , where browsing through data using mining techniques may reveal something that might be of interest to the user as information that was unknown previously.
35 / 50
Taken jointly, the extract programs or naturally evolving systems formed a spider web, also known as
36 / 50
If „M‟ rows from table-A match the conditions in the query then table-B is accessed „M‟ times. Suppose table-B has an index on the join column. If „a‟ I/Os are required to read the data block for each scan and „b‟ I/Os for each data block then the total cost of accessing table-B is _____________ logical I/Os approximately.
37 / 50
38 / 50
Relational databases allow you to navigate the data in ____________ that is appropriate using the primary, foreign key structure within the data model.
39 / 50
In DWH project, it is assured that ___________ environment is similar to the production environment
40 / 50
Data mining evolve as mechanism to cater the limitations of _____ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc...
41 / 50
The growth of master files and magnetic tapes exploded around the mid- _______. :
42 / 50
As apposed to the out come of classification, estimation deal with ____________ valued outcome.
43 / 50
The performance in a MOLAP cube comes from the O(1) look-up time for the array data structure.
44 / 50
In a traditional MIS system, there is an almost linear sequence of queries.
45 / 50
The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The smaller the portion of the program that must be executed __________, the greater the scalability of the computation.
46 / 50
The automated, prospective analyses offered by data mining move beyond the analysis of past events provided by respective tools typical of ___________.
47 / 50
48 / 50
Pipeline parallelism focuses on increasing throughput of task execution, NOT on __________ sub-task execution time.
49 / 50
A ________ dimension is a collection of random transactional codes, flags and/text attributes that are unrelated to any particular dimension. The ______ dimension is simply a structure that provides a convenient place to store the ______ attributes.
50 / 50
If someone told you that he had a good model to predict customer usage, the first thing you might try would be to ask him to apply his model to your customer _______, where you already knew the answer.
Your score is
The average score is 0%
Restart quiz