CS614-Midterm
1 / 50
Cube is a __________ entity containing values of a certain fact at a certain aggregation level at an intersection of a combination of dimensions.
2 / 50
Analytical processing uses ____________ , instead of record level access.
3 / 50
Node of a B-Tree is stored in memory block and traversing a B-Tree involves ______ page faults.
4 / 50
If every key in the data file is represented in the index file then index is :
5 / 50
Suppose the amount of data recorded in an organization is doubled every year. This increase is __________ .
6 / 50
All data is ______________ of something real. I An Abstraction II A Representation Which of the following option is true?
7 / 50
A ________ dimension is a collection of random transactional codes, flags and/text attributes that are unrelated to any particular dimension. The ______ dimension is simply a structure that provides a convenient place to store the ______ attributes.
8 / 50
NUMA stands for __________
9 / 50
If „M‟ rows from table-A match the conditions in the query then table-B is accessed „M‟ times. Suppose table-B has an index on the join column. If „a‟ I/Os are required to read the data block for each scan and „b‟ I/Os for each data block then the total cost of accessing table-B is _____________ logical I/Os approximately.
10 / 50
5 million bales.
11 / 50
The automated, prospective analyses offered by data mining move beyond the analysis of past events provided by respective tools typical of ___________.
12 / 50
Ad-hoc access means to run such queries which are known already.
13 / 50
Multi-dimensional databases (MDDs) typically use ___________ formats to store pre-summarized cube structures.
14 / 50
Grain is the ________ level of data stored in the warehouse.
15 / 50
_______________, if too big and does not fit into memory, will be expensive when used to find a record by given key.
16 / 50
in agriculture extension is that pest population beyond which the benefit of spraying outweighs levels
17 / 50
Taken jointly, the extract programs or naturally evolving systems formed a spider web, also known as
18 / 50
If some error occurs, execution will be terminated abnormally and all transactions will be rolled back. In this case when we will access the database we will find it in the state that was before the ____________.
19 / 50
_______ is an application of information and data.
20 / 50
The goal of ______is to look at as few block as possible to find the matching records.
21 / 50
For good decision making, data should be integrated across the organization to cross the LoB (Line of Business). This is to give the total view of organization from:
22 / 50
Multidimensional databases typically use proprietary __________ format to store pre-summarized cube structures.
23 / 50
To identify the __________________ required we need to perform data profiling
24 / 50
It is observed that every year the amount of data recorded in anorganization is
25 / 50
Execution can be completed successfully or it may be stopped due to some error. In case of successful completion of execution all the transactions will be ___________
26 / 50
Collapsing tables can be done on the ___________ relationships
27 / 50
The purpose of the House of Quality technique is to reduce ______ types of risk.
28 / 50
Data mining evolve as a mechanism to cater the limitations of ________ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.
29 / 50
It is observed that every year the amount of data recorded in an organization :
30 / 50
Data mining derives its name from the similarities between searching for valuable business information in a large database, for example, finding linked products in gigabytes of store scanner data, and mining a mountain for a _________ of valuable ore.
31 / 50
32 / 50
To judge effectiveness we perform data profiling twice.
33 / 50
Rearranging the grouping of source data, delivering it to the destination database, and ensuring the quality of data are crucial to the process of loading the data warehouse. Data ____________ is vitally important to the overall health of a warehouse project. 1. Cleansing 2. Cleaning 3. Scrubbing Which of the following options is true?
34 / 50
With data mining, the best way to accomplish this is by setting aside some of your data in a vault to isolate it from the mining process; once the mining is complete, the results can be tested against the isolated data to confirm the model's _______.
35 / 50
________ is the technique in which existing heterogeneous segments are reshuffled, relocated into homogeneous segments.
36 / 50
The STAR schema used for data design is a __________ consisting of fact and dimension tables. :
37 / 50
During ETL process of an organization, suppose you have data which can be transformed using any of the transformation method. Which of the following strategy will be your choice for least complexity?
38 / 50
39 / 50
A data warehouse implementation without an OLAP tool is always possible.
40 / 50
De-Normalization normally speeds up
41 / 50
If someone told you that he had a good model to predict customer usage, the first thing you might try would be to ask him to apply his model to your customer _______, where you already knew the answer.
42 / 50
Pipeline parallelism focuses on increasing throughput of task execution, NOT on __________ sub-task execution time.
43 / 50
Companies collect and record their own operational data, but at the same time they also use reference data obtained from _______ sources such as codes, prices etc.
44 / 50
_____ contributes to an under-utilization of valuable and expensive historical data, and inevitably results in a limitedcapability to provide decision support and analysis.
45 / 50
Pre-computed _______ can solve performance problems
46 / 50
Data Transformation Services (DTS) provide a set of _____ that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations supported by DTS connectivity.
47 / 50
Slice and Dice is changing the view of the data.
48 / 50
Relational databases allow you to navigate the data in ____________ that is appropriate using the primary, foreign key structure within the data model.
49 / 50
In horizontal splitting, we split a relation into multiple tables on the basis of
50 / 50
The need to synchronize data upon update is called
Your score is
The average score is 0%
Restart quiz