CS614-Midterm
1 / 50
De-Normalization normally speeds up
2 / 50
3 / 50
For good decision making, data should be integrated across the organization to cross the LoB (Line of Business). This is to give the total view of organization from:
4 / 50
Multi-dimensional databases (MDDs) typically use ___________ formats to store pre-summarized cube structures.
5 / 50
: An optimized structure which is built primarily for retrieval, with update being only a secondary consideration is
6 / 50
Data mining derives its name from the similarities between searching for valuable business information in a large database, for example, finding linked products in gigabytes of store scanner data, and mining a mountain for a _________ of valuable ore.
7 / 50
Data mining evolve as mechanism to cater the limitations of _____ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc...
8 / 50
A ________ dimension is a collection of random transactional codes, flags and/text attributes that are unrelated to any particular dimension. The ______ dimension is simply a structure that provides a convenient place to store the ______ attributes.
9 / 50
The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by _____________ tools typical of decision support systems.
10 / 50
Pre-join technique is used to avoid
11 / 50
To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?
12 / 50
If „M‟ rows from table-A match the conditions in the query then table-B is accessed „M‟ times. Suppose table-B has an index on the join column. If „a‟ I/Os are required to read the data block for each scan and „b‟ I/Os for each data block then the total cost of accessing table-B is _____________ logical I/Os approximately.
13 / 50
Ad-hoc access means to run such queries which are known already.
14 / 50
_____________ is a process which involves gathering of information about column through execution of certain queries with intention to identify erroneous records.
15 / 50
The key idea behind ___________ is to take a big task and break it into subtasks that can be processed concurrently on a stream of data inputs in multiple, overlapping stages of execution.
16 / 50
The STAR schema used for data design is a __________ consisting of fact and dimension tables. :
17 / 50
The performance in a MOLAP cube comes from the O(1) look-up time for the array data structure.
18 / 50
B-Tree is used as an index to provide access to records
19 / 50
Which statement is true for De-Normalization?
20 / 50
Focusing on data warehouse delivery only often end up _________.
21 / 50
As apposed to the out come of classification, estimation deal with ____________ valued outcome.
22 / 50
If every key in the data file is represented in the index file then index is :
23 / 50
In DWH project, it is assured that ___________ environment is similar to the production environment
24 / 50
During ETL process of an organization, suppose you have data which can be transformed using any of the transformation method. Which of the following strategy will be your choice for least complexity?
25 / 50
Many data warehouse project teams waste enormous amounts of time searching in vain for a _______.
26 / 50
Pakistan is one of the five major ________ countries in the world.
27 / 50
Collapsing tables can be done on the ___________ relationships
28 / 50
_______________, if fits into memory, costs only one disk I/O access to locate a record by given key.
29 / 50
Pipeline parallelism focuses on increasing throughput of task execution, NOT on __________ sub-task execution time.
30 / 50
Companies collect and record their own operational data, but at the same time they also use reference data obtained from _______ sources such as codes, prices etc.
31 / 50
Relational databases allow you to navigate the data in ____________ that is appropriate using the primary, foreign key structure within the data model.
32 / 50
Transactional fact tables do not have records for events that do not occur. These are called
33 / 50
34 / 50
The _________ is only a small part in realizing the true business value buried within the mountain of data collected and stored within organizations business systems and operational databases.
35 / 50
Change Data Capture is one of the challenging technical issues in _____________
36 / 50
With data mining, the best way to accomplish this is by setting aside some of your data in a vault to isolate it from the mining process; once the mining is complete, the results can be tested against the isolated data to confirm the model's _______.
37 / 50
To identify the __________________ required we need to perform data profiling
38 / 50
Data warehousing and on-line analytical processing (OLAP) are _______ elements of decision support system.
39 / 50
The technique that is used to perform these feats in data mining modeling, and this act of model building is something that people have been doing for long time, certainly before the _______ of computers or data mining technology.
40 / 50
: The goal of ___________ is to look at as few blocks as possible to find the matching records(s).
41 / 50
5 million bales.
42 / 50
The purpose of the House of Quality technique is to reduce ______ types of risk.
43 / 50
For a relation to be in 4NF it must be:-
44 / 50
DOLAP allows download of “cube” structures to a desktop platform with the need for shared relational or cube server.
45 / 50
Horizontal splitting breaks a table into multiple tables based upon_______
46 / 50
47 / 50
Data mining uses _________ algorithms to discover patterns and regularities in data.
48 / 50
All data is ______________ of something real. I An Abstraction II A Representation Which of the following option is true?
49 / 50
We must try to find the one access tool that will handle all the needs of their users.
50 / 50
NUMA stands for __________
Your score is
The average score is 0%
Restart quiz