CS614-Midterm
1 / 50
The _________ is only a small part in realizing the true business value buried within the mountain of data collected and stored within organizations business systems and operational databases.
2 / 50
To identify the __________________ required we need to perform data profiling
3 / 50
_________ breaks a table into multiple tables based upon common column values.
4 / 50
If w is the window size and n is the size of data set, then the complexity of merging phase in BSN method is___________
5 / 50
Horizontal splitting breaks a table into multiple tables based upon_______
6 / 50
If every key in the data file is represented in the index file then index is :
7 / 50
_____modeling technique is more appropriate for data warehouses.
8 / 50
Data mining is a/an ______ approach , where browsing through data using mining techniques may reveal something that might be of interest to the user as information that was unknown previously.
9 / 50
Suppose the amount of data recorded in an organization is doubled every year. This increase is __________ .
10 / 50
Multidimensional databases typically use proprietary __________ format to store pre-summarized cube structures.
11 / 50
Virtual cube is used to query two similar cubes by creating a third “virtual” cube by a join between two cubes.
12 / 50
To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?
13 / 50
Grain is the ________ level of data stored in the warehouse.
14 / 50
Data Transformation Services (DTS) provide a set of _____ that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations supported by DTS connectivity.
15 / 50
Pipeline parallelism focuses on increasing throughput of task execution, NOT on __________ sub-task execution time.
16 / 50
The need to synchronize data upon update is called
17 / 50
The degree of similarity between two records, often measured by a numerical value between _______, usually depends on application characteristics.
18 / 50
De-Normalization normally speeds up
19 / 50
The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The __________ the portion of the program that must be executed sequentially, the greater the scalability of computation.
20 / 50
If „M‟ rows from table-A match the conditions in the query then table-B is accessed „M‟ times. Suppose table-B has an index on the join column. If „a‟ I/Os are required to read the data block for each scan and „b‟ I/Os for each data block then the total cost of accessing table-B is _____________ logical I/Os approximately.
21 / 50
Pakistan is one of the five major ________ countries in the world.
22 / 50
A single database, couldn‟t serve both operational high performance transaction processing and DSS, analytical processing, all at the same time.
23 / 50
A dense index, if fits into memory, costs only ______ disk I/O access to locate a record by given key.
24 / 50
Companies collect and record their own operational data, but at the same time they also use reference data obtained from _______ sources such as codes, prices etc.
25 / 50
The technique that is used to perform these feats in data mining modeling, and this act of model building is something that people have been doing for long time, certainly before the _______ of computers or data mining technology.
26 / 50
For a DWH project, the key requirement are ________ and product experience.
27 / 50
For good decision making, data should be integrated across the organization to cross the LoB (Line of Business). This is to give the total view of organization from:
28 / 50
29 / 50
The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by _____________ tools typical of decision support systems.
30 / 50
It is observed that every year the amount of data recorded in anorganization is
31 / 50
During the application specification activity, we also must give consideration to the organization of the applications.
32 / 50
The input to the data warehouse can come from OLTP or transactional system but not from other third party database.
33 / 50
There are many variants of the traditional nested-loop join. If the index is built as part of the query plan and subsequently dropped, it is called
34 / 50
In DWH project, it is assured that ___________ environment is similar to the production environment
35 / 50
To judge effectiveness we perform data profiling twice.
36 / 50
The purpose of the House of Quality technique is to reduce ______ types of risk.
37 / 50
Pre-join technique is used to avoid
38 / 50
In horizontal splitting, we split a relation into multiple tables on the basis of
39 / 50
DTS allows us to connect through any data source or destination that is supported by ____________
40 / 50
During ETL process of an organization, suppose you have data which can be transformed using any of the transformation method. Which of the following strategy will be your choice for least complexity?
41 / 50
Analytical processing uses ____________ , instead of record level access.
42 / 50
Data mining evolve as a mechanism to cater the limitations of ________ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.
43 / 50
The STAR schema used for data design is a __________ consisting of fact and dimension tables. :
44 / 50
The goal of star schema design is to simplify ________
45 / 50
For a smooth DWH implementation we must be a technologist.
46 / 50
B-Tree is used as an index to provide access to records
47 / 50
48 / 50
If someone told you that he had a good model to predict customer usage, the first thing you might try would be to ask him to apply his model to your customer _______, where you already knew the answer.
49 / 50
The Kimball s iterative data warehouse development approach drew on decades of experience to develop the _________.
50 / 50
The key idea behind ___________ is to take a big task and break it into subtasks that can be processed concurrently on a stream of data inputs in multiple, overlapping stages of execution.
Your score is
The average score is 0%
Restart quiz