CS614 Midterm Online Quiz

CS614-Midterm

1 / 50

_______________, if fits into memory, costs only one disk I/O access to locate a record by given key.

A Sparse Index

None of These

A Dense Index

An Inverted Index

2 / 50

During ETL process of an organization, suppose you have data which can be transformed using any of the transformation method. Which of the following strategy will be your choice for least complexity?

One-to-One Scalar Transformation

One-to-Many Element Transformation

Many-to-One Element Transformation

Many-to-Many Element Transformation

3 / 50

The goal of ideal parallel execution is to completely parallelize those parts of a computation that are not constrained by data dependencies. The __________ the portion of the program that must be executed sequentially, the greater the scalability of computation.

Larger

Superior

Unambiguous

Smaller

4 / 50

.______ is class of Decision Support Environment.

Network

OLTP

OLAP

DBMS

5 / 50

_____________ is a process which involves gathering of information about column through execution of certain queries with intention to identify erroneous records.

Record Duplicate Detection

Data Anomaly Detection

Data profiling

None of these

6 / 50

: An optimized structure which is built primarily for retrieval, with update being only a secondary consideration is

OLTP

Inverted Index

OLAP

DSS

7 / 50

Node of a B-Tree is stored in memory block and traversing a B-Tree involves ______ page faults.

O (n)

O (n2)

O (log n)

O (n lg n)

8 / 50

Cube is a __________ entity containing values of a certain fact at a certain aggregation level at an intersection of a combination of dimensions.

Analytical

Logical

None of these

Physical

9 / 50

30.Data Warehouse is about taking / colleting data from different ________ sources:

Harmonized

Identical

Homogeneous

Heterogeneous

10 / 50

Analytical processing uses ____________ , instead of record level access.

Single-level hierarchy

None of the Given

multi-level aggregates

Single-level aggregates

11 / 50

Horizontal splitting breaks a table into multiple tables based upon_______

Redundant data.

Range of Data.

Common column values.

Common Row values

12 / 50

To judge effectiveness we perform data profiling twice.

One before Transformation and the other after Transformation

One before Extraction and the other after Extraction

One before Loading and the other after Loading MIDTERM EXAMINATION Spring 2008 CS614- Data Warehousing

13 / 50

Data mining derives its name from the similarities between searching for valuable business information in a large database, for example, finding linked products in gigabytes of store scanner data, and mining a mountain for a _________ of valuable ore.

Streak

Furrow

Vein

Trough

14 / 50

The _________ is only a small part in realizing the true business value buried within the mountain of data collected and stored within organizations business systems and operational databases.

None of these

Dependence on technology

Independence on technology

15 / 50

It is observed that every year the amount of data recorded in an organization :

Remains same as previous year

Doubles

Quartiles

Triples

16 / 50

If every key in the data file is represented in the index file then index is :

Inverted Index

None of these

Dense Index

Sparse Index

17 / 50

People that design and build the data warehouse must be capable of working across the organization at all levels

TRUE

FALSE

18 / 50

The automated, prospective analyses offered by data mining move beyond the analysis of past events provided by respective tools typical of ___________.

OLAP

Decision Support systems

OLTP

None of these

19 / 50

NUMA stands for __________

Non-uniform Memory Access

New Universal Memory Architecture

Non-updateable Memory Architecture

20 / 50

_________ breaks a table into multiple tables based upon common column values.

Vertical splitting

Horizontal splitting

21 / 50

If w is the window size and n is the size of data set, then the complexity of merging phase in BSN method is___________

O (w)

O (n)

O (w n)

O (w log n)

22 / 50

In _________ system, the contents change with time. :

OLAP

OLTP

DSS

ATM

23 / 50

Data Transformation Services (DTS) provide a set of _____ that lets you extract, transform, and consolidate data from disparate sources into single or multiple destinations supported by DTS connectivity.

Guidelines

Tools

Documentations

24 / 50

Non uniform distribution, when the data is distributed across the processors, is called ______.

Uncontrolled Distribution

Distributed Distribution

Pipeline Distribution

Skew in Partition

25 / 50

Data Warehouse provides the best support for analysis while OLAP carries out the _________ task.

Prediction

Whole

Analysis

Mandatory

26 / 50

Pipeline parallelism focuses on increasing throughput of task execution, NOT on __________ sub-task execution time.

Increasing

Decreasing

None of these

Maintaining

27 / 50

Slice and Dice is changing the view of the data.

TRUE

FALSE

28 / 50

To measure or quantify the similarity or dissimilarity, different techniques are available. Which of the following option represent the name of available techniques?

None of these

Euclidean distance is the only technique

Pearson correlation is the only technique

Both Pearson correlation and Euclidean distance

29 / 50

The performance in a MOLAP cube comes from the O(1) look-up time for the array data structure.

TRUE

FALSE

30 / 50

Investing years in architecture and forgetting the primary purpose of solving business problems, results in inefficient application. This is the example of _________ mistake.

Extreme Technology Design

Extreme Architecture Design

None of these

31 / 50

If someone told you that he had a good model to predict customer usage, the first thing you might try would be to ask him to apply his model to your customer _______, where you already knew the answer.

Base

Drive

Log

File

32 / 50

The degree of similarity between two records, often measured by a numerical value between _______, usually depends on application characteristics.

0 and 99

0 and 1

0 and 10

0 and 100

33 / 50

B-Tree is used as an index to provide access to records

By scanning the entire meta data

By scanning the entire table

None of these

Without scanning the entire table

34 / 50

During the application specification activity, we also must give consideration to the organization of the applications.

FALSE

TRUE

35 / 50

The input to the data warehouse can come from OLTP or transactional system but not from other third party database.

TRUE

FALSE

36 / 50

Pre-computed _______ can solve performance problems

Aggregates

Facts

Dimensions

37 / 50

NUMA stands for __________

New Universal Memory Architecture

Non-updateable Memory Architecture

Non-uniform Memory Access

38 / 50

________ is the technique in which existing heterogeneous segments are reshuffled, relocated into homogeneous segments.

Aggregation

Clustering

Segmentation

Partitioning

39 / 50

Pipeline parallelism focuses on increasing throughput of task execution, NOT on __________ sub-task execution time.

None of these

Increasing

Decreasing

Maintaining

40 / 50

_______ is an application of information and data.

Knowledge

Intelligence

Power

Education

41 / 50

The divide & conquer cube partitioning approach helps alleviate the ____________ limitations of MOLAP implementation.

Flexibility

Maintainability

Scalability

Security

42 / 50

The goal of ______is to look at as few block as possible to find the matching records.

Indexing

Partitioning

Joining

none of these

43 / 50

Pre-computed _______ can solve performance problems

Aggregates

Facts

Dimensions

44 / 50

For a smooth DWH implementation we must be a technologist.

TRUE

FALSE

45 / 50

The purpose of the House of Quality technique is to reduce ______ types of risk.

Four

Three

All

Two

46 / 50

The key idea behind ___________ is to take a big task and break it into subtasks that can be processed concurrently on a stream of data inputs in multiple, overlapping stages of execution.

Distributed Parallelism

Pipeline Parallelism

Massive Parallelism

Overlapped Parallelism

47 / 50

_____modeling technique is more appropriate for data warehouses.

dimensional

None of the given

physical

entity-relationship

48 / 50

Data mining evolve as a mechanism to cater the limitations of ________ systems to deal massive data sets with high dimensionality, new data types, multiple heterogeneous data resources etc.

OLTP

DWH

OLAP

DSS

49 / 50

The goal of star schema design is to simplify ________

None of these

Conceptual data model

Logical data model

Physical data model

50 / 50

_____ contributes to an under-utilization of valuable and expensive historical data, and inevitably results in a limitedcapability to provide decision support and analysis.

Data Stored in Heterogeneous Sources

Missing Data

The lack of data integration and standardization

Your score is

The average score is 0%