You are on page 1of 22

1 BI & DW TECHNOLOGIES

1.1

Basic Concepts

Q. No.

1.2

Question

1.

How do you explain DW to a layman?

2.

Why is DW often called a 'Decision Support System'? Give examples.

3.

What is BI? How does DW enable it?

4.

How do you decide whether to build data marts or an EDW?

5.

What are conformed dimensions? Explain the need for conformed dimensions? What is bus architecture?

6.

What are slowly changing dimensions? What are the various methods of handling them?

7.

What is the main objective of data warehouse design?

8.

Which is a subject-oriented view of a data warehouse?

9.

What is an Operational Data Store (ODS)? How different it is from a data warehouse?

10.

What are the data warehouse architecture goals?

11.

What is a star schema?

12.

What is a snowflake schema?

13.

What is a surrogate key?

14.

What are the Components of a typical data warehouse architecture?

15.

Explain the role of metadata in data warehousing environment? Who are the users of metadata?

16.

What are the popular metadata interchange standards currently available in the market? Explain them in brief?

17.

What are the different phases involved in data warehousing development lifecycle? How different it is
compared to any other OLTP development life cycle models?

DBMS & Data Modeling

Q. No.

Question

1.

What is a Multi-Dimensional Database? How does it fit into DW?

2.

What is the difference between a Dimension and a Hierarchy? Give examples.

3.

What is drill-through? How do MDBs enable it?

4.

What is the difference between a dimensional data model and a normal data model?

Q. No.
5.
6.
7.

1.3

How do you convert a logical model into a physical model? What design considerations will you generally
employ?
What are type 1, 2 and 3 changes?

8.

What do you mean by snow flaking a dimension? Is it always essential to snow flake a dimension? What are
the tradeoffs (Performance Vs Ease of maintaining a dimension)
What is a conformed dimension?

9.

Give an example of semi and fully additive fact.

10.

Why is Dimensional Modeling relevant for Data warehousing?

11.

How different is the DW database design as compared to OLTP database design?

12.

Why denormalize for DW?

ETL

Q. No.

1.4

Question

Question

1.

What was the biggest challenge you have faced in ETL?

2.

What are your views on the usage of ETL tool vs. Custom built code?

3.

What are first generation and second generation ETL tools - give a few examples

4.

What is a staging area? How is it different from ODS?

5.

What are the various approaches to handle data refresh into the DWH?

6.

What are the various ways to optimize ETL time window?

7.

24x7 availability of the DWH is becoming a more common requirement these days. Given that there is a data
refresh time window for the DWH, how will you ensure 100% uptime and end user data access when the actual
data refresh takes place?

OLAP

Q. No.

Question

1.

What is OLAP? How is it different from reporting?

2.
3.

What is the difference between ROLAP, MOLAP, HOLAP and DOLAP? How will you choose which
technology to use?
Explain drill-up, drill-down, drill across - with examples

4.

Explain slicing and dicing with examples

5.

What are the advantages & disadvantages of ROLAP?

Q. No.

Question

6.

What are the advantages & disadvantages of MOLAP?

7.

How do you decide which method to use - ROLAP, MOLAP or HOLAP

1.5

Metadata Management

1.6

Data Quality Management

1.7

Analytical CRM

1.8

Analytical SCM

1.9

E-Business Intelligence

1.10 ERP Intelligence


1.11 Enterprise Business Intelligence
The candidate should have 5+ years in IT industry with a minimum of 3 years in DW and some consulting
experience.
The candidate should have experience/understanding in the following:

Business Case for BI (ROI, Cost/Benefit analysis etc.)


BI Readiness assessment
BI need assessment
BI Roadmap preparation
Enterprise Data Warehouse (architecting/development/implementation)
Enterprise Data Modeling
Metadata Management
Various approaches / trends etc. in Architecting EDW

Design, Development & Implementation


Q. No.

Question

Answer

1.

What do you generally


do to increase the
performance of the
query in a database?

Declare all possible constraints.


Check for the ODBC connection if any.
Normalize database.
Check for outdated stats.
See the exec time after switching on that option.
See for the exec plan.
Index creation wizard etc.

2.

What are various normal


forms and their
advantages and
disadvantages? Explain

Q. No.
3.

4.
5.

6.

7.

Question
with an example.
Why is a wide key in the
index not good from
performance point of
view? How do you get
performance
enhancements from a
surrogate key?
How do you insert text
data or an image in a
SQL server table?
Does dropping of a table
get logged and can you
recreate a dropped table
with the data?
In what scenarios the
scripting of a database
job/task required?
Whats a bitmapped
index and whether SQL
server has it?

Answer
Wider the key, the less the actual data pertaining to the key on the
database page. Hence more I/O and poorer performance. Also it wont
be that efficient.

By creating sp in which a pointer variable is created pointing to the


place where the data needs to be inserted and then passing a reference
to that newly created pointer.
No, you cannot recreate a table if its dropped. Delete logs all the
entries being deleted.
This may be required when you may have a requirement to do a job
through a click of a button given to the user. This can fire a trigger in
the table and in turn run a job like scheduling a task of backuping up
the database according to the button clicked with no intervention of a
DBA for doing the job.
SQL does not support bitmapped indexes. Though internally it used
bitmapped indexes.
HINT: B-tree vs. bitmap indexing
RDBMSs use B-trees or some variation of them as the primary
indexing method. B-trees store index data in a hierarchy (or tree) of
pages. Each node in the tree contains a sorted list of key values and
links that correspond to ranges of key values between the listed
values. To find a specific data record given its key value, the program
reads the first node, or root, from the disk and compares the desired
key with the keys in the node to select a sub-range of key values to
search. It repeats the process with the node indicated by the
corresponding link. At the lowest level, the links indicate the data
records. The database system can move down through the levels of
the tree structure to find the simple index entries that contain the
location of the desired records or rows.
The advantage of B-tree indexes over sequential access is that, instead
of scanning pages that contain raw data, the RDBMS has to look at
only a few index pages until it finds pointers to the requested rows.
Bitmap indices are commonly used by DBMSs to accelerate decision
support queries. A bitmap index is a collection of bitmaps in which
each bit is mapped to a record or row ID (RID). A bit in a bitmap is
set if the corresponding RID has a given property P, e.g. a customer
that lives in the Silicon Valley, but is otherwise reset. One advantage
to using bitmap indexes is that you can do complex logical selection
operations very quickly by performing bit-wise AND, OR, and NOT
operations. Bitmaps are also compact representations of densely
populated sets. By using bitmap compression techniques, they are also
compact representations of sparsely populated sets, and hence their
value in data warehousing and OLAP. Bitmap indexes ideal for
columns that are of low cardinality, i.e. have values that are often
identical.

Q. No.

Question

Answer
By reducing the number of read operations to data, bitmapped indexes
offer better response time than traditional indexing methods such as
B-tree indexes. The idea behind a bitmapped index is that one bit
associates a specific value for an attribute with a row. For example,
each distinct value in a column can have a bitmapped index consisting
of 5 million bits--one for each record in the database. When a bit is on
(1), the value occurs in the record; when the bit is off (0), the value
doesn't occur. The index can identify records through their bitmap
position, so bitmaps don't need pointers.

8.

9.

How would SQL Server


react to a scenario when
you are creating a table
with a total of more than
8000 bytes record?
If yes, will it allow you
to insert data into such a
table and what will be
the warning?
You are often getting the
problem of your process
has been chosen as
victim of deadlock and
is being killed. What is
the problem in this case
and what is the
turnaround?

In addition to performance benefits, bitmap indexes offer the added


feature of requiring less storage space than their B-tree counterparts.
According to Oracle, one of the DBMS vendors that has implemented
bitmaps (Oracle also offers advanced partitioned indexes), the
compressed bitmap index format can equal as little as 25% of a
normal B-tree index when created on low-cardinality columns. (Of
course, the reverse is true for bitmap indexes created on highcardinality columns.) Bitmapped indexes offer several advantages,
such as reduced storage overhead and the capability to do
compression operations on indexes. Bitmapped indexes are best for
data with few unique values (not, for example, a product ID column in
which each record has a unique value). Because of the overhead
associated with updating them, bitmapped indexes are also best for
nonvolatile data.
Although SQL Server uses bitmap indexing internally for some
operations, it doesn't offer the feature as an option. That may change
in the post-Shiloh (after SQL Server 2000) release.
It will create a table though with a warning.
Same if you insert into such table anything exceeding 8060 bytes will
be omitted and wont be inserted in the table.
All this is possible if there is a variable data type in the columns. If
the fixed data type then the table wont be created.

There is a deadlock, i.e. two processes simultaneously trying to get


the same resources. You need to select the columns in the same order
and also set the deadlock priority to low in less required.

Data Warehousing
Q.
No.
1.
2.

Question

Answer

What are the advantages


and disadvantages of
MOLAP/HOLAP/ROLAP?
How do you archive the

Using The command: MSDARCH

Q.
No.

3.

Question

Answer

OLAP DB and what would


be the extension of the
backup files of a data mart
residing in OLAP server?
What is a data driven query
in DTS environment?

Ex: "\Program Files\Microsoft Analysis


Services\Bin\msmdarch" /a myserver
The extension would be .CAB

4.

Whats a virtual cube?

5.

How can you run the DTS


package from the
command prompt?
Why should we be cautious
while going for indexes in
a datamart?
How do you generally
design a data mart?

6.
7.

8.

Which is recommended;
going for data driven
queries and using lookups
for loading the datamart or
putting the lookup table in
a flat file and doing the
load.
9. Why a bit map index
becomes so crucial in a
data warehouse
environment?
10. What is indexed view in
SQL 2000?

11. Whats a linked cube?

Its a query that is fired based on the data value. For example, if the
product = coke which is already there in dimension table but the
package size has been changed then it updates the record in the
dimension table and an UPDATE query is fired or else a new record
is inserted (a insert query is fired) in the dimension table depending
on how we are managing the slowly changing dimensions.
A virtual cube, like a view in a relational database, is a logical
construct that itself contains no data. Just as a view is a join of
multiple relations, a virtual cube is a join of multiple cubes.
Through DTSRUN command which is scheduled from the
MSOLAP EM itself or through the NT scheduler.
This may cause the load process of the data mart slow. Moreover
more indexes result in increased space requirements that make the
database difficult to maintain and backup also.
Interviewing Top managers/business analysts.
KPI's (key performance indicators) of the business.
Sales /other Reports currently used for analyzing the business.
Understanding the business.
The second option is recommended, as its too fast especially in
scenarios where there is too much data to be loaded and the
window is quite small.

Indexed views for increased performance and flexibility are used


instead of aggregation tables for ROLAP partitions if the partition's
source data is stored in SQL Server 2000 and if certain criteria are
met.
You can actually have an index on views for faster retrieval of data.
A cube can be stored on a single Analysis server and then defined
as a linked cube on other Analysis servers. End users connected to
any of these Analysis servers can then access the cube. This
arrangement avoids the more costly alternative of storing and
maintaining copies of a cube on multiple Analysis servers. Linked
cubes can be connected using TCP/IP or HTTP. To end users, a
linked cube looks like a regular cube.

Oracle
Q. No.
1.

Question
What special features does oracle 8i offer for DW?

Answer

Q. No.
2.
3.
4.
5.

Question
What are the differences between table, normal
view and materialized view?
How do you decide on the rollback segment (size
& numbers) in DW vis--vis the OLTP systems?
How are bitmap indexes useful for a DW?

7.

What is the significance of parameters like pct


used, pct free, min extent, max extent etc. in the
table creation?
What are various ways of ensuring referential
integrity in an Oracle database?
What are the advantages of packages?

8.

What does pinning a package mean?

9.

What are the various methods of Dynamic SQL in


Proc*C? How and when are they used?
How is dynamic SQL used in PL/SQL procedures?

6.

10.
11.
12.
13.
14.
15.
16.
17.
18.
19.

Can we use a PL/SQL block inside a pro*c


Program, if so what is the syntax.
What is meant by indicator variable?
What is the package name that is used for writing
to a output file from a PL/SQL Program.
What are X$tables?
Is there any other way to create user without using
create user command?
Why we are giving optimal size when creating a
rollback segment.
Can we increase a tablespace size without adding
another datafile?
What is a db block size?

20.

Can we increase the db_block_size, if so when and


how?
What is the reason for using analyze command?

21.

What is meant by rollforward?

22.

Can we use commit inside a trigger?

23.

What are Schema objects and non-Schema


objects?
After deleting all records in a table, will the
allocated block will get unallocated?
What is a listener?

24.
25.
26.
27.

Without opening a database, can we query the


database, like select name from v$database?
What is the difference between create index
command and create index command using

Answer

Q. No.

Question

Answer

unrecoverable?
28.
29.
30.
31.
32.
33.
34.
35.

What is meant by undocumented Parameters? Give


Examples.
When we drop a table what are all related objects
that will get dropped?
What is the Advantage of using Redo log Members
when creating redo log files?
How control files should be organized when
creating a database?
Can we take system tablespace to offline when
database is running, if so why?
What are the different types of Backup routines
used for taking backups?
Why should u have index and data table spaces on
separate disk units?
What is variable spanned blocking? Does Oracle
support variable spanned blocking?

36.

What is the function of UTLBStat/ UTLEstat


report?

37.

What is the difference between a hot and cold


backup?
Why should u mirror the redologs?

38.

Oracle supports variable unspanned blocking. In


variable spanned blocking, portions of the same record
span across multiple blocks. Management of variable
spanned blocking is more cumbersome.
This report helps balance I/O and suggests whether we
should swap Tablespace and Datafile combinations.

Because it is a single point of failure. We lose the entire


database if we lose a redolog file.

39.

How do u estimate the size of the rollback


segment? Under what kind of transactions will
there be a possibility of running out of temp table
space?

40.

What is a control file in Oracle?

41.

What is a SGA?

42.

List some background processes of Oracle?

PMON, SMON, DBWR, LGWR, CKPT, ARCH, RECO

43.

What is a trace file?

The background processes produce trace files and


contain information about user sessions. There are date
and timestamps to help match each trace file to the user
session that produced it.

44.

What is the difference between paging and


swapping in context of Oracle?

45.

What do you mean by table and index splitting?

Paging involves moving portions of an application or


process from real memory to secondary storage while
Swapping involves moving the entire process or
application from real memory to secondary storage
It is better to keep index and table table spaces on
separate disk units to avoid Disk I/O contention. This
aids performance.

It contains information that is used by the database to


recover itself and maintain its integrity. It is better to
have a number of control files on different disk drives.

Q. No.

Question

Answer

46.

What do you mean by hotspots in context of


Oracle?

Hotspots are files within the Oracle database that are


most heavily read / written to.
There is a command "monitor fileio" to see the files that
are potential hotspots. This command is accessible to
user SYS in Oracle.

47.

Sizing an index and a table - How do u size an


index or a table?
How can you parallelize an SQL query?

With /*+PARALLEL 8/ hint

What does SHARED_POOL_SIZE in Init.ora


signify?

This parameter determines the amount of memory that


Oracle uses for its library and data dictionary cache.

48.
49.

1.12 Informatica
Q. No.
1.

Question
What are the various
types
of
sources
supported
by
Informatica?

Answer

2.

What are the various


sources
that
PowerConnect is used
for?

3.

Does
Informatica
generate
code
compatible with source?
What is PowerChannel?

4.

Adabas (F): PowerConnect for Mainframe


Oracle (F)
SQL Anywhere (F)
Cobol (F)
Unisys DMSII (P)
PeopleSoft (F): PowerConnect for PeopleSoft
Siebel (F): PowerConnect for Siebel
VSAM (F)
DB2 (F): Either DB2Connect or PowerConnect for
Mainframe
Cobol/Flat files (F)
Access (F)
Sybase (F)
Excel (F)

DB2
Adabas
PeopleSoft
Siebel
MF
Cobol/FF
PowerCenter uses either native connectivity or ODBC to extract from
source systems so code is always compatible with source
Informatica PowerChannel is a product that greatly improves the
movement of data across slow speed networks, for example a WAN or
the Internet. It is typically used for the movement of data between
geographically dispersed locations, branch offices for example, or for
sending or receiving data from external agencies. PowerChannel
provides a management environment for the defining and executing of
these data movements using XML control constructs, allowing the
data to be compressed and encrypted using RSA security algorithms.

Q. No.

Question

5.

What are the various


types
of
extracts
supported
by
Informatica?

6.

What are various types


of
transformations
supported by the tool?

Answer
It ensures complete, secure and reliable delivery of the data files
including the ability to restart failed jobs from the point of failure.
Incremental loads are supported using PowerCenter mapping variables
and parameters.
Event driven loads are also supported within PowerCenter either using
mapping variable/parameter files or using the event driven option in
the scheduler. Once an event is recognized the during a mapping flow
the type of load (new, update or delete) can be determined and then an
update strategy transformation object is used to apply this
Entire copy of source table (refresh) is supported within PowerCenter
F - Provided with PowerCenter are a number of functions that allow
many different conversion operations. These are listed below:
Aggregate: AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN,
PERCENTILE, STDDEV, SUM and VARIANCE
Character: ASCII, CHR, CONCAT, INITCAP, INSTR, LENGTH,
LOWER, LPAD, LTRIM, RPAD, RTRIM, SUBSTR and UPPER
Conversion: TO_CHAR, TO_DATE, TO_DECIMAL, TO_FLOAT,
TO_INTEGER and TO_NUMBER
Date:
ADD_TO_DATE, DATE_COMPARE, DATE_DIFF,
GET_DATE_PART,
LAST_DAY,
MAX,
MIN,
ROUND,
SET_DATE_PART and TRUNC
Numeric: ABS, CEIL, CUME, EXP, FLOOR, LN, LOG, MOD,
MOVINGAVG, MOVINGSUM, POWER, ROUND, SIGN, SQRT
and TRUNC
Scientific: COS, COSH, SIN, SINH, TAN and TANH
Special: ABORT, DECODE, ERROR, IIF and LOOKUP
Conversion: ISNULL, IS_DATE, IS_NUMBER and IS_SPACES

7.

Does it support SCD?

Variable:
SETCOUNTVARIABLE,
SETMAXVARIABLE,
SETMINVARIABLE, SETVARIABLE
EBCDIC to ASCII conversion is handled automatically by
PowerCenter
Slowly changing dimensions are supported within PowerCenter.
Changes can be identified and keys generated using a combination of
the transformation objects supplied as standard. PowerCenter also
includes a wizard to allow the following mappings to be created:
Type 1 Dimension mapping. Loads a slowly changing dimension
table by inserting new dimensions and overwriting existing
dimensions. This mapping is used when you do not want a history of
previous dimension data.
Type 2 Dimension/Version Data mapping. Loads a slowly changing
dimension table by inserting new and changed dimensions using a
version number and incremented primary key to track changes. This
mapping is used when you want to keep a full history of dimension
data and to track the progression of changes.
Type 2 Dimension/Flag Current mapping. Loads a slowly changing

Q. No.

Question

Answer
dimension table by inserting new and changed dimensions using a flag
to mark current dimension data and an incremented primary key to
track changes. This mapping is used when you want to keep a full
history of dimension data, tracking the progression of changes while
flagging only the current dimension.
Type 2 Dimension/Effective Date Range mapping. Loads a slowly
changing dimension table by inserting new and changed dimensions
using a date range to define current dimension data. This mapping is
used when you want to keep a full history of dimension data, tracking
changes with an exact effective date range.

8.

Optimization supported
by the tool?

9.

What are the backup and


recovery options offered
by the tool?

10. What is the support for


security offered by the
tool?

Type 3 Dimension mapping. Loads a slowly changing dimension


table by inserting new dimensions and updating values in existing
dimensions. This mapping is used when you want to keep the current
and previous dimension values in your dimension table
PowerCenter has a multi-threaded architecture that allows it to scale
fully across as many processors as are available. It uses intrinsic
parallelisation and memory buffering to ensure the reading,
transforming and writing processes are executed with the highest
possible throughput. It also supports the concept of data partitioning
for parallel extraction and processing. In addition, PowerCenter can
use multiple server engines running in parallel to give almost
unlimited scalability. Multiple server engines can be run on the same /
multiple hardware platforms. When reading from relational databases,
optimiser hints can be included if necessary, to ensure full advantage
is taken of any indexes on the source data
The Repository Manager client tool is used to backup/restore the
PowerCenter Metadata Repository. This is a simple process, saving
the repository as a flat file. Additionally each object/component can be
extracted into XML format and restored from XML format thus
allowing objects/components to be held externally.
PowerCenter is designed as a data integration tool to populate data
warehouses and similar targets. It is therefore not equipped with direct
functionality to backup / recover a database. This is usually left to the
database that the warehouse resides on. However, PowerCenter has
recovery options by supporting database commits and the
PowerCenter recovery option that registers each processed row in a
repository. Recovery loads to the warehouse can therefore be achieved
using PowerCenter
PowerCenter has an integral security model built into it for controlling
the use and implementation of the toolset. This does not cover
operating system security as it sits on top of this - i.e. a user would
need to access the operating system before being able to log in to, or
run, PowerCenter. The repository is used to manage access to the
source and target systems.
Access Control is configured using the Repository Manager client
tool. Users each have a profile and belong to a group. The profiles and
groups can be configured to give the required access to different
objects within PowerCenter.

Q. No.

Question

Answer
PowerCenter uses work areas called folders. These can be restricted to
owners, groups and others with read, write and execute permissions
able to be set for each. This security model enables single and multiuser development environments.

11. Does the tool support


rapid
development
methods, parallel work
streams on all areas of
ETL across teams

Development and Production environments are usually treated as


separate repositories. Migration between the two can be managed
using the client tools or via object import/export using XML.
Development can be achieved with multiple teams over multiple
projects. As mentioned above, folders can be created for separate
projects and permissions can be set to allow different users the
required access. PowerCenter uses its own locking model which
allows multiple users to develop in the repository at the same time. If
developers wish to hold the source objects under a configuration
management tool such as PVCS, the elements of metadata would be
exported from the Informatica Metadata Repository in XML format
and then checked into the configuration management tool.
Re-usable components such as mapplets and transformations further
enhance the productivity of the PowerCenter tool. These objects can
be placed in a shared folder so that multiple projects can share the
same functions either by copying or creating shortcuts.

12. Does it support rapid


learning curve?

13. Hardware supported by


the tool?

PowerCenter is not a code generation tool. All operations are


read/written from/to the metadata repository. The repository is not
proprietary and is stored on a relational database such as Oracle.
F - The Designer client has the following options when visualizing the
source and targets:
Analyze metadata (columns, precisions)
Analyze constraints
Analyze data types
Add business description for source/target
Add Business description for each source/target column
Analyze SQL queries from source
Data preview on source and target
F- The following hardware platforms, operating systems and RDBMS
are supported:
-

14. What is the metadata


support offered by the
tool?

Repository: Oracle, Sybase, Informix, MS SQL Server


Client Tools: 32-Bit Windows Platforms Windows
95/98/NT/2000
- Server Engine: Windows NT/2000, UNIX (HP-UX,
SUN Solaris, AIX)
PowerCenter holds the metadata regarding all the transformations
within the Informatica Metadata Repository. This metadata can be
browsed to ascertain usage of joins, aggregates etc and can be linked
to the operational metadata stored within the metadata repository to
provide run time information (throughput, frequency etc).
Its metadata repository, which is shareable with third party BI vendors
such as BusinessObjects, Hyperion Essbase, Cognos, Microstrategy ,
SAS and Brio.
Informatica provides a web based Metadata Reporter with pre built
reports for project/operational reporting.

Q. No.

Question

Answer

15. Outline steps to migrate


mappings from one
repository to another?
16. What is session
partitioning?

Copy from Designer


Export/import .xml

17. Why use server


variables?
18. Why use mapping
variables?
19. Give examples of active
and passive
transformations.

Written into server properties .. instead of hard coding every time use
them .. in case change in the paths, no need to change the code

20. Outline steps that you


will undertake to tune a
mapping?
21. How will you ascertain
that there is a source or
target bottleneck in a
mapping?
22. What is incremental
aggregation? When will
u use this feature?
23. What is the function of a
normalizer
transformation?
24. Give an example when
you will use Router
transformation?
25. Can you merge the
result of two active
transformations to a
passive transformation?
(NO)
26. How will you tune the
performance of an
aggregator, rank and
joiner transformations?
27. How will you estimate
the size of data and
index caches?

Partitioning the session based


reader/transformation/writer threads

on

the

source

..

parallel

Active Aggregator, Adv Ext Proc, Filter, Joiner, Normalizer, Rank,


Router, Src Qualifier, Update Strategy,
Passive Expression, Ext Proc, Lookup, Seq Gen, Stored Proc, XML
Src Qualifier

Used for COBOL sources. The data in COBOL sources is highly denormalized.. to make it normalized & use with relational DBs use
Normalizer Transformation.
Similar to a filter transformation .. used when same data is to be tested
against multiple conditions.

Cache Funda

Formula

1.13 Business Objects


Q.
No.
1.

Question

Answer

What does a universe


consist of?

Its a Semantic layer contains Information about universe structure,


classes and objects.

Q.
No.

Question

Answer

2.

What are the integrity


checks need to be done
while designing the
universe?

3.

What are the types of


joins being used in BO
universe design?
What is a loop?

Invalid syntax
Loops
Isolated tables
Isolated joins
Loops within contexts
Missing or incorrect cardinalities
Equi join
Outer Join
Theta Join
A Loop is caused by a circular set of joins which defines a closed path
thro a set of tables.
Graphical representation of Universe domain, security domain and
document domain
General supervisor
Supervisor
Designer
Supervisor Designer
End user
Versatile User
Set up and maintain architecture
Define user and groups
Assign them appropriate security profile
Customize user and group profiles
Security management
Three products Business Object Client, Supervisor, Designer.

4.
5.
6.

What is a BO
supervisory module?
What are all the
profiles that can be
created by a
supervisor?

7.

What are all the


supervisor functions?

8.

What products are


included in BO suite?
How many types of
domains we can have
in Business Objects
Repository? What are
the functions of each
type of domain?

9.

Three types of domains:


Security Domain: Each domain of a repository is identified in its
security domain. When a domain is created, its reference is
automatically stored in the security domain. The security domain also
contains information on the identification of the various
BUSINESSOBJECTS users, and on the management of the different
products
Universe Domain: The universe domain is a set of data structures
containing the characteristics of the universes created with DESIGNER.
In order for a universe to be shared, it must be exported to the universe
domain by the designer or supervisor.

10. What is Aggregate


Awareness? Where do
we define Aggregate
Aware function?
11. Can we create two
users of same name in
different groups using

Document Domain: The document domain is a set of data structures


containing the documents created by end users with the
BUSINESSOBJECTS End-user modules. In order to share documents
or cause them to be refreshed during scheduled processing, end users
must send them to the document domain.
Aggregate awareness is a feature of DESIGNER that makes use of
aggregate tables in a database. These are tables that contain precalculated data. The purpose of these tables is to enhance the
performance of SQL transactions; they are thus used to speed up the
execution of queries.
No, User Name is unique across the Repository.

Q.
No.

Question

Answer

supervisor?
12. Can the same user exist
in two different groups
with different profiles?
13. What are a universe, a
class and an object?
How are they related?
14. What are calculationcontexts? What are
Input and Output
Contexts?

Yes, This is possible. E.g. A user can belong to HR Group as well as


Project Managers Group.
A universe is a business-oriented mapping of the data structure found in
databases: tables, columns, joins, etc.
A universe can represent any specific application, system, or group of
users. For example, a universe can relate to a department in a company
such as marketing or accounting.
Calculation Context: By default, BUSINESSOBJECTS determines the
result of a measure based on the dimension or dimensions in the part of
the report in which the measure is inserted. These sets of dimensions are
called calculation contexts.
Input Context: The Input context consists of one or more dimensions
that go into the calculation.

15. Can a General


Supervisor create two
Domains of same type
with in a single data
account?
16. Where is the address of
Security Domain
stored?
17. What is the difference
between deleting and
removing a user?
18. What is the difference
between a formula and
a local variable?

19. What is offline mode?


How can you log on to
Business Objects in off
line mode? Can we
refresh a document
while working in
offline mode using a
Client/Server
Connection?

20. What is a strategy?


What are built-in and
external strategies?

Output Context: The Input context consists of one or more dimensions


that determine the result of the calculation
No

It is stored in BOMain.key file.


Removing a user removes it from one particular group only where as
deleting a user deletes the user name from all the groups.
A variable is a formula with a name. However, variables have a certain
number of advantages over formulas because they allow you to do
things that you cannot do just using formulas:
You cannot apply alerters, filters, sorts and breaks on columns or rows
containing formulas, but you can on those containing variables.
You can include variables qualified as dimensions in drill hierarchies.
Offline Mode: Depending on how BUSINESSOBJECTS has been set
up, you may have the option of starting BUSINESSOBJECTS in offline
mode. Using BUSINESSOBJECTS in offline mode means that you are
not connected to a repository which in turn means that, whatever your
connection type, you will not be able to retrieve and send documents
using BROADCAST AGENT.
Client Server Connection: If you are using a BUSINESSOBJECTS
client/server connection offline and not connected to a repository, you
can still work with documents and universes stored locally on your
computer and even create and refresh documents if you have a
connection to the database, and the database connection and security
information is stored on your computer.
Strategy: A strategy is a script that reads structural information from a
database or flat file. This information can pertain to tables, columns, or
joins.

Q.
No.

Question

Answer
In DESIGNER you can specify two types of strategies: built-in
strategies and external strategies.
Built In Strategies:
DESIGNER uses the following built-in strategies for creating the
components of universes:
The Objects Creation strategy, which tells DESIGNER how
to define classes and objects automatically from the databases
tables and columns
The Joins Creation strategy, which tells DESIGNER how to
define joins automatically from the databases tables and
columns
The Table Browser strategy, which tells DESIGNER how to
read the table and column structures from the databases data
dictionary
You can view them in the Strategies tab of the Universe Parameters
dialog box.

21. What does saving a


universe in Enterprise
and workgroup mode
mean?

22. In how many ways is a


Universe Identified?
What is the value of
Unique Identifier if the
Universe is never
exported to
Repository?

23. What are concatenated


Objects in a Universe?

24. How loops are


resolved during
Universe design?
When should we use

External Strategies:
External strategy files are declared in the STG section of .PRM files
located in the various RDBMS folders. All external strategy files contain
a number of existing strategies delivered with Business Objects
products. For example, a file may contain one object strategy, one join
strategy, and one table browser strategy, or multiple strategies of each
type. In this file you can customize an existing strategy or create your
own. Each external strategy file is specific to one RDBMS.
DESIGNER lets you save universes in either enterprise or workgroup
mode.
Enterprise mode means that you are working in an environment with a
repository.
Workgroup mode means that you are working without a repository.
The mode in which you save your universes determines whether other
designers are able to access them.
A universe is identified by:
A file name that consists of up to 8 characters and a .unv
extension.
A long name that consists of up to 35 characters. It may be a
name that more fully describes the purpose of the universe.
This is the name by which end users identify the universe in
BUSINESSOBJECTS or WEBINTELLIGENCE.
A unique system identifier. This is the identifier assigned by
the repository when you export the universe. This identifier is
null if you have never exported the universe.
A concatenated object is an object you create by combining two existing
objects.
For example, lets say that you wish to create an object called Full
Name, which is a concatenation of the objects Last Name and First
Name in the Customer class.
You can resolve loops in two ways:
Using aliases
Using contexts

Q.
No.

Question

Answer

alias and when context


to resolve the loops?

There is no strict rule to follow for resolving loops. However, whenever


possible you should use an alias instead of a context. When you use a
context, you expose the BUSINESSOBJECTS end user to the database
structure.

25. I have 2 data providers


giving me the
following results:
(Country, City) and
(City, Revenue). How
would you give me
Country wise
Revenue?
26. Can you create reports
using SQL? If yes can
you view & refresh
them in Infoview?
27. What features of
Business Objects
enable you to enhance
the performance on
queries?
28. Can you create reports
using Free Hand SQL
in ZABO installed full
client?
29. Can WebIntelligence
Users create their own
hierarchies?
30. Is it possible to create
Standard Reporting
templates in Infoview?
31. Suppose you have to
schedule a report to run
once a week every
Monday. This report
contains prompts that
specify the time period
it is to run on, every
time the report runs
this time period
changes and therefore
the prompts need to
change. Is there
anyway I can easily
and quickly automate
the updating of these

When you create aliases and you end up with object names that sound
very different, aliases are probably the right solution. If you end up with
object names that sound very similar, you should consider using
contexts.
You would have to use Data Synchronization and link 2 data providers
by City and then define a variable say Revenue Country, the
formula for this variable will be using Multicube() function as
Multicube(<Revenue>). Here without Multicube function you will get
erroneous results.

You can create reports using SQL only in Full Client (C/S mode) and
not in Infoview and yes you can push the reports in the repository and
view and refresh them in Infoview and cannot edit it their.
Aggregate Aware is the main answer to look for in this question. Can
ask to explain the concept of Aggregate Awareness?

NO. You cannot have Free Hand SQL or Stored Procedures or OLAP
sources as data providers for creating reports from ZABO Full Client.
NO. Only full client users can do this.
The answer is, no. However, you can format your objects in Designer,
including font styles, colors, borders, etc. These formats will work for
your Webi reports. However, this formating only works for the data
returned by the query. It does not work for the column headers, or, of
course, your report title, etc.
You are going to have to use a VBA macro to do this. The macro would
calculate the time periods and pass the data to the report prompt. Or the
time periods would have to come from a table instead of a prompt.
OR
If the prompt refers to a constant such as week number you could
remove the prompt and add a condition such as; sales week = (max
(sales week) where the max (sales week) could either be an additional
object or a complex condition. This would only bring the results for the
maximum week number.

Q.
No.

Question

Answer

prompts?
32. What is Short-Cut Join
and what is its use?
33. What are the major
differences between
BO Full Client and
Infoview?

34. What is a User Object


and why do you need
them?

35. How can you get


months to sort
correctly?
36. What is the concept of
Aliases in Business
Objects?

37. What is the concept of


Contexts in Business

Shortcut joins can be used in schemas containing redundant join paths


leading to the same result, regardless of direction. It is also a means to
improve performance and to avoid loops in design.
In Infoview you cannot do the following:
a) Cannot create report level Variables
b) Cannot do on the fly Ranking
c) Cannot have data from multiple data providers
d) Cannot create reports from Free hand SQL and Stored
procedures, but can create from OLAP sources.
e) Create custom Hierarchies
f) Save report as .pdf format without use of SDK.
g) Cannot have multiple blocks in the same report, like a
chart and a Table in the same report.
h) Cannot have multiple report tabs in the same
document.
In Full client you cannot have:
a) HTML Link Objects: these are the objects which will
appear as a link in Infoview and when clicked on it
will give you more information on that dimension by
opening another HTML page or will take to another
report.
Report linking: enabled using HTML Link Objects.
A universe consists primarily of classes and objects, created by the
universe designer. If the objects in a universe do not meet your needs,
you can customize the universe by creating your own objects, which are
called user objects. User objects appear in the User Objects class in the
universe. You include them in queries in the same way that you include
regular objects. Moreover, you do not need to define a connection to a
database to define a user object.
AND
Based on one or more existing objects, user objects enable you
to:
Make calculations at the database level.
Apply functions to text, for example to capitalize data.
Group data.
By default, BUSINESSOBJECTS sorts months in alphabetical order. To
sort months correctly in chronological order you have to use Month
option in Custom Sort.
In SQL an alias is an alternative name for a table. The purpose of aliases
is to resolve structural issues in a database arising from SQL limitations.
For example, one of the rules of SQL is that no table can be referenced
twice in the same SQL Statement when each table is used for a different
purpose. Nonetheless, in some cases this may be necessary to obtain the
desired query results. This is generally true in schemas where a table
acts as a shared lookup for other tables in the database. In DESIGNER,
an alias is just a pointer to another table. A designer places one or more
aliases in the Structure pane so that BUSINESSOBJECTS and
WEBINTELLIGENCE can generate the appropriate SQL statements for
certain types of queries.
A context is a rule by which determines which of two paths can be
chosen when more than one path is possible in the database. With

Q.
No.

Question

Answer

Objects?

certain database structures, you may need to use contexts rather than
aliases to resolve loops. A situation where this commonly occurs is a
transactional database with multiple fact tables (multiple stars) that
share lookup tables. For example, the Club database contains statistical
information both about sales and reservations. The statistics relating to
each type of transaction are stored in distinct fact tables. However,
because these fact tables share common dimensions, such as Resorts and
Customers, the schema contains a loop. The only way to resolve this
loop is to ensure that queries answer questions for one transaction or the
other, such as: Is the customer information needed from the perspective
of sales or reservations? The method for specifying the appropriate
perspective is called a context. When a user runs a query from a
universe containing contexts, BUSINESSOBJECTS or
WEBINTELLIGENCE prompts the user to indicate the correct
perspective for the query.
The Chasm trap occurs when two many to one joins converge on a
single table. You will get incorrect results if you run a query when the
following circumstances exist:
A many to one to many relationship exists among three
tables in the universe structure.
The query includes objects based on two tables both at the
many end of their
respective joins.
There are multiple rows returned for a single dimension.
In the example below a customer can place many orders and/or
place many loans:

38. What is a Chasm trap


and how can you
resolve them in BO
designer?

ORD
ERS

CUS
TO
ME
R
LOA
NS
If you want to run a query that returns the total order and loan values for
a customer Paul, you would get the following results:
Customer Name Order Date
Loans Value
Paul
12/01/99
50.00
Paul
14/04/99
50.00
Paul
20/09/99
50.00
Paul
12/01/99
100.00
Paul
14/04/99
100.00
Paul
20/09/99
100.00

Order Value

Loans Date

100.00

05/08/97

150.00

05/08/97

150.00

05/08/97

100.00

03/06/97

150.00

03/06/97

150.00

03/06/97

Q.
No.

Question

Answer
Sum = 800

Sum =

450
The total order value returned is 800, and the total loan value is 450.
This is obviously an incorrect result. A Cartesian product of the
CUSTOMER, ORDERS, and LOAN tables has been returned. The
correct results should be:
Total orders value for Paul is 400.
Total loans value for Paul is 150.

39. How can you create a


Running Sum for
Revenue, say you have
a report which contains
Country, Year &
Revenue. You want to
display another column
which gives Running
Sum, How to do this?
After this is being done
I want to have the
running sum reset
whenever my Country
changes. Say I have 2
countries France and
US, for France I have
data for 3 years FY93
(Revenue=2), FY94
(Revenue=3), FY95
(Revenue=4) and same
for US as FY93
(Revenue=5), FY94
(Revenue=6), FY95
(Revenue=7). Now my
running sum will be as
2, 5, 9, 14, 20, and 27.
Now I want to reset
this when Country
changes from France to
US so my result should

To RESOLVE a Chasm trap you need to make two separate queries and
then combine the results. Depending on the type of objects defined for
the fact tables, and the type of end user environment, you can use the
following methods to resolve a Chasm trap:
Create a context for each fact table. This solution works in all
cases for BUSINESSOBJECTS universes.
Modify the SQL parameters for the universe so you can
generate separate SQL queries for each measure. This solution only
works for measure objects. It does not generate separate queries for
dimension or detail objects.
Break the universe into multiple universes. One universe for
each fact table. This solution only applies to WEBINTELLIGENCE
universes when there are dimension objects in one or both fact tables so
two SELECT statements are synchronized and not joined.
To create Running Sum use the formula: RunningSum (<Revenue>)
and to reset this when Country changes make the formula as
RunningSum (<Revenue> ;< Country>) and a reset context with more
than one dimension looks like this: ;< Year>, <Region>

Q.
No.

Question

40.
41.

42.

43.

look like 2, 5, 9 then 5,


11, 18.
What is the difference
between Count and
CountAll?
Consider a report
consisting of a table as:
(Region, City,
Revenue). The year
dimension is also
available in the report
and you want to
display one more
column in the report as
<Max Rev Per Year>,
what will be the
formula for this
variable.
Can you have multiple
Security domains? If
Yes can these domains
interact between
themselves? If Security
Domains cannot
interact how you can
replicate users from
one security domain to
another.
Can you run BCA on a
UNIX machine? If yes
can you schedule
documents with VBA
on them?

Answer

The Count function counts values of a dimension object that are the
same only one time. This is called a distinct count. The Count All
function counts all rows including empty and duplicate rows.
<Max Rev Per Year> = Max (<Revenue> ForEach <Year>)

Yes you can have multiple security domains, but currently They
CANNOT interact between themselves. To replicate users between
security domains use Import Export users command from supervisor.

Yes you can run BCA on UNIX, but you cannot run documents with
VBA macros on them, for that you would need it to be running on NT.

1) What kind of tool BO is?


2) What are the different kinds of Repository domains of BO?
3) What are Universe, Class and Object, how they are interlinked? What are different kinds of Object BO
have?
4) What do you mean by Offline and Online mode when working with designer?
5) Explain different kind of connections with an example?
6) What all parameters are there in an Universe?
7) What is the difference between Built-In strategy and External strategy?
8) What are different types of join which BO designer supports?
9) What is Cardinalities of Join?
10) What is the purpose of testing integrity of the Universe?
11) What type of Joins returns incorrect result?
12) How to resolve loops, explain with an example?
13) What all different kind of functions available while designing Universes?
14) Explain Aggregate awareness?
15) What is hierarchy and how to build hierarchy?
16) Explain Slice and Dice, Drill down/up/across?
17) How to spot a Hierarchy?
18) Till what level of security BO Supports?
19) Is it possible to create reports from different Universes in one document?

20) How to link One Universe to other Universe?


21) Can a universe be linked to more than one data source?
22) What detail does BO Main key consist of?
23) What all different options present in WebIntelligence?
24) How to save Universes and Document in the repository?
25) What are Scheduling and Broadcasting and how it is done?

You might also like