Professional Documents
Culture Documents
NOTE TO READERS
The questions within this book have been compiled based on theoretical and real life
interviews from several sources. By utilizing this book you should be able to identify
key areas of you should improve. I personally use these questions as a refresher for a
lot of things I have learnt. In this release I improved the layout and fixed a few
wording mistakes.
You can find a list of online review lectures for SQL Server 2008 R2s BIDS suite at:
http://dl.dropbox.com/u/98625533/sql.html
Email suggestions and questions with regards to this book to:
books@otbxsolutions.ca
Regards,
Neelan Joachimpillai
2|Page
6. What is Consistency?
Consistency means that the transaction will repeat in a predictable way each time it is performed.
7. What is Isolation?
The data the transactions are independent of each other. The success of one transaction doesnt depend on the success
of another.
8. What is Durability?
Guarantees that the database will keep track of pending changes so that the server will be able to recover if an error
occurs.
9. What is a DBMS?
A DBMS is a set of software programs used to manage and interact with databases.
3|Page
21. What is the difference between a derived attribute, derived persistent attribute, and computed column?
A derived attribute is a attribute that is obtained from the values of other existing columns and does not exist on its own.
A derived persistent attribute is a derived attribute that is stored. A computed attribute is a attribute that is computed
from internal system values.
4|Page
23. Is the relationship between a strong and weak entity always identifying?
Yes, this is the requirement.
SQL Questions
26. Give/ recite the types of UDF functions.
Scalar, In-line, Multi
27. Describe what you know about PK, FK, and UK.
Primary keys Unique clustered index by default, doesnt accept null values, only one primary key per table.
Foreign Key References a primary key column. Can have null values. Enforces referential integrity.
Unique key Can have more than one per table. Can have null values. Cannot have repeating values.
Maximum of 999 clustered indexes per table.
28. What do you mean by CTEs? How will you use it?
CTEs also known as common table expressions are used to create a temporary table that will only exist
for the duration of a query. They are used to create a temporary table whose content you can reference
in order to simplify a queries structure.
29. What is a sparse column?
It is a column that is optimized for holding null values.
30. What would the command: DENY CREATE TABLE TO Peter do?
It wouldnt allow the user Peter to perform the operation CREATE TABLE regardless of his role.
31. What does the command: GRANT SELECT ON project TO Peter do?
It will allow the SELECT operation on the table project by Peter.
32. What does the command: REVOKE GRANT SELECT ON project TO Peter do?
It will revoke the permission granted on that table to Peter.
33. New commands in SQL 2008?
Database encryption, CDCs tables For on the fly auditing of tables, Merge operation, INSERT INTO To bulk insert into a
table from another table, Hierarchy attributes, Filter indexes, C like operations for numbers, resource management,
Intellisense For making programming easier in SSMS, Execution Plan Freezing To freeze in place how a query is
executed.
34. What is new in SQL 2008 R2?
PowerPivot, maps, sparklines, data bars, and indicators to depict data.
5|Page
40. What are before images, after images, undo activities and redo activities in relation to transactions?
Before images refers to the changes that are rolled back on if a transaction is rolled back. After images are used to roll
forward and enforce a transaction. Using the before images is called the undo activity. Using after images is called the
redo activity.
6|Page
7|Page
64.
Are triggers automatic or not automatic? What is the syntax for creating a trigger?
Triggers are run every time a specific operation is performed on a table.
The syntax for creating a trigger is as follows:
CREATE TRIGGER trg_t1 ON test_trigger
AFTER INSERT, DELETE
AS
BEGIN
PRINT 'Inside after trigger ...'
END
8|Page
67. What is the difference between a unique clustered index and clustered index?
The clustered allows duplicates while the unique clustered index does not. Clustered indexes do not uniquely sort the
data.
68. What is the difference between an inline function and multiline function?
Multi-statement UDFs let you include multiple statements in the UDF. The UDF returns a result set that's populated inside
the function, as I discuss later. Inline table-valued UDFs can include only one SELECT statement. Multi-statement
table-valued UDFs can add flexibility to many T-SQL solutions, but inline UDFs are almost always more efficient than
comparable multi-statement UDF solutions.
9|Page
78. What is the difference between a stored procedure and user defined functions?
Difference between SPs and UDFs:
1) SP executed using EXEC while you CALL UDF (you can use SELECT)
2) UDF must return something while SPs don't have to.
3) SP can have output params, while UDFs cannot have output params. Both can have returns
though.
4) Returns in SP are scalar, in UDF, outputs are 2 dimensional.
5) In UDFs you cannot have TRY/CATCH statements.
79. In what kind of situations can you use user defined functions?
When you want to utilize the table output.
10 | P a g e
84. Rownumber?
It will give a unique value to each row.
85. Denserank?
Denserank will give a unique rank based on the value of a column, however, it will continue at the
consecutive numbers after repeats.
86. NTile?
NTile will partition the values into equivalent parts.
87. What is the difference between a Union and Union all in SQL?
Union specifies that multiple result sets are to be combined and returned as a single result set.
Which means it would only show one row of duplicate rows. Union all incorporates all rows into
the result. This includes duplicates. Which means it would keep the duplicate rows, keep all rows.
91. Can we use Truncate command on a table which is referenced by FOREIGN KEY?
No.
11 | P a g e
DBCC CHECKDB
DBCC CHECKCATALOG
DBCC CHECKFILEGROUP
DBCC CHECKTABLE
99. Can you explain the types of Joins that we can have with SQL Server?
Inner join, Full outer join, left outer join, right outer join, and cross join.
100. When do you use SQL Profiler?
Microsoft SQL Server Profiler is a graphical user interface to SQL Trace for monitoring an instance of the Database Engine or
Analysis Services. You can capture and save data about each event to a file or table to analyze later. For example, you can
monitor a production environment to see which stored procedures are affecting performance by executing too slowly.
101. What is a Linked Server?
A linked server configuration enables SQL Server to execute commands against OLE DB data source on remote servers.
102. When defining a FORIEGN KEY, what does the constraint CASCADE DELETE DO?
If the primary key is deleted or changed so will be the records referencing it.
12 | P a g e
108. What should we do to copy the tables, schema and views from one SQL Server to another?
You can either use SQLs export/import wizard, or you can generate an export script and run it against the other server.
13 | P a g e
113. What are three SQL keywords used to change or set someones permissions?
GRANT, DENY, and REVOKE
117. What is the STUFF function and how does it differ from the REPLACE function?
Both STUFF and REPLACE are used to replace characters in a string.
SELECT REPLACE('abcdef','ab','xx')
results in xxcdef
SELECT REPLACE('defdefdef','def','abc')
results in abcabcabc
We cannot replace a specific occurrence of def using REPLACE.
SELECT STUFF('defdefdef',4, 3,'abc')
results in defabcdef
118. What does it mean to have quoted_identifier on? What are the implications of having it off?
When SET QUOTED_IDENTIFIER is ON (default), identifiers can be delimited by double quotation
14 | P a g e
119. What are the different types of replication? How are they used?
Replication is a set of technologies for copying and distributing data and database objects from one database to another and
then synchronizing between databases to maintain consistency. Using replication, you can distribute data to different
locations and to remote or mobile users over local and wide area networks, dial-up connections, wireless connections, and
the Internet.
Transactional replication is typically used in server-to-server scenarios that require high
throughput, including: improving scalability and availability; data warehousing and reporting;
integrating data from multiple sites; integrating heterogeneous data; and offloading batch
processing. Merge replication is primarily designed for mobile applications or distributed server
applications that have possible data conflicts. Common scenarios include: exchanging data with
mobile users; consumer point of sale applications; and integration of data from multiple sites.
Snapshot replication is used to provide the initial data set for transactional and merge replication;
it can also be used when complete refreshes of data are appropriate.
120. What is the difference between a local and a global variable?
Local variables are user defined variables that exist for the duration of a query. Global variables are
internal variables in SQL Server that contain system and operation related values.
121. What is the difference between a local temporary table and a global temporary table?
A local temporary table exists only for the duration of a connection or, if defined inside a compound statement, for the
duration of the compound statement. A global temporary table remains in the database permanently, but the rows
exist only within a given connection. When connection are closed, the data in the global temporary table disappears.
However, the table definition remains with the database for access when database is opened next time.
122. What are cursors? Name four types of cursors and when each one would be applied?
a. Base table
b. Static
c. Forward-only
d. Forward-only/Read-only
e. Keyset-driven
123. What is the purpose of UPDATE STATISTICS?
Update statistics is use to generate background statistics on a table or database. The statistics are used by query optimizer
in order to speed up the performance of queries.
124. How do you use DBCC statements to monitor various aspects of a SQL server installation?
Database Console Commands for SQL Server. They allow you to check the physical integrity of a database.
125. How do you load large data to the SQL server database?
You can use a BULK INSERT.
15 | P a g e
16 | P a g e
137. How would you Update the rows which are divisible by 10, given a set of numbers in column?
Using the modulus operator to find which numbers are divisible by 10.
139. What do the 1NF, 2NF and 3NF forms of normalization entail?
The 1NF form gets rid of redundant records and repeating attributes.
The 2NF form gets rid of partial dependencies.
The 3NF form gets rid of transient dependencies.
17 | P a g e
149. When utilizing the default constraint, if you do not enter a value, will the column be populated with the
default constraint specified?
No. The default constraint requires you enter a specific value.
18 | P a g e
165. What is the difference between a dimensional modelling and relational modelling?
Dimensional modelling is used to design data warehouses, while relational modelling is used to create OLTPs.
19 | P a g e
20 | P a g e
21 | P a g e
22 | P a g e
SSIS Questions
2. How did you make sure that OLAP is loaded with the right data?
QA testing of the ETL processes. Test queries.
8. Name a few tasks you commonly use for SSIS in the data flow.
Slowly changing dimensions, data conversion, sort, multi-cast, merge, merge join, cache transform, lookup, script component,
conditional split, flat file source, OLE DB source, OLE DB command, OLE DB destination and SQL Server destination.
23 | P a g e
14. When will you use transactions and check points in SSIS?
The transactions in T-SQL in order to roll back. Check points allow you to make a check point and throw an error when a
problem occurs. If the roll back or execution of a package or container or task requires a long time, use a check point over
a transaction.
15. What is the difference between a group container and a sequence container?
Group containers are just for viewing and do not have properties while a sequence container allows for you set
properties on it and breaks up tasks into different task units.
16. What is the difference between a file system task and ftp task?
A file system task will allow for SSIS to manage files and folders on the local system, while a ftp task will allow for files
and folders to be managed on remote ftp servers.
24 | P a g e
23. What is the difference between a control flow and data flow?
A dataflow is a subset of a control flow in an SSAS. Control flows dictate the order in which control tasks are performed
while a dataflow dictates the flow of data within the individual control flow tasks. In the control flow you can perform
dataflow tasks in parallel.
24. What are the different ways you can do error handling in SSIS?
Event handlers, fail pipeline, transactions, and check points.
25 | P a g e
39. How do you configure a script task and refer to an external variable? Remember SSIS is case sensitive.
Public Sub Main()
If File.Exists(Dts.Variables("User::path").Value) = True Then
Dts.Variables("User::fileExists").Value = True
Else
Dts.Variables("User::fileExists").Value = False
End If
Dts.TaskResult = ScriptResults.Success
End Sub
26 | P a g e
41. What are the different types of log provider types in SSIS?
They are the exact same types of information stored in different formats. You can have the following provider types:
Windows Event log file, text files, XML files, SQL server databases and trace files.
49. What is the difference between pipelines and precedence constraints in SSIS?
Pipelines are for data flow, and precedence constraints are for the control flow.
50. What is a environment variable and how does it apply to SSIS configuration files?
The environment variable is a variable on the OS level. It is used when with SSIS configuration, you can do INDIRECT
CONFIGURATION, and the variables value will be stored in the package, and all you have to do is make sure that all
computers this is being deployed to will have to have thevariable on them.
27 | P a g e
53. What is the task to call an SSIS package from inside another SSIS package?
By using the Execute Package Task.
55. What is the Parent package variable in Package Configuration used for?
It is used to pass variable values from parents to child packages. Remember that the settings have to be set on the child
package. Remember that you can only passes scalar value via SSIS.
59. Give an example of package deployment using the command line. (Remember the double quotes in the path name
dtutil /FILE "C:\Users\ConfusedInAManlyWay\Documents\Visual Studio
2008\Projects\March13\March13\child.dtsx" /DestServer localhost /COPY SQL;childpackage
28 | P a g e
63. What are the different types of transactions in SSIS and what do they do?
Supported indicates that the container does not start a transaction, but joins any transaction started by its parent
container. Required indicates that the container starts a transaction. Supported indicates that the container does not
start a transaction, but joins any transaction started by its parent container. Not Supported indicates that the
container does not start a transaction or join an existing transaction.
65. What is the difference between Fuzzy Grouping and Fuzzy Lookup?
Fuzzy grouping adds a column to the output of a table after examining the similarity between one or more columns and
adds columns to the output describing the similarity index. The grouping will compare the rows to the first instance of the
row that appears. A fuzzy lookup will perform a join/lookup based on whether the rows columns being compared are
similar to a degree to the looked up values being joined on.
69. What command line tools can you use with SSIS?
DTutil, DTexec, and DTexecUI are three that can be used.
29 | P a g e
72. How would you pass database information from SSIS to ASP.Net?
Using ASP.Nets datareader.
74. What is the prerequisite for data to be put in Merge and Merge Join transformations?
The data from the pipeline must be sorted.
75. What is SMSS used for?
It is used to deploy packets on the SQL server. It is what you use to interact with the database.
76. What is the difference between the For container, For Each container?
For Each will run through all the items in an enumeration. For container is used when you have bounds.
79. What is the difference between a Fuzzy look up and look up?
The Lookup transformation uses an equi-join to locate matching records in the reference table. It returns either an exact
match or nothing from the reference table. In contrast, the Fuzzy Lookup transformation uses fuzzy matching to return
one or more close matches from the reference table.
80. What is the difference between a Merge and Merge Join operation?
A merge requires the two incoming data sets to have the same amount of columns. It then performs a union on the
values. A merge join simply requires a common column on which it can perform a full join, left join, or inner join on.
81. What is the difference between using a dataflow task or bulk insert task?
Bulk insert doesnt do data validation, while the dataflow task will. You cant do transformations or error handling in bulk
inserts, while you can data flows.
30 | P a g e
82. What is the difference between an EXECUTE PROCESS TASK and EXECUTE PACKAGE TASK?
In execute process tasks, you can execute batch files, and in execute package task you execute packages.
92. What are the different ways that SSIS offers to execute script tasks?
ActiveX Component, Script task, and Script component.
31 | P a g e
102. Why is it not advisable to have all your variables as Package level?
The broader the scope, the more memory the variable will consume during run time.
32 | P a g e
104. What is the difference between an ETL tool and an ETL process?
The strategy is the plan, and the ETL tool is the vehicle that delivers the process.
105. What is the difference between an ETL tool and ETL package?
It is what is used to build a package. A package is a deliverable of an ETL tool. The package is an instance of my ETL
strategy where the logic in implemented and delivered.
SSRS Questions
108. What is SSRS used for?
SQL Server Reporting Services (SSRS) is a report generating tool from Microsoft. It is used for generating reports.
111. What is the difference between a sub-report, linked report, drill through, drill down, and report?
A sub-report is a report within a report, a linked report is basically a report based on a perspective of another report (you
set constraints on some parameters and generate a report from that), a drill through report is a report with bookmarks and
links, a drill down report is a report that has toggles.
33 | P a g e
117. What types of roles are available in SSRS 2008, and what are their purposes?
Item-level roles and system-level roles are the two types of roles available in SSRS 2008. An item-level role is a collection of
tasks related to operations on an object of the report object hierarchy of SSRS 2008. A system-level role is a collection of
tasks related to operations on server objects outside the report object hierarchy of SSRS 2008.
119. How do you add drop downs when creating selective reports?
You add a variable reference in your SELECT query for one of your datasets. You add another dataset that will hold solely the
values of the drop down and get the values from a query. Then you reference the value from a parameter. This will
automatically create a parameterized report.
34 | P a g e
120. What is the expressions option under the dataset menu for a cell used?
Customizing how the value of a cell is presented.
121. What is a data bar column?
It is a pictorial graphical representation of the column value in comparison to the other values in
the column.
122. In a report if we have data like 100.88899, how will you make sure that you round it off to two decimal places using
SQL queries?
You can use the ROUND() function to round a column in a database.
123. What is a tablix?
Tablix is a flexible report item that can be used to display data in a grid format, with layout possibilities ranging from
simple tables to advanced matrices. It is used to manage grid structures in SSIS.
124. What does selecting show data values on a data bar column do?
It will allow you to see the values next to the value bar in the data bar column rather than having a separate column
specifically for it.
125. What is conditional formatting?
It is when you change the format of your report based on the value in your data set.
126. What are some considerations that you should make when designing your SSIS package?
Minimize data in the buffer since often, you will be working with a subset of what will be running when the system goes
live. Remove redundant columns and data. Always use a SELECT statement rather than tables when configuring a data
source. Remove columns if you are no longer using them.
35 | P a g e
SSAS Questions
136. What is a cube (Analysis Services Database)?
It is a database that lives on top of a data mart or data warehouse. It scans the underlying schema and extracts the data.
137. What is a virtual cube?
A virtual cube is a cube object that used to exist in older versions of Analysis Services. It was basically a view of a
cube. It has been replaced by perspectives.
138. What is a data source view?
It will limit what tables can be altered and accessed by the system to the components in a view. It is the window to the
source connection.
139. What are the benefits of an OLAP cube over a relational database?
Consistently fast response. Meta-data based queries. Spreadsheet based formulas.
36 | P a g e
142. What is the benefit of using a OLAP database model over OLAP spreadsheet model?
It avoids data explosion since the database structure is not solely 2D like spreadsheets.
37 | P a g e
156. What is the difference between the key usage of regular, parent and key?
A parent child dimension is based on two dimension table columns that together define the lineage relationships among
the members of the dimension. One column called the member key column identifies each member, the other column,
called the parent key column identifies the parent of each member. If the key usage of a column is set to regular, it states
the column is an attribute. To simplify, you can look at things as follows:
Parent = Foreign key (for the parent key)
Key = Primary key
regular = attributes
157. What is the difference between a data source view and a perspective?
Data source views are on the source connections. Cubes and dimensions are built on the data source view. The
perspective is built on the cube.
38 | P a g e
167. Can MOLAP mode exist without aggregations? Can HOLAP exist without aggregations?
Yes
39 | P a g e
Id
First name
Last name
Sum of id etc
Sum of fn etc ..
Sum of ln etc ..
title
Default if aggregations already exists SSAS will not create the same aggregations again. Mainly default and
unrestricted allows analysis services to figure out things.
176. What does setting the aggregations performance value do?
Creates more aggregations in order to speed up the queries for different situations.
177. What is the difference between processing and deploying?
Processing is loading the values, and deploying is loading the structures.
181. For you to use partitioning in SQL or Microsoft analysis services, what version do you need?
You will need the Microsoft Enterprise Edition. The developer edition cannot support partitioning.
186. What are the pros and cons of having perspectives and how would you use them?
Pros
A perspective is a definition that allows users to see a cube in a simpler way. A perspective is a subset of the features of
a cube. A perspective enables administrators to create views of a cube, helping users to focus on the most relevant data
for them. A perspective contains subsets of all objects from a cube.
Cons
A perspective cannot include elements that are not defined in the parent cube. You cannot have additional security on a
perspective. If you have access to the perspective, you have access to the entire cube. It is read only.
41 | P a g e
189. What is the difference between a table bound and query bound partition?
A table bound partition is a partition that is bound to a fact table. A query bound partition is when the
contents of a partition is defined by a query (when doing this you must make sure that you do not overlook
any of the data).
190. How do you format your measures in a measure group? What is the name of the property and under which tab
does it fall under?
You can format your measure using the Cube Structures tab in analytics. You can select the measure and edit the Format
property in the properties window.
191. Under the cube object, which tab would you use to specify the relationships between dimensions and fact
tables/measure groups?
Dimension Usage.
195. What is the difference between referenced, many to many, fact, and regular relationships in analysis services?
A regular dimension relationship between a cube dimension and a measure group exists when the key column for the
dimension is joined directly to the fact table. This direct relationship is based on a primary keyforeign key relationship in the
underlying relational database, but might also be based on a logical relationship that is defined in the data source view.
Fact dimensions relationships, frequently referred to as degenerate dimensions, are standard dimensions that are constructed
from attribute columns in fact tables instead of from attribute columns in dimension tables. Useful dimensional data is
sometimes stored in a fact table to reduce duplication. A single fact table joined to multiple dimension members is called a
many-to-many relationship. A reference dimension relationship between a cube dimension and a measure group exists when
the key column for the dimension is joined indirectly to the fact table through a key in another dimension table.
42 | P a g e
43 | P a g e
44 | P a g e
45 | P a g e
SQL Profiler
1. What is SQL Profiler used for?
Finding database bottlenecks, debugging stored procedures and user defined processes, keeping a log of the
database activity, identifying deadlocks, identifying blocking and much much more It is a database tool used by
developers, QA testers and DBAs.
2. Must SQL Profiler be run via GUI or Is there another way to run SQL Profiler?
You can run SQL Profiler via T-SQL commands. Therefore, you can make Profiler it run in the background.
3. What is a template in SQL Profiler?
It is a predefined configuration for SQL Profiler that will make it monitor specific database activity.
4. What a filter in SQL Profiler?
It is a condition on the SQL Profiler data that will control what data Profiler records.
9. Give me an example of how you went about testing your ETL Package.
11. How can you make sure that your underlying data source is protected i.e prevent sql injection?
17. How do you decide how much time a test is going to take?
48 | P a g e
34. Describe the most complex stored procedure youve created and why it is most complex?
49 | P a g e
50 | P a g e