Professional Documents
Culture Documents
Question 1.
Solution: Insertion sort is a simple sorting algorithm, a comparison sort in which the
sorted array (or list) is built one entry at a time. It is much less efficient on large lists than
more advanced algorithms such as quicksort, heapsort, or merge sort. However, insertion
sort provides several advantages:
• simple implementation
• efficient for (quite) small data sets
• adaptive, i.e. efficient for data sets that are already substantially sorted: the time
complexity is O(n + d), where d is the number of inversions
• more efficient in practice than most other simple quadratic (i.e. O(n2)) algorithms
such as selection sort or bubble sort: the average running time is n2/4[citation needed],
and the running time is linear in the best case
• stable, i.e. does not change the relative order of elements with equal keys
• in-place, i.e. only requires a constant amount O(1) of additional memory space
• online, i.e. can sort a list as it receives it.
insertionSort(array A)
begin
for i := 1 to length[A]-1 do
begin
value := A[i];
j := i - 1;
done := false;
repeat
if A[j] > value then
begin
A[j + 1] := A[j];
j := j - 1;
if j < 0 then
done := true;
end
else
done := true;
until done;
A[j + 1] := value;
end;
end;
ii).
Effectively, the list is divided into two parts: the sublist of items already sorted, which is
built up from left to right and is found at the beginning, and the sublist of items remaining
to be sorted, occupying the remainder of the array.
iii)
iv).
1. A small list will take fewer steps to sort than a large list.
2. Fewer steps are required to construct a sorted list from two sorted lists than two
unsorted lists. For example, you only have to traverse each list once if they're
already sorted (see the merge function below for an example implementation).
v).
Quicksort is a well-known sorting algorithm developed by C. A. R. Hoare that, on
average, makes Θ(nlogn) (big O notation) comparisons to sort n items. In the worst
case, it makes Θ(n2) comparisons, though if implemented correctly this behavior is rare.
Typically, quicksort is significantly faster in practice than other Θ(nlogn) algorithms,
because its inner loop can be efficiently implemented on most architectures, and in most
real-world data, it is possible to make design choices which minimize the probability of
requiring quadratic time. Additionally, quicksort tends to make excellent usage of the
memory hierarchy, taking perfect advantage of virtual memory and available caches.
Coupled with the fact that quicksort is an in-place sort and uses no temporary memory, it
is very well suited to modern computer architectures
Quicksort sorts by employing a divide and conquer strategy to divide a list into two sub-
lists.
Full example of quicksort on a random set of numbers. The boxed element is the pivot. It
is always chosen as the last element of the partition.
The base case of the recursion are lists of size zero or one, which are always sorted.
Question 2.
a).
Suppose
o S(k) is true for fixed constant k
Often k = 0
o S(n) ◊ S(n+1) for all n >= k
Proof By Induction
• Claim:S(n) is true for all n >= k
• Basis:
o Show formula is true when n = k
• Inductive hypothesis:
o Assume formula is true for an arbitrary n
• Step:
o Show that formula is then true for n+1
David Luebke
Induction Example:
Gaussian Closed Form
• Prove 1 + 2 + 3 + … + n = n(n+1) / 2
o Basis:
If n = 0, then 0 = 0(0+1) / 2
o Inductive hypothesis:
Assume 1 + 2 + 3 + … + n = n(n+1) / 2
o Step (show true for n+1):
1 + 2 + … + n + n+1 = (1 + 2 + …+ n) + (n+1)
= (n+1)(n+2)/2 = (n+1)(n+1 + 1) / 2
David Luebke
Induction Example:
Geometric Closed Form
• Prove a0 + a1 + … + an = (an+1 - 1)/(a - 1)
for all a ≠ 1
o Basis: show that a0 = (a0+1 - 1)/(a - 1)
a0 = 1 = (a1 - 1)/(a - 1)
o Inductive hypothesis:
Assume a0 + a1 + … + an = (an+1 - 1)/(a - 1)
o Step (show true for n+1):
a0 + a1 + … + an+1 = a0 + a1 + … + an + an+1
= (an+1 - 1)/(a - 1) + an+1 = (an+1+1 - 1)/(a - 1)
David Luebke
Question 2 .
b).
A classic example of a recursive procedure is the function used to calculate the factorial
of an integer.
Function definition:
Pseudocode (recursive):
1. if n is 0, return 1
2. otherwise, return [ n × factorial(n-1) ]
end factorial
c).
A Turing machine is a theoretical device that manipulates symbols contained on a strip
of tape. Despite its simplicity, a Turing machine can be adapted to simulate the logic of
any computer algorithm, and is particularly useful in explaining the functions of a CPU
inside of a computer. The "Turing" machine was described by Alan Turing in 1937,[1]
who called it an "a(utomatic)-machine". Turing machines are not intended as a practical
computing technology, but rather as a thought experiment representing a computing
machine. They help computer scientists understand the limits of mechanical computation.
A succinct definition of the thought experiment was given by Turing in his 1948 essay,
"Intelligent Machinery". Referring back to his 1936 publication, Turing writes that the
Turing machine, here called a Logical Computing Machine, consisted of:
...an infinite memory capacity obtained in the form of an infinite tape marked out into squares on
each of which a symbol could be printed. At any moment there is one symbol in the machine; it is
called the scanned symbol. The machine can alter the scanned symbol and its behavior is in part
determined by that symbol, but the symbols on the tape elsewhere do not affect the behavior of
the machine. However, the tape can be moved back and forth through the machine, this being one
of the elementary operations of the machine. Any symbol on the tape may therefore eventually
have an innings[2]. (Turing 1948, p. 61)
The head is always over a particular square of the tape; only a finite stretch of squares is
given. The instruction to be performed (q4) is shown over the scanned square. (Drawing
after Kleene (1952) p.375.)
Here, the internal state (q1) is shown inside the head, and the illustration describes the
tape as being infinite and pre-filled with "0", the symbol serving as blank. The system's
full state (its configuration) consists of the internal state, the contents of the shaded
squares including the blank scanned by the head ("11B"), and the position of the head.
(Drawing after Minsky (1967) p. 121).
Question 3.
a)i).
a)ii).
Load balancing algorithms based on
gradient methods and their analysis
through algebraic graph theory
Andrey G. Bronevicha, ,
and Wolfgang Meyerb,
a
Taganrog State University of Radio-Engineering, Nekrasovskij street 44, 347928 Taganrog, Russia
b
Institut für Automatisierungstechnik (2-05), Schwarzenbergstrasse 95, D-21073 Hamburg, Germany
Received 5 May 2006;
revised 4 September 2007;
accepted 6 September 2007.
Available online 21 September 2007.
Abstract
The main results of this paper are based on the idea that most load balancing algorithms
can be described in the framework of optimization theory. It enables to involve classical
results linked with convergence, its speed and other elements. We emphasize that these
classical results have been found independently and till now this connection has not been
shown clearly. In this paper, we analyze the load balancing algorithm based on the
steepest descent algorithm. The analysis shows that the speed of convergence is
determined by eigenvalues of the Laplacian for the graph of a given load balancing
system. This consideration also leads to the problems of choosing an optimal structure for
a load balancing system. We prove that these optimal graphs have special Laplacians: the
multiplicities of their minimal and maximal positive eigenvalues must be greater than
one. Such a property is essential for strongly regular graphs, investigated in algebraic
graph theory.
b).
Kruskal's algorithm is an algorithm in graph theory that finds a minimum spanning tree
for a connected weighted graph. This means it finds a subset of the edges that forms a
tree that includes every vertex, where the total weight of all the edges in the tree is
minimized. If the graph is not connected, then it finds a minimum spanning forest (a
minimum spanning tree for each connected component). Kruskal's algorithm is an
example of a greedy algorithm.
Question 4.a).
Formal analysis
From the initial description it's not obvious that quicksort takes Θ(nlogn) time on
average. It's not hard to see that the partition operation, which simply loops over the
elements of the array once, uses Θ(n) time. In versions that perform concatenation, this
operation is also Θ(n).
In the best case, each time we perform a partition we divide the list into two nearly equal
pieces. This means each recursive call processes a list of half the size. Consequently, we
can make only logn nested calls before we reach a list of size 1. This means that the
depth of the call tree is Θ(logn). But no two calls at the same level of the call tree
process the same part of the original list; thus, each level of calls needs only Θ(n) time
all together (each call has some constant overhead, but since there are only Θ(n) calls at
each level, this is subsumed in the Θ(n) factor). The result is that the algorithm uses only
Θ(nlogn) time.
An alternate approach is to set up a recurrence relation for the T(n) factor, the time
needed to sort a list of size n. Because a single quicksort call involves Θ(n) factor work
plus two recursive calls on lists of size n / 2 in the best case, the relation would be:
In the worst case, however, the two sublists have size 1 and n − 1 (for example, if the
array consists of the same element by value), and the call tree becomes a linear chain of
n nested calls. The ith call does Θ(n − i) work, and . The recurrence relation is:
This is the same relation as for insertion sort and selection sort, and it solves to T(n) =
Θ(n2). Given knowledge of which comparisons are performed by the sort, there are
adaptive algorithms that are effective at generating worst-case input for quicksort on-the-
fly, regardless of the pivot selection strategy.[4]
Question 4 b.)
A greedy algorithm is any algorithm that follows the problem solving metaheuristic of
making the locally optimal choice at each stage[1] with the hope of finding the global
optimum.
For example, applying the greedy strategy to the traveling salesman problem yields the
following algorithm: "At each stage visit the unvisited city nearest to the current city".
Optimal substructure
"A problem exhibits optimal substructure if an optimal solution to the problem
contains optimal solutions to the sub-problems."[2] Said differently, a problem has
optimal substructure if the best next move always leads to the optimal solution.
An example of 'non-optimal substructure' would be a situation where capturing a
queen in chess (good next move) will eventually lead to the loss of the game (bad
overall move).
For many other problems, greedy algorithms fail to produce the optimal solution, and
may even produce the unique worst possible solutions. One example is the nearest
neighbor algorithm mentioned above: for each number of cities there is an assignment of
distances between the cities for which the nearest neighbor heuristic produces the unique
worst possible tour.[3]
Imagine the coin example with only 25-cent, 10-cent, and 4-cent coins. The greedy
algorithm would not be able to make change for 41 cents, since after committing to use
one 25-cent coin and one 10-cent coin it would be impossible to use 4-cent coins for the
balance of 6 cents. Whereas a person or a more sophisticated algorithm could make
change for 41 cents change with one 25-cent coin and four 4-cent coins.
Question 4.d).
In computational complexity theory, the complexity class NP-complete (abbreviated NP-
C or NPC), is a class of problems having two properties:
• Any given solution to the problem can be verified quickly (in polynomial time);
the set of problems with this property is called NP (nondeterministic polynomial
time).
• If the problem can be solved quickly (in polynomial time), then so can every
problem in NP.
Although any given solution to such a problem can be verified quickly, there is no known
efficient way to locate a solution in the first place; indeed, the most notable characteristic
of NP-complete problems is that no fast solution to them is known. That is, the time
required to solve the problem using any currently known algorithm increases very
quickly as the size of the problem grows. As a result, the time required to solve even
moderately large versions of many of these problems easily reaches into the billions or
trillions of years, using any amount of computing power available today. As a
consequence, determining whether or not it is possible to solve these problems quickly is
one of the principal unsolved problems in computer science today.
While a method for computing the solutions to NP-complete problems using a reasonable
amount of time remains undiscovered, computer scientists and programmers still
frequently encounter NP-complete problems. An expert programmer should be able to
recognize an NP-complete problem so that he or she does not unknowingly waste time
trying to solve a problem which so far has eluded generations of computer scientists.
Instead, NP-complete problems are often addressed by using approximation algorithms.