DD2432 Artificial Neural Networks and Other Learning Systems Exam 2009-03-12 at 14.00-19.00 Use a separate sheet for each question. Do not give several mutually conflicting answers to a question (ingen helgardering) if you do, the alternative with lowest score will be choosen.
DD2432 Artificial Neural Networks and Other Learning Systems Exam 2009-03-12 at 14.00-19.00 Use a separate sheet for each question. Do not give several mutually conflicting answers to a question (ingen helgardering) if you do, the alternative with lowest score will be choosen.
DD2432 Artificial Neural Networks and Other Learning Systems Exam 2009-03-12 at 14.00-19.00 Use a separate sheet for each question. Do not give several mutually conflicting answers to a question (ingen helgardering) if you do, the alternative with lowest score will be choosen.
DD2432 Artificial Neural Networks and Other Learning Systems
Exam 2009-03-12 at 14.00-19.00
Use a separate sheet for each question. Brief answers are preferred. Do not give several mutually conflicting answers to a question (ingen helgardering). If you do, the alternative with lowest score will be choosen. Allowed tools: Calculator and a standard english-other language dictionary may be used. Good luck! / Erik and rjan Question 1 (4p) Combine the terms with the right description. A) an input line of an ANN node 1) corresponds to the dendrite B) the ANN node output value 2) corresponds to the axon C) the ANN weight between two units 3) corresponds to the soma D) the transfer function of an ANN node 4) corresponds to excitatory and inhibitory potentials E) the output line of an ANN node 3) corresponds to spatial and temporal summation properties F) weights can be positive or negative 6) corresponds to the synapse efficacy G) the summation component of a node 7) corresponds to the all-or-none property of action potentials H) using a threshold transfer function 8) corresponds to the neuron output frequency Question 2 (4p) Consider a one-layer bipolar (-1, 1) perceptron using the perceptron learning rule, a learning rate 1 and a threshold 0. It is initialized with all weights and the bias set to 0. Output (out) is set to 1 when summed activation (net) is zero or more and is set to -1 if below zero. Compute the weight matrix after 1 epoch of training given the inputs and targets below. Show your calculations, and not just the final answer. input: target: ( 1 1) -1 ( 1 -1) 1 (-1 1) 1 (-1 -1) 1 Question 3 (2p) The perceptron learning rule for a one layer perceptron is said to stop learning unnecessarily early. What does this mean? Why/when is this bad? Question 4 (4p) A) Draw the network topology of a Backpropagation network that is to be used for data compression. B) Write the formula (or draw the diagram) for the transfer function of the output nodes in a Backpropagation network that does classification into a set of N discrete classes. C) In a project, a one layer perceptron was used as a classifier of ultra sound data. Objects had to be classified into two categories (correct and malfunctioning) and this was forunately simple (linearly separable). When the same equipment started to be used on a new set of objects to be classified, the one layer network was not able to classify correctly. It turned out that the two classes now had a sandwich-shaped desision surface (one class in the center (as cheese) and one class as two surrounding regions (two pieces of bread). Changing the network to a two layer Backpropagation network solved the task effectively even with very few hidden nodes. When a new change was introduced on the set of objects, the classifyer did not work. Explain what might be the problem and how it can be solved. Question 3 (4p) The following tips are given for a Backpropagation network to improve learning. A) Add random noise to the weights during learning. B) Introduce momentum in the weight updating. C) Order or weigh the training examples so that hard examples dominate. D) Put the target values t inside the domain interval of . What is the idea behind each of these tips? Question 6 (4p) Generalization is an important property of ANNs. In this respect, describe what is meant by early stopping and by pruning. Question 7 (4p) A) What is the storage capacity of a Hopfield net with 100 units assuming patterns have 50% 1s and 50% -1s? B) How can this capacity be increased? C) Why should the self-couplings (the diagonal elements of the weight matrix) be zero? D) What is the basic assumption that Hopfield nets makes about the patterns? Question 8 (4p) Boltzmann machines are guaranteed to converge to the global optimum, a very powerful property. The problem is that they are computationally expensive. This comes from a number of loops in the algorithm. Describe the loops for relaxation of a fixed network without learning as well as what needs to be computed to do learning. Question 9 (4p) Radial basis functions are commonly used as the transfer function of the input layer. Explain why. What does one want to achieve? What problems can be handled by doing this? Question 10 (4p) A) Reinforcement learning is not an unsupervised algorithm, and it is not dependent on a detailed supervised error signal. Give an example of a reinforcement signal in an application and when it is applied/used. Describle also very briefly what the application is about so we can understand the context of the error signal. B) Describe briefly the credit assignment problem. Question 11 (2p) What does one want to accomplish by the use of Oja's or Sanger's rule/net. Why is this interesting to acieve? What assumption about the data does it build upon? Question 12 (2p) In competitive learning there is the so called dead-unit problem. Give two ways to overcome this problem. Question 13 (4p) Below you find a number of problems that should be solved using one of the algorithms encountered in the course. For each case, name the algorithm. If you think it is important, name also some feature, like number of layers, learning rule, number of input/output nodes and data to them or type of activation function. A) In a molecular biology lab, a project concerns classifying tissue samples from patients into a small number (3) of cancer forms, as well as of course into the class no cancer. On the order of 400 patients have been collected in the data base. A large number (200) of proteins are measured in the samples and subsequently used as input to an ANN. Learning turns out to be problematic, as you may see yourself. Subsequently the protein data is examined in several ways, for example by plotting protein X versus protein Y for all patients. The conclusion is that it seems the data is not spanning the whole 300-D space, but rather to lie in sheets. What method/algorithm do you apply to the data before sending it in as input for the ANN classifyer? B) In your company the staff has been screened with a battrey of personality tests. All together, each person is characterized by 47 different measures of personality traits. The management now want to use this to improve team work. One idea is to have focus groups working on problems that need to be solved rapidly. These groups should consist of persons who are similar. The management will for each group pick out a group leader. Your task is to pick out 4 other persons who are the most similar to the chosen person. Based on experience, you know the management want to see your selection illustrated as a 2-d plot (the management won't understand higher dimensional plots). C) As a consultant of a large grocery store, your task is to optimize customer exposure to a list of products that the store wants to sell. These products will be placed throughout the store. Exposure to these means that the customer should walk by as many of the products as possible. At your hand you also have a list of very common products that many customers buy (assume for simplicity that all customers buy all these products). Assume that all products have a fixed location in the store (except for the exposure products that are different every week). You can also assume customers are smart and want to minimize their walking distance between the products they intend to buy. What algorithm do you use to decide where to put the exposure products? D) Your company is developing a logic-based AI system to plan maintenance of the manufacturing machines. As service costs, it should be made as seldom as possible of course, so maintenance should be planned close to the failure point. The goal is to produce a value indicating time until a machine component is expected to fail unless serviced. The company has saved a lot of data over the years. Data is information about time points for maintenance as well as time points for failures for a list of machine components. The AI project proceeded rapidly initially, finding the first 8 rules, but then proceeded slower and slower in terms of adding more rules and getting closer to the prediction level wanted. At one point the management decides a quick and dirty solution is needed something that can be made to work fairly good and be ready very soon.