You are on page 1of 105

Improving Precision of Robot Control Using a Hardware Implementation of a Kalman Filter

Alan Cheng

13

Table of Contents
I. Abstract 1. Introduction 2. Goal, Design, and Setup 3. Phase 1: Inverse Kinematics and the Kalman Filter 4. Phase 1: Software Design (Image Processing) and Verification 5. Phase 2: Logic Synthesis and Systolic Processing 6. Phase 2: Hardware Implementation of Faddeevs Algorithm 7. Extension: 3D Location of Block 8. Conclusion 9. References Appendix A: Program Source Code Appendix B: Phase II Verilog-HDL Program for Faddeevs Algorithm Appendix C: Data from Calculation of Variance for Measurement Error Covariance Matrix

I. Abstract
In the current era of technology, robotics has become an important area of scientific research. They are used in manufacturing, entertainment, and even home industries. However, a major problem that arises is the control of the robot. In an application like robotic surgery, it is crucial to have accurate control of the robot. Unfortunately due to noise and mechanical error, the movement of a robot can get less and less precise. The purpose of this project is to optimize control of a robotic arm in an inverse kinematics problem. A Kalman Filter is used to improve the control and correct the movement of the robot arm. There are three main phases in the project: Phase 1: software implementation of Kalman Filter and inverse kinematics, Phase 2: hardware implementation of Faddeevs Algorithm to efficiently perform matrix operations, and Phase 3: combined implementation of Kalman Filter in hardware and software robot control Phase 1 involves the use of a Kalman Filter. This filter gives an optimal estimation in a noisy environment. It consists of three major steps: prediction, measurement, and correction. Two algorithms were tested for the problem of inverse kinematics: one that uses a Kalman Filter, and one that doesnt. The algorithms were tested on the accuracy of grabbing a block placed in an arbitrary location. Results show that the algorithm with the Kalman Filter yielded a much higher accuracy and reliability than the other algorithm. Phase 2 addresses problems observed in Phase 1. One of the largest problems is that the software is not fast enough to maintain accurate real-time processing. Phase 2 solves the problem by moving the process of calculating the Kalman Filter into a FPGA. A method for a hardware implementation of Faddeevs Algorithm, which is useful in matrix computations, is used. The method uses the novel concept of macropulses and micropulses to regulate the flow of data in the algorithm. The algorithm was written in Verilog-HDL and verified using the ModelSim simulator. Phase 3 is currently being worked on. The goal is to combine the methods used in Phase 1 and Phase 2 to create more effective robot control. Future work includes the following: Use Faddeevs Algorithm to efficiently implement a Kalman Filter into a hardware circuit and have it working on a FPGA, and to connect the FPGA to the computer to higher accuracy of robot control (FPGA will be used for the Kalman Filter). Another important concept to be realized in Phase 3 is the speed-up of image processing. By improving the calculation speed of image processing, the software will be able to have realtime processing without delay from the camera. This can be done by moving the image processing stage onto the FPGA or CUDA.

1. Introduction
Technology has been improving and expanding at an exponential rate. At the same, the industry is started to rely more and more on automotive machines. Existing examples include assembly robots in manufacturing plants (most notably car manufacture) and transportation robots used to sort and move items in a warehouse (Amazons warehouse). An important problem which all of the robots have to solve is the problem of robot control. The problem of robot control is simply to optimize and have an accurate and precise control of a robot. However, a simple as it may seem, the problem itself is very complex, and is a very important part of current research in robotics. Many of the problems arising in robot control come from the environmental noise. Disruption in data can be caused by measurement, like impreciseness of image detection and recognition. Variation can also come from the accuracy of the robot servo motors, where over time the servo motor would start to get more and more inaccurate. In applications like robotic surgery, precision is the main concern where even a small movement in the wrong direction could be fatal. As a result, in order to produce a wider variety of higher quality robots, the problem of robot control must be perfected.

Figure 1.1: (left) Amazons warehouse; (right) surgical robot

2. Goal, Design, and Setup


2.1 Problem In current day technology, one of the most important areas of research is on robot control. This is because in many situations very accurate and precision movement is necessary. One notable is the problem of robotic surgery where if the robot is the accurate, it could mortally wound or kill the patient. Another example is during the manufacturing process of batteries where a needle connected to the battery juice is inserted into the battery head. However, the needle needs to be inserted in the exact middle of the battery to have the most beneficial result. Normal control mechanism like PID control sometimes isnt accurate enough, and as a result, engineers look towards filters like the Kalman Filter to gain higher precision. 2.2 Goal The purpose of this project is to optimize control of a robotic arm in an inverse kinematics problem. For the inverse kinematics problem, the robot arm must be able to successfully locate and grab a blue colored block. The block can be placed in any arbitrary location. Also, the robot should be able to operate in a noisy environment (because the realworld environment is almost never noise-free).

Figure 2.1: Example setup of problem. Note that noise is allowed in the environment. 2.3 Constraints of Hardware Because of practicality, constraints are introduced into the project. The following list shows the constraints of the project:

Cost Hardware cannot be expensive (includes: precise servo motors and highperformance computers) Computer - HP Pavilion dm4 with core i5 processor clocked at 2.4 GHz (laptop is outdated, may struggle to run high video image processing) o Cores 2 cores are on the computer (limits multi-threading and parallel processing) Robot Arm Has limited degrees of freedom and range

2.4 Lynxmotion AL5A and the Pololu Mini Maestro The Lynxmotion AL5A is a low cost solution (around $300) for a robot arm. It includes five degrees of freedom (which are shown in the figure below) with a range 180.The arm can also reach up to 14 inches around the base.

Gripper

Wrist Rotation (vertical)

Ulna (vertical)

Yellow dot for image detection

Humerus (vertical)

Base

Pololu Mini Maestro Servo Controller

Figure 2.2: Photograph of Lynxmotion AL5A Robot Arm and label of each part of the arm.

A Pololu Mini Maestro Servo Controller (12 servos) is used to control the robot arm. The control of the robot is done through pulse-width modulation instead of in degrees. Since the pulse-width modulation changes for each servo manufacturer, calibration is needed in order to know the control.

2.5 Visual Studio, C++, and OpenCV Microsofts Visual Studio 2010 was used for the integrated development environment of the project. The programming language of choice is C++ with the OpenCV 2.4.3 library used for image processing and recognition.

Figure 2.3: Screenshot of the Visual Studio 2010 Integrated Development Environment An overhead camera is used as the input for image detection. The setup is shown in figure 2.4.

Figure 2.5: Setup of overhead camera.

2.6 Discussion on Different Methods to Solve Problem The main goal is to have a noise resistant robot arm that would be able to solve the problem of the inverse kinematics. The following shows the methods considered in the hardware design of the robot arm: Problem 1: Method to control robot o Arduino Microcontroller

o Pololu Mini Maestro o Ultimately, the Pololu Mini Maestro was chosen because of an easier interface to the computer. If the Arduino was used, an additional servo-controller shield is necessary. Thus, using the Arduino also would lead to a more expensive cost. Problem 2: Method to detect blue block o Stereovision (the Kinect) o Overhead camera o Ultimately, an overhead camera was chose (again) because of an easier interface (stereovision is not required for 2 dimensions). However in the future, a change to using stereovision might be done to detect the 3D coordinates of the block Problem 3: Method for reducing noise o Kalman Filter o Extensive reduction of image processing noise o High quality and precise servo motors o PID controller (Proportional-integral-derivative controller) o Ultimately, the Kalman Filter was used. However, reduction of noise in image processing was also done via thresholding. High quality servo motors was not considered because of high costs. The use of only a PID controller was also not used because the Kalman Filter is optimal for linear functions.

2.7 Multi-stage Process The project is decomposed into 3 different phases: Phase 1: Software implementation of Kalman Filter and inverse kinematics o Phase 1 involves creating software to autonomously and accurately solve the inverse kinematics problem. This is achieved by using image processing to find the position of an object, using equations to solve the inverse kinematics, and to use a Kalman Filter to improve precision of robot control. Phase 2: Hardware implementation of Faddeevs Algorithm to efficiently perform matrix operations o Phase 2 involves the hardware implementation of Faddeevs Algorithm, which will then be used for an efficient implementation of a Kalman filter in Hardware. The Faddeevs Algorithm circuit created will be tested through hardware simulation. Phase3: Combined implementation of Kalman Filter in hardware and software robot control

o Phase 3 involves the following: successfully running a Kalman Filter on a FPGA, integration of the hardware Kalman Filter and the software robot control/image processing, and lastly speeding up image processing rates through CUDA or a FPGA

3. Phase 1: Inverse Kinematics and the Kalman Filter


3.1 The Inverse Kinematics Problem Inverse kinematics is the problem of calculating motions to get to a specified location of the end-effector (the tip of a robot arm). It is the opposite of forward kinematics, which is the problem of predicting the location of the end-effector given the joint movements. The inverse kinematics problem is commonly used in industrial robots to help with automotive assembly.

3.2 Inverse Kinematics in the Project In the project, the problem of inverse kinematics is to move to robot arm to a position in front of the blue block, and then to pick it up. Since only the x and y coordinates are used (the z coordinate will be implemented later) and that there are no obstacles in the environment (physical obstacles), trigonometry can be used to calculate the angles that the arm needs to rotate. Refer to figure 3.1 below.

Equation to finding angle of rotation: ( )

Equation to finding length from block:

Figure 3.1: Method for finding inverse kinematics of given problem Figure 3.1 shows the method for calculating the data necessarily to solve the inverse kinematics problem. A triangle can be constructed from the points of the robot arm (in initial position) as well as the position of the blue block (measured from camera by image recognition). Assuming that the position of the robot arm base is at (0, 0), the length of x can be simply determined by the x coordinate of the blue block, and likewise for the length of y. Then, angle is calculated by taking the arctangent of x/y (equation is derived in figure 3.2). ( )

( ) ( ( )) ( ) Figure 3.2: Derivation of equation for calculating angle in the problem The distance between the robot base and the blue block can be calculated using the Pythagorean Theorem. This is possible because the lengths of the base and height of the triangle are already known. A simple derivation is shown in figure 3.3. ( )

Figure 3.3: Derivation of equation for calculating distance to blue block After the robot is at the desired angle calculated with the equations above, the next step is to calculate the inverse kinematics to move the robot arm such that the gripper surrounds the block. This optimal point which the gripper (specifically the connection between the gripper and the wrist) has to move to is located at point (r, z). The movement of the arm to point (r, z) is also not trivial because in order to move the arm forward, three different joints have to be move simultaneously. Therefore, a different set of inverse kinematics equations is needed. Figure 3.4 shows a diagram of triangles that can be constructed to find the angles of each joint. The figure is drawn in respect to the Z axis (the X and Y axes must stay the same otherwise the robot would not be facing the right direction).

Figure 3.4: Calculation of inverse kinematics for moving towards the block (position to grab the block located at the grip point) Calculation of point (r, z), the Grip point r = r - (sin(gripAngle) * gripLength) z = z - baseHeight + (cos(gripAngle) * gripLength)

Calculations of angles required to move to the Grip point h = sqrt(z2 * r2) / 2 elbowAngle = asin(h / armLength) * 2 shoulderAngle = atan2(z / r) + ((PI - elbowAngle) / 2) wristAngle = PI + gripAngle - shoulderAngle - elbowAngle Figure 3.5: Equations to calculate points and angles shown in figure 3.3.

3.3The Kalman Filter

The Kalman Filter is utilized in this project to optimize the control and accuracy of the robot arm. It is able to make a model of a problem more noise resistant. Because in the real world, noise arise everywhere (lighting, slight movement of camera, inaccurate servo motors in the robot arm), having a more noise resistant model leads to high accuracy and precision. To simply put it, the Kalman Filter filters out noise, leaving only useful information. The Kalman Filter has three main parts: prediction, measurement, and correction. The general equations are shown in figure 3.3. The Kalman filter first predicts a state using a model, obtains some sort of measurement from sensors, and then uses both the predicted state and the measurements to create a more accurate state (see figure 3.5). Over time, the model of the problem will become closer to having almost no error. Before explaining each of the states separate, I will present an example outlining the parts of a Kalman Filter.

Figure 3.6: General equations of the Kalman Filter (source: http://bilgin.esme.org/BitsBytes/KalmanFilterforDummies.aspx)

Figure 3.7: Example graph of a Kalman Filter and corresponding measurements in a robot localization problem. (Source: http://2.bp.blogspot.com/_Eogk12oYmg/S7ZvOnBbTVI/AAAAAAAAAKg/2_IoODo8KoQ/s640/KalmanFusion.png)

3.3.1 Kalman Filter Example Suppose that there is a river in which you want to monitor the water level. You have a model to predict the water level every hour. However, the model is not 100% accurate. Therefore, you send some to the river to physically check the water level. But the person is only able to check the water level once a day during noon. Also, the measured water level from the person is not 100% accurate. As a result, a Kalman Filter is used to combine the outputs from both sources (model and person) to give a better estimate. In this example, the three stages are as following: Prediction theoretical model of river level Measurement physical person checking the river level Correction using both of the outputs from the prediction and measurement stage to create a better estimate.

Figure 3.8: Sample graph of the river problem. 3.3.2 Model for Kalman Filter One of the most important steps in the Kalman Filter is to create a model for the problem (general form equation shown in figure 3.7). The purpose of such model is to create a prediction (of the next state) for the current problem. If all the parameters in the project fit into the equations without contradiction, then a Kalman Filter can be used. In the first equation (which calculates the next state in time), is the state being predicted, A is a transition matrix (similar to slope in y=mx+b; it tells how the model should progress over time), is the previous state, B is a general form matrix or constant, is a control-input, and is the process noise value. The second equation calculates the measurement vector for the model. It contains the following variables: is a general matrix (typically used for unit conversion), is the current measured state, and is a variable denoting measurement noise.

Figure 3.9: Equations of the model for a Kalman Filter* *Note that the general equation will vary depending on function of Kalman Filter and type of Kalman Filter. The equation commonly used is: Xk = Fk1Xk1 + Bk1uk1 + Wk1. I have changed the variable name from F to A to match with the equations in the latter portions of this section. The first equation uses the form of y = mx + b, but also generalizes the equations to matrices and added variables for noise. This is because the common form of a Kalman Filter is for linear functions (although different variants like the Extended Kalman Filter can be used for non-linear models). In the model, a process noise value is added ( ). This value represents the

variation in the model. For example, in the river height problem described above, the process noise may be describing the percent error that the model has (since almost all models are not perfectly accurate). The second equation is more simplistic than the first. It is used to refine the measurement data. A measurement noise value, which is an estimation of the noise occurring in measurement, is included in the equation. This is because noise has to be accounted for in the model. The general equation (figure 3.7) can be decomposed into smaller equations. These equations form three major stages of the Kalman Filter: prediction, measurement, and correction. Note that these derived equations will vary depending of the current problem, and therefore are simplified.

3.3.3 Prediction Stage The first stage in the Kalman Filter is the prediction stage. In this stage, a model is used to predict the future state of a problem (i.e. river level model). In the equations shown below, the variables have the same meaning as the model shown in figure 3.7.

Figure 3.10: Equations for the prediction stage of the Kalman Filter. The first equation calculates the value of , which is the value of next state before correction. Notice that the only difference from the equation in Figure 3.8 to the model is a noticeable lack of the process noise variable. This is because in many models, process noise isnt apparent in the model. However, the variable is necessary for problems with significant processing noise. The next equation is used to calculate the value of the error covariance ( ) before correction. It introduces new values into the equations. Namely these values are: which is the previous error covariance value, and Q is the estimated process error covariance.

3.3.4 Measurement Stage The next stage of the Kalman Filter is the measurement stage. The purpose of this step is to simply obtain some sort of data from measurement. Therefore, there are no general equations for this stage. This process occurs in the river level problem as the person goes to physically measure the depth of the water.

3.3.5 Correction Stage The last and most important stage of the Kalman Filter is the correction stage. The purpose of this stage is to give a more accurate predicted state from the data outputted from both the prediction and measurement stages. To do this, there are three equations in the stage, and they are displayed in figure 3.9. New values introduced into the equations are: which is the measurement error covariance matrix, and which is the Kalman Gain. ( ( ( ) ) )

Figure 3.11: Equations used in the correction stage of the Kalman Filter The first equation calculates the Kalman Gain ( ). The Kalman Gain is a correction factor. It determines how much correction (measurement prediction) is needed. If the Kalman Gain is value 0.5, then 50% of correction is added. If the value is closer to 0, then the predicted state has a higher weight than the measurement state. When the value is 0, then the measurement state is completely ignored. The opposite happens as the Kalman Gain approaches 1. The second equation utilizes the Kalman Gain to calculate the value of the next (corrected) state. The Kalman Gain takes the difference between measurement and prediction values and applies it to the current state. Finally, the third equation calculates the new error covariance value (in a very similar manner to the calculation of the new state).

3.4 Implementation of Kalman Filter into Current Inverse Kinematics Problem Inverse Kinematics is used to determine both the rotation of the base and the forward movement of the robot arm. The Kalman Filter is used to optimize the values for the rotation of the base. The forward movement currently does not use Kalman Filters because in the current problem, the length of the gripper and the length of the block are fairly large, and can tolerate larger error values. However in the future, a Kalman Filter will be implemented to the forward movement of the robot arm. This is to increase precision in applications where grabbing can tolerate little error. Two Kalman Filters were used in the project. One Kalman Filter was used to minimize error in the rotation of the robot base. The other Kalman Filter was used to minimize error in the position of the blue block. This is because the position of the blue block is crucial to solving the

problem, and that an accurate position of the block is necessary. However, both Kalman Filters use the same model (only with different measurement values) The model of the Kalman Filter uses velocity, as were as position (x, y). This model works because given the previous velocity of the arm; a prediction can be made by adding the previous velocities to the previous state. Thus, the state ( ) is represented by a 4 by 1vector. The vector is in the format (x, y, velocity_x, velocity_y). The transition matrix (A) performs the operation of adding the velocity to the current state, and is shown in figure 3.12.
Velocity X Velocity Y

`
X X Y Velocity X Velocity Y Y

Figure 3.12: Transition matrix (A) In the prediction stage, the values/methods described above are used to determine the next state. In the measurement stage, image detection is done with the overview camera, and the (x, y) coordinates of both the block and robot arm (denoted by a yellow dot on the gripper) are recorded. Finally, the correction stage takes both the predicted state, as well as the image processing measured state to produce a more optimal state. An approximation was done to calculate the values of the error matrices. The process noise matrix was a 4 x 4 matrix with values of .0001 on the diagonal (from top left to bottom right). The measurement noise was calculated by taking a quick sample of camera measurements (x and y position) of the blue block. Three trials were done in total, with the block being in a different position each time (the sample shown in Appendix B). Then, the variation of each trial was taken, and then the variations were averaged. As a result, the matrix shown in figure 3.13 was found.

Figure 3.13: (Left) process noise matrix; (Right) measurement noise matrix. Notice that this matrix is only 2x2 because the velocity is not measured via measurement stage.

3.5 Implementation of Kalman Filter in C++ There actually exists a function in OpenCV to implement a Kalman Filter. It consists for four separate functions: one to initialize the Kalman Filter, one to predict to next state, one to receive and store the measurements, and one to correct the current state. The program coding is shown in figure 3.13. Initialization:
KalmanFilter KF(4, 2, 0); Mat_<float> state(4, 1); /* (x, y, Vx, Vy) */ Mat processNoise(4, 1, CV_32F); Mat_<float> measurement(2,1); measurement.setTo(Scalar(0)); KFs.statePre.at<float>(0) KFs.statePre.at<float>(1) KFs.statePre.at<float>(2) KFs.statePre.at<float>(3) = = = = x; y; 0; 0; x; y; 0; 0; 0,1,0,1, 0,0,1,0,

KFs.statePost.at<float>(0) KFs.statePost.at<float>(1) KFs.statePost.at<float>(2) KFs.statePost.at<float>(3)

= = = =

KFs.transitionMatrix = *(Mat_<float>(4, 4) << 1,0,1,0, 0,0,0,1);

setIdentity(KFs.measurementMatrix); setIdentity(KFs.processNoiseCov, Scalar::all(1e-4)); KFs.measurementNoiseCov = *(Mat_<float>(2, 2) << 0.197492,0, setIdentity(KFs.errorCovPost, Scalar::all(.1));

0,0.052607);

General Coding to Predict (Step 1)


Point step1() { Mat prediction = KFs.predict(); Point predictPt(prediction.at<float>(0),prediction.at<float>(1)); return predictPt; }

General Coding to Input Measurement*


void changeMeasure(float x,float y) { measurements(0) = x; measurements(1) = y; }

General Coding to Correct (Step 2)


Point step2() { Mat estimated = KFs.correct(measurements);

Point statePt(estimated.at<float>(0),estimated.at<float>(1)); return statePt; }

Figure 3.14: Example coding for the Kalman Filter *This function is not part of OpenCV (it is a user-defined function)

4. Phase 1: Software Design (Image Processing) and Verification


4.1 Software Outline The flowchart of the program is shown in figure 4.1. The general idea is as following (in order): Detect location of blue block using image processing Calculate inverse kinematics to the block o Use Kalman Filtering to correct the inverse kinematics such that the robot base is aligned with the blue block o Also use Kalman Filtering to continuously monitor the location of the blue block. When robot base is at right angle, move the arm forward and pickup the block.

Figure 4.1: Flowchart of Program 4.2. Computer Vision 4.2.1 Computer Vision

Computer vision is a computer with a camera to simulate human vision. It is how the computer can see the outside world. The image is actually a grid of numbers which represents the color of the pixel. After the image is captured, then it can be processed by computer. However, if the programmer was to do this from scratch, it would take lots of work to write code for processing each pixel. That is one of the reasons why OpenCV was made, to have an easier time programming.

4.2.2 OpenCV OpenCV is a free, open source, computer vision library. OpenCV is an external library for c and c++. Computer vision is used a lot in our common life. Such examples are Googles Street View, unmanned vehicles, etc.

4.2.3 Hue, Saturation, and Value Hue, Saturation, and Value, otherwise known as HSV, are used to detect the images color. The hue value is the pure color. Saturation is the colors strength (pure color to gray). Value is the brightness of the color. Thus, we can separate the image to get the grayscale of pure color, strength, and brightness.

http://ilab.usc.edu/wiki/index.php/HSV_And_H2SV_Color_Space

Figure 4.2.1: The HSV color cone.

4.2.4 Object Detection Object detection is important for making an autonomous machine. Without object detection, a robot can only rely on GPS or other navigational sensors. However, with image processing to know what to avoid, what to grab, etc. A method of object detection is using contours.

4.2.5 Contours Using contours is a way of image detection. Contours checks if all of the pixels surrounding an area are of the same color and keeps data of the shape. The easiest way for object detection is by using a binary image. A binary is a black and white image. Because there are only two colors, it is easier to tell which area is white and which area is black. If OpenCV did not exist, then there would be a loop checking over and over a pixel and its surroundings. It would use chain codes to store the information of each contour. However, OpenCV has a function called cvFindContours that does this process. All that is needed is a binary image and a storage area for the contours.

4.2.6 Chain Codes Chain codes are what stores the information of each and every contour. One of the most commonly used chain codes is Freemans chain code. Freemans chain code is not standardized, so the numbers may vary from source to source. This chain code contains an encoding for each of the eight cardinal directions. An example of the compass would be:
7 6 5 4 3 0 1 2

Figure 4.2.2: Example of a compass for Freemans chain code.

4.2.7 Contour Area The contour area is a shape feature. It can also be used to filter out small, noise contours. The contour area is the area inside contour. This could also be the number of pixels in an object. In OpenCV, the function cvContourArea finds the area of a given contour.

Figure 4.2.3: The contour area of a contour. The contour area of the contour (light blue) would be two pixels

4.2.8 Bounding Box A bounding box is a box which bounds or surrounds an object. The box would have the same length of the objects longest length, and the same width of the objects longest width.

Figure 4.2.4: A bounding box of a triangle (green). The bounding boxs area would be four pixels

4.2.9 Aspect Ratio The aspect ratio is an objects longest length divided by the objects longest width. This is also another shape feature. Since the bounding box has the length of the objects longest length, and the same width of the objects longest width. The aspect ratio is simply found by dividing the length and the width of the box.

Figure 4.2.5: The aspect ratio in the image is shown in purple

4.2.10 Extent The extent is a shape feature. The extent of an object can be found by dividing the bounding box area by the contour area. This is also known as the percentage of coverage.

Figure 4.2.6: The extent is found by dividing the bounding box (green) area by the contour area (red). The result is the black triangle.

4.2.11 Circularity The circularity of an object is the difference in area of an objects area to a circles area with the same perimeter. The equation is P2 / 4A where P is the perimeter and A is the area. The less circular a shape is, the higher the circularity.

Figure 4.2.7: The circularity of the triangle is the black circle.

4.2.12 Centroid Finding the centroid of a shape is very useful in robotics. Using the center coordinates, the robot can then relocate itself to the object and pick it up. In order to find the centroid of an object, moments is used. Moments are the following equation:

Figure 4.2.8: Moments equation; http://en.wikipedia.org/wiki/Image_moments. The centroid can be found by substituting values for i and j. By substituting i=0, j=1, into the equation and then dividing it by i=0, j=0, (the contour area), the center of the x-axis is found. By using i=1, j=0, and the same procedure is done, the center of the y-axis is found.

4.3 Image Processing The main purpose of the image processing stage is to locate the position of the blue block and the position of the robot arm via a yellow dot attached to the gripper. The positions would then be used to calculate the inverse kinematics, to determine when the robot is at the right angle, and also as input for measurement to the Kalman Filter. To determine when the robot arm is at the right angle, the angle of the base to the block is compared with the angle of the base to the robot arm (yellow dot). If they are equal (can have variation of up to 2 degrees), then the robot arms gripper has a straight line to pick up the block.

Figure 4.3: Determination if robot is at right angle. If the angles of both triangles shown above are equivalent, then the robot is directly facing towards the block. The whole image processing function can be described a flowchart. The first process is to receive a frame from the camera. Then, the frame is converted to 3 separate grey scale images: hue, saturation, and value. Using color thresholding (can be calibrated) the image is converted into a binary image. Then contours are found in the image (boundaries of objects). The contours are filtered using predetermined perimeters, and then the (x, y) position on the image is calculated by finding the center of the contour.

Figure 4.4: Flowchart for image processing function

4.4 Verification of Program The software must be verified in order to analyze if the goal was completed. Therefore, the robot arm will be verified in the following manner: 1. 2. 3. 4. 5. Assemble structure to mount overlooking camera (robot arm already assembled) Program software: 1 program with Kalman Filter and 1 program without Kalman Filter Calibrate robot movement from the overlooking camera Place the blue block in arbitrary location (can be from list of fixed locations) Run program with Kalman Filter. Observe if robot is able to pick up the block. 1 signifies that the block was picked up. 0 signifies that the block was missed. 6. Repeat step 5 multiple times (recommended at least 3) 7. Repeat step 5 and 6with the other program 8. Repeat steps 4-6 with different locations of the block Three different tests were done: one without additional noise, one with processing noise (shaking of the block), and one with measurement noise (shaking of the camera).

4.5 Results

Environment with no artificial noise:

Use of Location Kalman of Block Filter? A B C D E F G No Yes No Yes No Yes No Yes No Yes No Yes No Yes

Trial 1 0 1 0 1 0 0 1 1 1 1 1 1 1 0

Trial 2 1 1 0 0 1 1 1 1 1 1 1 1 1 1

Trial 3 0 1 1 1 0 1 1 1 1 1 1 1 1 1

It can be seen that even though the block was almost always successfully grab, the use of a Kalman Filter did improve the accuracy. However, at least one of the algorithms failed in one trial for the block placement at point A, B, C, and G. It should also be noted that points B and C have the highest percentage failure in total (3 in total: 2 for algorithm without Kalman Filter, and 1 for algorithm with Kalman Filter). Calculated accuracy percentage: No Kalman Filter: 76% Kalman Filter: 86%

Environment with artificial process noise:

Use of Location Kalman of Block Filter? No A B C D E F G Yes No Yes No Yes No Yes No Yes No Yes No Yes

Trial 1

Trial 2

Trial 3

0 1 0 1 1 1 1 0 0 1 1 1 1 1

1 1 1 1 1 1 0 1 1 1 1 1 1 1

1 1 0 1 0 1 1 1 1 1 1 1 1 1

It can be seen that the accuracy of both programs were not as different as a clean environment. This is because the when the robot moves forward, it touches the underlying paper and ultimately holds the block in place. Unlike the environment without noise, there was no significant trend to points that would cause failure. Calculated accuracy percentage: No Kalman Filter: 71% Kalman Filter: 95%

Environment with artificial noise measurement noise:

Use of Location Kalman of Block Filter? No A B C D E F G Yes No Yes No Yes No Yes No Yes No Yes No Yes

Trial 1

Trial 2

Trial 3

1 1 0 1 1 1 1 1 0 1 0 1 0 0

0 1 1 1 1 1 0 0 0 1 1 1 1 1

1 1 1 1 1 0 1 1 0 1 1 1 0 1

It can be seen that the accuracy of the software without Kalman Filter performed a lot worse, but the software with the Kalman filter stayed the same as in an environment without noise. Surprisingly unlike the other two environments, the software without Kalman Filter did not succeed in any trials with point E. Calculated accuracy percentage: No Kalman Filter: 57% Kalman Filter: 86%

5. Logic Synthesis and Systolic Processing


5.1 Problem outline One of the most major problems with the current implementation of the Kalman Filter is that the speed is too slow. There are two main reasons: 1) the Kalman Filter is not implemented in parallel, and 2) a medium-end computer does not have enough processing power to run image processing, detection, and the Kalman Filter smoothly. Therefore, one idea to fix both problems together is to use a hardware implementation of the Kalman Filter. By using the FPGA, we are able to have parallelism both internally (circuit design) and with the computer (2 threads will be running, the FPGA and the computer). Also, a hardware implementation will have much better speed than the computer because hardware is more optimized to the problem.

5.2 Review of Literature: Logic Synthesis 5.2.1 Boolean Algebra and Switching Algebra Boolean algebra is a two-valued algebraic system that was invented by George Boole in the book An Investigation of the Laws of Thought. It is a variant of algebra, in which the only values operated on was 0 (for false) and 1 (for truth). It is used in mathematic logic, computer programming, and digital logic. In the 1930s, Claude Shannon observed that Boolean algebra can be used for logic circuits and gates. Thus, he introduced switching algebra, a variant of Boolean algebra which can be used for analyzing logic circuits. In logic synthesis, both Boolean algebra and switching algebra are commonly used.

5.2.2 Truth Tables Truth tables are a way to graphical display the function of the current circuit. In standard truth tables, the inputs are on the left, and the outputs are on the right. In the columns for the input, a line for every single combination of values for each input is drawn. Normally, the values of the inputs are listed in normal binary coding. The output column shows the output for the corresponding input values on the left. Inputs A 0 0 0 0 1 Output(s)

Combinations of all possible input values

B C X 0 0 0 0 1 1 1 0 0 1 1 0 0 0 0

Corresponding output to the input combination

1 1 1

0 1 1

1 0 1

0 1 0

Figure 5.2.2: Sample truth table with description of each part.

5.2.3 Karnaugh Maps Karnaugh maps (K-map) are another graphical way to display functions on. However, unlike truth tables, it is very efficient at showing how to graphically synthesize circuits. In order to find minterms to a function in a K-map, one would only need to find groups of 1s (or 0s in POS) with size 2n.(more description about groups and Hamming Distance) An example of finding a POS function is shown in figure 5.4. Input values (for cd) In graycode encoding cd 00 ab 00 0 01 0 11 10 1 0 01 1 1 1 1 11 0 1 0 0 10 0 0 1 1 X Output(s) Figure 5.3: Karnaugh Map example with descriptions. Output value for corresponding inputs

Inputs

Input values (for ab) In graycode encoding

ac bc

cd ab 00 1 00 1 01 1 11 0 10

01 1 1 1 0

11 0 0 0 0

10 1 0 0 1 X

bcd

F(x) = ac + bc + bcd
Figure 5.4: Example Karnaugh Map with minterms labeled and function at bottom.

Notice that in Figure 5.4, the group of six is not grouped as one minterm. This is because in binary logic, there are 2n combinations for n bits, and 6 is not a result of 2n. It can also be noticed that there can exist groups which wrap around the edge of the Karnaugh Map. It is allowed because 00 and 10 are in Hamming Distance one.

5.2.4 Commonly Used Gates The following subsections will describe logic gates which are commonly used in logic synthesis.

5.2.4.1 AND Gate The AND gate is one of the most basic gates. It is commonly used in deriving other larger and complex gates and circuit. The output of the AND gate is 1 when both inputs are 1, and 0 otherwise. In the equation representation of the AND operator, the plus sign + is used. A X B A B 0 0 0 1 1 0 1 1 X 0 0 0 1

X=A+B

Figure 5.5: (Left) The graphical symbol for the AND gate, (Center) the truth table for the AND gate, (Right) the equation representation of the AND gate.

5.2.4.2 OR Gate The OR gate is another basic gate. Like the AND gate, it is commonly used in deriving other larger and complex gates and circuit. The output of the OR gate is 1 when either inputs are 1, and 0 otherwise. In the equation representation of the OR operator, the multiplication sign is used. A X B A B 0 0 0 1 1 0 1 1 X 0 1 1 1

X=AB

Figure 5.6: (Left) The graphical symbol for the OR gate, (Center) the truth table for the OR gate, (Right) the equation representation of the OR gate.

5.2.4.3 NOT Gate The NOT gate (or inverter) is another basic gate. Unlike both the AND and OR gate, the NOT gate has only 1 input. The function of the NOT gate is to invert the input. If the input is 0, the output is 1, and vica-versa. There several different notations for the NOT operation, and two are described in Figure 5.8. A X A X 0 1 1 0

X=A X = ~A X=A

Figure 5.7: (Left) The graphical symbol for the NOT gate, (Center) the truth table for the NOT gate, (Right) the equation representations of the NOT gate. Sometimes, the when the not gate is appended to the input or output of a logic gate, it is denoted with an empty circle.

=
Figure 5.8: Different notations of a not gate appended to a AND gate. 2. 1.4.4 XOR Gate Although the XOR (also called the EXOR gate or the Exclusive OR) gate is a combination of AND, OR, and NOT gates (shown in figure 5.2.4.4.1), it is fundamental to several logic synthesis methods such as ESOP minimization. The XOR gate is also the most basic gate in quantum technologies. In the XOR gate, the output is 1 when only one of the inputs has value 1, and 0 otherwise. The notation of the XOR operator is .

A B X

Figure 5.9: Decomposition of XOR gate to primitive AND, OR, and NOT gates.

A X B

A B 0 0 0 1 1 0 1 1

X 0 1 1 0

X=AB

Figure 5.10: (Left) The graphical symbol for the XOR gate, (Center) the truth table for the XOR gate, (Right) the equation representation of the XOR gate.

2. 1.4.5 Multiplexer The multiplexer is another gate produced by combining the primary gates (AND, OR, NOT). The function of the multiplexer is equivalent to a selector, where there is two inputs, and a third input wire determines which of the first two inputs are chosen for the output. Figure 5.11 shows the decomposition of the multiplexer to primary gates.
A

Control

Figure 5.11: Decomposition of multiplexer to its primary gates.

A B

A 0 0 0 0 1 1 1 1

B Control X 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 0 1 0 1 1 1 1

Control

Figure 5.12: (Left) The symbol for the multiplexer, (Right) the corresponding truth table to the multiplexer.

5.2.5 Arithmetic Gates 5.2.5.1 Adder

The adder is a gate which performs the arithmetic addition operator. There are two types of adder designs, the half adder and the full adder. The half adder takes two 1 bit inputs, and outputs a 2 bit output. The circuit is shown in figure 5.13.

X Half Sum Y

Carry-Out

A B 0 0 0 1 1 0 1 1

HS CO 0 0 1 0 1 0 0 1

Figure 5.13: Half Adder with truth table A full adder enables there to be inputs of arbitrary bits to be added together. For each bit addition, a carry can also be added as well. When full adders are combined, they create a ripple adder, which can be used to add numbers of multiple bits. X Y CIN S COUT 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1

X Y Carry-In Sum

Carry-Out

Figure 5.14: Full Adder with truth table

X3

Y3

X2

Y2

X1

Y1

X0

Y0

X C4

Y C3

Y C2

Y C1

Y C0

COUT CIN S

COUT CIN S

COUT CIN S

COUT CIN S

S3

S2

S1

S0

Figure 5.15: Ripper Adder using full adders 5.2.5.2 Multiplier The function of the multiplier is to perform the multiplication operation. The circuit is constructed based on multiplication using addition. Thus, the circuit shown in figure 5.16 is constructed using half adders. Note that the multiplication is done using binary inputs, instead of integers.

X2

Y1

X2

X1

X2 X1 X Y2 Y1 Z4 Z3 Z2 Z1

Half Adders

Z4

Z3

Z2

Z1

Figure 5.16: Decomposition of multiplier to primitive gates.

5.2.6 Clock A clock is a waveform generator. It is an input to the circuit which continuously switches between the values 0 and 1. Several logic gates used in state machines require the use of a clock pulse, where the gate is activated by a rising edge (from 0 to 1) or a falling edge (from 1 to 0). Figure 5.17 describes the clock waveform.

tper

tper = Period 1/tper = Frequency

Figure 5.17: Clock waveform with description.

5.2.7 Flip-Flops and Latches Flip-flops and latches are the basis of sequential circuits. They are used in the majority of larger circuits (circuits used in industry). Flip-flops and latches can be used as gates which store memory. Following standard conventions, flip-flops are devices which change the output only at clock pulses, while latches are devices which change the outputs whenever the input is changed. As only synchronous circuits will be used, the discussion of latches will be limited.

5.2.7.1 D-Latch The D-latch is a latch which simply stores bits of information. The circuit is derived from the SR latch, and is shown in figure 5.18. It is used in asynchronous circuits and can also be used to build D Flip-Flops.

D Q

Enable

Figure 5.18: Decomposition of D-latch to primitive gates.

D En

Q Q

D 0 0 1 1

En 0 1 0 1

Q Last Q Last Q 0 1

Q Last Q Last Q 1 0

Figure 5.19: D-latch with truth table.

5.2.7.2 D-Flip Flop The D Flip-Flop is simply a D-latch in synchronous logic. The inputs are stored on each clock pulse instead of when the input changes. This flip-flop is derived from D-latches, where only a reader for the clock slope is needed (positive edge and negative edge). The notch in figure 5.20 represents a clock input. D 0 0 1 1 Clk X X Rising Rising Q Last Q Last Q 0 1 Q Last Q Last Q 1 0

Q Q

Figure 5.20: D Flip-flop with truth table.

5.3 Review of Literature: Field Programmable Gate Array and Verilog-HDL 5.3.1 Field-Programmable Gate Array

Field-programmable gate arrays, commonly known as FPGA, are integrated circuits which are re-programmable. FPGAs are also cost efficient, and can be programmable using hardwire description languages. The features of a FPGA can overcome the difficulty of using application-specific integrated circuits (ASIC). The interior of a FPGA commonly consist of logic elements. These logic elements consist of several logic gates (AND, OR) and also memory elements (flip-flops). For educational use, development boards for FPGA are made. These boards contain many IO (input and output) devices (switches, LEDs) and also on board memory (RAM).

Figure 5.21: A Terasic development board with a FPGA from Altera (from: http://www.terasic.com.tw/attachment/archive/573/image/image_65_thumb.jpg)

5.3.2 Hardware description languages Hardware descriptions languages are used in order to design electrical circuits. They are often used in programming FPGAs. A program in these languages can describe circuit operation, design, and organization. The most common hardware description languages are Verilog-HDL and VHDL.

5.3.3 Verilog-HDL Verilog-HDL is a hardware description language used to program electronic systems. It is based off the popular language of C. Verilog was created by Phil Moorby and Prabhu Goel during 1984. Since then, Verilog has become recognized as an IEEE standard, with the most recent extension in 2005.

5.4 Systolic Processing

Systolic processing is a method of designing parallel logic circuits. It is a set of simple processing elements with regular and local connections which takes external inputs and processes them in a predetermined manner in a pipelined fashion. To do this, the circuit uses clock pulses to control multiple registers at the same time. A systolic processor can be in 1 or 2 dimensions.

Figure 5.22: Two different types of layouts of a systolic processor. Each tile represents a register. One of the central concepts to is the idea of pipelining. In a pipeline system, the clock is used to push data from one register to the next simultaneously. This means that as soon as a register releases its stored information, it receives new information. At the same time, in between each register there can be multiple arithmetic/logic gates. Thus, we can process multiple inputs and the same time. One example where it is beneficial to use systolic processing is the concept of matrix multiplication. Matrix multiplication consists of both multiplication and addition. A way to construct such systolic processor for the problem of matrix multiplication is to emulate a loop by simultaneous performing addition and multiplication to a cell. The structure of a 2D array of shift registers helps shift the data around, resulting in the parallel calculation of numbers. Figure 5.23 shows the circuit mentioned.

][

Figure 5.23a: Matrix multiplication problem

B02 B01 B00

B12 B11 B10 0

B22 B21 B20 0 0

A02 A01 A00

C00 C01 C02

A12 A11 A01 0

C10 C11 C12

A22 A12 A02 0

C20 C21 C22

Figure 5.23b: Top level diagram of systolic processor for matrix multiplication

Figure 5.23c: Internal module of circuit found in figure 5.23a However, these variants are not the most optimal circuit/algorithm for the problem of matrix multiplication. The most optimal circuit uses something called Faddeevs Algorithm.

6. Phase 2: Hardware Implementation of Faddeevs Algorithm


6.1 Faddeevs Algorithm Faddeevs Algorithm is a general purpose algorithm that is useful for a wide range of matrix operations. It is especially useful in a systolic implementation as many operators can work in parallel to solve the otherwise slow operation of calculating the inverse matrix. In the figure below, by replacing the matrices A, B, C, and D with different values, a variety of matrix operations can be solved (output is in matrix D).

Figure 6.1: The matrix operations that Faddeevs Algorithm can perform. 6.2 Chuang and Hes Implementation Chuang and He presented in A versatile systolic array for matrix computations an implementation of the algorithm in hardware. Their realization contains 2 main modules: 1) Neighbor pivoting (Interchangement of rows to create 0s and ultimately a triangular structure) 2) Gaussian elimination (to create 0s, especially for matrices C and D) In the implementation, data moves from left to right and from top to bottom simultaneously. Shifting is done at the same time for all registers, which occurs after the outputs of each cells are calculated. The circuit is shown in figure 6.2.

Figure 6.2: General systolic structure of Faddeevs algorithm There are two types of cells in the circuit: a boundary cell and an internal cell. The boundary cell is responsible for creating a coefficient that will be able to create a zero. The internal cell is responsible for adjusting values using the created coefficient. An example is shown in figure 6.3. Each of the cells also has two different modes, which is control with a toggle. The toggle is enabled when the input to a cell is from the C or D matrices, and changes the function of the cell from Neighbor Pivoting to Gaussian Elimination. Pseudocode for each cell is shown in figure 6.4.

( )

( )

( )

( )

Figure 6.3: Example concept of boundary and internal cells. The boundary cells calculate the value needed to eliminate x (.5), and the internal cells multiply the value to the rest of the equation. Neighbor Pivoting:

Boundary Cell: Xin

Mout Vout

If |Xin| |X| then Begin Vout = 1 If Xin 0 then Mout = -X/Xin Else Mout = 0 X = Xin End Else Vout = 0 Mout = /Xin/X

Internal Cell: Xin If Vin = 1then Begin Xout = X + (Min * Xin) = Min X = Xin = Vin End Else Xout = Xin + (Min * X)

Min Vin

X
Xout

Mout Vout

Gaussian Elimination:

Boundary Cell: Xin

Mout Null

Mout = -Xin / X

Internal Cell: Xin

Min Null

X
Xout

Mout = Min Null Xout = Xin + Min * X

Figure 6.4: Function of the Boundary Cells and the Internal Cells

6.4 Problems the Chuang and He Implementation Although Chuang and He have proposed a theoretical method to realize Faddeevs algorithm in hardware, the actual implementation is more complex. This is because Chuang and Hes model assumes that there are no delays in the circuit. However, on actual hardware, the time it takes to perform an operation may crucially affect the outcome of other operations. For example: if the value of the divider output is used before it finishes, an undesirable value will be received as the output and will be used for future calculations. This is shown in figure 6.5. If data is taken before divider is done:

unknown 8 2

Divider

Adder

unknown

Figure 6.5: Importance of timing in actual hardware

6.5 Macropulses and Micropulses The novel idea of a macropulse and a micropulse are used to create an error-free synthesizable circuit. In the context of a systolic processor, a macropulse is the pulse that determines to shifting of data from register to register. A micropulse is an internal pulse that regulates the calculations inside each cell (operations between shifting of data between cells). Figure 6.6 shows an example waveform for a macropulse and a micropulse.

Figure 6.6: Graphical representation of micropulses and macropulses In the circuit realized, the micropulse and macropulse are generated by a finite state machine (FSM). The finite state machine works by separating parts of the circuit into different sections, and then controlling each part separately in a way that no errors occur. The order of calculations for each cell is show below.

Boundary Cell: Comparator |Xin| |X| Comparator Xin 0

Calculate -X/Xin Calculate -Xin/X

Mux to select assignment of Xout and X

Assign values to register and toggle macropulse

Internal Cell: Comparator Vin = 1

Calculate X+(Min*Xin) Calculate Xin+(Min*X)

Mux to select assignment of Xout and X

Assign values to register according to above macropulse

Figure 6.7: FSM for partition of circuit for boundary cell and internal cell

6.6 Fixed-Point Notation Another important issue is the calculation of decimal numbers in binary. It is necessary because in a Kalman Filter, the precise calculations are made using values like .0001. Decimal numerals can be represented in binary using fixed-point notation. The fixed-point notation consists of two sections: one for whole numerical values, and one for decimal values. The decimal portion is represented as following: (1 = , 11 = , 111 = 1/8, etc). This project uses a fixed-point notation as following: 21 bits for a real value, and 11 bits for a decimal value (32 bits in total).

Figure 6.8: Fixed-point notation used for project An example of the use of this notation is shown as following: 3.75 = 00000000000000000001111000000000 It can be seen as: 000000000000000000011.11000000000

6.7 Schematic of Circuit for Boundary and Internal Cells *Note for all circuits: anything that is 8bit in the drawing is actually 32 bits Boundary Cell:

Partitioning Finite State Machine Connected as enable to each gate (not shown)

Internal Cell:

Partitioning Finite State Machine Connected as enable to each gate (not shown)

7. Extension: 3D Location of Block


7.1 Problem: In reality, the real world isnt limit to two dimensions. Therefore, an implementation of software to solve inverse kinematics for three dimensions is necessary.

(x, y, z) z

y x (0, 0, 0)

Figure 7.1: Example diagram of 3D for inverse kinematics

7.2 Extension of Inverse Kinematics to 3D Space The equations for calculating inverse kinematics used in the software can already calculate the inverse kinematics for three dimensions. First, the base rotation is calculated in the X and Y axes. Then, a second angle is calculated in respect to the X and Z axes. Finally, the distance between the robot arm and the base is calculated with the respective servo angles needed to move that distance

Calculation of point (r, z), the Grip point r = r - (sin(gripAngle) * gripLength) z = z - baseHeight + (cos(gripAngle) * gripLength)

Calculations of angles required to move to the Grip point h = sqrt(z2 * r2) / 2 elbowAngle = asin(h / armLength) * 2 shoulderAngle = atan2(z / r) + ((PI - elbowAngle) / 2) wristAngle = PI + gripAngle - shoulderAngle - elbowAngle Figure 7.2: Equations used earlier for inverse kinematics. This set of equations can already solve for the third dimension In order to detect all three coordinates (X, Y, and Z) of the block, an additional camera is necessary. The camera could either be used for stereovision (like the Kinect), or the camera could be added to the side of the setup where it would see the block from the side instead of from the top. However, if the use of another camera would slow down the processing of the computer

even further. Therefore, the Kalman Filter implementation will have to be quickened by using parallelism with CUDA or a FPGA (as explained in section 4).

9. Conclusion
The goal of the project, which was to create a highly accurate and noise-resilient robot arm to solve the inverse kinematics problem, was obtained through the use of image processing and a Kalman Filter. In Phase 1, the program was able to successfully grab a blue block placed in any arbitrary position in the range of the robot arm. As shown in the results, the addition of a Kalman Filter to the software always increased the accuracy and the noise resistance. The Kalman Filter had an average of 89% success rate as compared to a 68% success rate for the software without a Kalman Filter. Phase 2 showed a method for a hardware implementation of Faddeevs Algorithm, which is useful in matrix computations. The method uses the concept of macropulses and micropulses to regulate the flow of data in the algorithm. The algorithm was written in Verilog-HDL and verified using the ModelSim simulator on various testbenches (and example matrix computations). The idea of Phase 3 is to combine the methods used in Phase 1 and Phase 2 to create more effective robot control. Phase 1 gives the general algorithm to solve inverse kinematics using image recognition. Phase 2 shows a hardware implementation for fast calculation of matrix operations. In Phase 3, the main goals are as following: Use Faddeevs Algorithm to efficiently implement a Kalman Filter into a hardware circuit and have it working on a FPGA. Connect the FPGA to the computer to higher accuracy of robot control (FPGA will be used for the Kalman Filter)

The following figures shows a flowchart of the connections in phase 3 Camera USB Computer
Image Processing with OpenCV Inverse Kinematics Calculation

Lynxmotion
Servo Controller

FPGA
Kalman Filter o Uses Faddeevs Algorithm to calculate matrices

USB

USB

Nios II Embedded Processor

Another important concept to be realized in Phase 3 is the speed-up of image processing. By improving the calculation speed of image processing, the software will be able to have realtime processing without delay from the camera. This can be done by moving the image processing stage onto the FPGA or CUDA. Lynxmotion
Servo Controller

Camera USB

USB Computer
Inverse Kinematics Calculation

FPGA
Image Processing Kalman Filter o Uses Faddeevs Algorithm to calculate matrices

USB

Nios II Embedded Processor

9. References
Bradski, G., & Kaehler, A. (2008). Learning OpenCV. Sebastopol, CA: O'Reilly Media, Inc. Brunl, T. (2006). Embedded Robotics. Springer-Verlag.Chen, G., & Guo, L. (2005). The FPGA Implementation Of Kalman Filter. Proceedings of the 5th WSEAS Int. Conf. on Signal Processing, Computational Geometry & Artificial Vision, 61-65. Retrieved from http://www.wseas.us/e-library/conferences/2005malta/papers/499-146.pdf Chen, S. Y. "Kalman filter for robot vision: a survey." Industrial Electronics, IEEE Transactions on 59.11 (2012): 4409-4420. Cheng, Alan, et al. "Methodology to Create Hardware Oracles for Solving Constraint Satisfaction Problems", ULSI Conference 2013 Chu, P. P. (2008). FPGA Prototyping by Verilog Examples. Hoboken, NJ: John Wiley & Sons. DeSouza, Guilherme N., and Avinash C. Kak. "Vision for mobile robot navigation: A survey." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24.2 (2002): 237-267. Farrell, J. (2001). Object-Oriented Programming Using C++. Canada: Course Technology. Haskell , R. E. (2010). Digital Design Using Digilent FPGA Boards (3rd ed.). Rochester Hills, MI: LBE Books LLC. Hen-Geul Yeh (1986). Kalman filtering and systolic processors. IEEE International Conference on ICASSP '86. , vol.11, no., pp. 2139- 2142, Apr 1986. Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1168641&isnumber=26344 Ishaque, B., Haider, B., Wasid, M., Alaul, S., Hassan, K., Ahsan, T., ... Alam, M. (2004). An Evolutionary Algorithm to Solve Cryptarithmetic Problem. TRANSACTIONS ON ENGINEERING, COMPUTING AND TECHNOLOGY, VI, 305-313. Kent, J. (2003). C++ Demystified. Emeryville, CA: McGraw-Hill. Parker, J. R. (1994). Practical Computer Vision Using C. Canada: John Wiley & Sons, Inc.. Prata, S. (2004). C++ Primer Plus (5th ed.). Indianapolis, IN: SAMs Publishing. Vahid, F., & Lysecky, R. (2007). Verilog for Digital Design. Hoboken, NJ: Wiley & Sons. Van Dinh Le, H. (1988) A New General Purpose Systolic Array for Matrix Computations, Portland State University Master Thesis in Electrical Engineering

Wakerley, J. F. (2006). Digital Design (4th ed.). Upper Saddle River, NJ: Pearson Prentice Hall. Yuen, S. G., Kettler, D. T., Novotny, P. M., Plowes, R. D., & Howe, R. D. (2009). Robotic motion compensation for beating heart intracardiac surgery.The International journal of robotics research, 28(10), 1355-1372. Yuen, S. G., P. M. Novotny, and R. D. Howe. "Quasiperiodic predictive filtering for robotassisted beating heart surgery." Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on. IEEE, 2008. Zhao, Qi-jie, et al. "Kalman filter based vision predicting and object tracking method and its application [J]." Optics and Precision Engineering 5 (2008): 028.

Appendix A: Program Source Code

// Kalman Filter to Optimize Inverse Kinematics // Author: Alan Cheng // [NOTE] The program is not polished, and so there may still be unused variables // The computer to Mini Maestro interface coding is taken from Pololu #include #include #include #include #include #include #include #include #include #include #include #include #include "stdafx.h" <fstream> <iostream> <opencv2/imgproc/imgproc.hpp> <opencv2/video/tracking.hpp> <opencv2/highgui/highgui.hpp> <Windows.h> <String> <stdio.h> <cv.h> <math.h> <vector> <ctype.h>

using namespace cv; using namespace std; #define CVX_RED #define CVX_GREEN #define CVX_BLUE CV_RGB(0xff,0x00,0x00) CV_RGB(0x00,0xff,0x00) CV_RGB(0x00,0x00,0xff)

#define CVX_CYAN CV_RGB(0x00,0xff,0xff) #define CVX_MAGENTA CV_RGB(0xff,0x00,0xff) #define CVX_YELLOW CV_RGB(0xff,0xff,0x00) #define CVX_WHITE #define CVX_BLACK #define CVX_GRAY50 #define #define #define #define 3.94" #define CV_RGB(0xff,0xff,0xff) CV_RGB(0x00,0x00,0x00) CV_RGB(0x88,0x88,0x88) //base hight 2.65" //shoulder-to-elbow "bone" 5.75" //elbow-to-wrist "bone" 7.375" //gripper (incl.heavy duty wrist rotate mechanism) length

BASE_HGT 67.31 HUMERUS 95.0 ULNA 105.0 GRIPPER 80.00 PI 3.1415926

//--------------------------double tempcenterx = -1; double tempcentery = -1; double centerx = -1; double centery = -1; double ccenterx = -1; double ccentery = -1; int bcenterx = 0; int bcentery = 0; double distancex = 0; double distancey = 0; double bdistancex = 0; double bdistancey = 0; double xOffset = 95.00; //77 double pixelToMM = .5; //.461 float hum_sq = HUMERUS*HUMERUS;

float uln_sq = ULNA*ULNA; //--------------------------struct blueblock{ public: double aspectlow, aspecthigh, compactlow, compacthigh, extentlow, extenthigh, arealow, areahigh; }; struct colcircle{ public: double aspectlow, aspecthigh, compactlow, compacthigh, extentlow, extenthigh, arealow, areahigh; }; struct hsvThresh { int hT; int sT; int vT; };

#ifdef _DEBUG #define new DEBUG_NEW #endif

struct mouse_info_struct { int x,y; }; struct mouse_info_struct mouse_info = {-1,-1}, last_mouse;

class Kalman { public: Kalman() { KalmanFilter KF(4, 2, 0); Mat_<float> state(4, 1); /* (x, y, Vx, Vy) */ Mat processNoise(4, 1, CV_32F); Mat_<float> measurement(2,1); measurement.setTo(Scalar(0)); KFs = KF; measurements = measurement; } void setKalman(float x, float y) { KFs.statePre.at<float>(0) KFs.statePre.at<float>(1) KFs.statePre.at<float>(2) KFs.statePre.at<float>(3) = = = = x; y; 0; 0;

KFs.statePost.at<float>(0) = x;

KFs.statePost.at<float>(1) = y; KFs.statePost.at<float>(2) = 0; KFs.statePost.at<float>(3) = 0; KFs.transitionMatrix = *(Mat_<float>(4, 4) << 1,0,1,0, 0,0,0,1); 0,1,0,1, 0,0,1,0,

setIdentity(KFs.measurementMatrix); setIdentity(KFs.processNoiseCov, Scalar::all(1e-4)); KFs.measurementNoiseCov = *(Mat_<float>(2, 2) << 0.197492,0, setIdentity(KFs.errorCovPost, Scalar::all(.1)); } Point step1() { Mat prediction = KFs.predict(); Point predictPt(prediction.at<float>(0),prediction.at<float>(1)); return predictPt; } Point step2() { Mat estimated = KFs.correct(measurements); Point statePt(estimated.at<float>(0),estimated.at<float>(1)); return statePt; } void changeMeasure(float x,float y) { measurements(0) = x; measurements(1) = y; } private: KalmanFilter KFs; Mat_<float> measurements; };

0,0.052607);

void Image_Processing(Mat frame, blueblock b, hsvThresh hsvB1, hsvThresh hsvB2); void Image_Processing_Circle(Mat frame, colcircle c, hsvThresh hsvC1, hsvThresh hsvC2); void calculateNewAngle(Point endPt, double currentX, double currentY); double calculateAngle(double currentX, double currentY);

//----------------------HANDLE port; void initialposition(); void closeArm(); void moveBase(double bas_angle_r); void gripBar(); void set_arm( float x, float y, float z, float grip_angle_d); void moveRobotArm(double y, double z, double grip_angle_d); //----------------------vector<Point> mousev,kalmanv; void on_mouse(int event, int x, int y, int flags, void* param) { //if (event == CV_EVENT_LBUTTONUP)

{ last_mouse = mouse_info; mouse_info.x = x; mouse_info.y = y; // } } HANDLE openPort(const char * portName, unsigned int baudRate) { HANDLE port; DCB commState; BOOL success; COMMTIMEOUTS timeouts; /* Open the serial port. */ port = CreateFileA(portName, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if (port == INVALID_HANDLE_VALUE) { switch(GetLastError()) { case ERROR_ACCESS_DENIED: fprintf(stderr, "Error: Access denied. Try closing all other programs that are using the device.\n"); break; case ERROR_FILE_NOT_FOUND: fprintf(stderr, "Error: Serial port not found. " "Make sure that \"%s\" is the right port name. " "Try closing all programs using the device and unplugging the " "device, or try rebooting.\n", portName); break; default: fprintf(stderr, "Error: Unable to open serial port. Error code 0x%x.\n", GetLastError()); break; } return INVALID_HANDLE_VALUE; } /* Set the timeouts. */ success = GetCommTimeouts(port, &timeouts); if (!success) { fprintf(stderr, "Error: Unable to get comm timeouts. GetLastError()); CloseHandle(port); return INVALID_HANDLE_VALUE; } timeouts.ReadIntervalTimeout = 1000; timeouts.ReadTotalTimeoutConstant = 1000; timeouts.ReadTotalTimeoutMultiplier = 0; timeouts.WriteTotalTimeoutConstant = 1000; timeouts.WriteTotalTimeoutMultiplier = 0; success = SetCommTimeouts(port, &timeouts); if (!success) cout << "got mouse " << x <<","<< y <<endl;

Error code 0x%x.\n",

{ fprintf(stderr, "Error: Unable to set comm timeouts. GetLastError()); CloseHandle(port); return INVALID_HANDLE_VALUE; } /* Set the baud rate. */ success = GetCommState(port, &commState); if (!success) { fprintf(stderr, "Error: Unable to get comm state. GetLastError()); CloseHandle(port); return INVALID_HANDLE_VALUE; } commState.BaudRate = baudRate; success = SetCommState(port, &commState); if (!success) { fprintf(stderr, "Error: Unable to set comm state. GetLastError()); CloseHandle(port); return INVALID_HANDLE_VALUE; } Error code 0x%x.\n",

Error code 0x%x.\n",

Error code 0x%x.\n",

/* Flush out any bytes received from the device earlier. */ success = FlushFileBuffers(port); if (!success) { fprintf(stderr, "Error: Unable to flush port buffers. GetLastError()); CloseHandle(port); return INVALID_HANDLE_VALUE; } return port; }

Error code 0x%x.\n",

/** Implements the Maestro's Get Position serial command. * channel: Channel number from 0 to 23 * position: A pointer to the returned position value (for a servo channel, the units are quarter-milliseconds) * Returns 1 on success, 0 on failure. * For more information on this command, see the "Serial Servo Commands" * section of the Maestro User's Guide: http://www.pololu.com/docs/0J40 */ BOOL maestroGetPosition(HANDLE port, unsigned char channel, unsigned short * position) { unsigned char command[2]; unsigned char response[2]; BOOL success; DWORD bytesTransferred; // Compose the command. command[0] = 0x90; command[1] = channel; // Send the command to the device.

success = WriteFile(port, command, sizeof(command), &bytesTransferred, NULL); if (!success) { fprintf(stderr, "Error: Unable to write Get Position command to serial port. Error code 0x%x.", GetLastError()); return 0; } if (sizeof(command) != bytesTransferred) { fprintf(stderr, "Error: Expected to write %d bytes but only wrote %d.", sizeof(command), bytesTransferred); return 0; } // Read the response from the device. success = ReadFile(port, response, sizeof(response), &bytesTransferred, NULL); if (!success) { fprintf(stderr, "Error: Unable to read Get Position response from serial port. Error code 0x%x.", GetLastError()); return 0; } if (sizeof(response) != bytesTransferred) { fprintf(stderr, "Error: Expected to read %d bytes but only read %d (timeout). " "Make sure the Maestro's serial mode is USB Dual Port or USB Chained.", sizeof(command), bytesTransferred); return 0; } // Convert the bytes received in to a position. *position = response[0] + 256*response[1]; return 1; } /** Implements the Maestro's Set Target serial command. * channel: Channel number from 0 to 23 * target: The target value (for a servo channel, the units are quarter-milliseconds) * Returns 1 on success, 0 on failure. * Fore more information on this command, see the "Serial Servo Commands" * section of the Maestro User's Guide: http://www.pololu.com/docs/0J40 */ BOOL maestroSetTarget(HANDLE port, unsigned char channel, unsigned short target) { unsigned char command[4]; DWORD bytesTransferred; BOOL success; // Compose command[0] command[1] command[2] command[3] the command. = 0x84; = channel; = target & 0x7F; = (target >> 7) & 0x7F;

// Send the command to the device. success = WriteFile(port, command, sizeof(command), &bytesTransferred, NULL); if (!success)

{ fprintf(stderr, "Error: Unable to write Set Target command to serial port. Error code 0x%x.", GetLastError()); return 0; } if (sizeof(command) != bytesTransferred) { fprintf(stderr, "Error: Expected to write %d bytes but only wrote %d.", sizeof(command), bytesTransferred); return 0; } return 1; }

void Image_Processing(Mat frame, blueblock b, hsvThresh hsv1, hsvThresh hsv2) { Mat img_edge; Mat hsvframe; double area; cvtColor(frame, hsvframe, CV_BGR2HSV); inRange(hsvframe,Scalar(hsv1.hT,hsv1.sT,hsv1.vT),Scalar(hsv2.hT,hsv2.sT,hsv2.vT),i mg_edge); namedWindow("Binary"); imshow( "Binary", img_edge ); //Extract the contours so that vector<vector<Point> > contours0; vector<vector<Point> > contours; vector<Vec4i> hierarchy; Mat cnt_img = Mat::zeros(480, 640, CV_8UC3); findContours( img_edge, contours0, hierarchy, RETR_TREE, CHAIN_APPROX_SIMPLE); contours.resize(contours0.size()); for( size_t k = 0; k < contours0.size(); k++ ) approxPolyDP(Mat(contours0[k]), contours[k], 3, true); /// Get the moments vector<Moments> mu(contours0.size() ); for( int i = 0; i < contours0.size(); i++ ) { mu[i] = moments( contours0[i], false ); } /// Get the mass centers: vector<Point2f> mc( contours.size() ); for( int i = 0; i < contours.size(); i++ ) { mc[i] = Point2f( mu[i].m10/mu[i].m00 , mu[i].m01/mu[i].m00 ); }

for( size_t k = 0; k < contours0.size(); k++ ) { area = contourArea(Mat(contours0[k]), false);

area = fabs(area); CvBox2D box; box = minAreaRect(Mat(contours0[k])); double extent; double aspec_ratio; if(box.size.width > box.size.height) aspec_ratio = box.size.width/box.size.height; else aspec_ratio = box.size.height/box.size.width; extent = area/(box.size.height*box.size.width); CvPoint pt1; CvPoint pt2; CvPoint pt3; CvPoint pt4; CvPoint2D32f pt[4]; if(area != 0) { tempcenterx = mc[k].x; tempcentery = mc[k].y; } if((area > b.arealow) && (area < b.areahigh) && (extent > b.extentlow) && (extent < b.extenthigh) && (aspec_ratio > b.aspectlow) && (aspec_ratio < b.aspecthigh)) { cvBoxPoints( box, pt ); pt1.x = pt[0].x; pt1.y = pt[0].y; pt2.x = pt[1].x; pt2.y = pt[1].y; pt3.x = pt[2].x; pt3.y = pt[2].y; pt4.x = pt[3].x; pt4.y = pt[3].y; line( frame, pt1, pt2, CVX_GREEN, 1, 8 ); line( frame, pt2, pt3, CVX_GREEN, 1, 8 ); line( frame, pt3, pt4, CVX_GREEN, 1, 8 ); line( frame, pt4, pt1, CVX_GREEN, 1, 8 ); pt1.x = (int)tempcenterx; pt1.y = (int)tempcentery-10; pt2.x = (int)tempcenterx; pt2.y = (int)tempcentery+10; line( frame, pt1, pt2, CVX_GREEN, 2, 8 ); pt1.x = (int)tempcenterx-10; pt1.y = (int)tempcentery; pt2.x = (int)tempcenterx+10; pt2.y = (int)tempcentery; line( frame, pt1, pt2, CVX_GREEN, 2, 8 ); centery = tempcentery*pixelToMM+xOffset; centerx = (340.0-tempcenterx)*pixelToMM; } } drawContours( cnt_img, contours, -1, Scalar(128,255,255), 3, CV_AA, hierarchy, 1); namedWindow( "contours", 1 ); imshow("contours", cnt_img); }

void Image_Processing_Circle(Mat frame, colcircle c, hsvThresh hsv1, hsvThresh hsv2) { Mat img_edge; Mat hsvframe; double area; cvtColor(frame, hsvframe, CV_BGR2HSV); inRange(hsvframe,Scalar(hsv1.hT,hsv1.sT,hsv1.vT),Scalar(hsv2.hT,hsv2.sT,hsv2.vT),i mg_edge); namedWindow("Binary2"); imshow( "Binary2", img_edge ); //Extract the contours vector<vector<Point> > contours0; vector<vector<Point> > contours; vector<Vec4i> hierarchy; Mat cnt_img = Mat::zeros(480, 640, CV_8UC3); findContours( img_edge, contours0, hierarchy, RETR_TREE, CHAIN_APPROX_SIMPLE); contours.resize(contours0.size()); for( size_t k = 0; k < contours0.size(); k++ ) approxPolyDP(Mat(contours0[k]), contours[k], 3, true); /// Get the moments vector<Moments> mu(contours0.size() ); for( int i = 0; i < contours0.size(); i++ ) { mu[i] = moments( contours0[i], false ); } /// Get the mass centers: vector<Point2f> mc( contours.size() ); for( int i = 0; i < contours.size(); i++ ) { mc[i] = Point2f( mu[i].m10/mu[i].m00 , mu[i].m01/mu[i].m00 ); }

for( size_t k = 0; k < contours0.size(); k++ ) { area = contourArea(Mat(contours0[k]), false); area = fabs(area); CvBox2D box; box = minAreaRect(Mat(contours0[k])); double extent; double aspec_ratio; if(box.size.width > box.size.height) aspec_ratio = box.size.width/box.size.height; else aspec_ratio = box.size.height/box.size.width; extent = area/(box.size.height*box.size.width); CvPoint pt1; CvPoint pt2; CvPoint pt3; CvPoint pt4;

CvPoint2D32f pt[4]; if(area != 0) { tempcenterx = mc[k].x; tempcentery = mc[k].y; } if((area > c.arealow) && (area < c.areahigh) && (extent > c.extentlow) && (extent < c.extenthigh) && (aspec_ratio > c.aspectlow) && (aspec_ratio < c.aspecthigh)) { cvBoxPoints( box, pt ); pt1.x = pt[0].x; pt1.y = pt[0].y; pt2.x = pt[1].x; pt2.y = pt[1].y; pt3.x = pt[2].x; pt3.y = pt[2].y; pt4.x = pt[3].x; pt4.y = pt[3].y; line( frame, pt1, pt2, CVX_GREEN, 1, 8 ); line( frame, pt2, pt3, CVX_GREEN, 1, 8 ); line( frame, pt3, pt4, CVX_GREEN, 1, 8 ); line( frame, pt4, pt1, CVX_GREEN, 1, 8 ); pt1.x = (int)tempcenterx; pt1.y = (int)tempcentery-10; pt2.x = (int)tempcenterx; pt2.y = (int)tempcentery+10; line( frame, pt1, pt2, CVX_GREEN, 2, 8 ); pt1.x = (int)tempcenterx-10; pt1.y = (int)tempcentery; pt2.x = (int)tempcenterx+10; pt2.y = (int)tempcentery; line( frame, pt1, pt2, CVX_GREEN, 2, 8 ); ccentery = tempcentery*pixelToMM+xOffset; ccenterx = (340.0-tempcenterx)*pixelToMM; } } drawContours( cnt_img, contours, -1, Scalar(128,255,255), 3, CV_AA, hierarchy, 1); namedWindow( "contours2", 1 ); imshow("contours2", cnt_img); }

double calculateAngle(double currentX, double currentY) { double angle = 0.0; angle =currentY/currentX; return angle; }

void initialposition() { BOOL success; unsigned short target, position;

int servo; maestroGetPosition(port, 0, &position); maestroSetTarget(port, maestroSetTarget(port, maestroSetTarget(port, maestroSetTarget(port, maestroSetTarget(port, } void moveBase(double bas_angle_r) { BOOL success; unsigned short position; 0, 1, 2, 3, 4, 5650); 4844); 9324); 9207); 1);

maestroGetPosition(port, 0, &position); printf("Current position is %d.\n", position); double bas_servopulse = position + (( ( bas_angle_r*180.0/PI )) * 40.0 ); unsigned short target = (unsigned short)(bas_servopulse+0.5); cout << "Setting target to "<< target << " (" << target/4 << " us)." << endl; maestroSetTarget(port, 0, target); } void gripBar() { BOOL success; double target; unsigned short position; maestroGetPosition(port, 4, &position); printf("Current position is %d.\n", position); target = 10000; cout << "Setting target to "<< target << " (" << target/4 << " us)." << endl; maestroSetTarget(port, 4, target); }

void set_arm( float x, float y, float z, float grip_angle_d) { float grip_angle_r = grip_angle_d*PI/180.0; //grip angle in radians for use in calculations //Base angle and radial distance from x,y coordinates float bas_angle_r = atan2( x, y ); float rdist = sqrt(( x * x ) + ( y * y )); //rdist is y coordinate for the arm y = rdist; //Grip offsets calculated based on grip angle float grip_off_z = ( sin( grip_angle_r )) * GRIPPER; float grip_off_y = ( cos( grip_angle_r )) * GRIPPER; //Wrist position

float wrist_z = ( z - grip_off_z ) - BASE_HGT; float wrist_y = y - grip_off_y; //Shoulder to wrist distance ( AKA sw ) float s_w = ( wrist_z * wrist_z ) + ( wrist_y * wrist_y ); float s_w_sqrt = sqrt( s_w ); //s_w angle to ground float a1 = atan2( wrist_z, wrist_y ); //s_w angle to humerus float a2 = acos((( hum_sq - uln_sq ) + s_w ) / ( 2 * HUMERUS * s_w_sqrt )); //shoulder angle float shl_angle_r = a1 + a2; float shl_angle_d = shl_angle_r*180.0/PI; //elbow angle float elb_angle_r = acos(( hum_sq + uln_sq - s_w ) / ( 2 * HUMERUS * ULNA )); float elb_angle_d = elb_angle_r*180.0/PI; float elb_angle_dn = -( 180.0 - elb_angle_d ); //wrist angle float wri_angle_d = ( grip_angle_d - elb_angle_dn ) - shl_angle_d; //Servo pulses float bas_servopulse float shl_servopulse float elb_servopulse float wri_servopulse

= = = =

5816.0 4844.0 6000.0 6000.0

+ (( ( bas_angle_r*180.0/PI )) * 40.0 ); + (( shl_angle_d - 90.0 ) * 27.0 ); - (( elb_angle_d - 90.0 ) * 40.0 ); + ( wri_angle_d * 40.0 );

//Set servos unsigned short target0, target1, target2, target3; target0 = (bas_servopulse+0.5); target1 = (shl_servopulse+0.5); target2 = (elb_servopulse+0.5); target3 = (wri_servopulse+0.5); maestroSetTarget(port, maestroSetTarget(port, maestroSetTarget(port, maestroSetTarget(port, } void moveRobotArm( double length, double z, double grip_angle_d) { double grip_angle_r = grip_angle_d*PI/180.0; //grip angle in radians for use in calculations double rdist = length; //rdist is y coordinate for the arm double y = rdist; //Grip offsets calculated based on grip angle float grip_off_z = ( sin( grip_angle_r )) * GRIPPER; float grip_off_y = ( cos( grip_angle_r )) * GRIPPER; 0, 1, 2, 3, target0); target1); target2); target3);

//Wrist position float wrist_z = ( z - grip_off_z ) - BASE_HGT; float wrist_y = y - grip_off_y; //Shoulder to wrist distance ( AKA sw ) float s_w = ( wrist_z * wrist_z ) + ( wrist_y * wrist_y ); float s_w_sqrt = sqrt( s_w ); //s_w angle to ground //float a1 = atan2( wrist_y, wrist_z ); float a1 = atan2( wrist_z, wrist_y ); //s_w angle to humerus float a2 = acos((( hum_sq - uln_sq ) + s_w ) / ( 2 * HUMERUS * s_w_sqrt )); //shoulder angle float shl_angle_r = a1 + a2; float shl_angle_d = shl_angle_r*180.0/PI; //elbow angle float elb_angle_r = acos(( hum_sq + uln_sq - s_w ) / ( 2 * HUMERUS * ULNA )); float elb_angle_d = elb_angle_r*180.0/PI; float elb_angle_dn = -( 180.0 - elb_angle_d ); //wrist angle float wri_angle_d = ( grip_angle_d - elb_angle_dn ) - shl_angle_d; //Servo pulses float shl_servopulse = 4844.0 + (( shl_angle_d - 90.0 ) * 27.0 ); float elb_servopulse = 6000.0 - (( elb_angle_d - 90.0 ) * 40.0 ); float wri_servopulse = 6000.0 + ( wri_angle_d * 40.0 ); //Set servos unsigned short target1, target2, target3; target1 = (shl_servopulse+0.5); target2 = (elb_servopulse+0.5); target3 = (wri_servopulse+0.5);

maestroSetTarget(port, 1, target1); maestroSetTarget(port, 2, target2); maestroSetTarget(port, 3, target3); }

void closeArm() { BOOL success; unsigned short target, position; int servo; maestroSetTarget(port, 4, 10000); } int _tmain(int argc, _TCHAR* argv[])

{ blueblock bb; colcircle cc; hsvThresh hsvB1; hsvThresh hsvB2; hsvThresh hsvC1; hsvThresh hsvC2; char * portName; double deltaX=0, deltaY=0; double barAngle = 0.0; double circleAngle = 0.0; double barDistance = 0.0; double circleDistance = 0.0; vector<Point> barPredict; vector<Point> circlePredict; vector<Point> barState; vector<Point> circleState; vector<Point> barMeasure; vector<Point> circleMeasure; vector<int> index; int baudRate; portName = "COM4"; baudRate = 9600; port = openPort(portName, baudRate); ifstream contoursettings; FILE *fptr2; fptr2 = fopen("C:\\Users\\Alan\\Documents\\KFDataTest.csv","a"); cout << "Reading files ..." << endl; contoursettings.open("C:\\Users\\Alan\\Documents\\Contour_Input_Block.txt"); contoursettings >> bb.arealow >> bb.areahigh >> bb.aspectlow >> bb.aspecthigh >> bb.extentlow >> bb.extenthigh; contoursettings.close(); FILE *fptr; fptr = fopen("C:\\temp\\barThreshold.txt","r"); fscanf(fptr,"%d %d %d %d %d %d\n", &hsvB1.hT, &hsvB1.sT, &hsvB1.vT, &hsvB2.hT, &hsvB2.sT, &hsvB2.vT); fclose(fptr); contoursettings.open("C:\\Users\\Alan\\Documents\\Contour_Input_Circle.txt"); contoursettings >> cc.arealow >> cc.areahigh >> cc.aspectlow >> cc.aspecthigh >> cc.extentlow >> cc.extenthigh; contoursettings.close(); fptr = fopen("C:\\temp\\circleThreshold.txt","r"); fscanf(fptr,"%d %d %d %d %d %d\n", &hsvC1.hT, &hsvC1.sT, &hsvC1.vT, &hsvC2.hT, &hsvC2.sT, &hsvC2.vT); fclose(fptr); Kalman kalmanCircle; Kalman kalmanBar; VideoCapture cap; cap.open(0);

Mat image, frame, frame2; initialposition();

centerx=-1; centery=-1; ccenterx=-1; ccentery=-1; for(;;) { cap >> frame; frame.copyTo(frame2); Image_Processing(frame, bb, hsvB1, hsvB2); Image_Processing_Circle(frame2, cc, hsvC1, hsvC2); namedWindow("Camera"); imshow( "Camera", frame ); namedWindow("Camera1"); imshow( "Camera1", frame2 ); char c = (char)waitKey(10); if( c == 27 ) { break; } } cap.release(); destroyWindow("Camera"); destroyWindow("Camera1"); destroyWindow("contours"); destroyWindow("contours2"); destroyWindow( "Binary" ); destroyWindow( "Binary2" ); cap.open(0); centerx=-1; centery=-1; ccenterx=-1; ccentery=-1; for(;;) { cap >> frame; frame.copyTo(frame2); Image_Processing(frame, bb, hsvB1, hsvB2); Image_Processing_Circle(frame2, cc, hsvC1, hsvC2); namedWindow("Camera"); imshow( "Camera", frame ); namedWindow("Camera1"); imshow( "Camera1", frame2 ); if(centery>0 && ccentery>0 ) break; char c = (char)waitKey(10); if( c == 27 ) break; } kalmanCircle.setKalman((float)ccenterx,(float)ccentery); kalmanBar.setKalman((float)centerx,(float)centery); double bas_angle_r = atan2( centerx, centery ); double rdist = sqrt(( centerx * centerx ) + ( (centery) * (centery) )); cap.release();

destroyWindow("Camera"); destroyWindow("Camera1"); destroyWindow("contours"); destroyWindow("contours2"); destroyWindow( "Binary" ); destroyWindow( "Binary2" ); barPredict.clear(); barState.clear(); barMeasure.clear(); circlePredict.clear(); circleState.clear(); circleMeasure.clear(); index.clear(); vector<double> barangle; vector<double> circleangle; cap.open(0); for(;;) { moveBase(bas_angle_r); Point statePtCircle; Point statePtBar; Point predictPtCircle; Point predictPtBar; Point measurePtBar; Point measurePtCircle; for(int i=0; i<1; i++) { cap >> frame; frame.copyTo(frame2); centerx = -999; centery =-999; ccenterx=-999; ccentery = -999; predictPtCircle = kalmanCircle.step1(); predictPtBar = kalmanBar.step1(); Image_Processing(frame, bb, hsvB1, hsvB2); Image_Processing_Circle(frame2, cc, hsvC1, hsvC2); if(centerx>-999 && centery >-999 && ccenterx > -999 && ccentery > 999) { measurePtBar.x = centerx; measurePtBar.y = centery; measurePtCircle.x = ccenterx; measurePtCircle.y = ccentery; kalmanCircle.changeMeasure((float)ccenterx,(float)ccentery); kalmanBar.changeMeasure((float)centerx,(float)centery); statePtCircle = kalmanCircle.step2(); statePtBar = kalmanBar.step2(); index.push_back(i); barMeasure.push_back(measurePtBar); circleMeasure.push_back(measurePtCircle); barPredict.push_back(predictPtBar); barState.push_back(statePtBar); circlePredict.push_back(predictPtCircle); circleState.push_back(statePtCircle); } namedWindow("Camera");

imshow( "Camera", frame ); namedWindow("Camera1"); imshow( "Camera1", frame2 ); }

/**/

ccenterx = statePtCircle.x; ccentery = statePtCircle.y; centerx = statePtBar.x; centery = statePtBar.y; if(centery > 0 && ccentery > 0) { double bar_angle = atan2( centerx, centery )*180.0/PI; double circle_angle = atan2( ccenterx, ccentery )*180.0/PI; barangle.push_back(bar_angle); circleangle.push_back(circle_angle); double delta_angle = bar_angle - circle_angle; if(fabs(delta_angle)<2) { moveBase(0); break; } else bas_angle_r = delta_angle *PI/180.0; } else bas_angle_r = 0.0; char c = (char)waitKey(10); if( c == 27 ) break; } FILE *fptrKalman = fopen("c:\\temp\\kalmanBarPredict.csv","w"); for(int i=0; i<barPredict.size(); i++) {

fprintf(fptrKalman,"%d,%d,%d,%d,%d,%d,%d,%f\n",index[i],barMeasure[i].x,barMeasure [i].y,barPredict[i].x,barPredict[i].y,barState[i].x,barState[i].y,barangle[i]); } fclose(fptrKalman); fptrKalman = fopen("c:\\temp\\kalmanCirclePredict.csv","w"); for(int i=0; i<circlePredict.size(); i++) { fprintf(fptrKalman,"%d,%d,%d,%d,%d,%d,%d,%f\n",index[i],circleMeasure[i].x,circleM easure[i].y,circlePredict[i].x,circlePredict[i].y,circleState[i].x,circleState[i].y,circl eangle[i]); } fclose(fptrKalman); moveRobotArm(rdist-60.0,50.0,0.0); Sleep(1000); gripBar(); Sleep(1000); set_arm(0.0, 200.0, 150.0, 0.0); destroyWindow("Camera"); destroyWindow("Camera1"); destroyWindow("contours"); destroyWindow("contours2"); destroyWindow( "Binary" ); destroyWindow( "Binary2" ); Sleep(2000); initialposition();

CloseHandle(port); return(0); }

Appendix B: Phase II Verilog-HDL Program for Faddeevs Algorithm

Circuit: module Divider32Bit( input signed[31:0] dividendIn, input signed[31:0] divisorIn, input start, input clk, output reg signed[31:0] quotient_out, output complete ); reg[31:0] dividend, divisor; //Parameterized values parameter Q = 11; parameter N = 32; reg signed[N-1:0] quotient; reg signed[N-1:0] dividend_copy; reg signed[2*(N-1)-1:0] divider_copy; reg [5:0] bit; reg done; reg XnegFlag, YnegFlag; // initial done = 1; //assign quotient_out = quotient; assign complete = done; always @( posedge clk ) begin if( start ) begin done <= 1'b0; XnegFlag = 0; YnegFlag = 0; if(dividendIn < 0) begin dividend = -1*dividendIn; XnegFlag = 1; end else dividend = dividendIn; if(divisorIn < 0) begin

divisor = -1*divisorIn; YnegFlag = 1; end else divisor = divisorIn; bit <= N+Q-2; quotient <= 0; dividend_copy <= {1'b0,dividend[N-2:0]}; divider_copy[2*(N-1)-1] <= 0; divider_copy[2*(N-1)-2:N-2] <= divisor[N-2:0]; divider_copy[N-3:0] <= 0; //set sign bit if((dividend[N-1] == 1 && divisor[N-1] == 0) || (dividend[N-1] == 0 && divisor[N-1] == 1)) quotient[N-1] <= 1; else quotient[N-1] <= 0; end else if(!done) begin //compare divisor/dividend if(dividend_copy >= divider_copy) begin //subtract dividend_copy <= dividend_copy - divider_copy; //set quotient quotient[bit] <= 1'b1; end //reduce divisor divider_copy <= divider_copy >> 1; //reduce bit counter bit <= bit - 1; //stop condition if(bit == 6'b111111) begin done <= 1'b1; if(((XnegFlag == 1) && ((YnegFlag == 0))) || ((XnegFlag == 0) && ((YnegFlag == 1)))) quotient_out = quotient*-1; else quotient_out = quotient;

XnegFlag = 1'b0; YnegFlag = 1'b0; end end end endmodule module delay(clk, Xin, Xout, loadSig, loadIn); input clk, loadSig; input[31:0] Xin, loadIn; output reg[31:0] Xout; always@(posedge loadSig) begin Xout = loadIn; end always@(posedge clk) begin Xout = Xin; end endmodule module delay1Bit(clk, Xin, Xout, loadSig, loadIn); input clk, loadSig; input Xin, loadIn; output reg Xout; always@(posedge clk) begin if(loadSig == 1) Xout = loadIn; else Xout = Xin; end endmodule //--------------------------------------------//--------------------------------------------module Inequality32Bit(A, B, Out, En); input[31:0] A, B; input En; output reg Out;

always@(posedge En) begin if(A !== B) Out = 1; else Out = 0; end endmodule module Comparator1Bit(A, Out, En); input A, En; output reg Out; always@(posedge En) begin if(A == 1) Out = 1; else Out = 0; end endmodule module Comparator(A, B, Out, En); input signed[31:0] A, B; input En; output reg Out; reg[31:0] X1, X2; always@(posedge En) begin if(A < 0) X1 = A * -1; else X1 = A; if(B < 0) X2 = B * -1; else X2 = B; if(X1 >= X2 ) Out = 1; else Out = 0;

end endmodule module DFF(clk, rst, en, D, Q); input clk, rst, en; input[31:0] D; output reg[31:0] Q; parameter[31:0] ZeroVal = 0; always @(posedge clk or posedge rst) begin if (rst == 1) begin Q = ZeroVal; end else if(en == 1) begin Q = D; end end endmodule module DFF1Bit(clk, rst, en, D, Q); input clk, rst, en; input D; output reg Q; always @(posedge clk or posedge rst) begin if (rst == 1) begin Q = 1'b0; end else if(en == 1) begin Q = D; end end endmodule module Adder32Bit(A, B, C, en); input [31:0] A, B;

input en; output reg[31:0] C; always@(posedge en) begin C = A + B; end endmodule //Need changes module Multiplier32Bit(A, B, Out, en); input signed[31:0] A, B; //Floating point at 11 input en; output reg signed[31:0] Out; reg signed[63:0] C; //Floating point at 22 reg XnegFlag, YnegFlag; reg signed[31:0] X1, X2, Y1; always@(posedge en) begin XnegFlag = 0; YnegFlag = 0; if(A < 0) begin X1 = -1*A; XnegFlag = 1; end else X1 = A; if(B < 0) begin X2 = -1*B; YnegFlag = 1; end else X2 = B; C = X1 * X2; Y1[31] = C[63]; Y1[30:10] = C[41:21]; Y1[10:0] = C[21:11];

if(((XnegFlag == 1) && ((YnegFlag == 0))) || ((XnegFlag == 0) && ((YnegFlag == 1)))) Out = Y1*-1; else Out = Y1; XnegFlag = 1'b0; YnegFlag = 1'b0; end endmodule module NegativeMultiplier32Bit(A, Out, En); input signed[31:0] A; input En; output reg signed[31:0] Out; always@(posedge En) Out = -1 * A; endmodule module Mux3Input32Bit(A, B, D, C, Out, clk); input[31:0] A, B, D; input[1:0] C; input clk; output reg[31:0] Out; always@(posedge clk) begin if(C == 2'b10) Out = B; else if(C == 2'b11) Out = D; else Out = A; end endmodule module ToggleMux(A, B, C, Out, clk); input C, clk; input[31:0] A, B; output reg [31:0] Out; always@(posedge clk) begin

if(C == 0) begin Out = A; end else begin Out = B; end end endmodule //--------------------------------------------module PartitionEnable(clk, DivideDone, reset, Comp1En, Comp2En, MultiEn, DivideEn, Mux1En, Mux2En, regEn, divClk); input clk, reset, DivideDone; output reg Comp1En, Comp2En, MultiEn, DivideEn, Mux1En, Mux2En, regEn, divClk; integer count = 0; always@(posedge clk or posedge reset) begin if(reset == 1) begin Comp1En = 1'b0; Comp2En = 1'b0; MultiEn = 1'b0; DivideEn = 1'b0; Mux1En = 1'b0; Mux2En = 1'b0; regEn = 1'b0; count = 0; end else begin if(count == 0) begin Comp1En = 1'b1; divClk = 1'b0; count = count + 1; end else if(count == 1) begin Comp2En = 1'b1; Comp1En = 1'b0; count = count + 1; end

else if(count == 2) begin MultiEn = 1'b1; Comp2En = 1'b0; count = count + 1; end else if(count == 3) begin DivideEn = 1'b1; MultiEn = 1'b0; count = count + 1; end else if((count == 4) && (DivideDone !== 1)) begin DivideEn = 1'b0; end else if((count == 4) && (DivideDone == 1)) begin Mux1En = 1'b1; DivideEn = 1'b0; count = count + 1; end else if(count == 5) begin Mux2En = 1'b1; Mux1En = 1'b0; count = count + 1; end else if(count == 6) begin regEn = 1'b1; Mux2En = 1'b0; count = count + 1; end else if(count == 7) begin divClk = 1'b1; regEn = 1'b0; count = 0; end end end endmodule //---------------------------------------------

module PartitionEnableIn(clk, divClk, reset, Comp1En, MultiEn, AdderEn, Mux1En, Mux2En); input clk, reset, divClk; output reg Comp1En, MultiEn, AdderEn, Mux1En, Mux2En; integer count = 0; always@(posedge clk or posedge reset) begin if(reset == 1) begin Comp1En = 1'b0; MultiEn = 1'b0; AdderEn = 1'b0; Mux1En = 1'b0; Mux2En = 1'b0; count = 0; end else begin if(count == 0) begin Comp1En = 1'b1; count = count + 1; end else if(count == 1) begin MultiEn = 1'b1; Comp1En = 1'b0; count = count + 1; end else if(count == 2) begin AdderEn = 1'b1; MultiEn = 1'b0; count = count + 1; end else if(count == 3) begin Mux1En = 1'b1; AdderEn = 1'b0; count = count + 1; end else if(count == 4) begin Mux2En = 1'b1; Mux1En = 1'b0;

count = count + 1; end else if(count == 5) begin Mux2En = 1'b0; count = count + 1; end else if(divClk == 1) count = 0; end end endmodule module ClockControl(clk, reset, sig1, sig2, regclk); input clk, reset, sig1, sig2; output reg regclk; reg counter = 0; always@(posedge(sig1&&sig2)) begin counter = 1; end always@(posedge clk or posedge reset) begin if(reset == 1) begin regclk = 1; counter = 0; end else if(counter == 1) begin regclk = 1; counter = counter+1; end else begin regclk = 0; counter = 0; end end endmodule

//--------------------------------------------//--------------------------------------------module BoundaryCell(toggle, toggleOut, clk, regclk, reset, X, V, Mout); input toggle, clk, reset; input signed[31:0] X; output V, regclk, toggleOut; output[31:0] Mout; parameter[31:0] ZeroVal = 0; wire InEqOut1, CompOut1, DivideDone, reset, Comp1En, Comp2En, MultiEn, DivideEn, Mux1En, Mux2En, regEn, divClk; wire divSignal1, divSignal2, InvToggle, DFFEn; wire[31:0] xReg, NegativeX, NegativexReg, Multi1Out, Multi2Out, Mux1Out, M; PartitionEnable EnableAll(clk, DivideDone, reset, Comp1En, Comp2En, MultiEn, DivideEn, Mux1En, Mux2En, regEn, divClk); delay1Bit RegisterT(divClk, toggle, toggleOut, reset, 1'b0); Comparator Comparator1(X, xReg, CompOut1, Comp1En); assign V = CompOut1; not NOTT(InvToggle, toggle); and ANDRegEn(DFFEn, CompOut1, InvToggle); Inequality32Bit Inequality2(X, ZeroVal, InEqOut1, Comp2En); DFF RegisterX(regEn, reset, DFFEn, X, xReg); NegativeMultiplier32Bit Multiplier1(X, NegativeX, MultiEn); NegativeMultiplier32Bit Multiplier2(xReg, NegativexReg, MultiEn); Divider32Bit Divider1(NegativexReg, X, DivideEn, clk, Multi1Out, divSignal1); Divider32Bit Divider2(NegativeX, xReg, DivideEn, clk, Multi2Out, divSignal2); ClockControl DoneSig(clk, reset, divSignal1, divSignal2, DivideDone); assign regclk = divClk; Mux3Input32Bit Mux1(Multi2Out, ZeroVal, Multi1Out, {CompOut1, InEqOut1}, Mux1Out, Mux1En); ToggleMux Mux2(Mux1Out, Multi2Out, toggle, M, Mux2En); delay RegisterM(divClk, M, Mout, reset, 0); endmodule module CompEnable(clk, regclk, outEn, reset);

input clk, regclk, reset; output reg outEn; integer counter = 0; always@(posedge reset) begin outEn = 0; end always@(negedge clk) begin if((regclk == 1) && (counter < 1)) begin counter = counter + 1'b1; outEn = 1; end else outEn = 0; end always@(negedge regclk) begin counter = 0; end endmodule //--------------------------------------------//--------------------------------------------module InternalCell(toggle, toggleOut, clk, regclk, reset, Xin, Xout, Min, Mout, Vin, Vout); input[31:0] Xin, Min; input clk, Vin, toggle, regclk, reset; output Vout, toggleOut; output [31:0] Xout, Mout; wire[31:0] xReg, XMwire, QMwire, XMQwire, QMXwire, Out1; wire MuxVEn, Comp1En, MultiEn, AdderEn, Mux1En, Mux2En; wire V, DFFEn, InvToggle; wire[31:0] M, Xmid; PartitionEnableIn EnableAllIn(clk, regclk, reset, Comp1En, MultiEn, AdderEn, Mux1En, Mux2En); Comparator1Bit ComparatorV(V, regEn, Comp1En); assign V = Vout;

assign M = Min; not NOTT(InvToggle, toggle); and ANDRegEn(DFFEn, regEn, InvToggle); delay1Bit RegisterV(regclk, Vin, Vout, reset, 1'b0); delay1Bit RegisterT(regclk, toggle, toggleOut, reset, 1'b0); delay RegisterM(regclk, Min, Mout, reset, 0); DFF RegisterX(regclk, reset, DFFEn, Xin, xReg); delay RegisterXout(regclk, Xmid, Xout, reset, 0); //delay XinDelay(regclk, Xin, X, reset, 0); Multiplier32Bit MultiplierXM(Xin, M, XMwire, MultiEn); Multiplier32Bit MultiplierQM(xReg, M, QMwire, MultiEn); Adder32Bit AdderXMQ(XMwire, xReg, XMQwire, AdderEn); Adder32Bit AdderQMX(QMwire, Xin, QMXwire, AdderEn); xnor XNORV(MuxVEn, V, 1'b1); ToggleMux Mux1(QMXwire, XMQwire, MuxVEn, Out1, Mux1En); ToggleMux Mux2(Out1, QMXwire, toggle, Xmid, Mux2En); endmodule //--------------------------------------------module FaddeevInputs(clk, clk2, globalLoad, toggle, a11, a12, a13, a14, a21, a22, a23, a24, a31, a32, a33, a34, a41, a42, a43, a44, b11, b12, b13, b14, b21, b22, b23, b24, b31, b32, b33, b34, b41, b42, b43, b44, c11, c12, c13, c14, c21, c22, c23, c24, c31, c32, c33, c34, c41, c42, c43, c44, d11, d12, d13, d14, d21, d22, d23, d24, d31, d32, d33, d34, d41, d42, d43, d44, o1, o2, o3, o4, o5, o6, o7, o8); input [31:0] a11, a12, a13, a14, a21, a22, a23, a24, a31, a32, a33, a34, a41, a42, a43, a44, b11, b12, b13, b14, b21, b22, b23, b24, b31, b32, b33, b34, b41, b42, b43, b44,

c11, c12, c13, c14, c21, c22, c23, c24, c31, c32, c33, c34, c41, c42, c43, c44, d11, d12, d13, d14, d21, d22, d23, d24, d31, d32, d33, d34, d41, d42, d43, d44; output[31:0] o1, o2, o3, o4, o5, o6, o7, o8; output toggle; input clk, globalLoad, clk2; parameter[31:0] ZeroVal = 0; wire[31:0] wireIn1, wireIn2, wireIn3, wireIn4, wireIn5, wireIn6, wireIn7, wireIn8; wire[31:0] wireA12, wireA13, wireA14, wireA21, wireA22, wireA23, wireA24, wireA31, wireA32, wireA33, wireA34, wireA41, wireA42, wireA43, wireA44, wireB11, wireB12, wireB13, wireB14, wireB21, wireB22, wireB23, wireB24, wireB31, wireB32, wireB33, wireB34, wireB41, wireB42, wireB43, wireB44, wireC11, wireC12, wireC13, wireC14, wireC21, wireC22, wireC23, wireC24, wireC31, wireC32, wireC33, wireC34, wireC41, wireC42, wireC43, wireC44, wireD11, wireD12, wireD13, wireD14, wireD21, wireD22, wireD23, wireD24, wireD31, wireD32, wireD33, wireD34, wireD41, wireD42, wireD43, wireD44; wire[31:0] wire031, wire041, wire042, wire111, wire112, wire113, wire121, wire122, wire123, wire124, wire131, wire132, wire133, wire134, wire135, wire141, wire142, wire143, wire144, wire145, wire146; delay DelayA11(clk, wireA21, wireIn1, globalLoad, a11); delay DelayA12(clk, wireA31, wireA21, globalLoad, a21); delay DelayA13(clk, wireA41, wireA31, globalLoad, a31); delay DelayA14(clk, wireC11, wireA41, globalLoad, a41); delay DelayC11(clk, wireC21, wireC11, globalLoad, c11); delay DelayC12(clk, wireC31, wireC21, globalLoad, c21); delay DelayC13(clk, wireC41, wireC31, globalLoad, c31); delay DelayC14(clk, ZeroVal, wireC41, globalLoad, c41); //--------delay Delay020(clk, wireA12, wireIn2, globalLoad, ZeroVal); delay DelayA21(clk, wireA22, wireA12, globalLoad, a12); delay DelayA22(clk, wireA32, wireA22, globalLoad, a22); delay DelayA23(clk, wireA42, wireA32, globalLoad, a32); delay DelayA24(clk, wireC12, wireA42, globalLoad, a42); delay DelayC21(clk, wireC22, wireC12, globalLoad, c12);

delay DelayC22(clk, wireC32, wireC22, globalLoad, c22); delay DelayC23(clk, wireC42, wireC32, globalLoad, c32); delay DelayC24(clk, ZeroVal, wireC42, globalLoad, c42); //--------delay Delay030(clk, wire031, wireIn3, globalLoad, ZeroVal); delay Delay031(clk, wireA13, wire031, globalLoad, ZeroVal); delay DelayA31(clk, wireA23, wireA13, globalLoad, a13); delay DelayA32(clk, wireA33, wireA23, globalLoad, a23); delay DelayA33(clk, wireA43, wireA33, globalLoad, a33); delay DelayA34(clk, wireC13, wireA43, globalLoad, a43); delay DelayC31(clk, wireC23, wireC13, globalLoad, c13); delay DelayC32(clk, wireC33, wireC23, globalLoad, c23); delay DelayC33(clk, wireC43, wireC33, globalLoad, c33); delay DelayC34(clk, ZeroVal, wireC43, globalLoad, c43); //--------delay Delay040(clk, wire041, wireIn4, globalLoad, ZeroVal); delay Delay041(clk, wire042, wire041, globalLoad, ZeroVal); delay Delay042(clk, wireA14, wire042, globalLoad, ZeroVal); delay DelayA41(clk, wireA24, wireA14, globalLoad, a14); delay DelayA42(clk, wireA34, wireA24, globalLoad, a24); delay DelayA43(clk, wireA44, wireA34, globalLoad, a34); delay DelayA44(clk, wireC14, wireA44, globalLoad, a44); delay DelayC41(clk, wireC24, wireC14, globalLoad, c14); delay DelayC42(clk, wireC34, wireC24, globalLoad, c24); delay DelayC43(clk, wireC44, wireC34, globalLoad, c34); delay DelayC44(clk, ZeroVal, wireC44, globalLoad, c44); //----------------//----------------delay Delay110(clk, wire111, wireIn5, globalLoad, ZeroVal); delay Delay111(clk, wire112, wire111, globalLoad, ZeroVal); delay Delay112(clk, wire113, wire112, globalLoad, ZeroVal); delay Delay113(clk, wireB11, wire113, globalLoad, ZeroVal); delay DelayB11(clk, wireB21, wireB11, globalLoad, b11); delay DelayB12(clk, wireB31, wireB21, globalLoad, b21); delay DelayB13(clk, wireB41, wireB31, globalLoad, b31); delay DelayB14(clk, wireD11, wireB41, globalLoad, b41); delay DelayD11(clk, wireD21, wireD11, globalLoad, d11); delay DelayD12(clk, wireD31, wireD21, globalLoad, d21); delay DelayD13(clk, wireD41, wireD31, globalLoad, d31); delay DelayD14(clk, ZeroVal, wireD41, globalLoad, d41); //--------delay Delay120(clk, wire121, wireIn6, globalLoad, ZeroVal);

delay Delay121(clk, wire122, wire121, globalLoad, ZeroVal); delay Delay122(clk, wire123, wire122, globalLoad, ZeroVal); delay Delay123(clk, wire124, wire123, globalLoad, ZeroVal); delay Delay124(clk, wireB12, wire124, globalLoad, ZeroVal); delay DelayB21(clk, wireB22, wireB12, globalLoad, b12); delay DelayB22(clk, wireB32, wireB22, globalLoad, b22); delay DelayB23(clk, wireB42, wireB32, globalLoad, b32); delay DelayB24(clk, wireD12, wireB42, globalLoad, b42); delay DelayD21(clk, wireD22, wireD12, globalLoad, d12); delay DelayD22(clk, wireD32, wireD22, globalLoad, d22); delay DelayD23(clk, wireD42, wireD32, globalLoad, d32); delay DelayD24(clk, ZeroVal, wireD42, globalLoad, d42); //--------delay Delay130(clk, wire131, wireIn7, globalLoad, ZeroVal); delay Delay131(clk, wire132, wire131, globalLoad, ZeroVal); delay Delay132(clk, wire133, wire132, globalLoad, ZeroVal); delay Delay133(clk, wire134, wire133, globalLoad, ZeroVal); delay Delay134(clk, wire135, wire134, globalLoad, ZeroVal); delay Delay135(clk, wireB13, wire135, globalLoad, ZeroVal); delay DelayB31(clk, wireB23, wireB13, globalLoad, b13); delay DelayB32(clk, wireB33, wireB23, globalLoad, b23); delay DelayB33(clk, wireB43, wireB33, globalLoad, b33); delay DelayB34(clk, wireD13, wireB43, globalLoad, b43); delay DelayD31(clk, wireD23, wireD13, globalLoad, d13); delay DelayD32(clk, wireD33, wireD23, globalLoad, d23); delay DelayD33(clk, wireD43, wireD33, globalLoad, d33); delay DelayD34(clk, ZeroVal, wireD43, globalLoad, d43); //--------delay Delay140(clk, wire141, wireIn8, globalLoad, ZeroVal); delay Delay141(clk, wire142, wire141, globalLoad, ZeroVal); delay Delay142(clk, wire143, wire142, globalLoad, ZeroVal); delay Delay143(clk, wire144, wire143, globalLoad, ZeroVal); delay Delay144(clk, wire145, wire144, globalLoad, ZeroVal); delay Delay145(clk, wire146, wire145, globalLoad, ZeroVal); delay Delay146(clk, wireB14, wire146, globalLoad, ZeroVal); delay DelayB41(clk, wireB24, wireB14, globalLoad, b14); delay DelayB42(clk, wireB34, wireB24, globalLoad, b24); delay DelayB43(clk, wireB44, wireB34, globalLoad, b34); delay DelayB44(clk, wireD14, wireB44, globalLoad, b44); delay DelayD41(clk, wireD24, wireD14, globalLoad, d14); delay DelayD42(clk, wireD34, wireD24, globalLoad, d24); delay DelayD43(clk, wireD44, wireD34, globalLoad, d34); delay DelayD44(clk, ZeroVal, wireD44, globalLoad, d44); //-----------------

toggleOn toggleActivator(clk2, globalLoad, wireIn4, toggle); assign o1 = wireIn1; assign o2 = wireIn2; assign o3 = wireIn3; assign o4 = wireIn4; assign o5 = wireIn5; assign o6 = wireIn6; assign o7 = wireIn7; assign o8 = wireIn8; endmodule module toggleOn(clk, reset, In5, toggle); input clk, reset; input [31:0] In5; output reg toggle; always@(posedge clk or posedge reset) begin if(reset == 1) toggle = 0; else if(In5 !== 0) toggle = 1; end endmodule

module FaddeevMain(clk, globalLoad, toggle, regclk, In1, In2, In3, In4, In5, In6, In7, In8, e11, e12, e13, e14, e21, e22, e23, e24, e31, e32, e33, e34, e41, e42, e43, e44); input [31:0] In1, In2, In3, In4, In5, In6, In7, In8; input toggle, clk, globalLoad; output regclk; output [31:0] e11, e12, e13, e14, e21, e22, e23, e24, e31, e32, e33, e34, e41, e42, e43, e44; parameter[31:0] ZeroVal = 0; wire V11, V12, V13, V14, V15, V16, V17, V18, V22, V23, V24, V25, V26, V27, V28, V33, V34, V35, V36, V37, V38, V44, V45, V46, V47, V48;

wire signed[31:0] M11, M12, M13, M14, M15, M16, M17, M18, M22, M23, M24, M25, M26, M27, M28, M33, M34, M35, M36, M37, M38, M44, M45, M46, M47, M48; wire signed[31:0] X11, X12, X13, X14, X15, X16, X17, X18, X22, X23, X24, X25, X26, X27, X28, X33, X34, X35, X36, X37, X38, X44, X45, X46, X47, X48; wire signed[31:0] DelayW11, DelayW12, DelayW13, DelayW21, DelayW22, DelayW31; wire clk2, t11, t12, t13, t14, t15, t16, t17, t18, t22, t23, t24, t25, t26, t27, t28, t33, t34, t35, t36, t37, t38, t44, t45, t46, t47, t48; BoundaryCell Boundary11(toggle, t11, clk, clk2, globalLoad, In1, V11, M11); InternalCell Internal12(t11, t12, clk, clk2, globalLoad, In2, X12, M11, M12, V11, V12); InternalCell Internal13(t12, t13, clk, clk2, globalLoad, In3, X13, M12, M13, V12, V13); InternalCell Internal14(t13, t14, clk, clk2, globalLoad, In4, X14, M13, M14, V13, V14); InternalCell Internal15(t14, t15, clk, clk2, globalLoad, In5, X15, M14, M15, V14, V15); InternalCell Internal16(t15, t16, clk, clk2, globalLoad, In6, X16, M15, M16, V15, V16); InternalCell Internal17(t16, t17, clk, clk2, globalLoad, In7, X17, M16, M17, V16, V17); InternalCell Internal18(t17, t18, clk, clk2, globalLoad, In8, X18, M17, M18, V17, V18); BoundaryCell Boundary22(t12, t22, clk, clk2, globalLoad, X12, V22, M22); InternalCell Internal23(t22, t23, clk, clk2, globalLoad, X13, X23, M22, M23, V22, V23); InternalCell Internal24(t23, t24, clk, clk2, globalLoad, X14, X24, M23, M24, V23, V24); InternalCell Internal25(t24, t25, clk, clk2, globalLoad, X15, X25, M24, M25, V24, V25); InternalCell Internal26(t25, t26, clk, clk2, globalLoad, X16, X26, M25, M26, V25, V26); InternalCell Internal27(t26, t27, clk, clk2, globalLoad, X17, X27, M26, M27, V26, V27); InternalCell Internal28(t27, t28, clk, clk2, globalLoad, X18, X28, M27, M28, V27, V28); BoundaryCell Boundary33(t23, t33, clk, clk2, globalLoad, X23, V33, M33); InternalCell Internal34(t33, t34, clk, clk2, globalLoad, X24, X34, M33, M34, V33, V34); InternalCell Internal35(t34, t35, clk, clk2, globalLoad, X25, X35, M34, M35, V34, V35); InternalCell Internal36(t35, t36, clk, clk2, globalLoad, X26, X36, M35, M36, V35, V36); InternalCell Internal37(t36, t37, clk, clk2, globalLoad, X27, X37, M36, M37, V36, V37); InternalCell Internal38(t37, t38, clk, clk2, globalLoad, X28, X38, M37, M38, V37, V38); BoundaryCell Boundary44(t34, t44, clk, clk2, globalLoad, X34, V44, M44); InternalCell Internal45(t44, t45, clk, clk2, globalLoad, X35, X45, M44, M45, V44, V45); InternalCell Internal46(t45, t46, clk, clk2, globalLoad, X36, X46, M45, M46, V45, V46); InternalCell Internal47(t46, t47, clk, clk2, globalLoad, X37, X47, M46, M47, V46, V47); InternalCell Internal48(t47, t48, clk, clk2, globalLoad, X38, X48, M47, M48, V47, V48);

//---------------delay DelayMain11(clk2, X45, DelayW11, globalLoad, ZeroVal); delay DelayMain12(clk2, DelayW11, DelayW12, globalLoad, ZeroVal); delay DelayMain13(clk2, DelayW12, DelayW13, globalLoad, ZeroVal); delay DelayMain14(clk2, DelayW13, e41, globalLoad, ZeroVal); delay DelayMain15(clk2, e41, e31, globalLoad, ZeroVal); delay DelayMain16(clk2, e31, e21, globalLoad, ZeroVal); delay DelayMain17(clk2, e21, e11, globalLoad, ZeroVal); delay DelayMain21(clk2, X46, DelayW21, globalLoad, ZeroVal); delay DelayMain22(clk2, DelayW21, DelayW22, globalLoad, ZeroVal); delay DelayMain23(clk2, DelayW22, e42, globalLoad, ZeroVal); delay DelayMain24(clk2, e42, e32, globalLoad, ZeroVal); delay DelayMain25(clk2, e32, e22, globalLoad, ZeroVal); delay DelayMain26(clk2, e22, e12, globalLoad, ZeroVal); delay DelayMain31(clk2, X47, DelayW31, globalLoad, ZeroVal); delay DelayMain32(clk2, X47, e43, globalLoad, ZeroVal); delay DelayMain33(clk2, e43, e33, globalLoad, ZeroVal); delay DelayMain34(clk2, e33, e23, globalLoad, ZeroVal); delay DelayMain35(clk2, e23, e13, globalLoad, ZeroVal); delay DelayMain41(clk2, X48, e44, globalLoad, ZeroVal); delay DelayMain42(clk2, e44, e34, globalLoad, ZeroVal); delay DelayMain43(clk2, e34, e24, globalLoad, ZeroVal); delay DelayMain44(clk2, e24, e14, globalLoad, ZeroVal); assign regclk = clk2; endmodule

Testbench //Testbench module Faddeev3x3TB(); reg clk, globalLoad; reg[31:0] a11, a12, a13, a21, a22, a23, a31, a32, a33, b11, b12, b13, b21, b22, b23, b31, b32, b33,

c11, c12, c13, c21, c22, c23, c31, c32, c33, d11, d12, d13, d21, d22, d23, d31, d32, d33; wire[31:0] e11, e12, e13, e21, e22, e23, e31, e32, e33; wire[31:0] In1, In2, In3, In4, In5, In6; wire toggle, regclk; FaddeevInputs3x3 InputMatrices(regclk, clk, globalLoad, toggle, a11, a12, a13, a21, a22, a23, a31, a32, a33, b11, b12, b13, b21, b22, b23, b31, b32, b33, c11, c12, c13, c21, c22, c23, c31, c32, c33, d11, d12, d13, d21, d22, d23, d31, d32, d33, In1, In2, In3, In4, In5, In6); FaddeevMain3x3 MatrixCalculation(clk, globalLoad, toggle, regclk, In1, In2, In3, In4, In5, In6, e11, e12, e13, e21, e22, e23, e31, e32, e33); initial begin forever begin clk = 0; #1 clk = 1; #1 clk = 0; end end initial begin //Insert matrices input below a11 = 32'b00000000000000000000100000000000*-1; a12 = 32'b00000000000000000010100000000000; a13 = 32'b00000000000000000001100000000000*-1; b11 = 32'b00000000000000000001000000000000*-1;

b12 = 32'b00000000000000000011100000000000*-1; b13 = 32'b00000000000000000011000000000000; a21 = 32'b00000000000000000001100000000000; a22 = 32'b00000000000000000010000000000000; a23 = 32'b00000000000000000000100000000000; b21 = 32'b00000000000000000000100000000000; b22 = 32'b00000000000000000001100000000000; b23 = 32'b00000000000000000000100000000000; a31 = 32'b00000000000000000011000000000000; a32 = 32'b00000000000000000011100000000000; a33 = 32'b00000000000000000001000000000000*-1; b31 = 32'b00000000000000000010100000000000; b32 = 32'b00000000000000000100100000000000; b33 = 32'b00000000000000000010000000000000; c11 = 32'b00000000000000000000100000000000*-1; c12 = 32'b00000000000000000001000000000000; c13 = 32'b00000000000000000010000000000000*-1; d11 = 32'b00000000000000000001000000000000; d12 = 32'b00000000000000000000100000000000; d13 = 32'b00000000000000000010100000000000*-1; c21 = 32'b00000000000000000001100000000000*-1; c22 = 32'b00000000000000000010000000000000*-1; c23 = 32'b00000000000000000000100000000000; d21 = 32'b00000000000000000001000000000000; d22 = 32'b00000000000000000010000000000000; d23 = 32'b00000000000000000011000000000000; c31 = 32'b00000000000000000010100000000000; c32 = 32'b00000000000000000001100000000000*-1; c33 = 32'b00000000000000000001000000000000*-1; d31 = 32'b00000000000000000001100000000000*-1; d32 = 32'b00000000000000000001000000000000; d33 = 32'b00000000000000000100100000000000; //----------globalLoad = 1; #9 globalLoad = 0; end endmodule

Appendix C: Data from Calculation of Variance for Measurement Error Covariance Matrix

Trial 1: 289 289 289 289 289 288 288 289 289 288 288 288 288 289 289 288 288 288 289 289 288 288 288 289 289 288 288 288 289 289 288 288 288 288 289 289 288 288 289 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274

Trial 2: 442 442 441 441 441 441 441 441 440 440 440 441 441 440 440 440 441 441 440 440 440 440 440 440 440 440 440 440 441 440 440 440 440 440 440 440 440 440 440 253 253 253 253 253 253 253 253 253 253 254 253 253 253 253 254 253 253 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254

Trial 3: 160 160 160 159 160 160 160 160 159 159 160 161 160 159 159 160 161 160 160 159 160 160 160 160 160 159 160 160 160 160 160 159 160 160 160 159 159 160 160 180 181 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180

Trial 1: Variance (N-1): 0.23703 0 Trial 2: Variance (N-1): 0.167723 0.147921 Trial 3: Variance (N-1): 0.187723 0.009901 Average(N-1): 0.197492 0.052607

289 288 288 288 288 288 289 289 288 288 288 289 288 288 288 288 288 288 288 288 288 289 289 289 288 288 289 289 288 288 288 289 289 288 288 288 289 289 288 288 288 289 288

274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274

440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440

254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254 254

160 160 159 159 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 159 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 160 159 160

180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180

288 288 288 289 289 288 288 288 289 289 288 288 288 288 289 289 288 288 288 288

274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274 274

440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440 440

254 254 254 254 254 254 254 254 253 254 254 253 254 254 254 254 254 254 254 254

160 160 160 160 159 160 160 161 160 159 160 160 160 160 160 159 159 160 160 160

180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180 180

You might also like