You are on page 1of 3

Best-Fit Curves

Copy the data in Table 1 and plot it as shown in Figure 1.


x
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
7.5

f(x)
2.7
0.4
-2.3
-2.9
-0.8
2
3
1.2
-1.6
-3
-1.6
1.3
3
2
-0.9
-2.9
Table 1

Figure 1

The data is approximately sinusoidal: i.e. f(x) is a function of x and that function is
sinusoidal (i.e. a sine function). It is not, however, simply f(x) = sin (x). The points do
not exactly match a sinusoid because they represent measured data and there is some
uncertainty in each point.
A general sine function has three properties. The first is the amplitude. While sin(x)
varies between 1 and 1, 2sin(x) has twice the amplitude, i.e. it varies between 2 and
2. The second property is the period. The function sin(x) repeats itself every 2
radians i.e. sin(x) = sin(x+2= sin(x+4) = .... It is said to have a period of 2
radians. The function sin(2x) has a period of radians.
The third property of a sinusoidal function is the phase angle. The function sin(x+1)
has the same amplitude and the same period as sin(x). The difference between them is
shown in Figure 2. While, for example, the sine of x is zero when x is zero, the sine of
x+1 is zero when x = 1. The sine of x is 1 when x is /2 while the sine of x+1 is 1
when x is /2 1.

1.5
1

f(x)

0.5
sin(x)

0
0

sin(x+1)

-0.5
-1
-1.5
x

Figure 2

Returning to the data given above; the function f(x) is of the form
f(x) = Asin(bx + c)
and we want to find the values of A, b, and c, for the curve that best fits the data. If
you try and add a trend-line to the graph in Excel you will see that the options are
limited; a sinusoidal fit function is not included. You can fit straight lines,
polynomials, etc., but that is no good in this case.
You could, for example, use two data points to estimate the amplitude of the sinusoid
but you might not get a very accurate answer remember that each point has a certain
uncertainty in it and you will not know whether you have picked points at the peaks
and troughs of the function. The more points you use, the better an estimate you can
get: that is why you fit a trend-line to data.
Excel only allows a limited number of types of trend-lines. You will see with this
example, however, that you can fit any type of trend-line you want with the help of
Solver.

How is the best-fit function defined?


The curve that best fits a set of data can be defined as the one that minimises the sum
of the squares of the vertical distances between the data-points and the curve. For this
reason, this kind of best-fit curve is known as a least-squares best-fit.
Add in initial guesses for A, b, and c as shown below. Note that, when using Solver
(or Goal Seek), you always need to start with some initial values. Also add in
formulas to calculate Asin(bx+c) for each of the x values as shown. Select the three
columns and plot using an X-Y Plot as shown below. As you can see, the function
with A = 1 and b = c = 2 is a very poor fit for the measured data.

Figure 3

Now add another column containing the square of the difference between f(x) and
Asin(bx+c) for each x value i.e. the square of the differences between the values in
column D and those in column E. Since the values in column E represent the current
guess at what the best-fit curve is, these values are the squares of the differences
between this curve and the actual data-points. Put the sum of the values in column F
in a cell; i.e. the sum of the squares of the differences (as shown in Figure 4).

Figure 4

Now the least-squares best-fit curve is defined as the curve which minimises this
value. In this case the values of A, b, and c define the curve and we want to find which
values of these minimises the number in cell D22. This can be done using Solver.
Note that Solver may not find the correct values the first time it will depend on your
initial guesses. It may find a local minimum but not the global minimum. This
should be clear from the chart. Try changing the values in cells B1:B3 and running
Solver again.
The actual best-fit curve is, in this case, approximately f(x) = 3.016 sin(1.998 x +
2.012). Note that you may get a different value for c; it may differ from 2.012 by ,
, 2, or any other integer multiple of . Why is this?
With this method you can fit any kind of function to data.

You might also like