You are on page 1of 11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
11,064,090 members 37,158 online

home

articles

quick answers

discussions

Sign in

features

community

help

Searchforarticles,questions,tips

Articles Languages C / C++ Language General

Best Square Root Method - Algorithm - Function


(Precision VS Speed)
Mahmoud Hesham ElMagdoub, 15 Sep 2010

CPOL

Rate this:

4.85 47 votes

Info
First Posted

1 Apr 2010

Views

119,162

Downloads

1,074

Bookmarked

88 times

Square Root Methods Fast Algorithm Speed Precision computational Quake3 Fast Square Root Function Fast
Gaming

Download source 5.28 KB

Introduction
I enjoy Game Programming with Directx and I noticed that the most called method throughout most of my
games is the standard sqrt method in the Math.h and this made me search for faster functions than the standard
sqrt. And after some searching, I found lots of functions that were much much faster but it's always a
compromise between speed and precision. The main purpose of this article is to help people choose the best
squareroot method that suits their program.

Background
In this article, I compare 14 different methods for computing the square root with the standard sqrt function as a

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

1/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
reference, and for each method I show its precision and speed compared to the sqrt method.

What this Article is Not About


1. Explaining how each method works
2. New ways to compute the square root

Using the Code


The code is simple, it basically contains:
1. main.cpp
Calls all the methods and for each one of them, it computes the speed and precision relative to the sqrt function.
2. SquareRootmethods.h
This Header contains the implementation of the functions, and the reference of where I got them from.
First I calculate the Speed and Precision of the sqrt method which will be my reference.
For computing the Speed, I measure the time it takes to call the sqrt function M1 times and I assign this value
to RefSpeed which will be my reference.
And for computing the Precision, I add the current result to the previous result in RefTotalPrecision every
time I call the method. RefTotalPrecision will be my reference.
For measuring runtime duration Speed of the methods, I use the CDuration class found in this link, and I use
dur as an instance of that class.
Collapse | Copy Code

for(intj=0;j<AVG;j++)
{
dur.Start();

for(inti=1;i<M;i++)

RefTotalPrecision+=sqrt((float)i);

dur.Stop();

Temp+=dur.GetDuration();
}

RefTotalPrecision/=AVG;
Temp/=AVG;
RefSpeed=(float)(Temp)/CLOCKS_PER_SEC;

And for the other methods I do the same calculations, but in the end, I reference them to the sqrt.
Collapse | Copy Code

for(intj=0;j<AVG;j++)

dur.Start();

for(inti=1;i<M;i++)

TotalPrecision+=sqrt1((float)i);

dur.Stop();

Temp+=dur.GetDuration();

TotalPrecision/=AVG;

Temp/=AVG;

Speed=(float)(Temp)/CLOCKS_PER_SEC;
cout<<"Precision="
<<(double)(1abs((TotalPrecisionRefTotalPrecision)/(RefTotalPrecision)))*100<<endl;

NOTES:
1. I assume that the error in Precision whether larger or smaller than the reference is equal, that's why I use
"abs".
2. The Speed is referenced as the actual percentage, while the Precision is referenced as a decrease
percentage.
You can modify the value of M as you like, I initially assign it with 10000.

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

2/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
You can modify AVG as well, the higher it is, the more accurate the results.
Collapse | Copy Code

#defineM10000
#defineAVG10

Points of Interest
Precision wise, the sqrt standard method is the best. But the other functions can be much faster even 5 times
faster. I would personally choose Method N# 14 as it has high precision and high speed, but I'll leave it for you to
choose.
I took 5 samples and averaged them and here is the output:

According to the analysis the above Methods Performance Ranks Speed x Precision is:

NOTE: The performance of these methods depends highly on your processor and may change from one
computer to another.

The METHODS
Sqrt1
http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

3/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Reference: http://ilab.usc.edu/wiki/index.php/Fast_Square_Root
Algorithm: Babylonian Method + some manipulations on IEEE 32 bit floating point representation
Collapse | Copy Code

floatsqrt1(constfloatx)
{
union
{
inti;
floatx;
}u;
u.x=x;
u.i=(1<<29)+(u.i>>1)(1<<22);

//TwoBabylonianSteps(simplifiedfrom:)
//u.x=0.5f*(u.x+x/u.x);
//u.x=0.5f*(u.x+x/u.x);
u.x=u.x+x/u.x;
u.x=0.25f*u.x+x/u.x;
returnu.x;
}

Sqrt2
Reference: http://ilab.usc.edu/wiki/index.php/Fast_Square_Root
Algorithm: The Magic Number Quake 3
Collapse | Copy Code

#defineSQRT_MAGIC_F0x5f3759df
floatsqrt2(constfloatx)
{
constfloatxhalf=0.5f*x;

union//getbitsforfloatingvalue
{
floatx;
inti;
}u;
u.x=x;
u.i=SQRT_MAGIC_F(u.i>>1);//givesinitialguessy0
returnx*u.x*(1.5fxhalf*u.x*u.x);//Newtonstep,repeatingincreasesaccuracy
}

Sqrt3
Reference: http://ilab.usc.edu/wiki/index.php/Fast_Square_Root
Algorithm: Log base 2 approximation and Newton's Method
Collapse | Copy Code

floatsqrt3(constfloatx)
{
union
{
inti;
floatx;
}u;
u.x=x;
u.i=(1<<29)+(u.i>>1)(1<<22);
returnu.x;
}

Sqrt4
Reference: I got it a long time a go from a forum and I forgot, please contact me if you know its reference.
Algorithm: Bakhsali Approximation
Collapse | Copy Code

floatsqrt4(constfloatm)
{
inti=0;
while((i*i)<=m)
i++;
i;
floatd=mi*i;
floatp=d/(2*i);
floata=i+p;
returna(p*p)/(2*a);
}

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

4/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject

Sqrt5
Reference: http://www.dreamincode.net/code/snippet244.htm
Algorithm: Babylonian Method
Collapse | Copy Code

floatsqrt5(constfloatm)
{
floati=0;
floatx1,x2;
while((i*i)<=m)
i+=0.1f;
x1=i;
for(intj=0;j<10;j++)
{
x2=m;
x2/=x1;
x2+=x1;
x2/=2;
x1=x2;
}
returnx2;
}

Sqrt6
Reference: http://www.azillionmonkeys.com/qed/sqroot.html#calcmeth
Algorithm: Dependant on IEEE representation and only works for 32 bits
Collapse | Copy Code

doublesqrt6(doubley)
{
doublex,z,tempf;
unsignedlong*tfptr=((unsignedlong*)&tempf)+1;
tempf=y;
*tfptr=(0xbfcdd90a*tfptr)>>1;
x=tempf;
z=y*0.5;
x=(1.5*x)(x*x)*(x*z);//Themoreyoumakereplicatesofthisstatement
//thehighertheaccuracy,hereonly2replicatesareused
x=(1.5*x)(x*x)*(x*z);
returnx*y;
}

Sqrt7
Reference: http://bits.stephanbrumme.com/squareRoot.html
Algorithm: Dependant on IEEE representation and only works for 32 bits
Collapse | Copy Code

floatsqrt7(floatx)
{
unsignedinti=*(unsignedint*)&x;
//adjustbias
i+=127<<23;
//approximationofsquareroot
i>>=1;
return*(float*)&i;
}

Sqrt8
Reference: http://forums.techarena.in/softwaredevelopment/1290144.htm
Algorithm: Babylonian Method
Collapse | Copy Code

doublesqrt9(constdoublefg)
{
doublen=fg/2.0;
doublelstX=0.0;
while(n!=lstX)
{
lstX=n;
n=(n+fg/n)/2.0;
}
returnn;
}

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

5/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject

Sqrt9
Reference: http://www.functionx.com/cpp/examples/squareroot.htm
Algorithm: Babylonian Method
Collapse | Copy Code

doubleAbs(doubleNbr)
{
if(Nbr>=0)
returnNbr;
else
returnNbr;
}
doublesqrt10(doubleNbr)
{
doubleNumber=Nbr/2;
constdoubleTolerance=1.0e7;
do
{
Number=(Number+Nbr/Number)/2;
}while(Abs(Number*NumberNbr)>Tolerance);

returnNumber;
}

Sqrt10
Reference: http://www.cs.uni.edu/~jacobson/C++/newton.html
Algorithm: Newton's Approximation Method
Collapse | Copy Code

doublesqrt11(constdoublenumber)e
{
constdoubleACCURACY=0.001;
doublelower,upper,guess;
if(number<1)
{
lower=number;
upper=1;
}
else
{
lower=1;
upper=number;
}

Article
Browse Code
Stats
Revisions 27

while((upperlower)>ACCURACY)
{
guess=(lower+upper)/2;
if(guess*guess>number)
upper=guess;
else
lower=guess;
}
return(lower+upper)/2;
}

Alternatives
Comments 41

Tagged as
C++

Sqrt11
Reference: http://www.drdobbs.com/184409869;jsessionid=AIDFL0EBECDYLQE1GHOSKH4ATMY32JVN
Algorithm: Newton's Approximation Method
Collapse | Copy Code

Related Articles
SAPrefs Netscape
like Preferences
Dialog
XNA Snooker Club
WPF: A* search
Windows 7 / VS2010
demo app
Window Tabs

doublesqrt12(unsignedlongN)
{
doublen,p,low,high;
if(2>N)
return(N);
low=0;
high=N;
while(high>low+1)
{
n=(high+low)/2;
p=n*n;
if(N<p)
high=n;
elseif(N>p)
low=n;
else
break;
}

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

6/11

12/4/2014
WndTabs AddIn
for DevStudio

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
return(N==p?n:low);
}

Sqrt12
Reference: http://cjjscript.q8ieng.com/?p=32
Algorithm: Babylonian Method
Collapse | Copy Code

Go to top

doublesqrt13(intn)
{

//doublea=(eventuallythemainmethodwillplugvaluesintoa)

doublea=(double)n;

doublex=1;

//Forlooptogetthesquarerootvalueoftheenterednumber.

for(inti=0;i<n;i++)

x=0.5*(x+a/x);

returnx;
}

Sqrt13
Reference: N/A
Algorithm: Assembly fsqrt
Collapse | Copy Code

doublesqrt13(doublen)
{
__asm{
fldn
fsqrt
}
}

Sqrt14
Reference: N/A
Algorithm: Assembly fsqrt 2
Collapse | Copy Code

doubleinline__declspec(naked)__fastcallsqrt14(doublen)
{

_asmfldqwordptr[esp+4]

_asmfsqrt

_asmret8
}

History
1.3 (15 September 2010)
Added Method N#14 which is the best method till now
Added modified source code

1.2 (24 June 2010)


Added Method N#13
Added the Methods Performance Rank
Added modified source code

1.1 (3 April 2010)


Added Precision Timer instead of clock because it's more precise

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

7/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Added the average feature

1.0 (31 March 2010)


Initial release
I hope that this article would at least slightly help those who are interested in this issue.

License
This article, along with any associated source code and files, is licensed under The Code Project Open License
CPOL

Share
EMAIL

About the Author


Mahmoud Hesham ElMagdoub
Software Developer Senior Free Lancer
Egypt
BSc Computer Engineering, Cairo University, Egypt.
I love teaching and learning and I'm constantly changing
Feel free to discuss anything with me.
A Freelance UX designer, Code is my tool brush, I design for experience !

My Website
Follow on

Twitter

Comments and Discussions


You must Sign In to use this message board.
Search Comments
Profile popups

Spacing Compact

Noise VeryHigh

Layout OpenAll

Go
Per page 10

Update

First Prev Next

Not sure methods are best

Member 10586125

6May14 2:17

Just have tried to make dll and import sqrt13 and sqrt14 into my c# project.
Just tested in my method computing Standard Deviation.
I have tried two versions __fastcall and __stdcall.
Result is following Math.Sqrt in c# is little bit faster.
Best methods in article sqrt13 and sqrt14 do not suite for x64.

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

8/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Sign In View Thread Permalink

GCC code for testing

Member 10551010

29Apr14 21:58

Would it be possible to translate your sqrt13 and sqrt14 algorithms into GCC? It would help Linux users but
nonexperts in assembly like myself to test it and maybe use it. Thanks!
Sign In View Thread Permalink

objectivec

10z

28Apr14 9:04

Marcos Lohmann

20Nov13 5:40

Is there a way to get sqrt14 to work in objectivec?


Sign In View Thread Permalink

another way to do it

I am using this function to calculate square root; it has a very good performance and precision.
It is based on median of lower/higher end points to reduce the number of iterations to find the answer.
At the deepest decimal positions it may run into an infinite loop, so I had to implement a break point based on
repeated lower/higher ends.

floatsqrt(floatn){
if(n<0)n=1*n;
floatlow=0,high=n,llow=high,lhigh=low,sqrt=0,res=0;
while(res!=n){
sqrt=(high+low)/2;
res=sqrt*sqrt;
if(res>n)high=sqrt;
elseif(res<n)low=sqrt;
if(llow==low&&lhigh==high){
break;
}else{
llow=low;
lhigh=high;
}
}
returnsqrt;
}

Sign In View Thread Permalink

Re: another way to do it

Marcos Lohmann

20Nov13 6:41

But the latest one has an issue on numbers between 0 and 1, so I have implemented a routine to multiply
the number by 100 as many times it needs to become grater than 1, then divide the result by 10 the same
amount of times; that fixes the issue.

floatsqrt1(floatn){
if(n<0)return1;
inttimes=1;
while(n<1){
n=n*100;
times++;
}
if(n<0)n=1*n;
floatlow=0,high=n,xlow=high,xhigh=low,sqrt=0,res=0;
while(res!=n){
sqrt=(high+low)/2;
res=sqrt*sqrt;
if(res>n)high=sqrt;
elseif(res<n)low=sqrt;
if(xlow==low&&xhigh==high){
break;
}else{
xlow=low;
xhigh=high;
}
}
for(inti=1;i<times;i++){
sqrt=sqrt/10;
}
returnsqrt;
}

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

9/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Sign In View Thread Permalink

Further advice on precision calculation [modified]

Conor Manning

22Feb13 9:22

Mahmoud,
Further to Peter_in_2780's excellent comment, I'd like to point out that there is another error in your analysis of
the precision of each method.
After making the changes Peter suggested, by taking the sum of all of the absolute values of correct
answer, you will also need to change what you mean by 'correct'.
In your current implementation, you define precision as how far the answer is from your reference
implementation. Unfortunately, the reference implementation is simply another approximation there's no
algorithm that can calculate the square root with complete accuracy of any given real number.
So to calculate the precision, you'll actually need to square your sqrt and see how close that is to the input. An
example should make this clear here I'll take sqrt13:
precision = abssqrt13x*sqrt13x x
Something else you might want to consider is that M=10000 isn't really a high value. Some of the methods
listed here, such as the Quake method, have constant order, so I'd expect them to do much better than
iterative methods for high inputs.
I hope you'll find the time to edit the article, because otherwise developers might be misinformed. Thanks.
modified 22Feb13 14:30pm.

Sign In View Thread Permalink

Good Read

Bassam AbdulBaki

29May12 8:26

Good read, especially that magic number from Quake III. If you're also testing accuracy, you may want to add
the new optimal magic number found using the Quake III approach 0x5f37642f.
Web BM RSS Math LinkedIn

Sign In View Thread Permalink

Re: Good Read

Thanks

Mahmoud Hesham El
Magdoub

5Jun12 21:33

I will add it in my next revision

Iam Nothing

Sign In View Thread Permalink

actual time taken by the methods?

utkarshs

28May12 7:47

speed here is compared in percentages,so what is the actual time taken by the functions?for example, how
much time would they take to calculate square root of 2 to 100 decimal places.my method takes 27 seconds
for that,but has 100% accuracy.i mean,not relative to sqrt,but actual 100% accuracy.how would you rate it?on
your scale?answer please
Sign In View Thread Permalink

Re: actual time taken by the methods?

Mahmoud Hesham El
Magdoub

28May12 8:41

Thank you for your question


I'm really glad you made a new algorithm
About the time, I cannot give you absolute time because it's relative and depends on your computer speed.
You can try any of the algorithms in my article on your code and see how much it takes.
Please update me with the results

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

10/11

12/4/2014

BestSquareRootMethodAlgorithmFunction(PrecisionVSSpeed)CodeProject
Keep it up
Iam Nothing

Sign In View Thread Permalink

Last Visit: 31Dec99 19:00


General

News

Permalink | Advertise | Privacy | Terms of Use | Mobile


Web01 | 2.8.141202.1 | Last Updated 15 Sep 2010

Last Update: 3Dec14 22:10


Suggestion

Question

Select Language

Refresh
Bug

Answer

Joke

Layout: fixed | fluid

http://www.codeproject.com/Articles/69941/BestSquareRootMethodAlgorithmFunctionPrecisi

Rant

1 2 3 4 5 Next
Admin
Article Copyright 2010 by Mahmoud Hesham ElMagdoub
Everything else Copyright CodeProject, 19992014

11/11

You might also like