You are on page 1of 300

Information about the guy who would like to learn and share the document :

Name Indian email id : 31gane@gmail.com : Ganesh Gopal

This document has been compiled from Librerated resources available in Internet (Freedom). I would like to Share this document and those who would like to edit and update more useful information which can impress students, engineers, home makers, advanced users, scientists, inorder to liberate themselves from the iron fists of capitalistic and monopolising corporates. Moreover i would like to thank all the organizations from which i have copied the information. Those who would like to Edit and Add information are encouraged to do so. This document follows Creative Commons license.

Science the GNU Way, Part I


In my past several articles, I've looked at various packages to do all kinds of science. Sometimes, however, there just isn't a tool to solve a particular problem. That's the great thing about science. There is always something new to discover and study. But, this means it's up to you to develop the software tools you need to do your analysis. This article takes a look at the GNU Scientific Library, or GSL. This library is the Swiss Army library of routines that you will find useful in your work.

First, you need to to get a copy of GSL and install it on your system. Because it is part of the GNU Project, it is hosted at http://www.gnu.org/s/gsl. You always can download and build from the source code, but all major distributions should have packages available. For example, on Debian-based systems, you need to install the package libgsl0-dev to develop your code and gsl-bin to run that code. GSL is meant for C and C++, so you also need a compiler. Most of you probably already are familiar with GCC, so I stick with that here. The next step is actually the last step. I'm looking at compiling now so that I can focus on all the tools available in GSL. All the header files for GSL are stored in a subdirectory named gsl. So, for example, if you wanted to include the math header file, you would use:
#include <gsl/gsl_math.h>

All the functions are stored in a single library file called libgsl.a or libgsl.so. You also need a library to handle basic linear algebra. GSL provides one (in the file libgslcblas.so), but you can use your own. Additionally, you need to link in the math library. So, the final compile and link command should look like this: Garrick, one line below.
gcc -o hello_world hello_world.c -lgsl -lgslcblas -lm

There are optional inline versions for some of the performance-critical functions in GSL. To use these, you need to include -DHAVE_INLINE with your compile command. To try to help with portability issues, GSL offers some functions that exist only on certain platforms (and not others). As an example, the BSD math library has a function called hypot. GSL offers its own version, called gsl_hypot, that you can use on non-BSD platforms. Some functions have both a general algorithm as well as optimized versions for specific platforms. This way, if you are running on a SPARC, for example, you can select a version optimized for SPARC if it exists. One of the first things you likely will want to do is check whether you are getting correct results from your code or if there were errors. GSL has a number of functions and data structures available in the header file named gsl_errno.h. Functions return a value of zero if everything is fine. If there were any problems in trying to complete the requested action, a nonzero value is returned.

This could be an actual error condition, like a wrong data type or memory error, or it could be a condition like not being able to converge to within the requested accuracy in the function call. This is why you always need to check the return value for all GSL function calls. The actual values returned in an error condition are error codes, defined in the file gsl_errno.h. They are defined as macros that start with GSL_. Examples include the following: GSL_EDOM domain error, used by functions when an argument doesn't fall into the domain over which the function is defined. GSL_ERANGE range error, either an overflow or underflow. GSL_ENOMEM no memory available. The library will use values only up to 1024. Values above this are available for use in your own code. There also are string versions of these error codes available. You can translate the error code to its text value with the function gsl_errno(). Now that you know how to compile your program and what to do with errors, let's start looking at what kind of work you can do with GSL. Basic mathematical functions are defined in the file gsl_math.h. The set of mathematical constants from the BSD math library are provided by this part of GSL. All of the constants start with M_. Here are a few of them: M_PI pi. M_SQRT2 the square root of 2. M_EULER Euler's constant. There also are capabilities for dealing with infinities and non-numbers. Three macros define the values themselves: GSL_POSINF positive infinity. GSL_NEGINF negative infinity. GSL_NAN not a number. There also are functions to test variables: gsl_isnan is it not a number? gsl_isinf is it infinite? gsl_finite is it finite? There is a macro to find the sign of a number. GSL_SIGN(x) returns the sign of x: 1 if it is positive and 1 if it is negative. If you are interested in seeing whether a number is even or odd, two macros are defined: GSL_IS_ODD(x) and GSL_IS_EVEN(x). These return 1 if the condition is true and 0 if it is not. A series of elementary functions are part of the BSD math library. GSL provides versions of these for platforms that don't have native versions, including items like: gsl_hypot calculate hypotenuse. gsl_asinh, gsl_acosh, gsl_atanh the arc hyperbolic trig functions. If you are calculating the power of a number, you would use gsl_pow_int(x,n), which gives you x to the power of n. There are specific versions for powers less than 10. So if you wanted to find the cube of a number, you would use gsl_pow_3. These are very efficient and highly optimized. You even can inline these specialized functions when HAVE_INLINE is defined.

Several macros are defined to help you find the maximum or minimum of numbers, based on data type. The basic GSL_MAX(a,b) and GSL_MIN(a,b) simply return either the maximum or minimum of the two numbers a and b. GSL_MAX_DBL and GSL_MIN_DBL find the maximum and minimum of two doubles using an inline function. GSL_MAX_INT and GSL_MIN_INT do the same for integer arguments. When you do any kind of numerical calculation on a computer, errors always are introduced by roundoff and truncation. This is because you can't exactly reproduce numbers on a finite binary system. But, what if you want to compare two numbers and see whether they are approximately the same? GSL provides the function gsl_fcmp(x,y,epsilon). This function compares the two doubles x and y, and checks to see if they are within epsilon of each other. If they are within this range, the function returns 0. If x < y, it returns 1, and it returns 1 if x > y. Complex numbers are used in many scientific fields. Within GSL, complex data types are defined in the header file gsl_complex.h, and relevant functions are defined in gsl_complex_math.h. To store complex numbers, the data type gsl_complex is defined. This is a struct that stores the two portions. You can set the values with the functions gsl_complex_rect(x,y) or gsl_complex_polar(x,y). In the first, this represents x+iy; whereas in the second, x is the radius, and y is the angle in a polar representation. You can pull out the real and imaginary parts of a complex number with the macros GSL_REAL and GSL_IMAG. There is a function available to find the absolute value of a complex number, gsl_complex_abs(x), where x is of type gsl_complex. Because complex numbers actually are built up of two parts, even basic arithmetic is not simple. To do basic math, you can use the following: gsl_complex_add(a,b) gsl_complex_sub(a,b) gsl_complex_mul(a,b) gsl_complex_div(a,b) You can calculate the gsl_complex_inverse(a). conjugate with gsl_complex_conjugate(a) and the inverse with

There are functions for basic mathematical functions. To calculate the square root, you would use gsl_complex_sqrt(x). To calculate the logarithm, you would use gsl_complex_log(x). Several others are available too. Trigonometric functions are provided, like gsl_complex_sin(x). There also are functions for hyperbolic trigonometric functions, along with the relevant inverse functions. Now that you have the basics down, my next article will explore all the actual scientific calculations you can do. I'll look at statistics, linear algebra, random numbers and many other topics.

What is Scientific Programming?

This article will take you into the world of scientific programming from simple numerical computations to some complex mathematical models and simulations. We will explore various computational tools but our focus will remain scientific programming with Python. I have chosen Python because it combines remarkable power with clean, simple and easy-to-understand syntax. That some of the most robust scientific packages have been written in Python makes it a natural choice for scientific computational tasks. Scientific programming, or in broader terms, scientific computing, deals with solving scientific problems with the help of computers, so as to obtain results more quickly and accurately. Computers have long been used for solving complex scientific problems however, advancements in computer science and hardware technologies over the years have also allowed students and academicians to play around with robust scientific computation tools. Although tools like Mathematica and Matlab remain commercial, the open source community has also developed some equally powerful computational tools, which can be easily used by students and independent researchers. In fact, these tools are so robust that they are now also used at educational institutions and research labs across the globe. So, lets move on to setting up a scientific environment.

Setting up the environment


Most UNIX system/Linux distributions have Python installed by default. We will use Python 2.6.6 for the purposes of this article. Its recommended to install IPython, as it offers enhanced introspection, additional shell syntax, syntax highlighting and tab-completion. You can install IPython here. Next, well install the two most basic scientific computational packages for Python: NumPy and SciPy. The former is the fundamental package needed for scientific computing with Python. It contains a powerful N-dimensional array object, sophisticated functions, tools for integrating C/C++, and Fortran code with useful linear algebra, Fourier transforms, and random-number capabilities. The SciPy library is built to work with NumPy arrays, and provides many user-friendly and efficient numerical routines for numerical integration and optimisation. Open the Synaptic Package Manager and install the python-numpy and python-scipy packages. Now that we have NumPy and SciPy installed, lets get our hands dirty with some mathematical functions and equations!

Figure 1: NumPy and SciPy Installation

Numerical computations with NumPy, SciPy and Maxima


NumPy offers efficient array computations with fixed-size, homogeneous, multi-dimensional array types, and a plethora of functions to perform various array operations. Array-programming languages like NumPy generalise operations in scalars to apply transparently to vectors, matrices and other higher-dimensional arrays. Python does not have a default array data type, and processing data with Python lists and for loops is dramatically slower compared to corresponding operations in compiled languages like FORTRAN, C and C++. NumPy comes to the rescue, with its dynamically typed environment for array computation, similar to basic Matlab. You can create a simple array with the array function in NumPy: In[1]: import numpy as np In[2]: a = np.array([1,2,3,4,5]) In[2]: b = np.array([6,7,8,9,10]) In[3]: type(b) #check datatype Out[3]: type numpy.ndarray #array In[4]: a+b Out[4]: array([7,9,11,13,15]) You can also convert a simple array to a matrix array using the shape attribute. In[1]: import numpy as np In[5]: c = np.array([1,4,5,7,2,6]) In[6]: c.shape = (2,3) In[7]: c Out[7]: array([1,4,5],[7,2,6]) // converted to a 2 column matrix

Matrix operations
Now let us take a look at some simple matrix operations. The following matrix can be simply defined as:

# Defining a matrix and matrix multiplication In[1]: import numpy as np In[2]: x = np.array([[1,2,3],[4,5,6],[7,8,9]]) In[3]: y = np.array([[1,4,5],[2,6,5],[6,8,3]]) #another matrix In[4]: z = np.dot(x,y) #matrix multiplication using dot attribute In[5]: z Out[5]: z = ([[23, 40,24], [50,94,63],[77,148,102]]) You can also create matrices in NumPy using the matrix class. However, its preferable to use arrays, since most NumPy functions return arrays, and not matrices. Moreover, matrix objects have a maximum of Rank-2. To hold Rank-3 data, you need an array. Also, arrays are closer in semantics to tensor algebra, compared to matrix objects. The following example shows how to transpose a matrix and define a diagonal matrix: In[7]: import numpy as np In[8]: x = np.array([[1,2,3],[4,5,6],[7,8,9]]) In[7]: xT = np.transpose(x) #take transpose of the matrix In[8]: xT Out[8]:xT = ([[1,4,7],[2,5,8],[3,6,9]]) In[9]:n = diag(range(1,4)) #defining a diagnol matrix In[10]: n Out[10]:n = ([[1,0,0],[0,2,0],[0,0,3]])

Linear algebra
You can also solve linear algebra problems using the linalg package contained in SciPy. Let us look at a few more examples of calculating matrix inverse and determinant: # Matrix Inverse In[1]: import numpy as np In[2]: m = np.array([[1,3,3],[1,4,3],[1,3,4]]) In[3]: np.linalg.inv(m) #take inverse with linalg.inv function Out[3]: array([[-7,-3,-3],[-1,1,0],[-1,0,1]]) #Calculating Determinant In[4]: z = np.array([[0,0],[0,1]]) In[5]: np.linalg.det(z) Out[5]: 0 #z is a singular matrix and hence has its determinant as zero

Integration
The scipy.integrate package provides several integration techniques, which can be used to solve simple and complex integrations. The package provides various methods to integrate functions. We will be discussing a few of them here. Let us first understand how to integrate the following functions:

# Simple Integration of x^2 In[1]: from scipy.integrate import quad In[2]: import scipy as sp In[3]: sp.integrate.quad(lambda x: x**2,0,3) Out[3]: (9.0, 9.9922072216264089e-14) # Integration of 2^sqrt(x)/sqrt(x) In[4]: sp.integrate.quad(lambda x: 2**sqrt(x)/sqrt(x),1,3) Out[4]: (3.8144772785946079, 4.2349205016052412e-14) The first argument to quad is a callable Python object (i.e., a function, method, or class instance). We have used a lambda function as the argument in this case. (A lambda function is one that takes any number of arguments including optional arguments and returns the value of a single expression.) The next two arguments are the limits of integration. The return value is a tuple, with the first element holding the estimated value of the integral, and the second element holding an upper bound on the error.

Differentiation
We can get a derivative at a point via automatic differentiation, supported by FuncDesigner and OpenOpt, which are scientific packages based on SciPy. Note that automatic differentiation is different from symbolic and numerical differentiation. In symbolic differentiation, the function is differentiated as an expression, and is then evaluated at a point. Numerical differentiation makes use of the method of finite differences. However, automatic differentiation is the decomposition of differentials provided by the chain rule. A complete understanding of automatic differentiation is beyond the scope of this article, so Id recommend that interested readers refer to Wikipedia. Automatic differentiation works by decomposing the vector function into elementary sequences, which are then differentiated by a simple table lookup. Unfortunately, a deeper understanding of automatic differentiation is required to make full use of the scientific packages provided in Python. Hence, in this article, well focus on symbolic differentiation, which is easier to understand and implement. Well be using a powerful computer algebra system known as Maxima for symbolic differentiation. Maxima is a version of the MIT-developed MACSYMA system, modified to run under CLISP. Written in Lisp, it allows differentiation, integration, solutions for linear or polynomial equations, factoring of polynomials, expansion of functions in the Laurent or Taylor series, computation of the Poisson series, matrix and tensor manipulations, and two- and three-dimensional graphics. Open the Synaptic Package Manager and install the maxima package. Once installed, you can run it by executing the maxima command in the terminal. Well be differentiating the following simple functions with the help of Maxima:

d / dx(x4) d / dx(sin x + tan x) d / dx(1 / log x) Figure 2 displays Maxima in action.

Figure 2: Differentiation of some simple functions You have to simply define the function in diff() and maxima will calculate the derivative for you. (%i1) diff(x^4) (%o1) 4x^3 del(x) (%i2) diff(sin(x) + tan(x)) (%o2) (sec^2(x) + cos(x))del(x) (%i3) diff(1/log(x)) (%o3) - del(x)/x log^2(x)The command diff(expr,var,num) will differentiate the expression in Slot 1 with respect to the variable entered in Slot 2 a number of times, determined by a positive integer in Slot 3. Unless a dependency has been established, all parameters and variables in the expression are treated as constants when taking the derivative. Similarly, you can also calculate higher order differentials with Maxima.

Ordinary differential equations


Maxima can also be used to solve ODEs. Well dive straight into some examples to understand how to solve ODEs with Maxima. Consider the following differential equations: dx/dt = e-t + x d2x / dt2 4x = 0 Consider Figure 3.

Figure 3: Solving simple differential equations

Figure 4: Getting solutions at a point of differential equations Lets rewrite our example ordinary differential equations using the noun form diff, which uses a single quote. Then use ode2, and call the general solution gsoln. The function ode2 solves an ordinary differential equation (ODE) of the first or second order. This takes three arguments: an ODE given by eqn, the dependent variable dvar, and the independent variable ivar. When successful, it returns either an explicit or implicit solution for the dependent variable. %c is used to represent the integration constant in the case of first-order equations, and %k1 and %k2 the constants for second-order equations. We can also find the solution at predefined points using ic1 and call this particular solution, psoln. Consider the following non-linear first order differential equation: (x2y)dy / dx = xy +x3 1

Lets first define the equation, and then solve it with ode2. Further, let us find the particular solution at points x=1 and y=1 using ic1. We can also solve ODEs with NumPy and SciPy using the FuncDesigner and OpenOpt packages. However, both these packages make use of automatic differentiation to solve ODEs. Hence, Maxima was chosen over these packages. ODEs can also be solved using the scipy.integrate.odeint package. We will later use this package for mathematical modelling.

Curve plotting with MatPlotLib


Its said that a picture is worth a thousand words, and theres no denying the fact that its much more convenient to make sense of a scientific experiment by looking at the plots as compared to looking just at the raw data. In this article, well be focusing on MatPlotLib, which is a Python package for 2D plotting that produces production-quality graphs. Matlab is customisable and extensible, and is integrated with LaTeX markup, which is really useful when writing scientific papers. Let us make a simple plot with the help of MatPlotLib: #Simple Plot with MatPlotLib #! /usr/bin/python import matplotlib.pyplot as plt x = range(10) plt.plot(x, [xi**3 for xi in x]) plt.show()

Figure 5: Simple plot with MatPlotLib Let us take another example using the arange function; arange(x,y,z) is a part of NumPy, and it generates a sequence of elements with x to y with spacing z. #Simple Plot with MatPlotLib #! /usr/bin/python import matplotlib.pyplot as plt import numpy as np x = np.arange(0,20,2) plt.plot(x, [xi**2 for xi in x]) plt.show()

We can also add labels, legends, the grid and axis name in the plot. Take a look at Figure 6, and the following code:

Figure 6: The plot after utilising the arange function import matplotlib.pyplot as plt import numpy as np x = np.arange(0,20,2) plt.title('Sample Plot') plt.xlabel('X axis') plt.ylabel('Y axis') plt.plot(x, [xi**3 for xi in x], label='Fast') plt.plot(x, [xi**4 for xi in x], label='Slow') plt.legend() plt.grid(True) plt.show() plt.savefig('plot.png')

Figure 7: Multiline plot with MatPlotLib

You can create various types of plots using MatPlotLib. Let us take a look at Pie Plot and Scatter Plot. import matplotlib.pyplot as plt plt.figure(figsize=(10,10)); plt.title('Distribution of Dark Energy and Dark Matter in the Universe') x = [74.0,22.0,3.6,0.4] labels = ['Dark Energy', 'Dark Matter', 'Intergalatic gas', 'Stars,etc'] plt.pie(x, labels=labels, autopct='%1.1f%%'); plt.show()

Figure 8: Pie chart with MatPlotLib

Figure 9: Scatter Plot with MatPlotLib

import matplotlib.pyplot as plt import numpy as np plt.title('Scatter Plot') x = np.random.randn(200) y = np.random.randn(200) plt.xlabel('X axis') plt.ylabel('Y axis') plt.scatter(x,y) plt.show() Similarly, you can plot Histograms and Bar charts using the plt.hist() and plt.bar() functions, respectively. In our next example, we will generate a plot by using data from a text file: import matplotlib.pyplot as plt import numpy as np data = np.loadtxt('ndata.txt') x = data[:,0] y = data[:,1] figure(1,figsize=(6,4)) grid(True) hold(True) lw=1 xlabel('x') plot(x,y,'b',linewidth=lw) plt.show() After executing this program, it results in the plot shown in Figure 10.

Figure 10: Plotting by fetching data from the text file Figure 11: Spring-Mass System

So, whats happening here? First of all, we fetch data from the text file using the loadtxt function, which splits each non-empty line into a sequence of strings. Empty or commented lines are just skipped. The fetched data is then distributed in variables using slice. The figure function creates a new figure of the specified dimensions, whereas the plot function creates a new line plot.

Mathematical modelling
Now that we have a basic understanding of various computation tools, we can move on to some more complex problems related to mathematics and physics. Lets take a look at one of the problems provided by the SciPy community. The example is available on the Internet (at the SciPy website). However, some of the methods explained in this example are deprecated; hence, well rebuild the example, so that it works correctly with the latest version of SciPy and NumPy. Were going to build and simulate a model based on a coupled spring-mass system, which is essentially a harmonic oscillator, in which a spring is stretched or compressed by a mass, thereby developing a restoring force in the spring, which results in harmonic motions when the mass is displaced from its equilibrium position. For an undamped system, the motion of Block 1 is given by the following differential equation: m1d2x1 / dt = (k1 + k)x1 k2x2 = 0 For Block 2: m2d2x2 / dt + k x2 k1x1 = 0 In this example, weve taken a coupled spring-mass system, which is subjected to a frictional force, thereby resulting in damping. Note that damping tends to reduce the amplitude of oscillations in an oscillatory system. For our example, let us assume that the lengths of the springs, when subjected to no external forces, are L1 and L2. The following differential equations define such a system: m1d2x1 / dt + 1dx1 / dt + k1(x1 L1) k2(x1 x2 L2) = 0 and: m2d2x2 / dt + 2dx / dt + k (x2 x1 L2) = 0 Well be using the Scipy odeint function to solve this problem. The function works for first-order differential equations; hence, well re-write the equations as first fourth order equations: dx1 / dt = y1 dy1 / dt = (-1y1 k1(x1 L1) + k (x2 x1 L2)) / m1 dx2 / dt = y dy2 / dt = (-2y2 k2(x2 x1 L1)) / m2

Now, lets write a simple Python script to define this problem: #! /usr/bin/python def vector(w,t,p): x1,y1,x2,y2 = w m1,m2,k1,k2,u1,u2,L1,L2 = p f = [y1, (-b1*y1 - k1*(x1-L1) + k2*(x2-x1-L2))/m1, y2, (-b2*y2 - k2*(x2-x1-L2))] return f In this script, we have simply defined the above mentioned equations programmatically. The argument w defines the state variables; t is for time, and p defines the vector of the parameters. In short, we have simply defined the vector field for the spring-mass system in this script. Now, lets define a script that uses odeint to solve the equations for a given set of parameter values, initial conditions, and time intervals. The script prints the points in the solution to the terminal. #! /usr/bin/python from scipy.integrate import odeint import two_springs # Parameter values # Masses: m1 = 1.0 m2 = 1.5 # Spring constants k1 = 8.0 k2 = 40.0 # Natural lengths L1 = 0.5 L2 = 1.0 # Friction coefficients b1 = 0.8 b2 = 0.5 # Initial conditions # x1 and x2 are the initial displacements; y1 and y2 are the initial velocities x1 = 0.5 y1 = 0.0 x2 = 2.25 y2 = 0.0 # ODE solver parameters abserr = 1.0e-8 relerr = 1.0e-6 stoptime = 10.0 numpoints = 250 # Create the time samples for the output of the ODE solver. t = [stoptime*float(i)/(numpoints-1) for i in range(numpoints)] # Pack up the parameters and initial conditions: p = [m1,m2,k1,k2,L1,L2,b1,b2] w0 = [x1,y1,x2,y2] # Call the ODE solver. wsol = odeint(two_springs.vectorfield,w0,t,args=(p,),atol=abserr,rtol=relerr)

# Print the solution. for t1,w1 in zip(t,wsol): print t1,w1[0],w1[1],w1[2],w1[3] The scipy.integrate.odeint function integrates a system of ordinary differential equations. It takes the following parameters: func: callable(y, t0, ...) It computes the derivative of y at t0. y0: array This is the initial condition on y (can be a vector). t: array It is a sequence of time points for which to solve for y. The initial value point should be the first element of this sequence. args: tuple Indicates extra arguments to pass to function. In our example, we have added atrol and rtol as extra arguments to deal with absolute and relative errors. The zip function takes one or more sequences as arguments, and returns a series of tuples that pair up parallel items taken from those sequences. Copy the solution generated from this script to a text file using the cat command. Name this text file as two_springs.txt. The following script uses Matplotlib to plot the solution generated by two_springs_solver.py: #! /usr/bin/python # Defining a matrix and matrix multiplication from pylab import * from matplotlib.font_manager import FontProperties import numpy as np data = np.loadtxt('two_springs.txt') t = data[:,0] x1 = data[:,1] y1 = data[:,2] x2 = data[:,3] y2 = data[:,4] figure(1,figsize=(6,4)) xlabel('t')4 grid(True) hold(True) lw = 1 plot(t,x1,'b',linewidth=lw) plot(t,x2,'g',linewidth=lw) legend((r'$x_1$',r'$x_2$'),prop=FontProperties(size=16)) title('Mass Displacements for the Coupled Spring-Mass System') savefig('two_springs.png',dpi=72)

On running the script, we get the plot shown in Figure 12. It clearly shows how the mass displacements are reduced with time for damped systems.

Figure 12: Plot of the spring-mass system In this article, we have covered some of the most basic operations in scientific computing. However, we can also model and simulate more complex problems with NumPy and SciPy. These tools are now actively used for research in quantum physics, cosmology, astronomy, applied mathematics, finance and various other fields. With this basic understanding of scientific programming, youre now ready to explore deeper realms of this exciting world!

Device Drivers, Part 1: Linux Device Drivers for Your Girl Friend
This series on Linux device drivers aims to present the usually technical topic in a way that is more interesting to a wider cross-section of readers.

After a week of hard work, we finally got our driver working, were Pugs first words when he met his girlfriend, Shweta. Why? What was your driver up to? Was he sick? And what hard work did you do? asked Shweta. Confused, Pugs responded, What are you talking about? Now it was Shwetas turn to look puzzled, as she replied, Why ask me? You tell me which of your drivers are you talking about? When understanding dawned on him, Pugs groaned, Ah cmon! Not my car drivers I am talking about a device driver on my computer. I know about car and bus drivers, pilots, and even screwdrivers; but what is this device driver? queried Shweta, puzzled. That was all it took to launch Pugs into a passionate explanation of device drivers for the newbie in particular, Linux device drivers, which he had been working on for many years.

Of drivers and buses


A driver drives, manages, controls, directs and monitors the entity under its command. What a bus driver does with a bus, a device driver does with a computer device (any piece of hardware connected to a computer) like a mouse, keyboard, monitor, hard disk, Web-camera, clock, and more.

Further, a pilot could be a person or even an automatic system monitored by a person (an auto-pilot system in airliners, for example). Similarly, a specific piece of hardware could be controlled by a piece of software (a device driver), or could be controlled by another hardware device, which in turn could be managed by a software device driver. In the latter case, such a controlling device is commonly called a device controller. This, being a device itself, often also needs a driver, which is commonly referred to as a bus driver. General examples of device controllers include hard disk controllers, display controllers, and audio controllers that in turn manage devices connected to them. More technical examples would be an IDE controller, PCI controller, USB controller, SPI controller, I2C controller, etc. Pictorially, this whole concept can be depicted as in Figure 1.

Figure 1: Device and driver interaction Device controllers are typically connected to the CPU through their respectively named buses (collection of physical lines) for example, the PCI bus, the IDE bus, etc. In todays embedded world, we encounter more micro-controllers than CPUs; these are the CPU plus various device controllers built onto a single chip. This effective embedding of device controllers primarily reduces cost and space, making it suitable for embedded systems. In such cases, the buses are integrated into the chip itself. Does this change anything for the drivers, or more generically, on the software front? The answer is, not much except that the bus drivers corresponding to the embedded device controllers are now developed under the architecture-specific umbrella.

Drivers have two parts


Bus drivers provide hardware-specific interfaces for the corresponding hardware protocols, and are the bottom-most horizontal software layers of an operating system (OS). Over these sit the actual device drivers. These operate on the underlying devices using the horizontal layer interfaces, and hence are device-specific. However, the whole idea of writing these drivers is to provide an abstraction to the user, and so, at the other end, these do provide an interface (which varies from OS to OS). In short, a device driver has two parts, which are: a) device-specific, and b) OS-specific. Refer to Figure 2.

Figure 2: Linux device driver partition The device-specific portion of a device driver remains the same across all operating systems, and is more about understanding and decoding the device data sheets than software programming. A data sheet for a device is a document with technical details of the device, including its operation, performance, programming, etc. in short a device user manual. Later, I shall show some examples of decoding data sheets as well. However, the OS-specific portion is the one that is tightly coupled with the OS mechanisms of user interfaces, and thus differentiates a Linux device driver from a Windows device driver and from a MacOS device driver.

Verticals
In Linux, a device driver provides a system call interface to the user; this is the boundary line between the so-called kernel space and user-space of Linux, as shown in Figure 2. Figure 3 provides further classification.

Figure 3: Linux kernel overview

Based on the OS-specific interface of a driver, in Linux, a driver is broadly classified into three verticals: Packet-oriented or the network vertical Block-oriented or the storage vertical Byte-oriented or the character vertical The CPU vertical and memory vertical, taken together with the other three verticals, give the complete overview of the Linux kernel, like any textbook definition of an OS: An OS performs 5 management functions: CPU/process, memory, network, storage, device I/O. Though these two verticals could be classified as device drivers, where CPU and memory are the respective devices, they are treated differently, for many reasons. These are the core functionalities of any OS, be it a micro-kernel or a monolithic kernel. More often than not, adding code in these areas is mainly a Linux porting effort, which is typically done for a new CPU or architecture. Moreover, the code in these two verticals cannot be loaded or unloaded on the fly, unlike the other three verticals. Henceforth, when we talk about Linux device drivers, we mean to talk only about the latter three verticals in Figure 3. Lets get a little deeper into these three verticals. The network vertical consists of two parts: a) the network protocol stack, and b)the network interface card (NIC) device drivers, or simply network device drivers, which could be for Ethernet, Wi-Fi, or any other network horizontals. Storage, again, consists of two parts: a) File-system drivers, to decode the various formats on different partitions, and b) Block device drivers for various storage (hardware) protocols, i.e., horizontals like IDE, SCSI, MTD, etc. With this, you may wonder if that is the only set of devices for which you need drivers (or for which Linux has drivers). Hold on a moment; you certainly need drivers for the whole lot of devices that interface with the system, and Linux does have drivers for them. However, their byte-oriented cessibility puts all of them under the character vertical this is, in reality, the majority bucket. In fact, because of the vast number of drivers in this vertical, character drivers have been further sub-classified so you have tty drivers, input drivers, console drivers, frame-buffer drivers, sound drivers, etc. The typical horizontals here would be RS232, PS/2, VGA, I2C, I2S, SPI, etc.

Multiple-vertical drivers
One final note on the complete picture (placement of all the drivers in the Linux driver ecosystem): the horizontals like USB, PCI, etc, span below multiple verticals. Why is that? Simple you already know that you can have a USB Wi-Fi dongle, a USB pen drive, and a USB-toserial converter all are USB, but come under three different verticals! In Linux, bus drivers or the horizontals, are often split into two parts, or even two drivers: a) device controller-specific, and b) an abstraction layer over that for the verticals to interface, commonly called cores. A classic example would be the USB controller drivers ohci, ehci, etc., and the USB abstraction, usbcore.

Summing up
So, to conclude, a device driver is a piece of software that drives a device, though there are so many classifications. In case it drives only another piece of software, we call it just a driver. Examples are file-system drivers, usbcore, etc. Hence, all device drivers are drivers, but all drivers are not device drivers. Hey, Pugs, hold on; were getting late for class, and you know what kind of trouble we can get into. Lets continue from here, later, exclaimed Shweta.

Jumping up, Pugs finished his explanation: Okay. This is the basic theory about device drivers. If youre interested, later, I can show you the code, and all that we have been doing for the various kinds of drivers. And they hurried towards their classroom.

Device Drivers, Part 2: Writing Your First Linux Driver in the Classroom

This article, which is part of the series on Linux device drivers, deals with the concept of dynamically loading drivers, first writing a Linux driver, before building and then loading it. Shweta and Pugs reached their classroom late, to find their professor already in the middle of a lecture. Shweta sheepishly asked for his permission to enter. An annoyed Professor Gopi responded, Come on! You guys are late again; what is your excuse, today? Pugs hurriedly replied that they had been discussing the very topic for that days class device drivers in Linux. Pugs was more than happy when the professor said, Good! Then explain about dynamic loading in Linux. If you get it right, the two of you are excused! Pugs knew that one way to make his professor happy was to criticise Windows. He explained, As we know, a typical driver installation on Windows needs a reboot for it to get activated. That is really not acceptable; suppose we need to do it on a server? Thats where Linux wins. In Linux, we can load or unload a driver on the fly, and it is active for use instantly after loading. Also, it is instantly disabled when unloaded. This is called dynamic loading and unloading of drivers in Linux. This impressed the professor. Okay! Take your seats, but make sure you are not late again. The professor continued to the class, Now you already know what is meant by dynamic loading and unloading of drivers, so Ill show you how to do it, before we move on to write our first Linux driver.

Dynamically loading drivers


These dynamically loadable drivers are more commonly called modules and built into individual files with a .ko (kernel object) extension. Every Linux system has a standard place under the root of the file system (/) for all the pre-built modules. They are organised similar to the kernel source tree structure, under /lib/modules/<kernel_version>/kernel, where <kernel_version> would be the output of the command uname -ron the system, as shown in Figure 1.

Figure 1: Linux pre-built modules

Figure 2: Linux module operations

To dynamically load or unload a driver, use these commands, which reside in the /sbin directory, and must be executed with root privileges: lsmod lists currently loaded modules insmod <module_file> inserts/loads the specified module file modprobe <module> inserts/loads the module, along with any dependencies rmmod <module> removes/unloads the module

Lets look at the FAT filesystem-related drivers as an example. Figure 2 demonstrates this complete process of experimentation. The module files would be fat.ko, vfat.ko, etc., in the fat (vfat for older kernels) directory under /lib/modules/`uname -r`/kernel/fs. If they are in compressed .gz format, you need to uncompress them with gunzip, before you can insmodthem. The vfat module depends on the fat module, so fat.ko needs to be loaded first. To automatically perform decompression and dependency loading, use modprobe instead. Note that you shouldnt specify the .ko extension to the modules name, when using the modprobe command. rmmod is used to unload the modules.

Our first Linux driver


Before we write our first driver, lets go over some concepts. A driver never runs by itself. It is similar to a library that is loaded for its functions to be invoked by a running application. It is written in C, but lacks a main() function. Moreover, it will be loaded/linked with the kernel, so it needs to be compiled in a similar way to the kernel, and the header files you can use are only those from the kernel sources, not from the standard /usr/include. One interesting fact about the kernel is that it is an object-oriented implementation in C, as we will observe even with our first driver. Any Linux driver has a constructor and a destructor. The modules constructor is called when the module is successfully loaded into the kernel, and the destructor when rmmod succeeds in unloading the module. These two are like normal functions in the driver, except that they are specified as the init and exit functions, respectively, by the macros module_init() and module_exit(), which are defined in the kernel header module.h. /* ofd.c Our First Driver code */ #include <linux/module.h> #include <linux/version.h> #include <linux/kernel.h> static int __init ofd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofd registered"); return 0; } static void __exit ofd_exit(void) /* Destructor */ { printk(KERN_INFO "Alvida: ofd unregistered"); } module_init(ofd_init); module_exit(ofd_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Our First Driver");

Given above is the complete code for our first driver; lets call it ofd.c. Note that there is no stdio.h (a user-space header); instead, we use the analogous kernel.h (a kernel space header). printk() is the equivalent of printf(). Additionally, version.h is included for the module version to be compatible with the kernel into which it is going to be loaded. The MODULE_* macros populate module-related information, which acts like the modules signature.

Building our first Linux driver


Once we have the C code, it is time to compile it and create the module file ofd.ko. We use the kernel build system to do this. The following Makefile invokes the kernels build system from the kernel source, and the kernels Makefile will, in turn, invoke our first drivers Makefile to build our first driver. To build a Linux driver, you need to have the kernel source (or, at least, the kernel headers) installed on your system. The kernel source is assumed to be installed at /usr/src/linux. If its at any other location on your system, specify the location in the KERNEL_SOURCE variable in this Makefile. # Makefile makefile of our first driver # if KERNELRELEASE is defined, we've been invoked from the # kernel build system and can use its language. ifneq (${KERNELRELEASE},) obj-m := ofd.o # Otherwise we were called directly from the command line. # Invoke the kernel build system. else KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: ${MAKE} -C ${KERNEL_SOURCE} SUBDIRS=${PWD} modules clean: ${MAKE} -C ${KERNEL_SOURCE} SUBDIRS=${PWD} clean endif With the C code (ofd.c) and Makefile ready, all we need to do is invoke make to build our first driver (ofd.ko). $ make make -C /usr/src/linux SUBDIRS=... modules make[1]: Entering directory `/usr/src/linux' CC [M] .../ofd.o Building modules, stage 2. MODPOST 1 modules CC .../ofd.mod.o LD [M] .../ofd.ko make[1]: Leaving directory `/usr/src/linux'

Summing up
Once we have the ofd.ko file, perform the usual steps as the root user, or with sudo. # su # insmod ofd.ko # lsmod | head -10 lsmod should show you the ofd driver loaded.

While the students were trying their first module, the bell rang, marking the end of the session. Professor Gopi concluded, Currently, you may not be able to observe anything other than the lsmod listing showing the driver has loaded. Wheres the printk output gone? Find that out for yourselves, in the lab session, and update me with your findings. Also note that our first driver is a template for any driver you would write in Linux. Writing a specialised driver is just a matter of what gets filled into its constructor and destructor. So, our further learning will be to enhance this driver to achieve specific driver functionalities.

Device Drivers, Part 3: Kernel C Extras in a Linux Driver

This article in the series on Linux device drivers deals with the kernels message logging, and kernelspecific GCC extensions. Enthused by how Pugs impressed their professor in the last class, Shweta wanted to do so too. And there was soon an opportunity: finding out where the output of printk had gone. So, as soon as she entered the lab, she grabbed the best system, logged in, and began work. Knowing her professor well, she realised that he would have dropped a hint about the possible solution in the previous class itself. Going over what had been taught, she remembered the error output demonstration from insmod vfat.ko running dmesg | tail. She immediately tried that, and found the printk output there. But how did it come to be here? A tap on her shoulder roused her from her thoughts. Shall we go for a coffee? proposed Pugs. But I need to . I know what youre thinking about, interrupted Pugs. Lets go, Ill explain you all about dmesg.

Kernel message logging


Over coffee, Pugs began his explanation. As far as parameters are concerned, printf and printk are the same, except that when programming for the kernel, we dont bother about the float formats %f, %lf and the like. However, unlike printf, printk is not designed to dump its output to some console. In fact, it cannot do so; it is something in the background, and executes like a library, only when triggered either from hardware-space or user-space. All printk calls put their output into the (log) ring buffer of the kernel. Then, the syslog daemon running in user-space picks them up for final processing and redirection to various devices, as configured in the configuration file /etc/syslog.conf.

You must have observed the out-of-place macro KERN_INFO, in the printk calls, in the last article. That is actually a constant string, which gets concatenated with the format string after it, into a single string. Note that there is no comma (,) between them; they are not two separate arguments. There are eight such macros defined in linux/kernel.h in the kernel source, namely: #define KERN_EMERG "<0>" /* system is unusable */ #define KERN_ALERT "<1>" /* action must be taken immediately */ #define KERN_CRIT "<2>" /* critical conditions */ #define KERN_ERR "<3>" /* error conditions */ #define KERN_WARNING "<4>" /* warning conditions */ #define KERN_NOTICE "<5>" /* normal but significant condition */ #define KERN_INFO "<6>" /* informational */ #define KERN_DEBUG "<7>" /* debug-level messages */ Now depending on these log levels (i.e., the first three characters in the format string), the syslog userspace daemon redirects the corresponding messages to their configured locations. A typical destination is the log file /var/log/messages, for all log levels. Hence, all the printk outputs are, by default, in that file. However, they can be configured differently to a serial port (like /dev/ttyS0), for instance, or to all consoles, like what typically happens for KERN_EMERG. Now, /var/log/messages is buffered, and contains messages not only from the kernel, but also from various daemons running in user-space. Moreover, this file is often not readable by a normal user. Hence, a user-space utility, dmesg, is provided to directly parse the kernel ring buffer, and dump it to standard output. Figure 1 shows snippets from the two.

Figure 1: Kernels message logging

Kernel-specific GCC extensions


Shweta, frustrated since she could no longer show off as having discovered all these on her own, retorted, Since you have explained all about printing in the kernel, why dont you also tell me about the weird C in the driver as well the special keywords __init, __exit, etc. These are not special keywords. Kernel C is not weird C, but just standard C with some additional extensions from the C compiler, GCC. Macros __init and __exit are just two of these extensions. However, these do not have any relevance in case we are using them for a dynamically loadable driver, but only when the same code gets built into the kernel. All functions marked with __init get placed inside the init section of the kernel image automatically, by GCC, during kernel compilation; and all functions marked with __exit are placed in the exit section of the kernel image. What is the benefit of this? All functions with __init are supposed to be executed only once during bootup (and not executed again till the next bootup). So, once they are executed during bootup, the kernel frees up RAM by removing them (by freeing the init section). Similarly, all functions in the exit section are supposed to be called during system shutdown. Now, if the system is shutting down anyway, why do you need to do any cleaning up? Hence, the exit section is not even loaded into the kernel another cool optimisation. This is a beautiful example of how the kernel and GCC work hand-in-hand to achieve a lot of optimisation, and many other tricks that we will see as we go along. And that is why the Linux kernel can only be compiled using GCC-based compilers a closely knit bond.

The kernel functions return guidelines


While returning from coffee, Pugs kept praising OSS and the community thats grown around it. Do you know why different individuals are able to come together and contribute excellently without any conflicts, and in a project as huge as Linux, at that? There are many reasons, but most important amongst them is that they all follow and abide by inherent coding guidelines. Take, for example, the kernel programming guideline for returning values from a function. Any kernel function needing error handling, typically returns an integer-like type and the return value again follows a guideline. For an error, we return a negative number: a minus sign appended with a macro that is available through the kernel header linux/errno.h, that includes the various error number headers under the kernel sources namely, asm/errno.h, asm-generic/errno.h, asm-generic/errno-base.h. For success, zero is the most common return value, unless there is some additional information to be provided. In that case, a positive value is returned, the value indicating the information, such as the number of bytes transferred by the function.

Kernel C = pure C
Once back in the lab, Shweta remembered their professor mentioning that no /usr/include headers can be used for kernel programming. But Pugs had said that kernel C is just standard C with some GCC extensions. Why this conflict? Actually this is not a conflict. Standard C is pure C just the language. The headers are not part of it. Those are part of the standard libraries built in for C programmers, based on the concept of reusing code. Does that mean that all standard libraries, and hence, all ANSI standard functions, are not part of pure C? Yes, thats right. Then, was it really tough coding the kernel?

Well, not for this reason. In reality, kernel developers have evolved their own set of required functions, which are all part of the kernel code. The printk function is just one of them. Similarly, many string functions, memory functions, and more, are all part of the kernel source, under various directories like kernel, ipc, lib, and so on, along with the corresponding headers under the include/linux directory. Oh yes! That is why we need to have the kernel source to build a driver, agreed Shweta. If not the complete source, at least the headers are a must. And that is why we have separate packages to install the complete kernel source, or just the kernel headers, added Pugs. In the lab, all the sources are set up. But if I want to try out drivers on my Linux system in my hostel room, how do I go about it? asked Shweta. Our lab has Fedora, where the kernel sources are typically installed under /usr/src/kernels/<kernelversion>, unlike the standard /usr/src/linux. Lab administrators must have installed it using the command-line yum install kernel-devel. I use Mandriva, and installed the kernel sources using urpmi kernel-source, replied Pugs. But I have Ubuntu, Shweta said. Okay! For that, just use apt-get utility to fetch the source possibly apt-get install linux-source, replied Pugs.

Summing up
The lab session was almost over when Shweta suddenly asked, out of curiosity, Hey Pugs, whats the next topic we are going to learn in our Linux device drivers class? Hmm most probably character drivers, threw back Pugs. With this information, Shweta hurriedly packed her bag and headed towards her room to set up the kernel sources, and try out the next driver on her own. In case you get stuck, just give me a call, smiled Pugs.

Device Drivers, Part 4: Linux Character Drivers

This article, which is part of the series on Linux device drivers, deals with the various concepts related to character drivers and their implementation. Shweta, at her PC in her hostel room, was all set to explore the characters of Linux character drivers, before it was taught in class. She recalled the following lines from professor Gopis class: todays first driver would be the template for any driver you write in Linux. Writing any specialised/advanced driver is just a matter of what gets filled into its constructor and destructor With that, she took out the first drivers code, and pulled out various reference books, to start writing a character driver on her own. She also downloaded the online book, Linux Device Drivers by Jonathan Corbet, Alessandro Rubini, and Greg Kroah-Hartman. Here is the summary of what she learnt.

Figure 1: Character driver overview

Ws of character drivers
We already know what drivers are, and why we need them. What is so special about character drivers? If we write drivers for byte-oriented operations (or, in C lingo, character-oriented operations), then we refer to them as character drivers. Since the majority of devices are byte-oriented, the majority of device drivers are character device drivers. Take, for example, serial drivers, audio drivers, video drivers, camera drivers, and basic I/O drivers. In fact, all device drivers that are neither storage nor network device drivers are some type of a character driver. Lets look into the commonalities of these character drivers, and how Shweta wrote one of them.

The complete connection


As shown in Figure 1, for any user-space application to operate on a byte-oriented device (in hardware space), it should use the corresponding character device driver (in kernel space). Character driver usage is done through the corresponding character device file(s), linked to it through the virtual file system (VFS). What this means is that an application does the usual file operations on the character device file. Those operations are translated to the corresponding functions in the linked character device driver by the VFS. Those functions then do the final low-level access to the actual device to achieve the desired results. Note that though the application does the usual file operations, their outcome may not be the usual ones. Rather, they would be as driven by the corresponding functions in the device driver. For example, a write followed by a read may not fetch what has just been written to the character device file, unlike for regular files. Remember that this is the usual expected behaviour for device files. Lets take an audio device file as an example. What we write into it is the audio data we want to play back, say through a speaker. However, the read would get us audio data that we are recording, say through a microphone. The recorded data need not be the played-back data. In this complete connection from the application to the device, there are four major entities involved: 1. 2. 3. 4. Application Character device file Character device driver Character device

The interesting thing is that all of these can exist independently on a system, without the other being present. The mere existence of these on a system doesnt mean they are linked to form the complete connection. Rather, they need to be explicitly connected. An application gets connected to a device file by invoking the open system call on the device file. Device file(s) are linked to the device driver by specific registrations done by the driver. The driver is linked to a device by its device-specific low-level operations. Thus we form the complete connection. With this, note that the character device file is not the actual device, but just a place-holder for the actual device.

Major and minor numbers


The connection between the application and the device file is based on the name of the device file. However, the connection between the device file and the device driver is based on the number of the device file, not the name. This allows a user-space application to have any name for the device file, and enables the kernel-space to have a trivial index-based linkage between the device file and the device driver. This device file number is more commonly referred to as the <major, minor> pair, or the major and minor numbers of the device file.

Earlier (till kernel 2.4), one major number was for one driver, and the minor number used to represent the sub-functionalities of the driver. With kernel 2.6, this distinction is no longer mandatory; there could be multiple drivers under the same major number, but obviously, with different minor number ranges. However, this is more common with the non-reserved major numbers, and standard major numbers are typically preserved for single drivers. For example, 4 for serial interfaces, 13 for mice, 14 for audio devices, and so on. The following command would list the various character device files on your system: $ ls -l /dev/ | grep "^c"

<major, minor> related support in kernel 2.6


Type (defined in kernel header linux/types.h): dev_t contains both major and minor numbers Macros (defined in kernel header linux/kdev_t.h): MAJOR(dev_t dev) extracts the major number from dev MINOR(dev_t dev) extracts the minor number from dev MKDEV(int major, int minor) creates the dev from major and minor. Connecting the device file with the device driver involves two steps: 1. Registering for the <major, minor> range of device files. 2. Linking the device file operations to the device driver functions. The first step is achieved using either of the following two APIs, defined in the kernel header linux/fs.h: + int register_chrdev_region(dev_t first, unsigned int cnt, char *name); + int alloc_chrdev_region(dev_t *first, unsigned int firstminor, unsigned int cnt, char *name); The first API registers the cnt number of device file numbers, starting from first, with the given name. The second API dynamically figures out a free major number, and registers the cnt number of device file numbers starting from <the free major, firstminor>, with the given name. In either case, the /proc/devices kernel window lists the name with the registered major number. With this information, Shweta added the following into the first driver code: #include <linux/types.h> #include <linux/kdev_t.h> #include <linux/fs.h> static dev_t first; // Global variable for the first device number In the constructor, she added: if (alloc_chrdev_region(&first, 0, 3, "Shweta") < 0) { return -1; } printk(KERN_INFO "<Major, Minor>: <%d, %d>\n", MAJOR(first), MINOR(first));

In the destructor, she added: unregister_chrdev_region(first, 3); Its all put together, as follows: #include <linux/module.h> #include <linux/version.h> #include <linux/kernel.h> #include <linux/types.h> #include <linux/kdev_t.h> #include <linux/fs.h> static dev_t first; // Global variable for the first device number static int __init ofcd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofcd registered"); if (alloc_chrdev_region(&first, 0, 3, "Shweta") < 0) { return -1; } printk(KERN_INFO "<Major, Minor>: <%d, %d>\n", MAJOR(first), MINOR(first)); return 0; } static void __exit ofcd_exit(void) /* Destructor */ { unregister_chrdev_region(first, 3); printk(KERN_INFO "Alvida: ofcd unregistered"); } module_init(ofcd_init); module_exit(ofcd_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Our First Character Driver"); Then, Shweta repeated the usual steps that shed learnt for the first driver: Build the driver (.ko file) by running make. Load the driver using insmod. List the loaded modules using lsmod. Unload the driver using rmmod.

Summing up
Additionally, before unloading the driver, she peeped into the /proc/devices kernel window to look for the registered major number with the name Shweta, using cat /proc/devices. It was right there. However, she couldnt find any device file created under /dev with the same major number, so she created them by hand, using mknod, and then tried reading and writing those. Figure 2 shows all these steps.

Figure 2: Character device file experiments Please note that the major number 250 may vary from system to system, based on availability. Figure 2 also shows the results Shweta got from reading and writing one of the device files. That reminded her that the second step to connect the device file with the device driver which is linking the device file operations to the device driver functions was not yet done. She realised that she needed to dig around for more information to complete this step, and also to figure out the reason for the missing device files under /dev. We will deal with her further learning in our next article.

Device Drivers, Part 5: Character Device Files Creation & Operations

This article is a continuation of the series on Linux device drivers, and carries on the discussion on character drivers and their implementation. In my previous article, I had mentioned that even with the registration for the <major, minor> device range, the device files were not created under /dev instead, Shweta had to create them manually, using mknod. However, on further study, Shweta figured out a way to automatically create the device files, using the udev daemon. She also learnt the second step to connect the device file with the device driver linking the device file operations to the device driver functions. Here is what she learnt.

Automatic creation of device files


Earlier, in kernel 2.4, the automatic creation of device files was done by the kernel itself, by calling the appropriate APIs of devfs. However, as the kernel evolved, kernel developers realised that device files were more related to user-space and hence, as a policy, that is where they ought to be dealt with, not at the kernel. Based on this idea, the kernel now only populates the appropriate device class and device information into the /sys window, for the device under consideration. User-space then needs to interpret it and take appropriate action. In most Linux desktop systems, the udev daemon picks up that information, and accordingly creates the device files. udev can be further configured via its configuration files to tune the device file names, their permissions, their types, etc. So, as far as the driver is concerned, the appropriate /sys entries need to be populated using the Linux device model APIs declared in <linux/device.h>. The rest should be handled by udev. The device class is created as follows: struct class *cl = class_create(THIS_MODULE, "<device class name>"); Then, the device info (<major, minor>) under this class is populated by: device_create(cl, NULL, first, NULL, "<device name format>", ...);

Here, the first is dev_t with the corresponding <major, minor>. The corresponding complementary or the inverse calls, which should be called in chronologically reverse order, are as follows: device_destroy(cl, first); class_destroy(cl); Refer to Figure 1 for the /sys entries created using chardrv as the <device class name> and mynull as the <device name format>. That also shows the device file, created by udev, based on the <major>:<minor> entry in the dev file.

Figure 1: Automatic device file creation In case of multiple minors, the device_create() and device_destroy() APIs may be put in the for loop, and the <device name format> string could be useful. For example, the device_create() call in a for loop indexed by i could be as follows: device_create(cl, NULL, MKNOD(MAJOR(first), MINOR(first) + i), NULL, "mynull%d", i);

File operations
Whatever system calls (or, more commonly, file operations) we talk of on a regular file, are applicable to device files as well. Thats what we say: a file is a file, and in Linux, almost everything is a file from the user-space perspective. The difference lies in the kernel space, where the virtual file system (VFS) decodes the file type and transfers the file operations to the appropriate channel, like a filesystem module in case of a regular file or directory, and the corresponding device driver in case of a device file. Our discussion focuses on the second case.

Now, for VFS to pass the device file operations onto the driver, it should have been informed about it. And yes, that is what is called registering the file operations by the driver with the VFS. This involves two steps. (The parenthesised code refers to the null driver code below.) First, lets fill in a file operations structure (struct file_operations pugs_fops) with the desired file operations (my_open, my_close, my_read, my_write, ) and initialise the character device structure (struct cdev c_dev) with that, using cdev_init(). Then, hand this structure to the VFS using the call cdev_add(). Both cdev_init() and cdev_add() are declared in <linux/cdev.h>. Obviously, the actual file operations (my_open, my_close, my_read, my_write) also had to be coded. So, to start with, lets keep them as simple as possible lets say, as easy as the null driver.

The null driver


Following these steps, Shweta put the pieces together, attempting her first character device driver. Lets see what the outcome was. Heres the complete code ofcd.c: #include <linux/module.h> #include <linux/version.h> #include <linux/kernel.h> #include <linux/types.h> #include <linux/kdev_t.h> #include <linux/fs.h> #include <linux/device.h> #include <linux/cdev.h> static dev_t first; // Global variable for the first device number static struct cdev c_dev; // Global variable for the character device structure static struct class *cl; // Global variable for the device class static int my_open(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: open()\n"); return 0; } static int my_close(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: close()\n"); return 0; } static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); return 0; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); return len; } static struct file_operations pugs_fops = { .owner = THIS_MODULE,

.open = my_open, .release = my_close, .read = my_read, .write = my_write }; static int __init ofcd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofcd registered"); if (alloc_chrdev_region(&first, 0, 1, "Shweta") < 0) { return -1; } if ((cl = class_create(THIS_MODULE, "chardrv")) == NULL) { unregister_chrdev_region(first, 1); return -1; } if (device_create(cl, NULL, first, NULL, "mynull") == NULL) { class_destroy(cl); unregister_chrdev_region(first, 1); return -1; } cdev_init(&c_dev, &pugs_fops); if (cdev_add(&c_dev, first, 1) == -1) { device_destroy(cl, first); class_destroy(cl); unregister_chrdev_region(first, 1); return -1; } return 0; } static void __exit ofcd_exit(void) /* Destructor */ { cdev_del(&c_dev); device_destroy(cl, first); class_destroy(cl); unregister_chrdev_region(first, 1); printk(KERN_INFO "Alvida: ofcd unregistered"); } module_init(ofcd_init); module_exit(ofcd_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Our First Character Driver"); Shweta repeated the usual build process, with some new test steps, as follows: 1. Build the driver (.ko file) by running make. 2. Load the driver using insmod.

3. 4. 5. 6.

List the loaded modules using lsmod. List the major number allocated, using cat /proc/devices. null driver-specific experiments (refer to Figure 2 for details). Unload the driver using rmmod.

Figure 2: 'null driver' experiments

Summing up
Shweta was certainly happy; all on her own, shed got a character driver written, which works the same as the standard /dev/null device file. To understand what this means, check the <major, minor> tuple for /dev/null, and similarly, also try out the echo and cat commands with it. However, one thing began to bother Shweta. She had got her own calls (my_open, my_close, my_read, my_write) in her driver, but wondered why they worked so unusually, unlike any regular file system calls. What was unusual? Whatever was written, she got nothing when reading unusual, at least from the regular file operations perspective. How would she crack this problem? Watch out for the next article.

Device Drivers, Part 6: Decoding Character Device File Operations

This article, which is part of the series on Linux device drivers, continues to cover the various concepts of character drivers and their implementation, which was dealt with in the previous two articles [1, 2]. So, what was your guess on how Shweta would crack the problem? Obviously, with the help of Pugs. Wasnt it obvious? In our previous article, we saw how Shweta was puzzled by not being able to read any data, even after writing into the /dev/mynull character device file. Suddenly, a bell rang not inside her head, but a real one at the door. And for sure, there was Pugs. How come youre here? exclaimed Shweta. I saw your tweet. Its cool that you cracked your first character driver all on your own. Thats amazing. So, what are you up to now? asked Pugs. Ill tell you, on the condition that you do not play spoil sport, replied Shweta. Pugs smiled, Okay, Ill only give you advice. And that too, only if I ask for it! I am trying to understand character device file operations, said Shweta. Pugs perked up, saying, I have an idea. Why dont you decode and then explain what youve understood about it? Shweta felt that was a good idea. She tailed the dmesg log to observe the printk output from her driver. Alongside, she opened her null driver code on her console, specifically observing the device file operations my_open, my_close, my_read, and my_write. static int my_open(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: open()\n"); return 0; } static int my_close(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: close()\n"); return 0; }

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); return 0; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); return len; } Based on the earlier understanding of the return value of the functions in the kernel, my_open() and my_close() are trivial, their return types being int, and both of them returning zero, means success. However, the return types of both my_read() and my_write() are not int, rather, it is ssize_t. On further digging through kernel headers, that turns out to be a signed word. So, returning a negative number would be a usual error. But a non-negative return value would have additional meaning. For the read operation, it would be the number of bytes read, and for the write operation, it would be the number of bytes written.

Reading the device file


To understand this in detail, the complete flow has to be given a relook. Lets take the read operation first. When the user does a read from the device file /dev/mynull, that system call comes to the virtual file system (VFS) layer in the kernel. VFS decodes the <major, minor> tuple, and figures out that it needs to redirect it to the drivers function my_read(), thats registered with it. So from that angle, my_read() is invoked as a request to read, from us the device-driver writers. And hence, its return value would indicate to the requesters (i.e., the users), how many bytes they are getting from the read request. In our null driver example, we returned zero which meant no bytes available, or in other words, the end of the file. And hence, when the device file is being read, the result is always nothing, independent of what is written into it. Hmmm So, if I change it to 1, would it start giving me some data? asked Pugs, by way of verifying. Shweta paused for a while, looked at the parameters of the function my_read() and answered in the affirmative, but with a caveat the data sent would be some junk data, since my_read() is not really populating data into buf (the buffer variable that is the second parameter of my_read(), provided by the user). In fact, my_read() should write data into buf, according to len (the third parameter to the function), the count in bytes requested by the user. To be more specific, it should write less than, or equal to, len bytes of data into buf, and the number of bytes written should be passed back as the return value. No, this is not a typo in the read operation, device-driver writers write into the user-supplied buffer. We read the data from (possibly) an underlying device, and then write that data into the user buffer, so that the user can read it. Thats really smart of you, said Pugs, sarcastically.

Writing into the device file


The write operation is the reverse. The user provides len (the third parameter of my_write()) bytes of data to be written, in buf (the second parameter of my_write()). The my_write() function would read that data and possibly write it to an underlying device, and return the number of bytes that have been successfully written. Aha!! Thats why all my writes into /dev/ mynull have been successful, without actually doing any read or write, exclaimed Shweta, filled with happiness at understanding the complete flow of device file operations.

Preserving the last character


With Shweta not giving Pugs any chance to correct her, he came up with a challenge. Okay. Seems like you are thoroughly clear with the read/write fundamentals; so, heres a question for you. Can you modify these my_read() and my_write() functions such that whenever I read /dev/mynull, I get the last character written into /dev/mynull? Confidently, Shweta took on the challenge, and modified my_read() and my_write() as follows, adding a static global character variable: static char c; Almost there, but what if the user has provided an invalid buffer, or if the user buffer is swapped out. Wouldnt this direct access of the user-space buf just crash and oops the kernel? pounced Pugs. Shweta, refusing to be intimidated, dived into her collated material and figured out that there are two APIs just to ensure that user-space buffers are safe to access, and then updated them. With the complete understanding of the APIs, she rewrote the above code snippet as follows: static char c; static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); if (copy_to_user(buf, &c, 1) != 0) return -EFAULT; else return 1; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); if (copy_from_user(&c, buf + len 1, 1) != 0) return -EFAULT; else return len; }

Then Shweta repeated the usual build-and-test steps as follows: 1. 2. 3. 4. 5. Build the modified null driver (.ko file) by running make. Load the driver using insmod. Write into /dev/mynull, say, using echo -n "Pugs" > /dev/ mynull Read from /dev/mynull using cat /dev/mynull (stop by using Ctrl+C) Unload the driver using rmmod.

On cating /dev/mynull, the output was a non-stop infinite sequence of s, as my_read() gives the last one character forever. So, Pugs intervened and pressed Ctrl+C to stop the infinite read, and tried to explain, If this is to be changed to the last character only once, my_read() needs to return 1 the first time, and zero from the second time onwards. This can be achieved using off (the fourth parameter of my_read()). Shweta nodded her head obligingly, just to bolster Pugs ego.

Device Drivers, Part 7: Generic Hardware Access in Linux


This article, which is part of the series on Linux device drivers, talks about accessing hardware in Linux.

Shweta was all jubilant about her character driver achievements, as she entered the Linux device drivers laboratory on the second floor of her college. Many of her classmates had already read her blog and commented on her expertise. And today was a chance to show off at another level. Till now, it was all software but todays lab was on accessing hardware in Linux. In the lab, students are expected to learn by experiment how to access different kinds of hardware in Linux, on various architectures, over multiple lab sessions. Members of the lab staff are usually reluctant to let students work on the hardware straight away without any experience so they had prepared some presentations for the students (available here).

Generic hardware interfacing


As every one settled down in the laboratory, lab expert Priti started with an introduction to hardware interfacing in Linux. Skipping the theoretical details, the first interesting slide was about generic architecture-transparent hardware interfacing (see Figure 1).

Figure 1: Hardware mapping The basic assumption is that the architecture is 32-bit. For others, the memory map would change accordingly. For a 32-bit address bus, the address/memory map ranges from 0 (0x00000000) to 232 1 (0xFFFFFFFF). An architecture-independent layout of this memory map would be like whats shown in Figure 1 memory (RAM) and device regions (registers and memories of devices) mapped in an interleaved fashion. These addresses actually are architecture-dependent. For example, in an x86 architecture, the initial 3 GB (0x00000000 to 0xBFFFFFFF) is typically for RAM, and the later 1GB (0xC0000000 to 0xFFFFFFFF) for device maps. However, if the RAM is less, say 2GB, device maps could start from 2GB (0x80000000). Run cat /proc/iomem to list the memory map on your system. Run cat /proc/meminfo to get the approximate RAM size on your system. Refer to Figure 2 for a snapshot.

Figure 2: Physical and bus addresses on an x86 system

Irrespective of the actual values, the addresses referring to RAM are termed as physical addresses, and those referring to device maps as bus addresses, since these devices are always mapped through some architecture-specific bus for example, the PCI bus in the x86 architecture, the AMBA bus in ARM architectures, the SuperHyway bus in SuperH architectures, etc. All the architecture-dependent values of these physical and bus addresses are either dynamically configurable, or are to be obtained from the data-sheets (i.e., hardware manuals) of the corresponding architecture processors/controllers. The interesting part is that in Linux, none of these are directly accessible, but are to be mapped to virtual addresses and then accessed through them thus making the RAM and device accesses generic enough. The corresponding APIs (prototyped in <asm/io.h>) for mapping and unmapping the device bus addresses to virtual addresses are: void *ioremap(unsigned long device_bus_address, unsigned long device_region_size); void iounmap(void *virt_addr); Once mapped to virtual addresses, it depends on the device datasheet as to which set of device registers and/or device memory to read from or write into, by adding their offsets to the virtual address returned by ioremap(). For that, the following are the APIs (also prototyped in <asm/io.h>): unsigned int ioread8(void *virt_addr); unsigned int ioread16(void *virt_addr); unsigned int ioread32(void *virt_addr); unsigned int iowrite8(u8 value, void *virt_addr); unsigned int iowrite16(u16 value, void *virt_addr); unsigned int iowrite32(u32 value, void *virt_addr);

Accessing the video RAM of DOS days


After this first set of information, students were directed for the live experiments. The suggested initial experiment was with the video RAM of DOS days, to understand the usage of the above APIs. Shweta got onto the system and went through /proc/iomem (as in Figure 2) and got the video RAM address, ranging from 0x000A0000 to 0x000BFFFF. She added the above APIs, with appropriate parameters, into the constructor and destructor of her existing null driver, to convert it into a vram driver. Then she added the user access to the video RAM through read and write calls of the vram driver; heres her new file video_ram.c: #include <linux/module.h> #include <linux/version.h> #include <linux/kernel.h> #include <linux/types.h> #include <linux/kdev_t.h> #include <linux/fs.h> #include <linux/device.h> #include <linux/cdev.h> #include <linux/uaccess.h> #include <asm/io.h> #define VRAM_BASE 0x000A0000 #define VRAM_SIZE 0x00020000 static void __iomem *vram; static dev_t first; static struct cdev c_dev; static struct class *cl; static int my_open(struct inode *i, struct file *f)

{ return 0; } static int my_close(struct inode *i, struct file *f) { return 0; } static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { int i; u8 byte; if (*off >= VRAM_SIZE) { return 0; } if (*off + len > VRAM_SIZE) { len = VRAM_SIZE - *off; } for (i = 0; i < len; i++) { byte = ioread8((u8 *)vram + *off + i); if (copy_to_user(buf + i, &byte, 1)) { return -EFAULT; } } *off += len; return len; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { int i; u8 byte; if (*off >= VRAM_SIZE) { return 0; } if (*off + len > VRAM_SIZE) { len = VRAM_SIZE - *off; } for (i = 0; i < len; i++) { if (copy_from_user(&byte, buf + i, 1)) { return -EFAULT; } iowrite8(byte, (u8 *)vram + *off + i); } *off += len;

return len; } static struct file_operations vram_fops = { .owner = THIS_MODULE, .open = my_open, .release = my_close, .read = my_read, .write = my_write }; static int __init vram_init(void) /* Constructor */ { if ((vram = ioremap(VRAM_BASE, VRAM_SIZE)) == NULL) { printk(KERN_ERR "Mapping video RAM failed\n"); return -1; } if (alloc_chrdev_region(&first, 0, 1, "vram") < 0) { return -1; } if ((cl = class_create(THIS_MODULE, "chardrv")) == NULL) { unregister_chrdev_region(first, 1); return -1; } if (device_create(cl, NULL, first, NULL, "vram") == NULL) { class_destroy(cl); unregister_chrdev_region(first, 1); return -1; } cdev_init(&c_dev, &vram_fops); if (cdev_add(&c_dev, first, 1) == -1) { device_destroy(cl, first); class_destroy(cl); unregister_chrdev_region(first, 1); return -1; } return 0; } static void __exit vram_exit(void) /* Destructor */ { cdev_del(&c_dev); device_destroy(cl, first); class_destroy(cl); unregister_chrdev_region(first, 1); iounmap(vram);

} module_init(vram_init); module_exit(vram_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Video RAM Driver");

Summing up
Shweta then repeated the usual steps: 1. 2. 3. 4. Build the vram driver (video_ram.ko file) by running make with a changed Makefile. Load the driver using insmod video_ram.ko. Write into /dev/vram, say, using echo -n "0123456789" > /dev/vram. Read the /dev/vram contents using od -t x1 -v /dev/vram | less. (The usual cat /dev/vram can also be used, but that would give all the binary content. od -t x1 shows it as hexadecimal. For more details, run man od.) 5. Unload the driver using rmmod video_ram. With half an hour still left for the end of the practical class, Shweta decided to walk around and possibly help somebody else with their experiments.

Device Drivers, Part 8: Accessing x86-Specific I/O-Mapped Hardware

This article, which is part of the series on Linux device drivers, continues the discussion on accessing hardware in Linux. The second day in the Linux device drivers laboratory was expected to be quite different from the typical software-oriented class. Apart from accessing and programming architecture-specific I/O mapped hardware in x86, it had a lot to offer first-timers with regard to reading hardware device manuals (commonly called data sheets) and how to understand them to write device drivers. In contrast, the previous session about generic architecture-transparent hardware interfacing was about mapping and accessing memory-mapped devices in Linux without any device-specific details.

x86-specific hardware interfacing


Unlike most other architectures, x86 has an additional hardware accessing mechanism, through direct I/O mapping. It is a direct 16-bit addressing scheme, and doesnt need mapping to a virtual address for access. These addresses are referred to as port addresses, or ports. Since this is an additional access mechanism, it has an additional set of x86 (assembly/machine code) instructions. And yes, there are the input instructions inb, inw, and inl for reading an 8-bit byte, a 16-bit word, and a 32-bit long word, respectively, from I/O mapped devices, through ports. The corresponding output instructions are outb, outw and outl, respectively. The equivalent C functions/macros (available through the header <asm/io.h>) are as follows: u8 inb(unsigned long port); u16 inw(unsigned long port); u32 inl(unsigned long port); void outb(u8 value, unsigned long port); void outw(u16 value, unsigned long port); void outl(u32 value, unsigned long port);

The basic question that may arise relates to which devices are I/O mapped and what the port addresses of these devices are. The answer is pretty simple. As per x86-standard, all these devices and their mappings are predefined. Figure 1 shows a snippet of these mappings through the kernel window /proc/ioports. The listing includes predefined DMA, the timer and RTC, apart from serial, parallel and PCI bus interfaces, to name a few.

Figure 1: x86-specific I/O ports

Simplest: serial port on x86


For example, the first serial port is always I/O mapped from 0x3F8 to 0x3FF. But what does this mapping mean? What do we do with this? How does it help us to use the serial port? That is where a data-sheet of the corresponding device needs to be looked up. A serial port is controlled by the serial controller device, commonly known as an UART (Universal Asynchronous Receiver/Transmitter) or at times a USART (Universal Synchronous/Asynchronous Receiver/Transmitter). On PCs, the typical UART used is the PC16550D. The data-sheet for this [PDF] can be downloaded as part of the self-extracting package [BIN file] used for the Linux device driver kit, available at lddk.esrijan.com. Generally speaking, from where, and how, does one get these device data sheets? Typically, an online search with the corresponding device number should yield their data-sheet links. Then, how does one get the device number? Simple by having a look at the device. If it is inside a desktop, open it up and check it out. Yes, this is the least you may have to do to get going with the hardware, in order to write device drivers. Assuming all this has been done, it is time to peep into the data sheet of the PC16550D UART.

Device driver writers need to understand the details of the registers of the device, as it is these registers that writers need to program, to use the device. Page 14 of the data sheet (also shown in Figure 2) shows the complete table of all the twelve 8-bit registers present in the UART PC16550D.

Figure 2: Registers of UART PC16550D Each of the eight rows corresponds to the respective bit of the registers. Also, note that the register addresses start from 0 and goes up to 7. The interesting thing about this is that a data sheet always gives the register offsets, which then needs to be added to the base address of the device, to get the actual register addresses. Who decides the base address and where is it obtained from? Base addresses are typically board/platform specific, unless they are dynamically configurable like in the case of PCI devices. In this case, i.e., a serial device on x86, it is dictated by the x86 architectureand that precisely was the starting serial port address mentioned above0x3F8. Thus, the eight register offsets, 0 to 7, exactly map to the eight port addresses 0x3F8 to 0x3FF. So, these are the actual addresses to be read or written, for reading or writing the corresponding serial registers, to achieve the desired serial operations, as per the register descriptions. All the serial register offsets and the register bit masks are defined in the header <linux/serial_reg.h>. So, rather than hard-coding these values from the data sheet, the corresponding macros could be used instead. All the following code uses these macros, along with the following:

#define SERIAL_PORT_BASE 0x3F8 Operating on the device registers To summarise the decoding of the PC16550D UART data sheet, here are a few examples of how to do read and write operations of the serial registers and their bits. Reading and writing the Line Control Register (LCR): u8 val; val = inb(SERIAL_PORT_BASE + UART_LCR /* 3 */); outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */); Setting and clearing the Divisor Latch Access Bit (DLAB) in LCR: u8 val; val = inb(SERIAL_PORT_BASE + UART_LCR /* 3 */); /* Setting DLAB */ val |= UART_LCR_DLAB /* 0x80 */; outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */); /* Clearing DLAB */ val &= ~UART_LCR_DLAB /* 0x80 */; outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */); Reading and writing the Divisor Latch: u8 dlab; u16 val; dlab = inb(SERIAL_PORT_BASE + UART_LCR); dlab |= UART_LCR_DLAB; // Setting DLAB to access Divisor Latch outb(dlab, SERIAL_PORT_BASE + UART_LCR); val = inw(SERIAL_PORT_BASE + UART_DLL /* 0 */); outw(val, SERIAL_PORT_BASE + UART_DLL /* 0 */);

Blinking an LED
To get a real experience of low-level hardware access and Linux device drivers, the best way would be to play with the Linux device driver kit (LDDK) mentioned above. However, just for a feel of low-level hardware access, a blinking light emitting diode (LED) may be tried, as follows: Connect a light-emitting diode (LED) with a 330 ohm resistor in series across Pin 3 (Tx) and Pin 5 (Gnd) of the DB9 connector of your PC. Pull up and down the transmit (Tx) line with a 500 ms delay, by loading and unloading the blink_led driver, using insmod blink_led.ko and rmmod blink_led, respectively. Driver file blink_led.ko can be created from its source file blink_led.c by running make with the usual driver Makefile. Given below is the complete blink_led.c:

#include <linux/module.h> #include <linux/version.h> #include <linux/types.h> #include <linux/delay.h> #include <asm/io.h> #include <linux/serial_reg.h> #define SERIAL_PORT_BASE 0x3F8 int __init init_module() { int i; u8 data; data = inb(SERIAL_PORT_BASE + UART_LCR); for (i = 0; i < 5; i++) { /* Pulling the Tx line low */ data |= UART_LCR_SBC; outb(data, SERIAL_PORT_BASE + UART_LCR); msleep(500); /* Defaulting the Tx line high */ data &= ~UART_LCR_SBC; outb(data, SERIAL_PORT_BASE + UART_LCR); msleep(500); } return 0; } void __exit cleanup_module() { } MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Blinking LED Hack");

Looking ahead
You might have wondered why Shweta is missing from this article? She bunked all the classes! Watch out for the next article to find out why.

Device Drivers, Part 9: I/O Control in Linux

This article, which is part of the series on Linux device drivers, talks about the typical ioctl() implementation and usage in Linux. Get me a laptop, and tell me about the x86 hardware interfacing experiments in the last Linux device drivers lab session, and also about whats planned for the next session, cried Shweta, exasperated at being confined to bed due to food poisoning at a friends party. Shwetas friends summarised the session, and told her that they didnt know what the upcoming sessions, though related to hardware, would be about. When the doctor requested them to leave, they took the opportunity to plan and talk about the most common hardware-controlling operation, ioctl().

Introducing ioctl()
Input/Output Control (ioctl, in short) is a common operation, or system call, available in most driver categories. It is a one-bill-fits-all kind of system call. If there is no other system call that meets a particular requirement, then ioctl() is the one to use. Practical examples include volume control for an audio device, display configuration for a video device, reading device registers, and so on basically, anything to do with device input/output, or device-specific operations, yet versatile enough for any kind of operation (for example, for debugging a driver by querying driver data structures). The question is: how can all this be achieved by a single function prototype? The trick lies in using its two key parameters: command and argument. The command is a number representing an operation. The argument command is the corresponding parameter for the operation. The ioctl() function implementation does a switch case over the commmand to implement the corresponding functionality. The following has been its prototype in the Linux kernel for quite some time:

int ioctl(struct inode *i, struct file *f, unsigned int cmd, unsigned long arg); However, from kernel 2.6.35, it changed to: long ioctl(struct file *f, unsigned int cmd, unsigned long arg); If there is a need for more arguments, all of them are put in a structure, and a pointer to the structure becomes the one command argument. Whether integer or pointer, the argument is taken as a long integer in kernel-space, and accordingly type-cast and processed. ioctl() is typically implemented as part of the corresponding driver, and then an appropriate function pointer is initialised with it, exactly as in other system calls like open(), read(), etc. For example, in character drivers, it is the ioctl or unlocked_ioctl (since kernel 2.6.35) function pointer field in the struct file_operations that is to be initialised. Again, like other system calls, it can be equivalently invoked from user-space using the ioctl() system call, prototyped in <sys/ioctl.h> as: int ioctl(int fd, int cmd, ...); Here, cmd is the same as what is implemented in the drivers ioctl(), and the variable argument construct (...) is a hack to be able to pass any type of argument (though only one) to the drivers ioctl(). Other parameters will be ignored. Note that both the command and command argument type definitions need to be shared across the driver (in kernel-space) and the application (in user-space). Thus, these definitions are commonly put into header files for each space.

Querying driver-internal variables


To better understand the boring theory explained above, heres the code set for the debugging a driver example mentioned earlier. This driver has three static global variables: status, dignity, and ego, which need to be queried and possibly operated from an application. The header file query_ioctl.h defines the corresponding commands and command argument type. A listing follows: #ifndef QUERY_IOCTL_H #define QUERY_IOCTL_H #include <linux/ioctl.h> typedef struct { int status, dignity, ego; } query_arg_t; #define QUERY_GET_VARIABLES _IOR('q', 1, query_arg_t *) #define QUERY_CLR_VARIABLES _IO('q', 2) #define QUERY_SET_VARIABLES _IOW('q', 3, query_arg_t *) #endif

Using these, the drivers ioctl() implementation in query_ioctl.c would be as follows: #include <linux/module.h> #include <linux/kernel.h> #include <linux/version.h> #include <linux/fs.h> #include <linux/cdev.h> #include <linux/device.h> #include <linux/errno.h> #include <asm/uaccess.h> #include "query_ioctl.h" #define FIRST_MINOR 0 #define MINOR_CNT 1 static dev_t dev; static struct cdev c_dev; static struct class *cl; static int status = 1, dignity = 3, ego = 5; static int my_open(struct inode *i, struct file *f) { return 0; } static int my_close(struct inode *i, struct file *f) { return 0; } #if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35)) static int my_ioctl(struct inode *i, struct file *f, unsigned int cmd, unsigned long arg) #else static long my_ioctl(struct file *f, unsigned int cmd, unsigned long arg) #endif { query_arg_t q; switch (cmd) { case QUERY_GET_VARIABLES: q.status = status; q.dignity = dignity; q.ego = ego; if (copy_to_user((query_arg_t *)arg, &q, sizeof(query_arg_t))) { return -EACCES; } break; case QUERY_CLR_VARIABLES: status = 0; dignity = 0; ego = 0; break; case QUERY_SET_VARIABLES:

if (copy_from_user(&q, (query_arg_t *)arg, sizeof(query_arg_t))) { return -EACCES; } status = q.status; dignity = q.dignity; ego = q.ego; break; default: return -EINVAL; } return 0; } static struct file_operations query_fops = { .owner = THIS_MODULE, .open = my_open, .release = my_close, #if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35)) .ioctl = my_ioctl #else .unlocked_ioctl = my_ioctl #endif }; static int __init query_ioctl_init(void) { int ret; struct device *dev_ret; if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT, "query_ioctl")) < 0) { return ret; } cdev_init(&c_dev, &query_fops); if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0) { return ret; } if (IS_ERR(cl = class_create(THIS_MODULE, "char"))) { cdev_del(&c_dev); unregister_chrdev_region(dev, MINOR_CNT); return PTR_ERR(cl); } if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "query"))) { class_destroy(cl); cdev_del(&c_dev);

unregister_chrdev_region(dev, MINOR_CNT); return PTR_ERR(dev_ret); } return 0; } static void __exit query_ioctl_exit(void) { device_destroy(cl, dev); class_destroy(cl); cdev_del(&c_dev); unregister_chrdev_region(dev, MINOR_CNT); } module_init(query_ioctl_init); module_exit(query_ioctl_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Query ioctl() Char Driver"); And finally, the corresponding invocation functions from the application query_app.c would be as follows: #include <stdio.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> #include <string.h> #include <sys/ioctl.h> #include "query_ioctl.h" void get_vars(int fd) { query_arg_t q; if (ioctl(fd, QUERY_GET_VARIABLES, &q) == -1) { perror("query_apps ioctl get"); } else { printf("Status : %d\n", q.status); printf("Dignity: %d\n", q.dignity); printf("Ego : %d\n", q.ego); } } void clr_vars(int fd) { if (ioctl(fd, QUERY_CLR_VARIABLES) == -1) { perror("query_apps ioctl clr");

} } void set_vars(int fd) { int v; query_arg_t q; printf("Enter Status: "); scanf("%d", &v); getchar(); q.status = v; printf("Enter Dignity: "); scanf("%d", &v); getchar(); q.dignity = v; printf("Enter Ego: "); scanf("%d", &v); getchar(); q.ego = v; if (ioctl(fd, QUERY_SET_VARIABLES, &q) == -1) { perror("query_apps ioctl set"); } } int main(int argc, char *argv[]) { char *file_name = "/dev/query"; int fd; enum { e_get, e_clr, e_set } option; if (argc == 1) { option = e_get; } else if (argc == 2) { if (strcmp(argv[1], "-g") == 0) { option = e_get; } else if (strcmp(argv[1], "-c") == 0) { option = e_clr; } else if (strcmp(argv[1], "-s") == 0) { option = e_set;

} else { fprintf(stderr, "Usage: %s [-g | -c | -s]\n", argv[0]); return 1; } } else { fprintf(stderr, "Usage: %s [-g | -c | -s]\n", argv[0]); return 1; } fd = open(file_name, O_RDWR); if (fd == -1) { perror("query_apps open"); return 2; } switch (option) { case e_get: get_vars(fd); break; case e_clr: clr_vars(fd); break; case e_set: set_vars(fd); break; default: break; } close (fd); return 0; } Now try out query_app.c and query_ioctl.c with the following operations: Build the query_ioctl driver (query_ioctl.ko file) and the application (query_app file) by running make, using the following Makefile: # If called directly from the command line, invoke the kernel build system. ifeq ($(KERNELRELEASE),) KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: module query_app module: $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) modules clean:

$(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) clean ${RM} query_app # Otherwise KERNELRELEASE is defined; we've been invoked from the # kernel build system and can use its language. else obj-m := query_ioctl.o endif Load the driver using insmod query_ioctl.ko. With appropriate privileges and command-line arguments, run the application query_app: ./query_app to display the driver variables ./query_app -c to clear the driver variables ./query_app -g to display the driver variables ./query_app -s to set the driver variables (not mentioned above) Unload the driver using rmmod query_ioctl.

Defining the ioctl() commands


"Visiting time is over," yelled the security guard. Shweta thanked her friends since she could understand most of the code now, including the need for copy_to_user(), as learnt earlier. But she wondered about _IOR, _IO, etc., which were used in defining commands in query_ioctl.h. These are usual numbers only, as mentioned earlier for an ioctl() command. Just that, now additionally, some useful command related information is also encoded as part of these numbers using various macros, as per the POSIX standard for ioctl. The standard talks about the 32-bit command numbers, formed of four components embedded into the [31:0] bits: 1. The direction of command operation [bits 31:30] -- read, write, both, or none -- filled by the corresponding macro (_IOR, _IOW, _IOWR, _IO). 2. The size of the command argument [bits 29:16] -- computed using sizeof() with the command argument's type -- the third argument to these macros. 3. The 8-bit magic number [bits 15:8] -- to render the commands unique enough -- typically an ASCII character (the first argument to these macros). 4. The original command number [bits 7:0] -- the actual command number (1, 2, 3, ...), defined as per our requirement -- the second argument to these macros. Check out the header <asm-generic/ioctl.h> for implementation details.

Device Drivers, Part 10: Kernel-Space Debuggers in Linux


This article, which is part of the series on Linux device drivers, talks about kernel-space debugging in Linux.

Shweta, back from hospital, was relaxing in the library, reading various books. Ever since she learned of the ioctl way of debugging, she was impatient to find out more about debugging in kernel-space. She was curious about how and where to run the kernel-space debugger, if there was any. This was in contrast with application/user-space debugging, where we have the OS running underneath, and a shell or a GUI over it to run the debugger (like gdb, and the data display debugger, ddd). Then she came across this interesting kernel-space debugging mechanism using kgdb, provided as part of the kernel itself, since kernel 2.6.26.

The debugger challenge in kernel-space


As we need some interface to be up to run a debugger to debug anything, a kernel debugger could be visualised in two possible ways: Put the debugger into the kernel itself, accessible via the usual console. For example, in the case of kdb, which was not official until kernel 2.6.35, one had to download source code (two sets of patches one architecture-dependent, one architecture-independent) from this FTP address and then patch these into the kernel source. However, since kernel 2.6.35, the majority of it is in the officially released kernel source. In either case, kdb support needs to be enabled in kernel source, with the kernel compiled, installed and booted with. The boot screen itself would give the kdb debugging interface. Put a minimal debugging server into the kernel; a client would connect to it from a remote host or local user-space over some interface (say serial or Ethernet). This is kgdb, the kernels gdb server, to be used with gdb as its client. Since kernel 2.6.26, its serial interface is part of the official kernel release. However, if youre interested in a network interface, you still need to patch with one of the releases from the kgdb project page. In either case, you need to enable kgdb support in the kernel, recompile, install and boot the new kernel. Please note that in both the above cases, the complete kernel source for the kernel to be debugged is needed, unlike for building modules, where just headers are sufficient. Here is how to play around with kgdb over the serial interface.

Setting up the Linux kernel with kgdb


Here are the prerequisites: Either the kernel source package for the running kernel should be installed on your system, or a corresponding kernel source release should have been downloaded from kernel.org. First of all, the kernel to be debugged needs to have kgdb enabled and built into it. To achieve that, the kernel source has to be configured with CONFIG_KGDB=y. Additionally, for kgdb over serial, CONFIG_KGDB_SERIAL_CONSOLE=y needs to be configured. And CONFIG_DEBUG_INFO is preferred for symbolic data to be built into the kernel, to make debugging with gdb more meaningful. CONFIG_FRAME_POINTER=y enables frame pointers in the kernel, allowing gdb to construct more accurate stack back-traces. All these options are available under Kernel hacking in the menu obtained in the kernel source directory (preferably as root, or using sudo), by issuing the following command: $ make mrproper # To clean up properly $ make oldconfig # Configure the kernel same as the current running one $ make menuconfig # Start the ncurses based menu for further configuration

Figure 1: Configuring kernel options for kgdb See the highlighted selections in Figure 1, for how and where these options would be: KGDB: kernel debugging with remote gdb > CONFIG_KGDB KGDB: use kgdb over the serial console > CONFIG_KGDB_SERIAL_CONSOLE Compile the kernel with debug info > CONFIG_DEBUG_INFO Compile the kernel with frame pointers > CONFIG_FRAME_POINTER

Once configuration is saved, build the kernel (run make), and then a make install to install it, along with adding an entry for the installed kernel in the GRUB configuration file. Depending on the distribution, the GRUB configuration file may be /boot/grub/menu.lst, /etc/grub.cfg, or something similar. Once installed, the kgdb-related kernel boot parameters need to be added to this new entry, as shown in the highlighted text in Figure 2.

Figure 2: GRUB configuration for kgdb kgdboc is for gdb connecting over the console, and the basic format is kgdboc= <serial_device>, <baud-rate> where: <serial_device> is the serial device file (port) on the system running the kernel to be debugged <baud-rate> is the baud rate of this serial port kgdbwait tells the kernel to delay booting till a gdb client connects to it; this parameter should be given only after kgdboc. With this, were ready to begin. Make a copy of the vmlinux kernel image for use on the gdb client system. Reboot, and at the GRUB menu, choose the new kernel, and then it will wait for gdb to connect over the serial port. All the above snapshots are with kernel version 2.6.33.14. The same should work for any 2.6.3x release of the kernel source. Also, the snapshots for kgdb are captured over the serial device file /dev/ttyS0, i.e., the first serial port.

Setting up gdb on another system


Following are the prerequisites: Serial ports of the system to be debugged, and the other system to run gdb, should be connected using a null modem (i.e., a cross-over serial) cable. The vmlinux kernel image built, with kgdb enabled, needs to be copied from the system to be debugged, into the working directory on the system where gdb is going to be run. To get gdb to connect to the waiting kernel, launch gdb from the shell and run these commands: (gdb) file vmlinux (gdb) set remote interrupt-sequence Ctrl-C (gdb) set remotebaud 115200 (gdb) target remote /dev/ttyS0 (gdb) continue

In the above commands, vmlinux is the kernel image copied from the system to be debugged.

Debugging using gdb with kgdb


After this, it is all like debugging an application from gdb. One may stop execution using Ctrl+C, add break points using b[reak], stop execution using s[tep] or n[ext] the usual gdb way. There are enough GDB tutorials available online, if you need them. In fact, if you are not comfortable with textbased GDB, use any of the standard GUI tools over gdb, like ddd, Eclipse, etc.

Summing up
By now, Shweta was excited about wanting to try out kgdb. Since she needed two systems to try it out, she went to the Linux device drivers lab. There, she set up the systems and ran gdb as described above.

Device Drivers, Part 11: USB Drivers in Linux


This article, which is part of the series on Linux device drivers, gets you started with writing your first USB driver in Linux.

Pugs pen drive was the device Shweta was playing with, when both of them sat down to explore the world of USB drivers in Linux. The fastest way to get the hang of it, and Pugs usual way, was to pick up a USB device, and write a driver for it, to experiment with. So they chose a pen drive (a.k.a. USB stick) that was at hand a JetFlash from Transcend, with vendor ID 0x058f and product ID 0x6387.

USB device detection in Linux


Whether a driver for a USB device is there or not on a Linux system, a valid USB device will always be detected at the hardware and kernel spaces of a USB-enabled Linux system, since it is designed (and detected) as per the USB protocol specifications. Hardware-space detection is done by the USB host controller typically a native bus device, like a PCI device on x86 systems. The corresponding host controller driver would pick and translate the low-level physical layer information into higher-level USB protocol-specific information. The USB protocol formatted information about the USB device is then populated into the generic USB core layer (the usbcore driver) in kernel-space, thus enabling the detection of a USB device in kernel-space, even without having its specific driver. After this, it is up to various drivers, interfaces, and applications (which are dependent on the various Linux distributions), to have the user-space view of the detected devices. Figure 1 shows a top-tobottom view of the USB subsystem in Linux.

Figure 1: USB subsystem in Linux A basic listing of all detected USB devices can be obtained using the lsusb command, as root. Figure 2 shows this, with and without the pen drive plugged in. A -v option to lsusbprovides detailed information.

Figure 2: Output of lsusb

Figure 3: USB's proc window snippet

Decoding a USB device section


To further decode these sections, a valid USB device needs to be understood first. All valid USB devices contain one or more configurations. A configuration of a USB device is like a profile, where the default one is the commonly used one. As such, Linux supports only one configuration per device the default one. For every configuration, the device may have one or more interfaces. An interface corresponds to a function provided by the device. There would be as many interfaces as the number of functions provided by the device. So, say an MFD (multi-function device) USB printer can do printing, scanning and faxing, then it most likely would have at least three interfaces, one for each of the functions. So, unlike other device drivers, a USB device driver is typically associated/written per interface, rather than the device as a whole meaning that one USB device may have multiple device drivers, and different device interfaces may have the same driver though, of course, one interface can have a maximum of one driver only. It is okay and fairly common to have a single USB device driver for all the interfaces of a USB device. The Driver=... entry in the proc window output (Figure 3) shows the interface to driver mapping a (none) indicating no associated driver. For every interface, there would be one or more end-points. An end-point is like a pipe for transferring information either into or from the interface of the device, depending on the functionality. Based on the type of information, the endpoints have four types: Control, Interrupt, Bulk and Isochronous. As per the USB protocol specification, all valid USB devices have an implicit special control end-point zero, the only bi-directional end-point. Figure 4 shows the complete pictorial representation of a valid USB device, based on the above explanation.

Figure 4: USB device overview Coming back to the USB device sections (Figure 3), the first letter on each line represents the various parts of the USB device specification just explained. For example, D for device, C for configuration, I for interface, E for endpoint, etc. Details about these and various others are available in the kernel source, in Documentation/usb/proc_usb_info.txt.

The USB pen drive driver registration


Seems like there are so many things to know about the USB protocol, to be able to write the first USB driver itself device configuration, interfaces, transfer pipes, their four types, and so many other symbols like T, B, S, under a USB device specification, sighed Shweta. Yes, but dont you worry all of that can be covered in detail later. Lets do first things first get the pen drives interface associated with our USB device driver (pen_register.ko), consoled Pugs. Like any other Linux device driver, here, too, the constructor and the destructor are required basically the same driver template that has been used for all the drivers. However, the content would vary, as this is a hardware protocol layer driver, i.e., a horizontal driver, unlike a character driver, which was one of the vertical drivers discussed earlier. The difference would be that instead of registering with and unregistering from VFS, here this would be done with the corresponding protocol layer the USB core in this case; instead of providing a user-space interface like a device file, it would get connected with the actual device in hardware-space. The USB core APIs for the same are as follows (prototyped in <linux/usb.h>): int usb_register(struct usb_driver *driver); void usb_deregister(struct usb_driver *); As part of the usb_driver structure, the fields to be provided are the drivers name, ID table for autodetecting the particular device, and the two callback functions to be invoked by the USB core during a hot plugging and a hot removal of the device, respectively.

Putting it all together, pen_register.c would look like what follows: #include <linux/module.h> #include <linux/kernel.h> #include <linux/usb.h> static int pen_probe(struct usb_interface *interface, const struct usb_device_id *id) { printk(KERN_INFO "Pen drive (%04X:%04X) plugged\n", id->idVendor, id->idProduct); return 0; } static void pen_disconnect(struct usb_interface *interface) { printk(KERN_INFO "Pen drive removed\n"); } static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F, 0x6387) }, {} /* Terminating entry */ }; MODULE_DEVICE_TABLE (usb, pen_table); static struct usb_driver pen_driver = { .name = "pen_driver", .id_table = pen_table, .probe = pen_probe, .disconnect = pen_disconnect, }; static int __init pen_init(void) { return usb_register(&pen_driver); } static void __exit pen_exit(void) { usb_deregister(&pen_driver); } module_init(pen_init); module_exit(pen_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("USB Pen Registration Driver");

Then, the usual steps for any Linux device driver may be repeated: Build the driver (.ko file) by running make. Load the driver using insmod. List the loaded modules using lsmod. Unload the driver using rmmod.

But surprisingly, the results wouldnt be as expected. Check dmesg and the proc window to see the various logs and details. This is not because a USB driver is different from a character driver but theres a catch. Figure 3 shows that the pen drive has one interface (numbered 0), which is already associated with the usual usb-storage driver. Now, in order to get our driver associated with that interface, we need to unload the usb-storage driver (i.e., rmmod usb-storage) and replug the pen drive. Once thats done, the results would be as expected. Figure 5 shows a glimpse of the possible logs and a procwindow snippet. Repeat hot-plugging in and hot-plugging out the pen drive to observe the probe and disconnect calls in action.

Figure 5: Pen driver in action

Summing up
Finally! Something in action! a relieved Shweta said. But it seems like there are so many things (like the device ID table, probe, disconnect, etc.), yet to be understood to get a complete USB device driver in place. Yes, you are right. Lets take them up, one by one, with breaks, replied Pugs, taking a break himself.

Device Drivers, Part 12: USB Drivers in Linux Continued


The 12th part of the series on Linux device drivers takes you further along the path to writing your first USB driver in Linux a continuation from the previous article.

Pugs continued, Lets build upon the USB device driver coded in our previous session, using the same handy JetFlash pen drive from Transcend, with the vendor ID 0x058f and product ID 06387. For that, lets dig further into the USB protocol, and then convert our learning into code.

USB endpoints and their types


Depending on the type and attributes of information to be transferred, a USB device may have one or more endpoints, each belonging to one of the following four categories: Control to transfer control information. Examples include resetting the device, querying information about the device, etc. All USB devices always have the default control endpoint point as zero. Interrupt for small and fast data transfers, typically of up to 8 bytes. Examples include data transfer for serial ports, human interface devices (HIDs) like keyboards, mouse, etc. Bulk for big but comparatively slower data transfers. A typical example is data transfers for mass-storage devices. Isochronous for big data transfers with a bandwidth guarantee, though data integrity may not be guaranteed. Typical practical usage examples include transfers of time-sensitive data like audio, video, etc. Additionally, all but control endpoints could be in or out, indicating the direction of data transfer; in indicates data flow from the USB device to the host machine, and out, the other way. Technically, an endpoint is identified using an 8-bit number, the most significant bit (MSB) of which indicates the direction 0 means out, and 1 means in. Control endpoints are bi-directional, and the MSB is ignored. Figure 1 shows a typical snippet of USB device specifications for devices connected on a system.

Figure 1: USB's proc window snippet (click for larger view) To be specific, the E: lines in the figure show examples of an interrupt endpoint of a UHCI Host Controller, and two bulk endpoints of the pen drive under consideration. Also, the endpoint numbers (in hex) are, respectively, 0x81, 0x01 and 0x82 the MSB of the first and third being 1, indicating in endpoints, represented by (I) in the figure; the second is an (O) or out endpoint. MxPS specifies the maximum packet size, i.e., the data size that can be transferred in a single go. Again, as expected, for the interrupt endpoint, it is 2 (<=8), and 64 for the bulk endpoints. Ivl specifies the interval in milliseconds to be given between two consecutive data packet transfers for proper transfer, and is more significant for the interrupt endpoints.

Decoding a USB device section


As we have just discussed regarding the E: line, it is the right time to decode the relevant fields of others as well. In short, these lines in a USB device section give a complete overview of the device as per the USB specifications, as discussed in our previous article. Refer back to Figure 1. The first letter of the first line of every device section is a T, indicating the position of the device in the USB tree, uniquely identified by the triplet <usb bus number, usb tree level, usb port>. D represents the device descriptor, containing at least the device version, device class/category, and the number of configurations available for this device. There would be as many C lines as the number of configurations, though typically, it is one. C, the configuration descriptor, contains its index, the device attributes in this configuration, the maximum power (actually, current) the device would draw in this configuration, and the number of interfaces under this configuration. Depending on this, there would be at least that many I lines. There could be more in case of an interface having alternates, i.e., the same interface number but with different properties a typical scenario for Web-cams.

I represents the interface descriptor with its index, alternate number, the functionality class/category of this interface, the driver associated with this interface, and the number of endpoints under this interface. The interface class may or may not be the same as that of the device class. And depending on the number of endpoints, there would be as many E lines, details of which have already been discussed earlier. The * after the C and I represents the currently active configuration and interface, respectively. The P line provides the vendor ID, product ID, and the product revision. S lines are string descriptors showing up some vendor-specific descriptive information about the device. Peeping into cat /proc/bus/usb/devices is good in order to figure out whether a device has been detected or not, and possibly to get the first-cut overview of the device. But most probably this information would be required to write the driver for the device as well. So, is there a way to access it using C code? Shweta asked. Yes, definitely; thats what I am going to tell you about, next. Do you remember that as soon as a USB device is plugged into the system, the USB host controller driver populates its information into the generic USB core layer? To be precise, it puts that into a set of structures embedded into one another, exactly as per the USB specifications, Pugs replied. The following are the exact data structures defined in <linux/usb.h>, ordered here in reverse, for flow clarity: struct usb_device { struct usb_device_descriptor descriptor; struct usb_host_config *config, *actconfig; }; struct usb_host_config { struct usb_config_descriptor desc; struct usb_interface *interface[USB_MAXINTERFACES]; }; struct usb_interface { struct usb_host_interface *altsetting /* array */, *cur_altsetting; }; struct usb_host_interface { struct usb_interface_descriptor desc; struct usb_host_endpoint *endpoint /* array */; }; struct usb_host_endpoint { struct usb_endpoint_descriptor desc; }; So, with access to the struct usb_device handle for a specific device, all the USB-specific information about the device can be decoded, as shown through the /proc window. But how does one get the device

handle? In fact, the device handle is not available directly in a driver; rather, the per-interface handles (pointers to struct usb_interface) are available, as USB drivers are written for device interfaces rather than the device as a whole. Recall that the probe and disconnect callbacks, which are invoked by the USB core for every interface of the registered device, have the corresponding interface handle as their first parameter. Refer to the prototypes below: int (*probe)(struct usb_interface *interface, const struct usb_device_id *id); void (*disconnect)(struct usb_interface *interface); So, with the interface pointer, all information about the corresponding interface can be accessed and to get the container device handle, the following macro comes to the rescue: struct usb_device device = interface_to_usbdev(interface); Adding this new learning into last months registration-only driver gets the following code listing (pen_info.c): #include <linux/module.h> #include <linux/kernel.h> #include <linux/usb.h> static struct usb_device *device; static int pen_probe(struct usb_interface *interface, const struct usb_device_id *id) { struct usb_host_interface *iface_desc; struct usb_endpoint_descriptor *endpoint; int i; iface_desc = interface->cur_altsetting; printk(KERN_INFO "Pen i/f %d now probed: (%04X:%04X)\n", iface_desc->desc.bInterfaceNumber, id->idVendor, id->idProduct); printk(KERN_INFO "ID->bNumEndpoints: %02X\n", iface_desc->desc.bNumEndpoints); printk(KERN_INFO "ID->bInterfaceClass: %02X\n", iface_desc->desc.bInterfaceClass); for (i = 0; i < iface_desc->desc.bNumEndpoints; i++) { endpoint = &iface_desc->endpoint[i].desc; printk(KERN_INFO "ED[%d]->bEndpointAddress: 0x%02X\n", i, endpoint->bEndpointAddress); printk(KERN_INFO "ED[%d]->bmAttributes: 0x%02X\n", i, endpoint->bmAttributes); printk(KERN_INFO "ED[%d]->wMaxPacketSize: 0x%04X (%d)\n", i, endpoint->wMaxPacketSize, endpoint->wMaxPacketSize); } device = interface_to_usbdev(interface); return 0; }

static void pen_disconnect(struct usb_interface *interface) { printk(KERN_INFO "Pen i/f %d now disconnected\n", interface->cur_altsetting->desc.bInterfaceNumber); } static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F, 0x6387) }, {} /* Terminating entry */ }; MODULE_DEVICE_TABLE (usb, pen_table); static struct usb_driver pen_driver = { .name = "pen_driver", .probe = pen_probe, .disconnect = pen_disconnect, .id_table = pen_table, }; static int __init pen_init(void) { return usb_register(&pen_driver); } static void __exit pen_exit(void) { usb_deregister(&pen_driver); } module_init(pen_init); module_exit(pen_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>"); MODULE_DESCRIPTION("USB Pen Info Driver"); Then, the usual steps for any Linux device driver may be repeated, along with the pen drive steps: Build the driver (pen_info.ko file) by running make. Load the driver using insmod pen_info.ko. Plug in the pen drive (after making sure that the usb-storage driver is not already loaded). Unplug the pen drive. Check the output of dmesg for the logs. Unload the driver using rmmod pen_info.

Figure 2 shows a snippet of the above steps on Pugs system. Remember to ensure (in the output of cat /proc/bus/usb/devices) that the usual usb-storage driver is not the one associated with the pen drive interface, but rather the pen_info driver.

Figure 2: Output of dmesg

Summing up
Before taking another break, Pugs shared two of the many mechanisms for a driver to specify its device to the USB core, using the struct usb_device_id table. The first one is by specifying the <vendor id, product id> pair using the USB_DEVICE() macro (as done above). The second one is by specifying the device class/category using the USB_DEVICE_INFO() macro. In fact, many more macros are available in <linux/usb.h> for various combinations. Moreover, multiple of these macros could be specified in the usb_device_id table (terminated by a null entry), for matching with any one of the criteria, enabling to write a single driver for possibly many devices. Earlier, you mentioned writing multiple drivers for a single device, as well. Basically, how do we selectively register or not register a particular interface of a USB device?, queried Shweta. Sure. Thats next in line of our discussion, along with the ultimate task in any device driver the datatransfer mechanisms, replied Pugs.

Device Drivers, Part 13: Data Transfer to and from USB Devices

This article, which is part of the series on Linux device drivers, continues from the previous two articles. It details the ultimate step of data transfer to and from a USB device, using your first USB driver in Linux. Pugs continued, To answer your question about how a driver selectively registers or skips a particular interface of a USB device, you need to understand the significance of the return value of the probe() callback. Note that the USB core would invoke probe for all the interfaces of a detected device, except the ones which are already registered thus, while doing it for the first time, it will probe for all interfaces. Now, if the probe returns 0, it means the driver has registered for that interface. Returning an error code indicates not registering for it. Thats all. That was simple, commented Shweta. Now, lets talk about the ultimate data transfers to and from a USB device, continued Pugs. But before that, tell me, what is this MODULE_DEVICE_TABLE? This has been bothering me since you explained the USB device ID table macros, asked Shweta, urging Pugs to slow down. Thats trivial stuff. It is mainly for the user-space depmod, he said. Module is another term for a driver, which can be dynamically loaded/unloaded. The macro MODULE_DEVICE_TABLE generates two variables in a modules read-only section, which is extracted by depmod and stored in global map files under /lib/modules/<kernel_version>. Two such files are modules.usbmap and modules.pcimap, for USB and PCI device drivers, respectively. This enables auto-loading of these drivers, as we saw the usb-storage driver getting auto-loaded.

USB data transfer


Time for USB data transfers. Lets build upon the USB device driver coded in our previous sessions, using the same handy JetFlash pen drive from Transcend, with vendor ID 0x058f and product ID 0x6387, said Pugs, enthusiastically. USB, being a hardware protocol, forms the usual horizontal layer in the kernel space. And hence, for it to provide an interface to user-space, it has to connect through one of the vertical layers.

As the character (driver) vertical has already been discussed, it is the current preferred choice for the connection with the USB horizontal, in order to understand the complete data transfer flow. Also, we do not need to get a free unreserved character major number, but can use the character major number 180, reserved for USB-based character device files. Moreover, to achieve this complete character driver logic with the USB horizontal in one go, the following are the APIs declared in <linux/usb.h>: int usb_register_dev(struct usb_interface *intf, struct usb_class_driver *class_driver); void usb_deregister_dev(struct usb_interface *intf, struct usb_class_driver *class_driver); Usually, we would expect these functions to be invoked in the constructor and the destructor of a module, respectively. However, to achieve the hot-plug-n-play behaviour for the (character) device files corresponding to USB devices, these are instead invoked in the probe and disconnect callbacks, respectively. The first parameter in the above functions is the interface pointer received as the first parameter in both probe and disconnect. The second parameter, struct usb_class_driver, needs to be populated with the suggested device file name and the set of device file operations, before invoking usb_register_dev. For the actual usage, refer to the functions pen_probe and pen_disconnect in the code listing of pen_driver.c below. Moreover, as the file operations (write, read, etc.,) are now provided, that is exactly where we need to do the data transfers to and from the USB device. So, pen_write and pen_ read below show the possible calls to usb_bulk_msg() (prototyped in <linux/usb.h>) to do the transfers over the pen drives bulk endpoints 001 and 082, respectively. Refer to the E lines of the middle section in Figure 1 for the endpoint number listings of our pen drive.

Figure 1: USB specifications for the pen drive Refer to the header file <linux/usb.h> under kernel sources, for the complete list of USB core API prototypes for other endpoint-specific data transfer functions like usb_control_msg(), usb_interrupt_msg(), etc. usb_rcvbulkpipe(), usb_sndbulkpipe(), and many such other macros, also defined in <linux/usb.h>, compute the actual endpoint bit-mask to be passed to the various USB core APIs.

Note that a pen drive belongs to a USB mass storage class, which expects a set of SCSI-like commands to be transacted over the bulk endpoints. So, a raw read/write as shown in the code listing below may not really do a data transfer as expected, unless the data is appropriately formatted. But still, this summarises the overall code flow of a USB driver. To get a feel of a real working USB data transfer in a simple and elegant way, one would need some kind of custom USB device, something like the one available here. #include <linux/module.h> #include <linux/kernel.h> #include <linux/usb.h> #define MIN(a,b) (((a) <= (b)) ? (a) : (b)) #define BULK_EP_OUT 0x01 #define BULK_EP_IN 0x82 #define MAX_PKT_SIZE 512 static struct usb_device *device; static struct usb_class_driver class; static unsigned char bulk_buf[MAX_PKT_SIZE]; static int pen_open(struct inode *i, struct file *f) { return 0; } static int pen_close(struct inode *i, struct file *f) { return 0; } static ssize_t pen_read(struct file *f, char __user *buf, size_t cnt, loff_t *off) { int retval; int read_cnt; /* Read the data from the bulk endpoint */ retval = usb_bulk_msg(device, usb_rcvbulkpipe(device, BULK_EP_IN), bulk_buf, MAX_PKT_SIZE, &read_cnt, 5000); if (retval) { printk(KERN_ERR "Bulk message returned %d\n", retval); return retval; } if (copy_to_user(buf, bulk_buf, MIN(cnt, read_cnt))) { return -EFAULT; } return MIN(cnt, read_cnt); } static ssize_t pen_write(struct file *f, const char __user *buf, size_t cnt, loff_t *off) { int retval; int wrote_cnt = MIN(cnt, MAX_PKT_SIZE); if (copy_from_user(bulk_buf, buf, MIN(cnt, MAX_PKT_SIZE)))

{ return -EFAULT; } /* Write the data into the bulk endpoint */ retval = usb_bulk_msg(device, usb_sndbulkpipe(device, BULK_EP_OUT), bulk_buf, MIN(cnt, MAX_PKT_SIZE), &wrote_cnt, 5000); if (retval) { printk(KERN_ERR "Bulk message returned %d\n", retval); return retval; } return wrote_cnt; } static struct file_operations fops = { .open = pen_open, .release = pen_close, .read = pen_read, .write = pen_write, }; static int pen_probe(struct usb_interface *interface, const struct usb_device_id *id) { int retval; device = interface_to_usbdev(interface); class.name = "usb/pen%d"; class.fops = &fops; if ((retval = usb_register_dev(interface, &class)) < 0) { /* Something prevented us from registering this driver */ err("Not able to get a minor for this device."); } else { printk(KERN_INFO "Minor obtained: %d\n", interface->minor); } return retval; } static void pen_disconnect(struct usb_interface *interface) { usb_deregister_dev(interface, &class); } /* Table of devices that work with this driver */ static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F, 0x6387) },

{} /* Terminating entry */ }; MODULE_DEVICE_TABLE (usb, pen_table); static struct usb_driver pen_driver = { .name = "pen_driver", .probe = pen_probe, .disconnect = pen_disconnect, .id_table = pen_table, }; static int __init pen_init(void) { int result; /* Register this driver with the USB subsystem */ if ((result = usb_register(&pen_driver))) { err("usb_register failed. Error number %d", result); } return result; } static void __exit pen_exit(void) { /* Deregister this driver with the USB subsystem */ usb_deregister(&pen_driver); } module_init(pen_init); module_exit(pen_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("USB Pen Device Driver"); As a reminder, the usual steps for any Linux device driver may be repeated with the above code, along with the following steps for the pen drive: Build the driver (pen_driver.ko) by running make. Load the driver using insmod pen_driver.ko. Plug in the pen drive (after making sure that the usb-storage driver is not already loaded). Check for the dynamic creation of /dev/pen0 (0 being the minor number obtained check dmesg logs for the value on your system). Possibly try some write/read on /dev/pen0 (you most likely will get a connection timeout and/or broken pipe errors, because of non-conforming SCSI commands). Unplug the pen drive and look for /dev/pen0 to be gone. Unload the driver using rmmod pen_driver.

Meanwhile, Pugs hooked up his first-of-its-kind creation the Linux device driver kit (LDDK) into his system for a live demonstration of the USB data transfers. Aha! Finally a cool complete working USB driver, quipped Shweta, excited. Want to have more fun? We could do a block driver over it, added Pugs. Oh! Really? asked Shweta, with glee. Yes. But before that, we need to understand the partitioning mechanisms, commented Pugs.

Device Drivers, Part 14: A Dive Inside the Hard Disk for Understanding Partitions
This article, which is part of the series on Linux device drivers, takes you on a tour inside a hard disk.

Doesnt it sound like a mechanical engineering subject: The design of the hard disk? questioned Shweta. Yes, it does. But understanding it gives us an insight into its programming aspect, reasoned Pugs, while waiting for the commencement of the seminar on storage systems. The seminar started with a few hard disks in the presenters hand and then a dive into her system, showing the output of fdisk -l (Figure 1).

Figure 1: Partition listing by fdisk

The first line shows the hard disk size in human-friendly format and in bytes. The second line mentions the number of logical heads, logical sectors per track, and the actual number of cylinders on the disk together known as the geometry of the disk. The 255 heads indicate the number of platters or disks, as one read-write head is needed per disk. Lets number them, say D1, D2, D255. Now, each disk would have the same number of concentric circular tracks, starting from the outside to the inside. In the above case, there are 60,801 such tracks per disk. Lets number them, say T1, T2, T60801. And a particular track number from all the disks forms a cylinder of the same number. For example, tracks T2 from D1, D2, D255 will together form the cylinder C2. Now, each track has the same number of logical sectors 63 in our case, say S1, S2, S63. And each sector is typically 512 bytes. Given this data, one can actually compute the total usable hard disk size, using the following formula: Usable hard disk size in bytes = (Number of heads or disks) * (Number of tracks per disk) * (Number of sectors per track) * (Number of bytes per sector, i.e. sector size) For the disk under consideration, it would be: 255 * 60801 * 63 * 512 bytes = 500105249280 bytes. Note that this number may be slightly less than the actual hard disk (500107862016 bytes, in our case). The reason is that the formula doesnt consider the bytes in the last partial or incomplete cylinder. The primary reason for that is the difference between todays technology of organising the actual physical disk geometry and the traditional geometry representation using heads, cylinders and sectors. Note that in the fdisk output, we referred to the heads and sectors per track as logical not physical. One may ask that if todays disks dont have such physical geometry concepts, then why still maintain them and represent them in a logical form? The main reason is to be able to continue with the same concepts of partitioning, and be able to maintain the same partition table formats, especially for the most prevalent DOS-type partition tables, which heavily depend on this simplistic geometry. Note the computation of cylinder size (255 heads * 63 sectors / track * 512 bytes / sector = 8225280 bytes) in the third line and then the demarcation of partitions in units of complete cylinders.

DOS-type partition tables


This brings us to the next important topic: understanding DOS-type partition tables. But first, what is a partition, and why should we partition? A hard disk can be divided into one or more logical disks, each of which is called a partition. This is useful for organising different types of data separately, for example, different operating system data, user data, temporary data, etc. So, partitions are basically logical divisions and need to be maintained by metadata, which is the partition table. A DOS-type partition table contains four partition entries, each a 16-byte entry. Each of these entries can be depicted by the following C structure: typedef struct { unsigned char boot_type; // 0x00 - Inactive; 0x80 - Active (Bootable) unsigned char start_head; unsigned char start_sec:6; unsigned char start_cyl_hi:2; unsigned char start_cyl; unsigned char part_type; unsigned char end_head; unsigned char end_sec:6; unsigned char end_cyl_hi:2; unsigned char end_cyl; unsigned long abs_start_sec; unsigned long sec_in_part; } PartEntry;

This partition table, followed by the two-byte signature 0xAA55, resides at the end of the disks first sector, commonly known as the Master Boot Record (MBR). Hence, the starting offset of this partition table within the MBR is 512 - (4 * 16 + 2) = 446. Also, a 4-byte disk signature is placed at offset 440. The remaining top 440 bytes of the MBR are typically used to place the first piece of boot code, that is loaded by the BIOS to boot the system from the disk. The part_info.c listing contains these various definitions, along with code for parsing and printing a formatted output of the partition table. From the partition table entry structure, it could be noted that the start and end cylinder fields are only 10 bits long, thus allowing a maximum of 1023 cylinders only. However, for todays huge hard disks, this is in no way sufficient. Hence, in overflow cases, the corresponding <head, cylinder, sector> triplet in the partition table entry is set to the maximum value, and the actual value is computed using the last two fields: the absolute start sector number (abs_start_sec) and the number of sectors in this partition (sec_in_part). The code for this too is in part_info.c: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #define SECTOR_SIZE 512 #define MBR_SIZE SECTOR_SIZE #define MBR_DISK_SIGNATURE_OFFSET 440 #define MBR_DISK_SIGNATURE_SIZE 4 #define PARTITION_TABLE_OFFSET 446 #define PARTITION_ENTRY_SIZE 16 // sizeof(PartEntry) #define PARTITION_TABLE_SIZE 64 // sizeof(PartTable) #define MBR_SIGNATURE_OFFSET 510 #define MBR_SIGNATURE_SIZE 2 #define MBR_SIGNATURE 0xAA55 #define BR_SIZE SECTOR_SIZE #define BR_SIGNATURE_OFFSET 510 #define BR_SIGNATURE_SIZE 2 #define BR_SIGNATURE 0xAA55 typedef struct { unsigned char boot_type; // 0x00 - Inactive; 0x80 - Active (Bootable) unsigned char start_head; unsigned char start_sec:6; unsigned char start_cyl_hi:2; unsigned char start_cyl; unsigned char part_type; unsigned char end_head; unsigned char end_sec:6; unsigned char end_cyl_hi:2; unsigned char end_cyl; unsigned long abs_start_sec; unsigned long sec_in_part; } PartEntry; typedef struct { unsigned char boot_code[MBR_DISK_SIGNATURE_OFFSET];

unsigned long disk_signature; unsigned short pad; unsigned char pt[PARTITION_TABLE_SIZE]; unsigned short signature; } MBR; void print_computed(unsigned long sector) { unsigned long heads, cyls, tracks, sectors; sectors = sector % 63 + 1 /* As indexed from 1 */; tracks = sector / 63; cyls = tracks / 255 + 1 /* As indexed from 1 */; heads = tracks % 255; printf("(%3d/%5d/%1d)", heads, cyls, sectors); } int main(int argc, char *argv[]) { char *dev_file = "/dev/sda"; int fd, i, rd_val; MBR m; PartEntry *p = (PartEntry *)(m.pt); if (argc == 2) { dev_file = argv[1]; } if ((fd = open(dev_file, O_RDONLY)) == -1) { fprintf(stderr, "Failed opening %s: ", dev_file); perror(""); return 1; } if ((rd_val = read(fd, &m, sizeof(m))) != sizeof(m)) { fprintf(stderr, "Failed reading %s: ", dev_file); perror(""); close(fd); return 2; } close(fd); printf("\nDOS type Partition Table of %s:\n", dev_file); printf(" B Start (H/C/S) End (H/C/S) Type StartSec TotSec\n"); for (i = 0; i < 4; i++) { printf("%d:%d (%3d/%4d/%2d) (%3d/%4d/%2d) %02X %10d %9d\n", i + 1, !!(p[i].boot_type & 0x80), p[i].start_head, 1 + ((p[i].start_cyl_hi << 8) | p[i].start_cyl), p[i].start_sec, p[i].end_head, 1 + ((p[i].end_cyl_hi << 8) | p[i].end_cyl), p[i].end_sec, p[i].part_type, p[i].abs_start_sec, p[i].sec_in_part); } printf("\nRe-computed Partition Table of %s:\n", dev_file); printf(" B Start (H/C/S) End (H/C/S) Type StartSec TotSec\n"); for (i = 0; i < 4; i++) {

printf("%d:%d ", i + 1, !!(p[i].boot_type & 0x80)); print_computed(p[i].abs_start_sec); printf(" "); print_computed(p[i].abs_start_sec + p[i].sec_in_part - 1); printf(" %02X %10d %9d\n", p[i].part_type, p[i].abs_start_sec, p[i].sec_in_part); } printf("\n"); return 0; } As the above is an application, compile it with gcc part_info.c -o part_info, and then run ./part_info /dev/sda to check out your primary partitioning information on /dev/sda. Figure 2 shows the output of ./part_info on the presenters system. Compare it with the fdisk output in Figure 1.

Figure 2: Output of ./part_info

Partition types and boot records


Now, as this partition table is hard-coded to have four entries, thats the maximum number of partitions you can have. These are called primary partitions, each having an associated type in the corresponding partition table entry. These types are typically coined by various OS vendors, and hence sort of map to various OSs like DOS, Minix, Linux, Solaris, BSD, FreeBSD, QNX, W95, Novell Netware, etc., to be used for/with the particular OS. However, this is more a formality than a real requirement. Besides this, one of the four primary partitions can be labelled as something called an extended partition, which has a special significance. As the name suggests, it is used to further extend hard disk division, i.e., to have more partitions. These are called logical partitions and are created within the extended partition. The metadata of these is maintained in a linked-list format, allowing an unlimited number of logical partitions (at least theoretically). For that, the first sector of the extended partition, commonly called the Boot Record (BR), is used like the MBR to store (the linked-list head of) the partition table for the logical partitions. Subsequent linked-list nodes are stored in the first sector of the subsequent logical partitions, referred to as the Logical Boot Record (LBR).

Each linked-list node is a complete 4-entry partition table, though only the first two entries are used the first for the linked-list data, namely, information about the immediate logical partition, and the second as the linked lists next pointer, pointing to the list of remaining logical partitions. To compare and understand the primary partitioning details on your systems hard disk, follow the steps (as the root user hence with care) given below: ./part_info /dev/sda ## Displays the partition table on /dev/sda fdisk -l /dev/sda ## To display and compare the partition table entries with the above In case you have multiple hard disks (/dev/sdb, ), hard disk device files with other names (/dev/hda, ), or an extended partition, you may try ./part_info <device_file_name> on them as well. Trying on an extended partition would give you the information about the starting partition table of the logical partitions. Right now, we have carefully and selectively played (read-only) with the systems hard disk. Why carefully? Since otherwise, we may render our system non-bootable. But no learning is complete without a total exploration. Hence, in our next session, we will create a dummy disk in RAM and do destructive exploration on it.

Device Drivers, Part 15: Disk on RAM Playing with Block Drivers

This article, which is part of the series on Linux device drivers, experiments with a dummy hard disk on RAM to demonstrate how block drivers work. After a delicious lunch, theory makes the audience sleepy. So, lets start with the code itself.

Disk On RAM source code


Lets us create a directory called DiskOnRAM which holds the following six files three C source files, two C headers, and one Makefile.

partition.h
#ifndef PARTITION_H #define PARTITION_H #include <linux/types.h> extern void copy_mbr_n_br(u8 *disk); #endif

partition.c
#include <linux/string.h> #include "partition.h" #define ARRAY_SIZE(a) (sizeof(a) / sizeof(*a)) #define SECTOR_SIZE 512 #define MBR_SIZE SECTOR_SIZE #define MBR_DISK_SIGNATURE_OFFSET 440 #define MBR_DISK_SIGNATURE_SIZE 4 #define PARTITION_TABLE_OFFSET 446 #define PARTITION_ENTRY_SIZE 16 // sizeof(PartEntry) #define PARTITION_TABLE_SIZE 64 // sizeof(PartTable) #define MBR_SIGNATURE_OFFSET 510 #define MBR_SIGNATURE_SIZE 2 #define MBR_SIGNATURE 0xAA55 #define BR_SIZE SECTOR_SIZE #define BR_SIGNATURE_OFFSET 510 #define BR_SIGNATURE_SIZE 2 #define BR_SIGNATURE 0xAA55 typedef struct { unsigned char boot_type; // 0x00 - Inactive; 0x80 - Active (Bootable) unsigned char start_head; unsigned char start_sec:6; unsigned char start_cyl_hi:2; unsigned char start_cyl; unsigned char part_type; unsigned char end_head; unsigned char end_sec:6; unsigned char end_cyl_hi:2; unsigned char end_cyl; unsigned long abs_start_sec; unsigned long sec_in_part; } PartEntry; typedef PartEntry PartTable[4]; static PartTable def_part_table = { { boot_type: 0x00, start_head: 0x00, start_sec: 0x2, start_cyl: 0x00, part_type: 0x83, end_head: 0x00, end_sec: 0x20, end_cyl: 0x09, abs_start_sec: 0x00000001,

sec_in_part: 0x0000013F }, { boot_type: 0x00, start_head: 0x00, start_sec: 0x1, start_cyl: 0x0A, // extended partition start cylinder (BR location) part_type: 0x05, end_head: 0x00, end_sec: 0x20, end_cyl: 0x13, abs_start_sec: 0x00000140, sec_in_part: 0x00000140 }, { boot_type: 0x00, start_head: 0x00, start_sec: 0x1, start_cyl: 0x14, part_type: 0x83, end_head: 0x00, end_sec: 0x20, end_cyl: 0x1F, abs_start_sec: 0x00000280, sec_in_part: 0x00000180 }, { } }; static unsigned int def_log_part_br_cyl[] = {0x0A, 0x0E, 0x12}; static const PartTable def_log_part_table[] = { { { boot_type: 0x00, start_head: 0x00, start_sec: 0x2, start_cyl: 0x0A, part_type: 0x83, end_head: 0x00, end_sec: 0x20, end_cyl: 0x0D, abs_start_sec: 0x00000001, sec_in_part: 0x0000007F }, { boot_type: 0x00, start_head: 0x00, start_sec: 0x1, start_cyl: 0x0E, part_type: 0x05, end_head: 0x00, end_sec: 0x20, end_cyl: 0x11,

abs_start_sec: 0x00000080, sec_in_part: 0x00000080 }, }, { { boot_type: 0x00, start_head: 0x00, start_sec: 0x2, start_cyl: 0x0E, part_type: 0x83, end_head: 0x00, end_sec: 0x20, end_cyl: 0x11, abs_start_sec: 0x00000001, sec_in_part: 0x0000007F }, { boot_type: 0x00, start_head: 0x00, start_sec: 0x1, start_cyl: 0x12, part_type: 0x05, end_head: 0x00, end_sec: 0x20, end_cyl: 0x13, abs_start_sec: 0x00000100, sec_in_part: 0x00000040 }, }, { { boot_type: 0x00, start_head: 0x00, start_sec: 0x2, start_cyl: 0x12, part_type: 0x83, end_head: 0x00, end_sec: 0x20, end_cyl: 0x13, abs_start_sec: 0x00000001, sec_in_part: 0x0000003F }, } }; static void copy_mbr(u8 *disk) { memset(disk, 0x0, MBR_SIZE); *(unsigned long *)(disk + MBR_DISK_SIGNATURE_OFFSET) = 0x36E5756D; memcpy(disk + PARTITION_TABLE_OFFSET, &def_part_table, PARTITION_TABLE_SIZE); *(unsigned short *)(disk + MBR_SIGNATURE_OFFSET) = MBR_SIGNATURE; } static void copy_br(u8 *disk, int start_cylinder, const PartTable *part_table)

{ disk += (start_cylinder * 32 /* sectors / cyl */ * SECTOR_SIZE); memset(disk, 0x0, BR_SIZE); memcpy(disk + PARTITION_TABLE_OFFSET, part_table, PARTITION_TABLE_SIZE); *(unsigned short *)(disk + BR_SIGNATURE_OFFSET) = BR_SIGNATURE; } void copy_mbr_n_br(u8 *disk) { int i; copy_mbr(disk); for (i = 0; i < ARRAY_SIZE(def_log_part_table); i++) { copy_br(disk, def_log_part_br_cyl[i], &def_log_part_table[i]); } }

ram_device.h
#ifndef RAMDEVICE_H #define RAMDEVICE_H #define RB_SECTOR_SIZE 512 extern int ramdevice_init(void); extern void ramdevice_cleanup(void); extern void ramdevice_write(sector_t sector_off, u8 *buffer, unsigned int sectors); extern void ramdevice_read(sector_t sector_off, u8 *buffer, unsigned int sectors); #endif

ram_device.c
#include <linux/types.h> #include <linux/vmalloc.h> #include <linux/string.h> #include "ram_device.h" #include "partition.h" #define RB_DEVICE_SIZE 1024 /* sectors */ /* So, total device size = 1024 * 512 bytes = 512 KiB */ /* Array where the disk stores its data */ static u8 *dev_data; int ramdevice_init(void) { dev_data = vmalloc(RB_DEVICE_SIZE * RB_SECTOR_SIZE); if (dev_data == NULL) return -ENOMEM; /* Setup its partition table */ copy_mbr_n_br(dev_data); return RB_DEVICE_SIZE; }

void ramdevice_cleanup(void) { vfree(dev_data); } void ramdevice_write(sector_t sector_off, u8 *buffer, unsigned int sectors) { memcpy(dev_data + sector_off * RB_SECTOR_SIZE, buffer, sectors * RB_SECTOR_SIZE); } void ramdevice_read(sector_t sector_off, u8 *buffer, unsigned int sectors) { memcpy(buffer, dev_data + sector_off * RB_SECTOR_SIZE, sectors * RB_SECTOR_SIZE); }

ram_block.c
/* Disk on RAM Driver */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/fs.h> #include <linux/types.h> #include <linux/genhd.h> #include <linux/blkdev.h> #include <linux/errno.h> #include "ram_device.h" #define RB_FIRST_MINOR 0 #define RB_MINOR_CNT 16 static u_int rb_major = 0; /* * The internal structure representation of our Device */ static struct rb_device { /* Size is the size of the device (in sectors) */ unsigned int size; /* For exclusive access to our request queue */ spinlock_t lock; /* Our request queue */ struct request_queue *rb_queue; /* This is kernel's representation of an individual disk device */ struct gendisk *rb_disk; } rb_dev; static int rb_open(struct block_device *bdev, fmode_t mode) { unsigned unit = iminor(bdev->bd_inode);

printk(KERN_INFO "rb: Device is opened\n"); printk(KERN_INFO "rb: Inode number is %d\n", unit); if (unit > RB_MINOR_CNT) return -ENODEV; return 0; } static int rb_close(struct gendisk *disk, fmode_t mode) { printk(KERN_INFO "rb: Device is closed\n"); return 0; } /* * Actual Data transfer */ static int rb_transfer(struct request *req) { //struct rb_device *dev = (struct rb_device *)(req->rq_disk->private_data); int dir = rq_data_dir(req); sector_t start_sector = blk_rq_pos(req); unsigned int sector_cnt = blk_rq_sectors(req); struct bio_vec *bv; struct req_iterator iter; sector_t sector_offset; unsigned int sectors; u8 *buffer; int ret = 0; //printk(KERN_DEBUG "rb: Dir:%d; Sec:%lld; Cnt:%d\n", dir, start_sector, sector_cnt); sector_offset = 0; rq_for_each_segment(bv, req, iter) { buffer = page_address(bv->bv_page) + bv->bv_offset; if (bv->bv_len % RB_SECTOR_SIZE != 0) { printk(KERN_ERR "rb: Should never happen: " "bio size (%d) is not a multiple of RB_SECTOR_SIZE (%d).\n" "This may lead to data truncation.\n", bv->bv_len, RB_SECTOR_SIZE); ret = -EIO; } sectors = bv->bv_len / RB_SECTOR_SIZE; printk(KERN_DEBUG "rb: Sector Offset: %lld; Buffer: %p; Length: %d sectors\n", sector_offset, buffer, sectors); if (dir == WRITE) /* Write to the device */ {

ramdevice_write(start_sector + sector_offset, buffer, sectors); } else /* Read from the device */ { ramdevice_read(start_sector + sector_offset, buffer, sectors); } sector_offset += sectors; } if (sector_offset != sector_cnt) { printk(KERN_ERR "rb: bio info doesn't match with the request info"); ret = -EIO; } return ret; } /* * Represents a block I/O request for us to execute */ static void rb_request(struct request_queue *q) { struct request *req; int ret; /* Gets the current request from the dispatch queue */ while ((req = blk_fetch_request(q)) != NULL) { #if 0 /* * This function tells us whether we are looking at a filesystem request * - one that moves block of data */ if (!blk_fs_request(req)) { printk(KERN_NOTICE "rb: Skip non-fs request\n"); /* We pass 0 to indicate that we successfully completed the request */ __blk_end_request_all(req, 0); //__blk_end_request(req, 0, blk_rq_bytes(req)); continue; } #endif ret = rb_transfer(req); __blk_end_request_all(req, ret); //__blk_end_request(req, ret, blk_rq_bytes(req)); } } /* * These are the file operations that performed on the ram block device */ static struct block_device_operations rb_fops = { .owner = THIS_MODULE,

.open = rb_open, .release = rb_close, }; /* * This is the registration and initialization section of the ram block device * driver */ static int __init rb_init(void) { int ret; /* Set up our RAM Device */ if ((ret = ramdevice_init()) < 0) { return ret; } rb_dev.size = ret; /* Get Registered */ rb_major = register_blkdev(rb_major, "rb"); if (rb_major <= 0) { printk(KERN_ERR "rb: Unable to get Major Number\n"); ramdevice_cleanup(); return -EBUSY; } /* Get a request queue (here queue is created) */ spin_lock_init(&rb_dev.lock); rb_dev.rb_queue = blk_init_queue(rb_request, &rb_dev.lock); if (rb_dev.rb_queue == NULL) { printk(KERN_ERR "rb: blk_init_queue failure\n"); unregister_blkdev(rb_major, "rb"); ramdevice_cleanup(); return -ENOMEM; } /* * Add the gendisk structure * By using this memory allocation is involved, * the minor number we need to pass bcz the device * will support this much partitions */ rb_dev.rb_disk = alloc_disk(RB_MINOR_CNT); if (!rb_dev.rb_disk) { printk(KERN_ERR "rb: alloc_disk failure\n"); blk_cleanup_queue(rb_dev.rb_queue); unregister_blkdev(rb_major, "rb"); ramdevice_cleanup(); return -ENOMEM; }

/* Setting the major number */ rb_dev.rb_disk->major = rb_major; /* Setting the first mior number */ rb_dev.rb_disk->first_minor = RB_FIRST_MINOR; /* Initializing the device operations */ rb_dev.rb_disk->fops = &rb_fops; /* Driver-specific own internal data */ rb_dev.rb_disk->private_data = &rb_dev; rb_dev.rb_disk->queue = rb_dev.rb_queue; /* * You do not want partition information to show up in * cat /proc/partitions set this flags */ //rb_dev.rb_disk->flags = GENHD_FL_SUPPRESS_PARTITION_INFO; sprintf(rb_dev.rb_disk->disk_name, "rb"); /* Setting the capacity of the device in its gendisk structure */ set_capacity(rb_dev.rb_disk, rb_dev.size); /* Adding the disk to the system */ add_disk(rb_dev.rb_disk); /* Now the disk is "live" */ printk(KERN_INFO "rb: Ram Block driver initialised (%d sectors; %d bytes)\n", rb_dev.size, rb_dev.size * RB_SECTOR_SIZE); return 0; } /* * This is the unregistration and uninitialization section of the ram block * device driver */ static void __exit rb_cleanup(void) { del_gendisk(rb_dev.rb_disk); put_disk(rb_dev.rb_disk); blk_cleanup_queue(rb_dev.rb_queue); unregister_blkdev(rb_major, "rb"); ramdevice_cleanup(); } module_init(rb_init); module_exit(rb_cleanup); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>"); MODULE_DESCRIPTION("Ram Block Driver"); MODULE_ALIAS_BLOCKDEV_MAJOR(rb_major); You can also download the code demonstrated from here. As usual, executing make will build the Disk on RAM driver (dor.ko), combining the three C files. Check out the Makefile to see how.

Makefile
# If called directly from the command line, invoke the kernel build system. ifeq ($(KERNELRELEASE),) KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: module module: $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) modules clean: $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) clean # Otherwise KERNELRELEASE is defined; we've been invoked from the # kernel build system and can use its language. else obj-m := dor.o dor-y := ram_block.o ram_device.o partition.o To clean the built files, run the usual make clean. Once built, the following are the experimental steps (refer to Figures 1 to 3).

Figure 1: Playing with the Disk on RAM driver

Figure 2: xxd showing the initial data on the first partition (/dev/rb1)

Figure 3: Formatting the third partition (/dev/rb3)

Please note that all these need to be executed with root privileges: Load the driver dor.ko using insmod. This would create the block device files representing the disk on 512 KiB of RAM, with three primary and three logical partitions. Check out the automatically created block device files (/dev/rb*). /dev/rb is the entire disk, which is 512 KiB in size. rb1, rb2 and rb3 are the primary partitions, with rb2 being the extended partition and containing three logical partitions rb5, rb6 and rb7. Read the entire disk (/dev/rb) using the disk dump utility dd. Zero out the first sector of the disks first partition (/dev/rb1), again using dd. Write some text into the disks first partition (/dev/rb1) using cat. Display the initial contents of the first partition (/dev/rb1) using the xxd utility. See Figure 2 for xxd output. Display the partition information for the disk using fdisk. See Figure 3 for fdisk output. Quick-format the third primary partition (/dev/rb3) as a vfat filesystem (like your pen drive), using mkfs.vfat (Figure 3). Mount the newly formatted partition using mount, say at /mnt (Figure 3). The disk usage utility df would now show this partition mounted at /mnt (Figure 3). You may go ahead and store files there, but remember that this is a disk on RAM, and so is non-persistent. Unload the driver using rmmod dor after unmounting the partition using umount /mnt. All data on the disk will be lost.

Now lets learn the rules


We have just played around with the disk on RAM (DOR), but without actually knowing the rules, i.e., the internal details of the game. So, lets dig into the nitty-gritty to decode the rules. Each of the three .c files represent a specific part of the driver; ram_device.c and ram_device.h abstract the underlying RAM operations like vmalloc/vfree, memcpy, etc., providing disk operation APIs like init/cleanup, read/write, etc. partition.c and partition.h provide the functionality to emulate the various partition tables on the DOR. Recall the pre-lunch session (i.e., the previous article) to understand the details of partitioning. The code in this is responsible for the partition information like the number, type, size, etc., that is shown using fdisk. The ram_block.c file is the core block driver implementation, exposing the DOR as the block device files (/dev/rb*) to user-space. In other words, four of the five files ram_device.* and partition.* form the horizontal layer of the device driver, and ram_block.c forms the vertical (block) layer of the device driver. So, lets understand that in detail.

The block driver basics


Conceptually, the block drivers are very similar to character drivers, especially with regards to the following: Usage of device files Major and minor numbers Device file operations Concept of device registration

So, if you already know character driver implementation, it would be easy to understand block drivers.

However, they are definitely not identical. The key differences are as follows: Abstraction for block-oriented versus byte-oriented devices. Block drivers are designed to be used by I/O schedulers, for optimal performance. Compare that with character drivers that are to be used by VFS. Block drivers are designed to be integrated with the Linux buffer cache mechanism for efficient data access. Character drivers are pass-through drivers, accessing the hardware directly. And these cause the implementation differences. Lets analyse the key code snippets from ram_block.c, starting at the drivers constructor rb_init(). The first step is to register for an 8-bit (block) major number (which implicitly means registering for all 256 8-bit minor numbers associated with it). The function for that is as follows: int register_blkdev(unsigned int major, const char *name); Here, major is the major number to be registered, and name is a registration label displayed under the kernel window /proc/devices. Interestingly, register_blkdev() tries to allocate and register a freely available major number, when 0 is passed for its first parameter major; on success, the allocated major number is returned. The corresponding de-registration function is as follows: void unregister_blkdev(unsigned int major, const char *name); Both these are prototyped in <linux/fs.h>. The second step is to provide the device file operations, through the struct block_device_operations (prototyped in <linux/blkdev.h>) for the registered major number device files. However, these operations are too few compared to the character device file operations, and mostly insignificant. To elaborate, there are no operations even to read and write, which is surprising. But as we already know that block drivers need to integrate with the I/O schedulers, the read-write implementation is achieved through something called request queues. So, along with providing the device file operations, the following need to be provided: The request queue for queuing the read/write requests The spin lock associated with the request queue to protect its concurrent access The request function to process the requests in the request queue Also, there is no separate interface for block device file creations, so the following are also provided: The device file name prefix, commonly referred to as disk_name (rb in the dor driver) The starting minor number for the device files, commonly referred to as first_minor. Finally, two block-device-specific things are also provided, namely: The maximum number of partitions supported for this block device, by specifying the total minors. The underlying device size in units of 512-byte sectors, for the logical block access abstraction. All these are registered through the struct gendisk using the following function: void add_disk(struct gendisk *disk); The corresponding delete function is as follows: void del_gendisk(struct gendisk *disk); Prior to add_disk(), the various fields of struct gendisk need to initialised, either directly or using various macros/functions like set_capacity(). major, first_minor, fops, queue, disk_name are the minimal fields to be initialised directly. And even before the initialisation of these fields, the struct gendisk needs to be allocated, using the function given below:

struct gendisk *alloc_disk(int minors); Here, minors is the total number of partitions supported for this disk. And the corresponding inverse function would be: void put_disk(struct gendisk *disk); All these are prototyped in <linux/genhd.h>.

Request queue and the request function


The request queue also needs to be initialised and set up into the struct gendisk, before add_disk(). The request queue is initialised by calling: struct request_queue *blk_init_queue(request_fn_proc *, spinlock_t *); We provide the request-processing function and the initialised concurrency protection spin-lock as parameters. The corresponding queue clean-up function is given below: void blk_cleanup_queue(struct request_queue *); The request (processing) function should be defined with the following prototype: void request_fn(struct request_queue *q); It should be coded to fetch a request from its parameter q, for instance, by using the following: struct request *blk_fetch_request(struct request_queue *q); Then it should either process it, or initiate processing. Whatever it does should be non-blocking, as this request function is called from a non-process context, and also after taking the queues spin-lock. Moreover, only functions not releasing or taking the queues spin-lock should be used within the request function. A typical example of request processing, as demonstrated by the function rb_request() in ram_block.c is given below: while ((req = blk_fetch_request(q)) != NULL) /* Fetching a request */ { /* Processing the request: the actual data transfer */ ret = rb_transfer(req); /* Our custom function */ /* Informing that the request has been processed with return of ret */ __blk_end_request_all(req, ret); }

Requests and their processing


Our key function is rb_transfer(), which parses a struct request and accordingly does the actual data transfer. The struct request primarily contains the direction of data transfer, the starting sector for the data transfer, the total number of sectors for the data transfer, and the scatter-gather buffer for the data transfer. The various macros to extract these from the struct request are as follows: rq_data_dir(req); /* Operation type: 0 - read from device; otherwise - write to device */ blk_req_pos(req); /* Starting sector to process */ blk_req_sectors(req); /* Total sectors to process */ rq_for_each_segment(bv, req, iter) /* Iterator to extract individual buffers */ rq_for_each_segment() is the special one which iterates over the struct request (req) using iter, and extracting the individual buffer information into the struct bio_vec (bv: basic input/output vector) on

each iteration. And then, on each extraction, the appropriate data transfer is done, based on the operation type, invoking one of the following APIs from ram_device.c: void ramdevice_write(sector_t sector_off, u8 *buffer, unsigned int sectors); void ramdevice_read(sector_t sector_off, u8 *buffer, unsigned int sectors); Check out the complete code of rb_transfer() in ram_block.c.

Summing up
With that, we have actually learnt the beautiful block drivers by traversing through the design of a hard disk and playing around with partitioning, formatting and various other raw operations on a hard disk. Thanks for patiently listening. Now, the session is open for questions please feel free to leave your queries as comments.

Device Drivers, Part 16: Kernel Window Peeping through /proc


This article, which is part of the series on Linux device drivers, demonstrates the creation and usage of files under the /proc virtual filesystem.

After many months, Shweta and Pugs got together for some peaceful technical romancing. All through, they had been using all kinds of kernel windows, especially through the /proc virtual filesystem (using cat), to help them decode various details of Linux device drivers. Heres a non-exhaustive summary listing: /proc/modules dynamically loaded modules /proc/devices registered character and block major numbers /proc/iomem on-system physical RAM and bus device addresses /proc/ioports on-system I/O port addresses (especially for x86 systems) /proc/interrupts registered interrupt request numbers /proc/softirqs registered soft IRQs /proc/kallsyms running kernel symbols, including from loaded modules /proc/partitions currently connected block devices and their partitions /proc/filesystems currently active filesystem drivers /proc/swaps currently active swaps /proc/cpuinfo information about the CPU(s) on the system /proc/meminfo information about the memory on the system, viz., RAM, swap,

Custom kernel windows


Yes, these have been really helpful in understanding and debugging Linux device drivers. But is it possible for us to also provide some help? Yes, I mean can we create one such kernel window through /proc? asked Shweta. Why just one? You can have as many as you want. And its simple just use the right set of APIs, and there you go. For you, everything is simple, Shweta grumbled. No yaar, this is seriously simple, smiled Pugs. Just watch me creating one for you, he added. And in a jiffy, Pugs created the proc_window.c file below:

#include <linux/module.h> #include <linux/kernel.h> #include <linux/proc_fs.h> #include <linux/jiffies.h> static struct proc_dir_entry *parent, *file, *link; static int state = 0; int time_read(char *page, char **start, off_t off, int count, int *eof, void *data) { int len, val; unsigned long act_jiffies; len = sprintf(page, "state = %d\n", state); act_jiffies = jiffies - INITIAL_JIFFIES; val = jiffies_to_msecs(act_jiffies); switch (state) { case 0: len += sprintf(page + len, "time = %ld jiffies\n", act_jiffies); break; case 1: len += sprintf(page + len, "time = %d msecs\n", val); break; case 2: len += sprintf(page + len, "time = %ds %dms\n", val / 1000, val % 1000); break; case 3: val /= 1000; len += sprintf(page + len, "time = %02d:%02d:%02d\n", val / 3600, (val / 60) % 60, val % 60); break; default: len += sprintf(page + len, "<not implemented>\n"); break; } len += sprintf(page + len, "{offset = %ld; count = %d;}\n", off, count); return len; } int time_write(struct file *file, const char __user *buffer, unsigned long count, void *data) { if (count > 2) return count; if ((count == 2) && (buffer[1] != '\n')) return count; if ((buffer[0] < '0') || ('9' < buffer[0])) return count; state = buffer[0] - '0'; return count; } static int __init proc_win_init(void) { if ((parent = proc_mkdir("anil", NULL)) == NULL) { return -1; }

if ((file = create_proc_entry("rel_time", 0666, parent)) == NULL) { remove_proc_entry("anil", NULL); return -1; } file->read_proc = time_read; file->write_proc = time_write; if ((link = proc_symlink("rel_time_l", parent, "rel_time")) == NULL) { remove_proc_entry("rel_time", parent); remove_proc_entry("anil", NULL); return -1; } link->uid = 0; link->gid = 100; return 0; } static void __exit proc_win_exit(void) { remove_proc_entry("rel_time_l", parent); remove_proc_entry("rel_time", parent); remove_proc_entry("anil", NULL); } module_init(proc_win_init); module_exit(proc_win_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Kernel window /proc Demonstration Driver"); And then Pugs did the following: Built the driver file (proc_window.ko) using the usual drivers Makefile. Loaded the driver using insmod. Showed various experiments using the newly created proc windows. (Refer to Figure 1.) And finally, unloaded the driver using rmmod.

Figure 1: Peeping through /proc

Demystifying the details


Starting from the constructor proc_win_init(), three proc entries have been created: Directory anil under /proc (i.e., NULL parent) with default permissions 0755, using proc_mkdir() Regular file rel_time in the above directory, with permissions 0666, using create_proc_entry() Soft link rel_time_l to the file rel_time, in the same directory, using proc_symlink() The corresponding removal of these is done with remove_proc_entry() in the destructor, proc_win_exit(), in chronological reverse order. For every entry created under /proc, a corresponding struct proc_dir_entry is created. For each, many of its fields could be further updated as needed: mode Permissions of the file uid User ID of the file gid Group ID of the file Additionally, for a regular file, the following two function pointers for reading and writing over the file could be provided, respectively: int (*read_proc)(char *page, char **start, off_t off, int count, int *eof, void *data) int (*write_proc)(struct file *file, const char __user *buffer, unsigned long count, void *data) write_proc() is very similar to the character drivers file operation write(). The above implementation lets the user write a digit from 0 to 9, and accordingly sets the internal state. read_proc() in the above implementation provides the current state, and the time since the system has been booted up in different units, based on the current state. These are jiffies in state 0; milliseconds in state 1; seconds and milliseconds in state 2; hours, minutes and seconds in state 3; and <not implemented> in other states. And to check the computation accuracy, Figure 2 highlights the system uptime in the output of top. read_procs page parameter is a page-sized buffer, typically to be filled up with count bytes from offset off. But more often than not (because of less content), just the page is filled up, ignoring all other parameters.

Figure 2: Comparison with tops output

All the /proc-related structure definitions and function declarations are available through <linux/proc_fs.h>. The jiffies-related function declarations and macro definitions are in <linux/jiffies.h>. As a special note, the actual jiffies are calculated by subtracting INITIAL_JIFFIES, since on boot-up, jiffies is initialised to INITIAL_JIFFIES instead of zero.

Summing up
Hey Pugs! Why did you set the folder name to anil? Who is this Anil? You could have used my name, or maybe yours, suggested Shweta. Ha! Thats a surprise. My real name is Anil; its just that everyone in college knows me as Pugs, smiled Pugs. Watch out for further technical romancing from Pugs a.k.a Anil.

Device Drivers, Part 17: Module Interactions

This article, which is part of the series on Linux device drivers, demonstrates various interactions with a Linux module. As Shweta and Pugs gear up for their final semesters project on Linux drivers, theyre closing in on some final titbits of technical romancing. This mainly includes the various communications with a Linux module (dynamically loadable and unloadable driver) like accessing its variables, calling its functions, and passing parameters to it.

Global variables and functions


One might wonder what the big deal is about accessing a modules variables and functions from outside it. Just make them global, declare them extern in a header, include the header and access, right? In the general application development paradigm, its this simple but in kernel development, it isnt despite of recommendations to make everything static, by default there always have been cases where nonstatic globals may be needed. A simple example could be a driver spanning multiple files, with function(s) from one file needing to be called in the other. Now, to avoid any kernel name-space collisions even with such cases, every module is embodied in its own namespace. And we know that two modules with the same name cannot be loaded at the same time. Thus, by default, zero collision is achieved. However, this also implies that, by default, nothing from a module can be made really global throughout the kernel, even if we want to. And so, for exactly such scenarios, the <linux/module.h> header defines the following macros: EXPORT_SYMBOL(sym) EXPORT_SYMBOL_GPL(sym) EXPORT_SYMBOL_GPL_FUTURE(sym)

Each of these exports the symbol passed as their parameter, additionally putting them in one of the default, _gpl or _gpl_future sections, respectively. Hence, only one of them needs to be used for a particular symbol though the symbol could be either a variable name or a function name. Heres the complete code (our_glob_syms.c) to demonstrate this: #include <linux/module.h> #include <linux/device.h> static struct class *cool_cl; static struct class *get_cool_cl(void) { return cool_cl; } EXPORT_SYMBOL(cool_cl); EXPORT_SYMBOL_GPL(get_cool_cl); static int __init glob_sym_init(void) { if (IS_ERR(cool_cl = class_create(THIS_MODULE, "cool"))) /* Creates /sys/class/cool/ */ { return PTR_ERR(cool_cl); } return 0; } static void __exit glob_sym_exit(void) { /* Removes /sys/class/cool/ */ class_destroy(cool_cl); } module_init(glob_sym_init); module_exit(glob_sym_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs.com>"); MODULE_DESCRIPTION("Global Symbols exporting Driver"); Each exported symbol also has a corresponding structure placed into (each of) the kernel symbol table (__ksymtab), kernel string table (__kstrtab), and kernel CRC table (__kcrctab) sections, marking it to be globally accessible. Figure 1 shows a filtered snippet of the /proc/kallsyms kernel window, before and after loading the module our_glob_syms.ko, which has been compiled using the usual driver Makefile.

Figure 1: Our global symbols module The following code shows the supporting header file (our_glob_syms.h), to be included by modules using the exported symbols cool_cl and get_cool_cl: #ifndef OUR_GLOB_SYMS_H #define OUR_GLOB_SYMS_H #ifdef __KERNEL__ #include <linux/device.h> extern struct class *cool_cl; extern struct class *get_cool_cl(void); #endif #endif Figure 1 also shows the file Module.symvers, generated by compiling the module our_glob_syms. This contains the various details of all the exported symbols in its directory. Apart from including the above header file, modules using the exported symbols should possibly have this file Module.symvers in their build directory. Note that the <linux/device.h> header in the above examples is being included for the various classrelated declarations and definitions, which have already been covered in the earlier discussion on character drivers.

Module parameters
Being aware of passing command-line arguments to an application, it would be natural to ask if something similar can be done with a module and the answer is, yes, it can. Parameters can be passed to a module while loading it, for instance, when using insmod. Interestingly enough, and in contrast to the command-line arguments to an application, these can be modified even later, through sysfs interactions. The module parameters are set up using the following macro (defined in <linux/moduleparam.h>, included through <linux/module.h>): module_param(name, type, perm) Here, name is the parameter name, type is the type of the parameter, and perm refers to the permissions of the sysfs file corresponding to this parameter. The supported type values are: byte, short, ushort, int, uint, long, ulong, charp (character pointer), bool or invbool (inverted Boolean). The following module code (module_param.c) demonstrates a module parameter: #include <linux/module.h> #include <linux/kernel.h> static int cfg_value = 3; module_param(cfg_value, int, 0764); static int __init mod_par_init(void) { printk(KERN_INFO "Loaded with %d\n", cfg_value); return 0; } static void __exit mod_par_exit(void) { printk(KERN_INFO "Unloaded cfg value: %d\n", cfg_value); } module_init(mod_par_init); module_exit(mod_par_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>"); MODULE_DESCRIPTION("Module Parameter demonstration Driver");

Figure 2: Experiments with the module parameter

Figure 3: Experiments with the module parameter (as root)

Note that before the parameter setup, a variable of the same name and compatible type needs to be defined. Subsequently, the following steps and experiments are shown in Figures 2 and 3: Building the driver (module_param.ko file) using the usual driver Makefile Loading the driver using insmod (with and without parameters) Various experiments through the corresponding /sys entries And finally, unloading the driver using rmmod.

Note the following: Initial value (3) of cfg_value becomes its default value when insmod is done without any parameters. Permission 0764 gives rwx to the user, rw- to the group, and r-- for the others on the file cfg_value under the parameters of module_param under /sys/module/. Check for yourself: The output of dmesg/tail on every insmod and rmmod, for the printk outputs. Try writing into the /sys/module/module_param/parameters/cfg_value file as a normal (nonroot) user.

Summing up
With this, the duo have a fairly good understanding of Linux drivers, and are all set to start working on their final semester project. Any guesses what their project is about? Hint: They have picked up one of the most daunting Linux driver topics. Let us see how they fare with it next month.

Steps to make Raspberry Pi Supercomputer


Prof Simon Cox Computational Engineering and Design Research Group Faculty of Engineering and the Environment University of Southampton, SO17 1BJ, UK. V0.2: 8th September 2012

First steps to get machine up


1. Get image from http://www.raspberrypi.org/downloads I used: 2012-08-16-wheezy-raspbian.zip 2. Use win32 disk imager to put image onto an SD Card (or on a Mac e.g. Disk Utility/ dd) http://www.softpedia.com/get/CD-DVD-Tools/Data-CD-DVD-Burning/Win32-Disk-Imager.shtml You will use the Write option to put the image from the disk to your card 3. Boot on Pi 4. Expand image to fill card using the option on screen when you first boot. If you dont do this on first boot, then you need to use $ sudo raspi-config http://elinux.org/RPi_raspi-config 5. Log in and change the password http://www.simonthepiman.com/beginners_guide_change_my_default_password.php $ passwd

6. 7. 8.

Log out and check that you typed it all OK (!) $ exit Log back in again with your new password Refresh your list of packages in your cache $ sudo apt-get update

Building MPI so we can run code on multiple nodes

9. Just doing this out of habit, but note not doing any more than just getting the list (upgrade is via sudo apt-get upgrade). 10. Get Fortran after all what is scientific programming without Fortran being a possibility? $ sudo apt-get install gfortran 11. Read about MPI on the Pi. This is an excellent post to read just to show you are going to make it by the end, but dont type or get anything just yet- we are going to build everything ourselves: http://westcoastlabs.blogspot.co.uk/2012/06/parallel-processing-on-pi-bramble.html Note there are a few things to note here a) Since we put Fortran in we are good to go without excluding anything b) The packages here are for armel and we need armhf in this case so we are going to build MPI ourselves 12. Read a bit more before you begin: http://www.mcs.anl.gov/research/projects/mpich2/documentation/files/mpich2-1.4.1-installguide.pdf We are going to follow the steps from 2.2 in this guide. 13. Make a directory to put the sources in
$ mkdir $ cd

/home/pi/mpich2

~/mpich2 14. Get MPI sources from Argonne.


$ wget http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.4.1p1/mpich2-1.4.1p1.tar.gz

15. Unpack them. $ tar xfz mpich2-1.4.1p1.tar.gz 16. Make yourself a place to put the compiled stuff this will also make it easier to figure out what you have put in new on your system. Also you may end up building this a few times $ sudo mkdir /home/rpimpi/ $ sudo mkdir /home/rpimpi/mpich2-install [I just chose the rpimpi to replace the you in the Argonne guide and I did the directory creation in two steps] 17. Make a build directory (so we keep the source directory clean of build things) mkdir /home/pi/mpich_build 18. Change to the BUILD directory $ cd /home/pi/mpich_build 19. Now we are going to configure the build $ sudo /home/pi/mpich2/mpich2-1.4.1p1/configure -prefix=/home/rpimpi/mpich2-install Make a cup of tea

20. Make the files $ sudo make Make another cup of tea 21. Install the files $ sudo make install Make another cup of tea it will finish 22. Add the place that you put the install to your path $ export PATH=$PATH:/home/rpimpi/mpich2-install/bin Note to permanently put this on the path you will need to edit .profile $nano ~/.profile and add at the bottom these two lines: # Add MPI to path PATH="$PATH:/home/rpimpi/mpich2-install/bin" 23. Check whether things did install or not $ which mpicc $ which mpiexec 24. Change directory back to home and create somewhere to do your tests $ cd ~ $ mkdir mpi_testing $ cd mpi_testing 25. Now we can test whether MPI works for you on a single node mpiexec -f machinefile -n <number> hostname where machinefile contains a list of IP addresses (in this case just one) for the machines a) $ ifconfig b) Add this line: 192.168.1.161 [or whatever your IP address was] 27. If you use $ mpiexec -f machinefile n 1 hostname Output is: raspberrypi 28. Now to run a little C code. In the examples subdirectory of where you built MPI is the famous CPI example. You will now use MPI on your Pi to calculate Pi: $ cd /home/pi/mpi_testing $ mpiexec -f machinefile -n 2 ~/mpich_build/examples/cpi Put this into a single file called machinefile 26. $ nano machinefile Get your IP address

Output is Process 0 of 2 is on raspberrypi Process 1 of 2 is on raspberrypi pi is approximately 3.1415926544231318, Error is 0.0000000008333387 Celebrate if you get this far.

Flash me once
29. We now have a master copy of the main node of the machine with all of the installed files for MPI in a single place. We now want to clone this card. 30. Shutdown your Pi very carefully $ sudo poweroff Remove the SD Card and pop it back into your SD Card writer on your PC/ other device. Use Win32 disk imager (or on a Mac e.g. Disk Utility/ dd) to put the image FROM your SD Card back TO your PC: http://www.softpedia.com/get/CD-DVD-Tools/Data-CD-DVD-Burning/Win32-Disk-Imager.shtml You will use the Read option to put the image from the disk to your card Let us call the image 2012-08-16-wheezy-raspbian_backup_mpi_master.img 31. Eject the card and put a fresh card into your PC/other device. Use win32 disk imager to put image onto an SD Card (or on a Mac e.g. Disk Utility/ dd) http://www.softpedia.com/get/CD-DVD-Tools/Data-CD-DVD-Burning/Win32-Disk-Imager.shtml You will use the Write option to put the image from the disk to your card and choose the 2012-0816-wheezy-raspbian_backup_mpi_master.img image you just created. [Note that there are probably more efficient ways of doing this in particular maybe avoid expanding the filesystem in step 4 of the first section.] 32. Put the card into your second Pi and boot this. You should now have two Raspberry Pis on. Unless otherwise stated, all the commands below are typed from the Master Pi that you built first.

Using SSH instead of password login between the Pis


33. Sort out RSA to allow quick log in. This is the best thing to read: http://steve.dynedge.co.uk/2012/05/30/logging-into-a-rasberry-pi-using-publicprivate-keys/ In summary (working on the MASTER Pi node) $ cd ~ $ ssh-keygen -t rsa C raspberrypi@raspberrypi This set a default location of /home/pi/.ssh/id_rsa to store the key Enter a passphrase e.g. myfirstpicluster. If you leave this blank (not such good security) then no further typing of passphrases is needed. $ cat ~/.ssh/id_rsa.pub | ssh pi@192.168.1.162 "cat >> .ssh/authorized_keys" 34. If you now log into your other Pi and do $ ls al ~/.ssh You should see a file called authorized_keys this is your ticket to no login heaven on the nodes

35. Now let us add the new Pi to the machinefile. (Log into it and get its IP address, as above) Working on the Master Raspberry Pi (the first one you built): $ nano machinefile Make it read 192.168.1.161 192.168.1.162 [or whatever the two IP addresses you have for the machines are] 36. Now to run a little C code again. In the examples subdirectory of where you built MPI is the famous CPI example. First time you will need to enter the passphrase for the key you generated above (unless you left it blank) and also the password for the second Pi. $ cd /home/pi/mpi_testing $ mpiexec -f machinefile -n 2 ~/mpich_build/examples/cpi Output is Process 0 of 2 is on raspberrypi Process 1 of 2 is on raspberrypi pi is approximately 3.1415926544231318, Error is 0.0000000008333387 If you repeat this a second time you wont need to type any passwords in. Hurray. Note that we have NOT changed the hostnames yet (so yes, the above IS running on the two machines, but they both have the same hostname at the moment). 37. If you change the hostname on your second machine (see Appendix 1 Hostname Script) and run: $ mpiexec -f machinefile -n 2 ~/mpich_build/examples/cpi Output: Process 0 of 2 is on raspberrypi Process 1 of 2 is on iridispi002 Now you can see each process running on the separate nodes.

CONGRATULATIONS - YOU HAVE NOW FINISHED BUILDING 2-NODE SUPERCOMPUTER IF YOU FOLLOW THE STEPS BELOW, YOU CAN EXPAND THIS TO 64 (or more) nodes

Acknowledgements
Thanks to all of the authors of the posts linked to in this guide. Thanks to the team in the lab: Richard Boardman, Steven Johnston, Gereon Kaiping, Neil OBrien, and Mark Scott. Also to Oz Parchment and Andy Everett (iSolutions). Thanks to Pavittar Bassi in Finance, who made all the orders for equipment happen so efficiently. And, of course, Professor Coxs son James who provided specialist support on Lego and system testing.

Appendix 1 Scripts and other things to do


Flash me one more time (rinse and repeat for each additional node)
1. Power off the worker Pi and eject the card $ sudo poweroff We now have a copy of the WORKER nodes of the machine with all of the installed files for MPI in a single place. We now want to clone this card- as it has the ssh key on it in the right place. Shutdown your Pi very carefully $ sudo poweroff 2. Remove the SD Card and pop it back into your SD Card writer on your PC/ other device. Use Win32 disk imager (or on a Mac e.g. Disk Utility/ dd) to put the image FROM your SD Card back to your PC: http://www.softpedia.com/get/CD-DVD-Tools/Data-CD-DVD-Burning/Win32-Disk-Imager.shtml You will use the Read option to put the image from the disk to your card Let us call the image 2012-08-16-wheezy-raspbian_backup_mpi_worker.img 3. Eject the card and put a fresh card into the machine. Use win32 disk imager to put image onto an SD Card (or on a Mac e.g. Disk Utility/ dd) http://www.softpedia.com/get/CD-DVD-Tools/Data-CD-DVD-Burning/Win32-Disk-Imager.shtml You will use the Write option to put the image from the disk to your card and choose the 2012-0816-wheezy-raspbian_backup_mpi_master.img image you just created. [Note that there are probably more efficient ways of doing this in particular maybe avoid expanding the filesystem in step 4 of the first section.]

Hostname Script
If you want to rename each machine, you can do it from the Master node using: ssh pi@192.168.1.162 'sudo echo "iridispi002" | sudo tee /etc/hostname' ssh pi@192.168.1.163 'sudo echo "iridispi003" | sudo tee /etc/hostname' ssh pi@192.168.1.164 'sudo echo "iridispi004" | sudo tee /etc/hostname' etc. You should then reboot each worker node If you re-run step (36) above again, you will get: $ mpiexec -f machinefile -n 2 ~/mpich_build/examples/cpi Output: Process 0 of 2 is on raspberrypi Process 1 of 2 is on iridispi002 pi is approximately 3.1415926544231318, Error is 0.0000000008333387 This shows the master node still called raspberrypi and the first worker called iridispi002

Using Python
There are various Python bindings for MPI. This guide just aims to show how to get ONE of them working. 1. Let us use mpi4py. More info at http://mpi4py.scipy.org/ http://mpi4py.scipy.org/docs/usrman/index.html $ sudo apt-get install python-mpi4py 2. $ cd ~ $ mkdir mpi4py $ cd mpi4py $ wget http://mpi4py.googlecode.com/files/mpi4py-1.3.tar.gz $ tar xfz mpi4py-1.3.tar.gz $ cd mpi4py-1.3/demo 3. Repeat steps 1 and 2 on each of your other nodes (we did not bake this into the system image) We also want to run the demo so let us get the source too

4.

Run an example (on your master node)

$ mpirun.openmpi -np 2 -machinefile /home/pi/mpi_testing/machinefile python helloworld.py

Output is: Hello, World! I am process 0 of 2 on raspberrypi. Hello, World! I am process 1 of 2 on iridispi002.
5. $ mpiexec.openmpi -n 4 -machinefile /home/pi/mpi_testing/machinefile python helloworld.py

Output is: Hello, World! I am process 2 of 4 on raspberrypi. Hello, World! I am process 3 of 4 on iridispi002. Hello, World! I am process 1 of 4 on iridispi002. Hello, World! I am process 0 of 4 on raspberrypi. 6. These are handy to remove things if your attempts to get mpi4py dont quite pan out

$ sudo apt-get install python-mpi4py $ sudo apt-get autoremove

Keygen script commands


cat ~/.ssh/id_rsa.pub | ssh pi@192.168.1.161 "cat >> .ssh/authorized_keys" cat ~/.ssh/id_rsa.pub | ssh pi@192.168.1.162 "cat >> .ssh/authorized_keys" cat ~/.ssh/id_rsa.pub | ssh pi@192.168.1.163 "cat >> .ssh/authorized_keys" etc. for sending out the key exchanges if you want to do this again having generated a new key

Getting Pip for Raspberry Pi


1. We can install Pip, which gives us a nice way to set up Python packages (and uninstall them too). More info is at http://www.pip-installer.org/en/latest/index.html http://www.pip-installer.org/en/latest/installing.html $ cd ~ $ mkdir pip_testing $ cd pip_testing 2. A prerequisite for pip is distribute so lets get that first and then install pip. The sudo is because the installation of these has to run as root. $ curl http://python-distribute.org/distribute_setup.py | sudo python $ curl https://raw.github.com/pypa/pip/master/contrib/get-pip.py | sudo python

Notes on making MPI Shared Libraries for Raspberry Pi


MPI libraries can also be built shared so that they can be dynamically loaded. This gives a library file that ends in .so etc. not .a and we can do that by building those MPI libraries again. This is a repeat of steps above, but written out again using the suffix _shared on the directory names. 1. 2. 3. Make a directory to put the sources in $ mkdir /home/pi/mpich2_shared $ cd ~/mpich2_shared Get MPI sources from Argonne.
$ wget http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.4.1p1/mpich2-1.4.1p1.tar.gz

Unpack them. $ tar xfz mpich2-1.4.1p1.tar.gz 4. Make yourself a place to put the compiled stuff this will also make it easier to figure out what you have put in new on your system. $ sudo mkdir /home/rpimpi_shared/ $ sudo mkdir /home/rpimpi_shared/mpich2-install_shared [I just chose the rpimpi_shared to replace the you in the Argonne guide and I made the directory creation in two steps] 5. 6. 7. 8. Make a build directory (so we keep the source directory clean of build things) $ mkdir /home/pi/mpich_build_shared Change to the BUILD directory $ cd /home/pi/mpich_build_shared Now we are going to configure the build
$ sudo /home/pi/mpich2_shared/mpich2-1.4.1p1/configure -prefix=/home/rpimpi_shared/mpich2-install_shared --enable-shared

Make the files $ sudo make 9. Install the files $ sudo make install 10. Finally add the place that you put the install to your path $ export PATH=$PATH:/home/rpimpi_shared/mpich2-install_shared/bin Note to permanently put this on the path you will need to edit .profile $ nano ~/.profile and add at the bottom these two lines: # Add MPI Shared to path PATH="$PATH:/home/rpimpi_shared/mpich2-install_shared/bin"

The Quick Guide to QEMU Setup

In this series of articles, we will explore the basics of QEMU, OS installation, QEMU networking and embedded system development for the ARM architecture. In this first part, lets begin with the basics. When I first began using computers, I was amazed. It required just a mouse-click to play songs, movies or games to entertain every age. It was like magic to me! Over time, I learnt about compiled programs, and source code. My curiosity very soon made source code my passion. Even though compiled software packages are easily available, I love compiling from source. And that is just what I do for QEMU. QEMU is one of the best emulators out there. Still, very few people use its full capabilities. Though we deal with the basics in this article, look forward to some interesting stuff later in the series!

Building QEMU from source


The first thing is to download the QEMU source code; the current version as of this writing is 0.14 and youll find it here. Extract the tar ball and go to the extracted directory: $ tar -zxvf qemu-0.14.0.tar.gz $ cd qemu-0.14.0 Run the configuration script. We will build QEMU for i386. (It can be built for other architectures too, like ARM, PPC, SPARC, etc.) Lets install the Ubuntu distro in the virtual machine thats the reason weve chosen to build QEMU for the i386 architecture: $ ./configure target-list=i386-softmmu Hopefully, you will not run into any trouble during the configure script run. If theres any issue, it will probably be some missing library or header files, which you can look for, and install. Once you are done with the configure script, compile the source code with the make command. After compilation, QEMU binaries should be installed in their proper locations. On my Fedora system, I used the su command to get the necessary root privileges and install the binaries using make install. To confirm that QEMU has been successfully installed, run qemu, and a pop-up window like what appears in Figure 1 will confirm the successful installation of QEMU.

Figure 1: Testing QEMU after installation

Creating a new virtual machine


If you are familiar with other virtualisation software, you might wonder how to go about creating a new VM. The first step is to create the hard disk image for the VM. So lets install the Ubuntu 9.10 OS in the VM; a disk image of 10 GB is sufficient for this purpose. To create it, run the following commands: $ qemu-img create ubuntu.img 10G $ ls -lh ubuntu.img -rw-r--r--. 1 root root 10G Mar 11 11:54 ubuntu.img The next step is to install Ubuntu (I already have a downloaded Ubuntu 9.10 (Karmic) ISO image in my current working directory): $ qemu -hda ubuntu.img -boot d -cdrom ./ubuntu-9.10-desktop-i386.iso -m 512 In the above command, the -hda option specifies the disk image file; and -cdrom is the CD-ROM or ISO image to use as the optical drive for the VM. The -m option specifies how much RAM this virtual machine will be given to use (in this case, I have allocated 512 MB of RAM; your choice should be based on your needs and hardware). Finally, we instruct QEMU to boot the VM from the ISO image by using the -boot d option. Once you run the above command, the VM will boot up and present the Ubuntu boot menu (see Figure 2).

Figure 2: Installing Ubuntu in QEMU Follow the same installation steps you would use on a real machine. Once installed, you can boot the VM from the disk image with the following commands: $ qemu -m 512 -hda ubuntu.img Figure 3 shows the VM running after booting from the virtual hard disk.

Figure 3: Booting the installed operating system

The next thing we need to do is set up networking.

QEMU networking
Setting up networking on QEMU is tricky work. Lets make use of the virtual network kernel devices TAP and TUN, which are different from hardware Ethernet devices; TAP and TUN are supported only in the kernel (i.e., only in software). TAP operates at the data-link layer, and TUN at the network layer. QEMU can use the TAP interface to provide full networking support to the guest operating system. Before this, we need to install the VPN (Virtual Private Network) package on the host machine, and set up a bridge between the host and guest OS. Install the openvpn and bridge-utils packages: # yum install openvpn # yum install bridge-utils Now, we will create two scripts for qemuqemu-ifup and qemu-ifdown, as given below: #qemu-ifup 1 /sbin/ifconfig eth1 down 2 /sbin/ifconfig eth1 0.0.0.0 promisc up 3 openvpn --mktun --dev tap0 4 ifconfig tap 0 0.0.0.0 up 5 brctl addbr br0 6 brctl addif br0 eth1 7 brctl addif br0 tap0 8 brctl stp br0 off 9 ifconfig br0 10.10.10.2 netmask 10 255.255.255.0 This script will be used to start QEMU networking. In the first line, the Ethernet device is disabled. For the interface to be a part of a network bridge, it must have an IP address of 0.0.0.0, which is what we have done in the second line. In lines 3 and 4, we create and bring up the TAP device/interface tap0. In the next few steps, a bridge is created with eth1 and tap0 as parts of this bridge. Finally, we assign an IP address to this bridge. Following is what the qemu-ifdown script looks like: #qemu-ifdown 1 ifconfig eth1 down 2 ifconfig eth1 -promisc 3 ifup eth1 4 ifconfig br0 down 5 brctl delbr br0 6 openvpn --rmtun --dev tap0 This script will be used to shutdown QEMU networking; it is almost self-explanatory, shutting down both the interfaces, deleting the bridge, and the tap0 device. Copy these two files to your /etc directory, and test them: # /etc/qemu-ifup Wed Apr 6 15:53:50 2011 TUN/TAP device tap0 opened Wed Apr 6 15:53:50 2011 Persist state set to: ON # ifconfig br0 br0 Link encap:Ethernet HWaddr 00:25:11:74:5B:0C inet addr:10.10.10.2 Bcast:10.10.10.255 Mask:255.255.255.0 inet6 addr: fe80::225:11ff:fe74:5b0c/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0

TX packets:29 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:7539 (7.3 KiB) #ifconfig tap0 tap0 Link encap:Ethernet HWaddr C2:10:27:8C:B8:35 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) The virtual device tap0 and bridge br0 are up, so our script is working fine. Now bring it down, using the qemu-ifdown script: # /etc/qemu-ifdown Error: Connection activation failed: Device not managed by NetworkManager Wed Apr 6 15:56:44 2011 TUN/TAP device tap0 opened Wed Apr 6 15:56:44 2011 Persist state set to: OFF

Figure 4: QEMU networking using kernel virtual device Everything is set up correctly; now its time to boot the Ubuntu VM with full networking support. Start the networking (as root), and boot the VM (as an ordinary user): # /etc/qemu-ifup $ qemu -m 512 -hda ubuntu.img -net nic -net tap,ifname=tap0,script=no When the machine boots up, assign an IP address to the eth0 interface inside the VM: $ sudo ifconfig eth0 10.10.10.100 netmask 255.255.255.0 Now try to ping the bridge IP (results are shown in Figure 4): $ ping 10.10.10.2

Using QEMU for Embedded Systems Development, Part 1

We covered the basic use of QEMU. Now lets dig deeper into its abilities, looking at the embedded domain. Techies who work in the embedded domain must be familiar with the ARM (Advanced RISC Machine) architecture. In the modern era, our lives have been taken over by mobile devices like phones, PDAs, MP3 players and GPS devices that use this architecture. ARM has cemented its place in the embedded devices market because of its low cost, lower power requirements, less heat dissipation and good performance. Purchasing ARM development hardware can be an expensive proposition. Thankfully, the QEMU developers have added the functionality of emulating the ARM processor to QEMU. You can use QEMU for two purposes in this arena to run an ARM program, and to boot and run the ARM kernel. In the first case, you can run and test ARM programs without installing ARM OS or its kernel. This feature is very helpful and time-saving. In the second case, you can try to boot the Linux kernel for ARM, and test it.

Compiling QEMU for ARM


In the last article, we compiled QEMU for x86. This time lets compile it for ARM. Download the QEMU source, if you dont have it already. Extract the tarball, change to the extracted directory, configure and build it as follows: $ tar -zxvf qemu-0.14.0.tar.gz $ cd qemu-0.14.0 $ ./configure target-list=arm-softmmu $ make $ su # make install You will find two output binaries, qemu-arm and qemu-system-arm, in the source code directory. The first is used to execute ARM binary files, and the second to boot the ARM OS.

Obtaining an ARM tool-chain


Lets develop a small test program. Just as you need the x86 tool-chain to develop programs for Intel, you need the ARM tool-chain for ARM program development. You can download it from here. Extract the archives contents, and view a list of the available binaries: $ tar -jxvf arm-2010.09-50-arm-none-linux-gnueabi-i686-pc-linux-gnu.tar.bz2 $ cd arm-2010.09/bin/ $ ls -rwxr-xr-x 1 root root 569820 Nov 7 22:23 arm-none-linux-gnueabi-addr2line -rwxr-xr-x 2 root root 593236 Nov 7 22:23 arm-none-linux-gnueabi-ar -rwxr-xr-x 2 root root 1046336 Nov 7 22:23 arm-none-linux-gnueabi-as -rwxr-xr-x 2 root root 225860 Nov 7 22:23 arm-none-linux-gnueabi-c++ -rwxr-xr-x 1 root root 572028 Nov 7 22:23 arm-none-linux-gnueabi-c++filt -rwxr-xr-x 1 root root 224196 Nov 7 22:23 arm-none-linux-gnueabi-cpp -rwxr-xr-x 1 root root 18612 Nov 7 22:23 arm-none-linux-gnueabi-elfedit -rwxr-xr-x 2 root root 225860 Nov 7 22:23 arm-none-linux-gnueabi-g++ -rwxr-xr-x 2 root root 222948 Nov 7 22:23 arm-none-linux-gnueabi-gcc

Cross-compiling and running the test program for ARM


Now use the arm-none-linux-gnueabi-gcc tool to compile a test C program. Before proceeding, you should add the ARM tool-chain to your PATH: # PATH=/(Your-path)/arm-2010.09/bin:$PATH Create a small test program, test.c, with the basic Hello world: #include<stdio.h> int main(){ printf("Welcome to Open World\n"); } Use the ARM compiler to compile this program: # arm-none-linux-gnueabi-gcc test.c -o test Once the file is compiled successfully, check the properties of the output file, showing that the output executable is built for ARM: # file test test: ELF 32-bit LSB executable, ARM, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.16, not stripped Run the test program: #qemu-arm -L /your-path/arm-2010.09/arm-none-linux-gnueabi/libc ./test Welcome to Open World While executing the program, you must link it to the ARM library. The option -L is used for this purpose.

Building the Linux kernel for ARM


So, you are now done with the ARM tool-chain and qemu-arm. The next step is to build the Linux kernel for ARM. The mainstream Linux kernel already contains supporting files and code for ARM; you need not patch it, as you used to do some years ago. Download latest version of Linux from kernel.org (v2.6.37 as of this writing), and extract the tarball, enter the extracted directory, and configure the kernel for ARM: # tar -jxvf linux-2.6.37.tar.bz2 # cd linux-2.6.37

# make menuconfig ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabiHere, specify the architecture as ARM, and invoke the ARM tool-chain to build the kernel. In the configuration window, navigate to Kernel Features, and enable Use the ARM EABI to compile the kernel. (EABI is Embedded Application Binary Interface.) Without this option, the kernel wont be able to load your test program.

Modified kernel for u-boot


In subsequent articles, we will be doing lots of testing on u-boot and for that, we need a modified kernel. The kernel zImage files are not compatible with u-boot, so lets use uImage instead, which is a kernel image with the header modified to support u-boot. Compile the kernel, while electing to build a uImage for u-boot. Once again, specify the architecture and use the ARM tool-chain: # make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- uImage -s Generating include/generated/mach-types.h arch/arm/mm/alignment.c: In function 'do_alignment': arch/arm/mm/alignment.c:720:21: warning: 'offset.un' may be used uninitialized in this function . . . Kernel: arch/arm/boot/Image is ready Kernel: arch/arm/boot/zImage is ready Image Name: Linux-2.6.37 Created: Thu May 5 16:59:28 2011 Image Type: ARM Linux Kernel Image (uncompressed) Data Size: 1575492 Bytes = 1538.57 kB = 1.50 MB Load Address: 00008000 Entry Point: 00008000 Image arch/arm/boot/uImage is ready After the compilation step, the uImage is ready. Check the files properties: # file arch/arm/boot /uImage uImage: u-boot legacy uImage, Linux-2.6.37, Linux/ARM, OS Kernel Image (Not compressed), 1575492 bytes, Thu May 5 17:11:30 2011, Load Address: 0x00008000, Entry Point: 0x00008000, Header CRC: 0xFC6898D9, Data CRC: 0x5D0E1B70 Now test this image on QEMU; the result is shown in Figure 1: # qemu-system-arm -M versatilepb -m 128M -kernel /home/manoj/Downloads/linux-2.6.37/ar

Figure 1: ARM kernel inside QEMU The kernel will crash at the point where it searches for a root filesystem, which you didnt specify in the above command. The next task is to develop a dummy filesystem for your testing. Its very simple develop a small test C program hello.c, and use it to build a small dummy filesystem: #include<stdio.h> int main(){ while(1){ printf("Hello Open World\n"); getchar(); } The endless loop (while(1)) will print a message when the user presses a key. Compile this program for ARM, but compile it statically; as you are trying to create a small dummy filesystem, you will not use any library in it. In GCC, the -static option does this for you: # arm-none-linux-gnueabi-gcc hello.c -static -o hello Use the output file to create a root filesystem. The command cpio is used for this purpose. Execute the following command: # echo hello | cpio -o --format=newc > rootfs 1269 blocks

Check the output file: # file rootfs rootfs: ASCII cpio archive (SVR4 with no CRC) You now have a dummy filesystem ready for testing with this command: # qemu-system-arm -M versatilepb -m 128M -kernel /home/manoj/Downloads/ linux-2.6.37/arch/arm/boot/uImage -initrd rootfs -append "root=/dev/ram rdinit=/hello"

Figure 2: ARM kernel with a dummy filesystem When the kernel boots, it mounts rootfs as its filesystem, and starts the hello program as init. So now you are able to run ARM programs, and boot the ARM kernel inside QEMU. The next step would be to use u-boot on QEMU. An array of testing is ahead of us, which we will cover in a forthcoming article.

Using QEMU for Embedded Systems Development, Part 2

In the previous articles, we learnt how to use QEMU for a generic Linux OS installation, for networking using OpenVPN and TAP/TUN, for cross-compilation of the Linux kernel for ARM, to boot the kernel from QEMU, and how to build a small filesystem and then mount it on the vanilla kernel. Now we will step out further. First of all, I would like to explain the need for a bootloader. The bootloader is code that is used to load the kernel into RAM, and then specify which partition will be mounted as the root filesystem. The bootloader resides in the MBR (Master Boot Record). In general-purpose computing machines, an important component is the BIOS (Basic Input Output System). The BIOS contains the low-level drivers for devices like the keyboard, mouse, display, etc. It initiates the bootloader, which then loads the kernel. Linux users are very familiar with boot-loaders like GRUB (Grand Unified Boot-Loader) and LILO (Linux Loader). Micro-controller programmers are very familiar with the term Bare-Metal Programming. It means that there is nothing between your program and the processor the code you write runs directly on the processor. It becomes the programmers responsibility to check each and every possible condition that can corrupt the system. Now, let us build a small program for the ARM Versatile Platform Baseboard, which will run on the QEMU emulator, and then print a message on the serial console. Downloaded the tool-chain for ARM EABI from here. As described in the previous article, add this tool-chain in your PATH. By default, QEMU redirects the serial console output to the terminal, when it is initialised with the nographic option: $ qemu-system-arm --help | grep nographic -nographic disable graphical output and redirect serial I/Os to console. When using -nographic, press 'ctrl-a h' to get some help. We can make good use of this feature; lets write some data to the serial port, and it can be a good working example. Before going further, we must make sure which processor the GNU EABI tool-chain supports, and which processor QEMU can emulate. There should be a similar processor supported by both the toolchain and the emulator. Lets check first in QEMU. In the earlier articles, we compiled the QEMU source code, so use that source code to get the list of the supported ARM processors: $ cd (your-path)/qemu/qemu-0.14.0/hw $ grep "arm" versatilepb.c #include "arm-misc.h" static struct arm_boot_info versatile_binfo; cpu_model = "arm926"; Its very clear that the arm926 is supported by QEMU. Lets check its availability in the GNU ARM

tool-chain: $ cd (your-path)/CodeSourcery/Sourcery_G++_Lite/share/doc/arm-arm-none-eabi/info $ cat gcc.info | grep arm | head -n 20 . . `strongarm1110', `arm8', `arm810', `arm9', `arm9e', `arm920', `arm920t', `arm922t', `arm946e-s', `arm966e-s', `arm968e-s', `arm926ej-s', `arm940t', `arm9tdmi', `arm10tdmi', `arm1020t', `arm1026ej-s', `arm10e', `arm1020e', `arm1022e', `arm1136j-s', Great!! The ARM926EJ-S processor is supported by the GNU ARM tool-chain. Now, lets write some data to the serial port of this processor. As we are not using any header file that describes the address of UART0, we must find it manually, from the file (your-path)/qemu/qemu-0.14.0/hw/versatilepb.c: /* 0x101f0000 Smart card 0. */ /* 0x101f1000 UART0. */ /* 0x101f2000 UART1. */ /* 0x101f3000 UART2. */ Open source code is so powerful, it gives you each and every detail. UART0 is present at address 0x101f1000. For testing purposes, we can write data directly to this address, and check output on the terminal. Our first test program is a bare-metal program running directly on the processor, without the help of a bootloader. We have to create three important files. First of all, let us develop a small application program (init.c): volatile unsigned char * const UART0_PTR = (unsigned char *)0x0101f1000; void display(const char *string){ while(*string != '\0'){ *UART0_PTR = *string; string++; } } int my_init(){ display("Hello Open World\n"); } Lets run through this code snippet. First, we declared a volatile variable pointer, and assigned the address of the serial port (UART0). The function my_init(), is the main routine. It merely calls the function display(), which writes a string to the UART0. Engineers familiar with base-level micro-controller programming will find this very easy. If you are not experienced in embedded systems programming, then you can stick to the basics of digital electronics. The microprocessor is an integrated chip, with input/output lines, different ports, etc. The ARM926EJS has four serial ports (information obtained from its data-sheet); and they have their data lines (the address). When the processor is programmed to write data to one of the serial ports, it writes data to these lines. Thats what this program does. The next step is to develop the startup code for the processor. When a processor is powered on, it jumps to a specified location, reads code from that location, and executes it. Even in the case of a reset (like on a desktop machine), the processor jumps to a predefined location. Heres the startup code,

startup.s: .global _Start _Start: LDR sp, = sp_top BL my_init B. In the first line, _Start is declared as global. The next line is the beginning of _Starts code. We set the address of the stack to sp_top. (The instruction LDR will move the data value of sp_top in the stack pointer (sp). The instruction BL will instruct the processor to jump to my_init (previously defined in init.c). Then the processor will step into an infinite loop with the instruction B ., which is like a while(1) or for(;;) loop. If we dont do this, our system will crash. The basics of embedded systems programming is that our code should run into an infinite loop. Now, the final task is to write a linker script for these two files (linker.ld): ENTRY(_Start) SECTIONS { . = 0x10000; startup : { startup.o(.text)} .data : {*(.data)} .bss : {*(.bss)} . = . + 0x500; sp_top = .; } The first line tells the linker that the entry point is _Start (defined in startup.s). As this is a basic program, we can ignore the Interrupts section. The QEMU emulator, when executed with the -kernel option, starts execution from the address 0x10000, so we must place our code at this address. Thats what we have done in Line 4. The section SECTIONS, defines the different sections of a program. In this, startup.o forms the text (code) part. Then comes the subsequent data and the bss part. The final step is to define the address of the stack pointer. The stack usually grows downward, so its better to give it a safe address. We have a very small code snippet, and can place the stack at 0x500 ahead of the current position. The variable sp_top will store the address for the stack. We are now done with the coding part. Lets compile and link these files. Assemble the startup.s file with: $ arm-none-eabi-as -mcpu=arm926ej-s startup.s -o startup.o Compile init.c: $ arm-none-eabi-gcc -c -mcpu=arm926ej-s init.c -o init.o Link the object files into an ELF file: $ arm-none-eabi-ld -T linker.ld init.o startup.o -o output.elf Finally, create a binary file from the ELF file: $ arm-none-eabi-objcopy -O binary output.elf output.bin The above instructions are easy to understand. All the tools used are part of the ARM tool-chain. Check their help/man pages for details. After all these steps, finally we will run our program on the QEMU emulator: $ qemu-system-arm -M versatilepb -nographic -kernel output.bin The above command has been explained in previous articles (1, 2), so we wont go into the details. The binary file is executed on QEMU and will write the message Hello Open World to UART0 of the ARM926EJ-S, which QEMU redirects as output in the terminal.

Using QEMU for Embedded Systems Development, Part 3

This is the last article of this series on QEMU. In the previous article, we worked on bare-metal programming, and discussed the need for a bootloader. Most GNU/Linux distros use GRUB as their boot-loader (earlier, LILO was the choice). In this article, we will test the famous U-Boot (Universal BootLoader). In embedded systems, especially in mobile devices, ARM processor-based devices are leading the market. For ARM, U-Boot is the best choice for a bootloader. The good thing about it is that we can use it for different architectures like PPC, MIPS, x86, etc. So lets get started.

Download and compile U-Boot


U-Boot is released under a GPL licence. Download it from this FTP server, which has every version of U-Boot available. For this article, I got version 1.2.0 (u-boot-1.2.0.tar.bz2). Extract the downloaded tar ball and enter the source code directory: # tar -jxvf u-boot-1.2.0.tar.bz2 # cd u-boot-1.2.0 To begin, we must configure U-Boot for a particular board. We will use the same ARM Versatile Platform Baseboard (versatilepb) we used in the previous article, so lets run: # make versatilepb_config arch=ARM CROSS_COMPILE=arm-none-eabiConfiguring for versatile board... Variant:: PB926EJ-S After configuration is done, compile the source code: # make all arch=ARM CROSS_COMPILE=arm-none-eabifor dir in tools examples post post/cpu ; do make -C $dir _depend ; done make[1]: Entering directory `/root/qemu/u-boot-1.2.0/tools' ln -s ../common/environment.c environment.c . . G++_Lite/bin/../lib/gcc/arm-none-eabi/4.4.1 -lgcc \ -Map u-boot.map -o u-boot arm-none-eabi-objcopy --gap-fill=0xff -O srec u-boot u-boot.srec arm-none-eabi-objcopy --gap-fill=0xff -O binary u-boot u-boot.bin Find the size of the compiled U-Boot binary file (around 72 KB in my experience) with ls -lh u-boot* we will use it later in this article. I assume that you have set up QEMU, networking and the ARM tool chain, as explained in previous articles in this series (1, 2, 3). If not, then I suggest you read the last

three articles.

Boot U-Boot in QEMU


Now we can boot the U-Boot binary in QEMU, which is simple. Instead of specifying the Linux kernel as the file to boot in QEMU, use the U-Boot binary: # qemu-system-arm -M versatilepb -nographic -kernel u-boot.bin Run some commands in U-Boot, to check if it is working: Versatile # printenv bootargs=root=/dev/nfs mem=128M ip=dhcp netdev=25,0,0xf1010000,0xf1010010,eth0 bootdelay=2 baudrate=38400 bootfile="/tftpboot/uImage" stdin=serial stdout=serial stderr=serial verify=n Environment size: 184/65532 bytes

Figure 1: U-Boot The next step is to boot a small program from U-Boot. In the previous article, we wrote a small baremetal program so let us use that. We will create a flash binary image that includes u-boot.bin and the bare-metal program in it. The test

program from the last article will be used here again with some modification. As the u-boot.bin size is around 72 KB, we will move our sample program upward in memory. In the linker script, change the starting address of the program: ENTRY(_Start) SECTIONS { . = 0x100000; startup : { startup.o(.text)} .data : {*(.data)} .bss : {*(.bss)} . = . + 0x500; sp_top = .; } Compile the test program as shown below: # arm-none-eabi-gcc -c -mcpu=arm926ej-s init.c -o init.o # arm-none-eabi-as -mcpu=arm926ej-s startup.s -o startup.o # arm-none-eabi-ld -T linker.ld init.o startup.o -o test.elf # arm-none-eabi-objcopy -O binary test.elf test.bin Now, our test programs binary file and the u-boot.bin must be packed in a single file. Lets use the mkimage tool for this; locate it in the U-Boot source-code directory. # mkimage -A arm -C none -O linux -T kernel -d test.bin -a 0x00100000 -e 0x00100000 test.uimg Image Name: Created: Wed Jul 6 13:29:54 2011 Image Type: ARM Linux Kernel Image (uncompressed) Data Size: 148 Bytes = 0.14 kB = 0.00 MB Load Address: 0x00100000 Entry Point: 0x00100000 Our sample binary file is ready. Lets combine it with u-boot.bin to create the final flash image file: #cat u-boot.bin test.uimg > flash.bin Calculate the starting address of the test program in the flash.bin file: # printf "0x%X" $(expr $(stat -c%s u-boot.bin) + 65536) 0x21C68 Boot the flash image in QEMU: # qemu-system-arm -M versatilepb -nographic -kernel flash.bin Now verify the image address in U-Boot: Versatile # iminfo 0x21C68 ## Checking Image at 00021c68 ... Image Name: Image Type: ARM Linux Kernel Image (uncompressed) Data Size: 136 Bytes = 0.1 kB Load Address: 00100000 Entry Point: 00100000 Verifying Checksum ... OK The image is present at the address 0x21C68. Boot it by executing the bootm command: Versatile # bootm 0x21C68 ## Booting image at 00021c68 ... Image Name: Image Type: ARM Linux Kernel Image (uncompressed) Data Size: 148 Bytes = 0.1 kB Load Address: 00100000

Entry Point: 00100000 OK Starting kernel ... Hello Open World Thats all folks!

Kernel Development & Debugging Using the Eclipse IDE

This article is targeted at Linux newbies, kernel developers, and those who are new to Eclipse. It deals with development, building and debugging of the Linux kernel using the Eclipse IDE. Eclipse is an open source community, whose projects are focused on building an extensible development platform, runtimes, and application frameworks for building, deploying and managing software across the entire software lifecycle. Many people know (and hopefully love) it as a Java IDE, but many say that Eclipse is much more than just a Java IDE. For this article, I have used a configuration of Ubuntu 9.10 (32-bit) running on a system with a Core 2 Duo CPU and 2 GB of RAM.

Eclipse CDT installation


Download Eclipse CDT. Extract the tar ball with the following command: $ tar -zxvf eclipse-cpp-ganymede-SR2-linux-gtk.tar.gz Change the working directory in your terminal window to the extracted eclipse folder, and start Eclipse, as follows: $ cd eclipse $ ./elipse

Building and debugging a simple C application


Once you have the Eclipse IDE running, you can start with a small C application. Select File > New > C Project. Give the project a name (say, Hello, for this article), and specify the location (keep the location field at its default value). Under Project Type, select Executable. In the submenu, select Hello World ANSI C Project. Click on Next, and you are prompted with a window for some more details about the project. Fill in the appropriate data, and click Next. The next window, for the debug/release configuration, should be answered depending on your requirement. Click Finish to conclude this step. Next, Eclipse will ask you whether to open the C/C++ perspective or not; select Yes here. You will find Project Explorer on the left side on the IDE. Expand the Hello directory here, and you will find the

src directory; expand this directory too. Double-click the hello.c file in the src tree to open it. Its a simple program, as you will see. Now, its time to compile the project. Right-click the project in the Project Explorer dialogue box, and click Build Project. At the bottom, you will find a tab bar with options like Problems, Tasks, Console, etc. The Console tab displays compilation messages such as the following: **** Build of configuration Debug for project hello **** make all Building target: hello Invoking: GCC C Linker gcc -o"hello" ./src/hello.o Finished building target: hello A binary output file named hello is created. To run it, again right-click the project, and choose Run As > Local C/C++ Application. You will find the output of running the application in the Console tab.

Compiling the Linux kernel on Eclipse


Now lets get down to the real work, kernel compilation. Download a kernel. I have tried this with linux-2.6.34.tar.bz2. Save it to the folder in which you want to store this project. For example: $ mv linux-2.6.34.tar.bz2 /usr/src Extract the kernel source and change into the extracted folder: $ tar -jxvf linux-2.6.34.tar.bz2 $ cd linux-2.6.34 Configure your kernel with your preferred option from config/menuconfig/xconfig/gconfig. $ make menuconfig Visit the Kernel Hacking section; select Kernel Debugging; check Compile the kernel with debug info and Compile the kernel with frame pointers. Then, save your configuration and start Eclipse. Before stepping into kernel compilation, you must disable automatic building and indexing, just to save time. Go to Window > Preference > General > Workspace and disable the option Build automatically. To disable indexing, visit Window > Preference > General > C/C++ > Indexer. Select No Indexer. To proceed, select File > New > C Project. Give a name to your project, uncheck Use Default Location and browse to your kernel source-code directory. From Project Type, select Makefile Project > Empty Project. If you want to cross-compile the kernel, and you already have some other tool chain installed, you can choose it from Toolchain. Click Finish to complete this initial step. Repeat the same steps to build the new project, which we have covered earlier. In the Console tab, you will see the following messages: **** Build of configuration Linux GCC for project KernelLinux **** make all CHK include/linux/version.h CHK include/generated/utsrelease.h CALL scripts/checksyscalls.sh CHK include/generated/compile.h Now wait for a few minutes feel free to make yourself some some coffee.

Once the compilation is over, check whether the kernel image file is created successfully: $ ls -l /usr/src/linux-2.6.34/arch/x86/boot/bzImage -rw-r--r-- 1 manoj src 3589920 2010-11-30 12:51 /usr/src/linux-2.6.34/arch/x86/boot/bzImage

QEMU installation and testing a kernel image (bzImage)


To debug a program, it must be run. As we are already running Ubuntus kernel, we have to run the newly compiled kernel in an emulator. For this purpose, we use QEMU, which is a generic open source machine emulator. The good thing about QEMU is that it can boot the Linux kernel directly. Lets install the package. For my setup, I used the following command: $ sudo apt-get install qemu Once QEMU is installed, we can boot the newly compiled kernel, as follows: $ qemu -kernel /usr/src/linux-2.6.34/arch/x86/boot/bzImage The kernel will start booting, and will run into trouble when it tries to open the root device. We havent provided a root file system to QEMU, so its obvious that this will occur. However, now we know that Eclipse and QEMU are both working as desired. We will write a faulty module for our experiment, and will compile it as part of the kernel itself. Open your kernel project in Eclipse. In Project Explorer, right-click the net directory; choose New > Folder; and give it a name (hello, for this article). Create two files inside the hello directory: hello.c and Makefile, as follows: ********************* hello.c ************************ #include<linux/module.h> #include<linux/version.h> #include<linux/kernel.h> #include<linux/init.h> #include<linux/string.h> int hello_init(void) { printk("Module named Hello inserted\n "); strcpy(NULL,"Hello"); return 0; } void hello_exit(void) { printk("Module named Hello removed\n "); } module_init(hello_init); module_exit(hello_exit); MODULE_LICENSE("GPL"); ********************* Makefile ********************* obj-y := hello.o all: make -C ../../include M=$(PWD) modules The resultant loadable kernel module is named hello.c. Most kernel newbies start kernel hacking by reading OReillys Linux Device Driver book, so they can understand this code easily, but we will compile the hello module as part of the kernel. Edit the Makefile linux-2.6.34/net/Makefile, and append the following line, so that make can execute the

Makefile in the hello directory: obj-y +=hello/ Save the changes, and build your kernel again. If the kvm module is installed on your system, then you must remove it from memory before launching QEMU, since this module might create problems when you set breakpoints in the kernel: $ sudo rmmod kvm_intel $ sudo rmmod kvm You can boot the kernel with QEMU with this command-line (but dont do it yet): $ qemu -s -S /dev/zero -kernel /usr/src/linux-2.6.34/arch/x86/boot/bzImage (-S is used to stop the emulator processor at startup. The option -gdb -p xxxx can be replaced with -s, but in this case the port number 1234 will be used as default.) We need to specify a root filesystem. For this reason we can use /dev/zero as a dummy filesystem. Here our concern is to debug the kernel during booting. If required, a working filesystem can be used in place of /dev/zero. Basically the filesystem is required when the kernel is booted properly. So we can debug the kernel while its booting. I suggest you read the man pages of QEMU for more information. For now, we can go with these three options. The -s option will open a gdbserver on TCP port 1234 for QEMU. You can change the default port number, and the option is -gdb -p xxxx (port number). For example: $ qemu -gdb -p 4567 -S /dev/zero -kernel /usr/src/linux-2.6.34/arch/x86/boot/bzImage For the confirmation, execute the following command (you should see the mentioned sample results): $ netstat -tap | grep qemu tcp 0 0 *:4567 *:* LISTEN 23184/qemu You can see that QEMU is running, and its listening at Port Number 4567. Go to the kernel source code directory, where you will find a file named vmlinux. This is the kernel image which stores the debugging symbols. To debug our kernel running on QEMU, we will use the following file: $ cd /usr/src/linux-2.6.34 $ gdb ./vmlinux GNU gdb (GDB) 7.0-ubuntu Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/src/linux-2.6.34/vmlinux...done. (gdb) Important: We need to set the automatic usage of hardware breakpoints: (gdb) set breakpoint auto-hw Set a break point at start_kernel(): (gdb) break start_kernel Breakpoint 1 at 0xc06ed629: file init/main.c, line 533. Now continue the kernel booting. The kernel will stop at start_kernel().

(gdb) c Continuing. (gdb) n 533 smp_setup_processor_id(); (gdb) n 545 boot_init_stack_canary(); (gdb) n 547 cgroup_init_early(); (gdb) n 549 local_irq_disable(); (gdb) list 544 */ 545 boot_init_stack_canary(); 546 547 cgroup_init_early(); 548 549 local_irq_disable(); 550 early_boot_irqs_off(); 551 early_init_irq_lock_class(); You can use GDB commands to debug the kernel. For more information on GDB commands, download the GDB manual.

Kernel debugging in Eclipse


As mentioned earlier, we need hardware breakpoints to debug a kernel running on an emulator, so its better to enable auto hardware breakpoints in the gdbinit file. Edit the /etc/gdb/gdbinit file and append the following line: set breakpoint auto-hw Boot the bzImage in QEMU. Open your project in Eclipse. Click on Run > Debug Configurations. Double-click on Local C/C++ Application; a new configuration will be created. Name your new configuration, browse your project, and in the C/C++ Application field specify the full path to the vmlinux file, such as /usr/src/linux-2.6.34/vmlinux. Click on the Debugger tab, and select gdbserver Debugger from the Debugger options. Set a breakpoint in the Stop at startup at field, as hello_init. Click on the Connection tab, select the connection type to be TCP, and the port number to be 1234. (We are debugging the kernel in two different ways. Earlier, when we debugged the kernel from the command line gdb, we could change the default port number. Here, since we are debugging with Eclipse, it would be better to stick to default options.) Apply the changes, and start debugging by clicking the Debug button. Eclipse will build the kernel again (check the messages in the Eclipse Console), and after a few minutes, a pop-up window will ask you to open the Debug perspective; select Yes here. In QEMU, look at the kernel messages to confirm the running condition of the kernel. Our modules source file (hello.c) will be opened in Eclipse. To open a disassembly of the code, click Windows > Show View > Disassembly. Press F6 to step in to the running code. Once the strcpy function is executed, the kernel will crash (strcpy tries to copy data at a NULL pointer). In QEMU, you will find the backtrace of the kernel. If you have some knowledge of assembly coding, it is an advantage during debugging.

Playing with User-mode Linux

This article gives you hands-on experience in setting up a User-Mode Linux (UML) kernel and getting it up on a running Linux OS. We see how to share files between the host Linux and guest Linux, via the network and other methods. We also cover building a custom kernel, building modules for the UML kernel, inserting them into the running UML kernel, and debugging the kernel and modules with GDB. UML gives you the advantage of running Linux on top of a Linux distribution, without the need of privileged access. It is run in the form of an unprivileged user program, giving the end user power to play with the OS. The Linux kernel, once compiled to the UML architecture, creates a machinedependent binary which can execute itself, and launch the UML kernel. UML was developed by Jeff Dike, and has been part of the vanilla Linux kernel since version 2.6.0. It is very useful for kernel developers to quickly test new code; and for administrators, to build sandbox Linux virtual machines and honeypots, while deploying new services without disturbing the production environment. Most steps mentioned in this document are distribution-agnostic, and can be tried on any Linux machine with the x86 or x86_64 architectures. Figure 1 depicts a conceptual layout of UML in relation to the hardware, host kernel and user-space.

Figure 1: UML conceptual diagram

Requirements for setting up UML


The basic requirements for setting up UML are: Access to a Linux machine with x86 or x86_64 architecture (with or without root access) The Linux kernel build environment pre-installed (GCC, make, etc) A downloaded kernel source tarball (version 2.6.x) from kernel.org (I used 2.6.35-rc3 in this article) A root filesystem (can be created, or you can download one from here. Creating a rootfs from scratch is beyond the scope of this article.)

Building the UML kernel and modules


Extract the kernel source archive: $ tar -jxvf linux-2.6.35-rc3.tar.bz2 Enter the directory in which it was extracted (Im using /code/kernel/lfy/), and issue the following command to compile the kernel for the UML architecture (view Figure 2). $ ARCH=um make menuconfig

Figure 2: Kernel configuration for UM architecture

Figure 3: UML-specific configurations in menuconfig

ARCH defines the architecture for which the kernel is to be compiled; in this case, um stands for user mode. The make menuconfigcommand gives us an ncurses-based interface in which we can configure build options for the UML kernel. See Figure 3. To enable us to debug the kernel, we need to enable the following options: 1. Compile the kernel with debugging info (see Figure 4). 2. Compile the kernel with frame pointers. 3. Enable loadable module support in the main menu.

Figure 4: Selection of debug options Once youve chosen these options and saved the configuration, proceed with kernel compilation. (If you arent familiar with the process in general, you may want to refer to one of the many kernel compilation how-to pages on the Web.) Issue this specific make command to compile the kernel: $ ARCH=um make Once the kernel compilation is done, you should have a binary named linuxcreated in the same directory; see Figure 5.

Figure 5: List of files after kernel compilation Kernel modules need to be installed in a directory (of the host system), so that we can later copy them to the /lib/modules/ path inside the UML system. In my case, the target directory is /code/kernel/lfy/mods. $ ARCH=um make modules_install INSTALL_MOD_PATH=/code/kernel/lfy/mods

Extending the root filesystem (optional)


As mentioned in the requirements section, were using a downloaded root filesystem file for the UML kernel. If the compiled kernel modules you need to copy are many, or you need to copy other, large, files into the UML filesystem, then you probably need more space in the downloaded root filesystem. You can quickly resize the filesystem with the following three-step procedure:

1. Add space at the end of the filesystem file: $ dd if=/dev/zero count=1024 bs=1024k >> FedoraCore6-AMD64-root_fs This adds 1 GB to the end of the root filesystem file. Be careful to use the >> (the double greater-than) redirection operator to append to it; if you use the single greater-than symbol, the rootfs will actually be an empty 1 GB file. 2. Do a forced check of the filesystem: $ e2fsck -f FedoraCore6-AMD64-root_fs 3. Resize the filesystem to use the added space: $ resize2fs FedoraCore6-AMD64-root_fs

First boot of UML


Now we are ready to boot UML for the first time. 1. Boot a UML instance with the following command-line (shown in Figure 6): $ linux-2.6.35.rc3/linux ubda=FedoraCore6-AMD64-root_fs mem=256M

Figure6: Booting UML In this command-line, ubda specifies the filesystem image that is to be used as the root filesystem. If you need to pass more than one filesystem image, use arguments like ubdb, ubdc, etc. The optional mem parameter specifies the memory (RAM) that is allocated to the UML (it defaults to 128M if not specified). 2. Access the host filesystem in the UML instance, and copy the previously compiled kernel modules to UMLs filesystem. This can be done in various ways; I am highlighting a couple of methods here. 1. hostfs method (root access not required): hostfs is a UML filesystem that provides access to the host system files. Once the UML system is booted, execute the following steps: 1. Create a directory in the UML instance, where you will mount the host filesystem: # mkdir /host 2. Mount the host directory that contains the modules built for the UML kernel: # mount none /host -t hostfs -o /code/kernel/lfy/mods 3. Once the host directory is mounted, copy the module files to /lib/modules of the UML instance, with a simple cp command. 2. Network update method (root access required): A network update involves setting up a bridge between the host and the UML system. Once the network is set up, files can be copied over the network using scpor an NFS share from the host, mounted in the UML system. Execute the following steps:

1. On the host, you will need to have the bridge-utils package installed. You can download the source code and compile it in your host OS. After that, run the following steps (in the same order): # brctl addbr br0 (create bridge br0) # tunctl -u `id -u surya` (create a tap device, and assign permissions to a normal user; replace with username of your desired ordinary user account.) # ifconfig eth0 0.0.0.0 promisc up (set the system network interface and the tap interface in promiscuous mode) # ifconfig tap1 0.0.0.0 promisc up # brctl addif br0 eth0 (add system interface to the bridge) # brctl addif br0 tap1 (add tap device to the bridge) # ifconfig br0 up (bring up the br0 device with DHCP -- see note below) 2. Once the above setup is done, the UML instance can be restarted with the following modified command line: $ linux-2.6.35.rc3/linux ubda=FedoraCore6-AMD64-root_fs mem=256M eth0=tuntap, tap1 3. As mentioned, use scp to copy the modules, or create an NFS share from the host, mount it in the UML instance, and copy the modules. Note: This setup assumes that there is a running DHCP server in the hosts network. If this is not the case, interface br0 on the host and eth0 in the UML guest have to be assigned static IP addresses. We need to remember that the UML system lies in the same network as the host system, in this setup. We follow this model of setting up the network because if administrators want to provide sandboxed UML test environments for users who need full privileges, they would either need the UML instances to be on the same network as the host, or would need to configure a custom iptables setup.

Debugging the Linux kernel in UML


Since UML is considered to be an application, it can be debugged with the standard GDB debugger, as follows. Load the linux binary in GDB: $ gdb linux-2.6.35.rc3/linux This gives us a gdb prompt. Since we could not specify the command-line arguments for the UML instance on the GDB command-line, we set the arguments here (the eth0 argument is assuming that you have set up bridged network access between the host and the UML instance, as described above): $ set args ubda=FedoraCore6-AMD64-root_fs mem=256M eth0=tuntap,tap1 Figure 7 illustrates the UML kernel being booted from within GDB. Once we passed the arguments with the set args command, we placed a breakpoint on the start_kernel()function in the kernel code, and then instructed GDB to run the program. As you can see, after UML initialisation, GDB stopped execution when it reached the breakpoint. If you did not start UML in the GDB debugger, you can also attach GDB to the UML guest later: # gdb linux-2.6.35-rc3/linux 2666 (Here, 2666 is the PID of the UML instance. See Figure 8 for an illustration of attaching GDB to a running UML instance.)

Figure 7: Debugging UML instance with GDB

Figure 8: Attaching GDB to a running UML instance

Compiling custom modules for UML


If you have written a custom kernel module that you need to insert into the running UML kernel, the module needs to be compiled for the UML architecture, with the same kernel version with which UML is running. Your Makefile could look like whats shown below: obj-m := uml-mod.o KPATH := /code/kernel/lfy/mods/lib/modules/2.6.35-rc3/build PWD := $(shell pwd) all: $(MAKE) -C $(KPATH) SUBDIRS=$(PWD) modules Here, KPATH defines the path of the UML kernel source. Remember to pass ARCH=um with the make command: # ARCH=um make This will compile your custom kernel module for the UML kernel. Once the module is compiled successfully, you can copy the .ko file from the host system to the UML using hostfs or networking, as given above, and you can then insert it into the running UML kernel.

Debugging modules with UML


Loadable modules are a great advantage in the Linux kernel. Pieces of kernel code can be dynamically plugged in and out of the running kernel. However, a few of these modules with bugs can cause problems with the system, and need to be debugged. s these modules are inserted in the kernel at a later stage, GDB has no knowledge of the relevant symbol information, or the location of the module in memory. We need to feed this information to GDB, once the module is loaded, in order to debug the module. GDB has a command, add-symbol-file, which takes the .ko module file (which you are trying to debug) as its first argument, and the address of the .text section of the module as the second argument. The .text address can be obtained from /sys/module/<modulename>/sections/.text.

Figure 9: Module debugging steps in the UML instance

Lets consider an example, using the module loop.ko. In the UML instance: 1. Insert the module loop.ko in the UML kernel. (If it is not compiled, you can recompile loop.ko and copy it to the UML system.) # insmod/modprobe loop.ko 2. Obtain the address from /sys/module/loop/sections/.text (see Figure 9): # cat /sys/module/loop/sections/.text 3. To debug the loop.ko module, we need to prepare a sample image file and format it, ready to be mounted at a later stage: # dd if=/dev/zero of=fs.img count=2 bs=1024k # mkfs.ext3 fs.img On the host system: 1. In a different terminal window (which you started the UML instance from), find the process ID of the UML instance. 2. Attach GDB to the running UML instance, specifying the PID. For example: $ gdb uml-linux-image 8892 3. In GDB, load the debug symbol information for the module: add-symbol-file /code/kernel/lfy/linux-kernel/drivers/block/loop.ko 0x7187c000 (The last argument here is the .text address of the loop module, obtained in the second step we ran in the UML instance, above.) 4. Test whether the module is properly loaded: p loop_unplug (loop_unplug is a function in the drivers/block/loop.c file. This GDB command should show the .text address we used earlier. Once you see the .text address, it implies you are able to access the module through GDB.) 5. Now, put a breakpoint on the loop_unplug() function: # b loop_unplug 6. Finally, type c at the gdb prompt, to continue running until the breakpoint is encountered. Figure 10 illustrates the above steps. Back in the UML instance, we can activate our breakpoint with the following steps: # mkdir test # mount -o loop fs.img test This should hit the breakpoint in the running GDB instance in the host system. We can then view and debug the code of the module. The purpose of this article was to provide an introduction to UML, and a step-by-step guide to setting up a UML system and debugging the kernel and modules. The methods mentioned in this article are one of the many available to set up and play around with UML. For more knowledge on the topic, you can subscribe to the UML mailing lists or visit the UML home page.

Figure 10: Module debugging steps on the host system

Debugging the Linux Kernel with debugfs

debugfs is a simple memory-based filesystem, designed specifically to debug Linux kernel code, and not to be confused with the debugfs filesystem utility. Introduced by Greg Kroah-Hartman in December 2004, debugfs helps kernel developers export large amounts of debug data into user-space. This article introduces debugfs and its application programming interface, along with sample code. The intended audience are kernel and device driver developers with some knowledge of Linux kernel development. The experienced can use it to refresh their skills as well. Greg Kroah-Hartman introduced debugfs to the Linux kernel through his post to the Linux kernel mailing list (LKML) in December 2004. debugfs helps kernel developers export large amounts of debug data into user-space. It has no rules about any information that can be put in, unlike /proc and /sysfs, and hence is encouraged for use in debugging. Using printk can suffice when reading values but it does not allow developers to change values of variables from user-space.

Prerequisites
A test system with kernel development environment requires Linux kernel source code for reference. Pretty much any distribution with a kernel version higher than 2.6.10-rc3 supports debugfs, but the newer the kernel, the more API support you get. My setup consists of Fedora 13 (x86_64) and the latest linux-next kernel.

Setting up DebugFS
If you are using one of the latest distributions, chances are that debugfs is already set up on your machine. If youre compiling the kernel from scratch, make sure you enable debugfs in the kernel configuration. Once you reboot to your newly compiled kernel, check if debugfs is already mounted, with the following command: # mount | grep debugfs none on /sys/kernel/debug type debugfs (rw) If you see output as above, you have debugfs pre-mounted. If not, you can mount it (as root) with the command shown below: # mount -t debugfs nodev /sys/kernel/debug If you want it to be available on every reboot, append an entry in /etc/fstab as follows: debugfs /sys/kernel/debug debugfs defaults 0 0 Once mounted, you can view a lot of files and directories in /sys/kernel/debug, each belonging to one or the other subsystem.

The debugfs API


To access the API, include linux/debugfs.h in your source file. To be able to use debugfs, we start by creating a directory within /sys/kernel/debug, which is an ideal way to start. The rest of the files can be placed within this directory. struct dentry *debugfs_create_dir(const char *name, struct dentry *parent); Here, is the name of the directory, and parent is the parent directory (if null, the directory is created in /sys/kernel/debug). If debugfs is not enabled in the kernel, ENODEV is returned. If you need to create a single file within debugfs, you can call the following function: struct dentry *debugfs_create_file(const char *name, mode_t mode, struct dentry *parent, void *data, struct file_operations *fops); Here, name is the file name to be created; mode stands for the permissions of the created file; parent specifies the parent directory in which the file should be created (if null, it defaults to the debugfs root, /sys/kernel/debug); data is the type inode.i_private and fops is the file operations. If you need to write to and read from a single value, you can use this to create an unsigned 8-bit value: struct dentry *debugfs_create_u8(const char *name, mode_t mode, struct dentry *parent, u8 *value); Here, value is a pointer to a variable that needs to be read and written to. A few other helper functions to create files with single integer values are: struct dentry *debugfs_create_u16 struct dentry *debugfs_create_u32 struct dentry *debugfs_create_u64 (Refer to fs/debugfs/file.c for more information.) Similar functions that give hex output, are: dentry *debugfs_create_x8(const char *name, mode_t mode, struct dentry *parent, u8 *value) dentry *debugfs_create_x16 dentry *debugfs_create_x32 ^^dentry *debugfs_create_x64 Note: debugfs_create_x64 is the most recent edition of the API; added in May 2010 by Huang Ying, this tells us that debugfs is still in active development.

Clean up!
One important thing to remember is that all the files and directories created using the above APIs are to be cleaned up by the creator, using the following code: void debugfs_remove(struct dentry *dentry) Here, dentry is a pointer to the file or directory that needs to be cleaned up. Other available DebugFS APIs are: debugfs_create_symlink create symbolic link in debugfs debugfs_remove_recursive recursive removal of a directory tree debugfs_rename rename a file or directory debugfs_initialized to know whether debugfs is registered debugfs_create_bool to read/write a boolean value debugFS_create_blob to read a binary large object

You can refer to code for these and other functions in fs/debugfs/{inode.c, file.c} in the kernel directory. On a general note, debugfs is not considered to be a stable Application Binary Interface (ABI), and

hence it should only be used to collect debug information on a temporary basis. For more information in this regard, you can refer to an LWN article.

Code sample
Note: You need to be running the latest kernel to make this program work. This sample program, my_debugfs.c, shows a use of debugfs. #include <linux/init.h> #include <linux/module.h> #include <linux/debugfs.h> /* this is for DebugFS libraries */ #include <linux/fs.h> MODULE_LICENSE("GPL"); MODULE_AUTHOR("Surya Prabhakar <surya_prabhakar@dell.com>"); MODULE_DESCRIPTION("sample code for DebugFS functionality"); #define len 200 u64 intvalue,hexvalue; struct dentry *dirret,*fileret,*u64int,*u64hex; char ker_buf[len]; int filevalue; /* read file operation */ static ssize_t myreader(struct file *fp, char __user *user_buffer, size_t count, loff_t *position) { return simple_read_from_buffer(user_buffer, count, position, ker_buf, len); } /* write file operation */ static ssize_t mywriter(struct file *fp, const char __user *user_buffer, size_t count, loff_t *position) { if(count > len ) return -EINVAL; return simple_write_to_buffer(ker_buf, len, position, user_buffer, count); } static const struct file_operations fops_debug = { .read = myreader, .write = mywriter, }; static int __init init_debug(void) { /* create a directory by the name dell in /sys/kernel/debugfs */ dirret = debugfs_create_dir("dell", NULL); /* create a file in the above directory This requires read and write file operations */ fileret = debugfs_create_file("text", 0644, dirret, &filevalue, &fops_debug); /* create a file which takes in a int(64) value */ u64int = debugfs_create_u64("number", 0644, dirret, &intvalue);

if (!u64int) { printk("error creating int file"); return (-ENODEV); } /* takes a hex decimal value */ u64hex = debugfs_create_x64("hexnum", 0644, dirret, &hexvalue ); if (!u64hex) { printk("error creating hex file"); return (-ENODEV); } return (0); } module_init(init_debug); static void __exit exit_debug(void) { /* removing the directory recursively which in turn cleans all the file */ debugfs_remove_recursive(dirret); } module_exit(exit_debug);

Building the module


Create a Makefile like the following one, to compile and test this code: 1 obj-m := my_debugfs.o 2 KDIR := /lib/modules/$(shell uname -r)/build 3 PWD := $(shell pwd) 4 5 all: 6 $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules

Further reading and references


For those who are unaware, theres lot of information already available; LWN and the LKML mailing list are good sources of knowledge. If you want to look at the relevant kernel source code, you can visit: debugfs.h inode.c You can observe the latest changes in debugfs by examining the online Git repository at kernel.org. Within the kernel source, you can refer to Documentation/filesystems/debugfs.txt.

Kernel Tracing with ftrace, Part 1

This article explains how to set up ftrace and be able to understand how to trace functions. It should be useful for current kernel developers and device driver developers who want to debug kernel issues, and also for students who are keen to pursue a Linux systems programming career. ftrace (Function Tracer) is the Swiss army knife of kernel tracing. It is a tracing mechanism built right into the Linux kernel. It has the capability to see exactly what is happening in the kernel, and debug it. ftrace is more than a mere function tracer, and has a wide variety of tracing abilities to debug and analyse a number of issues like latency, unexpected code paths, performance issues, etc. It can also be used as a good learning tool. ftrace was introduced in kernel 2.6.27 by Steven Rostedy and Ingo Molnar. It comes with its own ring buffer for storing trace data, and uses the GCC profiling mechanism.

Prerequisites
You need a 32-bit or 64-bit Linux machine with a kernel development environment, and as new a kernel as possible (the newer the kernel, the more the tracing options you get). I use a Fedora Core 13 (x86_64) machine in my environment, but any distribution would suffice.

Setting up Ftrace
debugfs needs to be set up to run on the machine you want to use ftrace on. If you are unaware of how to set up debugfs, do refer to my debugfs article from last month. debugfs should have been mounted on /sys/kernel/debugfs, and if tracing is enabled, you should be able to see a directory called tracing under debugfs. If debugfs is not mounted, you can issue the following command: # mount -t debugfs nodev /sys/kernel/debug If you are unable to see the tracing subdirectory, you will need to enable tracing in the kernel configuration, and recompile it. Look for the following options to be enabled in the kernel configuration path (refer to Figure 1): Kernel Hacking -> Tracers 1. 2. 3. 4. Kernel Function Tracer (FUNCTION_TRACER) Kernel Function Graph Tracer (FUNCTION_GRAPH_TRACER) Enable/disable ftrace dynamically (DYNAMIC_FTRACE) Trace max stack (STACK_TRACER)

Figure 1: Kernel configurationoptions for tracing Depending on your architecture, a few more tracers can be enabled during compilation, as per requirements. The listed tracers are for debugging. Once the kernel compilation is complete, and you have booted to the new kernel, tracing can be initiated.

Figure 2: Tracing files

Tracing
Files in the tracing directory (/sys/kernel/debug/tracing) control the tracing ability (refer to Figure 2 for a list of files). A few files could be different, depending upon what tracers you selected during kernel configuration. You can obtain information on these files from the <kernel source>/Documentation/tracing directory. Lets explore a few of the important ones: available_tracers: This shows what tracers are compiled to trace the system. current_tracer: Displays what tracer is currently enabled. Can be changed by echoing a new tracer into it. tracing_enabled: Lets you enable or disable the current tracing. trace: Actual trace output. set_ftrace_pid: Sets the PID of the process for which trace needs to be performed. To find out the available tracers, just cat the available_tracers file. Tracers in the space-separated output include: nop (not a tracer, this is set by default); function (function tracer); function_graph (function graph tracer), etc: # cat available_tracers blk function_graph mmiotrace wakeup_rt wakeup irqsoff function sched_switch nop

Figure 3: Sample trace output

Once you identify the tracer that you want to use, enable it (ftrace takes only one tracer at a time): # cat current_tracer ##to see what tracer is currently in use. # echo function > current_tracer ##select a particular tracer. # cat current_tracer ##check whether we got what we wanted. To start tracing, use the following commands: # echo 1 > tracing_enabled ##initiate tracing # cat trace > /tmp/trace.txt ##save the contents of the trace to a temporary file. # echo 0 > tracing_enabled ##disable tracing # cat /tmp/trace.txt ##to see the output of the trace file. The trace output is now in the trace.txt file. A sample output of a function trace obtained with the above commands is shown in Figure 3.

Kernel Tracing with ftrace, Part 2


In my previous article, we had a working setup of ftrace, and explored options to enable and disable it. In this article, we will explore a few more capabilities of ftrace. Lets begin with tracer options. The output of the tracing can be controlled by a file called trace_options. Various fields can be enabled and disabled by updating options in the file /sys/kernel/debug/tracing/trace_options. A sample of trace_options can be viewed in Figure 1.

Figure 1: Trace options To disable a tracing option, a no keyword needs to be added to the start of the line. For example, echo notrace_printk > trace_options. (Remember not to have a space between no and the option.) To enable a trace again, you could use, for instance, echo trace_printk > trace_options.

ftrace for a specific process


ftrace allows you to perform tracing even for a specific process. In the /sys/kernel/debug/tracing directory, the file set_ftrace_pid needs to be updated with the PID of the process you want to be traced. The traceprocess.sh sample script below shows how to capture the PID on-the-go, and enable tracing. #!/bin/bash DPATH="/sys/kernel/debug/tracing" PID=$$ ## Quick basic checks [ `id -u` -ne 0 ] && { echo "needs to be root" ; exit 1; } # check for root permissions [ -z $1 ] && { echo "needs process name as argument" ; exit 1; } # check for args to this function mount | grep -i debugfs &> /dev/null [ $? -ne 0 ] && { echo "debugfs not mounted, mount it first"; exit 1; } #checks for debugfs mount # flush existing trace data echo nop > $DPATH/current_tracer

# set function tracer echo function > $DPATH/current_tracer # enable the current tracer echo 1 > $DPATH/tracing_enabled # write current process id to set_ftrace_pid file echo $PID > $DPATH/set_ftrace_pid # start the tracing echo 1 > $DPATH/tracing_on # execute the process exec $* You can refine it with your own innovations. Run it with the command whose process you want to trace as the argument, as shown in Figure 2, where we traced the ls command.

Figure 2: Executing traceprocess.sh and viewing trace output Once tracing is complete, you need to clear the set_ftrace_pid file, for which you can use the following command: :> set_ftrace_pid

Function graph tracer


The function graph tracer tracks the entry and exit of a function, and is quite useful to track its execution time. Functions with a duration of over 10 microseconds are marked with a +, and those over 100 microseconds with !. To enable the function graph tracer, use echo function_graph > current_tracer. The sample output is as shown in Figure 3.

Figure 3: Trace output of function graph tracer

Figure 4: Listing filter functions, using with wild-cards There are a lot of tracers; the entire list is in linux/Documentation/trace/ftrace.txt. The tracers are enabled or disabled by echoing the tracer name into the current_tracer file.

Dynamic tracing
We can easily get inundated with the amount of data the function tracer throws at us. There is a dynamic way to filter just the functions we need, and eliminate those that we dont need: to specify them in the file set_ftrace_filter. (First find the function(s) you want, from the available_filter_functions file.) See Figure 4 for an example of dynamic tracing. As you can see, you can even use wild-cards for the functions names. I used all the vmalloc_ functions, and set them with: echo vmalloc_* > set_ftrace_filter.

Event tracing
Tracing can also be triggered when particular events happen on the system. Available system events are found in the file available_events: [root@DELL-RnD-India tracing]# cat available_events | head -10 kvmmmu:kvm_mmu_pagetable_walk kvmmmu:kvm_mmu_paging_element kvmmmu:kvm_mmu_set_accessed_bit kvmmmu:kvm_mmu_set_dirty_bit kvmmmu:kvm_mmu_walker_error kvmmmu:kvm_mmu_get_page kvmmmu:kvm_mmu_sync_page kvmmmu:kvm_mmu_unsync_page kvmmmu:kvm_mmu_prepare_zap_page kvm:kvm_entry For example, to enable an event, you would use: echo sys_enter_nice >> set_event (note that you append the event name to the file, using the >> append redirector, and not >). To disable an event, precede the event name with a !: echo '!sys_enter_nice' >> set_event. See Figure 5 for a sample event tracing scenario. The available events are listed in the events directory as well.

Figure 5: Available tracing events, setting and unsetting them.

For further details about event tracing, read the file Documents/Trace/events.txt in the kernel directory.

trace-cmd and KernelShark


trace-cmd, introduced by Steven Rostedt in his July 2009 post to the LKML, makes it easy to manipulate the tracer. Follow these steps to get the latest version, including the GUI tool KernelShark, installed on your system: wget http://www.hr.kernel.org/pub/linux/analysis/trace-cmd/trace-cmd-1.0.5.tar.gz tar -zxvf trace-cmd-1.0.5.tar.gz cd trace-cmd* make make gui # compiles GUI tools (KernelShark) make install make install_gui # installs GUI tools With trace-cmd, tracing becomes a breeze (see Figure 6 for sample usage): trace-cmd list ##to see available events trace-cmd record -e syscalls ls ##Initiate tracing on the syscall 'ls' ##(A file called trace.dat gets created in the current directory.) trace-cmd report ## displays the report from trace.dat

Figure 6: Using trace-cmd for recording and reporting KernelShark, installed by the make install_gui step above, can be used to analyse the trace data in the file trace.dat, as shown in Figure 7.

KGDB with VirtualBox: Debug a Live Kernel

Debugging an application live has always been easy for application developers, but debugging a live kernel has never been a simple option for kernel developers it involves multiple machines with serial connections. This article shows how to use virtualisation atop a running OS to help debug a live kernel on a single machine. Readers are expected to have prior knowledge on how to use GDB, know the fundamentals of the Linux kernel, understand custom compilation, apart from knowing how to use VirtualBox or any other virtualisation software. KGDB is an amazing Linux kernel debugging tool. It can debug the kernel while it is running, set breakpoints, and step through the code. Earlier, KGDB used to be a bunch of patches that had to be carefully merged into the mainline kernel. However, since version 2.6.26, KGDB has been merged into the mainline, and only needs to be enabled during kernel compilation. A typical KGDB setup requires two machines connected by a serial cable: one as a source machine on which debugging is done, and the other (destination) which is being debugged. With virtualisation, however, we can do away with that second machine. When we combine VirtualBox with KGDB on a single machine, the host OS is the source machine, while the guest OS (Linux kernel compiled with KGDB enabled) is the destination. A virtual serial port is enabled between the host and the guest. Figure 1 displays such a setup.

Figure 1: KGDB with the VirtualBox set-up

Prerequisites
For this setup, well need: The host running a Linux system (you can have a host OS other than Linux, but this article does not cover that). My host system runs Ubuntu Maverick Meerkat 10.10, 64-bit. VirtualBox software installed on the host OS. I used the VirtualBox 4.0 distribution-specific binary obtained from the project website. The socat binary installed on the host. This is used to link the pipe file (FIFO) that is created by VirtualBox, with a pseudo-terminal on the host system. Heres the download link. Normally, GDB takes a physical terminal file (like ttys0) as the remote target, but in our case, we will instead provide a pseudo terminal created by socat as the remote target for GDB. Refer to the socat man pages for more information. A VM installed with a Linux guest OS (I used Fedora 14). The VirtualBox documentation shows how to create a VM, if you need it, so I wont repeat it here. Help on how to install an OS in a VirtualBox VM is also in the documentation. I downloaded the Fedora 14 ISO from the Fedora site, attached it to the VM, and booted the VM and installed Fedora. The Linux kernel source is accessible to the VM (and the host, too see below). This can be picked up from kernel.org; I used version 2.6.37. It is used to recompile the guest OS kernel with KGDB-specific options. How to get the source available in the VM is described under File-sharing between machines subsection below.

Setting up the guest


Configuring the virtual serial port in VirtualBox
Right-click your virtual machine instance, and go to the Settings tab. On the Port 1 tab, choose Enable Serial Port. Select Port Mode to be Host pipe. Enter a pipe file-name in the Port/File Path text field, as shown in Figure 2.

Figure 2: Configuring the serial port in VirtualBox

File-sharing between machines


I have set up networking for the VM and in my Fedora guest, so that I can easily access files on the host. You could set up an optional NFS server on the host machine, and create an NFS share for the kernel source directory. This share is mounted within the guest, and the kernel is compiled and installed from the guest command prompt. View Figure 3 for an idea of my setup.

Figure 3: The NFS share set-up with VirtualBox There are many benefits in such a setup the shared kernel source can be used to directly do a make install of the kernel within the guest. While debugging, you will need the kernel source files in the host OS, so that they are accessible to GDB and lets not forget the vmlinuz file, which is passed as one of the arguments to the debugger. If you are debugging kernel modules, you can edit the module source (in the NFS-shared folder) from the host, while you debug the guest. There are many other ways of doing this, which you can explore on the Internet.

Preparing and installing the kernel on a guest OS


The kernel source can be compiled either on the guest or the host. Since I have Ubuntu on the host and Fedora in the guest, I preferred to compile the kernel in the guest itself. Wherever you compile it, you will obviously need the build environment set up. Again, preparing a build environment is a task that is documented very well on the Internet. An important point to note is that compiling the kernel in the VM (guest OS) is extremely time-consuming. During kernel configuration (once you do a make menuconfig) ensure you enable the following options: Kernel hacking > (options for kernel hacking) Kernel debugging (features for kernel debugging) Compile the kernel with debug info (kernel and modules are compiled with the -g option) Compile the kernel with frame pointers (frame-pointer registers are used to keep track of stack) KGDB: Kernel debugger > (enable KGDB) KGDB: Use over serial console (enable serial console support) (see Figure 4)

Figure 4: KGDB serial console options Once the kernel is compiled, if you do a make modules_install and install from within the guest, it will install the newly compiled kernel. Figure 5 shows the Grub option for the new kernel after a guest reboot.

Figure 5: Newly compiled kernel in the GRUB menu

Figure 6: Options in Grub prompt

Editing the bootloader


To enable KGDB serial console support from the guest, we need to append the options kgdboc=ttyS0 and 115200 kgdbwait to the kernel command-line. Here, kgdboc (KGDB over console) uses ttyS0 with the baud rate defined as 115200. The kgdbwait option tells the kernel to wait until we connect to it with GDB. From the Grub boot-loader screen, press e and append the options to the kernel line, as shown in Figure 6. An alternative, if you plan to be debugging frequently, is to edit the bootloader configuration file (/etc/grub.conf, in my case) and update the kernel command line, as shown in Figure 7. This second method requires a reboot of the VM to activate the new kernel options, if you havent edited at the GRUB prompt.

Figure 7: Editing kernel options in /etc/grub.conf

Booting with KGDB options


Once you have these options in the kernel command-line and boot from it, the kernel boots till it gets to the stage where it waits for the remote GDB connection over the (virtual) serial port, as shown in Figure 8.

Figure 8: KGDB waiting for the remote connection

Linking the serial file on the host to the pseudo-terminal


We need to use socat to do this linking, with the following command: # socat -d -d /code/guest/serial PTY: Here, PTY: is the pseudo-terminal, and /code/guest/serial is the virtual serial port pipe file created by VirtualBox on my host machine as per the VM settings done earlier. When this command is run, it returns the pseudo-terminal number thats allocated (in my case, /dev/pts/7), as shown in Figure 9.

Figure 9: Socat with pipe file and pseudo tty Its important to remember that you should not terminate the socat command; it needs to be running in the background for us to be able to use the pseudo terminal, else it breaks the stream.

Firing up GDB
Enter the kernel source directory on the host, and start GDB, telling it to connect to the remote target, which is the pseudo-terminal number returned by socat: # cd linux-2.6.37 # gdb ./vmlinux (gdb) target remote /dev/pts/7 This connects us to the waiting Linux kernel session in the VM. If we type continue at the GDB prompt, we will see booting resume in the VMs guest OS. To be able to get back the GDB prompt on the host, you need to run the following command in the guest: # echo g > /proc/sysrq-trigger This will break the running session, and give you control in GDB. This can be used to insert breakpoints and do other debugging operations, like those seen in Figure 10. If you need to debug a kernel module, insert the module in the guest, obtain the .text address of the module (/sys/module/<module_name>/sections/.text) and use it as an argument for the GDB command addsymbol-file.

Figure 10: GDB connected to guest OS As you can see, with the above setup, a great deal of control can be achieved in debugging the live kernel.

How to create libraries with gcc?

Static Libraries with gcc:


Static library or statically-linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, producing an object file and a stand-alone executable. Here I will try to explain how to create a static library with gcc.

Code for the library:


First we write some code for our library. The below function will find and return the factorial of a number. fact.c
int fact (int f) { if ( f >= 1 ) return f; return f * fact ( f - 1 ); }

We need to create a header file to define our function. fact.h int fact (int);

Creating static library:


Static libraries are simply a collection of ordinary object files; conventionally, static libraries end with the .a . To generate object file use -c flag with gcc. $ gcc -c fact.c -o fact.o Flags: -c Compile or assemble the source files, but do not link. The compiler output is object files corresponding to each source file. -o To specifiy the output file name. A static library can contain more than one object files. We need to copy all object file in to a single file. This single file is our static library. We can use archiver(ar) to create to create our static library. $ ar rcs libfact.a fact.o

Options: r Insert the files into archive (with replacement). c Create the archive. s Write an object-file index into the archive, or update an existing one, even if no other change is made to the archive. Note: the library must start with the three letters lib and have the suffix .a.

A program using the library:


main.c
#include <stdio.h> #include "fact.h" int main(int argc, char *argv[]) { printf("%d\n", fact(4)); return 0; }

Linking our static library:


$ gcc -static main.c -L. -lfact -o fact Options: -L Add directory to the list of directories to be searched for -l. Now run program: $ ./fact 24

Shared Libraries with gcc:


Shared libraries are libraries that are loaded by programs when they start. All programs that start afterwards automatically use the new shared library.

Code for the library:


This is the code that goes into the library. A function that calculate and return the factorial of a number. fact.c
int fact (int n) { int fact = 1; if(n == 0 || n == 1) return 1; while(n &gt; 1) { fact = fact * n; n--; } return fact; }

Now create a header file to define our function. fact.h int fact (int);

Creating Shared library:


First, create the object files that will go into the shared library using the gcc -fPIC or -fpic flag. The -fPIC and -fpic options enable position independent code generation, a requirement for shared libraries. $ gcc -fPIC -Wall -g -c fact.c Every shared library has a special name called the soname. The soname has the prefix lib, the name of the library, the phrase .so, followed by a period and a version number. We can use ld, the GNU linker to create our shared library. The ld combines a number of object and archive files, relocates their data and ties up symbol references. $ ld -shared -soname libfoo.so.1 -o libfoo.so.1.0 fact.o

Installing Shared library:


Once youve created a shared library, youll want to install it. The simple approach is simply to copy the library into one of the standard directories (e.g., /usr/lib) and run ldconfig(8). Now copy the libfoo.so.1.0 to /usr/lib $ sudo cp libfoo.so.1.0 /usr/lib/ $ sudo ldconfig -n /usr/lib/ Now create a symbolic link to our library. $ sudo ln -sf /usr/lib/libfoo.so.1 /usr/lib/libfoo.so

Linking our Shared library:


This is the program that uses our foo library. main.c
#include <stdio.h> #include "fact.h" int main(int argc, char *argv[]) { printf("%d\n", fact(4)); return 0; }

Compile: $ gcc main.c -lfoo -o fact Now run program: $ ./fact 24

Lisp: Tears of Joy, Part 1

Lisp has been hailed as the worlds most powerful programming language. But only a few programmers use it because of its cryptic syntax and reputation for being appropriate for only those in academia. This is rather unfortunate, since Lisp isnt that hard to grasp. Currently, only the top percentile of programmers use Lisp; so if you want to be among the crme de la crme, these articles are for you. I am going to hate my job. Those were my initial thoughts when I received an assignment at work, a few years ago. I had been asked to maintain and leverage a client module written in Lisp. My perception of Lisp was that of an ancient functional programming language with a cryptic syntax, used only by academicians and scientists to carry out experiments in the domain of Artificial Intelligence. And Lisps ever-present parentheses were enough to drive anyone crazy! At that time, I believed that I was an ace at a cool, new-age, object-oriented programming language, which was the medium of my expression: I ate, drank, slept, and dreamt in that language. It made me the God of my machine universe. I also believed I was exceptionally attractive to women, and spent hours in front of the mirror doing my hair, and drove my sluggish and worn-out Kinetic Honda as if it was a 1340cc Suzuki Hayabusa. I now know better

Dj vu
Those of us who witnessed the shift from the non-structured programming paradigm to procedural programming, and then towards object-oriented programming, will relate to what Paul Graham had to say in his book, Hackers & Painters. It is something to the effect of, You cant trust the opinion of others regarding which programming language will make you a better programmer. Youre satisfied with whatever programming language you happen to use, because it dictates the way you think about programs. I know this from my own experience, as a high-school kid writing programs in BASIC. That language didnt even support recursion. Its hard to imagine writing programs without using recursion, but I didnt miss it at the time. I thought in BASIC. And I was a whiz at it. Master of all I surveyed.

Three weeks into hacking Lisp, I had a feeling of dj vu the previous experience being when I first progressed from BASIC to C, and from C to C++ and Java. With each leap, I would be happily surprised with the growing power (of programming) at my fingertips. Time and again, I would wonder how I ever coded without objects, methods, encapsulation, polymorphism, inheritance, etcetera! One may say that it was syntactic sugar at work. But not with Lisp, which is pure ecstasy. Its not just beautiful, but strangely beautiful. In his famous essay, How to become a Hacker, Eric Steven Raymond (author of many best sellers including The Cathedral and the Bazaar), writes: Lisp is worth learning for the profound enlightenment experience you will have when you finally get it. That experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself, a lot.

Lisp enlightens you as a hacker


So whats so great about Lisp? How does it enlighten you as a hacker? Lisper Paul Graham explains this so proficiently and methodically that it will be inappropriate to answer this question in any other words than his. The five languages (Python, Java, C/C++, Perl, and Lisp) that Eric Raymond recommends to hackers fall at various points on the power continuum. Where they fall relative to one another is a sensitive topic, but I think Lisp is at the top. And to support this claim, Ill tell you about one of the things I find missing when I look at the other four languages. How can you get anything done in them without macros? Many languages have something called a macro. But Lisp macros are unique. Lisp code is made out of Lisp data objects and not in the trivial sense that the source files contain characters and strings are one of the data types supported by the language. Lisp code, after its read by the parser, is made of data structures that you can traverse. If you understand how compilers work, youll realise that whats really going on is not so much that Lisp has a strange syntax (parentheses everywhere!), but that it has no syntax. You write programs in the parse trees that get generated within the compiler when other languages are parsed. But these parse trees are fully accessible to your programs. You can write programs that manipulate them. In Lisp, these programs are called macros. They are programs that write programs. (If you ever were to enter The Matrix, youd be happy that you were a Lisp maestro.) We know that Java must be pretty good, because it is the cool programming language. Or is it? Within the hacker subculture, there is another language called Perl that is considered a lot cooler than Java. But there is yet another, Python, whose users tend to look down on Perl, and another called Ruby, that some see as the heir apparent to Python. If you look at these languages in order, Java, Perl, Python and Ruby, youll notice an interesting pattern at least, youll notice this pattern if you are a Lisp hacker. Each one is progressively more like Lisp. Python copies even those features that many Lisp hackers consider to be mistakes. And if youd shown people Ruby in 1975 and described it as a dialect of Lisp with syntax, no one would have argued with you. Programming languages have almost caught up with 1958! Lisp was first discovered by John McCarthy in 1958, and popular programming languages are only now catching up with the ideas he developed back then.

Lisp enlightens you as an individual


All married men would relate to Steven Levy when he illustrates an example of how hackers think, in his book, Hackers: Heroes of the Computer Revolution. Marge Saunders would drive back into the garage on a weekend morning, and would ask her husband, Bob, Would you like to help me bring in the groceries? He would reply, No. Stunned, she would drag in the groceries herself. After the same thing occurred a few times, she exploded, hurling curses at him and demanding to know why he always said, No to her question. Thats a stupid question to ask, he said. Of course I wont like to help you bring in the groceries. If you ask me if Ill help you bring them in, thats another matter. When I used to program in my favourite OO programming language, my response was no different. Luckily for me, I discovered Lisp. It gave me a holistic view of the self, the cosmos, and also taught me that there are better responses to a question than a simple Yes/No. From then on, I have learned that the right answer to a Marge Saunder-like question would be Sure, dear! Do you need me to do anything else for you? Needless to say, my wife is a happier person, and we celebrated our seventh anniversary last month.

The functional programming edge


In his famous paper, Why Functional Programming Matters, computer scientist R John M Hughes says that conventional languages place conceptual limits on the way problems can be modularised. Functional languages push those limits back. Two features of functional languages in particular higher-order functions and lazy evaluation, can contribute greatly to modularity. As an example, Lisp allows us to manipulate lists and trees, program several numerical algorithms, and implement the alpha-beta heuristic (an algorithm from Artificial Intelligence used in game-playing programs). Since modularity is the key to successful programming, functional languages like Lisp are vitally important to the real world.

Getting started
Any language that obeys the central principles of Lisp is considered a Lisp dialect. However, the vast majority of the Lisp community uses two Lisps: ANSI Common Lisp (often abbreviated CL) and Scheme. Here, I will be exclusively talking about the ANSI Common Lisp dialect, the slightly more popular of the two. Many great Lisp compilers are available, but one in particular is the easiest to get started with: CLISP, an open source Common Lisp compiler. It is simple to install, and runs on any operating system Windows platforms, Macs, and Linux variants. Mac users may want to consider LispWorks, which will be easier to get running on their machines.

Installing CLISP
You can download a CLISP installer from its official website. On Windows, you simply run an installed program. On a Mac, there are some additional steps, which are detailed on the website. On a Debianbased Linux machine, you should find that CLISP already exists in your standard software repositories. Just run apt-get install clisp at the command line and youll have CLISP installed automatically. For other Linux distributions (Fedora, openSUSE, etc), you can use the standard packages listed under Linux packages on the CLISP website (if its not available on the distro repository).

Starting it up
To start CLISP, run clisp at your command line. If all goes according to plan, youll see the prompt shown in Figure 1.

Figure 1: Starting CLISP Like all Common Lisp environments, CLISP will automatically place you into a read-eval-print-loop (REPL) after you start it up. This means that you can immediately start typing in Lisp code. Try it out by typing (* 7 (+ 4 3)). Youll see the result printed below the expression: > (* 7 (+ 4 3)) 49 In the expression (* 7 (+ 4 3)), the * and the + are called the operator, and the numbers 7, 4, and 3 are called the arguments. In everyday life, we would write this expression as ((4 + 3) * 7), but in Lisp, we put the operators first, followed by the arguments, with the whole expression enclosed in parentheses. This is called prefix notation, because the operator comes first. By the way, if you make a mistake and CLISP starts acting crazy, just type :q and itll fix everything. When you want to shut down CLISP, just type (quit).

Whats under the hood?


Lets not go down the traditional route of starting with the ABCs (learning the syntax of the language, its core features, etc). Sometimes, the promise of what lies beneath is more tantalising than baring it all. Conrad Barski gets you excited about Lisp by showing you how to write a game in it. Lets adopt his method, and write a simple command-line interface game using a Binary Search algorithm. We know that the Binary Search technique continually divides the data in half, progressively narrowing down the search space, until it finds a match, or there are no more items to process. Its the classic guess-my-number game. Ask your friend (or better, your non-technical boss, who yelled at you the last time you fell asleep at a meeting) to pick a number between 1 and 100 (in his head) and your Lisp program would guess it in no more than 7 iterations.

This is how Barski explains the game: To create this game, we need to write three functions: guess-my-number, smaller, and bigger. The player simply calls these functions from the REPL. To call a function in Lisp, you put parentheses around it, along with any parameters you wish to give the function. Since these particular functions dont require any parameters, we simply surround their names in parentheses when we enter them. Heres the strategy behind the game: 1. Determine the upper and lower (big and small) limit of the players number. In our case, the smallest possible number would be 1, and the biggest would be 100. 2. Guess a number in between these two numbers. 3. If the player says the number is smaller, lower the big limit. 4. If the player says the number is bigger, raise the small limit. 5. Well also need a mechanism to start over with a different number. In Common Lisp, functions are defined with defun, as follows: defun function_name(arguments) ...) As the player calls the functions that make up our game, the program will need to track the small and big limits. In order to do this, well need to create two global variables called *small* and *big*. A variable that is defined globally in Lisp is called a top-level definition. We can create new top-level definitions with the defparameter function. > (defparameter *small* 1) *SMALL* > (defparameter *big* 100) *BIG* The asterisks surrounding the names *big* and *small* affectionately called earmuffs are completely arbitrary and optional. Lisp sees the asterisks as part of the variable names, and ignores them. Lispers like to mark all their global variables in this way as a convention, to make them easy to distinguish from local variables, which well discuss in subsequent articles. Also, spaces and line breaks are completely ignored when Lisp reads in your code. Now, the first function well define is guess-my-number. This uses the values of the *big* and *small* variables to generate a guess of the players number. The definition looks like what follows: > (defun guess-my-number() (ash (+ *small* *big*) -1)) GUESS-MY-NUMBER Whenever we run a piece of code like this in the REPL, the resulting value of the entered expression will be printed. Every command in ANSI Common Lisp generates a return value. The defun command, for instance, simply returns the name of the newly created function. This is why we see the name of the function parroted back to us in the REPL after we call defun. What does this function do? As discussed earlier, the computers best guess in this game will be a number in between the two limits. To accomplish this, we choose the average of the two limits. However, if the average number ends up being a fraction, well want to use a near-average number, since were guessing only whole numbers.

We implement this in the guess-my-number function by first adding the numbers that represent the high and low limits, then using the arithmetic shift function, ash, to halve the sum of the limits and shorten the results. The built-in Lisp function ash looks at a number in binary form, and then shifts its binary bits to the left or right, dropping any bits lost in the process. For example, the number 11 written in binary is 1011. We can move the bits in this number to the left with ash by using 1 as the second argument: > (ash 11 1) 22 We can move the bits to the right (and lop off the bit on the end) by passing in -1 as the second argument: > (ash 11 -1) 5 Lets see what happens when we call our new function: > (guess-my-number) 50 Since this is our first guess, the output we see when calling this function tells us that everything is working as planned: the program picked the number 50, right between 1 and 100. Now, lets write our smaller and bigger functions. Like guess-my-number, these are global functions defined with defun: > (defun smaller() (setf *big* (1- (guess-my-number))) (guess-my-number)) SMALLER > (defun bigger() (setf *small* (1+ (guess-my-number))) (guess-my-number)) BIGGER First, lets use defun to start the definition of a new global function, smaller. Next, use the setf function to change the value of our global variable *big*. Since we know the number must be smaller than the last guess, the biggest it can now be is 1 less than that guess. The code (1- (guess-my-number)) calculates this: It first calls our guess-my-number function to get the most recent guess, and then it uses the function 1-, which subtracts 1 from the result. Finally, we want our smaller function to show us a new guess. We do this by putting a call to guess-mynumber as the final line in the function body. This time, guess-my-number will use the updated value of *big*, causing it to calculate the next guess. The final value of our function will be returned automatically, causing our new guess (generated by guess-my-number) to be returned by the smaller function. The bigger function works in exactly the same manner, except that it raises the *small* value instead. After all, if you call the bigger function, you are saying your number is bigger than the previous guess, so the smallest it can now be (which is what the *small* variable represents) is one more than the previous guess. The function 1+ simply adds 1 to the value returned by guess-my-number.

To complete our game, lets add a function start-over, to reset our global variables: > (defun start-over() (defparameter *small* 1) (defparameter *big* 100) (guess-my-number)) As you can see, the start-over function resets the values of *small* and *big* and then calls guess-mynumber again to return a new starting guess. Whenever you want to start a brand-new game with a different number, you can call this function to reset the game. Figure 2 shows our game in action, with the number 74 as our guess.

Figure 2: The game in action In forthcoming articles, well go the ABC way, starting with the basic syntax and semantics of the language. Power corrupts. Lisp is power. Study it hard. Be evil. And lets plan for world domination!

Lisp: Tears of Joy, Part 2

I think that its extraordinarily important that we in computer science keep the fun in computing. When it started out, it was an awful lot of fun. Of course, the paying customers got shafted every now and then, and after a while, we began to take their complaints seriously. We began to feel as if we really were responsible for the successful, error-free perfect use of these machines. I dont think we are. I think were responsible for stretching them, setting them off in new directions, and keeping fun in the house. I hope the field of computer science never loses its sense of fun. Above all, I hope we dont become missionaries. Dont feel as if youre a Bible salesman. The world has too many of those already. What you know about computing, other people will learn. Dont feel as if the key to successful computing is only in your hands. Whats in your hands, I think and hope, is intelligence: the ability to see the machine as more than when you were first led up to it, that you can make it more. That quote by Alan J Perlis (the first recipient of the Turing Award) would resonate with those who read my previous scholarly article on Lisp, which, after all, is about having fun and stretching the capabilities of the computer and the programming language itself. One of the reasons Lisp does that is because it is designed to be extensible. Read on for more details. There is no substitute for a good book while learning Lisp. Start with a free resource. I recommend the following resources that are available online: On Lisp by Paul Graham Practical Common Lisp by Peter Seibel Successful Lisp by David B Lamkins The following text snippets are from Paul Grahams website, which carry extracts of his books, On Lisp and ANSI Common Lisp.

Bottom-up programming
Lisp is designed to be extensible; it lets you define new operators yourself. This is possible because the Lisp language is made out of the same functions and macros as your own programs. So its no more difficult to extend Lisp than to write a program in it. In fact, its so easy (and so useful) that extending the language is standard practice. As youre writing your program down toward the language, you build the language up toward your program. You work bottom-up, as well as top-down. Almost any program can benefit from having the language tailored to suit its needs, but the more complex the program, the more valuable bottom-up programming becomes. A bottom-up program can be written as a series of layers, each one acting as a sort of programming language for the one above. TEX was one of the earliest programs to be written this way. You can write programs from the bottomup in any language, but Lisp is by far the most natural vehicle for this style. Bottom-up programming leads naturally to extensible software. If you take the principle of bottom-up programming all the way to the topmost layer of your program, then that layer becomes a programming language for the user. Because the idea of extensibility is so deeply rooted in Lisp, it makes for the ideal language to write extensible software. Working bottom-up is also the best way to get reusable software. The essence of writing reusable software is to separate the general from the specific; bottom-up programming inherently creates such a separation. Instead of devoting all your efforts towards writing a single, monolithic application, you devote part of your effort to building a language, and part to writing a (proportionately smaller) application on top of it. Whats specific to this application will be concentrated in the topmost layer. The layers beneath will form a language for writing applications like this one and what could be more reusable than a programming language?

Rapid prototyping
Lisp allows you to not just write more sophisticated programs, but to write them faster. Lisp programs tend to be short the language gives you bigger concepts, so you dont have to use as many. As Frederick Brooks (best known for his book The Mythical Man-Month) has pointed out, the time it takes to write a program depends mostly on its length. So this fact alone means that Lisp programs take less time to write. The effect is amplified by Lisps dynamic character: in Lisp, the edit-compile-test cycle is so short that programming happens in real-time. Bigger abstractions, and an iterative environment, can change the way organisations develop software. The phrase rapid prototyping describes a kind of programming that began with Lisp in Lisp, you can often write a prototype in less time than it would take to write the spec for one. Whats more, such a prototype can be so abstract that it makes a better spec than one written in English. And Lisp allows you to make a smooth transition from prototype to production software. When Common Lisp programs are written with an eye for speed and compiled by modern compilers, they run as fast as programs written in any other high-level language. All of this obviously means that you can now spend less time working, and finally take your family out for that dinner youve been promising them, for the last three years happy boss, happy family.

Questions answered, and a response to cynical feedback


I received many questions about the Lisp programming language after the first article was published in the June 2011 edition of LINUX For You. Fortunately, David B Lamkins, author of the book Successful Lisp, has provided comprehensive answers to most of these questions here. The questions he answers are: 1. I looked at Lisp earlier, and didnt understand it. 2. I cant see the program for the parentheses. 3. Lisp is very slow compared to my favourite language. 4. No one else writes programs in Lisp. 5. Lisp doesnt let me use graphical interfaces. 6. I cant call other peoples code from Lisp. 7. Lisps garbage collector causes unpredictable pauses when my program runs. 8. Lisp is a huge language. 9. Lisp is only for artificial intelligence research. 10.Lisp doesnt have any good programming tools. 11.Lisp uses too much memory. 12.Lisp uses too much disk space. 13.I cant find a good Lisp compiler. Leave a comment if you couldnt locate the answer to your question in the link provided, or were unsatisfied with the quality or quantum of information presented, and I will try to respond with the profound knowledge locked up in my highly-evolved brain. A riveting quality of my mind is that that it spontaneously goes into a meditative state half-way through a negative feedback, and instantly rejects the hypothesis that I may be too dumb to understand that I am a bad writer. My superior brain and eyes then work in sync to dismiss any dissenting remarks or profanities. This enables me to rant on with the technical mumbo-jumbo that you now hold in your hand. If you sent negative feedback about my previous article to the editor, and did not receive a reply, you now know why!

Getting acquainted
Lisp, whose name is an acronym for LISt Processing, was designed to provide symbol-manipulating capabilities to attack programming problems such as symbolic differentiation and the integration of algebraic expressions. Despite its inception as a mathematical formalism, Lisp is a practical programming language. The basic elements of Lisp include its primary data structure, called the s-expression. This also includes the Lisp interpreter, which is the heart of any Lisp system, and is basically a machine that carries out processes described in the Lisp language. The Lisp interpreter performs computations on s-expressions through a process called evaluation. Since its earliest days, Lisp implementations have been variously interpreted or compiled, and often both. No modern commercial version of Lisp is without a compiler. The fact that modern Lisp variants often come with an interpreter as well, is simply a convenience for some implementations, to encourage late-binding semantics and promote flexibility, including interactive debugging.

The prompt
I assume that you have CLISP on your machine (please read Part I for instructions on how to install and use CLISP). The exact command to start a Lisp process may vary from one OS to another. Any Lisp system will include an interactive front-end called the toplevel. You type Lisp expressions into the toplevel, and the system displays their values. Lisp usually displays a prompt to tell you that it is waiting for you to type something. Many implementations of Common Lisp use > (the greater-than symbol) as the toplevel prompt; some may use an entirely different prompt, and others may issue no prompts at all.

S-expressions
A Lisp command begins with a parenthesis. Next, specify the name of an operation that you would like to perform. Then give the arguments you want to use. Finish the whole thing off with a final parenthesis. For example, if you want to add two numbers together, type something like the following code: > (+ 1 2) 3 In this example, you only typed the expression in parenthesis; the prompt and the answer were output by Lisp. In many implementations of Lisp, the user types a carriage return to signal that the command is complete. Other implementations will go to work as soon as they see the closing parenthesis. Also, some implementations will output additional blank lines along with the result. In the expression (+ 1 2), the + is called the addition operator, and the numbers 1 and 2 are called the arguments. In other programming language(s), this expression would be writen as 1 + 2, but in Lisp the + operator is put first, followed by the arguments, with the whole expression enclosed in a pair of parenthesis. This is called prefix notation, because the operator comes first. Lisps prefix notation offers convenience to programmers. For example, if you want to add five numbers together, in ordinary notation (and programming language) you have to use + four times: 1+2+3+4+5 while in Lisp, you just add other arguments: > (+ 1 2 3 4 5) 15 Commands like these are called s-expressions, which stands for symbolic expressions. This is a very general term, applicable to just about anything one can say in Lisp. The basic elements of s-expressions are lists and atoms. Lists are delimited by parentheses, and can contain any number of elements separated by white-space. Atoms are everything else. The elements of lists are themselves s-expressions (in other words, atoms or nested lists). More about lists and atoms will follow in the next article. Comments which, technically speaking, are not s-expressions start with a semi-colon, extend to the end of a line, and are essentially treated like white-space. > ;demo of multiplication operator (* 3 7) 21

Because operators can take a varying number of arguments, we need parentheses to show where an expression begins and ends. Expressions can be nested that is, the arguments in an expression may themselves be complex expressions. Suppose you want to compute the value of ((4*5)/(2+3)), then the equivalent s-expression in Lisp will be: > (/ (* 4 5) (+ 2 3)) 4

Evaluation
Lisp evaluates everything. It even evaluates its arguments. In Lisp, numbers evaluate to themselves that is, any time Lisp tries to evaluate 1, the answer is always 1. >1 1 >2 2 In Lisp, + is a function, and an expression like (+ 1 2) is a function call. When Lisp evaluates a function call, it does so in two steps: 1. First, the arguments are evaluated, from left to right. In the case of (+ 1 2), each argument evaluates to itself, so the values of the arguments are 1 and 2, respectively. 2. The values of the arguments are passed to the function named by the operator. In this case, it is the + function, which returns 5. If any of the arguments are themselves function calls, they are evaluated according to the same rules. So when (/ (* 4 5) (+ 2 3)) is evaluated, this is what happens: 1. Lisp evaluates (* 4 5): 4 evaluates to 4 and 5 evaluates to 5. These values are passed to the function *, which returns 20. 2. Lisp evaluates (+ 2 3): 2 evaluates to 2 and 3 evaluates to 3. These values are passed to the function +, which returns 5. 3. The values 20 and 5 are sent to the function /, which returns 4. Not all the operators in Common Lisp are functions, but most are. And function calls are always evaluated this way. The arguments are evaluated left-to-right, and their values are passed to the function, which returns the value of the expression as a whole. This is called the evaluation rule for Common Lisp. Lisp is somewhat unusual in that virtually everything gets done by executing a function. In fact, there is really no more to Lisp than that. All the complexity of Lisp comes from the particular functions Lisp provides, and the details of how various types of functions are treated by the interpreter.

Call 911!!!
If you type something that Lisp cant understand, it will display an error message and put you into a program called a debugger. The ordinary Lisp debugger provides the ability to suspend evaluation of a form. While the evaluation is suspended (a state commonly known as a break loop), you may examine the run-time stack, check the values of local or global variables, or change those values.

The exact nature of the debugger is not part of Common Lisp, per se. Rather, this is left to the creators of various Common Lisp implementations. For example, there is no Common Lisp standard specifying the nature of the prompt to be displayed after an error. Nor is there standardisation of the commands the debugger will accept. In some implementations, the user normally does not type a command at all; rather, a special key is reserved just for this purpose. In CLISP, to get out of the break loop and get back to the toplevel, just type :q as shown below: [1]> (/ 1 0) *** - /: division by zero The following restarts are available: ABORT :R1 Abort main loop Break 1 [2]> :q [3]> You must determine how the equivalent of :q has been realised in your local Lisp implementation. Experiment, check the implementation manual, or consult a local expert.

What next?
All the X-Men (movie) fans out there will agree with me when I say that the latest addition to the series X Men First Class has been the best yet. What could be better than learning about the origins of characters and a story line you have come to love. This, in fact, seems to be a trend in Hollywood, with movies like Star Wars, the Batman series, etc. I am going to follow that trend, and in my third article, give you a glimpse of the evolution of Lisp. The modest beginnings of the language you have now started to love (if you are still reading this, you probably have or is it just my charming self you cannot get enough of?!). I will also pick up the pace, as we have barely scratched the surface of Lisp. After all, plans for world domination are never fun if they take forever!

Lisp: Tears of Joy, Part 3

Genesis
By now, most of us agree that LISP is the worlds greatest programming language. For those of you who still disagree, and think it probably stands for Lots of Irritating Superfluous Parentheses, my suggestion would be to read this article with a few bottles of strong beer (for the hacker types) or wine (for the high brow literary elite). Guy L Steele Jr and Richard P Gabriel, in their paper The Evolution of LISP [PDF], say that the origin of LISP was guided more by institutional rivalry, one-upmanship, and the glee born of technical cleverness that is characteristic of the hacker culture, than by a sober assessment of technical requirements. How did it all start? Early thoughts about a language that eventually became LISP started in 1956, when John McCarthy attended the Dartmouth Summer Research Project on Artificial Intelligence. Actual implementation began in the fall of 1958. These are excerpts from the essay Revenge of the Nerds by Paul Graham: LISP was not really designed to be a programming language, at least not in the sense we mean today, which is something we use to tell a computer what to do. McCarthy did eventually intend to develop a programming language in this sense, but the LISP that we actually ended up with was based on something separate that he did as a theoretical exercise an effort to define a more convenient alternative to the Turing Machine. As McCarthy said later, Another way to show that LISP was neater than Turing machines was to write a universal LISP function, and show that it is briefer and more comprehensible than the description of a universal Turing machine. This was the LISP function eval, which computes the value of a LISP expression Writing eval required inventing a notation representing LISP functions as LISP data, and such a notation was devised for the purposes of the paper, with no thought that it would be used to express LISP programs in practice. Some time in late 1958, Steve Russell, one of McCarthys graduate students, looked at this definition of eval and realised that if he translated it into machine language, the result would be a LISP interpreter.

This was a big surprise at the time. Here is what McCarthy said about it later in an interview: Steve Russell said, Look, why dont I program this eval, and I said to him, Ho, ho, youre confusing theory with practice; this eval is intended for reading, not for computing. But he went ahead and did it. That is, he compiled the eval in my paper into [IBM] 704 machine code, fixing bugs, and then advertised this as a LISP interpreter, which it certainly was. So at that point LISP had essentially the form that it has today Suddenly, in a matter of weeks, I think, McCarthy found his theoretical exercise transformed into an actual programming language and a more powerful one than he had intended. (Read the complete essay here.) Now, despite how much I love LISP and think that all the pages of LINUX For You should be dedicated to it, I also know that if I want to keep writing for this magazine, I need to stick to the word limit. So, as interesting as the origin of LISP is, we should now move on to the real task at hand (and for those whove lost the plot, it is learning LISP).

Defining local variables in LISP


To define a local variable, use the let command. A let expression has two parts; the first is a list of variable declarations, where we can declare one or more local variables valid within the body of the let. The second is the body of the let, where we can use these variables. The expressions in the body are evaluated in order. Heres an example: > (let ( (x 18) (y 13) (z 15)) (+ x y z)) 46 Here, I have defined three local variables x, y, and z, and assigned them values of 18, 13 and 15, respectively. Each pair of local variable names and values assigned to them need to be surrounded by a set of parentheses. Also, the indentation, white-space, and line breaks are discretionary and therefore, you can stack up the variables and their values in a let expression to resemble a table. This is why I have placed y directly underneath x and z directly below y. Figure 1 identifies the parts of the let.

Figure 1: 'let' expression

Defining local functions in LISP


To define a local function, use the flet command. This, too, has two parts: (1) function declaration; and (2) the flet body, where the function may be called. The function declaration itself consists of a function

name, the arguments to that function, and the function body, where we put the functions code. As with let, we can define one or more functions within the scope of the flet. But remember that these are local functions visible only in the body they may not call one another, and so cannot be recursive. Heres an example: > (flet ( (foo(n) (+ n 5)) (bar(n) (+ n 10))) (bar (foo 2))) 17 Here, I have declared two functions: foo and bar. In this code example, the body of the flet first calls foo with 2 as its argument, to return a value of 7; it then calls bar to add 10 to this value, leading to 17 as the final value of the whole flet expression. See Figure 2.

Figure 2: 'flet' expression Lets see what happens when bar tries to call foo in its function body: > (flet ( (foo(n) (+ n 5)) (bar(n) (+ (foo 2) n))) (bar 10)) *** - EVAL: undefined function FOO The following restarts are available: USE-VALUE :R1 Input a value to be used instead of (FDEFINITION 'FOO). RETRY :R2 Retry STORE-VALUE :R3 Input a new value for (FDEFINITION 'FOO). ABORT :R4 Abort main loop > Thats right! You get slapped in the face with an error (Figure 3). The flet command creates local functions, which are invisible inside another function body.

To define recursive local functions, use labels. Local functions defined by a labels expression can refer to any other functions defined there, including themselves. This is identical in its basic structure to the flet command. Heres the preceding code, written with labels: > (labels ( (foo(n) (+ n 5)) (bar(n) (+ (foo 2) n))) (bar 10)) 17 In this code example, the body of the labels expression calls the local function bar. Since labels allows local functions to see each other, the bar function calls another local function (from inside its body). That is foo, which adds 2 to 5, resulting in a value of 7, which in turn is added to 10 the argument of the bar function, resulting in a final value of 17. This is where I have to leave you this month (as I have run out of space!) but before I go, I would like to thank everybody who wrote in to provide feedback about the article last month.

Lisp: Tears of Joy, Part 4

.. For God wrote in Lisp code When He filled the leaves with green. The fractal flowers and recursive roots: The most lovely hack Ive seen. And when I ponder snowflakes, never finding two the same, I know God likes a language with its own four-letter name. .. Partial lyrics of the song God Lives on Terra by Julia Ecklar. I would like to think these lines speak to all of you out there reading this article the ones whove fallen in love with Lisp, the ones who still dont get what all the fuss is about, and those who need some more nudging to fall off the fence (onto the right, Lisp-loving side). For those of you in the second and third categories, Ive answered a few of your questions below, which might help you along the path to enlightenment. I am standing on the shoulders of giants. Some text snippets are quoted from these two URLs: 1. http://c2.com/cgi/wiki?InterpretedLanguage 2. http://www.catalysoft.com/articles/goodAboutLisp.html

Lisp is slow because it is interpreted


Common Lisp is not an interpreted language. In fact, there is not even a reasonable definition of an interpreted language. The only two reasonable definitions of an interpreted language that I can think of, are: 1. A language that can be implemented with an interpreter. 2. A language that must be implemented with an interpreter. In the first case, all languages are interpreted. In the second case, no language is interpreted.

Sometimes, we confuse interpreted and interactive. We tend to think that whenever there is an interactive loop such as Lisps read-eval-print loop, there must also be an interpreter. That is false. The eval part can very well be implemented with a compiler. Sometimes, the belief is that even though it is possible to implement Common Lisp with a compiler, it is usually done with an interpreter, and hence most implementations are slow. This might be the case for programs written by amateurs, but is seldom the case with systems written by hackers. Almost every Common Lisp system in existence uses a compiler. Most Common Lisp compilers are capable of generating very good, very fast code. Obtaining fast code may require the hacker to add declarations that assist the compiler in optimising the code, such that fast object code is not derived out of nave source code. Also, mortals have a tendency to exaggerate the importance of speed. There are quite a number of applications that use very slow systems, such as Perl, Shell, Tcl/Tk, Awk, etc. It is not true that maximum speed is always required.

Lisp is not used in the industry


..or, Ive not seen Lisp in advertisements for employment. Lisp is used. Several major commercial software packages are written in Common Lisp or some other Lisp variant. It is hard to know in what language commercial software is written (since the user should not have to care), but there are a few that are well known. Interleaf, a documentation system, is written in Lisp. So is AutoCAD, a system for computer-aided design. Both are major applications in their domains. While not a commercial software system, Emacs is an important system written in Lisp. But even if Common Lisp were not used at all in industry, this would not be a good argument. The level of sophistication of the software industry is quite low with respect to programming languages, tools and methods. The universities should teach advanced languages, tools and methods with the hope of having industry use them one day, as opposed to teaching bad ones that happen to be used today. Students who want training in particular tools that happen to be in demand at the moment, should quit university, and apply for more specific training programs. Lisp was, and is, one of the dominant languages for Artificial Intelligence (AI) programming, but should not be seen as a language that is exclusive to the AI world. The AI community embraced the language in the 1980s because it enabled rapid development of software in a way that was not possible with other mainstream languages of the time, such as C. In the 1980s and early 1990s, the emphasis of mainstream software development methodologies was on getting it right the first time, an approach that demanded upfront effort on careful system specifications, and did not allow for changes in specifications during later stages of the software development life-cycle. In AI, software development needed to be much more agile, so that inevitable changes in specifications could more easily be accommodated by iterative development cycles. Lisp is ideal for this, as it can be tested interactively and provides for concise, quick coding, using powerful high-level constructs such as list processing and generic functions. In other words, Lisp was ahead of its time, because it enabled agile software development before it became respectable in mainstream software. Besides, whether a language such as Common Lisp is used in the industry depends a lot more on the individual student than on industry. There is a widespread myth among students that the industry is this monolithic entity whose tools and methods cannot be altered. In reality, the industry consists of people. Whatever the industry uses is whatever the people working there use. Instead of refusing to learn sophisticated tools and techniques, the student can resolve to try and become one of the forerunners in the industry, after graduation.

Recursive or iterative?
Unlike some other functional programming languages, Lisp provides for both recursive and iterative programming styles. As an example, consider writing a function to compute the factorial of a positive integer n. The factorial, written n!, is the product of all integers between 1 and n; i.e., n! = n(n-1)(n2)21. A function to compute this number can be written in a recursive style as follows: (defun fac(n) (if (= n 0) 1 (* n (fac (- n 1))))) You should read the definition of the function as follows. If n is zero, then return 1; otherwise return the product of n and the factorial of n-1. The same function can be written in an iterative style as follows: (defun fac(n) (let ((result 1)) (dotimes (i n) (setq result (* result (+ i 1)))) result)) This function uses let to set up a local variable called result with an initial value of 1. It then performs n iterations of a central loop that each time multiplies the result by the loops counter variable (one must be added to this counter, as dotimes counts from zero). Finally, the value of the result variable is returned as the result of the function.

Object-oriented
Like many other modern programming languages, Common Lisp is object-oriented. In fact, it was the first ANSI-standard object-oriented language, incorporating CLOS (the Common Lisp Object System). CLOS provides a set of functions that enable the definition of classes, their inheritance hierarchies and their associated methods. A class defines slots whose values carry information about object instances. The class definition can also specify default values for slots, and additional non-trivial behavioural mechanisms (such as integrity checking) performed at object creation time. As an example of object-orientation in Lisp, consider the definition of some shapes, which can be either circles or rectangles. A shapes position is given by x and y coordinates. Further, a circle has a radius, whereas a rectangle has a width and a height. It should be possible to compute the area of circles and rectangles. A working set of definitions is given below: (defclass shape () (x y)) (defclass rectangle (shape) ((width :initarg :width) (height :initarg :height))) (defclass circle (shape) ((radius :initarg :radius))) (defmethod area ((obj circle)) (let ((r (slot-value obj 'radius))) (* pi r r))) (defmethod area ((obj rectangle)) (* (slot-value obj 'width)

(slot-value obj 'height))) Using these definitions, a rectangle (instance) can be created with make-instance, for example, with a width of 3 and a height of 4: > (setq my-rect (make-instance 'rectangle :width 3 :height 4)) #<RECTANGLE @ #x8aee2f2> Its area can be computed by calling the method area: 12 Below, Ive attempted to address in my clever-yet-so-simple manner of explaining concepts (I hope thats not laughter I hear) some of the topics my very-soon-to-be fellow Lispers have had trouble understanding from books.

Quote
Lisp evaluates everything. Sometimes you dont want Lisp to do that; in situations like these, quote is used to override Lisps normal evaluation rules. This quote is a special operator. It has a distinct evaluation rule of its own, which is to evaluate nothing. It takes a single argument, and returns it verbatim. > (quote (+ 7 2)) (+ 7 2) This may seem to be a mundane function, since it does nothing. But, the need to prevent Lisp evaluating something arises so often that a shorthand notation is provided for quoting expressions. Common Lisp defines ' (a single quote or apostrophe) as an abbreviation for quote. Be careful not to substitute quote with ` (backquote character). ` (referred to as backquote by Common Lisp programmers, and quasiquote by Scheme programmers) has a different interpretation, which I will cover in forthcoming articles. > ('(the list (+ 2 3 5) sums 3 elements)) (THE LIST (+ 2 3 5) SUMS 3 ELEMENTS) Lisp programs are expressed as lists. This means that Lisp programs can generate Lisp code. Lisp programmers often write programs to write programs for them (such programs are called macros). The quote function is crucial to engineer this functionality. If a list is quoted, evaluation returns the list itself; if it is not quoted, the list is treated as Lisp code, and evaluation returns its value. > (list '(+ 0 1 1 2 3 5) (+ 0 1 1 2 3 5)) ((+ 0 1 1 2 3 5) 12)

Atom, list, and predicates


Things inside a list things that are not themselves lists, but words and numbers (in our examples so far) are called atoms. Atoms are separated by white space or parenthesis. You may use the term atom to refer to just about any Lisp object that is not viewed as having parts. Atoms like 2.718 and 42 are called numeric atoms, or just numbers.

Atoms like foo, bar, and + are called symbolic atoms, or just symbols. Both atoms and lists are called symbolic expressions, or more succinctly, expressions. Symbolic expressions are also referred to as sexpressions, s being short for symbolic, to distinguish them from m-expressions, m being short for meta. Symbols are sometimes referred to as literal atoms. You can check if an s-expression is an atom or not by using predicates functions that return true or false when testing a value for a particular property. In Lisp, false is indicated by NIL, and true is often signalled by the special symbol T, but anything other than NIL is considered to indicate true. Similarly, the function listp determines whether something is a list. > (atom '(functional) T > (atom '(a b c)) T > (listp '(a b c)) T > (listp 2.718) NIL

setq
Most programming languages provide a special notation for assignment. Lisp does no such thing. Instead, Lisp uses its routine workhorse, the function. Lisp has a special function, called setq, that assigns the value of one of its arguments to the other argument. To accomplish the equivalent of x <-47 in Lisp, we write the following: > (setq x 47) 47 > (setq y (+ 4 7)) 11 > (+ 3 (setq z (* 3 7))) 24 >z 21 > (setq my-medium-of-expression "Lisp") "Lisp" setq gives each variable symbol the value of the corresponding value expression. The setq form can take any even number of arguments, which should be alternating symbols and values. It returns the value of its last value. > (setq month "August" day 04 year 2011) 2011 > day 04 > month "August" Lisp tries to evaluate arguments before applying a function. When we evaluate a setq, the symbol (in our example, x, y, and z) may not be defined. This should cause an error the only reason it does not is because setq is an instance of a special function. Lisp may handle the argument of each special function idiosyncratically. setq does not cause its first argument to be evaluated; and the second argument is subject to normal treatment. Fortunately, there are only a small number of special functions in Lisp.

Whats next? Drum-roll (and if this were a television series, there would be fireworks going off now) Macros. After weeks of reading about the power of macros, you will finally learn how to write them. For those of you who dont know what Im talking about, think Agent Smith (the dark-glasses-wearing AI program in the movie Matrix) turning into a self-replicating computer virus. If, for a second, you ignore the fact that he was the antagonist in the movie, and that nobody likes computer viruses, and focus instead on the concept of programs that can actually write programs, I am sure the start of next month will see you parked at a newsstand waiting for the next edition of LFY. And I look forward to it as well, as that will take us one step closer to our plans of world domination.

Lisp: Tears of Joy, Part 5

If you tile a floor with tiles the size of your thumbnail, you dont waste many. Paul Graham Doug Hoyte, creator of Antiweb gives a lot of credit to macros for efficient Lisp performance. He says that while other languages give you small, square tiles, Lisp lets you pick tiles of any size and of any shape. With C, programmers use a language that is directly tied to the capabilities of a fancy fixnum adder. Aside from procedures and structures, little abstraction is possible in C. By contrast, Lisp was not designed around the capabilities and limitations of the machines. Instead of inquiring what makes a program fast, its better to ask what makes a program slow. The root causes can be roughly classified into three broad categories: (1) bad algorithms, (2) bad data structures, (3) and general code. All language implementations need good algorithms. An algorithm is a presumably well-researched procedural description of how to execute a programming task. Because the investment required in coming up with an algorithm is so much larger than that of implementing one, the use of algorithms is ubiquitous throughout all of computer science. Somebody has already figured out how, why, and how quickly, an algorithm works; all you have to do to use an algorithm is translate its pseudo-code into something that your system can understand. Because Common Lisp implementations have typically been well implemented and continuously improved upon for decades, they generally employ some of the best and quickest algorithms around for most common tasks. Good data structures are also necessary for any decent programming language. Data structures are so important that ignoring them will cause any language implementation to slow down to a crawl. Optimising data structures essentially comes down to a concept called locality which basically says that data that is accessed most frequently should be the fastest to access.

Data structures and locality can be observed clearly at almost every level of computing where performance gains have been sought: large sets of CPU registers, memory caches, databases, and caching network proxies, to name a few. Lisp offers a huge set of standard data structures, and they are generally implemented very well. If Lisp provides such good algorithms and data structures, how is it even possible that Lisp code can be slower than code in other languages? The explanation is based on the most important design decision of Lisp, i.e., general code, a concept otherwise familiar to us as duality of syntax. When we write Lisp code, we use as many dualities as possible; the very structure of the language encourages us to do so. Why are Lisp programs usually much shorter than programs in other programming languages? Part of the reason is because any given piece of Lisp code can be used for so much more than a corresponding piece of code in another language, that you reuse it more often. From the perspective of someone more familiar with another programming language, it can feel strange to have to write more to get less, but this is an important Lisp design decision duality of syntax. The more dualities attached to each expression, the shorter a program seems to be. Does this mean that to achieve or exceed Cs performance, we need to make our Lisp programs as long and dangerous as their corresponding C programs? No, Lisp has macros. So, enough of selling the concept; lets get down to it, shall we? Well start with some basics.

Lambdaanonymous function
At times, youll need a function in only one place in your program. You could create a function with defun and call it just once. Sometimes, this is the best thing to do because you can give the function a descriptive name that will help you read the program at some later date. But sometimes the function you need is so trivial or so obvious that you dont want to have to invent a name, or worry about whether the name might be in use somewhere else. For situations like this, Lisp lets you create an unnamed, or anonymous, function, using the lambda form. A lambda looks like a defun form without the name: (lambda (a b c n) (+ (* a (* n n)) (* b n))) You cant evaluate a lambda form; it must appear only where Lisp expects to find a function normally, as the first element of a form: > (lambda (a b c n) (+ (* a (* n n)) (* b n))) #<anonymous interpreted function 21AE8A6A> > ((lambda (a b c n) (+ (* a (* n n)) (* b n))) 1 3 5 7) 70

Macro
The term macro does not have the same meaning in Lisp as it has in other languages. A Lisp macro can be anything from an abbreviation to a compiler for a new language. Macros (in the Lisp sense) are still unique to Lisp. This is partly because in order to have macros, you probably have to make your language look as strange as Lisp. It may also be because if you do add that final increment of power, you can no longer claim to have invented a new language, but only a new dialect of Lisp. If you define a language that has car, cdr, cons, quote, cond, atom, eq, and a notation for functions expressed as lists, then you can build all the

rest of Lisp out of it. This is, in fact, the defining quality of Lisp: it was in order to make this possible that McCarthy gave Lisp the shape it has. Because macros can be used to transform Lisp into other programming languages and back, youd soon discover that all other languages are just skins on top of Lisp. Programming with Lisp is programming at a higher level. Where most languages invent and enforce syntactic and semantic rules, Lisp is general and malleable. With Lisp, you make the rules. A defmacro form looks like a defun form. It has a name, a list of argument names, and a body. The macro body returns a form to be evaluated. In other words, you need to write the body of the macro such that it returns a form, not a value. When Lisp evaluates a call to your macro, it first evaluates the body of your macro definition, and then evaluates the result of the first evaluation. By way of comparison, a functions body is evaluated to return a value. > (defmacro setq-literal (place literal) `(setq, place ', literal)) SETQ-LITERAL >(setq-literal a b) B >a B > (defmacro reverse-cons (rest first) `(cons, first, rest)) REVERSE-CONS >(reverse-cons nil a) (B) setq-literal works like setq, except that neither argument is evaluated. Remember that setq evaluates its second argument. The body of setq-literal has a form that begins with a ` (backtick). A backtick behaves like quote suppressing evaluation of all the enclosed forms except where a comma appears within the backticked form. A symbol following the comma is evaluated. So, in our call to (setq-literal a b) above, this is what needs to be done: 1. Bind place to the symbol a. 2. Bind literal to the symbol b. 3. Evaluate the body `(setq, place ', literal), following these steps: Evaluate place to get the symbol a. Evaluate literal to get the symbol b. Return the form (setq a 'b). Neither the backtick nor the commas appear in the returned form. Neither a nor b is evaluated in a call to setq-literal, but for different reasons; a is unevaluated because it appears as the first argument of setq; b is unevaluated because it appears after a quote in the form returned by the macro.

The operation of (reverse-cons nil a) is similar: 1. Bind rest to the symbol nil. 2. Bind first to the symbol a. 3. Evaluate the body `(cons, first, rest), following these steps: Evaluate first to get the symbol a. Evaluate rest to get the symbol nil. Return the form (cons a nil). Both arguments of reverse-cons are evaluated because cons evaluates its arguments, and our macro body doesnt quote either argument. a evaluates to the symbol b, and nil evaluates to itself.

Macro expansion
Lisp evaluates the code produced by the macro immediately. Therefore, when you use a macro, you get to see only the result of the entire evaluation of the macro, but you do not get to see the code produced by the macro along the way. Since it is sometimes useful in debugging to view this intermediate code, Lisp provides us with a way of doing so. The function macroexpand takes as input an s-expression that may be a call to a macro. If it is, macroexpand will apply the macro to its arguments to produce the object code. It will not evaluate this code, but merely return it. The Lisp interpreter uses macroexpand to expand a call to a macro that it is trying to evaluate. Using this function will give you an accurate picture of what the Lisp interpreter itself sees. Heres how it works for the example given above: > (macroexpand '(setq-literal a b)) (SETQ A (QUOTE B)) T > (macroexpand '(reverse-cons nil a)) (CONS A NIL) T Since macroexpand is a function, it evaluates its arguments. This is why you have to quote the form you want expanded.

A convenient macro
It is generally appropriate to enclose function arguments inside a call to the special function. In particular, the expression (function (lambda...)) occurs frequently. Common Lisp allows the shorthand #'x to stand for (function x), just as 'x stands for (quote x). Lets write a special macro to help abbreviate this particular combination. We can define a macro flambda that expands into (function (lambda...)) as follows: > (defmacro flambda (&rest l) (list 'function (cons 'lambda l))) FLAMBDA > (macroexpand '(flambda (x y) (cons y x))) (FUNCTION (LAMBDA (X Y) (CONS Y X))) T

Since a lambda expression may contain any number of forms, we used the &rest parameter designator to define a rest parameter. This parameter will get assigned the list of all the arguments supplied to flambda. For example, in the call to flambda above, the rest parameter l gets assigned the value ((x y) (cons y x)). flambda then just sticks the atom lambda on the front of this list, and then puts the resulting lambda expression in a list that begins with the symbol function. Thus, the sample flambda expression given above as an argument to macroexpand can be seen as a shorthand for writing the full (function (lambda...)) call.

pop
A more interesting function is the macro pop, which takes as its arguments a symbol whose value is a list. It reduces this list to its cdr, and returns the first element of the list as its value. In effect, pop treats the list like a stack, and pops off its top element. A call to pop is equivalent to executing the following piece of code: > (prog1 (car stack) (setq stack (cdr stack))) The function prog1 evaluates all its arguments in sequence, returning the value of the first one. Thus, for any particular list, you can execute a line of code like the one shown above, which carries out the desired function. What you could do is define a macro that expands into code of this form. Thus you can get (pop stack) to expand into the prog1 shown above. pop is actually defined in Common Lisp. Here, we will write our own definition of the function, for the purpose of an example: > (defmacro example-pop (stack) (list 'prog1 (list 'car stack) (list 'setq stack (list 'cdr stack)))) EXAMPLE-POP Lets see how the macro works: > (setq stack '(a b c)) (A B C) > (example-pop stack) A > stack (B C) example-pop returned the first element of stack as its value. In addition, it reduced stack to its cdr. Note that it would be difficult to write this as an ordinary function, since it needs to get at the actual stack name to change its value. You would always have to quote the name of the stack you were passing, because arguments supplied to ordinary functions always get evaluated. However, with macros, this was not a problem.

Special functions and macros


Many of the so-called special functions are actually macros. Most notably, the functions defun, prog, cond, setf and defmacro itself are all Common Lisp macros. For example, a call to setf turns into a call to a more basic assignment function a call to setf, whose first argument is a symbol, turns into a setq, as follows: > (macroexpand '(setf a 'b)) (SETQ A 'B) However, if the left-hand side of the call to setf is a call to the function get, the setf turns into a call to some internal Common Lisp property-list-changing function. Remember that the line between macros and special functions is rather thin. A Common Lisp implementation may implement some special functions as macros, and some macros as special forms although it must still provide a macro definition in the latter case. The use of macros is limited only by your own creativity. While you get creative with macros, I will tweak my example code snippets for the next article, where I will discuss recursion, closures, and advanced macros. Enjoy playing God with the machine world.

Lisp: Tears of Joy, Part 6

Closures
Suppose you want to write a function that saves some value it can examine the next time it runs. For example, functions that generate random numbers often use the previously generated random number to help generate the next. To do this, you need to lay aside a value such that a function can get at it next time. Its a risky proposition if you use a global variable or a property list to store such a value. If you get careless, and write another function that uses the same global variable or property, the two functions may interact in a way you never intended. It is even more problematic if you wish to have several incarnations of the same function around, each remembering what it computed previously, but not interfering with one another. Using a single, global object does not accommodate this. For example, suppose you want a function that will give you the next even number each time it is called. You could write this as a function that uses the global variable *evenum* to remember the last value it computed: > (defun generate-even() (setq *evenum* (+ *evenum* 2))) GENERATE-EVEN > (setq *evenum* 0) 0 > (generate-even) 2 > (generate-even) 4 > (generate-even) 6 This works fine, but some other function could come along and clobber evenum in between calls. And we could not use generate-even to generate two sequences of even numbers simultaneously, independent of the other.

The solution to this problem is to make a version of a function that has variables only it can access. This is done by taking a function that contains some free symbols, and producing a new function in which all those free symbols are given their own, unique variables (free symbols are symbols used to designate variables within a function, but which are not formal parameters of that function). This is as if we replaced each free symbol with a new symbol, one we were certain no other function could ever use. When this new function is run, then, its free symbols will reference variables that no other function can reference. If the values of some of these variables are changed, their new values will be retained until the next time the function is run. However, changes to these variables will not have any effect outside of this function; moreover, the values of these variables cannot be accessed or altered outside of this function. In this example, we would like to produce a version of generate-even that has its own private copy of the free variable *evenum*. When we create this version of generate-even, we would like its version of *evenum* to be initialised at whatever value *evenum* currently has. No matter what happens subsequently to the original version, this will not affect the new version. When we run this new version, it would update its private copy of *evenum*. This would not affect the version of *evenum* known to the original function, but the new, updated copy of *evenum* would be available to the new function the next time it is run. In other words, we take a sort of snapshot of a function, with respect to the current status of its free symbols. We can then manipulate this picture, rather than the function itself. The picture has about the same logical structure as the original, but if we change something in the picture, the original does not change. In fact, we should be able to take any number of such snapshots, and manipulate each one a bit differently. The alterations to each snapshot would serve to record its current state of affairs. Each snapshot could be looked at and altered quite independently of the others. When we take such a snapshot of a function, it is called a closure of that function. The name is derived from the idea that variables denoted by the free symbols of that function, normally open to the world outside that function, are now closed to the outside world. In Common Lisp, closures of functions are created by supplying the function function with a lambda expression as argument. If the free symbols of that lambda expression happen to be parameters of the function within which the lambda expression is embedded, function will return a snapshot of the function denoted by the lambda expression. This snapshot will contain its own variables corresponding to each of the lambda expressions free symbols. For example, we can use function to write a version of generate-even that will produce a closure that includes the free symbol *evenum* in the picture: > (defun generate-even (*evenum*) (function (lambda() (setq *evenum* (+ *evenum* 2))))) GENERATE-EVEN > (setq gen-even-1 (generate-even 0)) #<FUNCTION :LAMBDA NIL (SETQ *EVENUM* (+ *EVENUM* 2))> > (funcall gen-even-1) 2 > (funcall gen-even-1) 4 > (funcall gen-even-1) 6

When generate-even is called, Lisp creates a new variable corresponding to *evenum*, as Lisp always produces new variables corresponding to the formal parameters of a function. Then the function function returns a closure of the specified lambda expression. Since a new variable corresponding to *evenum* exists at this time, the closure gets this version of *evenum* as its own. When we exit this call to generate-even, no code can reference this variable that is closed off to the outside world. We save this closure by assigning it to the variable gen-even-1. Next, let us use funcall to invoke this function (funcall is like apply, but expects the arguments right after the function name. In this case, there are none, as the lambda expression of generate-even, and hence, the closure produced from it, is a function of no arguments). Lisp prints out the closure as #<FUNCTION :LAMBDA NIL (SETQ *EVENUM* (+ *EVENUM* 2))>. Remember that the notation used to print closures is not a part of the Common Lisp standard. This is because it is not really meaningful to talk about printing a closure. Therefore, other implementations of Common Lisp may use a different notation. We run this closure a couple of times, and each time it produces a new value. We can create as many independent closures of the same function as we like. For example, if we make another closure of generate-even right now > (setq gen-even-2 (generate-even 0)) #<FUNCTION :LAMBDA NIL (SETQ *EVENUM* (+ *EVENUM* 2))> > (funcall gen-even-2) 2 > (funcall gen-even-2) 4 > (funcall gen-even-1) 8 > (funcall gen-even-1) 10 > (funcall gen-even-2) 6 This closure starts off with its version of *evenum* at the value 0. Each closure has its own independent variable corresponding to the symbol *evenum*. Therefore, a call to one function has no effect on the value of *evenum* in the other.

Close that function!


When we close off the free symbols of a function, a new problem presents itself. The closed variables are inaccessible outside of the closure, so it would be difficult to write a set of functions that shared the same closed variable. For example, suppose we want to write a pair of functions where one returns the next even number, and the other the next odd number. However, we want them to work in tandem, so that a call to one advances the other. For example, if we call the even number generator three times in a row, it should return 2, 4, and 6. Then a call to the odd number generator should return 7. If we call it again, it should return 9. The next time we call the even number generator, it should return 10.

It is easy to write a single pair of such functions. For example, we could do the following: > (defun generate-even() (setq *seed* (cond ((evenp *seed*) (+ *seed* 2)) (t (1+ *seed*))))) GENERATE-EVEN > (defun generate-odd() (setq *seed* (cond ((oddp *seed*) (+ *seed* 2)) (t (1+ *seed*))))) GENERATE-ODD > (setq *seed* 0) 0 > (generate-even) 2 > (generate-odd) 3 > (generate-even) 4 > (generate-even) 6 > (generate-odd) 7 However, if we want to make closures of these functions, we are in trouble. If we use closure to produce a closure of each function, each closure would get its own version of *seed*. The closure of generate-even could not influence the closure of generate-odd, and the converse would also be true. But this is not what we want. The solution to this problem is to create closures of a number of functions in the same context. The functions closed together would share their variables with one another, but not with anyone else. For example, here is a function that creates a list of two closures, one of which generates even numbers and the other, odd: > (defun generate-even-odd (*seed*) (list (function (lambda() (setq *seed* (cond ((evenp *seed*) (+ *seed* 2)) (t (1+ *seed*)))))) (function (lambda() (setq *seed* (cond ((oddp *seed*) (+ *seed* 2)) (t (1+ *seed*)))))))) GENERATE-EVEN-ODD > (setq fns (generate-even-odd 0)) (#FUNCTION :LAMBDA NIL (SETQ *SEED* (COND ((EVENP *SEED*) (+ *SEED* 2)) (T (1+ *SEED*))))> (#FUNCTION :LAMBDA NIL (SETQ *SEED* (COND ((ODDP *SEED*) (+ *SEED* 2)) (T (1+ *SEED*))))>) > (funcall (car fns)) 2 > (funcall (car fns)) 4

> (funcall (cadr fns)) 5 > (funcall (cadr fns)) 7 > (funcall (car fns)) 8 Some readers who arent actively using Lisp may have forgotten some of what we covered earlier in this series, so here is a brief explanation of the keywords used in the code above. (cond ((test expression*)*)) Evaluates test expressions until one returns true. If that test has no corresponding expressions, returns the value of the test. Otherwise, evaluates the expressions in order, returning the value(s) of the last. If no test returns true, returns nil. car The primitive functions for extracting the elements of lists are car and cdr. The car of a list is the first element, and the cdr is everything after the first element: > (car `(a b c)) A > (cdr `(a b c)) (B C) caddr Common Lisp defines functions like caddr, which is an abbreviation for car of cdr of cdr. All the functions of the form c_x_r, where _x_ is a string of up to four as or ds, are defined in Common Lisp. With the possible exception of cadr, which refers to the second element, it is not a good idea to use them in code that anyone else is going to read. (evenp i) Returns true if i is even; (oddp i) returns true if i is odd. The call to generate-even-odd produces a pair of closures, one for each call to function. This pair shares access to an otherwise private copy of the variable *seed*. Subsequent calls to generate-even-odd would create additional pairs of such closures, each pair sharing a variable all its own.

Lisp: Tears of Joy, Part 7

Macros cant get enough of them!


You already know from my previous articles that macros are Lisp programs that generate other Lisp programs. The generated Lisp code has fully parenthesised notation, and so does the macro that generates the code. In the simplest case, a macro substitutes forms within a template, clearly establishing a visual correspondence between the generating code and the generated code. Complex macros can use the full power of the Lisp language to generate code according to the macro parameters; often, a template form is wrapped in code that constructs appropriate sub-forms, but even this approach is just a typical pattern of use, and not a requirement (or restriction) of the Lisp macro facility. To refresh our memories, lets examine the mechanism by which the Lisp system translates code generated by a macro. You define a macro with a defmacro form; defmacro is like defun, but instead of returning values, the body of the defmacro returns a Lisp form. Your program calls a macro the same way it calls a function, but the behaviour is quite different. First, none of the macros parameters are evaluatedever. Macro parameters are bound literally to the corresponding arguments in the macro definition. If you pass (* 7 (+ 3 2)) to a macro, the argument in the body of the macro definition is bound to the literal list (* 7 (+ 3 2)) and not the value 35. Next, the macro expander is invoked, receiving all of the actual parameters bound to their corresponding arguments as named by the defmacro form. The macro expander is merely the body of the defmacro form, which is just Lisp code; the only catch is that the Lisp system expects the macro expander to return a Lisp form. The Lisp system then evaluates whatever form the macro expander returns.

A Lisp implementation may expand macros at different times. A macro could be expanded just once, when your program is compiled. Or it could be expanded on first use as your program runs, and the expansion could be cached for subsequent reuse. A properly written macro will behave the same under all of these implementations. Lets explore a real-world example of using a macro to extend Lisp into the problem domain. In addition to providing a macro expander, our new macro will automatically generate an environment that will be referenced by the expander. Our example will show how to move computations from runtime to compile-time, and how to share information computed at compile-time. Lets say youre working on an interactive game that makes heavy use of the trigonometric function sin r in computing player motion and interaction. Youve already determined that calling the Lisp function sin is too time-consuming; you also know that your program will work just fine with approximate results for the computation of sin r. Youd like to define a lookup-sin macro to do the table lookup at runtime and also hide the details of table generation, which would just clutter your programs source code. Your macro will be invoked as (lookup-sin radians divisions), where radians is always in the range of zero to one-half pi, and divisions is the number of discrete values available as the result of lookup-sin. At runtime, the macro expander will just compute the index into a lookup table, and return the value from the table. The table will be generated at compile-time (on most Lisp systems). Furthermore, only one table will ever be generated for a given value of divisions in the macro call. ;; This is where we cache all of the sine tables generated during compilation. ;; The tables stay around at runtime so they can be used for lookups. (defvar *sin-tables* (make-hash-table) "A has table of tables of sine values. The hash is keyed by the number of entries in each sine table.") ;; This is a helper function for the lookup-sin macro. It is used only at compile ;; time. (defun get-sin-table-and-increment (divisions) "Returns a sine lookup table and the number of radians quantised by each entry in the table. Tables of a given size are reused. A table covers angles from zero to pi/2 radians." (let ((table (gethash divisions *sin-tables* :none)) (increment (/ pi 2 divisions))) (when (eq table :none) ;; Uncomment the next line to see when a table gets created. ;; (print `|Making new table|) (setq table (setf (gethash divisions *sin-tables*) (make-array (1+ divisions) :initial-element 1.0))) (dotimes (i divisions) (setf (aref table i) (sin (* increment i))))) (values table increment))) ;; Macro calls the helper at compile time, and returns an AREF form to do the ;; lookup at runtime. (defmacro lookup-sin (radians divisions) "Return a sine value via table lookup." (multiple-value-bind (table increment) (get-sin-table-and-increment divisions) `(aref, table (round, radians, increment))))

Let us examine what is happening here. When this program runs, it executes just aref (and associated round) to look up the sin r value. > (pprint (macroexpand-1 `(lookup-sin (/ pi 4) 50))) (AREF #(0.0D0 0.03141075907812829D0 0.06279051952931338D0 0.09410831331851433D0 0.12533323356430426D0 0.15643446504023087D0 0.18738131458572463D0 0.21814324139654257D0 0.2486898871648548D0 0.2789911060392293D0 0.3090169943749474D0 0.3387379202452914D0 0.368124552684678D0 0.3971478906347806D0 0.4257792915650727D0 0.4539904997395468D0 0.4817536741017153D0 0.5090414157503713D0 0.5358267949789967D0 0.5620833778521306D0 0.5877852522924731D0 0.6129070536529765D0 0.6374239897486898D0 0.6613118653236518D0 0.6845471059286887D0 0.7071067811865476D0 0.7289686274214116D0 0.7501110696304596D0 0.7705132427757893D0 0.7901550123756904D0 0.8090169943749475D0 0.8270805742745618D0 0.8443279255020151D0 0.8607420270039436D0 0.8763066800438637D0 0.8910065241883678D0 0.9048270524660196D0 0.9177546256839811D0 0.9297764858882515D0 0.9408807689542256D0 ...) (ROUND (/ PI 4) 0.031415926535897934D0)) Note that the macro call makes no mention of a lookup table. Tables are generated as needed by (and for) the compiler. > (lookup-sin (/ pi 4) 50) 0.7071067811865476D0 In the macro expansion, the #(...) is the printed representation of the lookup table for 50 divisions of the quarter circle. This table is stored in the *sin-tables* hash table, where it is shared by every macro call to (lookup-sin angle 50). We dont even have to do a hash lookup at runtime, because the macro expander has captured the free variable table from the multiple-value-bind form in lookup-sin.

Macros that define macros


Macros that define macros are used infrequently, partly because its hard to think of a good use for this technique, and partly because its difficult to get right. The following macro, based on an example in Paul Grahams book On Lisp, can be used to define synonyms for the names of Lisp functions, macros, and special forms. > (defmacro defsynonym (old-name new name) "Define OLD-NAME to be equivalent to NEW-NAME when used in the first position of a Lisp form." `(defmacro, new-name (&rest args) `(,`,old-name, @args))) DEFSYNONYM > (defsynonym make-pair cons) MAKE-PAIR > (make-pair `a `b) (A . B) Macros are always a little bit dangerous because code containing a macro call does not automatically get updated if you change the definition of the macro. You can always establish your own convention to help you remember that you need to recompile certain code after you change a macro definition. But theres always the possibility that youll forget, or make a mistake.

Ultimately, the likelihood that youll inadvertently end up with code that was compiled with an old version of a macro is directly proportional to how often youre likely to change the macro. A macro like defsynonym practically begs to be used again and again as you generate new code. If you change your mind about the old name to associate with a given new name, all of your previously compiled code will still refer to the old name that you had decided upon earlier. Ill leave you here to have fun with your macros. In the next article, well take a peek into Common Lisp Object System (CLOS) which allows you to build very sophisticated object-based systems. If you care to code with a strongly object-oriented mindset, you will probably find all the OOP language functionality you need in Common Lisp. CLOS has many advanced object-oriented features that you wont find in many other places. Because of this, CLOS has often been used as a research tool for studying OOP ideas.

Lisp: Tears of Joy, Part 8

In rare moments of self-reflection, when I allow myself to doubt my skills as a Lisp evangelist, I sometimes wonder if I have left behind some of my fellow programmers who favour the objectoriented style of programming. Just because I have been focusing on Lisp as a functional programming paradigm, it doesnt mean we dont have a role for you in our plans of world domination. Read on to know where you fit in.

Functional vs object-oriented (OO) programming


With an OO approach, programmers write code that describes in exacting detail the steps that the computer must take to accomplish the goal. They focus on how to perform tasks, and how to track changes in state. They would use loops, conditions and method calls as their primary flow control, and instances of structures or classes as primary manipulation units. OO tries to control state behind object interfaces. In contrast, functional programming (FP) involves composing the problem as a set of functions to be executed. FP programmers focus on what information is desired and what transformations are required, by carefully defining the input to each function and what each function returns. They would use function calls, including recursion, as their primary flow control, functions as first-class objects and data collections as primary manipulation units. FP tries to minimise state by using pure functions as much as possible. According to Michael Feathers: OO makes code understandable by encapsulating moving parts. FP makes code understandable by minimising moving parts. Conard Barski points out that the critics of the OO programming style may complain that objectoriented techniques force data to be hidden away in a lot of disparate places by requiring them to live inside many different objects. Having data located in disparate places can make programs difficult to understand, especially if that data changes over time.

Therefore, many Lispers prefer to use functional techniques over object-oriented techniques, though the two can often be used together with some care. Nonetheless, there are still many domains in which object-oriented techniques are invaluable, such as in user interface programming or simulation programming. On the other hand, James Hague, in his assessment of functional programming argues that, 100 per cent pure functional programming doesnt work. Even 98 per cent pure functional programming doesnt work. But if the slider between functional purity and 1980s BASIC-style imperative messiness is kicked down a few notches say to 85 per cent then it really does work. You get all the advantages of functional programming, but without the extreme mental effort and un-maintainability that increases as you get closer and closer to perfectly pure.

CLOS
If OO is what gets you going, Common Lisp offers the most sophisticated object-oriented programming framework of any major programming language. Its called Common Lisp Object System (CLOS). It is customisable at a fundamental level, using the Meta-Object Protocol (MOP). It has been claimed that theres really nothing like it anywhere else in programming. It lets you control incredibly complex software without losing control over the code. Let the tears of joy flow

Object-oriented programming in Common Lisp


What is CLOS?
#1: It is a layered system designed for flexibility. One of the design goals of CLOS is to provide a set of layers that separate different programming language concerns from one another. The first level of the Object System provides a programmatic interface to object-oriented programming. This level is designed to meet the needs of most serious users, and to provide a syntax that is crisp and understandable. The second level provides a functional interface into the heart of the Object System. This level is intended for programmers who are writing very complex software or a programming environment. The first level is written in terms of this second level. The third level provides the tools for programmers who are writing their own object-oriented language. It allows access to the primitive objects and operators of the Object System. It is this level at which the implementation of the Object System itself is based. The layered design of CLOS is founded on the meta-object protocol, a protocol that is used to define the characteristics of an object-oriented system. Using the meta-object protocol, other functional or programmatic interfaces to the Object System, as well as other object systems, can be written. #2: It is based on the concept of generic functions rather than on message-passing. This choice is made for two reasons: 1. there are some problems with message-passing in operations of more than one argument; 2. the concept of generic functions is a generalisation of the concept of ordinary Lisp functions.

A key concept in object-oriented systems is that given an operation and a tuple of objects on which to apply the operation, the code that is most appropriate to perform the operation is selected, based on the classes of the objects. In most message-passing systems, operations are essentially properties of classes, and this selection is made by packaging a message that specifies the operation and the objects to which it applies, before sending that message to a suitable object. That object then takes responsibility for selecting the appropriate piece of code. These pieces of code are called methods. #3: It is a multiple inheritance system. Another key concept in object-oriented programming is the definition of structure and behaviour on the basis of the class of an object. Classes thus impose a type system the code that is used to execute operations on objects depends on the classes of the objects. The sub-class mechanism allows classes to be defined that share the structure and the behaviour of other classes. This sub-classing is a tool for modularisation of programs. #4: It provides a powerful method combination facility. Method combination is used to define how the methods that are applicable to a set of arguments can be combined to provide the values of a generic function. In many object-oriented systems, the most specific applicable method is invoked, and that method may invoke other, less specific methods. When this happens, there is often a combination strategy at work, but that strategy is distributed throughout the methods as local control structure. Method combination brings the notion of a combination strategy to the surface, and provides a mechanism for expressing that strategy. #5: The primary entities of the system are all first-class objects. In the Common Lisp Object System, generic functions and classes are first-class objects with no intrinsic names. It is possible and useful to create and manipulate anonymous generic functions and classes. The concept of first-class is important in Lisp-like languages. A first-class object is one that can be explicitly made and manipulated; it can be stored in any location that can hold general objects.

What CLOS is not


It does not make for a great pickup conversation at the bar. I tried. It did not work! It also does not attempt to solve problems of encapsulation. The inherited structure of a class depends on the names of the internal parts of the classes from which it inherits. CLOS does not support subtractive inheritance. Within Common Lisp, there is a primitive module system that can be used to help create separate internal namespaces.

Classes
The defclass macro is used to define a new class. The definition of a class consists of its name, a list of its direct super-classes, a set of slot specifiers and a set of class options. The direct super-classes of a class are those from which the new class inherits structure and behaviour. When a class is defined, the order in which its direct super-classes are mentioned in the defclass form defines a local precedence order on the class and those super-classes. The local precedence order is represented as a list consisting of the class, followed by its direct super-classes, in the order mentioned in the defclass form. The following two classes define a representation of a point in space. The x-y-position class is a sub-class of the position class:

> (defclass position () ()) > (defclass x-y-position (position) ((x :initform 0) (y :initform 0)) (:accessor-prefix position-)) The position class is useful if we want to create other sorts of representations for spatial positions. The x and y coordinates are initialised to 0 in all instances, unless explicit values are supplied for them. To refer to the x coordinate of an instance of x-y-position, you would write:\ > (position-x position) To alter the x coordinate of that instance, you would write: (setf (position-x position) new-x) The macro defclass is part of the Object System programmatic interface and, as such, is on the first of the three levels of the Object System.

Generic functions
The class-specific operations of the Common Lisp Object System are provided by generic functions and methods. A generic function is one whose behaviour depends on the classes or identities of the arguments supplied to it. The methods associated with the generic function define the class-specific operations of the generic function. Like an ordinary Lisp function, a generic function takes arguments, performs a series of operations and returns values. An ordinary function has a single body of code that is always executed when the function is called. A generic function is able to perform different series of operations and to combine the results of the operations in different ways, depending on the class or identity of one or more of its arguments. Generic functions are defined by means of the defgeneric-options and defmethod macros. The defgeneric-options macro is designed to allow for the specification of properties that pertain to the generic function as a whole, and not just to individual methods. The defmethod form is used to define a method. If there is no generic function of the given name, however, it automatically creates a generic function with default values for the argument precedence order (left-to-right, as defined by the lambdalist), the generic function class (the class standard-generic-function), the method class (the class standard-method) and the method combination type (standard-method combination).

Methods
The class-specific operations provided by generic functions are themselves defined and implemented by methods. The class or identity of each argument to the generic function indicates which method or methods are eligible to be invoked. A method object contains a method function, an ordered set of parameter specialisers that specify when the given method is applicable, and an ordered set of qualifiers that are used by the method combination facility to distinguish between methods. The defmethod macro is used to create a method object. A defmethod form contains the code that is to be run when the arguments to the generic function cause the method that it defines, to be selected.

If a defmethod form is evaluated, and a method object corresponding to the given generic function name, parameter specialisers and qualifiers already exists, then the new definition replaces the old. Generic functions can be used to implement a layer of abstraction on top of a set of classes. For example, the x-y-position class can be viewed as containing information in polar coordinates. Two methods have been defined position-rho and position-theta, that calculate the and coordinates given an instance of x-y-position: (sqrt (+ (* x x) (* y y))))) It is also possible to write methods that update the virtual slots position-rho and position-theta: > (defmethod-setf position-rho ((pos x-y-position)) (rho) (let* ((r (position-rho pos)) (ratio (/ rho r))) (setf (position-x pos) (* ratio (position-x pos))) (setf (position-y pos) (* ratio (position-y pos))))) > (defmethod-setf position-theta ((pos x-y-position)) (theta) (let ((rho (position-rho pos))) (setf (position-x pos) (* rho (cos theta))) (setf (position-y pos) (* rho (sin theta))))) To update the -coordinate you may write: > (setf (position-rho pos) new-rho) This is precisely the same syntax that would be used if the positions were explicitly stored as polar coordinates.

Class redefinition
The Common Lisp Object System provides a powerful class-redefinition facility. When a defclass form is evaluated, and a class with the given name already exists, the existing class is redefined. Redefining a class modifies the existing class object to reflect the new class definition. You may define methods on the generic function class-changed to control the class redefinition process. This generic function is invoked automatically by the system after defclass has been used to redefine an existing class; for example, suppose it becomes apparent that the application that requires representing positions uses polar coordinates more than it uses rectangular coordinates. It might make sense to define a sub-class of position that uses polar coordinates: > (defclass rho-theta-position (position) ((rho :initform 0) (theta :initform 0)) (:accessor-prefix position-)) The instances of x-y-position can be automatically updated by defining a class-changed method: > (defmethod class-changed ((old x-y-position) (new rho-theta-position)) ;; Copy the position information from old to new to make new ;; be a rho-theta-position at the same position as old. (let ((x (position-x old)) (y (position-y old))) (setf (position-rho new) (sqrt (+ (* x x) (* y y)))

(position-theta new) (atan y x)))) At this point, we can change an instance of the class x-y-position, p1, to be an instance of rho-thetaposition by using change-class: > (change-class p1 'rho-theta-position)

Inheritance
Inheritance is the key to program modularity within CLOS. A typical object-oriented program consists of several classes, each of which defines some aspect of behaviour. New classes are defined by including the appropriate classes as super-classes, thus gathering the desired aspects of behaviour into one class. In general, slot descriptions are inherited by sub-classes. That is, slots defined by a class are usually slots implicitly defined by any sub-class of that class, unless the sub-class explicitly shadows the slot definition. A class can also shadow some of the slot options declared in the defclass form of one of its super-classes by providing its own description for that slot. A sub-class inherits methods in the sense that any method applicable to an instance of a class is also applicable to instances of any sub-class of that class (all other arguments to the method being the same). The inheritance of methods acts the same way regardless of whether the method was created by using defmethod or by using one of the defclass options that cause methods to be generated automatically. I hope with this article I have managed to convince OO programmers that Lisp is generous enough to cater to your style of thinking. Stick with me, and I promise that you wont be disappointed. So far weve seen how to fit the nuts and bolts into the engine. Next month, well learn how to paint it a nice shiny red I am referring to Graphical Programming in Lisp!

References
Let Over Lambda, Doug Hoyte CLOS: Integrating Object-Oriented and Functional programming, Richard P. Gabriel, Jon L White, Daniel G. Bobrow

Lisp: Tears of Joy, Part 9

A popular myth about Common Lisp is that it does not support a GUI. This is not true. Lisp systems have supported the GUI since the late 1970s much before low-cost consumer computers adopted it. The Xerox Alto, developed at Xerox PARC in 1973, was the first computer to use the desktop metaphor and a mouse-driven GUI. It was not a commercial product. Later, Xerox Star was introduced by Xerox Corporation in 1981. This was the first commercial system to incorporate the GUI, and it came with Lisp and SmallTalk for the research and software development market (the Apple Macintosh, released in 1984, was the first commercially successful product to use a multi-panel window GUI). Windows and Macintosh users occasionally find the Lisp GUI coarse and a bit unfamiliar. Several commercial Lisp environments offer graphical interface builders that let you build widgets with point/click and drag/drop techniques, but the look and feel is often plain. These Lisp environments typically provide wrappers around the collection of graphic routines supported by the OS, such that these wrappers can be used from within your Lisp program.

Examples using LispWorks and CAPI toolkit


LispWorks is an IDE for ANSI Common Lisp. It runs on Linux, Mac OS X, Windows and other operating systems. It uses Common Application Programmers Interface (CAPI), which is a library for implementing portable window-based application interfaces. CAPI is a conceptually simple, CLOS-based model of interface elements and their interaction. It provides a standard set of these elements and their behaviours, as well as giving you the opportunity to define elements of your own. CAPI currently runs under the X Window System with either GTK+ or Motif, and on Microsoft Windows and Mac OS X. Using CAPI with Motif is deprecated. Lets create a few GUI elements using LispWorks and CAPI.

Menus
You can create menus for an application using the menu class. Let us start by creating a test-callback and a hello function, which well need to create and test our GUIs.

(defun test-callback (data interface) (display-message "Data ~S in interface ~S" data interface)) (defun hello (data interface) (declare (ignore data interface)) (display-message "Hello World")) The following code then creates a CAPI interface with a menu, Foo, which contains four items. Choosing any of these items displays its arguments. Each item has the callback specified by the :callback keyword. (make-instance 'menu :title "Foo" :items '("One" "Two" "Three" "Four") :callback 'test-callback) (make-instance 'interface :menu-bar-items (list *)) (display *) A submenu can be created simply by specifying a menu as one of the items of the top-level menu. (make-instance 'menu :title "Bar" :items '("One" "Two" "Three" "Four") :callback 'test-callback) (make-instance 'menu :title "Baz" :items (list 1 2 * 4 5) :callback 'test-callback) (contain *) This creates an interface that has a menu called Baz, which itself contains five items. The third item is another menu, Bar, which contains four items. Once again, selecting any item returns its arguments. Menus can be nested as deeply as required, using this method. The menu-component class lets you group related items together in a menu. This allows similar menu items to share properties such as callbacks, and to be visually separated from other items in the menus. Menu components are actually choices. Here is a simple example of a menu component. This creates a menu called Items, which has four items. Menu 1 and Menu 2 are ordinary menu items, but Item 1 and Item 2 are created from a menu component, and are therefore grouped together in the menu, as shown in Figure 1:

(setq component (make-instance 'menu-component :items '("item 1" "item2") :print-function 'string-capitalize :callback 'test-callback)) (contain (make-instance 'menu :title "Items" :items (list "menu 1" component "menu 2") :print-function 'string-capitalize :callback 'hello) :width 150 :height 0)

Figure 1: Menu with menu components

Radio buttons
Menu components allow you to specify, via the :interaction keyword, selectable menu items either as multiple-selection or single-selection items. This is like having radio buttons or check boxes as items in a menu, and is a popular technique among many GUI-based applications. The following example shows you how to include a panel of radio buttons in a menu (see Figure 2): (setq radio (make-instance 'menu-component :interaction :single-selection :items '("This" "That") :callback 'hello)) (setq commands (make-instance 'menu :title "Commands" :items (list "Command 1" radio "Command 2") :callback 'test-callback)) (contain commands)

Figure 2: Radio buttons in the menu

The menu items This and That are radio buttons, only one of which may be selected at a time. The other items are just ordinary commands, as in the previous example. Note that CAPI automatically groups items that are parts of a menu component, so that they are separated from other items in the menu.

Checked menu
The above example also illustrates the use of more than one callback in a menu, which of course is the usual case when you are developing real applications. Choosing either of the radio buttons displays one message on the screen, and choosing either Command1 or Command2 returns the arguments of the callback. Checked menu items can be created by specifying :multiple-selection to the :interaction keyword, as illustrated below (view Figure 3): (setq letters (make-instance 'menu-component :interaction :multiple-selection :items (list "Alpha" "Beta"))) (contain (make-instance 'menu :title "Greek" :items (list letters) :callback 'test-callback))

Figure 3: Menu with multiple-selection 'checked' items Note how the items in the menu component inherit the callback given to the parent, eliminating the need to specify a separate callback for each item or component in the menu. Within a menu or component, you can specify alternatives that are invoked by modifier keys for a main menu item. The menu-item class lets you create individual menu items, which can be passed to menu-components or menus via the :items keyword. Using this class, you can assign different callbacks to different menu items. Remember that each instance of a menu item must not be used in more than one place at a time. (setq test (make-instance 'menu-item :title "Test" :callback 'test-callback)) (setq hello (make-instance 'menu-item :title "Hello" :callback 'hello)) (setq group (make-instance 'menu-component :items (list test hello))) (contain group)

Figure 4: Individual menu The combination of menu items, menu components and menus can create a hierarchical structure. The menu in the below code has five elements, one of which is itself a menu (with three menu items) and the remainder are menu components and menu items. Items in a menu inherit values from their parent, allowing similar elements to share relevant properties whenever possible. (defun menu-item-name (data) (format nil "Menu Item ~D" data)) (defun submenu-item-name (data) (format nil "Submenu Item ~D" data)) (contain (make-instance 'menu :items (list (make-instance 'menu-component :items '(1 2) :print-function 'menu-item-name ) (make-instance 'menu-component :items (list 3 (make-instance 'menu :title "Submenu" :items '(1 2 3) :print-function 'submenu-item-name)) :print-function 'menu-item-name) (make-instance 'menu-item :data 42)) :print-function 'menu-item-name))

Figure 5: Menu hierarchy

Rather than create GUI elements programmatically, LispWorks also contains an Interface Builder, which is a tool to construct graphical user interfaces for Lisp applications. You can design and test each window or dialogue in your application, and the interface builder generates the necessary source code to create the windows you have all you need to do is add callbacks to the generated code, so that your own source code is utilised.

Lisp: Tears of Joy, Part 10

There was a joke back in the 80s, when Reagans SDI (Strategic Defense Initiative) programme was in full swing, that someone stole the Lisp source code to the missile interceptor program and to prove it, he showed the last page of code )))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))))))))))))) In our quest for supreme knowledge, through the past nine articles on the Lisp series, weve travelled far enough to discover that LISP does not stand for Lots of Irritating Superfluous Parentheses (I know, strange!). In this epilogue, lets summarise in brief why hacking Lisp code could be a profound experience in your life.

Lisps Functional Programming advantage


Functional Programming is the art of writing programs that work by returning values, instead of modifying things. Functional Programming also enables you to create fabulously powerful and very efficient abstract programs it is a mathematical approach to programming. In math, for at least the last 400 years, functions have played a very central role. Functions express the connection between parameters (the input) and the result (the output) of certain processes. In each computation, the result depends, in a certain way, on the parameters. Therefore, a function is a good way of specifying a computation. This is the basis of the functional programming style. A program consists of the definition of one or more functions. With the execution of a program, the function is provided with parameters, and the result must be calculated. Writing code in a functional style guarantees that a function does only one thing (returns a value), and is dependent on one thing (the parameters passed to it). This equips you to control side effects. However, some side effects are almost always necessary for a program to actually do something. This means that you cant write a useful program that has the entirety of its code written in the functional style.

In Article 8, where we talked about CLOS, I pointed out James Hagues assessment of functional programming, where he argues that, 100 per cent pure functional programming doesnt work. Even 98 per cent pure functional programming doesnt work. But if the slider between functional purity and 1980s BASIC-style imperative messiness is kicked down a few notches say, to 85 per cent then it really does work. You get all the advantages of functional programming, but without the extreme mental effort and unmaintainability that increases as you get closer and closer to perfectly pure.

Macros
Macros may be the single most important reason why Lispers put up with all those annoying parentheses in their code. These very parentheses enable this powerful macro system in Lisp. Paul Graham, who comes close to being a Lisp missionary, points out that Lisp code is made out of Lisp data objects. And not in the trivial sense that the source files contain characters, and strings are one of the data types supported by the language. Lisp code, after its read by the parser, is made of data structures that you can traverse. If you understand how compilers work, youll figure out that whats really going on is not so much that Lisp has a strange syntax (parentheses everywhere!) as that Lisp has no syntax. You write programs in the parse trees that get generated within the compiler when other languages are parsed but these parse trees are fully accessible to your programs; you can write programs that manipulate them. In Lisp, these programs are called macros. They are programs that write programs.

A great medium to express recursion


Recursion is the act of defining an object or solving a problem in terms of itself. Properly used, recursion is a powerful problem-solving technique, both in artificial domains like mathematics and computer programming, as well as in real life. The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions, wrote Niklaus Wirth in his 1976 book, Algorithms + Data Structures = Programs. Lisp is the best programming language to use when working with recursive problems. Daniel P. Friedman and Matthias Felleisen demonstrate this case for Lisp in their book The Little Lisper. Lisp is inherently symbolic programmers do not have to make an explicit mapping between the symbols of their own language and the representations in the computer. Recursion is Lisps natural computational mechanism; the primary programming activity is the creation of (potentially) recursive definitions.

What else?
There are many more reasons why one should try out Lisp. Apart from the ones mentioned above, I have covered a few in my previous articles. This series was my small attempt to evangelise Lisp and plant a seed of curiosity in those who are seeking a better way of life in the maddening programming world. If the seed has taken root in your mind and youd like to explore further, feel free to contact me by leaving a comment, should you have any questions, or if you need pointers to resources that may aid your discovery of the worlds most powerful programming language. Have fun with Lisp!

Reference
Wirth, Niklaus (1976), Algorithms + Data Structures = Programs, Prentice-Hall

A Beginner's Guide to Using pyGTK and Glade


pyGTK and Glade allow anyone to create functional GUIs quickly and easily. The beauty of pyGTK and Glade is they have opened up cross-platform, professional-quality GUI development to those of us who'd rather be doing other things but who still need a GUI on top of it all. Not only does pyGTK allow neophytes to create great GUIs, it also allows professionals to create flexible, dynamic and powerful user interfaces faster than ever before. If you've ever wanted to create a quick user interface that looks good without a lot of work, and you don't have any GUI experience, read on. This article is the direct result of a learning process that occurred while programming Immunity CANVAS (www.immunitysec.com/CANVAS). Much of what was learned while developing the GUI from scratch was put in the pyGTK FAQ, located at www.async.com.br/faq/pygtk/index.py?req=index. Another URL you no doubt will be using a lot if you delve deeply into pyGTK is the documentation at www.gnome.org/~james/pygtk-docs. It is fair to say that for a small company, using pyGTK over other GUI development environments, such as native C, is a competitive advantage. Hopefully, after reading this article, everyone should be able to put together a GUI using Python, the easiest of all languages to learn. As a metric, the CANVAS GUI was written from scratch, in about two weeks, with no prior knowledge of pyGTK. It then was ported from GTK v1 to GTK v2 (more on that later) in a day, and it is now deployed to both Microsoft Windows and Linux customers.

The Cross-Platform Nature of pyGTK


In a perfect world, you never would have to develop for anything but Linux running your favorite distribution. In the real world, you need to support several versions of Linux, Windows, UNIX or whatever else your customers need. Choosing a GUI toolkit depends on what is well supported on your customers' platforms. Nowadays, choosing Python as your development tool in any new endeavor is second nature if speed of development is more of a requirement than runtime speed. This combination leads you to choose from the following alternatives for Python GUI development: wxPython, Tkinter, pyGTK and Python/Qt. Keeping in mind that I am not a professional GUI developer, here are my feelings on why one should chose pyGTK. wxPython has come a long way and offers attractive interfaces but is hard to use and get working, especially for a beginner. Not to mention, it requires both Linux and Windows users to download and install a large binary package. Qt, although free for Linux, requires a license to be distributed for Windows. This probably is prohibitive for many small companies who want to distribute on multiple platforms. Tkinter is the first Python GUI development kit and is available with almost every Python distribution. It looks ugly, though, and requires you to embed Tk into your Python applications, which feels like going backward. For a beginner, you really want to split the GUI from the application as much as possible. That way, when you edit the GUI, you don't have to change a bunch of things in your application or integrate any changes into your application. For these reasons alone, pyGTK might be your choice. It neatly splits the application from the GUI. Using libglade, the GUI itself is held as an XML file that you can continue to edit, save multiple versions of or whatever else you want, as it is not integrated with your application code. Furthermore, using Glade as a GUI builder allows you to create application interfaces quicklyso quickly that if multiple customers want multiple GUIs you could support them all easily.

Version Issues with GTK and pyGTK


Two main flavors of GTK are available in the wild, GTK versions 1 and 2. Therefore, at the start of a GUI-building project, you have to make some choices about what to develop and maintain. It is likely that Glade v1 came installed on your machine. You may have to download Glade v2 or install the development packages for GTK to compile the GTK v2 libglade. Believe me, it is worth the effort. GTK v2 offers several advantages, including a nicer overall look, installers for Windows with Python 2.2 and accessibility extensions that allow applications to be customized for blind users. In addition, version 2 comes installed on many of the latest distributions, although you still may need to install development RPMs or the latest pyGTK package. GTK v2 and hence pyGTK v2 offer a few, slightly more complex widgets (Views). In the hands of a mighty GUI master, they result in awesome applications, but they really confuse beginners. However, a few code recipes mean you can treat them as you would their counterparts in GTK v1, once you learn how to use them. As an example, after developing the entire GUI for CANVAS in GTK v1, I had to go back and redevelop it (which took exactly one day) in GTK v2. Support was lacking for GTK v1 in my customers' Linux boxes, but installing GTK v2 was easy enough. The main exception is Ximian Desktop, which makes pyGTK and GTK v1 easy to install. So, if your entire customer base is running that, you may want to stay with GTK v1. One thing to keep in mind thougha Python script is available for converting projects from Glade v1 to Glade v2, but not vice versa. So if you're going to do both, develop it first in Glade v1, convert it and then reconcile any differences.

An Introduction to Glade v2
The theory behind using Glade and libglade is it wastes time to create your GUI using code. Sitting down and telling the Python interpreter where each widget goes, what color it is and what the defaults are is a huge time sink. Anyone who's programmed in Tcl/Tk has spent days doing this. Not only that, but changing a GUI created with code can be a massive undertaking at times. With Glade and libglade, instead of creating code, you create XML files and code links to those files wherever a button or an entry box or an output text buffer is located. To start, you need Glade v2 if you don't have it already. Even if you do, you may want the latest version of it. Downloading and installing Glade v2 should be easy enough once you have GTK v2 development packages (the -devel RPMs) installed. However, for most people new to GUI development, the starting window for Glade is intimidatingly blank. To begin your application, click the Window Icon. Now, you should have a big blank window on your screen (Figure 1).

Figure 1. The cross-hatched area in the starting window is a place to put another widget.

The important thing to learn about GUI development is there are basically two types of objects: widgets, such as labels and entry boxes and other things you can see, and containers for those widgets. Most likely, you will use one of three kinds of containers, the vertical box, the horizontal box or the table. To create complex layouts, its easiest to nest these containers together in whatever order you need. For example, click on the horizontal box icon. Clicking on the hatched area in window1 inserts three more areas where you can add widgets. Your new window1 should look like Figure 2.

Figure 2. A basic three-pane vbox with the top pane selected. You now can select any of those three areas and further divide it with a vertical box. If you don't like the results, you always can go back and delete, cut and paste or change the number of boxes from the Properties menu (more on that later).

Figure 3. The top pane has been split by a two-pane hbox, which is selected. You can use these sorts of primitives to create almost any sort of layout. Now that we have a beginning layout, we can fill it with widgets that actually do something. In this case, I'll fill them with a label, a text entry, a spinbutton and a button. At first this looks pretty ugly (Figure 4).

Figure 4. The initial window filled in with widgets. Remember that GTK auto-adjusts the sizes of the finished product when it is displayed, so everything is packed together as tightly as possible. When the user drags the corner of the window, it's going to autoexpand as well. You can adjust these settings in the Properties window (go to the main Glade window and click ViewShow Properties). The Properties window changes different values for different kinds of widgets. If the spinbutton is focused, for example, we see the options shown in Figure 5.

Figure 5. The Glade interface for changing a widget's properties is customized for each type of widget.

By changing the Value option, we can change what the spinbutton defaults to when displayed. Also important is to change the Max value. A common mistake is to change the Value to something high but forget the Max, which causes the spinbutton initially to display the default but then revert to the Max value when it is changed, confusing the user. In our case, we're going to use the spinbutton as a TCP port, so I'll set it to 65535, the minimum to 1 and the default to 80. Then, focus on the label1 and change it to read Host:. By clicking on window1 in the main Glade window, you can focus on the entire window, allowing you to change its properties as well. You also can do this by bringing up the widget tree window and clicking on window1. Changing the name to serverinfo and the title to Server Info sets the titlebar and the internal Glade top-level widget name appropriately for this application. If you go to the widget tree view and click on the hbox1, you can increase the spacing between Host: and the text-entry box. This may make it look a little nicer. Our finished GUI looks like Figure 6.

Figure 6. The GUI in Glade does not look exactly like it does when rendered, so don't worry about the size of the Host: area. Normally, this would take only a few minutes to put together. After a bit of practice you'll find that putting together even the most complex GUIs using Glade can be accomplished in minutes. Compare that to the time it takes to type in all those Tk commands manually to do the same thing. This GUI, of course, doesn't do anything yet. We need to write the Python code that loads the .glade file and does the actual work. In fact, I tend to write two Python files for each Glade-driven project. One file handles the GUI, and the other file doesn't know anything about that GUI. That way, porting from GTK v1 to GTK v2 or even to another GUI toolkit is easy.

Creating the Python Program


First, we need to deal with any potential version skew. I use the following code, although a few other entries mentioned in the FAQ do similar things:
#!/usr/bin/env python import sys try: import pygtk #tell pyGTK, if possible, that we want GTKv2 pygtk.require("2.0") except:

#Some distributions come with GTK2, but not pyGTK pass try: import gtk import gtk.glade except: print "You need to install pyGTK or GTKv2 ", print "or set your PYTHONPATH correctly." print "try: export PYTHONPATH=", print "/usr/local/lib/python2.2/site-packages/" sys.exit(1) #now we have both gtk and gtk.glade imported #Also, we know we are running GTK v2

Now are going to create a GUI class called appGUI. Before we do that, though, we need to open button1's properties and add a signal. To do that, click the three dots, scroll to clicked, select it and then click Add. You should end up with something like Figure 7.

Figure 7. After Adding the Event (Signal) Handler With this in place, the signal_autoconnect causes any click of the button to call one of our functions (button1_clicked). You can see the other potential signals to be handled in that list as well. Each widget may have different potential signals. For example, capturing a text-changed signal on a text-entry widget may be useful, but a button never changes because it's not editable. Initializing the application and starting gtk.mainloop() gets the ball rolling. Different event handlers need to have different numbers of arguments. The clicked event handler gets only one argument, the widget that was clicked. While you're at it, add the destroy event to the main window, so the program exits when you close the window. Don't forget to save your Glade project.
class appgui: def __init__(self): """ In this init we are going to display the main

serverinfo window """ gladefile="project1.glade" windowname="serverinfo" self.wTree=gtk.glade.XML (gladefile,windowname) # we only have two callbacks to register, but # you could register any number, or use a # special class that automatically # registers all callbacks. If you wanted to pass # an argument, you would use a tuple like this: # dic = { "on button1_clicked" : \ (self.button1_clicked, arg1,arg2) , ... dic = { "on_button1_clicked" : \ self.button1_clicked, "on_serverinfo_destroy" : \ (gtk.mainquit) } self.wTree.signal_autoconnect (dic) return #####CALLBACKS def button1_clicked(self,widget): print "button clicked" # we start the app like this... app=appgui() gtk.mainloop()

It's important to make sure, if you installed pyGTK from source, that you set the PYTHONPATH environment variable to point to /usr/local/lib/python2.2/site-packages/ so pyGTK can be found correctly. Also, make sure you copy project1.glade into your current directory. You should end up with something like Figure 8 when you run your new program. Clicking GO! should produce a nifty buttonclicked message in your terminal window.

Figure 8. The Initial Server Info GUI To make the application actually do something interesting, you need to have some way to determine which host and which port to use. The following code fragment, put into the button1_clicked() function, should do the trick:
host=self.wTree.get_widget("entry1").get_text() port=int(self.wTree.get_widget( "spinbutton1").get_value()) if host=="": return import urllib page=urllib.urlopen( "http://"+host+":"+str(port)+"/") data=page.read() print data

Now when GO! is clicked, your program should go off to a remote site, grab a Web page and print the contents on the terminal window. You can spice it up by adding more rows to the hbox and putting other widgets, like a menubar, into the application.

You also can experiment with using a table instead of nested hboxes and vboxes for layout, which often creates nicer looking layouts with everything aligned.

TextViews
You don't really want all that text going to the terminal, though, do you? It's likely you want it displayed in another widget or even in another window. To do this in GTK v2, use the TextView and TextBuffer widgets. GTK v1 had an easy-to-understand widget called, simply, GtkText. Add a TextView to your Glade project and put the results in that window. You'll notice that a scrolledwindow is created to encapsulate it. Add the lines below to your init() to create a TextBuffer and attach it to your TextView. Obviously, one of the advantages of the GTK v2 way of doing things is the two different views can show the same buffer. You also may want to go into the Properties window for scrolledwindow1 and set the size to something larger so you have a decent view space:
self.logwindowview=self.wTree.get_widget("textview1") self.logwindow=gtk.TextBuffer(None) self.logwindowview.set_buffer(self.logwindow)

In your button1_clicked() function, replace the print statement with:


self.logwindow.insert_at_cursor(data,len(data))

Now, whenever you click GO! the results are displayed in your window. By dividing your main window with a set of vertical panes, you can resize this window, if you like (Figure 9).

Figure 9. Clicking GO! loads the Web page and displays it in the TextView.

TreeViews and Lists


Unlike GTK v1, under GTK v2 a tree and a list basically are the same thing; the difference is the kind of store each of them uses. Another important concept is the TreeIter, which is a datatype used to store a pointer to a particular row in a tree or list. It doesn't offer any useful methods itself, that is, you can't ++ it to step through the rows of a tree or list. However, it is passed into the TreeView methods whenever you want to reference a particular location in the tree. So, for example:
import gobject self.treeview=[2]self.wTree.get_widget("treeview1") self.treemodel=gtk.TreeStore(gobject.TYPE_STRING, gobject.TYPE_STRING) self.treeview.set_model(self.treemodel)

defines a tree model with two columns, each containing a string. The following code adds some titles to the top of the columns:
self.treeview.set_headers_visible(gtk.TRUE) renderer=gtk.CellRendererText() column=gtk.TreeViewColumn("Name",renderer, text=0) column.set_resizable(gtk.TRUE) self.treeview.append_column(column) renderer=gtk.CellRendererText() column=gtk.TreeViewColumn("Description",renderer, text=1) column.set_resizable(gtk.TRUE) self.treeview.append_column(column) self.treeview.show()

You could use the following function to add data manually to your tree:
def insert_row(model,parent, firstcolumn,secondcolumn): myiter=model.insert_after(parent,None) model.set_value(myiter,0,firstcolumn) model.set_value(myiter,1,secondcolumn) return myiter

Here's an example that uses this function. Don't forget to add treeview1 to your glade file, save it and copy it to your local directory:
model=self.treemodel insert_row(model,None,'Helium', 'Control Current Helium') syscallIter=insert_row(model,None, 'Syscall Redirection', 'Control Current Syscall Proxy') insert_row(model,syscallIter,'Syscall-shell', 'Pop-up a syscall-shell')

The screenshot in Figure 10 shows the results. I've replaced the TextView with a TreeView, as you can see.

Figure 10. An Example TreeView with Two Columns A list is done the same way, except you use ListStore instead of TreeStore. Also, most likely you will use ListStore.append() instead of insert_after().

Using Dialogs
A dialog differs from a normal window in one important wayit returns a value. To create a dialog box, click on the dialog box button and name it. Then, in your code, render it with [3]gtk.glade.XML(gladefile,dialogboxname). Then call get_widget(dialogboxname) to get a handle to that particular widget and call its run() method. If the result is gtk.RESPONSE_OK, the user clicked OK. If not, the user closed the window or clicked Cancel. Either way, you can destroy() the widget to make it disappear. One catch when using dialog boxes: if an exception happens before you call destroy() on the widget, the now unresponsive dialog box may hang around, confusing your users. Call widget.destroy() right after you receive the response and all the data you need from any entry boxes in the widget.

Using input_add() and gtk.mainiteration() to Handle Sockets


Some day, you probably will write a pyGTK application that uses sockets. When doing so, be aware that while your events are being handled, the application isn't doing anything else. When waiting on a socket.accept(), for example, you are going to be stuck looking at an unresponsive application. Instead, use gtk.input_add() to add any sockets that may have read events to GTK's internal list. This allows you to specify a callback to handle whatever data comes in over the sockets. One catch when doing this is you often want to update your windows during your event, necessitating a call to gtk.mainiteration(). But if you call gtk.mainiteration() while within gtk.mainiteration(), the application freezes. My solution for CANVAS was to wrap any calls to gtk.mainiteration() within a check to make sure I wasn't recursing. I check for pending events, like a socket accept(), any time I write a log message. My log function ends up looking like this:

def log(self,message,color): """ logs a message to the log window right now it just ignores the color argument """ message=message+"\n" self.logwindow.insert_at_cursor(message, len(message)) self.handlerdepth+=1 if self.handlerdepth==1 and \ gtk.events_pending(): gtk.mainiteration() self.handlerdepth-=1 return

Moving a GUI from GTK v1 to GTK v2


The entry in the pyGTK FAQ on porting your application from GTK v1 to GTK v2 is becoming more and more complete. However, you should be aware of a few problems you're going to face. Obviously, all of your GtkText widgets need to be replaced with Gtk.TextView widgets. The corresponding code in the GUI also must be changed to accommodate that move. Likewise, any lists or trees you've done in GTK v1 have to be redone. What may come as a surprise is you also need to redo all dialog boxes, remaking them in GTK v2 format, which looks much nicer. Also, a few syntax changes occurred, such as GDK moving to gtk.gdk and libglade moving to gtk.glade. For the most part, these are simple search and replaces. Use GtkText.insert_defaults instead of GtkTextBuffer.insert_at_cursor() and radiobutton.get_active() instead of radiobutton.active, for example. You can convert your Glade v1 file into a Glade v2 file using the libglade distribution's Python script. This gets you started on your GUI, but you may need to load Glade v2 and do some reconfigurations before porting your code. Final Notes Don't forget you can cut and paste from the Glade widget tree. This can make a redesign quick and painless. Unset any possible positions in the Properties window so your startup doesn't look weird. If you have a question you think other people might too, add it to the pyGTK FAQ. The GNOME IRC server has a useful #pygtk channel. I couldn't have written CANVAS without the help of the people on the channel, especially James Henstridge. It's a tribute to the Open Source community that the principal developers often are available to answer newbie questions. The finished demo code is available from ftp.linuxjournal.com/pub/lj/listings/issue113/6586.tgz. Dave Aitel is the founder of Immunity, Inc., a New York-based security consulting company. CANVAS is Immunity's penetration testing and exploit development framework, written entirely in Python using pyGTK. More information on Immunity is available at www.immunitysec.com.

Writing a GUI app with Python & Glade


Here is what is to be created:

This tutorial will assume a basic knowledge of programming in general, and an understanding of Python. You don't even really need that, but if you don't know python well, you might need to look some stuff up, but google is your friend. This tutorial also assumes you are running Linux. I don't know about glade on Windows, so you'd have to check that out yourself. You will also need to install python and glade. Writing a GUI application to do a task may seem like a difficult job, but in fact, it can be very simple. Python is the language of choice for me, it's simple and fast to develop in, with a fair amount of power behind it. First of all, we need an idea of what to do. Let's do something very basic to begin with, a program to add two numbers, and display an output. We'll start by creating a simple CLI app, which we'll convert. tutCLI.py: Code:
class adder: result = 0 def __init__( self, number1, number2 ): self.result = int( number1 ) + int( number2 ) def giveResult( self ): return str(self.result) endIt = False while ( endIt == False ): print "Please input two intergers you wish to add: " number1 = raw_input( "Enter the first number: " ) number2 = raw_input( "Enter the second number: " ) try: thistime = adder( number1, number2 ) except ValueError: print "Sorry, one of your values was not a valid integer." continue print "Your result is: " + thistime.giveResult() goagain = raw_input( "Do you want to eXit or go again? ('X' to eXit, anything else to continue): " ) if ( goagain == "x" or goagain == "X" ): endIt = True

This is a simple python script to get the input. The class 'adder' does the actual addition. Taking two inputs in the constructor (__init__), and then adding the two, and storing the result in the 'result' member. The rest is the stuff that makes our CLI work. It gives the user instructions, takes some input, then it adds the numbers (note this is in a try/except to see if an exception was thrown. This is so we don't throw ugly errors to the user if they input something which is not an integer. It then prints the result, and asks if the user wants to go again, all pretty self-explanetory. Now we need to create a GUI for this. Open up glade. Press the 'New Window' button on the left, under 'Toplevels'. This will give you an empty window.

Before we begin, I will make note that glade works in a way that web developers will be used to, by using relative positioning, and splitting the area to arrange things, not by absolute values. You will need to create a vertical box with 3 items. This splits our window into three segments. The top will be instructions, the bottom buttons and information, the middle the entry areas.

Create a label, and put it up top, don't worry about the text at the moment. Next create a table 2 wide and 3 down in the middle, and finally a horizontal box with 2 items at the bottom.

In the middle, put a label in each as the three left-hand items, and a Text Entry as the right hand portion.

In the bottom, make another 2-item horizontal box in the bottom-left hand corner. Now add an image and a label to the this box. On the right of the main box add a two-item button box in the right hand slot. Add a button to each of these slots.

Add some appropriate text to the top label, and Number 1, Number 2, and Result to the labels on the left. This can all be done in the properties area on the right of your screen. Change your image in the bottom left hand corner to 'Stock' and then choose the stock image 'Warning'. Make the label next to it a warning (like "Sorry, one of your values was not a valid integer." in our CLI). For the two buttons, choose 'Stock' for both, and make one 'gtk-quit' and one 'gtk-add' - for obvious reasons.

OK, so we have everything set up, but it looks decidedly wonky and out of proportion. To counter this, you need to edit the way the items expand. Select the first Text Area, and go into 'packing'. Turn off the vertical 'Expand' option. Repeat this for all of the Text Areas and labels. You will need to go through your items forcing them to expand or not until it looks correct. (Only items in tables have individual horizontal and vertical expand options. For example, the button box should have 'expand' off.) Remember, you can use the tree view to the right to select items, so go through each one one-by-one and check it expanded and not, and see which works, you'll soon get the hang of what needs to be expanded to make the GUI function correctly.

You should end up with something like this:

Once you have, we still have a little to do in glade. Choose your Window (probably 'window1' in your treeview) and go to the 'General' tab in the properties window. Change the name to something more suitible - 'windowMain' is what I'll use. Set Window Title to something you want the user to see ("Latty's Amazing Adder!" in my case.). Next go to the signals tab, and open 'GtkObject' - and click on "destroy" - the dropdown menu under 'handler' - then choose 'on_windowMain_destroy' - This is the event that the GUI will give your application to say the user has tried to quite the application (by clicking on the close icon). Make sure to click somewhere else so it goes out of edit mode before doing anything else, otherwise it'll not save that signal.

Repeat this process for everything you will need to access, changing the names of the entries to something appropriate (these don't need any signals), changing the name of the bottom left hand hbox which contains your warning to something more apt, and changing the names of the two buttons then creating 'clicked' signals for both of them. Almost there! Finally, we need to make a few small edits. Select the hbox which contains your warning (which should not be renamed something, hboxWarning in my case) - then set 'visible' to false. We don't want this warning shown to begin with. You need to do the opposite for the main window, which you do want visible, so make sure it is. You will also want to turn 'sensitive' on the result Text Entry (under 'Common') to be false - this is for the user to see the result, they don't need to edit it. This will 'grey-out' the box (allthough not in glade, for some reason. You also need to go to the button box containing your two buttons and set 'pack type' under 'packing' to end, so it doesn't move around when the warning appears and disappears.

There! Done with the GUI. Save it in the same folder as where you will develop your app. "main.glade" in my case. OK, let's begin with our GUI script. First of all, we need to import the libraries we will need. Code:
import sys try: import pygtk pygtk.require("2.0") except: pass try: import gtk import gtk.glade except: print("GTK Not Availible") sys.exit(1)

The sys library is there purely to allow us to call 'sys.exit' - used to exit the application. The rest is to import the GTK, our graphical library. It should all make sense. Next we need to have our adder class again, this is unchanged from our original application (the wonders of object orientation, kids): Code:
class adder: result = 0 def __init__( self, number1, number2 ): self.result = int( number1 ) + int( number2 ) def giveResult( self ): return str(self.result)

Now, here comes the juicy bit, the GUI class. Code:
class leeroyjenkins: wTree = None def __init__( self ): self.wTree = gtk.glade.XML( "main.glade" ) dic = { "on_buttonQuit_clicked" : self.quit, "on_buttonAdd_clicked" : self.add, "on_windowMain_destroy" : self.quit, } self.wTree.signal_autoconnect( dic ) gtk.main() def add(self, widget): try: thistime = adder( self.wTree.get_widget("entryNumber1").get_text(), self.wTree.get_widget("entryNumber2").get_text() ) except ValueError: self.wTree.get_widget("hboxWarning").show() self.wTree.get_widget("entryResult").set_text("ERROR") return 0 self.wTree.get_widget("hboxWarning").hide() self.wTree.get_widget("entryResult").set_text(thistime.giveResult()) def quit(self, widget): sys.exit(0)

Basically, it loads the widget tree from your glade file, and then creates a list of signals and what methods they point to. In our case, closing the window and clicking quit both make it quit (a member function which is pretty obvious), and clicking add calls our add member function. The gtk.main() function just makes a loop which will display the GUI, and handle any signals as you have told it to. Notice members called by signals need a 'widget' parameter. This is the widget that set off that signal. We don't need this, but you might in another case. Our add function should look very familiar, and should be pretty self explanetory. Alright, that's it! Here is the end result in full: Code:
import sys try: import pygtk pygtk.require("2.0") except: pass try: import gtk import gtk.glade except: print("GTK Not Availible") sys.exit(1)

class adder: result = 0 def __init__( self, number1, number2 ): self.result = int( number1 ) + int( number2 ) def giveResult( self ): return str(self.result) class leeroyjenkins: wTree = None def __init__( self ): self.wTree = gtk.glade.XML( "main.glade" ) dic = { "on_buttonQuit_clicked" : self.quit, "on_buttonAdd_clicked" : self.add, "on_windowMain_destroy" : self.quit, } self.wTree.signal_autoconnect( dic ) gtk.main() def add(self, widget): try: thistime = adder( self.wTree.get_widget("entryNumber1").get_text(), self.wTree.get_widget("entryNumber2").get_text() ) except ValueError: self.wTree.get_widget("hboxWarning").show() self.wTree.get_widget("entryResult").set_text("ERROR") return 0 self.wTree.get_widget("hboxWarning").hide() self.wTree.get_widget("entryResult").set_text(thistime.giveResult()) def quit(self, widget): sys.exit(0) letsdothis = leeroyjenkins()

Note the last line which creates an object of our GUI class, this makes the program actually begin.

There we have it, a simple GUI app with Python and Glade. It's not that hard to do. This example, of course, may seem like a lot of work for little result, as the resulting application is not that useful, but what you have learnt here can easily be applied to other things. If you want to see a fully functional app written in python/glade, check out simpleconf. This was written for OCNix and is made in exactly the same way as is outlined here. You should be able to see how it functions. The source code is right there, so take a look. Edit: Forgot to mention, there is a reason to use the stock buttons wherever possible. They are automatically translated to the language the user is currently using, and change icon with the theme the user is using, so it always fits in.

GTK+ and Glade3 GUI Programming Tutorial


Part 1 - Designing a User Interface Using Glade 3 1. Quick Overview of GTK+ Concepts 2. Introduction to Glade3 3. Getting Familiar with the Glade Interface 4. Manipulating Widget Properties 5. Specifying Callback Functions for Signals 6. Adding Widgets to the GtkWindow 7. How Packing Effects the Layout 8. Editing the Menu (or Toolbar) 9. Final Touches to the Main Window 10.Getting Additional Help Using Glade 11.What Next?

Designing a User Interface using Glade3


In part 1 of the GTK+ and Glade3 GUI Programming Tutorial series, we will be designing the graphical user interface (GUI) for a GTK+ text editor application (shown left) which will be used throughout these tutorials. This GUI design will be created using the Glade Interface Designer and is completely independent of the programming language used to implement the design, which will come in subsequent tutorials.

Quick Overview of GTK+ Concepts If you have no experience with GTK+, you may struggle with some of the concepts I am going to cover. Although I am going to attempt to teach some of these concepts on the fly, it would serve you well to read up on these ideas further, perhaps after working through part 1 of this tutorial. Understanding the fundamental concepts of GTK+ will be instrumental in your ability to effectively use Glade. First of all, GTK+ is not a programming language. GTK+ is a toolkit, or a collection of libraries, which developers can use to develop GUI applications for Linux, OSX, Windows, and any other platform on which GTK+ is available. It can be thought of in the same terms as MFC or the Win32 API on Windows, Swing and SWT in Java, or Qt (the "other" Linux GUI toolkit used by KDE). Although GTK+ itself is written in C, there are a multitude of language "bindings" allowing programmers to develop GTK+ applications in the language of their choice including C++, Python, Perl, PHP, Ruby, and many others. GTK+ is based on 3 main libraries: Glib, Pango, and ATK, however, we primarily work with GTK+ and let GTK+ do it's magic with those 3 libraries. GLib wraps most of the standard C library functions for portability (allowing your code to run on Windows and Linux if desired). We use GLib a lot when working in C or C++, which I will explain more thoroughly when implementing our design using C.

Higher-level languages such as Python and Ruby won't have to worry about GLib as they have their own standard libraries which provide similar functionality. GTK+ and associated libraries implement an object oriented approach through GObject. How this works isn't important just yet, and different programming languages will reveal this to you differently, however, it's important to understand that GTK+ uses object orientation (yes, even in C). Every piece of a GTK+ GUI is comprised of one or more "widgets" which are objects. All widgets will be derived from a base widget called GtkWidget. For example, an application's window is a widget called GtkWindow. The toolbar within that window is a widget called GtkToolbar. Although a GtkWindow is also a GtkWidget, a GtkWidget is not neccesarily a GtkWindow. Child widgets are derived from their parent objects to extend the functionality of that object. These are standard OOP (object oriented programming) concepts (hint: Google search "object oriented programming" if this is a new concept). We can look at any widget in the GTK+ reference documentation to see which objects it is derived from. In the case of GtkWindow, it looks something like this:
GObject +----GInitiallyUnowned +----GtkObject +----GtkWidget +----GtkContainer +----GtkBin +----GtkWindow

As you can see, a GtkWindow is derived from GtkBin which is derived from GtkContainer, and so on. For your first application, you don't need to worry about anything above the GtkWidget object. The reason this heirarchy is so important is because when you're looking for functions, properties, and signals for any particular widget, you need to realize that the functions, properties, and signals of it's parent objects apply to it as well. In part 2, this will become even more apparent when writing code for this example application. We also begin to see a naming convention emerge. This is pretty handy. We can easily tell what library an object or function is from. All objects beginning with Gtk are from GTK+. Later, we'll see things like GladeXML which is part of Libglade or GError which is part of GLib. All objects (and thus Widgets) are in camel case. The functions which manipulate these objects are in lower-case with underscores for spaces. For example, gtk_window_set_title() is a function to set the title property of a GtkWindow object. All the reference documentation you will need is available online from library.gnome.org/devel/references, however, it is much easier to use Devhelp which is likely available as a package for your distribution. Devhelp allows you to browse and search the API documentation for the libraries you have installed on your system (assuming you install that libraries documentation as well). Introduction to Glade3 Glade is a RAD (Rapid Application Development) tool for designing GTK+ applications. Glade is a GTK+ application itself. It is simply a piece of software developers use to simplify the process of laying out an application's interface. Glade creates what will hereforth be refered to a s a "glade file". A glade file is actually an XML file which describes the heirachy of the widgets comprising the interface. Glade originally generated C code to build the GUI (and you'll still find examples and tutorials doing this). This was later discouraged in favor of using a library, Libglade, to build the interface at run time. And finally, as of Glade3, the old method has become deprecated. That means the ONLY thing glade does is allow you to generate a glade file which describes how the GUI is going to be built. This allows more flexibility with the developer, prevents having to re-compile applications when a minor interface change is needed, and allows more programming languages to be used with Glade.

Glade3 has had significant changes since previous versions such as Glade2. Glade3 has been available for some time and you shouldn't have any problems obtaining it. The package manager for your distribution (yum, aptitude, etc.) should have Glade3 available. You should note however, that the package will have 3 in it. Where 'glade' may be the name for the old package, 'glade-3' or 'glade3' will be the package name for the new version on which this tutorial is based. Glade is also available from source at glade.gnome.org. Getting Familiar with the Glade Interface Start up Glade and let's get familiar with it's interface. I will be referring to various aspects of Glade by the names described here. On the left hand side is the "Palette". The Palette is like that of a graphics editing application. It is a palette of GtkWidgets which you can use to design your application. In the middle area (which is empty when you first start Glade) is the "Editor". This is where you see your design in progress. On the right hand side is the "Inspector" on top and the widget "Properties" below that. The Inspector shows your design as a tree allowing you to access and view the heirarchy of the widgets making up your design. We manipulate various properties of widgets in the Properties tabs, including specifying callback functions for signals (explained later). So, the verfy first thing we're going to do, is create a Toplevel widget and save our file. To do this, Click on the GtkWindow icon in the Palette under the 'Toplevels' section. You should notice a gray box show up inside the Editor area of Glade. This is the workable area of a GtkWindow. The titlebar, close button, etc. will be added to the widget by the window manager (ie: GNOME) so we don't see it while editing. We will always start with a toplevel widget in Glade, typically a GtkWindow. Before going further, save the file as "tutorial.glade".

Now the file you just saved, "tutorial.glade", is an XML file. If you were to open it up in a text editor, it would look something like this:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE glade-interface SYSTEM "glade-2.0.dtd"> <!--Generated with glade3 3.4.0 on Tue Nov 20 14:05:37 2007 --> <glade-interface> <widget class="GtkWindow" id="window1"> <property name="events">GDK_POINTER_MOTION_MASK | GDK_POINTER_MOTION_HINT_MASK | GDK_BUTTON_PRESS_MASK | GDK_BUTTON_RELEASE_MASK</property> <child> <placeholder/> </child> </widget> </glade-interface>

As you can see, it's just a simple XML file. In part2 we will be using C with Libglade to parse this XML file and generate the UI at run-time. Being XML, this could just as easily be parsed by a Python program or any other language. As we continue to work in Glade, this file will be updated to describe our interface in this XML format any time we save. Exit out of your text editor and return to Glade. Manipulating Widget Properties The Editor of Glade now shows an empty GtkWindow widget. We are going to manipulate some of the widget's properties. If you look in the Properties pane of Glade, you will see 4 tabs: 'General', 'Packing', 'Common', and 'Signals'. Let's talk about the first 2 tabs. GtkWidgets typically have various properties which manipulate how they function and/or how they are displayed on the screen. If you look at the reference documentation for a GtkWidget and scroll down to the "Properties" section, you'll see a list of the properties for a GtkWindow. These are typically the properties which appear in the 'General' tab of the Glade properties pane and will vary from widget to widget. The name property exists for every widget in Glade and is what we will use to reference the widget when it comes time to write code for the application. Change the 'name' property of this GtkWindow from "window1" to "window". Then, add the text "GTK+ Text Editor" to the 'Window Title' property.

We'll discuss the 'Packing' tab in a bit, but first, let's look at the 'Common' tab. This tab also contains properties which belong to our GtkWindow, however, we don't see them in the reference documentation for GtkWindow. This is because this is where we set properties which areinherited from parent objects. Looking at the reference documentation for a GtkWidget in the section called "Object Hierarchy", you'll see the objects from which GtkWindow is derived. Click on the GtkContainer link to jump to the reference documentation for a GtkContainer. You'll notice that GtkContainer has a property called "border-width" and we have a property in Glade for "Border width" at the bottom of the 'Common' tab. We'll learn more about what a container widget is later, however, this demonstrates how important that object heirarchy is. Since many widgets are derived from GtkContainer, Glade puts it's properties into the 'Common' tab. In the "Object Hierarchy" section of the reference documentation for a GtkContainer you'll see that it is derived from a GtkWidget. Now click the GtkWidget link in to jump to the reference documentation for a GtkWidget. The GtkWidget has a bunch of properties, many of which are also shown in the 'Common' tab of Glade's Properties pane. These are properties which are common to all GTK+ widgets since all GTK+ widgets are derivitives of GtkWidget.

Specifying Callback Functions for Signals Objects emit a "signal" when something that might be useful to the programmer happens. These are similiar to "events" from Visual Basic. If a user does anything within your GUI, chances are they are emitting signals. As a programmer, you choose which signals you want to capture and perform a task, and connect a callback function to that signal. The first signal we'll learn, and the one which you'll use in just about every GTK+ application you write, is the "destroy" signal emitted by GtkObject. This signal is emitted whenever a GtkObject is destroyed. This is important, because when the user closes the window through the little 'x' up in the title bar or any other means, the widget is destroyed. We want to capture this signal and exit our application properly. This is better illustrated when we write code for this GUI, however, for now, let's just specify the function we want to call when the "destroy" signal is emitted for our GtkWindow. Look at the 'Signals' tab in the Glade Properties pane. You see a tree view where GtkWindow and each of the objects from which it is derived are listed. If you expand the GtkObject, you'll see all the signals emitted by GtkObject. These correspond to the reference documentation for a GtkObject in the "Signals" section.

Under the 'Handler' column, click the gray text "<Type here>" to begin editing. Select 'on_window_destroy' from the drop down and then hit ENTER. We can type anything we want here, however, glade provides a drop-down list of some of the mroe common callback function naming conventions. How this value is used depends on how the programmer connects signals in the code, however, for this tutorial, we want the GtkWindow's "destroy" signal to be associated with the handler string "on_window_destroy". We'll look at this closer in Part 2.

At this point, we actually have a working GUI. We could write a few lines of code in C, Python, Ruby, or any number of programming languages which would show our empty window and then properly terminate when we clicked the "x" in the titlebar. For this tutorial however, I will be showing you how to build the entire GUI in Glade3 before writing any code. However, to satisfy any possible curiosity, if you would like to see what a simple program looks like that would implement this Glade interface so far... In C
/* First run tutorial.glade through gtk-builder-convert with this command: gtk-builder-convert tutorial.glade tutorial.xml Then save this file as main.c and compile it using this command (those are backticks, not single quotes): gcc -Wall -g -o tutorial main.c `pkg-config --cflags --libs gtk+-2.0` -export-dynamic Then execute it using: ./tutorial */ #include <gtk/gtk.h> void on_window_destroy (GtkObject *object, gpointer user_data) { gtk_main_quit (); } int main (int argc, char *argv[]) { GtkBuilder *builder;

GtkWidget

*window;

gtk_init (&argc, &argv); builder = gtk_builder_new (); gtk_builder_add_from_file (builder, "tutorial.xml", NULL); window = GTK_WIDGET (gtk_builder_get_object (builder, "window")); gtk_builder_connect_signals (builder, NULL); g_object_unref (G_OBJECT (builder)); gtk_widget_show (window); gtk_main (); return 0; }

In Python (note: you must set the 'visible' property of 'window' to "Yes" in the 'Common' properties tab in Glade)
#!/usr/bin/env python # First run tutorial.glade through gtk-builder-convert with this command: # gtk-builder-convert tutorial.glade tutorial.xml # Then save this file as tutorial.py and make it executable using this command: # chmod a+x tutorial.py # And execute it: # ./tutorial.py import pygtk pygtk.require("2.0") import gtk class TutorialApp(object): def __init__(self): builder = gtk.Builder() builder.add_from_file("tutorial.xml") builder.connect_signals({ "on_window_destroy" : gtk.main_quit }) self.window = builder.get_object("window") self.window.show() if __name__ == "__main__": app = TutorialApp() gtk.main()

Again, I'm not going to go over the details of the code used to implement this GUI in this part of the tutorial, but instead focus on using Glade3. However, you can see that implementing an interface designed in Glade is just a few lines of code in the language of your choosing! Adding Widgets to the GtkWindow If you recall from the reference documentation for a GtkWindow, the GtkWindow is a derivative of GtkContainer. Widgets derived from GtkContainer are container widgets, meaning they can contain other widgets. This is another fundamental concept of GTK+ programming. If you come from a Windows programming background, you may be expecting to just drop a bunch of widgets onto the window and drag them around into the position you want them. But this is not how GTK+ works--and for good reason.

GTK+ widgets "packed" into various containers. Containers can be packed into containers into containers and so forth. There are various packing properties which effect how space is allocated for widgets packed into containers. Through these packing properties and nesting containers, we can have complex GUI designs without having to write code to handler the resizing and re-positioning of our widgets. This is probably one of the more difficult concepts for a new GTK+ developer, so let's just see it in action! The GtkWindow is a derivative of the container GtkBin which is a container that contains only one child widget. But this text editor is going to have 3 main sections; a menu bar, a text editing area, and a status bar. Therefore, we use a fundamental GTK+ widget, the GtkVBox. The GtkVBox (vertical box) is a container widget which can contain any number of child widgets stacked up vertically like rows (GtkHBox is the horizontal equivelant). Click on the GtkVBox icon in the Glade Palette under the 'Containers' section. You'll notice that the 'Select' toolbar button on the top of the Glade window is no longer depressed and your mouse cursor is the GtkVBox icon with a plus (+) sign when hovering over the Glade Editor. This means you are ready to drop the GtkVBox somewhere. Click in gray area of the Editor which is the empty space of the GtkWindow. A dialog box pops up asking you for the 'Number of items'. In this case we want 3 rows, so leave it as 3 and click 'OK'.

You should see the GtkWindow in the Editor divided into 3 rows now. These are the 3 empty child widgets of the GtkVBox we just added. You should also notice that the 'Select' toolbar at the top of Glade is once again depressed--allowing you to select widgets within the Editor.

Next, click on the GtkMenuBar icon in the Glade Palette under 'Containers'. Drop this one in the top row of the GtkVBox you just added.

Now click on the GtkScrolledWindow icon in the Glade Palette under 'Containers'. Drop this one in to the middle row of the GtkVBox. When you do that, it may not seem like anything has happened. However, you should notice that that middle row looks selected. It's not--the GtkScrolledWindow is.The reason you don't see anything, is because a GtkScrolledWindow doesn't have any initial visible components. It's a container which will provide scroll bars when it's child widget gets too large. We'll need this for our text editing widget.

Click the GtkTextView icon in the Glade Palette under 'Control and Display'. Drop this one right on top of that GtkScrolledWindow (the middle row). We have now just added the GtkTextView to the GtkScrolledWindow which was added to the GtkVBox.

Finally, click on the GtkStatusbar icon in the Glade Palette under 'Control and Display' and drop it into the bottom row.

And there you have it; the basic layout of our GTK+ text editor. If you look at the Glade Inspector you will see the parent-child relationship of our design.

The Inspector will come in handy. You cannot always click on a widget in the Editor as it might not be visible. For example, you cannot click on the GtkScrolledWindow we added, because you can only see it's child, the GtkTextView. Therefore, if you need to change the properties of "scrolledwindow1", you will have to select it in the Inspector. I mentioned earlier how "packing" is an often frustrating concept to new GTK+ developers. Therefore, I'm going to show you first hand how various packing effects the layout of your design. How Packing Effects the Layout When you look at the interface we've designed so far, you may take for granted how "smart" Glade was. How did it know we didn't want to make the status bar taller? Moreover, if you resize the window, how does it know that the text editing area grows to fill the new vertical space? Well, Glade guessed. It applied default packing options which are often what we want--but not always. The best way to learn about packing is to play around with packing properties in Glade as you can see the effects in real time. First, a quick description of the applicable properties. Once you get a feel for GTK+, you may want to read more on packing and space allocation. homogeneous: A property of the container widget which when set, tells GTK+ to allocate the same amount of space for each child. expand: A property of the child being packed specifying if it should recieve extra space when the parent grows. fill: A property of the child being packed specifying whether any extra space should be given to the child or used as padding around the child. Let's look at the default packing for our design. The GtkScrolledWindow has "expand"=TRUE which means it recieves extra space when the parent grows, and it has "fill"=TRUE which means it uses that extra space it recieves. This is how we want it to work.

Widget GtkVBox "vbox1" GtkMenuBar "menubar1"

Property homogeneous expand fill GtkScrolledWindow "scrolledwindow1" expand fill GtkStatusbar "statusbar1" expand fill

Value FALSE FALSE TRUE TRUE TRUE FALSE TRUE

Now, let's see what homogeneous does. Select the GtkVBox in the Glade Inspector and change it's "Homogeneous" property under 'General' properties tab to "Yes". Now the parent, "vbox1", allocates the same amount of space to each of it's children. Since all 3 child widgets have "fill"=TRUE, they fill up this extra space allocated to them.

Set the "Homogeneous" property back to "No". Click on the GtkScrolledWindow "scrolledwindow1" in the Glade Inspector and set the "Expand" property in the 'Packing' properties tab to "No". Now none of the child widgets will recieve the extra space when the GtkVBox grows. I've highlighted the 3 children in the image below to illustrate this. Each of the child widgets is it's initially requested size. The extra space allocated to the GtkVBox is simply unused since none of it's children want it (but still belongs to the GtkVBox).

Now set the "Expand" property of the GtkScrolledWindow back to "Yes" and change the "Fill" property to "No" instead. This time, the extra space is allocated to the GtkScrolledWindow since "expand"=TRUE, however, the GtkScrolledWindow doesn't use the space it was allocated since "fill"=FALSE.

Set the "Fill" property back to "Yes" to restore our original packing properties. I know it seems odd at first, but as you continue to work in Glade, you'll start to pick up on how these packing properties work and once you've conquored that part of the learning curve, you'll be amazed at how little work you have to do related to the position and size of your GUI's elements. Editing the Menu (or Toolbar) Glade3 comes with a new Menu and Toolbar editor. Although we aren't using a GtkToolbar in this tutorial, the process is very similar to that of the GtkMenubar. We will use the Glade3 Menu Editor to remove many of the items we won't be using and to specify signal handlers for the menu items we will be using. Although you can manipulate the properties and signals of GtkMenuItems from the standard Glade properties pane and can remove items from the Glade Inspector, the Glade3 menu editor provides a simpler way to edit your application's menu. Select the GtkMenuBar by clicking it in the Glade Editor or in the Glade Inspector. Then right-click and select 'Edit...' from the popup menu. This will launch the menu editor.

The menu editor contains properties just like Glade's main Properties pane and signals at the bottom just like the 'Signals' tab of Glade's main Properties pane. The main difference in the editor is the tree view on the left. It allows you easily add and remove menu items and drag them around. Go ahead and remove the one labeled "_View". This is the only menu item which Glade generates that we won't be using in this tutorial. For the remaining menu items we will need to do a few things. We need to rename them so that we have a clean, legible way to reference them in our source code. Then, we'll make some changes as a work around to a bug that might effect some readers using GTK+ 2.12.9 or below (Bug #523932), and finally we'll specify signal handlers. The steps for each of the menu items will be the same, so I'll just walk you through the 'New' menu item. Remember, this is all done in the menu editor but can also be done using the Inspector and Properties pane in glade. 1. Click on the new menu item 'gtk-new' 2. Change the 'Name' property under 'Menu Item' to "new_menu_item" 3. Change the 'Label' property under 'Properties' to "_New" (note the underscore, that's an accellerator) 4. Change the 'Stock Item' property under 'Properties' to "None" 5. Click on another menu item such as 'gtk-open' and then click back to 'gtk-new' to refresh the properties (Bug #533503) 6. Change the 'Stock Image' under 'Internal Image Properties' to the 'New' image 7. Specify a handler for the "activate" signal callled "on_new_menu_item_activate" (This is done just like before, where we click the tree view to get a drop down list of possible signal names) Now repeat those steps for each of the menu items: 'gtk-open', 'gtk-save', etc. Below are screen shots of the before and after on the 'New' menu item:

Glade Menu Editor before changes

Glade Menu Editor after changing 'New' menu item Final Touches to the Main Window It's never very easy to understand references to things like "textview1", "textview2", etc. when you're coding. So now that you know how to set properties, change the names of the following widgets (remember, the name is in the 'General' tab of the Properties pane): 1. Rename "textview1" to "text_view" 2. Rename "statusbar1" to "statusbar" And just to make it look a little nicer when the scroll bars are visible, let's add a a border and shadow to the GtkScrolledWindow 1. 2. 3. 4. Change the "Shadow Type" to "Etched In' in the 'General' properties tab for "scrolledwindow1" Change the "Border Width" to 1 in the 'Common' properties tab for "scrolledwindow1" Change the "Left Margin" to 2 in the 'General' properties for "text_view" Change the "Right Margin" to 2 in the 'General' properties for "text_view"

The finished glade file can be downloaded here: tutorial.glade Getting Additional Help Using Glade If you have additional questions about using Glade, you can always ask on the glade-users mailing list or post your question in the GTK+ Forums. What Next? In GTK+ and Glade3 GUI Programming Tutorial - Part 2 I will talk a little bit about choosing a programming language to implement the GUI we just created.

Part 2 - Choosing a Programming Language for GTK+ Development 1. 2. 3. 4. Which is the BEST Language? Language Choice Considerations A Look at Python vs. C What Next?

Which is the BEST Language? Let's get this out of your system now. This is a question for which you can spend the rest of your life reading answers to--and each will be different. The problem is that this is the wrong question to ask as the answer is different for every person in each different circumstance. Each language comes with it's advantages and it's drawbacks. The question to ask: Which language is well suited for me on this particular project? Language Choice Considerations The important thing to remember when starting in with a language, is to keep an open mind about other languages. You may start GTK+ programming with language X, and later switch to language Y once you know and understand how it's benefits are suited to your task. The GTK+ concepts will remain the same from language to language. 1. Experience Level How experienced you are with programming in general as well as how much time, patience, and devotion you are willing to spend are important factors in choosing a language. People without any programming experience have to learn fundamental programming concepts as well as the syntax and features of a new language. An experienced programmer can pick up a new language very quickly in comparison and can focus more on what the language has to offer as opposed to it's learning curve. Furthermore, if you're already a PHP expert, perhaps starting GTK+ development with PHP might appeal to you. Maybe you took a course on C++ in college and want to start there. Maybe you only worked with Visual Basic but are ready to take the plunge and learn C. 2. Activity and Community Support GTK+ is written in C. Other languages are available through "language bindings" which "wrap" the functionality. How active the project is which provides said bindings is an important factor. You want to choose a language that is up-to-date with new releases of GTK+ and bug fixes (all languages I've mentioned are pretty well up to date). Furthermore, a strong user-base and thus large community will be important as you get most of your support from the community. The more people using a particular language for GTK+, the more information there will be readily available. 3. Efficient Programmer vs. Efficient Program There's often a trade-off between how easy the program is to use and how efficient the program is in terms of speed and how much you can do with it on a lower-level. For many applications, the difference in efficiency of any two languages is negligable--and a new programmer would never even notice. For this reason, the increase in productivity is often the deciding factor. As and example, if I needed to write a program which allowed me to simply interface with some command-line utitlity through a GUI, I would likely choose Python or Ruby. However, if I were going to develop a sophisticated, powerful IDE, I would likely choose C or C++. In fact, you can even use several languages in one project! You could write the memory or processor intensive routines in C or C++ and do the rest in Python or Ruby. 4. Language Sexiness That's right-- how a language looks and feels is often a factor. You spend a lot of time staring at that code. How it flows on the screen, how it reads, and the overall development process in a particular language might appeal to you more than another. You should enjoy the programming you're doing. It's great having options isn't it!

A Look at Python vs. C I have chosen to fork this tutorial into 2 languages based on the above criteria. It is my humble opinion that C and Python fit the above criteria best. Both have very, very strong community support and are being used for a large portion of the projects developed for Linux and especially GNOME. Furthermore, they sit on sort of opposite ends of the spectrum with regard to the efficiency vs. productivity debate. You could even follow this tutorial down both paths and compare the 2 languages yourself. If you have no programming experience, or perhaps just a little experience with something like Visual Basic or PHP, I recommend starting with Python. Even if you are an experienced programmer with C, C++, or Java experience you may want to learn Python. It's an exciting modern language, fun to program with, and incredible quick to learn and use. For Rapid Application Development in Linux, Glade and Python make a great team. Learn more about the Python GTK+ binding PyGTK at www.pygtk.org. If you're an experienced programmer or dedicated student, it may be worth your while to learn C or C+ + for GTK+ development--especially if you're already familiar with C or C++. Learning GTK+ in C makes switching to another language such as Python a breeze. Furthermore, you'll have more options for contributing to existing projects. Personally, I do the majority of my GTK+ development in C despite the extra time it takes. What's Next? In GTK+ and Glade3 GUI Programming Tutorial - Part 3 I will talk about setting up your development environment and walk through a minimal implementation of the Glade file we created in part 1 using both C and Python. Part 3 - Writing a Basic Program to Implement the Glade File 1. 2. 3. 4. 5. 6. Setting Up Your Development Environment GtkBuilder and LibGlade The Minimal Application Compiling and Running the Application Stepping Through the Code What's Next?

Setting Up Your Development Environment To work with GTK+ and complete this step of the tutorial you will need a text editor, a terminal window, the GTK+ development libraries, and optionally Devhelp, the developer's help reference. If you're new to the Linux world, welcome to lots of options. There isn't one particular editor or IDE that is the "standard". Most developers actually work with their favorite text editor and a terminal window. Although there are some "full featured" IDEs out there, you may want to stick with a plain text editor and terminal at this point so as not to be overwhelmed or tripped up by features and automated tasks the IDE might perform. I do my work using Gedit, the default GNOME text editor. There is a plugins package available for Gedit which contains a terminal plugin and I have written Gedit Symbol Browser Plugin which allows you to quickly jump to functions in your source code. Below are Gedit screenshots from my system while working on this tutorial (click to see full-size).

Working in C

Working in Python

The development libraries you will need depend both on your distribution and whether you want to work in Python or C. Although the process can vary greatly depending on your platform and distribution, I can give you the information you need to get started. If you have problems installing any of the packages you need, post your problem or question at the GTK+ Forums or a forum for your distribution. When it comes to development libraries with Linux, you can often get all the packages you need using your distributions package manager to resolve dependencies. For example, on Ubuntu you can likely just issue the command: 'sudo aptitude install libgtk-2.0-dev'. This command will install the GTK+ development package, it's dependencies, their dependencies, and so on. It's important to install the "development packages". These are suffixed with "-dev" in Ubuntu/Debian and "-devel" in Redhat/Fedora. The development packages include header files and other includes that allow you to build applications which use a particular library. Just remember, "package" allows you to run applications using that library where "package-dev" or "package-devel" allows you to write applications using that library. Another prefix you will see on packages is "-doc" such as "libgtk2.0-doc". This will be the documentation for that library and once installed will allow you to browse the documentation using Devhelp: the GNOME developer's help browser. If you're programming with C you should install the following packages with their dependencies: buildessential, libgtk2.0-dev, libgtk2.0-doc, libglib2.0-doc, devhelp (package names may vary depending on distribution, these are for Ubuntu). If you're programming with Python you should install the following packages with their dependencies: python2.5-dev python2.5-doc, python2.5-gtk2, python-gtk2-doc, python-gobject-doc, devhelp (package names may vary depending on distribution, these are for Ubuntu). GtkBuilder and LibGlade If you recall, the file we created with Glade in part 1 of this tutorial series (tutorial.glade) was an XML file describing our GUI. The actual GUI will be built by our program. Therefore, the program will have to open and parse the XML file and create instances of the widgets described within. There is already a library written to perform this task--2 libraries in fact. LibGlade was originally the library used to parse the glade file and create the widgets described within. At the time of writing, this will still be the more commonplace method used in other tutorials and books. However, GTK+ 2.12 included an object called GtkBuilder which is essentially the same thing and is built right in to GTK+. As this is intended to eventually replace Libglade, we will be using GtkBuilder in this tutorial. However, as you learn and look at code elsewhere on the internet, keep in mind that anywhere you see LibGlade being used, GtkBuilder can be used instead. Since (at the time of writing) GtkBuilder is relatively new, Glade does not yet support saving in the GtkBuilder format. The GtkBuilder format is still an XML file, but with a slightly different schema. That means that in order to use GtkBuilder on a glade file, we must first convert it to the GtkBuilder format. GTK+ 2.12 provides a conversion script for this process, and will already be installed on your system at this point. You can read some of the common questions I get about all this Libglade/GtkBuilder stuff at Libglade to GtkBuilder F.A.Q.. So we now convert the glade XML file tutorial.glade to the GtkBuilder XML file tutorial.xml with the following command:
gtk-builder-convert tutorial.glade tutorial.xml

The file 'tutorial.xml' is the file we will actually parse in our program, however, we still need tutorial.glade when we want to make changes using Glade. This is only necessary until Glade supports the GtkBuilder format in a later version (They are aiming to have this ready by Glade 3.6 which you

can follow along Bug #490678). The Minimal Application We're finally ready to write some code! Let's just recap what we've done so far. 1. 2. 3. 4. 5. Using Glade, we created tutorial.glade which describes our user interface. We've selected which language we will use to write our program; Python, C, or both. We have a text editor and a terminal window available. We have installed the development libraries we need to program GTK+ applications. Using gtk-builder-convert, we converted tutorial.glade to tutorial.xml for use with GtkBuilder.

Now, before we start digging in to all the details of what each line of code does, we are going to write a minimal application just to ensure everything works and get acquainted with the development process. So, open up your text editor and type in the following... If you're programming in C
#include <gtk/gtk.h> void on_window_destroy (GtkObject *object, gpointer user_data) { gtk_main_quit(); } int main (int argc, char *argv[]) { GtkBuilder *builder; GtkWidget *window; gtk_init (&argc, &argv); builder = gtk_builder_new (); gtk_builder_add_from_file (builder, "tutorial.xml", NULL); window = GTK_WIDGET (gtk_builder_get_object (builder, "window")); gtk_builder_connect_signals (builder, NULL); g_object_unref (G_OBJECT (builder)); gtk_widget_show (window); gtk_main (); return 0; }

Save this file as 'main.c' in the same directory as 'tutorial.xml' If you're programming in Python
import sys import gtk class TutorialTextEditor: def on_window_destroy(self, widget, data=None): gtk.main_quit() def __init__(self): builder = gtk.Builder() builder.add_from_file("tutorial.xml")

self.window = builder.get_object("window") builder.connect_signals(self) if __name__ == "__main__": editor = TutorialTextEditor() editor.window.show() gtk.main()

Save this file as 'tutorial.py' in the same directory as 'tutorial.xml' Compiling and Running the Application If you're programming in C Since C is a compiled language, we need to use the gcc compiler to compile our source code into a binary application. In order for gcc to know where the GTK+ libraries are that it needs to link to and what compiler flags to use, we use a program called pkg-config. When we installed the GTK+ development package, a package-config file named 'gtk+-2.0.pc' was installed on our system. This file tells the pkg-config program which version of the GTK+ libraries are installed and where the include files live on our system. To illustrate this, type the following command in your terminal:
pkg-config --modversion gtk+-2.0

The output should show the version of GTK+ you have installed. On my system it shows '2.12.0'. Now let's look at what compiler flags are needed to build a GTK+ application on my system:
pkg-config --cflags gtk+-2.0

The output of that command shows a bunch of -I switches which are specifying include paths for the compiler to use. This will tell gcc where to look for include files when we use '#include' in our application. The very first one on my system is '-I/usr/include/gtk-2.0'. That means that when I use '#include <gtk/gtk.h>' in my code, gcc will be able to find '/usr/include/gtk-2.0/gtk/gtk.h'. Anytime you use a '#include <library/header.h>' style include that is not part of the standard C library in your code, there should be a '-I/path/to/library' style option passed to gcc. These libraries can be installed in different locations based on distribution, operating system, or user preference. Good thing we have pkg-config to handle all of this for us! Let's compile our application so far. Issue this command in the terminal (make sure you are in the same directory in which both 'main.c' and 'tutorial.xml' reside:
gcc -Wall -g -o tutorial main.c -export-dynamic `pkg-config --cflags --libs gtk+-2.0`

The '-Wall' option tells gcc to show all warnings. The '-g' option will generate debugging information which will be useful should you have a bug in your application and need to step through the code using a debugger such as gdb. The option '-o tutorial' tell gcc to generate the output executable into a file named 'tutorial'. 'main.c' is the file gcc will compile. The '-export-dynamic' has to do with how we connect signals to callback functions which will be discussed when we step through the code. And finally, the pkg-config command appears. Notice how it is enclosed in backticks (those are not single quotes). The backtick is usually to the left of the '1' key on th e keyboard with the tilde character (~). This is telling our shell to first execute the command 'pkg-config --cflags --libs gtk+-2.0' and put the output of that command into the current command. So if you execute 'pkg-config --cflags --libs gtk+-2.0' on your system and then paste it's output onto the end of that gcc command, it would be virtually the same thing. By using pkg-config to append the include paths and library paths to our compile command, we can use the same command to compile our program on any system, regardless of where those libraries are installed.

After your application compiles, there should be a new executable file named 'tutorial' which you execute using:
./tutorial

When you do so, you are going to see several warnings from GTK, something along the lines of " GtkWARNING **: Could not find signal handler 'xxxxxx'". Don't worry about those for now. Those are telling use that we specified a signal handler in our glade file which we did not yet write a function for. I'll address these when we step through the code. But you should have seen your GTK+ Text Editor window show up, and clicking the 'x' in the window titlebar should properly terminate the application. If for some reason you were not able to get the application to compile or execute, post your error messages and any other information in the GTK+ Forums. If you're programming in Python Since Python is an Interpreted Language we don't need to compile our program. We simply invoke the Python interpreter, which we actually do with the first line in our source code. So all we need to do to run our Python program, is change the permissions so that the file is executable and then run it. Change the permissions using:
chmod a+x tutorial.py

And now you can run it using:


./tutorial.py

you should have seen your GTK+ Text Editor window show up, and clicking the 'x' in the window titlebar should properly terminate the application. If for some reason you were not able to get the application to compile or execute, post your error messages and any other information in the GTK+ Forums. Stepping Through the Code Note: You should be looking up each new function in the GTK+ reference documentation as I introduce them. Get to know that documentation, it will be your best friend. Install Devhelp, use it! I will provided a link to the online reference documentation each time I introduce a new function in case you were unable to install Devhelp. Including the GTK+ Library If you're programming in C Hopefully you know enough about C programming to understand the first line '#include <gtk/gtk.h>'. If you don't, you should probably go back and work through a basic C programming tutorial before continuing with this one. By including gtk.h, we are indirectly including a multitude of header files. In fact, with only a few exceptions, we are including all of the GTK+ library and it's dependencies including GLib. If you want to know exactly what is being included just take a look at that file! Essentially, when you're looking through the reference manuals, you have access to most of the functions beginning with gtk_, g_, gdk_, pango_, and atk_. If you're programming in Python Hopefully you know enough about Python programming to understand the first two lines '#import sys' and '#import gtk'. If you don't, you should probably go back and work through a basic Python programming tutorial before continuing with this one. We now have access to all gtk.x classes. Initializing the GTK+ Library Python implicitly initializes the GTK+ library for you. In C however, we must initialize the GTK+ library before ANY call to a GTK+ function!

If you're programming in C
gtk_init (&argc, &argv);

Looking in 'main()' we see that we initialize GTK+ before anything else using the gtk_init() function. Building the Interface with GtkBuilder In a GTK+ application written entirely through code, that is, without the assistance of Glade or another interface designer, we would have to programatically create each widget, set the various properties of that widget, and add it to it's parent widget where applicable. Each of these steps could require several lines of code for each widget. That can be tedious. Just think about the interface we created in part 1. There are over 20 widgets defined (including all the menu items). To create all those widgets through pure code could exceed a hundred lines of lines of code once all the properties were applied! Good thing we're using Glade and GtkBuilder. With just 2 lines of code, GtkBuilder will open and parse tutorial.xml, create all the widgets defined within, apply their properties, and establish the widgets' parent-child relationships. Once that is done we can then ask builder for the references to the widgets we want to further manipulate or otherwise reference.

If you're programming in C
builder = gtk_builder_new (); gtk_builder_add_from_file (builder, "tutorial.xml", NULL);

The first variable we declared in main() was a pointer to a GtkBuilder object. We initialize that pointer using gtk_builder_new(). This function will create a new GtkBuilder object and return the pointer to that object which we are storing in the 'builder' variable. Just about all GTK+ objects will be created in this fashion.

The builder object at this point hasn't built any UI elements yet. We can use gtk_builder_add_from_file() to parse our XML file 'tutorial.xml' and add it's contents to the builder object. We are passing NULL as the third parameter to gtk_builder_add_from_file() because we are not going to learn about GError just yet. So we do not have any error checking yet and if the tutorial.xml file is not found or some other error occurs, our program will crash, but we'll address that later. You will notice that after calling gtk_builder_new() to create a new builder object, all the other gtk_builder_xxx functions take that builder object as the first parameter. This is how GTK+ implements object oriented programming in C, and will be consistent with all GTK+ objects (compare that with how Python, a natural OOP language implements the same thing below). If you're programming in Python
builder = gtk.Builder() builder.add_from_file("tutorial.xml")

When we initialize the TutorialTextEditor class with 'editor = TutorialTextEditor()' the class's initialization method, '__init__', is called. The first thing this method does is initialize a new gtk.Builder class with gtk.Builder(). The builder instance is local to the __init__ method because once we build our UI, we will no longer need the builder object. The builder object at this point hasn't built any UI elements yet. We use gtk.Builder.add_from_file() to parse our XML file 'tutorial.xml' and add it's contents to the builder object. Getting References to Widgets From GtkBuilder Once the builder has created all of our widgets we will want to get references to some of those widgets. We only need references to some of the widgets because some of them have already done their job and need no further manipulation. For example, the GtkVBox which holds our menu, text view, and statusbar has already done it's job of laying out our design and our code does not need to access it. So, we need to get a reference to any widget we will manipulate during the lifetime of our application and store it in a variable. At this point in the tutorial, we only need to reference the GtkWindow named "window" so that we can show it. If you're programming in C
window = GTK_WIDGET (gtk_builder_get_object (builder, "window"));

A couple things are happening here. First, let's look at gtk_builder_get_object(). The first parameter is the object from which we want to get an object. Again, this is how OOP is implemented in C. The second parameter is the name of the object we want to get a pointer to. This corresponds to the 'name' we specified for the widget in Glade during part 1. If you recall, we named the main application GtkWindow 'window'. So, that's what we pass to gtk_builder_get_object(). The gtk_builder_get_object() function returns a pointer to a GObject and we are storing this pointer in 'window' which we declared at the beginning of main() as a pointer to a GtkWidget. Moreover, we know that the object we are trying to get was a GtkWindow. This is why I placed so much emphasis on the 'Object Hierarchy' of widgets and GTK+ objects. If you look at the Object Hierarchy for a GtkWindow you will see that GtkWidget is one of it's ancestors as is GObject. Therefore, a GtkWindow is a GObject and it is a GtkWidget. This is a fundamental OOP concept and critical to working with GTK+. So, the GTK_WIDGET() wrapped around the call to gtk_builder_get_object() is a convenience macro used for casting. You can cast a GTK+ widget into any of it's ancestors using one of these casting macros. All GTK+ objects will have them available. So, 'GTK_WIDGET(something)' is the same as '(GtkWidget*)something'. We're casting the pointer to a GObject returned from the call to gtk_builder_get_object() to a pointer to a GtkWidget as that's what 'window' was declared as.

Finally, the reason we declared window as a pointer to a GtkWidget in the beginning of main() rather than as a pointer to a GtkWindow is due to convention. We could have declared it as a GtkWindow* and that would have still been correct. All GTK+ widgets are derived from a GtkWidget so we can always declare a variable pointing to any GTK+ widget as such. Many functions take GtkWidget* as a paramter and many functions return GtkWidget* and thus it usually makes sense to declare your variables as such and simply cast them to the specific widget where applicable (which you'll see later). If you're programming in Python
self.window = builder.get_object("window")

We are using gtk.Builder.get_object() to get the object named "window" from the builder. This corresponds to the 'name' we specified for the widget in Glade during part 1. If you recall, we named the main application's GtkWindow 'window'. So, that's what we pass to get_object(). We assign the returned object to self.window so that we have access to the application's window anywhere within the TutorialTextEditor() class. Connecting Callback Functions to Signals In part 1 we specified "handlers" for various "signals" in our interface. If you recall, GTK+ emits signals for various events that occur. This is a fundamental concept of GUI programming. Our application needs to know when the user does something so that it can respond to that action. As we'll see soon, our application just sits around in a loop waiting for something to happen. We will be using GtkBuilder to connect the signal handlers we defined using Glade with callback functions in our code. GtkBuilder will look at our code's symbols and connect the appropriate handlers for us. In part 1 we specified a handler named 'on_window_destroy' for the "destroy" signal of the GtkWindow named 'window'. Therefore, GtkBuilder will expect to find a function or method named 'on_window_destroy'. The "destroy" signal is emitted when a GtkObject is destroyed. As we'll see in the next bit of code, our application is going to sit in an infinite loop waiting for events to happen. When the user closes the window (such as clicking the 'x' in the titlebar), our application will need to break out of the loop and terminate. By connecting a callback to the "destroy" signal of the GtkWindow we will know when to terminate. Therefore, this is a signal you will use in almost every GTK+ application you write. Note: The method being used to connect callbacks to signals in this example is equivalent to using glade_xml_signal_autoconnect() function when using LibGlade instead of GtkBuilder. If you're programming in C
gtk_builder_connect_signals (builder, NULL);

When we call gtk_builder_connect_signals() we pass the builder object as the first parameter as always. The second parameter allows us to pass user data (anything we want) to our callback function. This will be important later, but for now we'll just pass NULL. This function uses GModule, a part of GLib used to dynamically load modules, to look at our applications symbol table (function names, variable names, etc.) to find the function name that matches the handler name we specified in Glade. In Glade we specified a handler for the GtkWindow's "destroy" signal called 'on_window_destroy'. So, gtk_builder_connect_signals() is looking for a function named 'on_window_destroy' that matches the signature of the callback function for the "destroy" signal. If you recall from part 1, the "destroy" signal belonged to GtkObject. Therefore, we find the prototype for the callback function in the manual for GtkObject under the 'signals' section: GtkObject "destroy" Signal. This tells us what the prototype for our callback function should look like. Based on the prototype specified in the manual, I wrote the following function:
void on_window_destroy (GtkObject *object, gpointer user_data)

{ gtk_main_quit(); }

So now gtk_builder_connect_signals() will find this function and see that it both matches the name of the handler we specified in Glade and has a compatible signature (takes the same arguments) as that specified for the "destroy" signal and makes the connection. Now our function on_window_destroy() will be called when the GtkWindow 'window' is destroyed. In on_window_destroy() we just call gtk_main_quit() to properly terminate our application. This function will break out of the main loop which I will talk about more when we get there in just a bit. Right after the call to gtk_builder_connect_signals() there was a call to g_object_unref().
g_object_unref (G_OBJECT (builder));

This is because we are no longer going to use the GtkBuilder object. We used it to construct our widgets and then we obtained pointers to the widgets we needed to reference. So now we can free all the memory it used up with XML stuff. You'll also noticed that we are using one of those casting macros to cast (GtkBuilder*) to (GObject*). We must do this because g_object_unref() takes a GObject* as a parameter. Since a GtkBuilder is derived from a GObject (as are all widgets) this is perfectly valid. If you're programming in Python
builder.connect_signals(self)

In Glade we specified a handler for the GtkWindow's "destroy" signal called 'on_window_destroy'. So, gtk.Builder.connect_signals() is looking for a method named 'on_window_destroy' that matches the signature of the callback method for the "destroy" signal. If you recall from part 1, the "destroy" signal belonged to GtkObject. Therefore, we find the prototype for the callback function in the manual for gtk.Object under the 'signals' section: gtk.Object "destroy" Signal. This tells us what the prototype for our callback method should look like. Based on the prototype specified in the manual, I wrote the following method:
def on_window_destroy(self, widget, data=None): gtk.main_quit()

So now builder.connect_signals() will find this method and see that it both matches the name of the handler we specified in Glade and has a compatible signature (takes the same arguments) as was specified for the "destroy" signal and makes the connection. Now our method on_window_destroy() will be called when the GtkWindow 'window' is destroyed. In on_window_destroy() we just call gtk.main_quit() to properly terminate our application. This function will break out of the main loop which I will talk about more when we get there in just a bit. Showing the Application Window Before we enter the GTK+ main loop (discussed next), we want show our GtkWindow widget as our app doesn't do much good if it's not even visible. If you're programming in C
gtk_widget_show (window);

Calling gtk_widget_show() sets the Widget's GTK_VISIBLE flag telling GTK+ to show the widget (which will happen within the GTK+ main loop discussed next). If you're programming in Python
editor.window.show()

Calling gtk.Widget.show() tells GTK+ to show the widget (which will happen within the GTK+ main loop discussed next). Entering the GTK+ Main Loop The main loop in GTK+ is an infinite loop which performs all of the "magic". This is how GUI programming works. Once we build our GUI and setup our program, we enter the GTK+ main loop and just wait for an event to occur which we care about (such as closing the window). A lot is happening inside this main loop, however, for a beginner you can simply think of it as an infinate loop in which GTK+ checks the state of things, updates the UI, and emits signals for events. After entering the main loop, our application isn't doing anything (but GTK+ is). When the user resizes the window, minimizes it, clicks on it, presses keys, etc. GTK+ is checking each of these events and emitting signals for them. However, our application is only connected to one signal currently, the "destroy" signal of 'window'. When the window is closed and the "destroy" signal is emitted, then the GTK+ main loop will turn over control to the handler function we have connected to that signal which breaks us out of the GTK+ main loop thus allowing our application to terminate. If you're programming in C
gtk_main ();

If you're programming in Python


gtk.main()

In Summary 1. 2. 3. 4. 5. 6. 7. 8. Application uses GtkBuilder to create the GUI from XML file. Application gets reference to main window widget. Application connects 'on_window_destroy' handler to the "destroy" signal. Application flags the window to be shown. Application enters GTK+ main loop (window is shown). User clicks the 'x' in the titlebar as a result of which GTK+ main loop emits the "destroy" signal. Handler 'on_window_destroy' breaks out of GTK+ main loop. Application terminates normally.

What's Next? In the next part of the tutorial we will begin to move through the code a bit faster, implementing the remaining functionality of our GTK+ text editor entirely through signal handlers.

Part 4 - Fully Implementing the GTK+ Text Editor


In this part of the tutorial we will be finishing the text editor by defining functions/methods for the remaining signal handlers we defined in our glade file in part 1. This will include opening and saving files, showing messages and dialogs to the user, cut/copy/paste operations, and an 'About' dialog. Structuring The Code And Declaring Functions Defining Some Utility Functions/Methods Opening and Saving Files Cut, Copy, and Paste Operations About Dialog Structuring The Code The application we wrote in part 3 was very minimal just to ensure we had done everything correctly so far. Now we want to go in and tighten up that code a bit, allowing for some error handling and coding convention.

If you're programming in C There a quite a few things that we want to do when implementing this application in C that will make our life easier. For simplicity, I'm keeping all of the code in a single C source file, however, in real world applications we would likely group functionality into individual source and header files.
#define BUILDER_XML_FILE "tutorial.xml"

The first thing I'm going to do, immediately following the line with '#include <gtk/gtk.h>', is put the name of the interface XML file into a #define so that it's more convenient to change. If I were to later deploy this application, my install script could change that define to the location the file was installed to.
typedef struct { GtkWidget *window; GtkWidget *statusbar; GtkWidget *text_view; guint statusbar_context_id; gchar *filename; } TutorialTextEditor;

Next I'm going to define a structure. We can pass a single variable to our signal handler functions as user_data. By using a struct for this, we can access any of the widgets which are commonly needed from any of the callback functions. We'll see how this works later. So this 'TutorialTextEditor' struct contains variables that we will need throughout the lifetime of the application while handling signals.
/* main window callback prototypes */ void on_window_destroy (GtkObject *object, TutorialTextEditor *editor); gboolean on_window_delete_event (GtkWidget *widget, GdkEvent *event, TutorialTextEditor *editor); /* file menu callback prototypes */ void on_new_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); void on_open_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); void on_save_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); void on_save_as_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); void on_quit_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); /* edit menu callback prototypes */ void on_cut_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); void on_copy_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); void on_paste_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); void on_delete_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); /* help menu callback prototypes */ void on_about_menu_item_activate (GtkMenuItem *menuitem, TutorialTextEditor *editor); /* misc. function prototypes */ void error_message (const gchar *message); gboolean init_app (TutorialTextEditor *editor); gboolean check_for_save (TutorialTextEditor *editor); gchar* get_open_filename (TutorialTextEditor *editor); gchar* get_save_filename (TutorialTextEditor *editor); void load_file (TutorialTextEditor *editor, gchar *filename); void write_file (TutorialTextEditor *editor, gchar *filename); void reset_default_status (TutorialTextEditor *editor);

Next I declare a whole bunch of function prototypes. All of the functions beginning with 'on_' are signal handler functions and correspond to the handlers we specified in the Glade file in part 1. Just like we did with the 'on_window_destroy' handler in part 3, the prototypes are based on the signature as defined in the GTK+ reference documentation. I didn't just make up the arguments these functions take,

I used the prototypes almost exactly as they appear in the manual with 2 differences. First, I change the name from 'user_function' to the name of the handler I specified in glade. Secondly, I changed the 'gpointer user_data' argument that was specified in the manual to 'TutorialTextEditor *editor'. The reason for this is because 'gpointer' is an untyped pointer (a void pointer). Therefore, I can make it a pointer to anything. For this tutorial, we are going to pass our TutorialTextEditor structure to every one of these signal handlers as the user data so I specify that right in the declaration for the function. The remaining function prototypes I'm declaring are miscellaneous functions that I use somewhere in the code. These types of functions would typically be added as the need comes up and as you refactor your code (especially when you're a beginner). So we're going to go ahead and pretend that we had enough foresight to know we were going to need them now.
int main (int argc, char *argv[]) { TutorialTextEditor *editor; if (gtk_check_version (2, 12, 0) != NULL) { g_warning ("You need to install GTK+ 2.12 or newer!"); return 1; } editor = g_slice_new (TutorialTextEditor); gtk_init (&argc, &argv); if (init_app (editor) == FALSE) { g_slice_free (TutorialTextEditor, editor); return 1; /* error loading UI */ } gtk_widget_show (editor->window); gtk_main (); g_slice_free (TutorialTextEditor, editor); return 0; }

And now to change our main() function a bit. We declare a variable as a pointer to that TutorialTextEditor struct we defined earlier. Next I do a GTK+ version check. I'm doing this because GtkBuilder requires GTK+ 2.12 or better, however, in a real-world application you would do your GTK+ version check in the ./configure script. However, that's beyond the scope of this tutorial. Next we initialize 'editor' using the g_slice_new () function. This function is part of GLib's memory management functions. What this does, is allocate the memory our struct needs. This is similar to casting the return of a call to malloc, but cleaner and using GLib's optimized memory management.

Get on the D-BUS


In Software Programs, the kernel and even your phone can keep you in touch and make the whole desktop work the way you want. Here's how D-BUS works, and how applications are using it. D-BUS is an interprocess communication (IPC) system, providing a simple yet powerful mechanism allowing applications to talk to one another, communicate information and request services. D-BUS was designed from scratch to fulfill the needs of a modern Linux system. D-BUS' initial goal is to be a replacement for CORBA and DCOP, the remote object systems used in GNOME and KDE, respectively. Ideally, D-BUS can become a unified and agnostic IPC mechanism used by both desktops, satisfying their needs and ushering in new features. D-BUS, as a full-featured IPC and object system, has several intended uses. First, D-BUS can perform basic application IPC, allowing one process to shuttle data to anotherthink UNIX domain sockets on steroids. Second, D-BUS can facilitate sending events, or signals, through the system, allowing different components in the system to communicate and ultimately to integrate better. For example, a Bluetooth dmon can send an incoming call signal that your music player can intercept, muting the volume until the call ends. Finally, D-BUS implements a remote object system, letting one application request services and invoke methods from a different objectthink CORBA without the complications.

Why D-BUS Is Unique


D-BUS is unique from other IPC mechanisms in several ways. First, the basic unit of IPC in D-BUS is a message, not a byte stream. In this manner, D-BUS breaks up IPC into discrete messages, complete with headers (metadata) and a payload (the data). The message format is binary, typed, fully aligned and simple. It is an inherent part of the wire protocol. This approach contrasts with other IPC mechanisms where the lingua franca is a random stream of bytes, not a discrete message. Second, D-BUS is bus-based. The simplest form of communication is process to process. D-BUS, however, provides a dmon, known as the message bus dmon, that routes messages between processes on a specific bus. In this fashion, a bus topology is formed, allowing processes to speak to one or more applications at the same time. Applications can send to or listen for various events on the bus. A final unique feature is the creation of not one but two of these buses, the system bus and the session bus. The system bus is global, system-wide and runs at the system level. All users of the system can communicate over this bus with the proper permissions, allowing the concept of system-wide events. The session bus, however, is created during user login and runs at the user, or session, level. This bus is used solely by a particular user, in a particular login session, as an IPC and remote object system for the user's applications.

D-BUS Concepts
Messages are sent to objects. Objects are addressed using path names, such as /org/cups/printers/queue. Processes on the message bus are associated with objects and implemented interfaces on that object. D-BUS supports multiple message types, such as signals, method calls, method returns and error messages. Signals are notification that a specific event has occurred. They are simple, asynchronous, one-way heads-up messages. Method call messages allow an application to request the invocation of a method on a remote object. Method return messages provide the return value resulting from a method invocation. Error messages provide exceptions in response to a method invocation. D-BUS is fully typed and type-safe. Both a message's header and payload are fully typed. Valid types include byte, Boolean, 32-bit integer, 32-bit unsigned integer, 64-bit integer, 64-bit unsigned integer, double-precision floating point and string. A special array type allows for the grouping of types. A DICT type allows for dictionary-style key/value pairs.

D-BUS is secure. It implements a simple protocol based on SASL profiles for authenticating one-toone connections. On a bus-wide level, the reading of and the writing to messages from a specific interface are controlled by a security system. An administrator can control access to any interface on the bus. The D-BUS dmon was written from the ground up with security in mind.

Why D-BUS?
These concepts make nice talk, but what is the benefit? First, the system-wide message bus is a new concept. A single bus shared by the entire system allows for propagation of events, from the kernel (see The Kernel Event Layer sidebar) to the uppermost applications on the system. Linux, with its welldefined interfaces and clear separation of layers, is not very integrated. D-BUS' system message bus improves integration without compromising fine engineering practices. Now, events such as disk full and printer queue empty or even battery power low can bubble up the system stack, available for whatever application cares, allowing the system to respond and react. The events are sent asynchronously, and without polling.

The Kernel Event Layer


The Kernel Event Layer is a kernel-to-user communication mechanism that uses a high-speed netlink socket to communicate asynchronously with user space. This mechanism can be tied into D-BUS, allowing the kernel to send D-BUS signals! The Kernel Event Layer is tied to sysfs, the tree of kobjects that lives at /sys on modern Linux systems. Each directory in sysfs is tied to a kobject, which is a structure in the kernel used to represent objects; sysfs is an object hierarchy exported as a filesystem. Each Kernel Event Layer event is modeled as though it originated from a sysfs path. Thus, the events appear as if they emit from kobjects. The sysfs paths are easily translatable to D-BUS paths, making the Kernel Event Layer and D-BUS a natural fit. This Kernel Event Layer was merged into the 2.6.10-rc1 kernel. Second, the session bus provides a mechanism for IPC and remote method invocation, possibly providing a unified system between GNOME and KDE. D-BUS aims to be a better CORBA than CORBA and a better DCOP than DCOP, satisfying the needs of both projects while providing additional features. And, D-BUS does all this while remaining simple and efficient.

Adding D-BUS Support to Your Application


The core D-BUS API, written in C, is rather low-level and large. On top of this API, bindings integrate with programming languages and environments, including Glib, Python, Qt and Mono. On top of providing language wrappers, the bindings provide environment-specific features. For example, the Glib bindings treat D-BUS connections as GObjects and allow messaging to integrate into the Glib mainloop. The preferred use of D-BUS is definitely using language and environmentspecific bindings, both for ease of use and improved functionality. Let's look at some basic uses of D-BUS in your application. We first look at the C API and then poke at some D-BUS code using the Glib interface.

The D-BUS C API


Using D-BUS starts with including its header:
#include <dbus/dbus.h>

The first thing you probably want to do is connect to an existing bus. Recall from our initial D-BUS discussion that D-BUS provides two buses, the session and the system bus. Let's connect to the system bus:
DBusError error; DBusConnection *conn; dbus_error_init (&error); conn = dbus_bus_get (DBUS_BUS_SYSTEM, &error); if (!conn) { fprintf (stderr, "%s: %s\n", err.name, err.message); return 1; }

Connecting to the system bus is a nice first step, but we want to be able to send messages from a wellknown address. Let's acquire a service:
dbus_bus_acquire_service (conn, "org.pirate.parrot", 0, &err); if (dbus_error_is_set (&err)) { fprintf (stderr, "%s: %s\n", err.name, err.message); dbus_connection_disconnect (conn); return; }

Now that we are on the system bus and have acquired the org.pirate.parrot service, we can send messages originating from that address. Let's send a signal:
DBusMessage *msg; DBusMessageIter iter; /* create a new message of type signal */ msg = dbus_message_new_signal( "org/pirate/parrot/attr", "org.pirate.parrot.attr", "Feathers"); /* build the signal's payload up */ dbus_message_iter_init (msg, &iter); dbus_message_iter_append_string (&iter, "Shiny"); dbus_message_iter_append_string (&iter, "Well Groomed"); /* send the message */ if (!dbus_connection_send (conn, msg, NULL)) fprintf (stderr, "error sending message\n"); /* drop the reference count on the message */ dbus_message_unref (msg); /* flush the connection buffer */ dbus_connection_flush (conn);

This sends the Feathers signal from org.pirate.parrot.attr with a payload consisting of two fields, each strings: Shiny and Well Groomed. Anyone listening on the system message bus with sufficient permissions can subscribe to this service and listen for the signal. Disconnecting from the system message bus is a single function:
if (conn) dbus_connection_disconnect (conn);

The Glib Bindings


Glib (pronounced gee-lib) is the base library of GNOME. It is on top of Glib that Gtk+ (GNOME's GUI API) and the rest of GNOME is built. Glib provides several convenience functions, portability wrappers, a family of string functions and a complete object and type systemall in C. The Glib library provides an object system and a mainloop, making object-based, event-driven programming possible, even in C. The D-BUS Glib bindings take advantage of these features. First, we want to include the right header files:
#include <dbus/dbus.h> #include <dbus/dbus-glib.h>

Connecting to a specific message bus with the Glib bindings is easy:


DBusGConnection *conn; GError *err = NULL; conn = dbus_g_bus_get (DBUS_BUS_SESSION, &err); if (!conn) { g_printerr ("Error: %s\n", error->message); g_error_free (error); }

In this example, we connected to the per-user session bus. This call associates the connection with the Glib mainloop, allowing multiplexed I/O with the D-BUS messages. The Glib bindings use the concept of proxy objects to represent instantiations of D-BUS connections associated with specific services. The proxy object is created with a single call:
DBusGProxy *proxy; proxy = dbus_g_proxy_new_for_service (conn, "org.fruit.apple", "org/fruit/apple", "org.fruit.apple");

This time, instead of sending a signal, let's execute a remote method call. This is done using two functions. The first function invokes the remote method; the second retrieves the return value. First, let's invoke the Peel remote method:
DBusGPendingCall *call; call = dbus_g_proxy_begin_call (proxy, "Peel", DBUS_TYPE_INVALID);

Now let's retrieve-check for errors and retrieve the results of the method call:
GError *err = NULL; int ret;

if (!dbus_g_proxy_end_call (proxy, call, &err, DBUS_TYPE_INT32, &ret, DBUS_TYPE_INVALID)) { g_printerr ("Error: %s\n", err->message); g_error_free (err); }

The Peel function accepts a single parameter, an integer. If this call returned nonzero, it succeeded, and the variable ret holds the return value from this function. The data types that a specific method accepts are determined by the remote method. For example, we could not have passed DBUS_TYPE_STRING instead of DBUS_TYPE_INT32. The main benefit of the Glib bindings is mainloop integration, allowing developers to manage multiple D-BUS messages intertwined with other I/O and UI events. The header file <dbus/dbus-glib.h> declares multiple functions for connecting D-BUS to the Glib mainloop. Conclusion D-BUS is a powerful yet simple IPC system that will improve, with luck, the integration and functionality of Linux systems. Users are encouraged to investigate new D-BUS utilizing applications. With this article in hand, D-BUS shouldn't be a scary new dependency, but a shining new feature. The on-line Resources list some interesting applications that use D-BUS. Developers are encouraged to investigate implementing D-BUS support in their applications. There are also some Web sites that provide more information on using D-BUS. Of course, the best reference is existing code, and thankfully there is plenty of that. Resources for this article: www.linuxjournal.com/article/7926. Robert Love is a kernel hacker in Novell's Ximian Group and is the author of Linux Kernel Development. Robert is heavily involved in both the Linux kernel and GNOME communities. He holds degrees in Computer Science and Mathematics from the University of Florida, and he enjoys photography.

Draw Great Graphs with Open Flash Charts, Part 1

Ever wondered how those beautiful graphs are created, or those pie charts, line charts and bar graphs? What if you had the power to build those without having to scratch your head too much, the open source way? Well, all you need is Open Flash Charts! Open Flash Charts (OFC) is an open source charting tool that lets you draw amazing Flash-based graphs on your website. First, let us discuss how to set up the Flash charting component and what the prerequisites are. Then we will move on to drawing a sample bar graph.

The basics
To get up and running with OFC, first download the code library files from the official site. Download the Version 2 Lug Wyrm Charmer zip file. This contains the code library for PHP, Ruby, Perl, Python, Java and even .NET, as you can read on the site. We will concentrate on PHP in this article. Unzip the file to the root folder of your site. Once thats done, we are ready to draw the first graph.

Drawing graphs
Let us try to draw a bar chart that fetches data from a MySQL database. Since the data will be passed to the graph from the database once the graph is completed, it will automatically change, depending on the data in the database, thus providing complete abstraction for the users. To draw the graph, you need to first decide on the values to be plotted on the X and Y axes. For our sample graph, let us consider a students marks table as the data to be plotted on the graph (see table below). Let us plot each students marks against the total marks, so there will be four bars in the chart, each corresponding to a student and the marks obtained by him. Name Ramesh Ram Rajesh Raghav Class 10 10 10 10 Marks 75 45 85 99

To start plotting the graph, first define a data file, which can fetch data from the database. The data file has an SQL query to actually fetch the data, which is in the form of an array. Now, let us look at a sample data file (BarGraphDataFile.php) to understand how it is constructed.

<?php $die = false; $link = @mysql_connect('localhost', 'root', '') or ($die = true); if($die) { echo '<h3>Database connection error!!!</h3>'; echo 'A connection to the Database could not be established.<br />'; echo 'Please check your username, password, database name and host.<br />'; echo 'Also make sure <i>mysql.class.php</i> is rightly configured!<br /><br />'; } mysql_select_db('test'); include_once 'php-ofc-library/open-flash-chart.php'; $query = mysql_query('select distinct Marks, Name from test_openflash'); While($queryRow = mysql_fetch_array($query, MYSQL_ASSOC)) { $dataForGraph[] = intval($queryRow['Marks']); $XAxisLabel[] = $queryRow['Name']; } $title = new title( 'The marks obtained by students as of : '.date("D M d Y").' are' ); $title->set_style( '{color: #567300; font-size: 14px}' ); $chart = new open_flash_chart(); $chart->set_title( $title ); $x_axis_labels = new x_axis_labels(); $x_axis_labels->set_labels($XAxisLabel); $y_axis = new y_axis(); $x_axis = new x_axis(); $y_axis->set_range( 0, 100, 10 ); $x_axis->set_labels ($x_axis_labels); $chart->set_x_axis( $x_axis ); $chart->add_y_axis( $y_axis ); $bar = new bar_glass(); $bar->colour('#BF3B69'); $bar->key('Marks obtained', 12); $bar->set_values($dataForGraph); $chart->add_element($bar); mysql_close($link); echo $chart->toPrettyString(); ?> Now, look at the code to understand what is happening. First, you opened a database connection to fetch the data. Upon successfully connecting, you included the OFC PHP file to get the required function definitions. You then proceeded to actual data fetching via an SQL query that gets the distinct names and corresponding marks obtained by students.

You stored the data in two arrays, one holding the marks (dataForGraph), the other the student name (XAxisLabel). The arrays are named based on their usage, since the marks will be the actual data to draw the graph and the names will be used to label the X axis. You then created a title object, set its colour and font size, then created a chart object and assigned the title object as the chart title. Next, the X axis label object was created and you set the values by passing the XAxisLabel array. Then the X axis and Y axis objects were created. Finally, a bar object was created and the data in the dataForGraph array was passed to it. The last line formats the data so that it can be used by the OFC components. Next, we have to create the file that displays the graph. This is quite simple with the following code: <html> <head> <script type="text/javascript" src="js/swfobject.js"></script> <script type="text/javascript"> swfobject.embedSWF( "open-flash-chart.swf", "bar_chart", "600", "400", "9.0.0", "expressInstall.swf", {"data-file":" BarGraphDataFile.php "} ); </script> </head> <body> <div id="bar_chart"></div> </body> </html> Save this file as Reports.html. In this file, we have included the JS and SWF files required for plotting the graph. The data file is also passed here so that it can be invoked to get the data required to plot the graph. When viewed in the browser, the chart is as shown in Figure 1.

Figure 1: Final chart

Draw Great Graphs with Open Flash Charts, Part 2


In the first part of this series, I gave a brief introduction to Open Flash Charts (OFC) and discussed how to draw bar charts with it. In this article, I will explain how to draw a pie chart. Since we have already discussed the prerequisites of OFC and how to set it up in the previous article, let us go straight to drawing the chart.

First, we need to define the values to be plotted on the pie chart. For our sample chart, lets look at the example of a classroom. Our pie chart will show the distribution of the number of students getting different grades in a class of 28. Our data is shown in the following table: Grade A (More than 80 marks) B (60-80 marks) C (40-60 marks) D (0-40 marks) No. of students 5 10 10 3

Now let us draw a pie chart showing the marks distribution. Lets first store the above data to a database (I assume MySQL). Once thats done, lets create the data file, with the SQL query that fetches the data to be plotted. The data is then properly formatted and passed on to the interfaces provided by OFC, which then renders the graph to the Web page. There are several functions available, depending on the type of graph to be plotted; so select the function according to your need, and pass it the data. The data file for our example pie chart is as follows: <?php $die = false; $link = @mysql_connect('localhost','test_user', 'test_pwd') or ($die = true); if($die) { echo '<h3>Database connection error!!!</h3>'; echo 'A connection to the Database could not be established.<br />'; echo 'Please check your username, password, database name and host.<br />'; echo 'Also make sure <i>mysql.class.php</i> is rightly configured!<br /><br />'; } mysql_select_db('test_database'); include_once 'php-ofc-library/open-flash-chart.php'; $query = mysql_query('SELECT DISTINCT Grade, Number FROM test_piechart'); While($queryRow = mysql_fetch_array($query, MYSQL_ASSOC)) {

$label[] = $queryRow['Grade']; $dataForGraph[] = intval($queryRow['Number']); } $title = new title( 'The grades distribution : '.date("D M d Y").' are' ); $title->set_style( '{color: #567300; font-size: 14px}' ); $chart = new open_flash_chart(); $chart->set_title( $title ); $pie = new pie(); $pie->set_alpha(0.6); $pie->set_start_angle( 35 ); $pie->add_animation( new pie_fade() ); $pie->set_tooltip( '#val# of #total#<br>#percent# of total strength' ); $pie->set_colours( array('#1C9E05','#FF368D','#1A3453','#1A3789') ); $pie->set_values( array(new pie_value($dataForGraph[0], "Grade" . $label[0]), new pie_value($dataForGraph[1], "Grade" . $label[1]), new pie_value($dataForGraph[2], "Grade" . $label[2]), new pie_value($dataForGraph[3], "Grade" . $label[3])) ); $chart->add_element( $pie ); echo $chart->toPrettyString(); ?> Here, lets first connect to MySQL, then select the database (customise it to your settings). Lets include the OFC library file, to make the APIs available. The database query returns the data, which is then saved to the arrays. Now lets start using OFC. First, create the graph title object, and use set_style to set its colour and font. Then create a new chart object and pass it the title using set_title. Next, create the pie object, using set_alpha, set_start_angle, and add_animation methods to give additional effects to the pie chart. The set_tooltip method adds tool-tips to the pie, displaying information about the pie when the mouseover is done. The argument passed to this method displays the value of a slice, and the total sum of values. The second line shows the percentage. In the set_colours method, the colours are passed as an array for different pie slices. The last method used for the pie object is set_values, where the value of the pie slices and the label, are passed as a pair. Finally, the pie object is added to the chart using add_element. The last line is used to format the data properly, so that the HTML file can use the data. Save this file as data_file.php in the Web server root folder. Next, given below is the HTML file to be used:

<html> <head> <title></title> <script type="text/javascript" src="js/swfobject.js"></script> <script type="text/javascript"> swfobject.embedSWF( "open-flash-chart.swf", "pie_chart", "700", "400", "9.0.0", "expressInstall.swf", {"data-file":" data_file.php "} ); </script> </head> <body> <div id="pie_chart"> </div> </body> </html> Here, I have invoked the data file I just saved as data_file.php. The size of the graph to be plotted can be passed as arguments; here, it is 700 and 400 pixels. The rest of the HTML file is just the addition of <div id="pie_chart">, created in the header, where I gave the name pie_chart while embedding the Flash. Ensure the div ID is the same as the name given while embedding the Flash object. Save this file as Pie_display.html in the Web server root folder. When accessed through the browser, you can see the graph in action, as in Figure 1.

Figure 1: The final graph Hope you liked the article, and that it was useful. Any queries or suggestions are most welcome. In the next article in this series, we will look at how to create other innovative charts.

Draw Great Graphs with Open Flash Charts, Part 3

In the earlier parts of this series, we looked at using Open Flash Charts (OFC) to create great-looking bar charts and pie charts. In this part, let us look at how to create line charts and draw multiple line charts in the same graph, which shows even more data in a very effective manner. As always, let us first define our data for the graph. For our line chart, let us take a classroom scenario. For each subject, we have the total number of students that appeared, and the number of students that get a first class, as shown in the table below. Let us plot this data as multiple line charts one line showing the total number of students and the second the number of students getting a first class. Subject Physics Mathematics English Chemistry Total no. of students 100 70 50 60 Total no. with first class 30 40 40 50

Get going
I assume that the data is already stored in a MySQL database, and will proceed straight to the data file (with the SQL query to fetch data to be plotted, you can proceed to format it and pass it on to OFC to render the graph). The data file to draw the line chart for our example is as follows: <?php $die = false; $link = @mysql_connect('localhost','test_user', 'test_pwd') or ($die = true); if($die) { echo '<h3>Database connection error!!!</h3>'; echo 'A connection to the Database could not be established.<br />'; echo 'Please check your username, password,database name and host.<br />'; echo 'Also make sure <i>mysql.class.php</i> is rightly configured!<br /><br />'; } mysql_select_db('test_database'); include_once 'php-ofc-library/open-flash-chart.php'; $query = mysql_query('SELECT DISTINCT Subject, Total_Students, Total_First_Class Number FROM test_linechart'); While($queryRow = mysql_fetch_array($query,MYSQL_ASSOC))

{ $labels[] = $queryRow['Subject']; $data_1[] = intval($queryRow['Total_Students']); $data_2[] = intval($queryRow['Number']); } $default_dot = new dot(); $default_dot->size(5)->colour('#DFC329'); $line_dot = new line(); $line_dot->set_default_dot_style($default_dot); $line_dot->set_width( 4 ); $line_dot->set_colour( '#DFC329' ); $line_dot->set_values( $data_1 ); $line_dot->set_key( "Students Appearing", 10 ); $default_hollow_dot = new hollow_dot(); $default_hollow_dot->size(5)->colour('#6363AC'); $line_hollow = new line(); $line_hollow->set_default_dot_style($default_hollow_dot); $line_hollow->set_width( 1 ); $line_hollow->set_colour( '#6363AC' ); $line_hollow->set_values( $data_2 ); $line_hollow->set_key( "Students Getting First Class", 10 ); $y = new y_axis(); $y->set_range( 0, 100, 10 ); $x_label = new x_axis_labels(); $x_label->set_labels($labels); $x = new x_axis(); $x->set_labels($x_label); $chart = new open_flash_chart(); $chart->set_title( new title( 'Line Charts Example' ) ); $chart->set_y_axis( $y ); $chart->set_x_axis( $x ); // // here we add our data sets to the chart: // $chart->add_element( $line_dot ); $chart->add_element( $line_hollow ); echo $chart->toPrettyString(); ?>

In the above code, first connect to the database, and fire the query to fetch all rows in the table. The result set is saved in three different arrays; the first array has the Subjects, which is used to create the x axis label. The second and the third arrays save the data to be plotted. Then get down to plot the data. First a dot object is created, and the size and colour defined. Then a new line object is created; the dot object is passed as dot style to the line. Then the width and colour of the line are set. Finally, set the data to be plotted, by passing the array containing the data to the set_values method. The key is set using the set_key method. Similarly, the next line object is created and the data to be plotted is passed to it. Then the x axis and y axis objects are created. For y axis, the range is set as 0-100, with an interval of 10. For x axis, the labels are created by passing the labels array we have already saved. Finally, the chart object is created, and the x and y axes are set, and the line objects are passed as elements to the chart. This completes the data file. Save the above code as data_file.php in the Web server root folder. Next, start with the HTML file to be used, which is as follows: <html> <head> <title></title> <script type="text/javascript" src="js/swfobject.js"></script> <script type="text/javascript"> swfobject.embedSWF( "open-flash-chart.swf", "line_chart", "500", "500", "9.0.0", "expressInstall.swf", {"data-file":" data_file.php "} ); </script> </head> <body> <div id=line_chart"> </div> </body> </html> In this HTML file, include the data file, data_file.php. The size of the graph to be plotted can be passed as arguments; here, 500500 pixels. Next, add the div line_chart, which is specified in the header as where to embed the Flash object. Save this file as line_display.html in the Web server root folder.

Figure 1: Our graph plot

OPENCV

You might also like