You are on page 1of 291

BIT Year 3 Semester 6 Detailed Syllabi

IT6205

IT6205 Systems & Network Administration


(Compulsory)
INTRODUCTION
This is a compulsory course designed for Semester 6 of the Bachelor of Information
Technology Degree program. This course on Systems & Network Administration
focuses on to provide theoretical & practical knowledge required to perform
administration of computer systems and networks.
CREDITS: 03
LEARNING OUTCOMES
After successful completion of this course students will be able to:

Describe the role/scope of a system and network administrator

Install various operating systems

Manage computer systems and undertake operational tasks

Provide network services to users

Apply scripting tools for automating system administration

Describe the virtualization

MINOR MODIFICATIONS
When minor modifications are made to this syllabus, those will be reflected in the
Virtual Learning Environment (VLE) and the latest version can be downloaded from
the relevant course page of VLE. Please inform your suggestions and comments
through the VLE. http://vle.bit.lk
ONLINE LEARNING MATERIALS AND ACTIVITIES
You can access all learning materials and this syllabus in the VLE: http://vle.bit.lk, if
you are a registered student of BIT degree program. It is very important to participate
in learning activities given in the VLE to learn this subject.
FINAL EXAMINATION
Final exam of the course will be held at the end of the semester. Each course in the
semester 6 is evaluated using a two hour structured question paper.

BIT Year 3 Semester 6 Detailed Syllabi

IT6205

OUTLINE OF SYLLABUS
Topic

Hours

1- Introduction to System & Network Administration

03

2- Installing an Operating System

04

3- Host Management

13*

4- Network Management

15*

5- Automating System Administration

05*

6- Virtualization

05
Total for the subject

45

* Students are expected to have practical work to complete their learning in these
topics.
The recommended operating system that should use for this module is Centos 6 or
better.

REQUIRED MATERIALS
Main Reading
Ref 1: Evi Nemeth, Garth Snyder, Trent R. Hein, Trent R. Hein and Ben Whaley UNIX
and Linux System Administration Handbook (4th Edition), Pearson Education, Inc.,
2011.
Ref 2: https://lopsa.org/CodeOfEthics

BIT Year 3 Semester 6 Detailed Syllabi

IT6205

DETAILED SYLLABUS:
Section 1 : Introduction to System & Network Administration (03hrs)
Instructional Objectives

Describe the essential duties of a system and network administrator

Identify similarities and differences among Linux Distributions

Find the required information using Man/info pages and other documents

1.1. Essential duties of the system administrator [Ref 1: pg. 4-6]


1.1.1. Adding/Removing Hardware
1.1.2. Monitoring & troubleshooting of the system
1.1.3. Maintain Local Documentation
1.1.4. Fire fighting
1.2. Unix and Linux Distributions [Ref 1: pg. 7-9]
1.3. Ethics [Ref 2:]
1.4. Man pages and online documentation [Ref 1: pg. 16-18]
1.5. RFCs and Other Documents [Ref 1: pg. 18-20]

Section 2: Installing an Operating System (04 hrs)


Instructional Objectives

Describe operating system concepts and installation procedures

Identify the role of a boot loader used in dual boot system

Material /Sub Topics


2.1. Boot Process [Ref 1: pg. 78-82]
2.1.1. Boot Process Steps
2.1.2. Kernel Initialization
2.1.3. Hardware Configuration
2.1.4. Recovery Mode
2.1.5. Activation of Start up Scripts
2.2. Rebooting & shutting down* [Ref 1: pg. 100-101]

BIT Year 3 Semester 6 Detailed Syllabi

IT6205

Section 3: Host Management (13 hrs)


Instructional Objectives

Plan and execute system management procedures

Describe the user management

Installing additional packages using tools

Characterize different disk storages

Describe controlling processes

Characterize different file system formats and its attributes

Use file related commands

Material /Sub Topics


3.1. Root Privileges* [Ref 1: pg. 110-117]
3.1.1. su
3.1.2. sudo
3.2. User Management* [Ref 1: pg. 174-200]
3.2.1. passwd file
3.2.2. group file
3.2.3. Home Directory
3.2.4. Setting permission and ownership
3.2.5. Adding/deleting users
3.2.6. Disabling logins
3.3. Software Installation and Management (rpm, yum, apt)* [Ref 1: pg. 381-391]
3.3.1. Managing packages with rpm
3.3.2. Managing packages with Yum
3.3.3. Managing packages with Apt tool
3.4. Disk Storage [Ref 1: pg. 206-264]
3.4.1. Storage Hardware Interface
3.4.2. Disk Partitioning and RAID
3.4.3. File Systems and Mounting*
3.5. Controlling Processes* [Ref 1: pg. 120-138]
3.5.1. Process Attributes
3.5.2. Signals & States
3.5.3. Nice
3.5.4. top
3.5.5. proc

BIT Year 3 Semester 6 Detailed Syllabi

IT6205

3.6. File System [Ref 1: pg. 140-158]


3.6.1. Path Names
3.6.2. File Names
3.6.3. File Tree
3.6.4. File Types & Attributes*
3.6.5. File commands: chmod, chown, chgrp, umask*

Section 4 : Network Administration


Managment (15 hrs)
Instructional Objectives

Plan and execute network management procedures

Identify user requirements and plan for deployment/ configuring of network


services

Material /Sub Topics


4.1. Network Configuration [Ref 1: pg. 469-472, 476-483]

4.1.1. Host Name & IP configuration


4.1.2. Configuration of the Basic Routing and Default Gateway
4.2. Configuring a Web Server (Apache)* [Ref 1: pg. 956-958, 963-971]
4.3. Configuring a DNS Server (Bind)* [Ref 1: pg. 552-596]
Section 5: Automating System Administration (05 hrs)
Instructional Objectives

Use appropriate scripting tools to automate periodic processes

Material /Sub Topics


5.1. Shell Basics [Ref 1: pg. 29-36]
5.2. Bash Scripting* [Ref 1: pg. 37-52]
5.3. Periodic Processes* [Ref 1: pg. 283-287]
Section 6 : Virtualization (05 hrs)
Instructional Objectives

Identify different types of Virtualization

Describe the given types of Virtualization

Identify available tools for Virtualization

BIT Year 3 Semester 6 Detailed Syllabi

IT6205

Material /Sub Topics


6.1. Virtualization [Ref 1: pg. 983-997]
6.1.1. Full virtualization
6.1.2.

Para virtualization

6.1.3. Native virtualization


6.1.4. Cloud Computing
6.1.5. Virtualization with Linux
6.1.6.

Introduction to Xen

6.1.7.

Introduction to KVM

* Students are recommended to do some practical in these topics.

IT 6205
Section 1.0
Introduction to System & Network
Administration

2012, University of Colombo School of Computing

1.1 Essential duties of the


system administrator

2012, University of Colombo School of Computing

What is Systems Administration?


System Administration, sis'tem ad-min'is-tra'shon, n.
Activities which directly support the operations and integrity
of computing systems and their use and which manage
their intricacies (complexity).
These activities minimally include system installation,
configuration, integration, maintenance, performance
management,
data
management,
security
management, failure analysis /recovery, and user
support.
In a inter-networked computing environment, the computer
network is often included as part of the complex
computing system.
2012, University of Colombo School of Computing

Being a SysAdmin Professional


System Administration, if done well, should
be equal parts:
Technical skills
People & communications skills
Problem solving & Common sense
Personal Commitment

SysAdmin involves a tension between


authority and responsibility on one hand and
service and co-operation on the other.
2012, University of Colombo School of Computing

System Administration: An introduction


Who is a system administrator?
Anyone who managers a computer not solely for
their own use.

What are the goals of system administration?


Ensure that computing systems run correctly and
as efficiently as possible
Ensure that all users can and do use the
computing systems to carry out their required
work in the easiest and most efficient manner.
These are conflicting goals.
2012, University of Colombo School of Computing

Who is a System Administrator?


Someone who takes care of the systems while others are using.
System Running
Smoothly and Efficiently

Users able to work in


Easy and Efficient Manner

My job is like an airplane pilots When I'm doing it well, you


might not even notice me, but my mistakes are often quite
spectacular.
2012, University of Colombo School of Computing

Tasks of a Network Administrator


Security Management
Performance Management

Planning for Growth


Fault Management and Recovery
Account/User Management
Networked Application Support
2012, University of Colombo School of Computing

Security Management
Firewalls
Usernames

Password control
Resource Access Control

2012, University of Colombo School of Computing

Performance Management
Availability
Response Time
Accuracy

2012, University of Colombo School of Computing

Planning for Growth


A Network (or any organisation) is not static
Growth means increased load on a network. This
must be planned for.

Systems eventually need replacement. This must


be planned for in advance

2012, University of Colombo School of Computing

10

Fault Management and Recovery


Monitoring
Reporting status
Testing
Fixes and Patches
Updates
Repairs
Change Management

2012, University of Colombo School of Computing

11

Account / User Management


Communication Facilities
Connection - Rental - Charges
Hardware Usage
Lease - Rent - Hire
Consumables Usage
Power, Paper, Media
Software Usage
Licensing,
Tolls,
Application usage
2012, University of Colombo School of Computing

12

Account / User Management


Accounts are Managed for:
Intrusion detection / prevention

Charging for Services


Legal protection of the Organisation

2012, University of Colombo School of Computing

13

Networked Application Support


Client / Server systems support
Internet support
Server support

Applications and Hardware


Helpdesk
Trouble report / Bug fixes

Printing
eMail
2012, University of Colombo School of Computing

14

How to be a Sys/Net Admin


(Yet another Job Description)
Learn Operating System basics
Learn shell utilities and script programming

Learn how to Install and Configure OS


Learn Web, DNS, Email, Proxy,
Learn TCP/IP networking
Learn about system tuning and accounting
Learn Compile and Customization
2012, University of Colombo School of Computing

15

Goals of System/Network Administration


Put together a network of computers

Get them running


Keep them running (despite Users.)

Provide a Service to Users


Requires skills of
Mechanic
Sociologist
Researcher
2012, University of Colombo School of Computing

16

Challenges of System/Network
Administration
Systems or Network Administration is
more than just installing computers or
networks.
It is about planning and designing an
efficient community of computers that
allow users to get their jobs done.

2012, University of Colombo School of Computing

17

Challenges of Administration
Design Logical, Efficient networks

Easily deploy & update many machines


Decide what services are needed
know the business tasks & customers

Plan and implement adequate security


Provide comfortable User environment
Be able to fix errors and problems
Keep track of & be able to use knowledge
2012, University of Colombo School of Computing

18

Comparison of System/Network
Management Styles
Fire-Fighting
Managing by responding to situations when
they happen (Reactive)
Preventative management
Monitor network and make repairs and
changes before problems appear (Proactive)
These are two opposite extremes.
Most real managers combine both.

2012, University of Colombo School of Computing

19

Fire-Fighting
Investigate the Fault or Problem
Isolate the problem and identify/define it
Use tests and tools to diagnose the problem
Solve the problem and document the solution
Prioritize multiple problems

2012, University of Colombo School of Computing

20

Preventative Management Techniques


Capacity Planning
Simulation and Testing
load generators
Benchmarks
Performance Monitors and System Tuning
Network analysis and modelling
Load balancing
Hardware upgrades

2012, University of Colombo School of Computing

21

Sources of Information for


System/Network Administrators
Manuals and Online Documentation

World Wide Web


RFCs, FYIs,
News groups, Discussion lists, WebLogs, Blogs,
.
Meetings, Seminars, Examinations
SAGE/Usenix, Microsoft TechNet/TechEd,
RHCE
How-To books
2012, University of Colombo School of Computing

22

Successful System Administration


Need to find a balance between
Authority and responsibility
Service and cooperation
A few Basic strategies
Plan it before you go it
Make it reversible
Make changes incrementally
Test, test, test before you unleash it on the
world
Know how things REALLY work.
2012, University of Colombo School of Computing

23

Successful System Administration


Example: editing system configuration files.
Keep a copy before any change to the configuration file
For original version, using suffix of .dist, .orig
For further changes, using suffix of .old, .sav, .yymmdd,
etc
Keep the current modification date
cp p
Plan how to back up if the change didnt work say
system does not even boot
Such as boot to single user mode and copy the old
version back
Test the change on a non-production environment first
Eliminate the most obvious problems
Make one major change at a time
Make the test easier
2012, University of Colombo School of Computing

24

Successful System Administration


Successful system administration
Careful planning
Habit
Change root password regularly
Faithfully making backups ( no matter how tedious)
Testing every change several times
Sticking to policies youve set

Handling crises
Have the foresight
Take time to anticipate and plan for the emergency
Prevent crises by carrying out all careful procedures.
2012, University of Colombo School of Computing

25

Final Word on System Admin


The task of system administration is a balancing act. It
requires patience, understanding, knowledge and
experience.
(eg. Working in a casualty ward of a hospital .....)
In order to be good at system administration, a certain
amount of dedication is required with both theoretical and
practical skills.
Even though the best system administration tool are free,
companies actively seeking to pay consultants/system
administrators to set up and maintain administration tools
for them!!
2012, University of Colombo School of Computing

26

1.2 Unix and Linux


Distributions

2012, University of Colombo School of Computing

27

Multics

Bell Labs join Multics project of GE and MIT


in 1965
Primitive version of Multics running on GE 645
Bell Labs ended its participation of Multics in
1969

2012, University of Colombo School of Computing

28

Unics / UNIX
Former Multics group at Bell

Labs wanted to continue system


programming
Ken Thompson used a cast off
PDP-7 to play the game
Space-Travel
With Dennis Ritchie, Ken
gradually implemented an
operating system for PDP-7
The new OS was named
Unics as opposed to Multics

2012, University of Colombo School of Computing

29

UNIX
UNIX was originally
written in assembler and
B
Dennis Ritchie improves
B and named it C
In 1973, most of UNIX
was rewritten in C

UNIX was migrated from


PDP-7 to a PDP-11

2012, University of Colombo School of Computing

30

UNIX

2012, University of Colombo School of Computing

http://www.bbc.co.uk/news/technology-15287391

31

BSD
In 70s AT&T was under a courts order not
to sell software
AT&T gave away UNIX to Universities
charging only for media
Kernighan took UNIX to his University at
Berkeley
Berkeley released BSD (Berkeley
Software Distribution) version of UNIX
BSD too went through many releases until
BSD 4.4 was released. This too become
accepted in the commercial world So, two
competing versions reined namely System
V and BSD1
2012, University of Colombo School of Computing

32

System V & GNU


In 1984, AT&T was divested, and was allowed
to sell UNIX
AT&T developed more versions, until it
released a commercial version called System 3
and this was followed by System V Release 4
SVR4 (supported by many vendors)
UNIX became commercial, source code
restricted
Richard M Stallman (RMS) left MIT AI Labs to
found the GNU (GNU's Not UNIX) Project
under Free Software Foundation
The goal of the GNU was to create a free UNIX
like operating system
GNU defined the word free as in free
speech, not as in free beer
2012, University of Colombo School of Computing

GNU's
Not
Unix

33

GNU
GNU distributed it's software under the
GNU General Public License (GPL)
GPL mandated changes to GPLed
programs also to be under GPL
By 1990, the GNU system was almost
complete
GNU Herd, the kernel of the GNU
system was not ready

http://www.gnu.org/

2012, University of Colombo School of Computing

34

Finally on Unix
Most of the Unix versions were based on BSD or System
V
IEEE developed a standard to enable various flavors of
Unix to inter-network. This ANSI standard known as
POSIX
(Portable
OS
Interface
for
Computer
Environments) is the collective name of a family of
related standards specified by the IEEE to define
the application programming interface (API), along with
shell and utilities interfaces for software compatible with
variants of the Unix operating system, although the
standard can apply to any operating system. The
term POSIX was suggested by Richard Stallman in
response to an IEEE request for a memorable name.
2012, University of Colombo School of Computing

35

What is Linux?
Linux is a free Unix-type operating system originally

created by Linus Torvalds with the assistance of


developers around the world.
Linux is an independent POSIX implementation and
includes true multitasking, virtual memory, shared libraries,
demand loading, proper memory management, TCP/IP
networking, and other features consistent with Unix-type
systems.
The source code for Linux is freely available to everyone.
Linux' refers to the kernel part of the OS.
The kernel will run on many platforms: PDP/11, Alpha,
Cray,
x86, PowerPC, PDAs and many more.
Today .
2012, University of Colombo School of Computing

36

http://www.linux.org

The Origins of Linux


The Beginning
The core of the Linux operating system was coded
by a Finnish programmer called Linus Benedict
Torvalds in 1991, when he was just 21. He had got a
new 386, and he found the existing DOS and UNIX
too expensive and inadequate.
In those days, a UNIX-like tiny, free OS called Minix
was extensively used for academic purposes. Since
its source code was available, Linus decided to take
Minix as a model. In his own words,
I wanted to write a better Minix than Minix
In order to encourage wide dissemination of his OS, Linus made the source
code open to public. At the end of 1992 there were about a hundred Linux
developers. Next year there were 1000. And the numbers multiplied every
year.
2012, University of Colombo School of Computing

37

The Origins of Linux (Contd.)


1991 Linux is created as a hobby by Linus
1992 First public version (Linux 0.02)
1993 First prefabricated Linux distributions
1996 Support for non-Intel processors
1999 2.2 kernel released
Then 2.4 and 2.6 kernels
Though Linus never imagined it, Linux quickly became a general tool for
computing. People stopped looking at Linux as a toy, and began to think
about it seriously. Today there are thousands of applications that can be
run on Linux, from Office Suites to 3D games. Hundreds of Linux User
Groups the world over discuss ways to make Linux work better. Many
number of web sites, and thousands of newsgroups and mailing lists talk
about Linux.
2012, University of Colombo School of Computing

38

Linux Lineage
While many UNIX systems are based on System V of AT&T or BSD (Berkeley
Systems Distribution) of the University of California, Berkeley, Linux has been
developed without using the source codes of these two systems.
As a result, Linux can
function as an independent
UNIX-type operating system
and
can
be
freely
redistributed
without
infringing the license. The
development of Linux has
been based on the activities
of many volunteers and its
functions and reliability are
comparable with any of the
commercially
marketed
UNIX systems.

PC UNIX
BSD
UNIX

FreeBSD

Minix
System V

Linux
2012, University of Colombo School of Computing

39

What is Linux Really?


The kernel is only useful when
Linux itself is just kernel
used in conjunction with other
The heart of the system; takes
software
care of memory management,
- GNU Project
interrupt handling, etc ..
- XFree86
(i.e. a common interface between
- Others
user process and hardware )
SHELL
Utility
Kernel

Hardware

2012, University of Colombo School of Computing

40

Linux Distributions

2012, University of Colombo School of Computing

41

What is a Linux Distribution?


Takes the kernel (2.6) & other software and sells/gives them to you
Provide a friendly method of installing the system

Provide security updates and bug fixes


Provide a method for installing and removing extra software
A packaging system
Provide their own utility software, e.g.
Printer setup,
Network setup,
And so on

Each distribution has its own characteristics


Though the OS is the same, the bundled software do vary from one
distribution to another.
2012, University of Colombo School of Computing

42

What Makes Distributions Different?


System Installer
Anaconda (Red Hat, Fedora and ...)
Yast (SuSE)
Package Management
RPM (Red Hat, Fedora, SuSE and ...)
DEB (Debian based distros)
TGZ (Slackware based distros)
Configuration System
Yast (SuSE)
system-config-* (Fedora)
Packages
2012, University of Colombo School of Computing

43

Major Distribution Types


Linux kernel

Command (GNU tools)


Library (glib)

Distribution

Appl/Installation tools

Distribution Types

RedHat

Debian

Slackware

Source-based

Mandrake
Suse

Storm
Corel

Plamo

Gentoo
Sorcerer

2012, University of Colombo School of Computing

Popular Linux Distributions

http://distrowatch.com/
2012, University of Colombo School of Computing

45

Linux Kernel Release Number


Release Number a.b.c
a - means major release
b - if b is odd means development release
if b is even means stable release
c - minor number ( patch number of major release )

Linux Kernel maintainer


Release 2.0.x - David Weinehall
Release 2.2.x - Marc-Christian Petersen
(former maintainer Alan Cox)
Release 2.4.x - Willy Tarreau
(former maintainer Marcelo Tosatti)
Release 2.6.x - Linus Trovals (Current Development)
Latest stable version is 3. (Check the Internet) 0
http://www.kernel.org/
2012, University of Colombo School of Computing

46

Linux Kernel
Linus Torvalds releases Linux kernel version 3 to celebrate
20 years of penguin-powered computing

This third iteration, currently named 3.0.0-rc1, comes 15


years after 2.0 first hit the web. Also included is code
optimized for AMD's Fusion and Intel's Ivy and Sandy
Bridgesilicon, and some updated graphics drivers, too.
Despite these tasty new treats,
Torvalds is quick to point out that this
new release is an evolutionary
change
and unleashing the big
three-oh was all about moving into a
third decade of distribution, not about
overhauling the OS.
2012, University of Colombo School of Computing

47

What is GNU/Linux?
GNU/Linux is
Operating System that compose with
LINUX Core Kernel
GNU Software Free software
GNU/Linux is free.
You can redistribute and modify GNU/Linux while you
dont break GPL.

2012, University of Colombo School of Computing

48

1.3 Ethics

2012, University of Colombo School of Computing

49

Ethics
Systems and Network administrators play a critical role in the
security and availability of the systems and networks they are
responsible for. During the course of their duties it is inevitable
that they will come into contact with sensitive, personal or
restricted information.
For these reasons system and network administrators must
display an exemplary work ethic.

Systems Administration is a profession.


It is a powerful profession.
A Systems Administrator must be ethical

Respect private information


Do not abuse power
2012, University of Colombo School of Computing

50

Being a System Administrator


Systems Administrators need extremely high privileges,
which involve rights over other people's files and directories.
They must have such rights as they need access to
directories and files to investigate problems, change
passwords, perform backups, etc.
The Super User The user with such far reaching powers is
known as the Super-user. The one in UNIX is called root.
Root owns the UNIX implementation and has rights to
everything - including deleting all files (including the Kernel!!)
on the system. The root password must only be known by
one person (and a copy kept in a sealed envelope in a safe).
For security reasons, you cannot have many people with root
password.

2012, University of Colombo School of Computing

51

SAGE Code of Ethics (1/3)


The integrity of a system administrator must be beyond
reproach.
SAs come in contact with privileged information regularly
Need to protect integrity and privacy of data
Must uphold law and policies as established for their systems

A system administrator shall not unnecessarily infringe


upon the rights of users.
No tolerance for discrimination except when required for job
Must not exercise special powers to access information except
when necessary

2012, University of Colombo School of Computing

52

SAGE Code of Ethics (2/3)


Communications of system administrators with all whom
they may come in contact shall be kept to the highest
standards of professional behavior.
Must keep users informed of computing matters that might affect
them
Must give impartial advice, and disclose any potential conflicts of
interest

The continuance of professional education is critical to


maintaining currency as a system administrator.
Reading, study, training, and
experiences are requirements

sharing

2012, University of Colombo School of Computing

knowledge

and

53

SAGE Code of Ethics (3/3)


A system administrator must maintain an exemplary
work ethic.
A sysadmin can have a significant impact on an organization a
high level of trust is maintained by exemplary behavior

At all times system administrators must display


professionalism in the performance of their duties.
Need to be professional, even when dealing with management,
vendors, users, or other sysadmins

SAGE is now known as Usenix LISA (Special Interest


Group for Sysadmins) and for more information visit:

https://www.usenix.org/lisa
2012, University of Colombo School of Computing

54

Ethics LOPSA
The League of Professional System Administrators
(LOPSA) is a nonprofit corporation with members
throughout the world. Their mission is to advance the
practice of system administration; to support, recognize,
educate, and encourage its practitioners; and to serve
the public through education and outreach on system
administration issues.
LOPSAs System Administrators' Code of Ethics can be
found at: https://lopsa.org/CodeOfEthics

2012, University of Colombo School of Computing

55

1.4 Man Pages & Online


Documentation

2012, University of Colombo School of Computing

56

man page
The Linux equivalent of HELP is man (manual)
A man page (short for manual page) is online software
documentation, serving as content for the man system, for an
entity typically encountered in Unix /Linux systems.
Such entities include computer programs (including library
and system calls), formal standards and conventions, and
even abstract concepts. A user may invoke a man page by
issuing the man command.
Use man <command> to display help for that command
Use man -k <keyword> to find all commands with that
keyword
Output is presented a page at a time. Use b for to scroll
backward, f or a space to scroll forward and q to quit
2012, University of Colombo School of Computing

57

man page layout


All man pages follow a common layout that is optimized for
presentation on a simple ASCII text display, possibly without
any form of highlighting or font control. Sections present may
include:
NAME: The name of the command or function, followed by a
one-line description of what it does.

SYNOPSIS: In the case of a command, you get a formal


description of how to run it and what command line options it
takes. For program functions, a list of the parameters the
function takes and which header file contains its definition. For
experienced users, this may be all the documentation they
need.

2012, University of Colombo School of Computing

58

man page layout


DESCRIPTION: A textual description of the functioning of the
command or function.
EXAMPLES: Some examples of common usage.
SEE ALSO: A list of related commands or functions.

Other sections may be present, but these are not well


standardized across man pages. Common examples include:
OPTIONS, EXIT STATUS, ENVIRONMENT, KNOWN BUGS,
FILES, AUTHOR, REPORTING BUGS, HISTORY and
COPYRIGHT.

2012, University of Colombo School of Computing

59

Linux Documentation Project


The Linux Documentation Project (LDP) is working on
developing good, reliable documentation for the Linux
operating system.
The overall goal of the LDP is to collaborate in taking
care of all of the issues of Linux documentation, ranging
from online documentation (man pages, HTML, and so
on) to printed manuals covering topics such as installing,
using, and running Linux.
Visit http://tldp.org/docs.html for more details.

2012, University of Colombo School of Computing

60

1.5 RFCs as Other Documents

2012, University of Colombo School of Computing

61

Request for Comments (RFC)


A Request for Comments (RFC) is a memorandum published
by the Internet Engineering Task Force (IETF) describing
methods, behaviors, research, or innovations applicable to the
working of the Internet and Internet-connected systems.
The IETF adopts some of the proposals published as RFCs
as Internet standards.

Request For Comments (RFC's) documents were invented


by Steve Crocker in 1969 to help record unofficial notes on
the development of the ARPANET. They have since become
the official record for Internet specifications, protocols,
procedures, and events.
To connects to the RFC repository maintained by the IETF,
visit: http://www.ietf.org/rfc.html
2012, University of Colombo School of Computing

62

Request for Comments (RFC)


Anyone can submit a document to be an RFC, although in
practice they are generated by the Internet Engineering Task
Force, and then reviewed by the IETF groups, various experts,
and the RFC Editor before publication. An RFC is never updated,
although it may be superseded by later revisions. RFC 2026, The
Internet Standards Process -- Revision 3, provides a good
description of the Internet standards development process, and is
updated by RFC 3932.
RFC Editor: Funded by the Internet Society to edit and publish
RFCs online. The RFC Editor maintains the master repository of
RFCs as well as RFC meta-data, which can be searched online.
The search results include the meta-data, links to the RFC text
itself, and links to any errata. Visit: http://www.rfceditor.org/index.html
2012, University of Colombo School of Computing

63

End of Section 1.0

2012, University of Colombo School of Computing

64

IT 6205
Section 2.0
Installing an Operating System

2012, University of Colombo School of Computing

2.1 Boot Process

2012, University of Colombo School of Computing

Boot Process
BIOS

GRUB

/etc/rc.d/
rc3.d

/etc/inittab

Linux
Kernel

init

/etc/rc.d/
rc.sysinit

Login
Shell

2012, University of Colombo School of Computing

RL Specific
/etc/inittab

/etc/rc.d/
rc5.d

Welcome to Linux ..

2012, University of Colombo School of Computing

2.2 Rebooting & Shutting down

2012, University of Colombo School of Computing

Rebooting & Shutting down


Linux systems consists of various utilities that allow a
system administrator to reboot or shutdown the
system.
If you use a proper method for reboot or shutdown a
Linux system, it confirms data protection by terminating
processes and synchronizing the file systems.
shutdown r or reboot command allows you to
reboot a Linux system if you log as a root in a
command prompt.
The init command also can use to reboot the system by
entering runlevel 6 (init 6).
2012, University of Colombo School of Computing

Rebooting & Shutting down


cont.
Init command allows you to change the current
runlevel, and for a shutdown, this value is 0 (init 0 ).
shutdown, halt or poweroff command allows you to
shutdown a Linux system.
According to the GUI you have installed, it contains
options to reboot or shutdown the system.

2012, University of Colombo School of Computing

End of Section 2.0

2012, University of Colombo School of Computing

IT 6205
Section 3.0
Host Management

2012, University of Colombo School of Computing

3.1 Root Privileges

2012, University of Colombo School of Computing

Root Privileges
su
Create a shell with the effective user ID. If no
user is specified, create a shell for a privileged
user.
su [option] [user]

Login as a user.

2012, University of Colombo School of Computing

switch will
allow you to login
as root

Root Privileges
sudo
If you have privileges, sudo allows you to
execute commands as superuser.
sudo [options] [command]
Main advantage of sudo is you can create
policies for users and limit their access to
execute programs.
These policies are located in /etc/sudoers file

2012, University of Colombo School of Computing

2012, University of Colombo School of Computing

3.2 User Management

2012, University of Colombo School of Computing

User Management
Passwd file
- Locates in /etc/passwd
- When we create a user and a password it will
store these user information in the passwd
file. There are seven fields of information.
Each record consists of seven fields
separated by colons ' : ' symbol.
Username : Password : User Identifier(UID) : Group
Identifier(GID) : Name of the User : Home Directory : Program
or Shell
2012, University of Colombo School of Computing

User Management
Group file
- Locates in /etc/group
- Group file is text file, it defines the groups
on the system.
- In the group file there are three data fields.
- Groupname : Password : Group ID: Users

2012, University of Colombo School of Computing

User Management
Home Directory
- Personal workspace of the user. Only the
user and super user has the access to this
personal directory. User directories are
stored under /home/[user]
- Eg: /home/saman
Setting permission and ownership
- Linux has three types of permissions.
Read Write Execute
- We can allocate permissions by using
binary numbers.
2012, University of Colombo School of Computing

User Management
Triplet for u: rwx => 4 + 2 + 1 = 7
Triplet for g: r-x => 4 + 0 + 1 = 5
Tripler for o: r-x => 4 + 0 + 1 = 5
Which makes : 755

2012, University of Colombo School of Computing

10

User Management
Adding/deleting users
- Superuser or privileged user can add or
remove users from the system.
- useradd [user] [options]

userdel [user] [options]

2012, University of Colombo School of Computing

11

User Management
Modify user account information
- usermod [options] [user]
Disabling logins
- System administrator can block users
temporary without deleting their account
using pw lock [user] command.
- To unlock you have to use pw unlock
[user]
- Another way to block user is
- usermod L [user]
2012, University of Colombo School of Computing

12

3.3 Software Installation &


Management

2012, University of Colombo School of Computing

13

Software Installation & Management


rpm
rpm [options]
A freely available packaging system for
software distribution and installation. RPM
packages are built, installed, and queried with
the rpm and rpmbuild . The rpm command
options are grouped into three subgroups for:
Querying and verifying packages Installing,
upgrading,
and
removing
packages
Performing miscellaneous functions
2012, University of Colombo School of Computing

14

Software Installation & Management


Rpm install

RPM remove
To remove you can use the e option
Rpm ev [package]

2012, University of Colombo School of Computing

15

Software Installation & Management


yum (Yellowdog Updater Modified)
Yum [command] [package name/s]
yum will automatically attempt to check all
configured repositories to resolve all package
dependencies during an installation/upgrade.
You can add new yum software repository url
into the end of the file /etc/yum.conf or in a
separate file named [anyname].repo in
/etc/yum.repos.d/ directory.

2012, University of Colombo School of Computing

16

2012, University of Colombo School of Computing

17

Software Installation & Management


apt
apt [command] [package name/s]
The Advanced Package Tool. A freely
available packaging system for software
distribution and installation.
You can add new apt software repository url to
/etc/apt/sources.list file or in a separate files
named [anyname].list in /etc/apt/sources.list.d
directory.

2012, University of Colombo School of Computing

18

Software Installation & Management


Apt install

2012, University of Colombo School of Computing

19

3.4 Disk Storage

2012, University of Colombo School of Computing

20

Storage Hardware Interfaces


ATA (Advanced Technology Attachment)
Known in earlier revisions as IDE, was
developed as a simple, low-cost interface for
PCs. It was originally called Integrated Drive
Electronics because it put the hardware
controller in the same box as the disk platters
and used a relatively high-level protocol for
communication between the computer and the
disks.

2012, University of Colombo School of Computing

21

Storage Hardware Interfaces


PATA(parallel ATA )
This style of disk is nearly obsolete, but the
installed base is enormous. PATA disks are
often labeled as IDE to distinguish them from
SATA drives. PATA disks are medium to fast in
speed, generous in capacity, and unbelievably
cheap

2012, University of Colombo School of Computing

22

Storage Hardware Interfaces


SATA(Serial ATA)
SATA is the successor to PATA. In addition to
supporting much higher transfer rates, SATA
simplifies connectivity with tidier cabling and a
longer maximum cable length.
SCSI
SCSI is one of the most widely supported disk
interfaces. It comes in several flavors, all of
which support multiple disks on a bus and
various speeds and communication styles.
2012, University of Colombo School of Computing

23

Disk Partitioning
RAID(Redundant Array of Inexpensive disks)
RAID is normally used to spread data among
several physical hard drives with enough
redundancy that should any drive fail the data
will still be intact. Once created a RAID array
appears to be one device which can be used
pretty much like a regular partition.

2012, University of Colombo School of Computing

24

Disk Partitioning
There are several kinds of RAID but there are two
most common here.

RAID-1 (mirroring) With RAID-1 it's basically


done with two essentially identical drives, each
with a complete set of data.
RAID-5 which is set up using three or more drives
with the data spread in a way that any one drive
failing will not result in data loss.

2012, University of Colombo School of Computing

25

Disk Partitioning
LVM (Logical Volume Manager )
LVM is a way of grouping drives and/or partition
in a way where instead of dealing with hard and
fast physical partitions the data is managed in a
virtual basis where the virtual partitions can be
resized.

2012, University of Colombo School of Computing

26

File Systems and Mounting.


File system types
Ext: This is like UNIX file system. It has the
concepts of blocks, inodes and directories.
Ext3: It is ext2 filesystem enhanced with
journalling capabilities. Journalling allows fast file
system recovery.
Isofs (iso9660): Used by CDROM file system.
Sysfs: It is a ram-based filesystem initially based
on ramfs. It is use to exporting kernel objects so
that end user can use it easily.
Procfs: The proc file system acts as an interface
to internal data structures in the kernel.
2012, University of Colombo School of Computing

27

File Systems and Mounting.


Mounting File Systems
To access any file system, it is first necessary to
mount it. Likewise, when access to a particular
file system is no longer desired, it is necessary to
unmount it. To mount any file system, two pieces
of information must be specified:
A means of uniquely identifying the desired disk
drive and partition, such as device file name, file
system label, or devlabel-managed symbolic link
A directory under which the mounted file system
is to be made available.
2012, University of Colombo School of Computing

28

3.5 Controlling Processes

2012, University of Colombo School of Computing

29

Controlling Processes
Process Attributes
PID or process ID, an integer.
PPID or parent process ID, an integer.
Nice number, the degree of friendliness of the process
towards other processes (process priority is calculated
from nice numbers and recent CPU usage).
TTY, the terminal to which the process is connected
RUID, or real user ID. The user issuing the command.
EUID, or effective user ID. The one determining access
permissions to system resources.
2012, University of Colombo School of Computing

30

Controlling Processes
Process Attributes ctd:
EGID, or effective group owner. Different from
RGID when SGID has been applied to a file.
RGID, or real group owner. The group of the
user who started the process

2012, University of Colombo School of Computing

31

2012, University of Colombo School of Computing

32

Controlling Processes
Signals
Signals are a way of sending simple messages to
processes.
Most of these messages are already defined and can
be found in <linux/signal.h>.
signals can only be processed when the process is in
user mode.
If a signal has been sent to a process that is in kernel
mode, it is dealt with immediately on returning to user
mode.
Signals are one of the oldest inter-process
communication methods used by Unix TM systems.
2012, University of Colombo School of Computing

33

Controlling Processes
Process states in Linux:
Running: Process is either running or ready to run
Interruptible: a Blocked state of a process and
waiting for an event or signal from another process
Uninterruptible: a blocked state. Process waits
for a hardware condition and cannot handle any
signal
Stopped: Process is stopped or halted and can be
restarted by some other process
Zombie: process terminated, but information is
still there in the process table.
2012, University of Colombo School of Computing

34

Controlling Processes
Commands
top - display top CPU processes
top [-] [d delay] [p pid]
top provides an ongoing look at processor
activity in real time. It displays a listing of the
most CPU-intensive tasks on the system, and
can provide an interactive interface for
manipulating processes. It can sort the tasks by
CPU usage, memory usage and runtime.

2012, University of Colombo School of Computing

35

Controlling Processes

2012, University of Colombo School of Computing

36

Controlling Processes
proc - process information pseudo-filesystem
/proc is a pseudo-filesystem which is used as
an interface to kernel data structures rather than
reading and interpreting /dev/kmem. Most of it is
read-only, but some files allow kernel variables
to be changed.

2012, University of Colombo School of Computing

37

Controlling Processes
proc - cpuinfo

2012, University of Colombo School of Computing

38

Controlling Processes
proc - meminfo

2012, University of Colombo School of Computing

39

Controlling Processes
nice - run a program with modified scheduling
priority
nice [OPTION] [COMMAND [ARG]...]
Run COMMAND with an adjusted scheduling
priority. With no COMMAND, print the current
scheduling priority. ADJUST is 10 by default.
Range goes from -20 (highest priority) to 19
(lowest).

2012, University of Colombo School of Computing

40

Controlling Processes
Watch this will execute a program periodically,
showing output fullscreen
Watch [OPTION] <command>

2012, University of Colombo School of Computing

41

Controlling Processes
time The time command runs the specified
program command with the given arguments.
When command finishes, time outputs giving
timing statistics about this program run.
time [OPTION] <command>

2012, University of Colombo School of Computing

42

3.6 File System

2012, University of Colombo School of Computing

43

File System
Path Names
In linux everything has a absolute path unlike
windows.
Everything starts from root ( / )

Paths use / as the separator.


Eg: /home/jithendra/Downloads

2012, University of Colombo School of Computing

44

File System
File Names
Filenames can contain any normal text
character including spaces and special
characters.
Filenames can be almost any length.
It is best to stick to a-z, A-Z,_, -, and
numbers.

If a filename contains a special character or a


space you may need to put quotes around the
whole path.
2012, University of Colombo School of Computing

45

File System
File Tree

2012, University of Colombo School of Computing

46

File System
File Types
Regular File : It comes under the Normal File
category.
Directory : These are special types of files that
are lists of other files.
Symbolic Link : A symbolic link is a reference
to another file ( a shortcut to any file ).

2012, University of Colombo School of Computing

47

File System
File Types
Socket : Special type of file that provides
inter-process networking protected by the file
systems access control
Named Pipe : A special type of file that acts
more or less like sockets and form a way for
processes to communicate with each other,
without using network socket semantics.
Device File : Character devices and Block
devices
2012, University of Colombo School of Computing

48

File System
File commands
Chmod: changes a permission of a file
Permissions
u - User who owns the file.
g - Group that owns the file.
o - Other.
a - All.
r - Read the file.
w - Write or edit the file.
x - Execute or run the file as a program.
2012, University of Colombo School of Computing

49

File System
chmod
Numeric Permissions: CHMOD can also to
attributed by using Numeric Permissions:
400 read by owner

002 write by anybody

040 read by group

100 execute by owner

004 read by anybody


(other)

010 execute by group

200 write by owner

001
execute
anybody

by

020 write by group


http://www.oreillynet.com/linux/cmd/cmd.csp?path=c/chmod
2012, University of Colombo School of Computing

50

File System
File commands
Chown : change file owner and group
chown [OPTION] [OWNER][:[GROUP]] FILE

Chgrp : changes the group that has access to


a file or directory.
chgrp newgroup filenames

2012, University of Colombo School of Computing

51

File System
Umask
The User Mask
Who determines the default permissions when a new
file is created?
Default permissions
completely insecure:

before

applying

mask

are

rw-rw-rw (octal 666) for files


rwxrwxrwx (octal 777) for directories

2012, University of Colombo School of Computing

52

File System
Umask
System default can be changed by umask command (a
shell builtin).
umask statement placed in a startup script (typically,
/etc/profile or /etc/bashrc).
Reassigns default file and directory permissions.

2012, University of Colombo School of Computing

53

File System
Umask
Use umask w/o arguments to show your current
permission setting
Bash builtin shows 4 digit 0066 (inode actually stores
12 binary permission bits)

/usr/bin/umask shows 0022


To change permission default setting:
umask xyz (x, y, or z = octal digit)

x, y, or z doesnt convert to binary meaning of rwx


directly
umask 0066
2012, University of Colombo School of Computing

54

File System
The User Mask Value Table

2012, University of Colombo School of Computing

55

End of Section 3.0

2012, University of Colombo School of Computing

56

IT 6205
Section 4.0
Network Administration

2012, University of Colombo School of Computing

4.1 Network Configuration

2012, University of Colombo School of Computing

Detecting the Ethernet card


Run the following command and you should see all the
network interface cards detected.
ifconfig a
If you can see eth0 then you can start setting up TCP/IP
networking.
If you cannot see eth0 in the output of ifconfig then you
can check whether the kernel has identified any
Ethernet controllers on the PCI bus. You can use the
following command for this.
lspci

2012, University of Colombo School of Computing

Assigning IP address to the


interface
You can use ifconfig to assign an IP address to the
interfaces.
ifconfig eth0 192.168.1.1 netmask 255.255.255.0 up
This assigns the IP address 192.168.1.1 to the eth0
interface. This interface configuration is not persistent across
reboots. Later, we will discuss how the network interfaces
can be configured using configuration files which keep the
these values across reboots.
Assigning more than ONE IP addresses (IP Aliasing)
ifconfig eth0:0 192.168.1.1 netmask 255.255.255.0 up
ifconfig eth0:1 192.168.1.2 netmask 255.255.255.0 up
2012, University of Colombo School of Computing

Setting up the routing table


The route command is used to setup the routing table.
The IP layer consults the routing table to figure out how
an IP packet is sent towards the destination. When you
configure the network interface the routing table picks up
the first route entry automatically.

You can also add the routing table entry for the own
network explicitly as follows.
route add -net 192.168.1.0 netmask 255.255.255.0
eth0

2012, University of Colombo School of Computing

Setting up the routing table


cont.
To communicate with the rest of the Internet your
network must have at least one router that knows how
to guide your packets towards the destination in other
networks. This router must have one network interface
connected to your network. We can set this router as
our default gateway as follows:

route add default gw 192.168.1.254

2012, University of Colombo School of Computing

Testing your NIC


You can check the routing table by using either one of the
following two commands.
route
netstat nr
Configuring name resolution
You must know the IP address of at least one DNS server,
accessible from your network, to resolve names. Applications
use a set of library routines (resolver library) to resolve
names and these routines consult a file called
/etc/resolv.conf.
nameserver 192.248.16.91
2012, University of Colombo School of Computing

Testing your NIC cont.


Testing
You can test your network by sending messages to other
hosts and bouncing them off those hosts. You can use the
ping command for this.
ping 192.168.1.254

2012, University of Colombo School of Computing

Testing your NIC


More Testing
You can test your network by these commands as well.
ping i 5 127.0.0.1 (Time Interval e.g 0.1 sec)
ping c 5 q 127.0.0.1 (Count & Summary)
ping s 100 127.0.0.1 (Size)
ping w 5 destination (Timeout)
netstat -a (List all ports)

netstat -at (List all TCP ports; u - UDP)

2012, University of Colombo School of Computing

Testing your NIC cont..


netstat -l (List all Listening ports; t TCP; u UDP )
netstat -s (Statistics of all ports)
netstat -c (Print information of all ports)
netstat -r (List all Kernel Routing information)

2012, University of Colombo School of Computing

10

Configuring the network


Files and Scripts
As mentioned before the IP address assigned using the
ifconfig command and the routing table entries do not
persists across reboots. These files are not consistent
across boot the machine. There are several different
distributions.

2012, University of Colombo School of Computing

11

Configuring the network


cont.
The primary network configuration files are as follows:

/etc/hosts - The main purpose of this file is to resolve


hostname that cannot be resolved any other
way. It can also be used to resolve hostnames
on small networks with no DNS server.
/etc/resolv.conf - This file specifies the IP address of DNS
servers and the search domain.

2012, University of Colombo School of Computing

12

Configuring the Network cont.


Files and Scripts
/etc/sysconfig/network - Specifies routing and host information for all
network interfaces.

/etc/sysconfig/network-scripts/ifcfg-<interface-name>

For each
network interface on a Red Hat Linux system,
there
is
a
corresponding
interface
configuration script. Each of these files
provide information specific to a particular
network interface.

NOTE: The /etc/sysconfig/networking/ directory is used by the Network


Administration Tool (system-config-network) and its contents should not be
edited manually.

2012, University of Colombo School of Computing

13

Configuring the Network cont.


DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
NETWORK=192.16.1.0
NETMASK=255.255.255.0
IPADDR=192.16.1.1
USERCTL=no
BROADCAST= <address>,

where <address>, is the broadcasts


address. This directive is deprecated.
DEVICE=<name>, where<name>, is the name of the physical device
Most of these entries are self explanatory. BOOTPROTO can be set to
dhcp to use the Dynamic Host Configuration Protocol to configure the
interface. USERCTL indicates whether users other than the superuser is
allowed to bring up or shut down this interface.
2012, University of Colombo School of Computing

14

Dynamic Host Configuration Protocol


(DHCP)

Dynamic Host Configuration Protocol (DHCP) is a


network protocol that automatically assigns TCP/IP
information to client machines.

Each DHCP client connects to the centrally located DHCP


server, which returns that client's network configuration
(including the IP address, gateway, and DNS servers).

The client retrieves this information from the DHCP


server.

2012, University of Colombo School of Computing

15

Dynamic Host Configuration Protocol


(DHCP)

DHCP is also useful if an administrator wants to change


the IP addresses of a large number of systems. Instead of
reconfiguring all the systems, he can just edit one DHCP
configuration file on the server for the new set of IP
addresses.

If the DNS servers for an organization changes, the


changes are made on the DHCP server, not on the DHCP
clients.

When the administrator restarts the network or reboots


the clients, the changes will go into effect.

2012, University of Colombo School of Computing

16

Configuring a DHCP Server

To configure a DHCP server, you must create


the dhcpd.conf configuration file in the /etc/ directory.

There are two types of statements in the configuration file:

Parameters State how to perform a task, whether to


perform a task, or what network configuration options to
send to the client.
Declarations Describe the topology of the network,
describe the clients, provide addresses for the clients, or
apply a group of parameters to a group of declarations.

2012, University of Colombo School of Computing

17

Configuring a DHCP Server

The parameters that start with the keyword option are


referred to as options. These options control DHCP
options; whereas, parameters configure values that are
not optional or control how the DHCP server behaves.

Parameters (including options) declared before a section


enclosed in curly brackets ({ }) are considered global
parameters. Global parameters apply to all the sections
below it.

2012, University of Colombo School of Computing

18

Configuring a DHCP Server Example

Save your /etc/dhcpd.conf file


start the dhcpd dameon by /etc/rc.d/init.d/dhcpd start command

2012, University of Colombo School of Computing

19

4.2 Configuring Web Server

2012, University of Colombo School of Computing

20

Web Server Basics


What is a web server?

Program that responds to requests for documents


"http daemon"
Uses the Hypertext Transfer Protocol (HTTP) to
communicate
Physical machine which runs the program
HTTP is

Designed for document transfer


Generic
not tied to web browsers exclusively
can serve any data type
Stateless
no persistent client/server connection
2012, University of Colombo School of Computing

21

Serving a Page
User of client machine types in a URL
Server name is translated to an IP address via DNS
Client connects to server using IP address and port

number
Client determines path and file to request
Client sends HTTP request to server
Server determines which file to send
Server sends response code and the document
Connection is broken

2012, University of Colombo School of Computing

22

Apache History
NCSA (National Centre for Supercomputing Applications, Uni

of Illinois) webserver was the popular public domain HTTP


daemon.
Developped by Rob McCool.
Killer Application of the Linux.
Rob McCool left NCSA in mid 1994.
Many webmasters developed their own extensions and bug

fixes that were in need of a common distribution.


http://httpd.apache.org/ABOUT_APACHE.html
http://www.ncsa.uiuc.edu/
http://hoohoo.ncsa.uiuc.edu/

2012, University of Colombo School of Computing

Server continued to grow in popularity but

incompatibilities between versions began to develop.


Eventually a small group of administrators began
working together to regain control.
Brian Behlendorf and Cliff Skolnick put together a
mailing list and logins for core developers. 8 core
contributors formed the foundation of the original
Apahe Group.
Single path developed project came to be know as a
patchy server or Apache server.
Today Apache posseses a level of complexity that
easily surpasses some OSs.

2012, University of Colombo School of Computing

24

ncsa

patches

Apache 0.9

Apache 1.2

Apache 1.3.29

shambala
Apache 2.0

APR Utils

APR

New Proxy

Apache 2.0

Java

PHP

Perl

httpd-2.x.x

modules
modules
.
25

2012, University of Colombo School of Computing

Installing Apache
rpm -Uvh httpd-2.x_NN.rpm
Source package http://httpd.apache.org/

Extract the package with (Apache 2.x)


# gzip -d httpd-2_x_NN.tar.gz or
# tar xvf httpd-2_x_NN.tar
# cd httpd-2.x.NN directory
The next step is to configure the Apache source tree for your
particular platform and personal requirements. To configure
source tree simply type
# ./configure --enable-mods-shared=all
(Type ./configure help to see available options with ./configure )
# make
# make install
2012, University of Colombo School of Computing

26

Apache Configuration
Basic site setup:

/site_home (www)
/conf
/logs
/htdocs
Generally Default site home is /etc/httpd
The configuration file is httpd.conf
config files reside in the conf directory

2012, University of Colombo School of Computing

27

Apache Configuration
Validating the Configration Files
/usr/local/apache/bin/apachectl configtest
Syntax OK

To start your server, type the command:


/usr/local/apache/bin/apachectl start
To stop your server, type the command:
/usr/local/apache/bin/apachectl stop

2012, University of Colombo School of Computing

28

Apache Configuration
Who runs the httpd daemon?

Superuser?
Security risk
Only user who can access port 80
Solution is for master process to be started by root, bind
to socket, then change to another user
Nobody?
Not portable across UNIXES
Creating a user and group to run the web server
on Linux, /usr/sbin/useradd and /usr/sbin/groupadd
or /etc/passwd and /etc/group
Put user and group info in httpd.conf
User apache
Group apache
2012, University of Colombo School of Computing

29

Apache Configuration
Create new user on Linux:

/etc/sbin/useradd -d /home/apache -g GID -m -s


/bin/bash -u UID apache
look up GID and UID in /etc/groups and /etc/passwd, respectively
Add the user to the httpd.conf file
edit /etc/httpd/conf/httpd.conf
add the following:
User apache
Group apache
The httpd daemon will now start processes as apache
Setting the host name
Add the following to httpd.conf:
ServerName yourmachinename
where yourmachinename is the name used by your server
2012, University of Colombo School of Computing

30

Apache Configuration
Setting the default document directory

Also called document root


Example:
default document dir: /var/www/html (/etc/httpd/htdocs)
request for http://myserver/index.html will look for
/etc/httpd/htdocs/index.html
Add the following to httpd.conf
DocumentRoot /var/www/html or DocumentRoot html
DO NOT add a slash at the end of the directory path
ServerRoot: The top of the directory tree under which the server's

configuration, error, and log files are kept.


ServerRoot "/etc/httpd
2012, University of Colombo School of Computing

31

Setting The Host Name


Set the hostname in httpd.conf

If the host is www.foobar.com, use:

ServerName www.foobar.com
Designates the default host name (later - virtual
host names)
For example, use localhost
Starting the server

/etc/init.d/httpd start
/usr/local/apache/bin/apachectl start

2012, University of Colombo School of Computing

32

Error Responses
Apache can respond to an error by

Sending a simple default error page


Sending a customized error page
Redirecting to a local URL
Redirecting to an external URL
Configured using the ErrorDocument directive in the config file
Syntax
ErrorDocument [HTTP code] [URL]
Examples
A nicer 404 Not Found message
ErrorDocument 404 nodocument.html
Static access denied message
ErrorDocument 403 Sorry, Dave : note single quote at start of string
2012, University of Colombo School of Computing

33

Error Responses
Examples
Redirect server errors to an error logging CGI program:
ErrorDocument 500 /cgi-bin/log-error
Log strange incoming requests, like DELETE
ErrorDocument 400 /cgi-bin/log-hacks

Some HTTP Error Codes

400 Bad Request


401 Unauthorized
403 Forbidden
404 Not Found
500 Internal Server Error
503 Service Unavailable

Notes

Any error response page starting with http:// will cause the
server to send a redirect to the client
#ErrorDocument 402 http://www.example.com/subscription_info.html
2012, University of Colombo School of Computing

34

Log Files
First line of troubleshooting when setting up a server
Provided flexible logging
Logs are written in a Customizable format
Logs can be written directly to a file or to an external program.
Conditional logging can be made based on the characteristics of the

request.
Directives provided for this,
TransferLog To create log file
LogFormat - To set a custom format
CustomLog - To define a log file and format
TransferLog & CustomLog directives can be used multiple times in each
server to cause each request to be logged to multiple files.
Important to remember log file rotation as well.

2012, University of Colombo School of Computing

35

Log Formats
Define the log locations in httpd.conf:

ErrorLog - logs server errors


ErrorLog "/var/log/httpd/error_log (/etc/httpd/logs/errors)
TransferLog - logs user requests and results
TransferLog /var/log/httpd/access_log
(/etc/httpd/logs/access)
TransferLog Format (Common Log Format CLF)
192.216.5.178 - - [06/Oct/1999:20:03:47 -0700] "GET /
HTTP/1.0" 200 1945
IP address of client
The clientss identity & the remote user name if using HTTP
authentication (missing in example - -)
Date, Time and Time Zone of the request
2012, University of Colombo School of Computing

36

Log Formats
Content HTTP request
Server Response code
Content length in bytes
The default CLF can be altered to store more information using the

LogFormat directive.

2012, University of Colombo School of Computing

37

Log Formats
ErrorLog Format

[Wed Oct 6 20:06:04 1999] [error] [client 192.216.5.178]


File does not exist: /home/httpd/html/foobar
Date
Error, Information, Security, etc.
Client IP address
Message
Current state of our config file
User apache
Group apache
ServerName localhost
DocumentRoot /var/www/html
TransferLog /var/logs/httpd/access_log
ErrorLog /var/logs/httpd/errors_log
2012, University of Colombo School of Computing

38

Log Formats

LogFormat %H %m %t %U simple
CustomLog logs/access.log simple
This willlogs Protocol,Date,Time and URL requested
Exercise:
Try these with your configured apache server.
LogFormat %h ip
LogFormat %h %l %u %t \%r\ %>s %b detailed
CustomLog logs/access.log detailed
CustomLog logs/ip.log ip

2012, University of Colombo School of Computing

39

Basic Server Settings


Server can set the Content-type header several ways

DefaultType directive
example: DefaultType text/html (application/octetstream)
sets the type for any document not otherwise
recognized
AddType directive
example: AddType image/jpeg (AddType
application/x-tar .tgz)
associates MIME type with file extension or override
the MIME configuration
can be used to set multiple types
2012, University of Colombo School of Computing

40

Basic Server Settings


TypesConfig directive
TypesConfig /etc/mime.types
Describes where the mime.types file (or equivalent) is
to be found.
in the default installation, this is /etc/mime.types
by far the easiest way to associate MIME type with file
extension needs an Apache module to work!

2012, University of Colombo School of Computing

41

Basic Server Settings


Our MIME-smart httpd.conf
LoadModule mime_module
/etc/httpd/modules/mod_mime.so
User www
Group www
ServerRoot /home/www
ServerName localhost
DocumentRoot /home/www/htdocs
ErrorLog logs/errors
TypesConfig /etc/mime.types
DefaultType text/plain

2012, University of Colombo School of Computing

42

Apache is Modular
Loadable modules

Routines which extend the server


Module examples:
MIME type determination
Server-parsed documents

CGI handler
the Apache group distributes 34 modules
LoadModule directive selectively includes
modules in server

2012, University of Colombo School of Computing

43

Apache is Modular
Loadable Modules vs. Compiled Modules

At install time, you have the option of including any


of the loadable modules in the base server
increases server executable size
easier to access module directives, like
TypesConfig
Which is better? Personal preference

2012, University of Colombo School of Computing

44

Apache is Modular
How do I know which directives are in which modules?

The quick reference in textbook lists directives


If they are core, dont need LoadModule
$ httpd l list all the statically compiled modules
Try the below online docs which show module names
with directive documentation:
http://www.apache.org/docs/mod/directives.html

2012, University of Colombo School of Computing

45

The Options Directive


Options directive

quick way to set several server features


Core directive (not in a module)
Applies to entire server, virtual host, directory
Uses + to turn on an option and - to turn it off
ExecCGI
allow execution of CGI scripts
FollowSymLinks
server will follow UNIX symbolic links (ln -s source target) to find files
SymLinksIfOwnerMatch
only follow symlinks if the target file is owned by the link owner

2012, University of Colombo School of Computing

46

The Options Directive


Indexes

if a URL that maps to a directory is requested and there is


no index.html, display a file index
Includes
allow server-side includes (SSI later)
IncludesNoEXEC
SSI allowed, but #exec is disabled
MultiViews
complex file mapping function
off by default
All
all options except MultiViews
2012, University of Colombo School of Computing

47

The Options Directive


Syntax

Options Indexes FollowSymLinks


Options +Includes -Indexes
Options -ExecCGI
Using + and - merges options
used in <Directory> and <Files>
blocks

2012, University of Colombo School of Computing

48

File and Directory Control


Config file directives so far can be divided into two groups:

Server process control


User, Group, ServerRoot, LoadModule, DocumentRoot,
ErrorLog, TypesConfig
File and Directory control
ErrorDocument, DefaultType, AddType, Options
How do we specify different settings for different directories?

<Directory> and <DirectoryMatch>


How do we specify different settings for specific files or file
types?
<Files> and <FilesMatch>
2012, University of Colombo School of Computing

49

<Directory>
What is it?

Encloses a group of config file directives which will apply


only to a specified directory and its subdirectories
Most commands not related to the server process can be
used
Syntax

Example: set file indexing and symlink following in the


smallco area

<Directory /clients/smallco>
Options Indexes FollowSymLinks
</Directory>
2012, University of Colombo School of Computing

50

<Directory>
Syntax

Example: specify a special 404 not found document for


Cars

<Directory /clients/cars>
ErrorDocument 404 cars404.html
</Directory>
Syntax
The directory path is a full path to the directory affected
Wild cards * and ? may be used
* is match any sequence of characters
? is match any single character
may also use [] to enclose character ranges

ie site_v[1-4] will match site_v1, site_v2, site_v3 and site_v4

2012, University of Colombo School of Computing

51

<Directory>
Syntax

Can also use regular expressions with the tilde


character
<Directory ~ ^/clients/(cars|vans)>
matches /clients/cars and /clients/vans but not
/web/clients/cars
Dont worry about regular expressions if you havent
learned about them yet!!!
<DirectoryMatch>
What is it?
Exactly the same as
<Directory ~ regex>
2012, University of Colombo School of Computing

52

<Location>
What is it?

Encloses a group of config file directives that apply only to a


given URL
Similar to <Directory>, but operates on the requested URL
Why do we need it?
Some directives alter the URL to get the file path
Usage

Most useful in conjunction with the SetHandler directive


SetHandler specifies which program will be used to
handle a request

<LocationMatch>

same as <Location ~ regex>


2012, University of Colombo School of Computing

53

What is it?

<Files>

Allows directives to be applied to only certain files or file types


Can be nested inside <Directory> tags
Processed in the order they appear in the httpd.conf file

Example:
Make any file with a .foo extension be served as text/plain
<Files *.foo>
ForceType text/plain
</Files>
Guess what ForceType does

Syntax

<Files filename>
Can use * and ? wildcards
Can use regular expression with <Files ~ regex> syntax
Can be used both inside and outside a <Directory> section
2012, University of Colombo School of Computing

54

2012, University of Colombo School of Computing

Virtual Hosts
More than one apparent server on one machine

One instance of Apache can serve multiple web sites


ISPs do this all the time
Can also be used for intranets to separate departmental
sites, for example
Virtual Host types

IP-based virtual hosts


Most common method
Requires different IP address for each virtual host
Requires configuration of network interface card
Supported under HTTP/1.0
2012, University of Colombo School of Computing

56

Virtual Hosts
Virtual Host types

Name-based virtual hosts


many host names pointing to same IP address
practically unlimited number of servers
easy to configure
no additional hardware or software
BUT client must support HTTP/1.1
old browsers may lack support
<VirtualHost> can include:
<Directory>, <Files>, and <Location> inside a <VirtualHost>
These inclusions are processed after the ones outside the
<VirtualHost>
2012, University of Colombo School of Computing

57

Virtual Hosts Example


Example Scenario

We are BigISP.com, an service provider running


Apache on Linux
We have two clients, Smallco and Bigco, Inc.
We want to set up web sites for these companies at:
www.smallco.com
www.bigco.com

2012, University of Colombo School of Computing

58

Virtual Hosts Example


Setup

Were going to create the following directories on our


server:
/home/www
server root
/home/www/conf
server config files
/home/www/logs
default log file dir
/home/www/htdocs default doc dir
/clients/smallco
Smallcos area
/clients/bigco
Bigcos area
under each clients directory, create logs and
htdocs/html

2012, University of Colombo School of Computing

59

IP-Based Virtual Hosts


Adding an IP address under Linux
Not needed for name-based hosting
Assumptions
there is already one network card (eth0) up and running with an IP
address
you have been assigned additional IP addresses for the machine
you are logged in as root

Adding an IP address under Linux


Add two more IP addresses to eth0:
ifconfig eth0:0 192.168.123.2
route add -host 192.168.123.2
ifconfig eth0:1 192.168.123.3
route add -host 192.168.123.3
Check by running ifconfig
To make this permanent, add these lines to /etc/rc.d/rc.local
2012, University of Colombo School of Computing

60

IP-Based Virtual Hosts


Need to associate virtual host names with new IP addresses

If you were assigned IP addresses by network admin, this may already be


set up
If not, add the following to /etc/hosts to associate names with IP numbers
192.168.123.2 www.smallco.com
192.168.123.3 www.bigco.com
The <VirtualHost> directive
contains directives weve already seen, like
ServerName
DocumentRoot
ErrorLog
TransferLog
ServerAdmin
applies these settings to requests that come in for the specified host

2012, University of Colombo School of Computing

61

IP-Based Virtual Hosts


Example Scenario

<VirtualHost www.smallco.com>
ServerName www.smallco.com
ServerAdmin webmaster@mail.smallco.com
DocumentRoot /clients/smallco/htdocs
ErrorLog /clients/smallco/logs/errors
TransferLog /clients/smallco/logs/access
</VirtualHost>
<VirtualHost www.bigco.com>
ServerName www.bigco.com
ServerAdmin root@mail.bigco.com
DocumentRoot /clients/bigco/htdocs
ErrorLog /clients/bigco/logs/errors
TransferLog /clients/bigco/logs/access
</VirtualHost>
2012, University of Colombo School of Computing

62

IP-Based Virtual Hosts


The <VirtualHost> directive

The hostname can be either a name or an IP address


if a name is used, the IP address is looked up via DNS
Pro: easy to administer
Con: if DNS is down when Apache is started, so is the
virtual server
The <VirtualHost> directive
if an IP address is used and ServerName is not specified, a
reverse DNS lookup will be performed to get the name
Con: DNS down, name-based requests to the server are
down
Solution: use IP address in <VirtualHost> with
ServerName directive

2012, University of Colombo School of Computing

63

IP-Based Virtual Hosts


Revised example:
<VirtualHost 192.168.123.2>
ServerName www.smallco.com
ServerAdmin webmaster@mail.smallco.com
DocumentRoot /clients/smallco/htdocs
ErrorLog /clients/smallco/logs/errors
TransferLog /clients/smallco/logs/access
</VirtualHost>
similar revision for www.bigco.com

2012, University of Colombo School of Computing

64

Name-Based Virtual Hosts


Example Scenario #2
Smallco and Bigco Inc. have arranged for www.smallco.com
and www.bigco.com to point to our IP address, 192.168.123.1
BigISP was assigned the primary name server for those
names
This is a network administrator task
Our server is server.bigisp.com
Our /etc/hosts may look like
192.168.123.1 server.bigisp.com
Need to add additional hostnames after server.bigisp.com
192.168.123.1 server.bigisp.com www.smallco.com
www.bigco.com
2012, University of Colombo School of Computing

65

Name-Based Virtual Hosts


Config file
NameVirtualHost 192.168.123.1
<VirtualHost 192.168.123.1>
ServerName www.smallco.com
ServerAdmin webmaster@mail.smallco.com
DocumentRoot /clients/smallco/htdocs
ErrorLog /clients/smallco/logs/errors
TransferLog /clients/smallco/logs/access
</VirtualHost>
<VirtualHost 192.168.123.1>
ServerName www.bigco.com
ServerAdmin root@mail.bigco.com
DocumentRoot /clients/bigco/htdocs
ErrorLog /clients/bigco/logs/errors
TransferLog /clients/bigco/logs/access
</VirtualHost>
2012, University of Colombo School of Computing

66

Name-Based Virtual Hosts


Notes
Main server at /home/www no longer available
can get around that by adding a new <VirtualHost>
entry for server.bigisp.com
Older browsers will not work
first virtual host in config file always used
possible workaround with the ServerPath directive
Older browser workaround
NameVirtualHost 192.168.123.1
<VirtualHost 192.168.123.1>
ServerName www.smallco.com
ServerPath /smallco
DocumentRoot /clients/smallco/htdocs
</VirtualHost>
2012, University of Colombo School of Computing

67

Name-Based Virtual Hosts


ServerPath workaround
Any request starting with /smallco will be served by
this virtual host
Pages must be accessed as
www.smallco.com/smallco
Requires that any pages in the site use only relative
links or /smallco/
Newer browsers unaffected

2012, University of Colombo School of Computing

68

Virtual Hosts
Two ways of running Apache for virtual hosts

Multiple httpd daemons


secure vhost1 cannot read vhost2s data
host machine has enough system resources
Single httpd daemon
some shared configuration is acceptable
host machine will service a high volume of requests
Setting up multiple daemons

requires multiple httpd installations


Setting up a single daemon

One site home and config file


Config file contains the <VirtualHost> directive
2012, University of Colombo School of Computing

69

Virtual Hosts
Almost any configuration directive can be put inside
<VirtualHost>
Exceptions are mainly directives that control the httpd
daemon, like
User, Group
ServerRoot
BindAddress
MinSpareServers, MaxSpareServers,
MaxRequestsPerChild

2012, University of Colombo School of Computing

70

Virtual Hosts
Example Scenario #3
Bigco Inc merges with Medium Corp
Web sites are consolidated, so requests to
www.mediumco.com should now go to
www.bigco.com
Medium Corp has designated BigISP as their primary
nameserver
Add www.mediumco.com to /etc/hosts
Use ServerAlias directive

2012, University of Colombo School of Computing

71

Virtual Hosts
Example Scenario #3

<VirtualHost 192.168.123.1>
ServerName www.gizmos.com
ServerAlias www.widgets.com
ServerAdmin root@mail.gizmos.com
DocumentRoot /clients/gizmos/htdocs
ErrorLog /clients/gizmos/logs/errors
TransferLog /clients/gizmos/logs/access
</VirtualHost>

2012, University of Colombo School of Computing

72

Virtual Hosts Example 1


Serving the same content on different IP addresses
(such as an internal and external address)

NameVirtualHost 192.168.1.1
NameVirtualHost 172.20.30.40
<VirtualHost 192.168.1.1 172.20.30.40>
DocumentRoot /www/server1
ServerName server.example.com
ServerAlias server
</VirtualHost>

2012, University of Colombo School of Computing

73

Virtual Hosts Example 2


Mixed name-based and IPbased vhosts
Listen 80
NameVirtualHost 172.20.30.40
<VirtualHost 172.20.30.40>
DocumentRoot /www/example1
ServerName www.example1.com
</VirtualHost>
<VirtualHost 172.20.30.40>
DocumentRoot /www/example2
ServerName www.example2.org
</VirtualHost>

# IP-based
<VirtualHost 172.20.30.50>
DocumentRoot /www/example4
ServerName www.example4.edu
</VirtualHost>
<VirtualHost 172.20.30.60>
DocumentRoot /www/example5
ServerName www.example5.gov
</VirtualHost>

<VirtualHost 172.20.30.40>
DocumentRoot /www/example3
ServerName www.example3.net
</VirtualHost>

2012, University of Colombo School of Computing

74

4.3 Configuring DNS Server

2012, University of Colombo School of Computing

75

What is DNS?
DNS (Domain Name System)

A database that is used by TCP/IP applications to map


between hostnames and IP addresses
Characteristics of DNS
A hierarchical namespace for hosts and IP addresses
A host table implemented as a distributed database
A Client/Server system
Components of DNS
Namespace and Resource Record
Name Server
Resolver (Client)

76

2012, University of Colombo School of Computing

Domain name resolution

HTTP

2012, University of Colombo School of Computing

IP address (128.143.71.21)

Resolver

Name
server

Hostname
(neon.tcpip-lab.edu)

77

Hostname (neon.tcpip-lab.edu)

IP address (128.143.71.21)

User program issues a request


for the IP address of a hostname
Local resolver formulates a DNS
query to the name server of the
host
Name server checks if it is
authorized to answer the query.
If yes, it responds.
Otherwise, it will query other
name servers
When the name server has the
answer it sends it to the resolver.

Recursive and Iterative Queries


There are two types of queries:
Recursive queries
Iterative (non-recursive) queries
The type of query is determined by a bit in the DNS query
Recursive query: When the name server of a host cannot
resolve a query, the server issues a query to resolve the
query
Iterative queries: When the name server of a host cannot
resolve a query, it sends a referral to another server to
the resolver

78

2012, University of Colombo School of Computing

What is DNS? Cont.


Query for add. A

Local
Name
Server

Referral to lk NS

Query for add. A

.
Name Server

lk
Name Server

lk

jp

ac

gov

com

Referral to ac.lk NS
Query for add. A

Answer

Resolver
Query

Referral to cmb.ac.lk NS

Query for add. A

ac.lk
Name Server

cmb.ac.lk
Name Server

Answer to ucsc.cmb.ac.lk

Resolver
add. A ucsc.cmb.ac.lk
79

2012, University of Colombo School of Computing

cmb

mrt

Root Servers

80

http://www.root-servers.org

2012, University of Colombo School of Computing

What is DNS? Cont.

Top Level Domains

81

2012, University of Colombo School of Computing

What is DNS? Cont.

Reverse lookup

82

2012, University of Colombo School of Computing

What is DNS? Cont.


Namespace
DNS namespace is a tree of domains
Refers to the actual database of IP addresses and their
associated names
At the highest level of the hierarchy sit the root servers
Zone
A DNS zone refers to a certain portion or administrative
space within the global Domain Name System. Each DNS
zone represents a boundary of authority subject to
management by certain entities and is administered as a
single separate entity. The total of all DNS zones, which are
organized in a hierarchical tree-like order of cascading
lower-level domains, form the DNS namespace.

83

2012, University of Colombo School of Computing

What is DNS? Cont.


Resource Records (RR)
Resource records are the data elements that define the
structure and content of the domain name space. All DNS
operations are ultimately formulated in terms of resource
records.
Name Server
The server programs that store information about the
domain name space
Resolver (Client)
The programs that extract information from name servers in
response to client requests

84

2012, University of Colombo School of Computing

85

2012, University of Colombo School of Computing

Zones and Delegations


Zones are administrative spaces
Zone administrators are responsible for portion of a domains

name space
Authority is delegated from a parent and to a child

lk zone

msc
www
ftp
mcs mit

86

google
isi sun tislabs
moon

2012, University of Colombo School of Computing

msc.ucsc.lk zone

ucsc

lk domain
ucsc.lk zone

edu com

lk

www

What is BIND?

87

2012, University of Colombo School of Computing

What is BIND?
BIND (Berkeley Internet Name Domain system)

A open source software package that implements the DNS protocol


and provides name service on systems (UNIX & NT)
Characteristics of BIND
Same as DNS, a Client/Server system
Client side : resolver & Server side : named
Components of BIND
DNS Server (named)
Answers queries about hostname and IP addresses
Asks other servers and caches their responses
zone transfers
DNS Resolver library
Contains the routines that you need to write your application
May use the generate query or the name server library routines
Tools for verifying the proper operation of the DNS server
nslookup & dig
88

2012, University of Colombo School of Computing

Where do I Start? - Client


Client Configuration
/etc/resolv.conf
Format
search
list

domainname

// define your resolvers default domain and search

domain

domainname

// define your resolvers default domain

nameserver ipaddr
server

// tells your resolver to query a particular name

Example
% more /etc/resolv.conf
nameserver 203.252.57.2
nameserver 203.252.32.4
domain cmb.ac.lk

Check : /etc/nsswitch.conf which to be used first? DNS or /etc/hosts file?


# /etc/nsswitch.conf:
hosts: files dns
..
89

2012, University of Colombo School of Computing

Where do I Start? - Server


Type of Server
Primary
Secondary
Cache only
Stub Server
Install of BIND
Distribution : ISC (Internet Software Consortium)
Configure Name Server
Network configuration
BIND boot file configuration
BIND-4 boot file : named.boot : script like code
BIND-8 boot file : named.conf : C like code
Resource Record configuration
90

2012, University of Colombo School of Computing

/etc/named.conf file
/etc/named.conf
options {
directory "/var/named";
};

91

zone "." IN {
type hint;
file "root.hints";
};

// root servers file

zone cmb.ac.lk" IN {
type master;
file "zone/cmb.ac.lk";
};

// for forward zone

zone "0.0.127.in-addr.arpa" IN {
type master;
file "zone/127.0.0";
};

// for localhost

zone 20.168.192.in-addr.arpa" IN {
type master;
file "zone/192.168.20";
};

// for reverse zone

2012, University of Colombo School of Computing

Resource Records (RRs)


Type
A
NS
MD
MF
CNAME
SOA
MB
MG
MR
NULL
WKS
PTR
HINFO
MINFO
MX
TXT
92

Value
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

meaning
a host address
an authoritative name server
a mail destination (Obsolete - use MX)
a mail forwarder (Obsolete - use MX)
the canonical name for an alias
marks the start of a zone of authority
a mailbox domain name (EXPERIMENTAL)
a mail group member (EXPERIMENTAL)
a mail rename domain name (EXPERIMENTAL)
a null RR (EXPERIMENTAL)
a well known service description
a domain name pointer
host information
mailbox or mail list information
mail exchange
text strings

2012, University of Colombo School of Computing

Resource Record: SOA


(Start Of Authority)
Master name server
Contact address

ucsc.lk. 3600 IN SOA

Version number

NS.UCSC.LK.

admin.ucsc.lk. (
2002021301
; serial
30M
; refresh
15M
; retry
1W
; expiry
1D )
; negative ttl

Timing parameter
93

2012, University of Colombo School of Computing

Resource Records (RRs) cont


Example
SOA Record
cmb.ac.lk.

IN

SOA ns.cmb.ac.lk.
2003081001
86400
1800
1209600
86400 )

admin.cmb.ac.lk. (
; Serial (2003-08-10 #01)
; Refresh (daily)
; Retry (30 minute)
; Expire (2 weeks)
; Minimum TTL (1 day)

; end of SOA
:

NS (Name Server) Record


cmb.ac.lk.

or
replace
with @
94

IN

NS
NS
NS

ns.cmb.ac.lk.
ns2.cmb.ac.lk.
ns.ac.lk.

2012, University of Colombo School of Computing

Resource Records (RRs) cont


A (Address) and CNAME (Canonical Name) Record
; Host Address
ns.cmb.ac.lk.
ns2
localhost
; Aliases
www
ftp

95

IN
IN

A
A
IN

IN
IN

192.168.20.100
192.168.30.100
A
127.0.0.1

CNAME
CNAME

namal
www

2012, University of Colombo School of Computing

Resource Records (RRs) cont


MX (Mail eXchanger) Record
@
mail
namal

IN
IN
IN
IN

MX
MX
A
A

10
20

mail.cmb.ac.lk.
namal.cmb.ac.lk.
192.168.20.50
192.168.20.55

IN

PTR

mail.cmb.ac.lk.

PTR (Pointer) Record


50.20.168.192.in-addr.arpa.

96

2012, University of Colombo School of Computing

Glue Record
A glue record is the IP address of a name server held at
the domain name registry. Glue records are required when
you wish to set the name servers of a domain name to
a hostname under the domain name itself.
Example:
set the name servers of cmb.ac.lk to anduna.cmb.ac.lk
and ns.ac.lk
you would need to also provide the glue records (i.e. the IP
addresses) for anduna.cmb.ac.lk and ns.ac.lk.

97

2012, University of Colombo School of Computing

Glue Record
If you did not provide the glue records for these name
servers then your domain name would not work as anyone
requiring DNS information for it would get stuck in a loop.
What is the name server for cmb.ac.lk?

aduna.cmb.ac.lk
What is the IP address of aduna.cmb.ac.lk?
don't know, try looking at name server for cmb.ac.lk
What is the name server for cmb.ac.lk?

aduna.cmb.ac.lk

98

2012, University of Colombo School of Computing

More on DNS
RFC 1537 recommends the following values for top-level domain
servers in the SOA:
86400 ; Refresh 24 hours (8hrs for non-top level domains)
7200 ;

Retry 2 hours (2hrs)

2592000 ;

Expire 30 days (7 days)

345600 ; Minimum TTL 4 days (1 day)


What values you choose for your SOA record will depend upon the
needs of your site. In general, longer times cause less load on your
systems and lengthen the propagation of changes; shorter times
increase the load on your systems and speed up the propagation of
changes.
99

2012, University of Colombo School of Computing

More on DNS
In the new version of bind, the TTL in SOA is now interpreted as the
"negative caching" time (See RFC 2308). The default TTL value is
defined by $TTL directive in the first line of your zone file. E.G.

$TTL 4d
@

100

IN

SOA ..

2012, University of Colombo School of Computing

Negative Caching
Classical DNS caching stores only the results of successful name
resolutions. It is also possible for DNS servers to cache the results
of unsuccessful name resolution attempts; this is called negative
caching.
To extend the example above, suppose you mistakenly thought
the name of the company's web site was www.uccs.lk and
typed that into your browser. Your local DNS server would be
unable to resolve the name, and would mark that name as
unresolvable in its cache; a negative cache entry. Note that
regular caching is sometimes called positive caching to
contrast it to negative caching.

101

2012, University of Colombo School of Computing

Negative Caching
The value to be used for negative caching in a zone is now
specified by the Minimum field in the Start Of
Authority resource record for each zone. As mentioned
above, this was formerly used to specify the
default TTL for a zone.

102

2012, University of Colombo School of Computing

More on DNS
Root DNS servers (totally 13) a.root-servers.net, ..,
m.root-servers.net
a.root-servers.net root server is about 12,000 queries/sec
Why 13 Root Servers?
Primary Name Server
unique
has SOA (Source of Authority) of that domain
add/change/remove of the domain name records
Secondary NS
possibly many for a domain
name records backup from Primary NS periodically
add/change/remove are worthless
fault tolerance when the Primary NS is down
103

2012, University of Colombo School of Computing

More on DNS
Cache Poisoning an attacker obtains the ability to put
data into our nameserver's cache
Create a separate user for the DNS server, with shell equal
to /bin/false.

(named -u dns_user -g dns_group)


BIND 8.2.0 and 8.2.1 was vulnerable to a Remote Root

exploit! (current release BIND V 9........., check this by


yourself)

104

2012, University of Colombo School of Computing

More on DNS
configuration syntax
/etc/named.boot (v4)
/etc/named.conf (v8,9)
New Name Daemon Control program
ndc (v8), rndc (v9)
Hostnames can contain letters, numbers, and hyphens, and
may not start with a hyphen. Underscore _ is not a valid
character in a hostname.
UDP/TCP port 53
UDP query/response ( < 512 bytes )
On normal conditions, DNS UDP traffic occupies more
than 99% of the total DNS traffic of a specified server!
TCP response (>512 bytes) + zone transfer
105

2012, University of Colombo School of Computing

More on DNS
Average size of a DNS packet is 150 bytes
Diagnostic tool on the web: http://www.dnsreport.com
The BIND name daemon control interface program (ndc)
can provide version information when used with newer
versions of BIND:
# ndc status

106

2012, University of Colombo School of Computing

options {
directory "/etc/namedb";
pid-file "/var/run/named/pid";
version "Unkonwn";
listen-on
{ a.b.c.d; };
allow-transfer {"none";};
};
// UCSC Domains and their settings
zone "." in {
type hint;
file "named.root";
};
zone "cmb.ac.lk" {
type master;
file "master/cmb.ac.lk.db";
allow-transfer { p.q.r.s; };
also-notify { p.q.r.s; };
};
107

DNS Example
zone "cmb.ac.lk" {
type slave;
file "slave/cmb.ac.lk.db";
masters {a.b.c.d;};
notify no;
};
zone "248.192.IN-ADDR.ARPA" {
type master;
file "master/cmb.ac.lk-rev.db";
allow-transfer { p.q.r.s; };
also-notify { p.q.r.s; };
};

2012, University of Colombo School of Computing

DNS Example
$TTL 3h
cmb.ac.lk.

IN
SOA aduna.cmb.ac.lk. root.aduna.cmb.ac.lk. (
2011092702
; serial
3h
; refresh every 3 hrs
1h
; rerty every hour
2w
; expire after 14 days
2d )
; 2 day
$ORIGIN cmb.ac.lk.
@
IN
TXT
"University of Colombo, Sri Lanka"
IN
NS
aduna.cmb.ac.lk.
IN
NS
ns.ac.lk.
IN
A
10.20.50.112
IN
MX
10
king.cmb.ac.lk.
IN
MX
50
queen.cmb.ac.lk.
aduna
IN
A
10.20.50.234

108

2012, University of Colombo School of Computing

DNS Tools
Domain Information Groper (dig) is a network administration commandline tool for querying Domain Name System (DNS) name servers for any
desired DNS records. # dig A www.ucsc.lk

109

2012, University of Colombo School of Computing

End of Section 4

2012, University of Colombo School of Computing

110

IT 6205
Section 5.0
Automating System Administration

2012, University of Colombo School of Computing

5.1 Shell Basics

2012, University of Colombo School of Computing

Shells
The shell is a UNIX program that interprets the commands you enter
from the keyboard
UNIX provides several shells, including the Bourne shell, the Korn
shell, and the C shell
Steve Bourne at AT&T Bell Laboratories developed the Bourne shell
as the first UNIX command processor
The Korn shell includes many extensions, such as a history feature
that lets you use a keyboard shortcut to retrieve commands you
previously entered
The C shell is designed for C programmers use
Linux uses the freeware Bash shell as its default command interpreter
(compatible with Bourne shell, created & distributed by the GNU
project)
You can choose the one that best suites your way of working ..
2012, University of Colombo School of Computing

Choosing Your Shell


You choose a shell when the system admin sets up your user
account
Bourne shell sh
Korn shell ksh
C shell csh
Bash bash
Enhanced C shell (a freeware shell derived from the C shell)
tcsh
Z shell (a freeware shell derived from the Korn shell) zsh
After you choose your shell, the system administrator stores
your choice in your account record, and it becomes your
assigned shell
UNIX uses this shell any time you log on (try %echo $SHELL)
2012, University of Colombo School of Computing

Choosing Your Shell


After you choose your shell, the system administrator stores your
choice in your account record, and it becomes your assigned shell
UNIX uses this shell any time you log on (try %echo $SHELL)
However, you can switch from one shell to another by typing the
shells name (such as tcsh, bash, or zsh) on your command line
(try %chsh)
Example of /etc/passwd file:
saman:xxxxx:500:500:Saman Silva:/home/saman:/bin/tcsh
root:xxxxxxxx:0:0:root:/root:/bin/bash

2012, University of Colombo School of Computing

Command-line Editing
Shells support certain keystrokes for performing command-line
editing
For example, Bash supports the left and right arrow keys, which
move the cursor on the command line
Not all shells support command-line editing in the same manner
Multiple Command Entry
You may type more than one command on the command line by
separating each command with a semicolon(;)
When you press Enter, UNIX executes the commands in the order
you entered them
You can use the clear command to clear your screen; it has no
options or arguments
You can access the command history with the up and down arrow
keys with most shells
2012, University of Colombo School of Computing

User Interaction with the Shell


User logs in

shell shows the prompt

User types a command

shell executes the appropriate program

User interacts with the program

User logs off


2012, University of Colombo School of Computing

5.2 Bash Scripting

2012, University of Colombo School of Computing

Shell Scripts
What are they for?
To automate certain common activities an user
performs routinely.
They serve the same purpose as batch files in
DOS/Windows.
Example:
rename 1000 files from upper case to lowercase

2012, University of Colombo


School of Computing
9

What are Shell Scripts


Just text/ASCII files with:
a set of standard UNIX/Linux commands (ls, mv, cp,
less, cat, etc.) along with
flow of control
some conditional logic and branching (ifthen),
loop structures (foreach, for, while), and
I/O facilities (echo, print, set, ...).
They allow use of variables.
They are interpreted by a shell directly.
Some of them (csh, tcsh) share some of C syntax.
DOS/Win equivalent - batch files (.bat)
2012, University of Colombo
School of Computing
10

Why not use C/C++ for that?


C/C++ programming requires compilation and linkage,
maybe libraries, which may not be available (production
servers).
For the typical tasks much faster in development,
debugging, and maintenance (because they are
interpreted and do not require compilation).

2012, University of Colombo


School of Computing
11

Shell Script Invocation


Specify the shell directly:
% tcsh myshellscript
% tcsh -v myshellscript
(-v = verbose, useful for debugging)

Make the shell an executable first and then run is a


command (set up an execution permission):
% chmod u+x myshellscript

Then either this:


% myshellscript
(if the path variable has . in it; security issue!)

Or:
% ./myshellscript
(should always work)
2012, University of Colombo
School of Computing
12

Shell Script Invocation (2)


If you get an error:
myshellscrip: command not found
The probably . is not in your path or theres no
execution bit set.
When writing scripts, choose unique names, that
preferably do not match system commands.
Bad name would be test for example, since there
are many shells with this internal command.
To disambiguate, always precede the shell with ./ or
absolute path in case you have to name your thing
not very creatively.

2012, University of Colombo


School of Computing
13

Start Writing a Shell Script


The very first line, often called 'shebang' (#!) should
precede any other line, to assure that the right shell is
invoked.
#!/bin/tcsh
# This is for tcsh

#!/bin/bash
# For Bourne-Again Shell

#!/bin/sh
# This is for Bourne Shell

Comments start with '#', with the exception of #!, $#, which
are a special character sequences.
Everything on a line after # is ignored if # is not a part of a
quoted string or a special character sequence.
2012, University of Colombo
School of Computing
14

Bourne Shell Script Constructs


Reference
System/Internal Variables
Control Flow (if, for, case)

2012, University of Colombo


School of Computing
15

Internal Variables
$# Will tell you # of command line arguments supplied

$0 Ourselves (i.e. name of the shell script executed


with path)
$1 First argument to the script
$2
$?
$$
$!
$-

Second argument, and so on


Exit status of the last command
Our PID
PID of the last background process
Current shell status
2012, University of Colombo School of Computing

16

Internal Variables (2)


Use shift command to shift the arguments one
left:
Assume intput:
./shift.sh 1 2 foo bar

$0 = <directory-of>/shift.sh
$1 = 1
$2 = 2
$3 = foo
$4 = bar

$0 = <directory-of>/shift.sh
$1 = 2
$2 = foo
$3 = bar

shift:

2012, University of Colombo


School of Computing
17

Environment
These (and very many others) are available to your shell:
$PATH - set of directories to look for commands
$HOME - home directory
$MAIL
$PWD personal working directory
$PS1 primary prompt
$PS2 input prompt
$IFS - what to treat as blanks

2012, University of Colombo


School of Computing
18

Control Flow: if
General Syntax:
if [ <expression> ]; then
<statements>
elif
<statements>
else
<statements>
fi

<expression> can either be a logical expression or a


command and usually a combo of both.

2012, University of Colombo


School of Computing
19

if
Some Logical Operators:
-eq
--- Equal
-ne
--- Not equal
-lt
--- Less Than
-gt
--- Greater Than
-o
--- OR
-a
--- AND
File or directory?
-f
--- file
-d
--- directory
2012, University of Colombo
School of Computing
20

for
Syntax:
for variable in <list of values/words>[;]
do
command1
command2

done

List can also be a result of a command.

2012, University of Colombo


School of Computing
21

for
for file in *.txt;
do
echo File $file:;
echo "===START===;
cat $file;
echo "===END===;
done

2012, University of Colombo


School of Computing
22

while
Syntax
while <expression>
do
command1
command2

done

2012, University of Colombo


School of Computing
23

until
Syntax
until <expression>
do
command1
command2

done

2012, University of Colombo


School of Computing
24

Exercise
All the *.conf files in the current directory will be copied
with that file name.org

for file in *.conf;


do cp $file $file.org;
done

2012, University of Colombo


School of Computing
25

More Examples
#!/bin/bash
# This is my script to make a backup of a # .conf file
d=`date +%d%m%y`;
cp -pv $1 $1.$d.org;
echo "Copying Finished";
vi $1
for i in *.txt;
do
echo "File name: $i";
echo "=====START=======";
cat $i;
echo "=====END=======";
done;

2012, University of Colombo


School of Computing
26

More Examples
#!/bin/bash
if [ "${1##*.}" = "tar" ]
then
echo This appears to be a tarball.
else
echo At first glance, this does not appear to
be a tarball.
fi
if [ "$2" = "help" ]
then
echo " ===============HELP ============";
fi

2012, University of Colombo


School of Computing
27

5.3 Periodic Processes

2012, University of Colombo School of Computing

28

Cron
Cron gives the ability to run commands periodically on
the system.
Cron jobs can be set up by the administrator or by
users.
The Cron Table is stored in /etc/crontab
Users can edit cron jobs with: crontab e
List with: crontab l

29

2012, University of Colombo School of Computing

Cron cont
Each entry has 6 fields:
Minutes 00-59
Hours 0-23 (Mid-night is 0)
Day of the month 1-31
Month of the year 1-12
Day of the week 0-6 (Sunday is 0)
Job to be executed
* all legal values
, multiple entries are separated by comma
# implies comments

30

2012, University of Colombo School of Computing

Cron Example
Field Rules:
single number ie. 1
range ie. 1-4
ranges w/step ie. 1-100/5
list ie. 1,3,5,7
wildcard ie. *
0 17 * * 1,2,3,4,5 /usr/backup
Run /usr/backup at 5pm Monday-Friday every week, in every month in
the year
Cron daemon starts by rc files. Once started never terminates. It checks
the crontab file every minute (for any changes)
Cron allow us to schedule programs for periodic execution. However,
cron is not a general facility for scheduling program execution off-hours
use the at command
31

2012, University of Colombo School of Computing

More Cron Examples


0 6 */2 * * mailq v | mail s Stuck Mails nimal
Uses mailq every two days to test whether there is any mail stuck in the
mail queue and sends the mail to administrator (nimal@...)
0 2 1 */2 * mt f /dev/rft0 rewind; tar cf /dev/rft0 /etc
Runs at 2:00AM on the first day of the month in every other month to
backup the /etc to the tape (make sure the tape is in the drive!!)
The same can be written as:
0 2 1 jan,mar,may,jul,sep,nov * mt f /dev/rft0 rewind; tar cf
/dev/rft0 /etc

0 0 * * * cmd

- Every night at 00:00 hours

5 4 * * 6 cmd

- 4:05am on Saturdays

0 1 */5 * * cmd

- At 1:00am on every 5th day 1st, 6th, 11th, so on

0 1 1-15 * * cmd - At 1:00am on every day from 1st to 15th, inclusive


* * * 12 4,5 cmd
32

- Every December Thu & Fri

2012, University of Colombo School of Computing

End of Section 5.0

2012, University of Colombo School of Computing

33

Virtualization

Virtualization

24 Virtualization

As enterprise data centers continue to rack up servers to slake the insatiable information appetite of the modern business, system administrators struggle with a
technical conundrum: how can existing systems be managed more efficiently to
save power, space, and cooling costs while continuing to meet the needs of users?
Software vendors have historically discouraged administrators from running their
applications with other software, citing potential incompatibilities and in some
cases even threatening to discontinue support in cases of noncompliance. The result has been a flood of single-purpose servers. Recent estimates have pegged the
utilization of an average sever at somewhere between 5% and 15%, and this number continues to drop as server performance rises.
One answer to this predicament is virtualization: allowing multiple, independent
operating systems to run concurrently on the same physical hardware. Administrators can treat each virtual machine as a unique server, satisfying picky vendors
(in most cases) while simultaneously reducing data center costs. A wide variety of
hardware platforms support virtualization, and the development of virtualizationspecific CPU instructions and the increasing prevalence of multicore processors
have vastly improved performance. Virtual servers are easy to install and require
less maintenance (per server) than physical machines.
983

984

Chapter 24

Virtualization

Implementations of virtualization have changed dramatically over the years, but


the core concepts are not new to the industry. Big Blue used virtual machines in
early mainframes while researching time-sharing concepts in the 1960s, allowing
users to share processing and storage resources through an abstraction layer. The
same techniques developed by IBM were used throughout the mainframe heyday
of the 1970s until the client-server boom of the 1980s. The technology lay dormant during the 1980s and 1990s until the cost and manageability problems of
enormous server farms rekindled interest in virtualization for modern systems.
VMware is widely credited with having started the current virtualization craze by
creating a virtualization platform for the Intel x86 architecture in 1999.
Today, virtualization technology is a flourishing business, with many vendors
twisting knobs and pushing buttons to create unique entries into the market. VMware remains a clear leader and offers products targeted at business of all sizes,
along with management software to support highly virtualized organizations. The
open source community has responded with a project known as Xen, which is
supported commercially by a company called XenSource, now owned by Citrix.
With the release of Solaris 10, Sun introduced some powerful technology known
collectively as zones and containers that can run more than 8,000 virtual systems
on a single Solaris deployment. These are just a few of the players in the market.
There are dozens of competing products, each with a slightly different niche.
See page 206 for more
information about
storage area networks.

Although server virtualization is our primary focus in this chapter, the same concepts apply to many other areas of the IT infrastructure, including networks, storage, applications, and even desktops. For example, when storage area networks or
network-attached storage are used, pools of disk space can be provisioned as a
service, creating additional space on demand. Applying virtualization to the desktop can be useful for system administrators and users alike, allowing for customtailored application environments for each user.
The many virtualization options have created a struggle for hapless UNIX and
Linux administrators. With dozens of platforms and configurations to choose
from, identifying the right long-term approach can be a daunting prospect. In this
chapter, we start by defining the terms used for virtualization technologies, continue with a discussion of the benefits of virtualization, proceed with tips for selecting the best solution for your needs, and finally, work through some hands-on
implementation activities for some of the most commonly used virtualization
software on our example operating systems.

24.1 VIRTUAL VERNACULAR


The virtualization market has its own set of confusing terms and concepts. Mastering the lingo is the first step toward sorting out the various options.
Operating systems assume they are in control of the systems hardware, so running two systems simultaneously causes resource conflicts. Server virtualization is

Full virtualization

985

an abstraction of computing resources that lets operating systems run without


direct knowledge of the underlying physical hardware. The virtualization software
parcels out the physical resources such as storage, memory, and CPU, dynamically
allocating their use among several virtual machines.
UNIX administrators should understand three distinct paradigms: full virtualization, paravirtualization, and OS-level virtualization. Each model resolves the resource contention and hardware access issues in a slightly different manner, and
each model has distinct benefits and drawbacks.
Full virtualization

Such hypervisors are also known as bare-metal hypervisors since they control the
physical hardware. The hypervisor provides an emulation layer for all of the hosts
hardware devices. The guest operating system is not modified. Guests make direct
requests to the virtualized hardware, and any privileged instructions that guest
kernels attempt to run are intercepted by the hypervisor for appropriate handling.
Bare-metal virtualization is the most secure type of virtualization because guest
operating systems are isolated from the underlying hardware. In addition, no kernel modifications are required, and guests are portable among differing underlying architectures. As long as the virtualization software is present, the guest can
run on any processor architecture. (Translation of CPU instructions does, however, incur a modest performance penalty.)
VMware ESX is an example of a popular full virtualization technology. The general structure of these systems is depicted in Exhibit A.

Guest OS N

Guest OS 1

Guest OS 0

Exhibit A Full virtualization architecture

Fully virtualized hypervisor


(e.g., VMWare ESX)
System Hardware

Disk

CPU

Memory

Virtualization

Full virtualization is currently the most accepted paradigm in production use today. Under this model, the operating system is unaware that it is running on a
virtualized platform. A hypervisor, also known as a virtual machine monitor, is
installed between the virtual machines (guests) and the hardware.

Chapter 24

Virtualization

Paravirtualization
Paravirtualization is the technology used by Xen, the leading open source virtual
platform. Like full virtualization, paravirtualization allows multiple operating systems to run in concert on one machine. However, each OS kernel must be modified to support hypercalls, or translations of certain sensitive CPU instructions.
User-space applications do not require modification and run natively on Xen machines. A hypervisor is used in paravirtualization just as in full virtualization.
The translation layer of a paravirtualized system has less overhead than that of a
fully virtualized system, so paravirtualization does lead to nominal performance
gains. However, the need to modify the guest operating system is a dramatic
downside and is the primary reason why Xen paravirtualization has scant support
outside of Linux and other open source kernels.
Exhibit B shows a paravirtualized environment. It looks similar to the fully virtualized system in Exhibit A, but the guest operating systems interface with the hypervisor through a defined interface, and the first guest is privileged.

Guest
OS
Guest
OS N
N
(modified)

(modified)

(modified)

Privileged
guest (host)

Guest OS 1

Exhibit B Paravirtualization architecture


Guest OS 0

986

Paravirtualized hypervisor
(e.g., Xen, LDoms)
System Hardware

Disk

CPU

Memory

Operating system virtualization


OS-level virtualization systems are very different from the previous two models.
Instead of creating multiple virtual machine environments within a physical system, OS-level virtualization lets an operating system create multiple, isolated application environments that reference the same kernel. OS-level virtualization is
properly thought of as a feature of the kernel rather than as a separate layer of
software abstraction.
Because no true translation or virtualization layer exists, the overhead of OS-level
virtualization is very low. Most implementations offer near-native performance.
Unfortunately, this type of virtualization precludes the use of multiple operating
systems since a single kernel is shared by all guests (or containers as they are

Cloud computing

987

commonly known in this context).1 AIX workload partitions and Solaris containers and zones are examples of OS-level virtualization.
OS-level virtualization is illustrated in Exhibit C.
Exhibit C OS-level virtualization architecture

D is

Virtual machine

k
Host Kernel

Virtu
a

CPU

Vir
tu
al

l mac
hine
2

ma

ch

em

ine

or y

Native virtualization
In an attempt to distinguish their hardware offerings, the silicon heavyweights
AMD and Intel are competing head to head to best support virtualization through
hardware-assisted (native) virtualization. Both companies offer CPUs that include virtualization instructions, eliminating the need for the translation layer
used in full and paravirtualization. Today, all major virtualization players can take
advantage of these processors features.
Cloud computing
In addition to traditional virtualization, a relatively recent offering in the industry
known informally (and, to some, begrudgingly) as cloud computing is an alternative to locally run server farms. Cloud computing offers computing power as a
service, typically attractively priced on an hourly basis. The most obvious benefit
is the conversion of server resources into a form of infrastructure analogous to
power or plumbing. Administrators and developers never see the actual hardware
they are using and need have no knowledge of its structure. The name comes from
the traditional use of a cloud outline to denote the Internet in network diagrams.
As a system administration book, this one focuses on cloud computing at the
server level, but applications are also being moved to the cloud (commonly
known as software-as-a-service, or SAAS). Everything from email to business
productivity suites to entire desktop environments can be outsourced and managed independently.
1. This is not entirely true. Solaris containers have a feature called branded zones that allows Linux
binaries to run on a Solaris kernel.

Virtualization

OS Virtualization
(e.g., Solaris containers,
HP Integrity VM,
IBM workload partitions)

988

Chapter 24

Virtualization

Cloud services are commonly bundled with a control interface that adjusts capacity on demand and allows one-click provisioning of new systems. Amazons Elastic Compute Cloud (EC2) is the most mature of the first-generation services of
this type. It has been widely adopted by companies that offer next-generation web
platforms. Love it or hate it, utility computing is gaining traction with bean counters as a cheaper alternative to data centers and localized server infrastructure.
Talking heads in the IT industry believe that cloud technologies in their myriad
forms are the future of computing.
Cloud computing relies on some of the same ideas as virtualization, but it should
be considered a distinct set of technologies in its own right.
Live migration
A final concept to consider is the possibility of migrating virtual machines from
one physical machine to another. Most virtualization software lets you move virtual machines in real time between running systems, in some cases without interruptions in service or loss of connectivity. This feature is called live migration. Its
helpful for load balancing, disaster recovery, server maintenance, and general system flexibility.
Comparison of virtualization technologies
Although the various virtualization options are conceptually different, each technique offers similar results in the end. Administrators access virtual systems in
the same way as they access any normal node on the network. The primary differences are that hardware problems may affect multiple systems at once (since they
share hardware) and that resource contention issues must be debugged at the
same level at which virtualization is implemented (e.g., in the hypervisor).

24.2 BENEFITS OF VIRTUALIZATION


Given the many blessings of virtual computing, its surprising that it took so many
years to be developed and commercially accepted. Cost savings, reduced energy
use, simplified business continuity, and greater technical agility are some of the
main drivers of the adoption of virtual technologies.
Cost is a major factor in all new IT projects, and with virtualization, businesses
realize immediate short-term cost savings because they purchase fewer servers.
Instead of acquiring new servers for a new production application, administrators
can spin up new virtual machines and save in up-front purchasing costs as well as
ongoing support and maintenance fees. Cooling requirements are cut dramatically since virtual servers do not generate heat, resulting in additional savings.
Data centers also become easier to support and less expensive to maintain. With
some organizations consolidating up to 30 physical servers onto a single virtual
host, a quick glance at the savings in rack space alone is sure to set data center
managers blushing with pride.

A practical approach

989

A reduced ecological impact is an easy marketing win for businesses as well. Some
estimates suggest that nearly one percent of the worlds electricity is consumed by
power-hungry data centers.2 Modern multicore CPUs are used more efficiently
when several virtual machines are running simultaneously.

Because hypervisors can be accessed independently of the virtual servers they


support, server management ceases to be grounded in physical reality and becomes fully scriptable. System administrators can respond quickly to customer
requests for new systems and applications by making use of template-driven
server provisioning. Scripts can automate and simplify common virtual system
administration tasks. A virtual servers boot, shutdown, and migration chores can
be automated by shell scripts and even scheduled through cron. Discontinued operating systems and applications can be moved off unsupported legacy hardware
onto modern architectures.
Virtualization increases availability. Live migration allows physical servers to be
taken down for maintenance without downtime or interruptions in service. Hardware upgrades do not impact the business, either. When its time to replace an
aging machine, the virtual system is immediately portable without a painful upgrade, installation, test, and cutover cycle.
Virtualization makes the rigorous separation of development, test, staging, and
production environments a realistic prospect, even for smaller businesses. Historically, maintaining these separate environments has been too expensive for many
businesses to bear, even though regulations and standards may have demanded it.
The individual environments may also benefit; for example, quality assurance testers can easily restore a test environment to its baseline configuration.
In terms of immediate gratification, few technologies seem to offer as many possibilities as server virtualization. As well see in the next section, however, virtualization is not a panacea.

24.3 A PRACTICAL APPROACH


The transition to a virtualized environment must be carefully planned, managed,
and implemented. An uncoordinated approach will lead to a motley assortment of
unstable, unmanageable implementations that do more harm than good. Furthermore, the confidence of stakeholders is easily lost: early missteps can complicate
2. Estimated by Jonathan Koomey in his excellent study Estimating total power consumption by servers
in the U.S. and the world.

Virtualization

Business continuitythat is, the ability of a company to survive physical and logical crises with minimal impact on business operationsis a vexing and expensive problem for system administrators. Complex approaches to disaster recovery
are simplified when virtual servers can be migrated from one physical location to
another with a single command. The migration technologies supported by most
virtualization platforms allow applications to be location independent.

990

Chapter 24

Virtualization

future attempts to move reluctant users to new platforms. Slow and steady wins
the race.
Its important to choose the right systems to migrate since some applications are
better suited to virtualization than others. Services that already have high utilization might be better left on a physical system, at least at the outset. Other services
that are best left alone include these:

Resource intensive backup servers or log hosts


High-bandwidth applications, such as intrusion detection systems
Busy I/O-bound database servers
Proprietary applications with hardware-based copy protection
Applications with specialized hardware needs, such as medical systems
or certain scientific data gathering applications

Good candidates for virtualization include these:

Internet-facing web servers that query middleware systems or databases


Underused stand-alone application servers
Developer systems, such as build or version control servers
Quality assurance test hosts and staging environments
Core infrastructure systems, such as LDAP directories, DHCP and DNS
servers, time servers, and SSH gateways

Starting with a small number of less critical systems will help establish the organizations confidence and develop the expertise of administrators. New applications
are obvious targets since they can be built for virtualization from the ground up.
As the environment stabilizes, you can continue to migrate systems at regular intervals. Large organizations might find that 25 to 50 servers per year is a sustainable pace.
Plan for appropriate infrastructure support in the new environment. Storage and
network resources should support the migrations plans. If several systems on the
same physical host will reside on separate physical networks, plan to trunk the
network interfaces. Include appropriate attachments for systems that will use
space on a SAN. Make smart decisions about locating similar systems on the same
physical hardware to simplify the infrastructure. Finally, make sure that every virtual machine has a secondary home to which it can migrate in the event of maintenance or hardware problems on the primary system.
Dont run all your mission-critical services on the same physical hardware, and
dont overload systems with too many virtual machines.
Thanks to rapid improvements in server hardware, administrators have lots of
good options for virtualization. Multicore, multiprocessor architectures are an obvious choice for virtual machines since they reduce the need for context switches
and facilitate the allocation of CPU resources. New blade server products from
major manufacturers are designed for virtual environments and offer high I/O

Introduction to Xen

991

and memory capacity. Solid state disk drives have inherent synergy with virtualization because of their fast access times and low power consumption.

24.4 VIRTUALIZATION WITH LINUX


Two major projects are vying for the title of Linux virtualization champion: Xen
and KVM. In one corner, Xen is an established, well-documented platform with
wide support from the distribution heavyweights. In the other corner, KVM has
been accepted by Linus Torvalds into the mainstream Linux kernel. It enjoys a
growing fan base, and both Ubuntu and Red Hat are supporting it.
In this section well stay out of the ring and stay focused on the pertinent system
administration details for each technology.

Initially developed by Ian Pratt as a research project at the University of Cambridge, the Linux-friendly Xen has grown to become a formidable virtualization
platform, challenging even the commercial giants in terms of performance, security, and especially cost. As a paravirtual hypervisor, the Xen virtual machine
monitor claims a mere 0.1%3.5% overhead, far less than fully virtualized solutions. Because the Xen hypervisor is open source, a number of management tools
exist with varying levels of feature support. The Xen source is available from
xen.org, but many distributions already include native support.
Xen is a bare-metal hypervisor that runs directly on the physical hardware. A running virtual machine is called a domain. There is always at least one domain, referred to as domain zero (or dom0). Domain zero has full hardware access, manages the other domains, and runs all device drivers. Unprivileged domains are
referred to as domU. All domains, including dom0, are controlled by the Xen hypervisor, which is responsible for CPU scheduling and memory management. A
suite of daemons, tools, and libraries completes the Xen architecture and enables
communication between domU, dom0, and the hypervisor.
Several management tools simplify common Xen administration tasks such as
booting and shutting down, configuring, and creating guests. Xen Tools is a collection of Perl scripts that simplify domU creation. MLN, or Manage Large Networks, is another Perl script that creates complex virtual networks out of clean,
easily understood configuration files. ConVirt is a shockingly advanced GUI tool
for managing guests. It includes drag-and-drop live migration, agentless multiserver support, availability and configuration dashboards, and template-driven
provisioning for new virtual machines. For hardened command-line junkies, the
unapologetic built-in tool xm fits the bill.
Linux distributions vary in their support of Xen. Red Hat originally expended
significant resources on including Xen in its distributions before ditching it for
the competing KVM software. Xen is well supported in SUSE Linux, particularly
in the Enterprise 11 release. Canonical, the company behind Ubuntu Linux, has

Virtualization

Introduction to Xen

992

Chapter 24

Virtualization

taken an odd approach with Xen, wavering on support in most releases before
finally dropping it in version 8.10 in favor of KVM (although Xen is still mentioned in documentation). Once installed, basic Xen usage differs little among
distributions. In general, we recommend Red Hat or SUSE for a large Xen-based
virtualization deployment.
Xen essentials
A Linux Xen server requires a number of daemons, scripts, configuration files,
and tools. Table 24.1 lists the most interesting puzzle pieces.
Table 24.1 Xen components
Path

Purpose

/etc/xen
xend-config.sxp
auto
scripts
/var/log/xen
/usr/sbin/xend
/usr/sbin/xm

Primary configuration directory


Top-level xend configuration file
Guest OS config files to autostart at boot time
Utility scripts that create network interfaces, etc.
Xen log files
Master Xen controller daemon
Xen guest domain management tool

Each Xen guest domain configuration file in /etc/xen specifies the virtual resources available to a domU, such as disk devices, CPU, memory, and network
interfaces. There is one configuration file per domU. The format is extremely flexible and gives administrators granular control over the constraints that will be
applied to each guest. If a symbolic link to a domU configuration file is added to
the auto subdirectory, that guest OS will be automatically started at boot time.
The xend daemon handles domU creation, migration, and other management
tasks. It must always remain running and typically starts at boot time. Its configuration file, /etc/xen/xend-config.sxp, specifies the communication settings for
the hypervisor and the resource constraints for dom0. It also configures facilities
for live migration.
See the footnote on
page 308 for more info
about sparse files.

Guest domains disks are normally stored in virtual block devices (VBDs) in
dom0. The VBD can be connected to a dedicated resource such as a physical disk
drive or logical volume. Or it can be a loopback file, also known as a file-backed
VBD, created with dd. Performance is better with a dedicated disk or volume, but
files are more flexible and can be managed with normal Linux commands (such as
mv and cp) in domain zero. Backing files are sparse files that grow as needed.
Unless the system is experiencing performance bottlenecks, a file-backed VBD is
usually the better choice. Its a simple process to transfer a VBD onto a dedicated
disk if you change your mind.

Xen guest installation with virt-install

993

Similarly, virtual network interfaces (aka VIFs) can be set up in multiple ways.
The default is to use bridged mode, in which each guest domain is a node on the
same network as the host. Routed and NAT modes configure guest domains to be
on a private network, accessible to each other and domain 0 but hidden from the
rest of the network. Advanced configurations include bonded network interfaces
and VLANs for guests on different networks. If none of these options fit the bill,
Xen network scripts are customizable to meet almost any unique need.
Xen guest installation with virt-install
One tool for simple guest installation is virt-install, bundled as part of Red Hats
virt-manager application.3 virt-install is a command-line OS provisioning tool.
It accepts installation media from a variety of sources, such as an NFS mount, a
physical CD or DVD, or an HTTP location.
For example, the installation of a guest domain might look like this:

This is a typical Xen guest domain with the name chef, a disk VBD location of
/vm/chef.img, and installation media obtained through HTTP. The instance has
512MiB of RAM and uses no X Windows-based graphics support during installation. virt-install downloads the files needed to start the installation and then
kicks off the installer process.
Youll see the screen clear, and youll go through a standard text-based Linux installation, including network configuration and package selection. After the installation completes, the guest domain reboots and is ready for use. To disconnect
from the guest console and return to dom0, type <Control-]>.
See page 1138 for more
details on VNC.

Its worth noting that although this incantation of virt-install provides a textbased installation, graphical support through Virtual Network Computing (VNC)
is also available.
The domains configuration is stored in /etc/xen/chef. Heres what it looks like:
name = "chef"
uuid = "a85e20f4-d11b-d4f7-1429-7339b1d0d051"
maxmem = 512
memory = 512
vcpus = 1
bootloader = "/usr/bin/pygrub"
on_poweroff = "destroy"
on_reboot = "restart"
on_crash = "restart"
vfb = [ ]
disk = [ "tap:aio:/vm/chef.dsk,xvda,w" ]
vif = [ "mac=00:16:3e:1e:57:79,bridge=xenbr0" ]
3. Install the python-virtinst package for virt-install support on Ubuntu.

Virtualization

redhat$ sudo virt-install -n chef -f /vm/chef.img -l http://example.com/myos


-r 512 --nographics

994

Chapter 24

Virtualization

You can see that the NIC defaults to bridged mode. In this case, the VBD is a
block tap file that provides better performance than does a standard loopback
file. The writable disk image file is presented to the guest as /dev/xvda. This particular disk device definition, tap:aio, is recommended by the Xen team for performance reasons.
The xm tool is convenient for day-to-day management of virtual machines, such
as starting and stopping VMs, connecting to their consoles, and investigating current state. Below, we show the running guest domains, then connect to the console for chef. IDs are assigned in increasing order as guest domains are created,
and they are reset when the host reboots.
redhat$ sudo xm list
Name
ID Mem(MiB)
Domain-0
0
2502
chef
19
512
redhat$ sudo xm console 19

VCPUs
2
1

State Time(s)
r----397.2
-b---12.8

To effect any customization of a guest domain, such as attaching another disk or


changing the network to NAT mode instead of bridged, you should edit the guests
configuration file in /etc/xen and reboot the guest. The xmdomain.cfg man page
contains excellent detail on additional options for guest domains.
Xen live migration
A domain migration is the process of moving a domU from one physical host to
another, and a live migration does so without any loss of service. Practically
speaking, this is one of the handiest and most magical of virtualization tricks for
system administrators. Open network connections are maintained, so any SSH
sessions or active HTTP connections will not be lost. Hardware maintenance, operating system upgrades, and physical server reboots are all good opportunities to
use migration magic.
One important requirement for implementing migrations is that storage must be
shared. Any storage needed by the domU, such as the disk image files on which
the virtual machine is kept, must be accessible to both host servers. File-backed
virtual machines are simplest for live migration since theyre usually contained in
a single portable file. But a SAN, NAS, NFS share, or iSCSI unit are all acceptable
methods of sharing files among systems. However the VBD is shared, be sure to
run the domU on only one physical server at a time. Linux filesystems do not
support direct, concurrent access by multiple hosts.
Additionally, because the IP and MAC addresses of a virtual machine follow it
from one host to another, each server must be on the same layer 2 and IP subnets.
Network hardware learns the new location of the MAC address once the virtual
machine begins sending traffic over the network.
Once these basic requirements are met, all you need are a few configuration
changes to the hypervisor configuration file, /etc/xen/xend-config.sxp, to enable

KVM

995

migrations. Table 24.2 describes the pertinent options; they are all commented
out in a default Xen installation. After making changes, restart xend by running
/etc/init.d/xend restart.
Table 24.2 Live migration options in the xend configuration file
Option

Description

xend-relocation-server
xend-relocation-port
xend-relocation-address

Enables migration; set to yes


Network port used for migration activities
Interface to listen on for migration connections. If
unspecified, Xen listens on all interfaces in dom0.
xend-relocation-hosts-allow Hosts from which to allow connections a

In the process of migrating a virtual machine between hosts, the domUs memory
image traverses the network in an unencrypted format. Administrators should
keep security in mind if the guest has sensitive data in memory.
Before attempting a migration, the guests configuration file must be in place on
both the source and destination servers. If the location of the disk image files differs between hosts (e.g., if one server mounts the shared storage in /xen and the
other in /vm), this difference should be reflected in the disk = parameter of the
domains configuration file.
The migration itself is simple:
redhat$ sudo xm migrate --live chef server2.example.com

Assuming that our guest domain chef is running, the command migrates it to
another Xen host, server2.example.com. Omitting the --live flag pauses the domain prior to migration. We find it entertaining to run a ping against chef s IP
address during the migration to watch for dropped packets.
KVM
KVM, the Kernel-based Virtual Machine, is a full virtualization tool that has been
included in the mainline Linux kernel since version 2.6.20. It depends on the Intel
VT and AMD-V virtualization extensions found on current CPUs.4 It is the default virtualization technology in Ubuntu, and Red Hat has also changed gears
from Xen to KVM after acquiring KVMs parent company, Qumranet.
Since KVM virtualization is supported by the CPU hardware, many guest operating systems are supported, including Windows. The software also depends on a
modified version of the QEMU processor emulator.
4. Does your CPU have them? Try egrep '(vmx|svm)' /proc/cpuinfo to find out. If the command displays no output, the extensions are not present. On some systems, the extensions must be enabled in
the system BIOS before they become visible.

Virtualization

a. This should never be blank; otherwise, connections will be allowed from all hosts.

996

Chapter 24

Virtualization

Under KVM, the Linux kernel itself serves as the hypervisor; memory management and scheduling are handled through the hosts kernel, and guest machines
are normal Linux processes. Enormous benefits accompany this unique approach
to virtualization. For example, the complexity introduced by multicore processors
is handled by the kernel, and no hypervisor changes are required to support them.
Linux commands such as top, ps, and kill show and control virtual machines, just
as they would for other processes. The integration with Linux is seamless.
Administrators should be cautioned that KVM is a relatively young technology,
and it should be heavily tested before being promoted to production use. The
KVM site itself documents numerous incompatibilities when running guests of
differing operating system flavors. Reports of live migrations breaking between
different versions of KVM are common. Consider yourself forewarned.
KVM installation and usage
Although the technologies behind Xen and KVM are fundamentally different, the
tools that install and manage guests operating systems are similar. As under Xen,
you can use virt-install to create new KVM guests. Use the virsh command to
manage them.5 These utilities depend on Red Hats libvirt library.
Before the installation is started, the host must be configured to support networking in the guests.6 In most configurations, one physical interface is used to bridge
network connectivity to each of the guests. Under Red Hat, the network device
configuration files are in /etc/sysconfig/network-scripts. Two device files are required: one each for the bridge and the physical device.
In the examples below, peth0 is the physical device and eth0 is the bridge:
/etc/sysconfig/network-scripts/peth0
DEVICE=peth0
ONBOOT=yes
BRIDGE=eth0
HWADDR=XX:XX:XX:XX:XX:XX
/etc/sysconfig/network-scripts/eth0
DEVICE=eth0
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Bridge

Here, the eth0 device receives an IP address through DHCP.


The flags passed to virt-install vary slightly from those used for a Xen installation. To begin with, the --hvm flag indicates that the guest should be hardware
virtualized, as opposed to paravirtualized. In addition, the --connect argument
guarantees that the correct default hypervisor is chosen, since virt-install sup5. You can use virsh to manage Xen domUs as well, if you wish.
6. This is equally true with Xen, but xend does the heavy lifting, creating interfaces in the background.

You might also like