Professional Documents
Culture Documents
SUBMITTED TO: SACHIN KUMAR ROLL NO.RD3803B50 REG. NO. 10811875 MCA 2ND SEM
ACKNOWLEDGEMENT I feel immense pleasure to give the credit of my project work not only to one individual as this work is integrated effort of all those who are concerned with it. I want to thanks to all my teachers who guide me to move the track. I sincerely express my gratitude and of thank to MR. MANISH KUMAR and other staff member of lpu phagwara jalandhar .For helping mein completing my term paper and making it a great success. I would like to express my deep scene to gratitude to Lovely School Of Engineering ,Phagwara who introduce me to the subject and under whose guidance I am able to complete my topic. Last but not least, I would thank to my teacher .
The operating system thus allows the "dissociation" of programmes and hardware, mainly to simplify resource management and offer the user a simplified Man-machine interface (MMI) to overcome the complexity of the actual machine. The operating system has various roles:
Management of the processor: the operating system is responsible for managing allocation
of the processor between the different programmes using a scheduling algorithm. The type of scheduler is totally dependent on the operating system, according to the desired objective. Management of the random access memory: the operating system is responsible for managing the memory space allocated to each application and, where relevant, to each user. If there is insufficient physical memory, the operating system can create a memory zone on the hard drive, known as "virtual memory". The virtual memory lets you run applications requiring more memory than there is available RAM on the system. However, this memory is a great deal slower. Management of input/output: the operating system allows unification and control of access of programmes to material resources via drivers (also known as peripheral administrators or input/output administrators). Management of execution of applications: the operating system is responsible for smooth execution of applications by allocating the resources required for them to operate. This means an application that is not responding correctly can be "killed". Management of authorisations: the operating system is responsible for security relating to execution of programmes by guaranteeing that the resources are used only by programmes and users with the relevent authorisations. File management: the operating system manages reading and writing in the file system and the user and application file access authorisations. Information management: the operating system provides a certain number of indicators that can be used to diagnose the correct operation of the machine.
COMPONENT OF OPERATING SYSTEM:---The operating system comprises a set of software packages that can be used to manage interactions with the hardware. The following elements are generally included in this set of software: The kernel, which represents the operating system's basic functions such as management of memory, processes, files, main inputs/outputs and communication functionalities. The shell, allowing communication with the operating system via a control language, letting the user control the peripherals without knowing the characteristics of the hardware used, management of physical addresses, etc. The file system, allowing files to be recorded in a tree structure.
MULTI THREAD SYSTEM:---An operating system is known as multi-threaded when several "tasks" (also known as processes) may be run at the same time. The applications consist of a sequence of instructions known as "threads". These threads will be alternately active, on standby, suspended or destroyed, according to the priority accorded to them or may be run simultaneously. A system is known as pre-emptive when it has a scheduler (also called planner), which, according to priority criteria, allocates the machine time between the various processes requesting it. The system is called a shared time system when a time quota is allocated to each process by the scheduler. This is the case of multi-user systems which allow several users to use different or similar applications on the same machine at the same time. the system is then referred to as a "transactional system". To do this, the system allocates a period of time to each user.
MULTI PROCESSER SYSTEM:--Multi-processing is a technique that involves operating several processors in parallel to obtain a higher calculation power than that obtained using a high-end processor or to increase the availability of the system (in the event of processor breakdown). The term SMP (Symmetric Multiprocessing or Symmetric Multiprocessor) refers to an architecture in which all processors access the same shared memory. A multiprocessor system must be able to manage memory sharing between several processors but also to distribute the work load.
EMBEDDED SYSTEM:--Embedded systems are operating systems designed to operate on small machines, such as PDAs (personal digital assistants) or autonomous electronic devices (spatial probes, robot, on-board vehicle computer, etc.) with reduced autonomy. Thus an essential feature of embedded systems is their advanced energy management and ability to operate with limited resources. The main "general use" embedded systems for PDAs are as follows: PalmOS Windows CE / Windows Mobile / Window Smartphone
REAL TIME SYSTEM:-Real time systems, used mainly in industry, are systems designed to operate in a time-constrained environment. A real time system must also operate reliably according to specific time constraints; in other words, it must be able to properly process information received at clearly-defined intervals (regular or otherwise). Here are some examples of real time operating systems: OS-9; RTLinux (RealTime Linux); QNX; VxWorks.
TYPES OF OPERATING SYSTEM:--There are several types of operating system, defined according to whether they can simultaneously manage information measuring 16 bits, 32 bits, 64 bits or more. System Programming Single user Multi-user Single task Multi-task DOS 16 bits X X Windows3.1 16/32 bits X not pre-emptive Windows95/98/Me 32 bits X cooperative WindowsNT/2000 32 bits X pre-emptive WindowsXP 32/64 bits X pre-emptive Unix / Linux 32/64 bits X pre-emptive MAC/OS X 32 bits X pre-emptive VMS 32 bits X pre-emptive
MULTI USER SYSTEM - A multi-user operating system allows many different users to take
advantage of the computer's resources simultaneously. The operating system must make sure that the requirements of the various users are balanced, and that each of the programs they are using has sufficient and separate resources so that a problem with one user doesn't affect the entire community of
users. Unix, VMS and mainframe operating systems, such as MVS, are examples of multi-user operating systems.
Photo courtesy Apple Mac OS X Panther screen shot It's important to differentiate between multi-user operating systems and single-user operating systems that support networking. Windows 2000 and Novell Netware can each support hundreds or thousands of networked users, but the operating systems themselves aren't true multi-user operating systems. The system administrator is the only "user" for Windows 2000 or Netware. The network support and all of the remote user logins the network enables are, in the overall plan of the operating system, a program being run by the administrative user.
virus
- In computers, a virus is a program or programming code that replicates by being copied or initiating its copying to another program, computer boot sector or document. Viruses can be transmitted as attachments to an e-mail note or in a downloaded file, or be present on a diskette or CD. The immediate source of the e-mail note, downloaded file, or diskette you've received is usually unaware that it contains a virus. Some viruses wreak their effect as soon as their code is executed; other viruses lie dormant until circumstances cause their code to be executed by the computer. Some viruses are benign or playful in intent and effect ("Happy Birthday, Ludwig!") and some can be quite harmful, erasing data or causing your hard disk to require reformatting. A virus that replicates itself by resending itself as an e-mail attachment or as part of a network message is known as a worm. Generally, there are three main classes of viruses: File infectors. Some file infector viruses attach themselves to program files, usually selected .COM or .EXE files. Some can infect any program for which execution is requested, including .SYS, .OVL, .PRG, and .MNU files. When the program is loaded, the virus is loaded as well. Other file infector viruses arrive as wholly-contained programs or scripts sent as an attachment to an e-mail note. System or boot-record infectors. These viruses infect executable code found in certain system areas on a disk. They attach to the DOS boot sector on diskettes or the Master Boot Record on hard disks. A typical scenario (familiar to the author) is to receive a diskette from an innocent source that contains a boot disk virus. When your operating system is running, files on the diskette can be read without triggering the boot disk virus. However, if you leave the diskette in the drive, and then turn the computer off or reload the operating system, the computer will look first in your A drive, find the diskette with its boot disk virus, load it, and make it temporarily impossible to use your hard disk. (Allow several days for recovery.) This is why you should make sure you have a bootable floppy. Macro viruses. These are among the most common viruses, and they tend to do the least damage. Macro viruses infect your Microsoft Word application and typically insert unwanted words or phrases. The best protection against a virus is to know the origin of each program or file you load into your computer or open from your e-mail program. Since this is difficult, you can buy anti-virus software that can screen e-mail attachments and also check all of your files periodically and remove any viruses that are found. From time to time, you may get an e-mail message warning of a new virus. Unless the warning is from a source you recognize, chances are good that the warning is a virus hoax. The computer virus, of course, gets its name from the biological virus. The word itself comes from a Latin word meaning slimy liquid or poison.
Virus classification:--There is no evidence that viruses possess a common ancestor or are in any way phylogenetically related. Nevertheless, classification along the lines of the Linnean system into families, genera, and species has been utilized. Based on the organisms they infect, the first broad division of viruses is into bacterial, plant, and animal viruses. Within these classes, other criteria for subdivision are used. Among these are general morphology; envelope or the lack of it; nature of the genome (DNA or RNA); structure of the genome (single- or double-stranded, linear or circular, fragmented or nonfragmented); mechanisms of gene expression and virus replication (positive- or negative-strand RNA); serological relationship; host and tissue susceptibility; pathology (symptoms, type of disease).
Animal viruses
The families of animal viruses are sometimes subdivided into subfamilies; the suffix -virinae may then be used. The subgroups of a family or subfamily are equivalent to the genera of the Linnean classification. See also Animal virus. The animal DNA viruses are divided into five families: Poxviridae, Herpesviridae, Adenoviridae, Papovaviridae, and Parvoviridae. RNA animal viruses may be either single-stranded or double-stranded. The single-stranded are further subdivided into positive-strand and negative-strand RNA viruses, depending on whether the RNA contains the messenger RNA (mRNA) nucleotide sequence or its complement, respectively. Further, the RNA genes may be located on one or several RNA molecules (nonfragmented or fragmented genomes, respectively). The positive-strand RNA animal viruses contain six families: Picornaviridae, Calciviridae, Coronaviridae, Togaviridae, Retroviridae, and Nodamuraviridae. The nucleocapsid of negative-strand RNA animal viruses contains an RNA-dependent
RNA polymerase required for the transcription of the negative strand into the positive mRNAs. Virion RNA is neither capped nor polyadenylated. The group is divided into five families: Arenaviridae, Orthomyxoviridae, Paramyxoviridae, Rhabdoviridae, and Bunyaviridae. The double-stranded RNA animal viruses contain only one group, the Reoviridae.
Bacterial viruses
Bacterial viruses are also known as bacteriophages or phages. They may be tailed or nontailed. Nontailed phages are further subdivided into those with envelopes and those without. Tailed phages, which do not have envelopes, are divided into three families: Myoviridae, Styloviridae, and Pedoviridae. The group of nontailed DNA bacteriophages contains seven families, each with a distinctive morphology: Tectiviridae, Corticoviridae, Inoviridae, Microviridae, Leviviridae, Plasmaviridae, and Cystoviridae. Only the latter two families have envelopes. See also Bacteriophage.
Plant viruses
Plant viruses are divided into groups, rather than families, except those which belong to families of rhabdo viridae and reoviridae. The group, and correspondingly subgroup and type, can be viewed as analogous to family, genus, and species, respectively. Most common among plant viruses are those with a single-stranded, capped but not polyadenylated, positive-strand RNA. See also Plant viruses and viroids. Virus classification involves naming and placing viruses into a taxonomic system. Like the relatively consistent classification systems seen for cellular organisms, virus classification is the subject of ongoing debate and proposals. This is largely due to the pseudo-living nature of viruses, which are not yet definitively living or non-living. As such, they do not fit neatly into the established biological classification system in place for cellular organisms, such as eukaryotes and prokaryotes. Virus classification is based mainly on phenotypic characteristics, including morphology, nucleic acid type, mode of replication, host organisms, and the type of disease they cause. A combination of two main schemes is currently in widespread use for the classification of viruses. David Baltimore, a Nobel Prize-winning biologist, devised the Baltimore classification system, which places viruses into one of seven groups. These groups are designated by Roman numerals and separate viruses based on their mode of replication, and genome type. Accompanying this broad method of classification are specific naming conventions and further classification guidelines set out by the International Committee on Taxonomy of Viruses.
Contents 1 Baltimore classification 1.1 DNA viruses 1.2 RNA viruses 1.3 Reverse transcribing viruses 2 ICTV classification 3 Holmes classification 4 LHT System of Virus Classification 5 Casjens and Kings classification of virus 6 Subviral agents 6.1 Viroids 6.2 Satellites 6.3 Prions 7 Notes 8 See also 9 External links Baltimore classification
Main article: Baltimore classification
The Baltimore Classification of viruses is based on the method of viral mRNA synthesis Baltimore classification (first defined in 1971) is a classification system which places viruses into one of seven groups depending on a combination of their nucleic acid (DNA or RNA), strandedness (singlestranded or double-stranded), Sense, and method of replication. Other classifications are determined by the disease caused by the virus or its morphology, neither of which are satisfactory due to different viruses either causing the same disease or looking very similar. In addition, viral structures are often difficult to determine under the microscope. Classifying viruses according to their genome means that those in a given category will all behave in a similar fashion, offering some indication of how to proceed with further research. Viruses can be placed in one of the seven following groups:
I: dsDNA viruses (e.g. Adenoviruses, Herpesviruses, Poxviruses) II: ssDNA viruses (+)sense DNA (e.g. Parvoviruses) III: dsRNA viruses (e.g. Reoviruses) IV: (+)ssRNA viruses (+)sense RNA (e.g. Picornaviruses, Togaviruses) V: (-)ssRNA viruses (-)sense RNA (e.g. Orthomyxoviruses, Rhabdoviruses) VI: ssRNA-RT viruses (+)sense RNA with DNA intermediate in life-cycle (e.g. Retroviruses) VII: dsDNA-RT viruses (e.g. Hepadnaviruses)
DNA viruses
For more details on this topic, see DNA virus. Group I: viruses possess double-stranded DNA. Group II: viruses possess single-stranded DNA. Virus Family Examples (common names) Virion Capsid Nucleic naked/enveloped Symmetry acid type Naked Icosahedral ds Naked Icosahedral ds circular Naked Icosahedral ss Enveloped Complex coats Enveloped Naked Icosahedral ds Complex Group I I II I
1.Adenoviridae Adenovirus 2.Papillomaviridae Papillomavirus 3.Parvoviridae Parvovirus B19 Herpes simplex virus, varicella4.Herpesviridae zoster virus, cytomegalovirus, Epstein-Barr virus 5.Poxviridae Smallpox virus, vaccinia virus 6.Hepadnaviridae Hepatitis B virus Polyoma virus; JC virus 7.Polyomaviridae (progressive multifocal leucoencephalopathy)
8.Circoviridae Transfusion Transmitted Virus Naked Icosahedral ss circular II RNA viruses For more details on this topic, see RNA virus. Group III: viruses possess double-stranded RNA genomes, e.g. rotavirus. These genomes are always segmented. Group IV: viruses possess positive-sense single-stranded RNA genomes. Many well known viruses are found in this group, including the picornaviruses (which is a family of viruses that includes well-known viruses like Hepatitis A virus, enteroviruses, rhinoviruses, poliovirus, and foot-and-mouth virus), SARS virus, hepatitis C virus, yellow fever virus, and rubella virus. Group V: viruses possess negative-sense single-stranded RNA genomes. The deadly Ebola and Marburg viruses are well known members of this group, along with influenza virus, measles, mumps and rabies. Virion Capsid Nucleic Virus Family Examples (common names) Group naked/enveloped Symmetry acid type 1.Reoviridae Reovirus, Rotavirus Naked Icosahedral ds III Enterovirus, Rhinovirus, Hepatovirus, Cardiovirus, 2.Picornaviridae Aphthovirus, Poliovirus, Naked Icosahedral ss IV Parechovirus, Erbovirus, Kobuvirus, Teschovirus, Coxsackie 3.Caliciviridae Norwalk virus, Hepatitis E virus Naked Icosahedral ss IV 4.Togaviridae Rubella virus Enveloped Icosahedral ss IV Lymphocytic choriomeningitis 5.Arenaviridae Enveloped Complex ss(-) V virus Dengue virus, Hepatitis C virus, 6.Flaviviridae Enveloped Icosahedral ss IV Yellow fever virus Influenzavirus A, Influenzavirus B, 7.Orthomyxoviridae Influenzavirus C, Isavirus, Enveloped Helical ss(-) V Thogotovirus Measles virus, Mumps virus, 8.Paramyxoviridae Enveloped Helical ss(-) V Respiratory syncytial virus California encephalitis virus, 9.Bunyaviridae Enveloped Helical ss(-) V Hantavirus 10.Rhabdoviridae Rabies virus Enveloped Helical ss(-) V 11.Filoviridae Ebola virus, Marburg virus Enveloped Helical ss(-) V 12.Coronaviridae Corona virus Enveloped Helical ss IV 13.Astroviridae Astrovirus Naked Icosahedral ss IV 14.Bornaviridae Borna disease virus
Virus Problems
The primary reason viruses are such a problem is the vulnerability of IS resources. Safeguard programs take time to run, and many users are in too much of a hurry to wait. Another reason for a viruses spread is that users often simply are not aware of the viruses presence until it is too late. This is true for both stand-alone and networked computers. Generally, there are two main classes of viruses. The first class consists of the FILE INFECTORS which attach themselves to ordinary program files. These usually infect executable files. The second category is SYSTEM or BOOT-RECORD INFECTORS: those viruses which infect executable code found in certain system areas on a disk which are not ordinary files. On DOS based systems, there are ordinary boot-sector viruses, which infect only the DOS boot sector, and MBR viruses which infect the Master Boot Record on fixed disks and the DOS boot sector on diskettes. Examples include Brain, Stoned, Empire, Azusa, and Michelangelo. Such viruses are always resident viruses. Finally, a few viruses are able to infect both (the Tequila virus is one example). These are often called MULTI-PARTITE viruses or BOOT-AND-FILE viruses.
Virus Symptoms
There are various symptoms which indicate a virus is present. Symptoms include messages, music and graphical displays. However, the main indicators are changes in file sizes and contents.
1. Notify your ADP System Manager and the ADP Security Office of the infection and take the necessary actions to minimize the spread of the virus within your activity. 2. Notify all activities that may have received infected diskettes or network files from your activity. Everyone concerned must know about the virus so that it may be stopped and removed. 3. If possible, capture samples of the virus(es) on diskette (no more than 1 diskette per virus). Forward them with the information in paragraph 5 below via your ISSM for analysis to the NRL IS Security Office, Code 1220.2 4. Use Toolbox or a commercial antiviral software to remove the infection. 5. Provide the following information to NRL IS Security Office via your ISSM. a) Name of the virus b) How the virus was first detected and identified? c) Damage or observations resulting when the virus triggers d) Damage caused to your systems, if any e) Source of the virus, if known f) Other locations, within or outside of your activity, possibly infected as a result of sharing infected media or files g) Number and types of systems infected (i.e. hard disks and servers) h) Number of floppy diskettes infected (approximate) i) Method of clean-up (removal software, format disk, etc.) j) Number of work hours expended to remove the infection (approximate) k) Your name, phone and location The ADP Security Office will make an immediate and thorough investigation of all virus infections reported.
Prevention
Scan all disks before they are used. Be cautious of all newly acquired software. Check new software for infection before it is run for the first time. Never boot from an unprotected diskette. Backup files and programs. Watch for unusual operation indicators. Use virus detection software.
If you believe that your computer is infected with a virus - DON'T PANIC! Sometimes a badly thought out attempt to remove a virus will do much more damage than the virus could have done. If you are not sure what to do, leave your computer turned off until you contact the NRL IS Security Group to remove the virus for you. Viruses can be extremely unforgiving unless they are removed correctly
Steps
1. 2. 3. 4. 5. 6. Open command prompt. Go to Windows, then Run, and type "cmd" . Press enter. Type "cd\" and press enter to get to the root directory of c:\ . Type "attrib -h -r -s autorun.inf" and press enter. Type "del autorun.inf" and press enter. Repeat the same process with other drives and restart your computer. Restart your computer and it's done. Enjoy the freedom to open hard drives on a double click.
Tips
Sometimes "cmd prompt" returns an error "file not found autorun.inf, sometimes some of your hard drives might not be containing the autorun.inf file, so leave those drives and try the next ones.
Warnings
After deleting the file from all of your hard drives, immediately restart your computer. Don't try to open your drives by double clicking before restarting the machine otherwise you'll have to repeat whole of the procedure again.
DIMACS Working Group on Analogies Between Computer and Biological Viruses and Their Immune Systems
Introduction
Defenses against benign and malignant computer virus attacks are of considerable importance to both industrial and defense sectors. With technology bounding ahead, today's computers have growing vulnerabilities. At the same time, the scale of sophistication in computer viruses is rapidly increasing, as shown in Figure 1. Viruses have the capability to spread over multiple types of networks and various protocols (email, HTTP, instant messaging, phone, etc.) and each with a separate mechanism for infection and spread. The operational issues are also quite important in identifying both military and non-military applications of information flow. Highlighted by the growing number of malicious methods of attacks is the fact that there exist major deficiencies in detecting and controlling computer virus attacks. Currently, we find that large-scale attacks are still sequential, and each can be classified into three stages. First, there is the pre-attack stage, when security deficiencies in a particular system are identified and new viruses are formulated. Next, there is the initial occurrence, when the virus is spreading freely without detection by filtering or cure software. Finally, defensive strategies against the virus are developed and the cleanup stage occurs. Typically, the cleanup stage is very long, since a small number of cases continue to spread long after the initial outbreak dissipates. This process is depicted in Figure 2. One great concern is the increasing length of the initial occurrence stage, the significant gap between the initial time of attack and implementation of defensive strategies for a new computer virus. In part, this is due to the sophistication of virus writers, allowing computer viruses to evolve more rapidly than biological viruses. Some recent highly damaging viruses used methods where the virus automatically spread with no human in the loop, for example automatic emails, http requests by IIS web servers, and guessing IP addresses to attack. New mathematical models need to be developed which include a delay between attack and defense, and keep available data in mind.
Attack Sophistication
ATTACK ATTACK KNOWLEDGE REQUIREMENTS
Stealth Diagnostics
METHODS
GUI Packet Spoofing Denial Of Service
Sophistication
Sniffers
Session Highjacking
Disabling Audits
1980
#of Infections
Pre-attack
Password Guessing
Initial occurrence
1985
1990
1995
For example, these models can be used to show trends in the data that will help quickly identify a new attack in the initial occurrence stage. If an outbreak can be averted earlier in this stage, we have termed it quenching. It has been well documented that early detection in biological viruses leads to the most Time effective counter measures. It is this type of observation, or analogy, that gives insight to effective defensive strategies. This inspired a closer investigation into the analogies between computer viruses and biological viruses and their immune systems. By revisiting the original population dynamics approach to modeling computer viruses [3] and incorporating some of the more recent work from other fields in biology, this study offers promise for greater understanding of both. The main objective of the working group is to identify, by mathematical analysis, new modeling techniques for describing quantitatively the spread of computer virus attacks and possible defensive strategies. The main themes were the creation of new models to account for the analogies and differences between the spread of computer and biological viruses, identification and metrics of virus spread, and discussions about control and constraints to preserve information flow. The approach to addressing the group's objectives was to allow for open-ended cross-disciplinary presentations of new ideas, which may or may not have been published in the open literature. These were followed by
discussion and debate, with breakout sessions to solidify the identification of problems. This meeting brought together experts from computer science, applied mathematics, physics, and epidemiology to discuss, in their opinion, the important issues that need to be addressed. Current approaches to the detection and control of computer virus spread were discussed including: information flow modeling, operational network modeling, examination of real world virus data, and validation of models against available data. Part of the problem in trying to assess model validity is that data availability is not widespread and incomplete. Network data of virus spread is inherently nonstationary and spatio-temporal in nature. Unlike well-formed physics experiments, a computer virus attack is transient, making control difficult. The new models are meant to aid in the development of early prediction methods, detection strategies, and optimal defensive measures. Of course, all defensive strategies are limited by constraints, whether they are human resources, time, money, or privacy. These factors were also considered during evaluation.
Models
The goals of constructing models are fairly general, but address specific needs for control and containment of computer virus modeling. Namely, the models constructed should: By exploring the analogies between computer and biological viruses, the group identified the building blocks for new mathematical models.
Classes
Computer viruses are spread from computer to computer via some of form of network. Most of the intercomputer transport occurs over the Internet, but can also be through social contacts, such as the infamous sneaker-net, where the virus is transported via a removable media. Therefore, network modeling is a natural choice to describe transport through a network. For this purpose, there are basically two kinds of modeling, continuum modeling and agent-based modeling [2]. In agent-based modeling, each individual node is tracked as being either susceptible to infection, immune or cured, or infectious. That is, there is a vector of states that is assigned to each node. On the other side, continuum modeling consists of averaging over all nodes, and compartmentalizing the states. So for each a compartmented model, we assume the system consists of a set of evolution equations which describe the total number of states, such as the number of susceptible, infective, cured, etc. Both classes of models have advantages and disadvantages. Continuum models have the advantages of being qualitatively analyzable for stability with a rich theoretical foundation. They possess both forward and backward sensitivity analysis. The models are defined in terms of available node parameters, such as exposure rate, infectious period, etc. They can predict mean behavior of outbreaks in the presence of small stochastic effects in time. In contrast, they cannot describe individual nodal events, or low probability events, which might occur in the long-tailed distributions, and they possess a severe limitation on the dimension of the state space which may be attached to individual nodes. Agent-based models belong to a completely different class of models. Unlike the continuum models, these models detail low probability events, directly incorporate the behavior of individuals, and have few limitations on the state space. However, they do have limitations. This class has very few analytical tools by which they can be studied, since they consist of Monte-Carlo simulations. In short, there can be no backward sensitivity analysis performed on these models, and the individual level data has to be derived from very detailed population data, making these models computationally less efficient than the continuum models. One hybrid class of models that connects both continuum and agent-based models is the class of patch models. Patch models consider local space averages of nodes, which are defined as a sub-population group. Then the sub-population groups are modeled as a continuum model. Next, the entire population consists of networking the sub-populations. The analogy is a large business or university cluster of nodes connected to the outside world via a router. So, the sub-populations model clusters of locally connected nodes, which are inter-connected by routers. The patch models have many of the best features of both continuum and agent-based models, and can possess the correct limiting cases to both. Current defensive strategies work on this premise. A general network consists of various sized chunks with the outsides completely connected and the insides completely connected, but it is very rare for outside-to-inside connections. Firewalls restrict traffic in and out of the outside connections, but there are generally no restrictions on the inside, usually called an intranet.
In addition there is also the medium of transmission. In biology there exist networks, which consist of human webs, sexual webs, food webs, etc., while technological networks such as the Internet and email transport computer viruses. There are also non-equilibrium effects, such as seasonality in biology, and fast periodic driving forces in networks. In summary, there are numerous analogies between biological and computer viruses. However, there are a few distinct differences in which are important in classifying new models. In particular, biological viruses usually grow slowly, while computer viruses transmit globally quickly. This results in prolonged contacts in biology, but there obviously is a maximum rate of possible transmission in networks. On one side, humans are self-regulating against viruses, while computers are not. This is one of the main reasons for new control mechanisms. Speed of attack of viruses is highly different, and in non-equilibrium with different time scales. One must also consider the speed of growth of the networks on which the virus spreads. Both types of viruses possess different levels of sophistication, where the biological virus is autonomous, evolving, and sequential, but the computer virus is highly regulated, and static. In general, biological networks are less connected than computer networks.
Data Issues
One of the main challenges to modeling virus outbreaks is quantifying predictions against measured data. In both biology and computer science networks, case histories are incompletely measured. Much of the network data consists just of counting connections in both cases. Temporal behavior is ignored in most cases, although now some companies are beginning to acquire time stamps along with virus transmission between nodes. One of the big issues surrounding data is that of confidentiality and privacy. This is due in part to public and government pressure put on health care industries. In the computer world, however, masking is possible, relieving that concern. One can ask what the ideal data set should contain. In discussion, the data should reflect virus propagation in time along directed graphs, which is necessary for setting up the correct network topology. Unfortunately, most data sets count the number of infections on some domain, as shown in Figure 3. Clearly, more data is better, but it should be collected under controlled circumstances. There should also be a network test bed, and some companies, such as some of the antivirus companies, have set up dummy networks. Non-stationary effects should also be taken into account, since real networks grow in time. In addition, there is a diversity of computing environments, which also change in time.
One of the interesting points that came up was the interface of support networks, where one might have a monitoring network connected to a large network. This is a potentially new branch of study that possesses a myriad of questions that need to be addressed. Another important direction, though not new, is that better sensitivity analysis needs to be done in transient regions. Because of rapid expansion of transmission, inaccurate predictions of rates of growth may lead to inaccurate control responses. Other considerations include modeling of clusters, mixing patterns on the network, meta-populations, and transport. Another area of research is that of comparing local versus global (hub) strategies for control. Local methods consider taking a Bayesian approach to model normal node behavior of a computer. The method would then flag any activity in the tails of the distribution. Such methods are finding their way into the control of spam mail. Other methods of checking unwanted statistical changes to the operating
Figure 3: Data of the number of infections for the Code Red Worm.
system already exist. However, local methods tend to have an imbalance between false alarms and missed detections, making optimization of local control difficult. In contrast, global methods may monitor large hub connections, where new software controls may be put in place. Hubs tend to be organized a minimizing distance criteria in terms connection length. Therefore, to do accomplish some sort of global control, it makes sense to try to analyze traffic in the presence of a virus or worm around these centers. Given that most network information is found from node counting, knowing how information is transmitted through the major hubs valuable for control.
Mathematical Methods
Given the problems mentioned above in modeling the spread of computer viruses, many powerful tools from mathematics may be used to model and analyze viral growth on networks. Many computer viruses spread via electronic mail, making use of computer users email address books as a source for email addresses of new victims. These address books form a directed social network of connections between individuals over which the virus spreads. In Forrest, et al. [5], the structure of a social network was studied using data drawn from a large computer installation. Some of the preliminary conclusions were that targeted computer vaccination strategies were much more effective than random vaccinations and that priority orderings for security upgrades were more effective, as well. This study provides an example topology from data on which email viruses can spread. Each type of computer
Virus has an associated network, and finding an accurate approximation of a networks topology is Figure 4: Current important for designing a model oftopology designs for the World-Wide Web. the viral spread. First and foremost, graph theory is the tool used to quantify the topology on which viruses propagate [6,7]. In most cases, it is studied on static graphs as shown in Figure 4, but due to Internet growth, new tools are needed to understand the time dependent nature for the connections. Some statistical physicists have already attempted to quantify how the network grows, but have yet to address how to quantify the topology and its effects on spreading. In addition to all to all and random networks, more analysis on these new types of networks must be completed to study their transient temporal behavior. For example, in the continuum limit, how does one take into account the non-stationary character of the Internet topology? Does a limit even make sense if the network topology is time dependent? Can new instabilities arise as the network changes due to the scale free nature of networks? Bifurcation analysis is an excellent class of tools to describe such phenomena. Due to the inherent randomness of both connections made spatially as well as temporal fluctuations of nodes switching on and off, stochastic analysis will need to be incorporated into the modeling effort. Currently, most of the analysis of stochastic perturbations tends to be local. However, virus propagation on networks is a spatio-temporal problem, with local perturbations correlating with long distances in the network. Therefore, new stochastic analysis of spatio-temporal systems of discrete and continuous equations needs to be extended to address the complex behavior of the network. Finally, much of the work that has been done, and will be continued, is that of computer simulation of the spread of computer viruses. Real models that include packet transfer and collisions exist, but are very expensive to run even on a limited linear backbone of simulated email servers. In order to bring about realistic modeling, better algorithms and parallel architectures will probably need to be developed if simulation tools are required to be real time operations. This is especially true if control feed forward hardware is to be developed with any sort of software sophistication. These are just some of the mathematical and computational fields which are necessary to describe and analyze the spared of viruses on networks. The list is not exhaustive by any means, and maybe they are the obvious choices. Nonetheless, the mathematical community needs to be heavily involved if any real progress on a non-local scale is to be achieved.
Conclusions
Computer virus attacks remain a serious threat to the national and international information infrastructure, and may be analyzed through mathematical and computational models. Predicting virus outbreaks is extremely difficult due to the human nature of the attacks, but more importantly, detecting outbreaks early with a low probability of false alarms seems quite difficult. Nonetheless, by developing models based on better data collection, it is possible to characterize essential properties of the attacks from a network topology viewpoint. More resilient networks may be designed to control attacks, while retaining information flow on the networks. Novel methods of control, both static and adaptive, can be designed to slow the spread of an attack, while other means are designed to eradicate the virus from the network. Since the dynamics are based on non-stationary networks, the problem is one of a truly complex spatio-temporal nature. Most of the epidemiological analysis to date on networks has been static, and ignores the time dependence. Given such a complex dynamic, it is clear that mathematical and computer modeling will play a role for very long time.
Research Challenges
There is a need to collect better data. We need more details on how and where viruses propagate. (Approach anti-virus companies and universities to assist in collecting this data.) For testing hypotheses, there is a need for a more accurate network test bed. We should address the question of how do we design an experimental environment that accurately mimics the dynamics of a real network? Models need to identify the speed and transients of virus spread, not just prevalence. There is a need for a richer topology in models, i.e. to capture complex network structures. A better sensitivity analysis needs to be developed for agent-based modeling. One needs to address the differences between a conceptual model and an operational model. Better prevention tactics need to address associated network resilience. New control designs: Quenching, similar to a biological innate immune response. Fight and Cure, similar to a biological adaptive immune response. Distributed responses: locally strong / globally benign.
References
[1] J.L. Aron, M.O Leary, R.A. Gove, S. Azadegan, and M.C. Schneider, The benefits of a notification process in addressing the worsening computer virus problem: Results of a survey and a simulation model, Computers & Security 21 (2002), 142-163. [2] L. Billings, W.M. Spears, and I.B. Schwartz, A unified prediction of computer virus spread in connected networks, Physics Letters A 297 (2002), 261-266. [3] J. Kephart and S. White, Directed-Graph Epidemiological Models of Computer Viruses, in Proc. of the 1991 Computer Society Symposium on Research in Security and Privacy (1991) 343-359. [4] E. Makinen, Comment on A framework for modeling trojans and computer virus infection, COMPUT J 44 (2001), 321-323. [5] M. Newman, S. Forrest, and J. Balthrop, Email networks and the spread of computer viruses, submitted to Physical Review E (2002). [6] R. Pastor-Satorras and A. Vespignani, Epidemic dynamics and endemic states in complex networks, Physical Review E 63 (2001), art. no. 066117. [7] R. Pastor-Satorras and A. Vespignani, Epidemic spreading in scale-free networks, Physical Review Letters 86 (2001), 3200-3203.