You are on page 1of 11

Database: A database is a repository for a collection of computerized data files after proposed and compiled to produce result.

Database System: is a computerized system whose overall purpose is to maintain a database and to make the information in that database available on demand. Types: DBMS and RDBMS DBMS: Database Management System ( eg. FOXPRO, DBASE, MS-ACCESS, MYSQL etc) RDBMS:Relational Database Management System ( eg. MYSQL, ORACLE, SQL SERVER etc) DBMS: A database-management system (DBMS) is a set of computer programs that controls the creation, maintenance, and the use of a database. It allows organizations to place control of database development in the hands of database administrators (DBAs) and other specialists. A DBMS is a system software package that helps the use of integrated collection of data records and files known as databases. It allows different user application programs to easily access the same database. DBMSs may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way. Instead of having to write computer programs to extract information, user can ask simple questions in a query language. Thus, many DBMS packages provide Fourth-generation programming language (4GLs) and other application development features. It helps to specify the logical organization for a database and access and use the information within a database. It provides facilities for controlling data access, enforcing data integrity, managing concurrency, and restoring the database from backups. A DBMS also provides the ability to logically present database information to users. RDBMS: A relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as introduced by E. F. Codd. Most popular commercial and open source databases currently in use are based on the relational database model. A short definition of an RDBMS may be a DBMS in which data is stored in the form of tables and the relationship among the data is also stored in the form of tables. E.F. Codd: Edgar Frank "Ted" Codd (August 23, 1923 April 18, 2003) was a British computer scientist who, while working for IBM, invented the relational model for database management, the theoretical basis for relational databases. He made other valuable contributions to computer science, but the relational model, a very influential general theory of data management, remains his most mentioned achievement. Rule 0: The system must qualify as relational, as a database, and as a management system. Rule 1: The information rule:All information in the database is to be represented in one and only one way, namely by values in column positions within rows of tables. Rule 2: The guaranteed access rule:All data must be accessible. This rule is essentially a restatement of the fundamental requirement for primary keys. It says that every individual scalar value in the database must be logically addressable by specifying the name of the containing table, the name of the containing column and the primary key value of the containing row. Rule 3: Systematic treatment of null values: The DBMS must allow each field to remain null (or empty). Specifically, it must support a representation of "missing information and inapplicable information" that is systematic, distinct from all regular values (for example, "distinct from zero or any other number", in the case of numeric values), and independent of data type. It is also implied that such representations must be manipulated by the DBMS in a systematic way. Rule 4: Active online catalog based on the relational model: The system must support an online, inline, relational catalog that is accessible to authorized users by means of their regular query language. That is, users must be able to access the database's structure (catalog) using the same query language that they use to access the database's data.

Rule 5: The comprehensive data sublanguage rule: The system must support at least one relational language thatHas a linear syntax and Can be used both interactively and within application programs, Rule 6: The view updating rule: All views that are theoretically updatable must be updatable by the system. Rule 7: High-level insert, update, and delete: The system must support set-at-a-time insert, update, and delete operators. This means that data can be retrieved from a relational database in sets constructed of data from multiple rows and/or multiple tables. This rule states that insert, update, and delete operations should be supported for any retrievable set rather than just for a single row in a single table. Rule 8: Physical data independence: Changes to the physical level (how the data is stored, whether in arrays or linked lists etc.) must not require a change to an application based on the structure. Rule 9: Logical data independence: Changes to the logical level (tables, columns, rows, and so on) must not require a change to an application based on the structure. Logical data independence is more difficult to achieve than physical data independence. Rule 10: Integrity independence: Integrity constraints must be specified separately from application programs and stored in the catalog. It must be possible to change such constraints as and when appropriate without unnecessarily affecting existing applications. Rule 11: Distribution independence: The distribution of portions of the database to various locations should be invisible to users of the database. Existing applications should continue to operate successfully. Rule 12: The nonsubversion rule: If the system provides a low-level (record-at-a-time) interface, then that interface cannot be used to subvert the system, for example, bypassing a relational security or integrity constraint. Just for referenc: E.F. Codds rules(not in course details)

Usage of DBMS Applications


Banking: For customer information, accounts, and loans, and banking transactions. Airlines: For reservations and schedule information. Airlines were among the first to use databases in a geographically distributed mannerterminals situated around the world accessed the central database system through phone lines and other data networks. Universities: For student information, course registrations, and grades. Credit card transactions: For purchases on credit cards and generation of monthly statements. Telecommunication: For keeping records of calls made, generating monthly bills, maintaining balances on prepaid calling cards, and storing information about the communication networks. Finance: For storing information about holdings, sales, and purchases of financial instruments such as stocks and bonds. Sales: For customer, product, and purchase information. Manufacturing: For management of supply chain and for tracking production of items in factories, inventories of items in warehouses/stores, and orders for items. Human resources: For information about employees, salaries, payroll taxes and benefits, and for generation of paychecks.

Advantages of DBMS: It has advantage over (traditional) File System due to resolving:
Data redundancy and inconsistency. Since different programmers create the files and application programs over a long period, the various files are likely to have different formats and the programs may be written in several programming languages. Moreover, the same information may be duplicated in several places (files). For example, the address and telephone number of a particular customer may appear in a file that consists of savings-account records and in a file that

consists of checking-account records. This redundancy leads to higher storage and access cost. In addition, it may lead to data inconsistency; that is, the various copies of the same data may no longer agree. For example, a changed customer address may be reflected in savings-account records but not elsewhere in the system. Difficulty in accessing data. Suppose that one of the bank officers needs to find out the names of all customers who live within a particular postal-code area. The officer asks the data-processing department to generate such a list. Because the designers of the original system did not anticipate this request, there is no application program on hand to meet it. There is, however, an application program to generate the list of all customers. The bank officer has now two choices: either obtain the list of all customers and extract the needed information manually or ask a system programmer to write the necessary application program. Both alternatives are obviously unsatisfactory. Suppose that such a program is written, and that, several days later, the same officer needs to trim that list to include only those customers who have an account balance of $10,000 or more. As expected, a program to generate such a list does not exist. Again, the officer has the preceding two options, neither of which is satisfactory. The point here is that conventional file-processing environments do not allow needed data to be retrieved in a convenient and efficient manner. More responsive data-retrieval systems are required for general use. Data isolation. Because data are scattered in various files, and files may be in different formats, writing new application programs to retrieve the appropriate data is difficult. Integrity problems. The data values stored in the database must satisfy certain types of consistency constraints. For example, the balance of a bank account may never fall below a prescribed amount (say, $25). Developers enforce these constraints in the system by adding appropriate code in the various application programs. However, when new constraints are added, it is difficult to change the programs to enforce them. The problem is compounded when constraints involve several data items from different files. Atomicity problems. A computer system, like any other mechanical or electrical device, is subject to failure. In many applications, it is crucial that, if a failure occurs, the data be restored to the consistent state that existed prior to the failure. Consider a program to transfer $50 from account A to account B. If a system failure occurs during the execution of the program, it is possible that the $50 was removed from account A but was not credited to account B, resulting in an inconsistent database state. Clearly, it is essential to database consistency that either both the credit and debit occur, or that neither occur. That is, the funds transfer must be atomicit must happen in its entirety or not at all. It is difficult to ensure atomicity in a conventional file-processing system. Concurrent-access anomalies. For the sake of overall performance of the system and faster response, many systems allow multiple users to update the data simultaneously. In such an environment, interaction of concurrent updates may result in inconsistent data. Consider bank account A, containing $500. If two customers withdraw funds (say $50 and $100 respectively) from account A at about the same time, the result of the concurrent executions may leave the account in an incorrect (or inconsistent) state. Suppose that the programs executing on behalf of each withdrawal read the old balance, reduce that value by the amount being withdrawn, and write the result back. If the two programs run concurrently, they may both read the value $500, and write back $450 and $400, respectively. Depending on which one writes the value last, the account may contain either $450 or $400, rather than the correct value of $350. To guard against this possibility, the system must maintain some form of supervision. But supervision is difficult to provide because data may be accessed by many different application programs that have not been coordinated previously. Security problems. Not every user of the database system should be able to access all the data. For example, in a banking system, payroll personnel need to see only that part of the database that has information about the various bank employees. They do not need access to information about customer accounts. But, since application programs are added to the system in an ad hoc manner, enforcing such security constraints is difficult.

Data Independence: Data independence is a form of database management that eeps data separated from all programs that make use of the data. As a cornerstone for the idea of a DBMS or database management system, data independence ensures that the data cannot be redefined or reorganized by any of the programs that make use of the data. In this manner, the data remains accessible, but is also stable and cannot be corrupted by the applications using it.

Disadvantages of DBMS:

Security might be compromised (without good controls). Integrity might be compromised (without good controls). Additional hardware might be required. Performance overhead might be significant. Successful operation is crucial (the enterprise might be highly vulnerable to failure). The system is likely to be complex (though such complexity should be concealed from the user).

Types of Data Independence:


a) Physical Data Independence: The ability to change the physical schema without changing the logical schema is called physical data independence. For example, a change to the internal schema, such as using different file organization or storage structures, storage devices, or indexing strategy, should be possible without having to change the conceptual or external schemas. b) Logical Data Independence: The ability to change the logical (conceptual) schema without changing the External schema (User View) is called logical data independence. For example, the addition or removal of new entities, attributes, or relationships to the conceptual schema should be possible without having to change existing external schemas or having to rewrite existing application programs.

Data Model: is a collection of conceptual tools for describing data, data relationships, data semantics, and
consistency constraints. An implementation of a given data model is a physical realization on a real machine of the components of that model. In a nutshell: The model is what users have to know about; the implementation is what users don't have to know about.

Types of Data Models


a) b) c) d) e) f) Flat File Model Hierarchical Data Model etc. Relational Model Object Oriented Data Model Network Model Entity-Relationship Model (E-R Model)

Flat Model: The flat (or table) model consists of a single, two-dimensional array of data elements, where all members of a given column are assumed to be similar values, and all members of a row are assumed to be related to one another.

Hierarchal Model: In a hierarchical model, data is organized into a tree-like structure, implying a single upward link in each record to describe the nesting, and a sort field to keep the records in a particular order in each same-level list. Hierarchical structures were widely used in the early mainframe database management systems, such as the Information Management System (IMS) by IBM, and now describe the structure of XML documents. This structure allows one 1:N relationship between two types of data. This structure is very efficient to describe many relationships in the real world; recipes, table of contents, ordering of paragraphs/verses, any nested and sorted information. However, the hierarchical structure is inefficient for certain database operations when a full path (as opposed to upward link and sort field) is not also included for each record.

Relational Model: he basic data structure of the relational model is the table, where information about a particular entity (say, an employee) is represented in rows (also called tuples) and columns. Thus, the "relation" in "relational database" refers to the various tables in the database; a relation is a set of tuples. The columns enumerate the various attributes of the entity (the employee's name, address or phone number, for example), and a row is an actual instance of the entity (a specific employee) that is represented by the relation. As a result, each tuple of the employee table represents various attributes of a single employee. A relational database contains multiple tables, each similar to the one in the "flat" database model. One of the strengths of the relational model is that, in principle, any value occurring in two different records (belonging to the same table or to different tables), implies a relationship among those two records. Yet, in order to enforce explicit integrity constraints, relationships between records in tables can also be defined explicitly, by identifying or non-identifying parent-child relationships characterized by assigning cardinality (1:1, (0)1:M, M:M). Tables can also have a designated single attribute or a set of attributes that can act as a "key", which can be used to uniquely identify each tuple in the table.

Object-Oriented Data Model: n recent years, the object-oriented paradigm has been applied to database technology, creating a new programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time, object databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases.

Network Model: The network model organizes data using two fundamental constructs, called records and sets. Records contain fields, which may be organized hierarchically. Sets (not to be confused with mathematical sets) define one-to-many relationships between records: one owner, many members. A record may be an owner in any number of sets, and a member in any number of sets. The network model is a variation on the hierarchical model, to the extent that it is built on the concept of multiple branches (lower-level structures) emanating from one or more nodes (higher-level structures), while 7

the model differs from the hierarchical model in that branches can be connected to multiple nodes. The network model is able to represent redundancy in data more efficiently than in the hierarchical model.

E-R Model: In software engineering, an entity-relationship model (ERM) is an abstract and conceptual representation of data. Entity-relationship modeling is a database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements in a top-down fashion. Diagrams created by this process are called entity-relationship diagrams, ER diagrams, or ERDs.

The Building Blocks of E-R Model: Entity, Relationship and Attributes. Entity: An entity is a thing or object in the real world that is distinguishable from other objects. For example, each
person is an entity, and bank accounts can be considered as entities.

Attributes: Entities are described in a database by a set of attributes. For example, the attributes account-number and
balance may describe one particular account in a bank, and they form attributes of the account entity set. Similarly, attributes customer-name, customer-street address and customer-city may describe a customer entity.

Relationship: A relationship is an association among several entities. For example, a depositor relationship associates
a customer with each account that she has. The set of all entities of the same type and the set of all relationships of the same type are termed an entity set and relationship set, respectively.

The overall logical structure (schema) of a database can be expressed graphically by an E-R diagram, which is built up from the following components: Rectangles, which represent entity sets Ellipses, which represent attributes Diamonds, which represent relationships among entity sets Lines, which link attributes to entity sets and entity sets to relationships

Two-related entities Some example,

an entity with an attribute

Relationship Types: There are 4 types of relationships

One to One: An entity in A is associated with at most one entity in B, and an entity in B is associated with at
most one entity in A.

In one-to-one relationship, two tables are associated in such a way that each record in first table can have only one matching record in second table, and each record in second table can have only one matching record in first table. Consider the relationship between countries and their capitals given below.

One to Many: An entity in A is associated with any number (zero or more) of entities in B. An entity in B,
however, can be associated with at most one entity in A.

In one-to-many relationship, two tables are associated in such a way that each record in the first table can have many matching records in second table, but a record in second table has only one matching record in first table. Roll No Name City Roll No Phone No 1 Shahbaz Lahore 1 7534313 2 Rashid Multan 1 0391643569 3 Mukaram Karachi 2 7597696 4 Shakeel Faisalabad 2 0303030303 10

Nusrat

Sialkot

Student Table

3 3 4 4

7592077 09798303269 77411198 +4466596565

Student Phone Table

Many to One: An entity in A is associated with at most one entity in B. An entity in B, however, can be
associated with any number (zero or more) of entities in A.

Many to Many: An entity in A is associated with any number (zero or more) of entities in B, and an entity in B is
associated with any number (zero or more) of entities in A.

In many-to-many relationship, two tables are associated in such a way that each record in the first table can have many matching records in second table, and a record in second table can have many matching records in first table

11

You might also like