Thursday, July 30, 2009

Physical Database Design-Purpose of the physical database design

- Process of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and indexes used to achieve efficient access to the data, and any associated integrity constraints and security measures.
- Does not include implementing files and databases ( i.e. creating them and loading data into them)
- It produce technical specifications use during the implementation phase
- The database can be single centralized or multiple distributed sites (here single centralized only)
- Physical database design is critical. Decisions made during this stage have a major impact on data accessibility, response times, securing and user friendliness
- Database administration plays a major role in physical database design.
- In most situations, many physical database design decisions are implicit or eliminated when you choose the database management technologies.
- Since many organizations have standards for OS, DBMS, and data access languages, you must deal only with those choices.
- The primary goal of physical database design is data processing efficiency.
- With ever-decreasing costs for computer technology (both speed and space), it is important for you to design the physical database to minimize the time required by users to interact with the IS
- Designing physical files and databases requires certain information:
- Normalized relations, including volume estimates
- Definitions of each attribute
- Descriptions of where and when data are used: entered, retrieved, deleted, and updated (including frequencies)
- expectations or requirements for response time and data security, backup, recovery, retention, and integrity
- descriptions of the technologies (DBMS) used for implementing the database Physical database design requires several critical decisions that will affect the integrity and performance. These key decisions include:
- Choosing the storage format (called data type) for each attribute from the logical data model. The format is chosen to minimize storage space and to minimize data integrity
- Grouping attributes from the logical data model into physical records
- Arranging similarly structured records in secondary memory (hard disk or any other means) so that individual or group of records (file organizations) can be stored, retrieved, and updated rapidly.
Consideration must be given to protecting and recovering data after errors are found
- Selecting structures (called indexes and database architectures) for storing and connecting files to make retrieving related data more efficient
- Preparing strategies for handling queries against the database that will optimize performance and take advantage of the file organization and indexes

Sunday, July 19, 2009

Relational Algebra Introduction

Relational Query Languages

Query languages: Allow manipulation and retrieval of data from a database.
- Relational model supports simple, powerful QLs:
• Strong formal foundation based on logic.
• Allows for much optimization.
- Query Languages != programming languages!
• QLs not intended to be used for complex calculations.
• QLs support easy, efficient access to large data sets.

Formal Relational Query Languages

Two mathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation:

Relational Algebra: More operational, very useful for
representing execution plans.
Relational Calculus: Lets users describe what they
want, rather than how to compute it. (Non-procedural,declarative.)

Note: Understanding Algebra & Calculus is key to
understanding SQL, query processing!

Preliminaries

A query is applied to relation instances, and the
result of a query is also a relation instance.
• Schemas of input relations for a query are fixed (but
query will run over any legal instance)
• The schema for the result of a given query is also
fixed. It is determined by the definitions of the query
language constructs.

Relational Algebra: 5 Basic Operations

- Selection ( ): Selects a subset of rows from relation
(horizontal).


- Projection ( ): Retains only wanted columns from relation
(vertical).
-
Cross-product ( ): Allows us to combine two relations.


- Set-difference ( — ): Tuples in r1, but not in r2.

- Union ( ): Tuples in r1 or in r2.

Wednesday, July 15, 2009

The design process in DBMS Further

Storing information about

- In this example the file is storing information about:
• the staff members who attending courses
• the departments in which those staff members work
• the courses that are being offered
• the actual attendance of an employee on a course

- the above problems could be avoided if we can keep the
information about each of these things separately, and then
make cross-references where required (relational database approach)


Relational database approach

- In a relational database, the data values held are
stored in tables (an arrangement of rows and columns, just like a spread sheet)

- one of the main advantage is its ability to set up and automatically maintain cross-references
(relationships) between the tables.
- the structure that hold the data is called a ‘relation’. Common term is table. A row is called a
‘tuple’ and a column is called an ‘attribute’.

Dr.E.F.Codd invented these new words.

A systematic approach to DB design

- implementing large information system can be extremely complicated
- break down the work into series of steps, and each step into manageable tasks.
- the basic steps to be followed during the design and construction are as follows.
• Decide what is the purpose of the DB
• Determine the information requirements of the DB
• Produce a logical model of the information requirements of DB
• Convert the logical data model to a physical data model appropriate to the technology being used
• Implement the physical design using the chosen method or product

Friday, July 10, 2009

Learn DBMS-Purpose of Database System

What is usefulness of the DBMS?

- In the early days, database applications were built on top of file systems

- Drawbacks of using file systems to store data:

Data redundancy and inconsistency
• Multiple file formats, duplication of information in different files

Difficulty in accessing data
• Need to write a new program to carry out each new task
Data isolation — multiple files and formats
Integrity problems
• Integrity constraints (e.g. account balance > 0) become part of program code
• Hard to add new constraints or change existing ones

Atomicity of updates
• Failures may leave database in an inconsistent state with partial updates carried out
- E.g. transfer of funds from one account to another should either complete or not happen at all

Concurrent access by multiple users
• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
- E.g. two people reading a balance and updating it at the same time

Security problems
- Database systems offer solutions to all the above problems

Monday, July 6, 2009

Logical Database Design and the Relational Model in DBMS

What is Relational Model - Basic Definitions
- relational model represent the data in the form of tables
- based on mathematical theory
- consists of three components
Data structure: Data are organized in the form of tables with rows and columns
Data manipulation: powerful operations (using SQL) are used to manipulate data stored in the relations Data integrity: facilities are included to specify business rules

What is Relational Data Structure
- a relation is a named, two-dimensional table of data
- can express the structure of a relation by a shorthand notation in which name of the relation is followed (in parentheses) by the name of the attributes in that relation
- e.g. EMPLOYEE(Emp_ID,Name,Dept_Name,Salary)
- Relational Keys: every relation must have a primary key
- e.g. EMPLOYEE(Emp_ID,Name,Dept_Name,Salary)
- a composite key is a primary key that consists of more than one attributes
-E.g the primary key for a relation DEPENDENT would likely consists of the combination of Emp_Id and Dependent_Name
- A foreign key : is an attributes (possibly composite) in a relation of a database that serve as the primary key of another relation in the same database.
-E.g EMPLOYEE(Emp_ID,Name,Dept_Name,Salary), DEPARTMENT(Dept_Name,Location,Fax)

learn Properties of Relations
- not all the tables are relations
- relations have several properties
- unique name
- an entry at the intersection of each row and column is atomic (or single valued)
- each row is unique
- each attribute within a table has a unique name
- the columns of a relation can be interchanged without changing the meaning or use of the relation
- the sequence of rows (top to bottom) is insignificant (can be interchanged)

Example Database
- A relational database consists of any number of relations
- The structure of the database is described through the use of conceptual Schema
- Two common methods for expressing a (conceptual) schema
- Short text statements: each relation is named and the names of its attributes follow in parentheses
- A graphical representation: each relation is represented by a rectangle containing the attributes for the relation
Text statements have the advantage of simplicity, while the graphical representation provides a better means of expressing referential integrity constraints.




Text description for the relations

CUSTOMER(Customer_ID,Customer_Name,Address,City,State, Zip)
ORDER(Order_ID,Order_Date,Customer_ID)
ORDERLINE(Order_ID,Product_ID,Quantity)
PRODUCT(Product_ID,Product_Description,Product_Finish,Unit_Price,On_H and)
- The primary key for the ORDERLINE is a composite key.
- Customer_ID is a foreign key in the ORDER relation; this allows the user to associate an order with the customer who submitted the order.
- ORDERLINE has two foreign keys; Order_ID and Product_ID; these keys allow the user to associate each line on an order with the relevant order and product