Database Management System(DBMS)
A complete notes on DBMS for BCA,BIT and BE students. Topics: Database, Database management system, Database Models ,ER diagram ,File Oriented Approach, Database approach
Database Management System
Introduction to DBMS:
Database is a structured collection of records or data that is stored in a computer system. It is a repository or collection of logically related and similar data. Database stores similar kinds of data that is organized in a manner that the data can be modified, added, delete, and information can be derived from it and use when it needed.
In order to have highly efficient database system, you need to incorporate a program that manages the queries and information stored on the system is called Database Management System (DBMS).
Example of real life databases are:
Dictionary: A database of words organized alphabetically along with meaning.
Telephone Directory: A database of telephone number and address.
Railway Timetable: A database of train organized by train name and time.
In early days, data was stored in files. For an application multiple files are required to be created. Each files stores and maintains its own related data. For example, a student information system would include files like student profile, course, fee etc.
There are many drawbacks of using the file system.
Data Redundancy: storing same data in multiple locations or files.
Data inconsistency: Replication of data in more than one file cause data integrity problem. For example, if a student’s address is stored in four different files, an address change need to be updated in each file separately.
The files in which the data is stored can have different file formats.
Database Oriented Approach
Database approach provides solutions for handling the problems of the file system approach. In this approach data are stored in central database and can access those data from different place.
The main characteristics of database approach are
- Data Redundancy is minimized.
- Data Inconsistency is reduced.
- Data is shared.
- Data is independence.
- Data integrity is maintained.
- Data security is maintained.
- Backup and recovery is support.
The information stored inside a database is represented using data modeling. The data model describes the structure of the database. A data model consists of components for describing the data, the relationship among them, and the semantics (relationship between words, phrases, signs) of data and the constraints (limitation or restriction) that hold data. A data model is an abstract model that describes how the data is represented and use.
The types of database models are described below:
Entity-Relationship (E-R) Model:
It is the real world model which represents the entities constraints in the database. The entities are further described in the database using attributes. The relation between entities is shown using the relationship. E-R model is represents diagrammatical using an E-R diagram.
Entity: An entity is the basic unit for modeling. It is a real world objects that exists physically which are tangible like students, employees, machines etc. or conceptually which are intangible like an event, job title etc.
For example, a student information system may consist of entities like student_profile, marks, course ect.
Attributes: An attribute describes some properties or characteristics of the entity. For example, student_profile may consist of attributes like student_name, student_address, student_age etc.
Relationship: An association or link between two entities is represented using a relationship. For example, student enroll in course is a relationship set between entities students and course.
There are four kinds of relationships.
One-to-One: An entity of A is linked with at most one entity of B and vice versa.
One-to-Many: An entity of A is liked with many entities of B, but an entity of B is liked with at most one entity of A.
Many-to-One: Many entities of A are liked with at most one entity of B, but an entity of B is liked with many numbers of entities of A.
Many-to-Many: An entity of A is linked with many numbers of entities of B and vice versa.
Fig: E-R Database Model
Here, Phone_No is multivalued entity which means one student can have multiple phone numbers. Age is the derived attribute which derived from DOB attribute.
Relational Database Model:
It is the most common types of database model in which data are arranged in two dimensional tables which are made of columns and rows. Table, record, field, key, and data values are the terms associated with a relational model. In this model each column represents a field, also called attribute and each row represents a record. The relationship between the two tables is implemented through a common attribute in the table not by physical links or pointers.
Several commercial products like DB2, ORACLE, SQL Server, SYBASE are relational database.
Fig: Relational Database Model
Hierarchical Database Model:
It developed by IMB and is the oldest database model. This model helps to establish a logical relationship among various data elements of multiple files and arrange the elements in a hierarchy and structure is like tree structure. In this model each record on one level can be related to multiple records on the next lower level. A record that has subsidiary record is called parent and the subsidiary record is called children.
Fig: Hierarchical Database Model
Network Database Model:
It is the modified version of hierarchical model. In this model, each record in the database can have multiple parents. The relationship among data elements can have many-to-many.
Fig: Network Database Model
Structure Query Language is a database computer language designed for managing data in relational database management system (RDBMS), and originally based upon relational algebra. It is used to creating and manipulation database. SQL works along with database programs such as DB2, Oracle, Informix, MS Access, SM SQL Server etc.
There are basic four operations in the SQL as select, update, insert, and delete. The select statement allows users to query the database for specific information. Update, insert and delete allow users to update the data, insert the data and delete the data from the table respectively.
We have database name TestDB and Table name Customers
SELECT* From Customers;
|1||Ram Krishna||Sita Devi||Kageshwori||KTM||44600||Nepal|
|2||Avisek Thakur||Naryan Karki||Gokarna||KTM||44621||Nepal|
|3||Nishan Niraula||Krish Rai||Katunje||BKT||44623||Nepal|
SELECT* From Customers where CustomerID=2;
|2||Avisek Thakur||Naryan Karki||Gokarna||KTM||44621||Nepal|
SELECT CustomerName, Country, PostalCode FROM Customers where CustomerID=2;
Feature of SQL:
- SQL is an English-like language. It use words such as select, insert, delete, update as part of its command set.
- SQL is non-procedural language.
- SQL processes set of records like table rather than a single record at a time.
- SQL provides command for variety of task including:
- Querying data
- inserting, updating, deleting rows in a table
- Creating, modifying and deleting database objects
- Controlling access to the database and database object
- Guaranteeing database consistency
Advantage of SQL:
- No codding needed: it is easy to manage database without any need of write program code.
- Well defined standards.
- SQL can be used in the PCs, servers, laptops, and even some of the mobile phones.
- This is used for communicating with the database and receives answers to the complex questions in a second.
- With the help of SQL language, the users can make different views of database structure.
Disadvantage of SQL:
- SQL has a complex interface that makes it difficult for some user to access it.
- The programmers who use SQL don’t have a full control over the database because of the hidden business rules.
Database Design and Data Security:
Database design provides a mean to represent the real world entities in a form that can be processed by computer. Database model present the process of abstracting real world entities and store in to database representation. They give us methods to capture the static and dynamic attributes of real entities. For example, if we want to capture the attributes of human beings, then we can scan this photograph, finger-print, retina images, get his details like name, address, age, gender, phone number, height, weight, DOB, blood group, profession, etc. But all of these information may not be required.
To develop a good database design, designer has to understand the meaning of information and the use of it. Once we develop the understanding and have identified the use of information in the application, we can determine how much and what kind of information we required.
Fig: Static and Dynamic Information
We could also determine in what format these information should be captured and represented in the database. During the determination of application’s information requirement, we can find what kind of data is essential and desired for application and which one is not needed at all.
Database design has following benefits:
- They provide a means to represent real world objects in computer usable form.
- They capture and represent association and relationship among the real world objects.
- They allow the database designer to capture the static and dynamic organization and flow of information within a modeled enterprise.
- They help in improving the maintainability, scalability (capability to upgrade on demand), and reliability of the system.
Data security refers to protective digital privacy measures that are applied to prevent unauthorized access to computers, database, and websites. It is a process of protecting files, database, accounts, on a network by adopting a set of controls, applications, and techniques that identify the relative importance of different datasets, their sensitivity, regulatory compliance requirements and then applying appropriate protections to secure these resources. Most business organization has massive amounts of data or information which may want to keep confidential and protected from other.
Data security is very important or critical for most companies and businesses. Besides formal organizations it is also very critical for home computer users as well. Information like client detail, bank detail, account detail, personal files etc. must be well protected for everyone because if it gets into wrong hands, it can be miss use, the personal reputation may be affected.
Now a days, threats and hackers are increasing rapidly. If once the damage is happens it’s too expensive and complicated to fix it and get back the original data and information that’s why data security is very important.
Some essential ways to secure data are:
- Establish Strong Password
- Strong Firewall security
- Antivirus Protection
- Secure Systems such as PCs, Laptops
- Secure Mobile Phones
- Backup Data Regularly
- Periodic Monitor of Data
- Surf Safely: don’t click unwanted link on the internet
A data warehouse (DW) is a collection of corporate information and data derived from operational systems and external data sources. A data warehouse is designed to support business decisions by allowing data consolidation, analysis and reporting at different aggregate levels. Data is populated into the DW through the processes of extraction, transformation and loading.
In a data warehouse, data from many heterogeneous sources is extracted into a single area, transformed according to the decision support system needs and stored into the warehouse. For example: company stores information pertaining to its employees, their salaries, developed products, customer information, sales and invoices. The CEO might want to ask a question pertaining to the latest cost-reduction measures; the answers will involve analysis of all of this data. This is a main service of the data warehouse, i.e., allowing executives to reach business decisions based on all these disparate raw data items.
Thus, a data warehouse contributes to future decision making. As in the above example, a firm administrator can query warehouse data to find out the market demand of a particular product, sales data by geographical region or answers other inquiries. This provides insight about required steps to more effectively market a particular product. Unlike an operational data store, a data warehouse contains aggregate historical data, which may be analyzed to reach critical business decisions. Despite associated costs and effort, most major corporations today use data warehouses.
Fig: Data warehouse
It provides the multidimensional view of consolidated data in a warehouse. Additionally, the data warehouse environment supports ETL (Extraction, Transform and Load) solutions, data mining capabilities, statistical analysis, reporting and OLAP (Online Analytical Processing) Tools, which help in interactive and efficient data analysis in a multifaceted view.
The major characteristics of data warehouse are:
Subject Oriented: it deals with subject area of organization. Example: an insurance company uses a data warehouse organize their data by customer, premium, and claim etc.
Integrated: data warehouse receives data from different resources. Filtering and transforming data from different resources into one consistent database is known as integration.
Non-Volatile: data are updated frequently. Data warehouse can save data for long run.
Time-variant: data can store for long time so, later older data can user for forecasting.
There are three main components of data warehousing:
- A data integration layer that extracts data from operational systems, such as Excel, ERP, CRM or financial applications.
- A data staging area where data is cleanup and organized.
- A presentation area where data is warehoused and made available for use.
Data Warehouse are used for
- Carrying out data mining to gain new insights from the information held in many large databases
- Conducting market research by analyzing large volumes of data in-depth
- An online business analyzing user behavior to make business decisions
Application areas of data warehouse are:
Here, are most common sectors where Data warehouse is used:
In the Airline system, it is used for operation purpose like crew assignment, analyses of route profitability, frequent flyer program promotions, etc.
It is widely used in the banking sector to manage the resources available on desk effectively. Few banks also used for the market research, performance analysis of the product and operations.
Healthcare sector also used Data warehouse to strategize and predict outcomes, generate patient's treatment reports, share data with tie-in insurance companies, medical aid services, etc.
In the public sector, data warehouse is used for intelligence gathering. It helps government agencies to maintain and analyze tax records, health policy records, for every individual.
Investment and Insurance sector:
In this sector, the warehouses are primarily used to analyze data patterns, customer trends, and to track market movements.
In retail chains, Data warehouse is widely used for distribution and marketing. It also helps to track items, customer buying pattern, promotions and also used for determining pricing policy.
A data warehouse is used in this sector for product promotions, sales decisions and to make distribution decisions.
This Industry utilizes warehouse services to design as well as estimate their advertising and promotion campaigns where they want to target clients based on their feedback and travel patterns.
Data mining is the process of analyzing hidden patterns of data according to different perspectives for categorization into useful information, which is collected and assembled in common areas, such as data warehouses, for efficient analysis, data mining algorithms, facilitating business decision making and other information requirements to ultimately cut costs and increase revenue. Data mining is also known as data discovery and knowledge discovery.
Following figure illustrates the necessary stages in data mining.
Fig: Stages in Data Mining
- Selection: in the initial stage target data is selecting and segmenting according to some criteria e.g. all those people who own a car.
- Preprocessing: this is the data cleansing stage where certain information is removed which is unnecessary. Eg: remove the gender of patient when studying pregnancy.
- Transformation: In this process, data is transformed into a form suitable for the data mining process. Data is consolidated so that the mining process is more efficient and the patterns are easier to understand.
- Data Mining: Data Mining is a process to identify interesting patterns and knowledge from a large amount of data. In these steps, intelligent patterns are applied to extract the data patterns. The data is represented in the form of patterns such as language L, facts F, statement S etc. and models are structured using classification and clustering techniques.
- Interpretation: The patterns identified by the system are interpreted into knowledge which can then be used to support human decision-making.
There are four main objective of data mining as:
- Sequence or path analysis: Finding patterns where one event leads to another, later event.
- Classification: Finding whether certain facts fall into predefined groups.
- Clustering: Finding groups of related facts not previously known.
- Forecasting: Discovering patterns in data that can lead to reasonable predictions.
Database Administration consists of everything required to manage a database and make it available as needed. The database administrator (DBA) is the person who manages, backs up and ensures the availability of the data produced and consumed by today’s organizations via their IT systems. The DBA is a critically important role in many of today’s IT departments, and by extension, their organizations overall.
For example, consider a commercial bank: the DBA is the person who ensures that the bank teller has easy, fast access to your information, and can quickly access your bank balance and transaction history. In this example, the DBA is a system or application-database administrator—a general DBA role responsible for most aspects of the organization’s databases. However, this is not the only kind of DBA.
A database administrator's responsibilities may include:
- Installing and upgrading the database server and/or application tools.
- Planning for and allocating the database system’s physical requirements, such as memory, disk space, network requirements, etc.
- Modifying the database structure using information provided by application developers.
- Creating users profiles, and ensuring system security by careful allocation of user permissions.
- Ensuring compliance with database vendor license agreement, including number of installations, and taking care of licensing renewals.
- Creating a backup and recovery strategy for the database, and regularly testing the backups to ensure usability.
- Monitoring technical support for both database systems and related applications.
- Creating reports by querying from database (as per need). These reports can be in the form of pre-formatted reports using the application frontend, or custom-made ad hoc reports by the database administrator.
- Monitoring and optimizing the database’s performance using either manual or automated tools. This may be a DBA’s most important tasks.
- Migrating database instances to new hardware and new versions of software from on premise to cloud based databases and vice versa.
Special Thanks : Lec. Pratik Chand, Morgan Int’l College