Wednesday 21 May 2014

Data modeling and data models

The below post is notes prepared by me by studying the book "Database Systems Design, Implementation and Management" by Peter Rob and Carlos Coronel
Content, examples and diagrams are taken from that book.

UNIT-I Chapter-II
Data modeling and data models


Q: What is a data model?
Ans: Data model is blue print containing all the instructions to build a database that will meet all the end –user requirements.
This blue print contains both text descriptions in plain, unambiguous language and clear useful diagrams depicting the main data elements.


Q: Explain The importance of data models?
Ans:Data models are communication tool that enables interaction among the designer, the application programmer and end user.
Data models are used to represent real world data and how the different degrees of data abstraction enables data modeling.
Ex: a house blue print is an abstraction; you cannot live in a blue print, Similarly the data model is an abstraction, you cannot draw the required data out of the data model. As you cannot build a perfect house without blue print, you cannot create a good database without creating an appropriate data model.


Q:What are Business Rules?
Ans: Business rule is a description of policy, procedure within a specific organization. Properly written business rules are used to define entities, attributes, relationships and constraints.
Example1: Consider 2 business rules
  • A customer may generate many invoices.
  • An invoice is generated by only one customer.
These business rules establish 2 entities (CUSTOMER and INVOICE) and a 1:M relationship.
Example 2: A business rule is as follows
  • A training session cannot be scheduled for <10 employees or for >30  employees
This rule establishes a constraint (not <10 employees or for >30  employees) , two entities (EMPLOYEE and TRAINING) and a relationship between these entities.


Q: How to Discover Business Rules
Ans: The main sources of business rules are
  • company managers,
  • policy makers,
  • department managers and
  • written documentation such as a company’s procedures, standards or operation manuals,
  • direct interviews with end users.
The process of identifying and documenting business rules is essential to database design for several reasons.
  • They help standardize the company’s view of data
  • They allow designer to develop  relationship participation rules and constraints and to create an accurate data model.


Q: Why not all business rules can be modeled?
Ans: For ex: No pilot can fly more than 10 hours within 24- hour period .
Such a business rule can be enforced by application software and not by database design.
Q: Explain about hierarchical model?
Ans: The hierarchical structure contains levels or segments.
Within the hierarchy, the top layer (also called root) is the parent of the segment directly beneath it.
Advantages:
  1. It promotes data sharing.
  2. Parent child relationship promotes conceptual simplicity and data integrity.
  3. Database security is provided and enforced by DBMS.
  4. It is efficient with 1:M relationships.


Disadvantages:
    • Complex to implement and difficult to manage as it requires knowledge of physical data storage characteristics.
    • Changes in structure require changes in all application programs.
    • Can implement only 1:M relationships. So it has implementation limitations.
    • No standards.


This technology is best applied when conceptual model also resembles a tree and most data accesses begin with the same root file.


Q: Explain about network model?
Ans: Network model allows a record to have more than one parent.
Advantages:
  • It can handle M:N and multi parent relationship types.
  • Data access is more flexible
  • There are standards defined to implement this model.
  • It includes DDL and DML commands in DBMS


Disadvantages:
  • Little data independence.
  • Changes in structure requires changes in all application programs.


Q: What is CODASYL and DBTG?
Ans: To help establish database standards, the conference on data systems languages (CODASYL) created Database Task Group (DBTG) in late 1960s.


The final DBTG report contained specifications for 3 crucial database components.
The schema is the conceptual organization of the entire database as viewed by DBA
The subschema defines the portion of the database as seen by the application programs.
The application programs invoke the subschema required to access the appropriate database file.
A data management language that defines the environment in which data can be managed.



Q: Explain about The Relational Model ?
Ans: Here tables are called as “Relations”
Rows are called “Tuples” and column names as “attributes”.
Every attribute has a domain. A domain is set of permissible values that can be given to an attribute.
A common attribute existing in any two tables creates a relationship between the tables.
It supports  relationship types (1:1, 1: M or M: N)


The RDBMS manages all the physical details, while the user sees the relational database as collection of tables. (it enables you to view data logically rather than physically.)
The RDBMS uses SQL to translate user queries into instructions for retrieving the required data. The SQL engine executes all queries.


Advantages
  • Promotes data and structural independence.
  • Tabular view improves conceptual simplicity.
  • Adhoc query capability is based on SQL
  • RDBMS isolates end user from physical level details.


Disadvantages:
  • RDBMS requires substantial hardware and software overhead.
  • Conceptual simplicity gives untrained people the tools to use good system poorly.
  • It may produce islands of information problems as individuals and departments can easily develop their own applications.


Q: Explain about The Entity Relationship Model?
Ans: ER models are normally represented in an entity relationship diagram (ERD)
The ER model is based on the following components:
Entity: Entity is anything about which data are to be collected and stored
Attribute: Attributes are characteristics of entities.
Relationship:A relationship is an association between entities.
Advantages:
  • Visual modeling yields conceptual simplicity.
  • Visual representation makes it an effective communication tool.
  • It can be integrated with dominant relational model.
Disadvantages
  • There is limited constraint representation.
  • There is limited relationship representation.
  • There is no data manipulation language.


Q: Explain the various notations used with ERDs ?
Ans: The various notations used with ERDs are
  • The chen notation favors conceptual simplicity.
  • The crow’s foot notation favors implementation – oriented approach.
  • The UML notation can be used for both conceptual and implementation modeling.


Q: Explain about Object Oriented model?
Ans: In this model both the data and their relationships are contained in a single structure known as an Object.
Object includes information about relationships between facts within the object and relationships with other objects.
The OODM is the basis of OODBMS
The OODM is said to be semantic data model because semantic indicated meaning.
The object oriented data model is based on the following components
  • An object is an abstraction of a real-world entity.
  • Attributes describe the properties of an object.
  • Objects that share similar characteristics are grouped in classes.
  • A class is a collection of similar objects with shared structure (attributes)  and behaviour (methods)  (where as entities do not have methods)
  • Classes are organized in class hierarchy (which represents an upside – down tree in which each class has only one parent)
  • Inheritance is the ability of an object within class hierarchy to inherit the attributes and methods of the classes above it.
Object oriented data models are depicted using UML diagrams.


Advantages:
  • Semantic content is added
  • Visual representation includes semantic content.
  • Inheritance promotes data integrity.
Disadvantages:
  • No widely accepted standard.
  • It is a complex navigational system.
  • There is a steep learning curve.
  • High system overhead slows transaction.


Q) Distinguish between Logical and Physical data independence.
Logical Data Independence:
Logical data independence is the ability to modify the conceptual schema without having alteration in external schemas or application programs. Alterations in the conceptual schema may include addition or deletion of fresh entities, attributes or relationships and should be possible without having alteration to existing external schemas or having to rewrite application programs.
Physical Data Independence:
Physical data independence is the ability to modify the internal schema without having alteration to the conceptual schemas or application programs. Alteration in the internal schema might include.
* Using new storage devices.
* Using different data structures.
* Switching from one access method to another.
* Using different file organizations or storage structures.
* Modifying indexes.

Explain about the Conceptual, Internal and external and Physical Model
(Or)
Explain about different levels of data abstraction
(Or)
Explain about three schema architecture.
Ans:
The Conceptual Model
    1. The conceptual model represents a global view of the organization’s data as viewed by all end-users of an information system.
    2. It describes all entities and their attributes, the relationships among these entities and the constraints on these relationships.   
    3. The conceptual model forms the basis for the conceptual schema - a description of the database structure.
    4. The conceptual model is independent of both software (DBMS and OS) and hardware.
    5. The E-R model is the most widely used to represent conceptual model
The Internal Model
    1. The internal model adapts the conceptual model to a specific DBMS (e.g., hierarchical, network, and relational).
    2. The internal model is software-dependent but hardware-independent.
    3. Development of the internal model is especially important to hierarchical and network database models because the data access efficiencies in these models can be affected by improper design.


The External Model
    1. The external model is the end user’s/ applications programmer’s view (local view)of the database environment.
    2. It is concerned about a specific business operation.
    3. It is implemented through the CREATE VIEW command in SQL.


Benefits of the external model
    • Application program development is simplified because the programmer does not have to be concerned about data not relevant to his/her application.
    • Communication with the end-user is simplified.
    • Identification of data required to support each business unit’s operation is simplified.
    • Access control and security can be easily implemented.




The Physical Model
    • The physical model operates at the lowest level of abstraction, describing the way data is stored on storage media such as disks or tapes.
    • It requires the definition of both the physical storage devices and the access methods required to reach the data within those storage devices.
    • The physical model is both software and hardware-dependent.


No comments:

Post a Comment