Author: saqibkhan

  • Generalization Aggregation

    The ER Model has the power of expressing database entities in a conceptual hierarchical manner. As the hierarchy goes up, it generalizes the view of entities, and as we go deep in the hierarchy, it gives us the detail of every entity included.

    Going up in this structure is called generalization, where entities are clubbed together to represent a more generalized view. For example, a particular student named Mira can be generalized along with all the students. The entity shall be a student, and further, the student is a person. The reverse is called specialization where a person is a student, and that student is Mira.

    Generalization

    As mentioned above, the process of generalizing entities, where the generalized entities contain the properties of all the generalized entities, is called generalization. In generalization, a number of entities are brought together into one generalized entity based on their similar characteristics. For example, pigeon, house sparrow, crow and dove can all be generalized as Birds.

    Generalization

    Specialization

    Specialization is the opposite of generalization. In specialization, a group of entities is divided into sub-groups based on their characteristics. Take a group Person for example. A person has name, date of birth, gender, etc. These properties are common in all persons, human beings. But in a company, persons can be identified as employee, employer, customer, or vendor, based on what role they play in the company.

    Specialization

    Similarly, in a school database, persons can be specialized as teacher, student, or a staff, based on what role they play in school as entities.

    Inheritance

    We use all the above features of ER-Model in order to create classes of objects in object-oriented programming. The details of entities are generally hidden from the user; this process known as abstraction.

    Inheritance is an important feature of Generalization and Specialization. It allows lower-level entities to inherit the attributes of higher-level entities.

    Inheritance

    For example, the attributes of a Person class such as name, age, and gender can be inherited by lower-level entities such as Student or Teacher.

  • Weak Entity Types

    In relational database mapping, we study different types of entity sets. There are two major types of entities: strong entities and weak entities. Understanding weak entity types is essential for grasping how relationships are organized and represented in a database. Unlike strong entities, weak entities cannot exist independently—they rely on other entities for their identification and existence.

    In this chapter, we will explore weak entity types, understand why they are important, and illustrate their use with examples for better clarity.

    Understanding Weak Entity Types in Relational Database Mapping

    A weak entity cannot be uniquely identified by its own attributes alone. In contrast, strong entities have a primary key that uniquely identifies each instance. Weak entities, however, depend on another entity, known as the owner entity, for identification. To uniquely identify a weak entity, we must use a combination of its own attributes and the attributes from its owner entity.

    Key Characteristics of Weak Entity Types

    Weak entities have special distinguishing features, such as –

    • No Primary Key − A weak entity lacks a unique primary key of its own.
    • Dependence on Owner Entity − A weak entity exists only through its relationship with a strong (owner) entity.
    • Partial Key − A weak entity has a partial key, a set of attributes that, when combined with the primary key of the owner entity, uniquely identifies the instances of the weak entity.

    The Role of Identifying Relationships

    To function properly within a database, every weak entity must be linked to its owner entity through an identifying relationship. This relationship ensures that the weak entity is connected to the strong entity’s primary key. Identifying relationships are crucial for maintaining the integrity and traceability of weak entities in relational database design. Let’s understand it better with the help of an example.

    Example − In a system that keeps track of a company’s projects, the “Project” is a strong entity type, however the “Project_Department” is a weak entity type.

    The “Project” has a primary key known as “Project_ID.” However, the “Project_Department” entity cannot be identified just by its own attributes; it needs the “Project_ID” to be unique.

    Here, Project (Strong Entity) is the entity type having attributes like “Project_ID,” “Project_Name,” and “Project_Budget.” The “Project_ID” acts as the primary key for this entity.

    Project_IDProject_NameProject_Budget

    Project_Department (Weak Entity) has the attributes of this weak entity that include “Dept_Name” and “Dept_Manager.” However, these attributes alone are not enough to make each “Project_Department” unique across all projects.

    Identifying the Relationship

    In our example, let’s consider “Project_Department” relies on the “Project” entity through an identifying relationship called “Belongs_To”. Now each instance of “Project_Department” must be associated with a specific project, which makes the combination of “Project_ID” and “Dept_Name” the composite key that uniquely identifies the department within that project.

    Importance of Weak Entity Types

    We understood the basics of weak entity types. Let us now try to figure out why we need weak entities in the first place. Weak entities are important for capturing real-world scenarios where certain data cannot stand alone.

    Real-World Constraints

    In real-world database designing, sometimes we need to model scenarios where one entity depends another entity. For example, a project department within a company might exist only within the context of a particular project. And we can feel that it cannot be understood or identified without knowing which project it is associated with. Using a weak entity allows databases to accurately represent such cases.

    Ensuring Data Integrity

    When we work with weak entities, it helps to maintain the integrity of related data. By linking a weak entity to its owner through an identifying relationship, we can make sure that the data is tied to a specific project, person, or event. Every data record has to be correctly associated and maintained. This property is needed for preventing orphan records or data inconsistencies.

    The Concept of Partial Keys

    The idea of partial key is an interesting and quite useful concept for an attribute or set of attributes that uniquely identify a weak entity within the context of its relationship with the owner entity. It is not enough to identify the weak entity on its own. When combined with the primary key of the owner, it serves this purpose.

    partial key refers to an attribute (or a set of attributes) that can uniquely identify a weak entity, but only within the context of its relationship with an owner (strong) entity. On its own, a partial key is insufficient to identify a weak entity. However, when combined with the primary key of the owner entity, it can uniquely identify a weak entity.

    Example − Consider the previously discussed example involving Project_Department. In this case, Dept_Name acts as the partial key. On its own, the Dept_Name does not uniquely identify a department. But when combined with the Project_ID from the associated Project entity, it becomes unique within that context.

    How to Identify Weak Entities in ER Models?

    Now that we understand the concept of weak entities, it’s important to recognize them when designing a database. In Entity-Relationship (ER) diagrams, weak entities can be identified through specific notations:

    • Double Rectangles − In ER diagrams, weak entities are represented using double rectangles.
    • Double Diamonds − Identifying relationships that connect weak entities to their owner entities are shown using double diamonds.
    • Composite Keys − A composite key, which is formed by combining the partial key of the weak entity with the primary key of the owner entity, should be clearly represented in the ER diagram.

    Conclusion

    In this chapter, we touched upon the basics of what weak entity types are and why they are significant in database models. We also understood how these entities differ from strong entities and the role of identifying relationships in connecting them to their owner entities. With the help of an example, we demonstrated how weak entity types work in practice. We also highlighted the basic concepts of partial keys and how they fit into identifying relationships.

  • Weak Entity Types

    In relational database mapping, we study different types of entity sets. There are two major types of entities: strong entities and weak entities. Understanding weak entity types is essential for grasping how relationships are organized and represented in a database. Unlike strong entities, weak entities cannot exist independently—they rely on other entities for their identification and existence.

    In this chapter, we will explore weak entity types, understand why they are important, and illustrate their use with examples for better clarity.

    Understanding Weak Entity Types in Relational Database Mapping

    A weak entity cannot be uniquely identified by its own attributes alone. In contrast, strong entities have a primary key that uniquely identifies each instance. Weak entities, however, depend on another entity, known as the owner entity, for identification. To uniquely identify a weak entity, we must use a combination of its own attributes and the attributes from its owner entity.

    Key Characteristics of Weak Entity Types

    Weak entities have special distinguishing features, such as –

    • No Primary Key − A weak entity lacks a unique primary key of its own.
    • Dependence on Owner Entity − A weak entity exists only through its relationship with a strong (owner) entity.
    • Partial Key − A weak entity has a partial key, a set of attributes that, when combined with the primary key of the owner entity, uniquely identifies the instances of the weak entity.

    The Role of Identifying Relationships

    To function properly within a database, every weak entity must be linked to its owner entity through an identifying relationship. This relationship ensures that the weak entity is connected to the strong entity’s primary key. Identifying relationships are crucial for maintaining the integrity and traceability of weak entities in relational database design. Let’s understand it better with the help of an example.

    Example − In a system that keeps track of a company’s projects, the “Project” is a strong entity type, however the “Project_Department” is a weak entity type.

    The “Project” has a primary key known as “Project_ID.” However, the “Project_Department” entity cannot be identified just by its own attributes; it needs the “Project_ID” to be unique.

    Here, Project (Strong Entity) is the entity type having attributes like “Project_ID,” “Project_Name,” and “Project_Budget.” The “Project_ID” acts as the primary key for this entity.

    Project_IDProject_NameProject_Budget

    Project_Department (Weak Entity) has the attributes of this weak entity that include “Dept_Name” and “Dept_Manager.” However, these attributes alone are not enough to make each “Project_Department” unique across all projects.

    Identifying the Relationship

    In our example, let’s consider “Project_Department” relies on the “Project” entity through an identifying relationship called “Belongs_To”. Now each instance of “Project_Department” must be associated with a specific project, which makes the combination of “Project_ID” and “Dept_Name” the composite key that uniquely identifies the department within that project.

    Importance of Weak Entity Types

    We understood the basics of weak entity types. Let us now try to figure out why we need weak entities in the first place. Weak entities are important for capturing real-world scenarios where certain data cannot stand alone.

    Real-World Constraints

    In real-world database designing, sometimes we need to model scenarios where one entity depends another entity. For example, a project department within a company might exist only within the context of a particular project. And we can feel that it cannot be understood or identified without knowing which project it is associated with. Using a weak entity allows databases to accurately represent such cases.

    Ensuring Data Integrity

    When we work with weak entities, it helps to maintain the integrity of related data. By linking a weak entity to its owner through an identifying relationship, we can make sure that the data is tied to a specific project, person, or event. Every data record has to be correctly associated and maintained. This property is needed for preventing orphan records or data inconsistencies.

    The Concept of Partial Keys

    The idea of partial key is an interesting and quite useful concept for an attribute or set of attributes that uniquely identify a weak entity within the context of its relationship with the owner entity. It is not enough to identify the weak entity on its own. When combined with the primary key of the owner, it serves this purpose.

    partial key refers to an attribute (or a set of attributes) that can uniquely identify a weak entity, but only within the context of its relationship with an owner (strong) entity. On its own, a partial key is insufficient to identify a weak entity. However, when combined with the primary key of the owner entity, it can uniquely identify a weak entity.

    Example − Consider the previously discussed example involving Project_Department. In this case, Dept_Name acts as the partial key. On its own, the Dept_Name does not uniquely identify a department. But when combined with the Project_ID from the associated Project entity, it becomes unique within that context.

    How to Identify Weak Entities in ER Models?

    Now that we understand the concept of weak entities, it’s important to recognize them when designing a database. In Entity-Relationship (ER) diagrams, weak entities can be identified through specific notations:

    • Double Rectangles − In ER diagrams, weak entities are represented using double rectangles.
    • Double Diamonds − Identifying relationships that connect weak entities to their owner entities are shown using double diamonds.
    • Composite Keys − A composite key, which is formed by combining the partial key of the weak entity with the primary key of the owner entity, should be clearly represented in the ER diagram.

    Conclusion

    In this chapter, we touched upon the basics of what weak entity types are and why they are significant in database models. We also understood how these entities differ from strong entities and the role of identifying relationships in connecting them to their owner entities. With the help of an example, we demonstrated how weak entity types work in practice. We also highlighted the basic concepts of partial keys and how they fit into identifying relationships.

  • Relationship Types and Relationship Sets in DBMS

    In relational databases, it is essential to understand the concept of relationships. Relationships are used for managing how the data interacts. Relationships help define how data interacts within the database. The concepts of relationship types and relationship sets are used to structure these interactions, allowing for meaningful and efficient data storage and retrieval.

    In this chapter, we’ll explore these ideas through examples to clarify their purpose and significance.

    Basic Concepts of Relationship Types

    Relationship types refer to the specific ways in which entities relate to each other within a database system. You can think of them as the rules or “kinds” of connections established between different data tables. Each relationship type defines how one entity interacts with another, guiding the structure and logic of data linkage.

    To grasp the basics, consider how relationships in data resemble real-life connections. For example, in a school database, Students and Courses are two separate entities. The relationship between them could be defined as Enrollment. This relationship shows which students are enrolled in which courses. Here, Enrollment is the relationship type that links the Student entity set with the Course entity set.

    Characteristics of Relationship Types

    In relationships, we see several characteristics, as given below:

    • Cardinality − Defines the number of entities involved. These could be one-to-one, one-to-many, or many-to-many relationship.
    • Attributes − Sometimes, relationships have their own attributes. Like, the date of enrollment could be an attribute of the “enrollment” relationship type.
    • Participation Constraints − Specifies whether an entity’s participation in the relationship is total or partial.

    Relationship Sets Explained

    Next we must focus on the relationship sets. A relationship set is essentially a collection of relationships of the same type. It is a specific instance of all possible pairs (or combinations) of entity instances that satisfy the relationship type. If the “enrollment” is our relationship type, then the set of all current enrollments in the school makes up the relationship set.

    Let us see the school database again. If three students, Alice, Bob, and Carlos, are enrolled in two courses, the Math and Science, the relationship set of “enrollment” might look like this:

    • Alice is enrolled in Math
    • Bob is enrolled in Science
    • Carlos is enrolled in both Math and Science

    Here, the set includes these specific instances. The mapping out the active links between the Student and Course entity sets.

    Examples of Relationship Sets

    Consider the “Works_On” relation. It needs the Employee and Project relations.

    Employee Relation:

    FnameMinitLnameSsnBdateAddressSexSalarySuper_ssnDno

    Project Relation

    PnamePnumberPlocationDnum

    Works_On relation

    EssnPnoHours

    Attributes − The relationship type “Works_On” have attributes like “Hours,” this is indicating how many hours an employee works on a specific project.

    Relationship Set − Now consider John, who works 20 hours on Project X and 10 hours on Project Y. Jane, on the other hand, spends 15 hours on Project X and 25 hours on Project Z. So, the set of these instances forms the relationship set for “Works_On.”

    Examples of Relationship Sets

    Degree of a Relationship Type

    The term “degree of a relationship type” indicates the number of participating entity types. This section covers the most common degrees: binary, ternary, and higher-order relationships.

    Binary Relationships

    Binary relationships, as the name suggests, needs two entity types. The “Works_On” example is a classic case of a binary relationship, as it connects “Employee” and “Project”.

    Let’s take another example. Suppose in a school database, a “Teaches” relationship between the entity types “Teacher” and “Subject” is a binary relationship.

    Ternary and Higher-Degree Relationships

    Relations could be more expanded. Ternary relationships involve three entity types. For instance, if a company needs to model which “Employee” works on which “Project” and for which “Client,” is another relation. It is a ternary relationship.

    In higher-degree relationships, complexity increases. We can imagine a four-way link connecting the relations “Employee,” “Project,” “Client,” and “Location”.

    Attributes of Relationship Types

    As we know, the columns are attributes and they do not just belong to entities. The relationship types can have their own attributes that provide more detail.

    Attributes in “Works_On”

    As we have seen in the “Works_On” relationship, the attributes called “Hours” are important. They indicate how much time an employee dedicates to a project. If John works 30 hours on Project Z, the attribute “Hours” would hold that data in the relationship set entry linking John and Project Z.

    Mapping Cardinalities

    Mapping cardinalities indicate the number of associations between entities. Mapping cardinalities play an important role in accurately describing real-world constraints.

    • One-to-One (1:1) − A one-to-one relationship means one entity in set A is related to one entity in set B. For example, each “Principal” (entity type) will have only one “School” (another entity type).
    • One-to-Many (1 : N) − A one-to-many relationship means one entity in set A is connected to multiple entities in set B. In the eBook example, each “Employee” may work on several “Projects.”
    • Many-to-Many (M : N) − Many-to-many relationships mean entities from both sets can be connected in numerous ways. The “Works_On” relationship type is a perfect example. Employees can work on different projects, and projects can have multiple employees.

    Conclusion

    In this chapter, we explained in detail the concepts of “relationship types” and “relationship sets”. We understood their importance in databases, and how they illustrate connections between data. We also explored the basic definitions, learned how to represent the relationships in ER diagrams.

    In addition, we touched upon the concept of attributes and degrees of relationship types, before concluding the chapter with a brief introduction of “mapping cardinalities” and covering how these concepts work in real-world database modeling.

  • ER Diagram Representation

    Let us now learn how the ER Model is represented by means of an ER diagram. Any object, for example, entities, attributes of an entity, relationship sets, and attributes of relationship sets, can be represented with the help of an ER diagram.

    Entity

    Entities are represented by means of rectangles. Rectangles are named with the entity set they represent.

    Entities in a school database

    Attributes

    Attributes are the properties of entities. Attributes are represented by means of ellipses. Every ellipse represents one attribute and is directly connected to its entity (rectangle).

    Simple Attributes

    If the attributes are composite, they are further divided in a tree like structure. Every node is then connected to its attribute. That is, composite attributes are represented by ellipses that are connected with an ellipse.

    Composite Attributes

    Multivalued attributes are depicted by double ellipse.

    Multivalued Attributes

    Derived attributes are depicted by dashed ellipse.

    Derived Attributes

    Relationship

    Relationships are represented by diamond-shaped box. Name of the relationship is written inside the diamond-box. All the entities (rectangles) participating in a relationship, are connected to it by a line.

    Binary Relationship and Cardinality

    A relationship where two entities are participating is called a binary relationship. Cardinality is the number of instance of an entity from a relation that can be associated with the relation.

    • One-to-one − When only one instance of an entity is associated with the relationship, it is marked as ‘1:1’. The following image reflects that only one instance of each entity should be associated with the relationship. It depicts one-to-one relationship.
    • One-to-many − When more than one instance of an entity is associated with a relationship, it is marked as ‘1:N’. The following image reflects that only one instance of entity on the left and more than one instance of an entity on the right can be associated with the relationship. It depicts one-to-many relationship.
    • Many-to-one − When more than one instance of entity is associated with the relationship, it is marked as ‘N:1’. The following image reflects that more than one instance of an entity on the left and only one instance of an entity on the right can be associated with the relationship. It depicts many-to-one relationship.
    • Many-to-many − The following image reflects that more than one instance of an entity on the left and more than one instance of an entity on the right can be associated with the relationship. It depicts many-to-many relationship.

    Participation Constraints

    • Total Participation − Each entity is involved in the relationship. Total participation is represented by double lines.
    • Partial participation − Not all entities are involved in the relationship. Partial participation is represented by single lines.
    Participation Constraints
  • ER Model Basic Concepts

    The ER model defines the conceptual view of a database. It works around real-world entities and the associations among them. At view level, the ER model is considered a good option for designing databases.

    Entity

    An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. For example, in a school database, students, teachers, classes, and courses offered can be considered as entities. All these entities have some attributes or properties that give them their identity.

    An entity set is a collection of similar types of entities. An entity set may contain entities with attribute sharing similar values. For example, a Students set may contain all the students of a school; likewise a Teachers set may contain all the teachers of a school from all faculties. Entity sets need not be disjoint.

    Attributes

    Entities are represented by means of their properties, called attributes. All attributes have values. For example, a student entity may have name, class, and age as attributes.

    There exists a domain or range of values that can be assigned to attributes. For example, a student’s name cannot be a numeric value. It has to be alphabetic. A student’s age cannot be negative, etc.

    Types of Attributes

    • Simple attribute − Simple attributes are atomic values, which cannot be divided further. For example, a student’s phone number is an atomic value of 10 digits.
    • Composite attribute − Composite attributes are made of more than one simple attribute. For example, a student’s complete name may have first_name and last_name.
    • Derived attribute − Derived attributes are the attributes that do not exist in the physical database, but their values are derived from other attributes present in the database. For example, average_salary in a department should not be saved directly in the database, instead it can be derived. For another example, age can be derived from data_of_birth.
    • Single-value attribute − Single-value attributes contain single value. For example − Social_Security_Number.
    • Multi-value attribute − Multi-value attributes may contain more than one values. For example, a person can have more than one phone number, email_address, etc.

    These attribute types can come together in a way like −

    • simple single-valued attributes
    • simple multi-valued attributes
    • composite single-valued attributes
    • composite multi-valued attributes

    Entity-Set and Keys

    Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.

    For example, the roll_number of a student makes him/her identifiable among students.

    • Super Key − A set of attributes (one or more) that collectively identifies an entity in an entity set.
    • Candidate Key − A minimal super key is called a candidate key. An entity set may have more than one candidate key.
    • Primary Key − A primary key is one of the candidate keys chosen by the database designer to uniquely identify the entity set.

    Relationship

    The association among entities is called a relationship. For example, an employee works_at a department, a student enrolls in a course. Here, Works_at and Enrolls are called relationships.

    Relationship Set

    A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have attributes. These attributes are called descriptive attributes.

    Degree of Relationship

    The number of participating entities in a relationship defines the degree of the relationship.

    • Binary = degree 2
    • Ternary = degree 3
    • n-ary = degree

    Mapping Cardinalities

    Cardinality defines the number of entities in one entity set, which can be associated with the number of entities of other set via relationship set.

    • One-to-one − One entity from entity set A can be associated with at most one entity of entity set B and vice versa.
    • One-to-many − One entity from entity set A can be associated with more than one entities of entity set B however an entity from entity set B, can be associated with at most one entity.
    • Many-to-one − More than one entities from entity set A can be associated with at most one entity of entity set B, however an entity from entity set B can be associated with more than one entity from entity set A.
    • Many-to-many − One entity from A can be associated with more than one entity from B and vice versa.
  • Handling Constraint Violations

    Relational database systems are not just about storing data; they also need to focus on maintaining the integrity and consistency of data during various operations.

    In this chapter, we will take a close look at update operations, along with the concept of transactions, and strategies for handling constraint violations, with detailed examples for a better understanding.

    Insert, Delete, and Update Operations

    There are three major operations that we normally perform on a database –

    • Insert − Adds new tuples to a table.
    • Delete − Removes tuples from a table.
    • Update (or Modify) − Changes the values in existing tuples.

    The update operation in a database allows modifications to the existing values of one or more attributes within a relation. The goal is to maintain a consistent state that adheres to all integrity constraints.

    All the three operations have the potential to breach the constraints defined in the database schema. Constraints such as key constraints, entity integrity, referential integrity, etc., are used for preserving data quality.

    Operations and Constraint Violations

    The insert operation is used to add a new record to a relation. However, it can trigger several constraint violations:

    • Domain Constraint − If the data type or range of a value in an attribute is incorrect, this constraint is breached.
    • Key Constraint − Violated when a primary key value in the new tuple matches an existing primary key.
    • Entity Integrity − Cause error if a primary key in the new tuple contains a NULL value.
    • Referential Integrity − Occurs when a foreign key references a non-existent value in the related table.

    Consider we are attempting to insert the tuple:

    <'Alicia','J','Zelaya','999887777','1960-04-05','6357 Windy Lane, Katy, TX', F,28000,'987654321',4>

    into the EMPLOYEE table. In this case, a key constraint violation will occur if 999887777 already exists as a primary key.

    The delete operation can violate referential integrity because the tuple being removed is referenced by foreign keys in other tables. For instance, deleting an employee whose ID is used in the WORKS_ON table would create orphaned records. This will disrupt data relationships. For example, deleting an EMPLOYEE with Ssn = ‘333445555’ could violate referential integrity because tuples in DEPARTMENT, WORKS_ON, and DEPENDENT might reference this employee.

    Now, let’s check what kind of constraint violations might occur while updating a database. Updating the existing data can be complex, especially when altering primary keys or foreign keys. Modifying a primary key is equivalent to deleting the original record and inserting a new one, which may trigger key constraints or referential integrity issues. For example, updating the Dno (department number) of an EMPLOYEE record to a value that does not exist in the DEPARTMENT table results in a referential integrity violation.

    Handling Constraint Violations

    In this section, let’s understand how to handle the constraint violations that occur during the insert, delete, and update operations.

    Handling Insertion Violations

    When an insertion violates constraints, the most common response is to reject the insertion and notify the user. But, there are situations where DBMS can assist by suggesting corrective measures:

    • Prompting for valid values − The DBMS might ask for a non-NULL value for primary keys if the violation is due to NULL entries.
    • Proposing related tuple insertion − If referential integrity fails, the system may suggest inserting a corresponding record into the referenced table.

    For example, attempting to insert an EMPLOYEE with a Dno value that does not exist can prompt the DBMS to suggest inserting a valid department first.

    Handling Deletion Violations

    Delete operations may also cause problems, primarily affecting referential integrity. The strategies to manage these violations are:

    • Restrict − Prevents the deletion if it disrupts related records.
    • Cascade − Automatically deletes all related tuples. This is useful when removing an entry that is referenced in multiple tables.
    • Set NULL / Default − Changes foreign key references to NULL or a preset default value when the primary key is deleted.

    For example, if an EMPLOYEE tuple with Ssn = ‘123456789’ is deleted, then the DBMS may set the Essn in WORKS_ON to NULL. This will avoid referential integrity breaches.

    Managing Update Operation

    Update operations can also breach constraints, particularly when modifying attributes tied to keys:

    • Updating a primary key − The DBMS treats it like a combination of delete and insert. If other tables reference the original primary key, this may lead to referential integrity problems.
    • Updating a foreign key − The DBMS must verify that the new value references an existing primary key or set it to NULL.

    For example, changing an employee’s Ssn from ‘123456789’ to ‘987654321’ may lead to issues if ‘987654321’ is already in use as a primary key elsewhere.

    Transactions and Their Importance

    Along with database operations, another important concept is transactions. In a database context, a transaction is an atomic sequence of operations that includes reading or updating the data. Transactions ensure that operations are performed as a single unit, maintaining the database’s consistency.

    Key properties of transactions include:

    • Atomicity − Ensures that all parts of a transaction are completed; otherwise, none are applied.
    • Consistency − Keeps the database in a valid state.
    • Isolation − Transactions do not interfere with each other.
    • Durability − Once a transaction is committed, changes are permanent.

    For example, a bank transfer operation where one account is debited, and another is credited must either complete fully or revert to the initial state to prevent inconsistencies.

    Handling Transactions in Practice

    Transactions are nothing but a set of rules that follow certain behavior. Transactions often involve multiple steps that could individually violate constraints. Atomicity means the DBMS checks for violations at each step:

    • Before committing − The DBMS verifies all constraints. If a violation is detected, the entire transaction rolls back to prevent partial changes.
    • After a rollback − The database state returns to what it was before the transaction began, preserving data consistency.

    For example, in an order processing system, a transaction might include checking stock availability, updating inventory, and recording the sale. If the stock update fails due to an inventory constraint, the entire transaction should revert.

    Referential Integrity Violation

    Let’s check a practical scenario of constraint violation. Consider the following UPDATE operation:

    UPDATE EMPLOYEE SET Dno =10WHERE Ssn ='999887777';

    If Dnumber = 10 does not exist in DEPARTMENT, then this update fails due to a referential integrity violation. The DBMS must check to ensure that relationships remain consistent.

    Combining Strategies for Complex Violations

    In some cases, a combination of cascade, set NULL, and restrict strategies might be applied based on specific needs:

    • Cascade for deep deletions − When deleting a department, cascade deletes associated employees and projects.
    • Set NULL for non-critical references − Updates foreign key values to NULL when the primary record is removed.
    • Restrict for critical data − Prevents deletion if essential references are present, such as financial records.

    Conclusion

    In this chapter, we explained how update operations, transactions, and constraint handling are integral to maintaining data integrity in relational databases. We understood the importance of managing insert, delete, and update operations without violating any constraints. We also explored how the DBMS reacts to constraint violations with strategies like restrict, cascade, and set NULL. Additionally, we examined how transactions help ensure that changes maintain consistency and follow atomicity.

  • Relational Database Schemas

    In a relational database model, the structure of data is maintained through a defined schema. A relational schema serves as the framework that outlines how data is organized and managed within a relational database. It includes attributes and the relationships between them. This framework ensures that data is stored in a way that maintains consistency and integrity.

    In this chapter, we will explore relational database schemas in depth, understand key concepts through examples, and explain how constraints help maintain data reliability.

    Relational Database Schema

    A relational database schema, represented by S, acts as a map for how data is structured in the database. It consists of a set of relation schemas and a collection of integrity constraints.

    The schema outlines how data interacts within and between tables. Each relation schema specifies the table’s name, its attributes, and their domains.

    Components of a Schema

    A relational schema has three major components:

    • Relation Schema (R) − The structure of a table, written as R(A1, A2, …, An), where A1, A2, …, An are attributes.
    • Attribute − A column within a table, each with a specific domain that limits the type of data it can contain.
    • Database State (DB) − The current data held in all tables conforming to the schema. A valid state meets all schema constraints.

    Example − Consider a relational schema for a COMPANY database:

    • EMPLOYEE (Ssn, Name, Salary, Dno)
    • DEPARTMENT (Dnumber, Dname, Mgr_ssn, Mgr_start_date)
    • PROJECT (Pnumber, Pname, Plocation, Dnum)

    Each relation schema outlines the columns of its respective table. It also indicates how data should be represented.

    Relational Database Schemas – Key Features and Components

    Let us now understand some of the key features and components of Relational Database Schemas:

    Attributes and Their Domains

    Every attribute within a relation schema has an associated domain that defines its allowable values. The domains enforce data type restrictions and value limits, ensuring consistency within the database.

    Consider an example where the Ssn attribute in the EMPLOYEE table might be limited to a pattern that matches valid social security numbers (e.g., XXX-XX-XXXX). Similarly the Age attribute could be restricted to integer values between 18 and 65.

    We know domains help prevent errors by ensuring the supplied data conforms to expected types and formats. So, an Age attribute defined with INTEGER CHECK (Age BETWEEN 18 AND 65) would block an insertion of Age = 17 or Age = 70.

    Relation States

    relation state is the set of tuples (rows) that exist in a table at any given time. Each tuple must comply with the attribute domains defined in the schema.

    For a better understanding, consider the following EMPLOYEE table with data:

    FnameLnameSsnSalaryDno
    JohnSmith123456789300005
    FranklinWong333445555400005
    AliciaZelaya999887777250004
    JenniferWallace987654321430004
    RameshNarayan666884444380005
    JoyceEnglish453453453250005
    AhmadJabbar987987987250004
    JamesBorg888665555550001

    For example, a tuple in the EMPLOYEE table might be represented as,

    <John, Smith,123-45-6789,30000,5>

    This tuple states that John Smith, identified by his Ssn, earns a salary of $30,000 and works in the department with Dno = 5. (We have removed other attributes for simplicity)

    Integrity Constraints

    Integrity constraints make sure that the data remains accurate and logically consistent. Integrity constraints include:

    • Key Constraints − Enforce uniqueness for specified attributes, known as keys, so that no two tuples share the same values for these attributes.
    • Entity Integrity − Ensures primary key attributes cannot have NULL values, maintaining the uniqueness and identifiability of each tuple.
    • Referential Integrity − Ensures foreign keys match primary keys in related tables or are NULL.
    DnameDnumberMgr_ssnMgr_start_date
    Research53334455551988-05-22
    Administration49876543211995-01-01
    Headquarters18886655551981-06-19

    In the COMPANY database, the Dno attribute in the EMPLOYEE table is a foreign key referencing the Dnumber attribute in the DEPARTMENT table. If an EMPLOYEE record has Dno = 5, Dnumber = 5 must exist in the DEPARTMENT table for referential integrity to hold.

    Practical Examples of Relational Database Schemas

    Let us go through the COMPANY database schema and see how the tables relate and enforce constraints.

    Here is the PROJECT table:

    PnamePnumberPlocationDnum
    ProductX1Bellaire5
    ProductY2Sugarland5
    ProductZ3Houston5
    Computerization10Stafford4
    Reorganization20Houston1
    Newbenefits30Stafford4
    • EMPLOYEE (Ssn, Name, Salary, Dno) − The Ssn attribute acts as the primary key, uniquely identifying each employee. The Dno attribute is a foreign key referencing Dnumber in the DEPARTMENT table.
    • DEPARTMENT (Dnumber, Dname, Mgr_ssn, Mgr_start_date) − The Dnumber attribute is the primary key, uniquely identifying each department. The Mgr_ssn attribute references Ssn in the EMPLOYEE table, indicating which employee manages the department.
    • PROJECT (Pnumber, Pname, Plocation, Dnum) − Pnumber serves as the primary key. Dnum is a foreign key referencing Dnumber in DEPARTMENT.

    If an employee record is inserted with Dno = 10, but Dnumber = 10 does not exist in the DEPARTMENT table, the insertion will be rejected due to a referential integrity violation.

    Super Keys and Candidate Keys

    Super keys and candidate keys help identify unique tuples within a relation.

    • The term “super key” is used for any combination of attributes that uniquely identifies a tuple.
    • A candidate key is a minimal super key without redundant attributes.

    Consider a CAR schema:

    CAR (License_number, Engine_serial_number, Make, Model)

    In this case,

    • License_number is a candidate key, as it uniquely identifies each car.
    • License_number, Engine_serial_number together is a superkey, but not minimal, so it is not a candidate key.

    The Importance of Primary Keys

    A primary key is nothing but a chosen candidate key. It must have unique, non-NULL values. This constraint ensures every record can be uniquely identified.

    For example, Ssn is the primary key in the EMPLOYEE table. Trying to insert a record with Ssn = NULL will violate entity integrity, resulting in a rejected insertion.

    Relationships and Foreign Keys

    Foreign keys are attributes in one table that link to primary keys in another. Foreign keys are meant for creating relationship between tables. They help ensure consistency when dealing with related data.

    Consider the WORKS_ON table with attributes Essn and Pno:

    EssnPnoHours
    123456789132.5
    12345678927.5
    666884444340.0
    453453453120.0
    453453453220.0
    333445555210.0
    333445555310.0
    3334455551010.0
    3334455552010.0
    9998877773030.0
    9998877771010.0
    9879879871035.0
    987987987305.0
    9876543213020.0
    9876543212015.0
    88866555520NULL

    Essn references Ssn in EMPLOYEE, and Pno references Pnumber in PROJECT. This relationship defines which employees are working on which projects.

    Violation Case − If an employee’s Ssn is deleted from EMPLOYEE but still exists as Essn in WORKS_ON, then referential integrity gets violated unless action is taken (e.g., cascading the delete or setting Essn to NULL).

    Conclusion

    To conclude, relational database schemas are foundational for defining how data is structured and managed. In this chapter, we highlighted the components of a schema, such as relation schemas, attributes, and their domains. Thereafter, we used the COMPANY schema to illustrate the importance of primary and foreign keys. We also understood how integrity constraints like entity and referential integrity help maintain data reliability.

  • Relational Model Constraints

    Relational databases are widely used; they provide structured and reliable ways to store and access the data. To ensure data integrity and consistency, relational databases mostly rely on certain rules known as constraints. These constraints are used to maintain the quality and reliability of data across different tables and records.

    In this chapter, we will take a look at the types of relational model constraints, supplemented with practical examples and explanations for a better understanding.

    Relational Model Constraints

    Relational model constraints are set of rules applied to the structure and data of a database to maintain logical consistency. They ensure that data is valid according to the rules defined in the database schema. These constraints can be broken down into three main types as given below −

    • Inherent model-based constraints − Implied by the nature of the relational model itself.
    • Schema-based constraints − Explicitly defined within the database schema and enforced by the database system.
    • Application-based constraints − Managed through external application logic as they cannot be directly expressed in the schema.

    Types of Relational Model Constraints

    Relational model constraints can be of the following types −

    • Domain Constraints
    • Key Constraints
    • Entity Integrity Constraint
    • Referential Integrity Constraint

    Let’s understand each of these in detail.

    Domain Constraints

    Domain constraints are used to specify that, every attribute in a database must contain only values from a predefined set known as domain. This ensures that each column in a table holds data of a particular type and format.

    For example, consider an EMPLOYEE table with an Age column that only allows integer values between 18 and 65. This domain constraint ensures that inserting a value such as “17” or “70” would violate the rule and be rejected by the system.

    If we define Age as INTEGER CHECK (Age BETWEEN 18 AND 65), any attempt to insert Age = 17 will trigger an error. The domain constraints is used to maintain data accuracy by preventing improper data types or out-of-bounds values.

    Key Constraints

    Key constraints are used for each record within a relation. It can be uniquely identified. This uniqueness is achieved through keys.

    Keys can be of the following types −

    • Superkey − Any set of attributes that uniquely identifies a tuple in a relation.
    • Candidate key − A minimal superkey. Removing any attribute would break the uniqueness property.
    • Primary key − A chosen candidate key that uniquely identifies each tuple in the table.

    For example, consider a table called STUDENT as given below −

    NameSSNHome PhoneAddressOffice PhoneAgeGPA
    Dick Davidson422-11-2320NULL3452 Elgin Road(817)749-1253253.53
    Barbara Benson533-69-1238(817)839-84617384 Fontana LaneNULL193.25
    Rohan Panchal489-22-1100(817)376-9821265 Lark Lane(817)749-6492283.93
    Chung-cha Kim381-62-1245(817)375-4409125 Kirby RoadNULL182.89
    Benjamin Bayer305-61-2435(817)373-16162918 Bluebonnet LaneNULL193.21

    The SSN (Social Security Number) serves as a primary key because each student must have a unique SSN. The primary key constraint guarantees that no two students can have the same SSN.

    This is important because the primary keys help to maintain the identity of each record. If a STUDENT table did not have a primary key, then distinguishing between duplicate records would be difficult, which will lead to a potential data ambiguity.

    Entity Integrity Constraint

    The entity integrity constraint states that, the primary key of a relation must always have a non-NULL value. It is needed because primary keys are used to identify tuples, and a NULL value would make identification impossible.

    Consider a table called CUSTOMER. If Customer_ID is defined as the primary key, then any attempt to insert a record with “Customer_ID = NULL” will violet the entity integrity constraint. It will be rejected by the system.

    Practically, we can consider a situation where a banking system has a Customer relation with primary keys representing account numbers. If one record had Account_ID = NULL, then it would be impossible to reference or link this record.

    Referential Integrity Constraint

    Referential integrity constraint states that, the relationships between tables remain consistent. A foreign key in one table must either match a primary key value in another table or be NULL. This constraint can preserve the logical connections between records in different tables.

    Suppose we have an EMPLOYEE table with a Dno column. Here, it references the Dnumber from the DEPARTMENT table. If Dno is set to a value that does not exist in the DEPARTMENT table, then it violates the referential integrity constraint.

    Detailed Scenario − See the Employee table −

    FnameMinitLnameSSNBdateAddressSexSalarySuper_ssnDno
    JohnBSmith1234567891965-01-09731 Fondren, Houston, TXM300003334455555
    FranklinTWong3334455551955-12-08638 Voss, Houston, TXM400008886655555
    AliciaJZelaya9998877771968-01-193321 Castle, Spring, TXF250009876543214
    JenniferSWallace9876543211941-06-20291 Berry, Bellaire, TXF430008886655554
    RameshKNarayan6668844441962-09-15975 Fire Oak, Humble, TXM380003334455555
    JoyceAEnglish4534534531972-07-315631 Rice, Houston, TXF250003334455555
    AhmadVJabbar9879879871969-03-29980 Dallas, Houston, TXM250009876543214
    JamesEBorg8886655551937-11-10450 Stone, Houston, TXM55000NULL1

    See the Department table −

    DnameDnumberMgr_ssnMgr_start_date
    Research53334455551988-05-22
    Administration49876543211995-01-01
    Headquarters18886655551981-06-19

    Valid Insertion − Adding an EMPLOYEE with Dno = 4 is allowed if Dnumber = 4 exists in DEPARTMENT.

    Violation Example − Inserting an EMPLOYEE with Dno = 99, where Dnumber = 99 does not exist in DEPARTMENT, would be rejected.

    This constraint is particularly useful in maintaining relationships between parent and child tables. So, deleting a department referenced by employees must either be restricted or cascaded to maintain integrity.

    Handling Constraint Violations

    In this section, let’s understand the kind of operations that can cause constraint violations and what precautionary measures can be taken in order to prevent such operations.

    Insertion Violations

    An insertion operation can violate various constraints because of −

    • Domain Constraint Violation − Inserting a non-integer value in an Age column defined as an integer will be rejected.
    • Key Constraint Violation − Adding a tuple with a duplicate Ssn in a STUDENT table where Ssn is the primary key will result in an error.
    • Entity Integrity Violation − Attempting to insert a row with NULL as the primary key value violates the entity integrity constraint.
    • Referential Integrity Violation − Inserting an EMPLOYEE with a Dno value that does not match any Dnumber in DEPARTMENT.

    Deletion and Referential Integrity

    Deleting a tuple can trigger constraint violations, particularly for referential integrity. For example, deleting the DEPARTMENT record that is referenced by the Dno of an EMPLOYEE tuple results in referential integrity issues.

    We can solve this issue by Restrict operation, like prevent deletion if it causes a violation. Or cascade, that is, automatically delete all dependent records. There is another solution: Set NULL / Default, i.e., modify the foreign key in dependent records to NULL or a default value.

    Update Operations

    Updating a record can affect primary keys and foreign keys. Modifying a primary key is similar to deleting the original record and inserting a new one, which may violate existing constraints. Updates can be two types:

    • Safe Update − Changing an employee’s salary does not affect primary or foreign keys and is permissible.
    • Risky Update − Modifying Ssn in EMPLOYEE to an existing Ssn violates the primary key constraint.

    Other Constraints and Business Rules

    We have studied the rules, but there are some other application-based constraints, or business rules. These rules include those that cannot be easily defined within the schema, such as −

    • “An employee’s salary must not exceed that of their supervisor.”
    • “Total weekly working hours for an employee should not exceed 56 hours.”

    These are typically enforced using application logic or triggers.

    Conclusion

    In this chapter, we presented an elaborate explanation of how relational model constraints play a crucial role in maintaining data integrity and consistency in a database. We understood the different types of constraints, such as domain constraint, key constraint, entity integrity constraints, and referential integrity constraints. We discussed practical examples for these constraints in action. We also reviewed how constraint violations can be handled and the importance of application-based rules.

  • Relation Data Model

    Relational data model is the primary data model, which is used widely around the world for data storage and processing. This model is simple and it has all the properties and capabilities required to process data with storage efficiency.

    Concepts

    Tables − In relational data model, relations are saved in the format of Tables. This format stores the relation among entities. A table has rows and columns, where rows represents records and columns represent the attributes.

    Tuple − A single row of a table, which contains a single record for that relation is called a tuple.

    Relation instance − A finite set of tuples in the relational database system represents relation instance. Relation instances do not have duplicate tuples.

    Relation schema − A relation schema describes the relation name (table name), attributes, and their names.

    Relation key − Each row has one or more attributes, known as relation key, which can identify the row in the relation (table) uniquely.

    Attribute domain − Every attribute has some pre-defined value scope, known as attribute domain.

    Constraints

    Every relation has some conditions that must hold for it to be a valid relation. These conditions are called Relational Integrity Constraints. There are three main integrity constraints −

    • Key constraints
    • Domain constraints
    • Referential integrity constraints

    Key Constraints

    There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely. This minimal subset of attributes is called key for that relation. If there are more than one such minimal subsets, these are called candidate keys.

    Key constraints force that −

    • in a relation with a key attribute, no two tuples can have identical values for key attributes.
    • a key attribute can not have NULL values.

    Key constraints are also referred to as Entity Constraints.

    Domain Constraints

    Attributes have specific values in real-world scenario. For example, age can only be a positive integer. The same constraints have been tried to employ on the attributes of a relation. Every attribute is bound to have a specific range of values. For example, age cannot be less than zero and telephone numbers cannot contain a digit outside 0-9.

    Referential integrity Constraints

    Referential integrity constraints work on the concept of Foreign Keys. A foreign key is a key attribute of a relation that can be referred in other relation.

    Referential integrity constraint states that if a relation refers to a key attribute of a different or same relation, then that key element must exist.