Document Details

FascinatingPanther1765

Uploaded by FascinatingPanther1765

Guwahati University

Tags

databases database management systems data modeling information systems

Summary

This document covers various aspects of databases, including their fundamental concepts, characteristics, and advantages. It delves into topics like database management systems (DBMS), the ANSI/SPARC architecture, data independence, and normalization. The key focus is on the theoretical understanding and features of databases and associated technologies.

Full Transcript

1NoQ:- What is a Database? Answer: A database is an organized collection of data. The data are typically organized to model aspects of reality in a way that supports processes requiring information. 2NoQ:-What is Database Management System? Answer: Database management systems (DBMSs) are specially...

1NoQ:- What is a Database? Answer: A database is an organized collection of data. The data are typically organized to model aspects of reality in a way that supports processes requiring information. 2NoQ:-What is Database Management System? Answer: Database management systems (DBMSs) are specially designed software applications that interact with the user, other applications, and the database itself to capture and analyze data. A general-purpose DBMS is a software system designed to allow the definition, creation, querying, update, and administration of databases. Well-known DBMSs include MySQL, Microsoft SQL Server, and Oracle etc. 3NO Q:-What are the major characteristics of database approach? Answer: 1. Self-describing nature of the database system: Self-describing nature of the database is the approach that the database system not only contains the database itself but also contains the description and definition of the structure of the database and its constraints. 2. Insulation between data and program, and data abstraction: In traditional file system the file structure is embedded in the application, therefore if there is any change in the file structure then it may require that all the programs are changed which access these file. In contrast, the DBMS system does not require such changes in most of the case. The file structure here is stored in the DBMS catalog separate from the access program. We can call this property as DATA INDEPENDENCE. 3. Support multiple user view of the data: A database typically has many users and each user may have a different perspective or view of the database. So it is important that a multiple user database that has a variety of distinct application provides the facility for defining multiple views. 4. Sharing of data and multiple user transaction processing: A DBMS must allow multiple users to share the data at the same time. It must include concurrency control software to ensure that multiple users trying to update the same data do it in a controlled manner so that the result of the update is correct. 4No Q:- What are the advantage and disadvantages of DBMS? Answer: Advantage of DBMS Reduction of Redundancies: Centralized control of data by the DBA avoids unnecessary duplication of data and effectively reduces the total amount of data storage required Elimination of Inconsistencies: The main advantage of avoiding duplication is the elimination of inconsistencies that tend to be present in redundant data files. Shared Data: A database allows the sharing of data under its control by any number of application programs or users. Integrity: Centralized control can also ensure that adequate checks are incorporated in the DBMS to provide data integrity. Data integrity means that the data contained in the database is both accurate and consistent. Security: Data is of vital importance to an organization and may be confidential. Such confidential data must not be accessed by unauthorized persons. Disadvantage of DBMS Cost of software/hardware and migration: A significant disadvantage of the DBMS system is cost. In addition to the cost of purchasing or developing the software, the hardware has to be upgraded to allow for the extensive programs and work spaces required for their execution and storage. Problem associated with centralization: While centralization reduces duplication, the lack of duplication requires that the database be adequately backed up so that in the case of failure the data can be recovered. Complexity of Backup and Recovery: Backup and recovery operations are fairly complex in a DBMS environment, and this is exacerbated in a concurrent multi user database system. 5. No Q:-Explain the ANSI/SPARC architecture of Database System. Answer: The ANSI/SPARC architecture is divided into three levels, Known as the internal, conceptual, and external levels. The Internal level is the one closest to physical storage- i.e. it is the one concerned with the way the data is physically stored. The External level is the one closest to the users-, it is the one concerned with the way the data is viewed by individual users The conceptual level is a “level of indirection” between the other two. If the external level is concerned with individual user views, then the conceptual level is concerned with a community user view. In other words, there will be many distinct external views, each consisting of a more or less abstract representation of some portion of the total database, and there will be precisely one conceptual view, consisting of a similarly abstract representation of the database in its entirety. Likewise, there will be precisely one internal view, representing the total database as physically stored. 6. No Q :-What is Data Independence? Answer: Data Independence can be defined as the immunity of applications to change in storage structure and access technique – which implies that the applications concerned do not depend on any one particular storage structure or access technique. Logical data independence is the capacity to change the conceptual schema without having to change external schemas or application programs. Changes to the conceptual schema may be needed to expand the database or to reduce the database. Application programs that reference the external schema constructs must work as before, after the conceptual schema undergoes a logical reorganization. Physical data independence is the capacity to change the internal schema without having to change the conceptual or external schemas. Changes to the internal schema may be needed because some physical files had to be reorganized or update. If the same data as before remains in the database, we should not have to change the conceptual schema. 7.No Q:- What is database Administrator (DBA)? What are the functions of DBA? Answer: The data administrator is the person who makes the strategic and policy decisions regarding the data of the enterprise, and the database administrator (DBA) is the person who provides the necessary technical support for implementing those decisions. The functions of DBA are as follows: It is the job of DBA to define the conceptual schema of the database of an enterprise The DBA must also decide how the data is to be represented in the stored database i.e. to define the internal schema of the database of an enterprise. It is the business of the DBA to liaise with users, to ensure that the data they require is available, and to write the necessary external schemas. It is the job of DBA to define the security and integrity rules of the database of an enterprise The DBA must define and implement an appropriate recovery scheme of the database to backup storage, and procedures for reloading the database when necessary. The DBA is responsible for so organizing the system as to get the performance that is “best for the enterprise” and for making the appropriate adjustments as requirements change. 8.No Q:- What is a Data model? Explain. Answer: A data model – a collection of concepts that can be used to describe the structure of a database. By structure of a database we mean the data types, relationships, and constraints that should hold on the data. Most data models also include a set of basic operation for specifying retrievals and updates on the database. High-level or Conceptual data models provide concepts that are close to the way many users perceive data, whereas low-level or physical data models provide concepts that describe the details of how data is stored in the computer. Conceptual data models use concepts such as entities, attributes and relationships. An entity represents a real world object or concept, such as an employee or a project that is described in the database. An attribute represents some property of interest that further describes an entity, such as the employee’s name or salary. A relationship among two or more entities represents an interaction among the entities; for example, a works-on relationship between an employee and a project. Representational or implementation data models are the models used most frequently in traditional commercial DBMSs, and they include the widely-used relational data model, as well as the so called legacy data models- the network and hierarchical models. Representational data models represent data by using record structures and hence are sometimes called record-based data models. Object data model as a new family of higher-level implementation data models that are closer to conceptual data models. Physical data models describe how data is stored in the computer by representing information such as record formats, record orderings, and access paths. An access path is a structure that makes the search for particular database records efficient. 9. No Q:-What are database schemas, Instances, Database State? Answer: The description of the database is called the database schema, which is specified during database design and is not expected to change frequently. Most data models have certain conventions for displaying the schemas as diagrams. A displayed schema is called a schema diagram. The actual data in a database may change quite frequently. The data in the database at a particular moment in time is called a database state or snapshot. It is also called the current set of occurrences or instances in the database. The DBMS stores the descriptions of the schema constructs and constraints also called the meta-data – in the DBMS catalog so that DBMS software can refer to the schema whenever it needs to. The schema is sometimes called the intension, and a database state an extension of the schema. Sometimes changes need to be applied to the Database schema once in a while as the application requirements change. For example, we may decide that another data item needs to be stored for each record in a file. This is known as schema evolution. 10.No Q:- Write a note on DBMS Languages. Answer: Once the design of a database is completed and a DBMS is chosen to implement the database, is to specify conceptual and internal schemas for the database and any mapping between the two. In many DBMSs where no strict separation of levels is maintained, one language, called the data definition language (DDL,) is used by the DBA and by database designer to define both schemas. The DBMSs will have a DDL compiler whose function is to process DDL statements in order to identify descriptions of the schema constructs and to store the schema description in the DBMS catalog. In DBMSs Where a clear separation is maintained between the conceptual and internal levels, the DDL is used to specify the conceptual schema only. Another language, the storage definition language (SDL), is used to specify the internal schema. For true three-schema architecture, we would need a third language, the view definition language (VDL), to specify user views and their mappings to the conceptual schema, but in most DBMSs the DDL is used to define both conceptual and external schemas. Once the database schemas are compiled and the database is populated with data, users must have some means to manipulate the database. Typical manipulations include retrieval, insertion, deletion, and modification of the data. The DBMS provides a data manipulation language (DML) for these purposes. There are two main types of DMLs. They are high-level or nonprocedural DML and low-level or procedural DML. High- level DMLs are also called set-at-a-time or set-oriented DMLs. Low- Level DMLs are also called record-at-a-time DMLs. A high-level or nonprocedural DML for example SQL can be used on its own to specify complex database operations in concise manner. A low-level or procedural DML must be embedded in a general-purpose programming Language. A query in high-level DML often specifies which data to retrieve rather than how to retrieve it; hence, such languages are called declarative. Whenever DML commands, whether high-level or low-level, are embedded in a general-purpose programming language, the language is called the host language and the DML is called the data sublanguage. On the other hand, a high-level DML used in a stand- alone interactive manner is called a query Language. 11.NoQ:- Who are the persons involved in the design, use, and maintenance of database? Answer: The persons involved in the design, use and maintenance of database are as follows: Database Administrators: In any organization where many persons use the same resources, there is a need for a chief administrator to oversee and manage these resources. The DBA is responsible for authorizing access to the database, for coordinating and monitoring its use, and for acquiring software and hardware resource as needed. Database Designers: They are responsible for identifying the data to be stored in the database and for choosing appropriate structures to represent and store this data. End users: They are the people whose jobs require access to the database for querying, updating and generating reports; the database primarily exists for their use. There are several categories of end users: Casual end user occasionally accesses the database, but they may need different information each time. They use a sophisticated database query language to specify their requests and are typically middle or high level managers or other occasional browsers. Naive or Parametric end users make up a sizable portion of database end users. Their main job function revolves around constantly querying and updating the database, using standard types of queries and updates – called canned transactions. Sophisticated end users include engineers, scientists, business analysts, and others who thoroughly familiarize themselves with the facilities of the DBMS so as to implement their applications to meet their complex requirements. Stand-alone users maintain personal databases by using ready-made program packages that provide easy to use menu or graphics-based interfaces. An example is the user of a tax package that stores a variety of personal financial data for tax purposes. System Analysts and Application Programmers (Software Engineers) : System analysts determine the requirements of end users, especially naive and parametric end users, and develop specification for canned transactions that meet these requirements. Application programmers implement these specifications as programs; then they test, debug, document, and maintain these canned transactions. 12.No Q:- Write a note on Relational Model. Answer: The relational model represents the database as a collection of relations. Informally, each relation resembles a table of values or to some extent a “flat” file of records. When a relation is thought of as a table of values, each row in the table represents a collection of related data values. In the relational model, each row in the table represents a fact that typically corresponds to a real-world entity or relationship. The table name and column names are used to help in interpreting the meaning of the values in each row. In the formal relational model terminology, a row is called a tuple, a column header is called an attribute, and the table is called a relation. The data type describing the type of values that can appear in each column is called a domain. A domain of a relational schema is a pool of legal values, from which the values of specific attributes are taken. 13.No Q:- What do you mean by Key Fields in a relation? Answer: Key fields are individual fields or group of fields which are used to identifies a particular record in a relation. For example: Student ID, Account No, Mobile No etc. 14.No Q:- What do you mean by Candidate Key in a relation? Answer: Set of one or more attributes whose values uniquely determine each entity. Several candidate key may exist in a relation. Let K be a set of attributes of the relation R. Then, K is a candidate key if and if it satisfies the following two conditions: Uniqueness: No legal values of R ever contain two distinct tuples with the same value. Irreducibility: No proper subset of K has the uniqueness property. 15.No Q:- What do you mean by Super Key? Answer: A super key is defined in the relational model of database organization as a set of attributes of a relation variable for which it holds that in all relations assigned to that variable, there are no two distinct tuples (rows) that have the same values for the attributes in this set. Equivalently a super key can also be defined as a set of attributes of a relation schema upon which all attributes of the schema are functionally dependent. 16.No Q:- What do you mean by primary Key? Answer: Primary key is a candidate Key that is chosen by the database designer as the principal means of identifying entities within an entity set. For example Student ID, Employee ID can be used as primary key. 17.No Q:- What do you mean by Alternate Key? Answer: An alternate key is any candidate key which is not selected to be the primary key. 18. No Q:-What do you mean by Secondary Key? Answer: The attributes that are not even the Super Key but can be still used for identification of records (not unique) are known as Secondary Key. Secondary Key can be used to identify a group of record. For Example Secondary Key can be Name, Address, Salary, Department ID etc. as they can identify the records but they might not be unique. 19.No Q:- What do you mean by Composite Key or compound Key? Answer: A composite key, in the context of relational databases, is a combination of two or more columns in a table that can be used to uniquely identify each row in the table. Uniqueness is only guaranteed when the columns are combined; when taken individually the columns do not guarantee uniqueness. 20. No Q:-What is mean by foreign Key? Answer: In the context of relational databases, a foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table. In other words, a foreign key is a column or a combination of columns that is used to establish and enforce a link between two tables. 21.No Q:- What is an Entity? Answer: An entity may be defined as a thing capable of an independent existence that can be uniquely identified. An entity is an abstraction from the complexities of a domain. Entities are some aspect of the real world that can be distinguished from other aspects of the real world. An entity may be a physical object such as a house or a car, an event such as a house sale or a car service, or a concept such as a customer transaction or order. 22.No Q:- What is a weak and strong entity? Answer: A weak entity is an entity that is existence-dependent on some other entity, in the sense that it cannot exist if that other entity does not also exist. The other entity in which the weak entity is dependent upon is called the strong entity. The entity set which does not have sufficient attributes to form a primary key is called as Weak entity set. An entity set that has a primary key is called as Strong entity set. 23.No Q:- What is an Attribute? Explain the differences types of attributes. Answer: An Attribute is a property that describes an entity. For Example, the employee is a entity and employee’s name, age, address, salary and job etc are the attribute. Simple and Composite Attribute Simple attribute consists of a single atomic value. A simple attribute cannot be subdivided. For example the attributes age, sex etc is simple attributes. A composite attribute is an attribute that can be further subdivided. For example the attribute ADDRESS can be subdivided into street, city, state, and zip code. Single Valued and Multi Valued attribute A single valued attribute can have only a single value. For example a person can have only one 'date of birth', 'age' etc. That is a single valued attributes can have only single value. But it can be simple or composite attribute. That is 'date of birth' is a composite attribute , 'age' is a simple attribute. But both are single valued attributes. Multivalued attributes can have multiple values. For instance a person may have multiple phone numbers, multiple degrees etc. Multivalued attributes are shown by a double line connecting to the entity in the ER diagram. Stored and Derived Attributes The value for the derived attribute is derived from the stored attribute. For example 'Date of birth' of a person is a stored attribute. The value for the attribute 'AGE' can be derived by subtracting the 'Date of Birth'(DOB) from the current date. Stored attribute supplies a value to the related attribute. Complex Attribute A complex attribute that is both composite and multi valued. 24. No Q:-What is an Entity-Relationship model (E-R Model)? Answer: In software engineering, an entity–relationship model (ER model) is a data model for describing the data or information aspects of a business domain or its process requirements, in an abstract way that lends itself to ultimately being implemented in a database such as a relational database. The main components of ER models are entities (things) and the relationships that can exist among them, and databases. 25.No Q:- What is E-R diagram? Answer: An entity-relationship diagram (ERD) is a data modelling technique that graphically illustrates an information system’s entities and the relationships between those entities. An ERD is a conceptual and representational model of data used to represent the entity framework infrastructure. 27.No Q:- What is Cardinality ratio? Answer: Cardinality ratio specifies number of relationship instances that an entity may participate in. Cardinality ratio are one to one(1:1), one to many(1:M) and many to many (M:M). 26.No Q:- What are the symbols used to draw a E-R diagram? Answer: Relationships: To understand what we mean by relationships among data items, we need some familiarity with data model terminologies. An entity is a thing which can be easily identified. It may be an object, place, person, concept or activity for which data need to be recorded. In diagramming technique, entities are named and represented inside a box as shown in Instructor and Student are the entities:- Instructor Student An association among entities leads to relationships. For example, there is some relationship between Instructor and Student as seen. Instructor Data Relationship Model (association Entities between entities) Student This relationship represents the fact that an instructor teaches several students and a student is taught by several instructors. Three types of relationships exist among entities. These are:- a) One-to-One b) One-to-Many c) Many-to-Many One-to-One Relationship(1:1):- A one-to-one(1:1) relationship is an association only between two entities. For example, in a university each department has only one head of the department. Moreover, one faculty Member cannot head more than one department. This shows one-to- one relationship between department and the person as head. Head of the Department Department One-to-Many Relationship(1:M):- Relationship exists when one entity is related to more than one entity. For example a father may have many children but a child has one father. Father Children Many-to-Many Relationship(M:M):- A many-to-many(M:M) relationship describes entities that may have many relationships among each other. For Example, one customer may buy many items and one item may be bought by many customers. Customers Items Q. What is Normalization? Answer: Normalization is the name given to the process of simplifying the relationship among data elements in a record. Normalization replaces a collection of data in a record structure by another record a collection of data in a record structure by another record design which is simpler, more predictable and therefore more manageable. Need for Normalization: Normalization redundancy. Redundancy is the unnecessary repetition of a field. It can cause problems with storage, repetition of a field. It can cause problems with storage, retrieval and updation of data. Redundancy can lead to: (a) Inconsistencies– errors are more likely to occur when facts are repeated. (b) Update anomalies– inserting, modifying and deleting data may cause inconsistencies. Inconsistency occurs when we perform updation or deletion of data in one relation, while forgetting to make corresponding changes in other relations. A fully normalized record consists of: (a) A primary key that identifies that entity. (b) A set of attributes that describe that entity. FIRST NORMAL FORM(1NF) When a table has no repeating groups, it is said to be in first normal form(1 NF). That is for each cell in a table(one row and one column), there can be only one value. This value should be atomic in the sense that it cannot be decomposed into smaller pieces. SECOND NORMAL FORM(2NF) A table is in Second normal form (2NF) if every non-key column depends on the entire key (not just part of it). This issue arise only for composite keys (with multiple columns). THIRD NORMAL FORM(3NF) A relation is in the third normal form if it is in second normal form and no non-prime attribute is functionally dependent on other non-prime attributes. BCNF(BOYCE-CODD NORMAL FORM) Boyce-Codd Normal Form (BCNF) was proposed as a simpler form of simpler form of Third Normal Form (3NF), but it is much more strict than 3NF, meaning that every relation in BCNF is also in 3NF. FOURTH NORMAL FORM(4NF) A generalization of the Boyce codd normal form to a relation which includes the multivalued dependencies is called fourth Normal Form. FIFTH NORMAL FORM(5NF) The Fourth Normal Form is by no means the “ultimate” normal form. As we saw earlier, multivalued dependencies help us understand and tackle some forms of repetition of information that cannot be understood in terms of functional Dependencies. There are types of constraints called join dependencies that generalize multivalued dependencies, and lead to another normal form called project-join normal form(PJNF). The PJNF is called Fifth Normal Form. DOMAIN-KEY NORMAL FORM The DKNF is considered to be the highest form of normalization, since all insertion and deletion anomalies are eliminated and all general constraints can be verified by using only the domain constraints and key constraints. DENORMALIZATION In some exceptional cases, database designers use the redundancy to improve performance for specific applications. They select such a schema that has redundant information that means it is not normalized. Structured Query Language: SQL is the standard language for making 1. No Q:-Explain the ANSI/SPARC architecture of Database System. Or Explain three level architecture of DBMS? Answer: The ANSI/SPARC architecture is divided into three levels, Known as the internal, conceptual, and external levels. The Internal level is the one closest to physical storage- i.e. it is the one concerned with the way the data is physically stored. The External level is the one closest to the users-, it is the one concerned with the way the data is viewed by individual users The conceptual level is a “level of indirection” between the other two. If the external level is concerned with individual user views, then the conceptual level is concerned with a community user view. In other words, there will be many distinct external views, each consisting of a more or less abstract representation of some portion of the total database, and there will be precisely one conceptual view, consisting of a similarly abstract representation of the database in its entirety. Likewise, there will be precisely one internal view, representing the total database as physically stored.

Use Quizgecko on...
Browser
Browser