AD3391 Database Design and Management Lecture Notes 1 PDF
Document Details
Uploaded by Deleted User
Tags
Summary
These lecture notes cover database design and management topics, including mapping Entity-Relationship (EER) models to Object-Database Management Systems (ODBMS) schemas. The notes detail object-level declarations, relationship properties, and reference attributes for various database relationships.
Full Transcript
EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com ...
EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Downloaded from EnggTree.com EnggTree.com Mapping EER to ODB schema –Object identifier –reference types –row types –UDTs –Subtypes and super types –user-defined routines –Collection types –Object Query Language; No-SQL: CAP theorem –Document-based: MongoDB data model and CRUD operations; Column-based: Hbase data model and CRUD operations. 5.1 Mapping EER to ODB schema It is relatively straightforward to design the type declarations of object classes for an ODBMS from an EER schema that contains neither categories nor n-ary relation- Step 1. Create an ODL class for each EER entity type or subclass. The type of the ODL class should include all the attributes of the EER class. Multivalued attributes are typically declared by using the set, bag, or list constructors. If the values of the multivalued attribute for an object should be ordered, the list constructor is chosen; if duplicates are allowed, the bag constructor should be chosen; otherwise, the set constructor is chosen. Composite attributes are mapped into a tuple constructor (by using a struct declaration in ODL). Declare an extent for each class, and specify any key attributes as keys of the extent. (This is possible only if an extent facility and key constraint declarations are available in the ODBMS.) Step 2. Add relationship properties or reference attributes for each binary relationship into the ODL classes that participate in the relationship. These may be created in one or both directions. If a binary relationship is represented by references in both directions, declare the references to be relationship properties that are inverses of one another, if such a facility exists. If a binary relationship is represented by a reference in only one direction, declare the reference to be an attribute in the referencing class whose type is the referenced class name. Depending on the cardinality ratio of the binary relationship, the relationship properties or reference attributes may be single-valued or collection types. They will be single- valued for binary relationships in the 1:1 or N:1 direction; they are collection types (set- valued or list-valued) for relationships in the 1: N or M: N direction. An alternative way to map binary M: N relationships is discussed in step 7. If relationship attributes exist, a tuple constructor (struct) can be used to create a structure of the form < reference, relationship attributes >, which may be included instead of the reference attribute. However, this does not allow the use of the inverse 111 Downloaded from EnggTree.com EnggTree.com constraint. Additionally, if this choice is represented in both directions, the attribute values will be represented twice, creating redundancy. This implicitly uses a tuple constructor at the top level of the type declaration, but in general, the tuple constructor is not explicitly shown in the ODL class declarations. Further analysis of the application domain is needed to decide which constructor to use because this information is not available from the EER schema. The ODL standard provides for the explicit definition of inverse relationships. Some ODBMS products may not provide this support; in such cases, programmers must maintain every relationship explicitly by coding the methods that update the objects appropriately. The decision whether to use set or list is not available from the EER schema and must be determined Object and Object-Relational Databases Step 3. Include appropriate operations for each class. These are not available from the EER schema and must be added to the database design by referring to the origi- nal requirements. A constructor method should include program code that checks any constraints that must hold when a new object is created. A destructor method should check any constraints that may be violated when an object is deleted. Other methods should include any further constraint checks that are relevant. Step 4. An ODL class that corresponds to a subclass in the EER schema inherits (via extends) the type and methods of its superclass in the ODL schema. Its specific (noninherited) attributes, relationship references, and operations are specified, as discussed in steps 1, 2, and 3. Step 5. Weak entity types can be mapped in the same way as regular entity types. An alternative mapping is possible for weak entity types that do not participate in any relationships except their identifying relationship; these can be mapped as though they were composite multivalued attributes of the owner entity type, by using the set < struct <... >> or list < struct <... >> constructors. The attributes of the weak entity are included in the struct <... > construct, which corresponds to a tuple constructor. Attributes are mapped as discussed in steps 1 and 2. Step 6. Categories (union types) in an EER schema are difficult to map to ODL. It is possible to create a mapping similar to the EER-to-relational mapping by declaring a class to represent the category and defining 1:1 relationship between the category and each of its superclasses. Another option is to use a union type, if it is available. Step 7. An n-ary relationship with degree n > 2 can be mapped into a separate class, with appropriate references to each participating class. These references are based on mapping a 1: N relationship from each class that 112 Downloaded from EnggTree.com EnggTree.com represents a participating entity type to the class that represents the n-ary relationship. An M: N binary relationship, especially if it contains relationship attributes, may also use this mapping option, if desired. 5.2 Object identifier An object identifier (OID) is an unambiguous, long-term name for any type of object or entity. The OID mechanism finds application in diverse scenarios, particularly in security, and is endorsed by the International Telecommunication Union (ITU), the Internet Engineering Task Force (IETF), and ISO. What is an OID? An object identifier (OID) is an extensively used identification mechanism jointly developed by ITU-T and ISO/IEC for naming any type of object, concept or "thing" with a globally unambiguous name which requires a persistent name (long life-time). It is not intended to be used for transient naming. OIDs, once allocated, should not be re-used for a different object/thing. It is based on a hierarchical name structure based on the "OID tree". This naming structure uses a sequence of names, of which the first name identifies a top-level "node" in the OID tree, and the next provides further identification of arcs leading to sub- nodes beneath the top-level, and so on to any depth. A critical feature of this identification mechanism is that it makes OIDs available to a great many organizations and specifications for their own use (including countries, ITU- T Recommendations, ISO and IEC International Standards, specifications from national, regional or international organizations, etc.). How are OIDs allocated and what is a registration authority? At each node, including the root, there is a requirement for some organization or standard to be responsible for allocating arcs to sub-nodes and recording that allocation (together with the organization the subordinate node has been allocated to), not necessarily publicly. This activity is called a Registration Authority (RA). In the OID tree, RAs are generally responsible only for allocation of sub-arcs to other RAs that then control their own sub-nodes. In general, the RA for a sub-node operates independently in allocating further sub-arcs to other organizations, but can be 113 Downloaded from EnggTree.com EnggTree.com constrained by rules imposed by its superior, should the superior so wish. The registration tree is indeed managed in a completely decentralized way (a node gives full power to its children). The registration tree is defined and managed following the ITU-T X.660 & X.670 Recommendation series (or the ISO/IEC 9834 series of International Standards) What is an OID repository? Initially, it was left for each Registration Authority (RA)in the hierarchy to maintain its own record of allocation beneath that RA, and to keep those allocations private if it so chose. There was never any policing of this. An RA in the hierarchy was its own master and operated autonomously. In the early 1990s Orange developed software for their internal use which was generic enough to provide a publicly available repository of OID allocations. Information on OIDs is often buried inside the databases (perhaps sometimes paper) maintained by an immense number of RAs. The information can be hard to access and is sometimes private. Today this OID repository is regarded as the easiest way to access a large amount of the publicly available information on OIDs: Many OIDs are recorded but it does not contain all existing OIDs. This OID repository is not an official Registration Authority, so any OID described on this web site has to be officially allocated by the RA of its parent OID. The accuracy and completeness of this OID repository rely on crowdsourcing, i.e., each user is welcome to contribute data. 5.3 reference type In SQL, a is a pointer; a scalar constructed SQL. It points to a row of a Base table that has the with REF value property – that is, a points to a UDT value. Reference s A is defined by a descriptor that contains three pieces of information: 1. The ’s name: REF. 2. The name of the UDT that the is based on. (The UDT is known as the referenced type.) 3. The scope of the : a (possibly empty) list of the names of the Base tables that make up the ’s scope. REF 114 Downloaded from EnggTree.com EnggTree.com The required syntax for a specification is as follows. :: = REF () [ SCOPE [reference scope check] ] ::= REFERENCES ARE [NOT] CHECKED [ ON DELETE {CASCADE | SET NULL | SET DEFAULT | RESTRICT | NO ACTION} ] A specification defines a pointer: its value is a value that references some site. (A site either does or does not have a REF value.) For example, this REF specification defines a based on a UDT (the “referenced type”) called my_udt: REF(my_udt) As already mentioned, a REF is a pointer. The value in a REF coloumn “refers” to a row in a Base table that has the with REF value property (this is a known as a typed table). The row that the REF value points to contains a value of the UDT that the REF Column is based on. If you’re putting a REF specification in an SQL-Schema statement, the that owns the containing Schema must have the USAGE Privilege on “”. If you’re putting a REF specification in any other SQL statement, then your current must have the USAGE Privilege on “”. For each site that has a REF value and is defined to hold a value of the referenced UDT, there is exactly one REF value – at any time, it is distinct from the REF value of any other site in your SQL-environment. The of the REF value is REF (UDT). [NON-PORTABLE] The data type and size of a REF value in an application program must be some number of octets but is non-standard because the SQL Standard requires implementors to define the octet-length of a REF value. A REF value might have a scope: it determines the effect of a dereference operator on that value. A REF value’s scope is a list of Base table names and consists of every row in every one of those Base tables. The optional SCOPE clause of a specification identifies REF’s scope. Each Table named in the SCOPE clause must be a referenceable Base table with a structured type that is the same as the structured type of the UDT that REF is based on. Here is an examples: 115 Downloaded from EnggTree.com EnggTree.com CREATE TABLE Table_1 ( column_1 SMALLINT, column_2 REF(my_udt) SCOPE Table_2); If you omit the SCOPE clause, the scope defaults to the Table that owns the Column you’re defining. If your REF specification with a SCOPE clause is part of a , it must include this : REFERENCES ARE [NOT] CHECKED ON DELETE NO ACTION. If a REF specification with a SCOPE clause is part of a , it must include a with or without the optional ON DELETE sub-clause. The clause may not be used under any other circumstances. A is a subtype of a if (a) both are s and (b) the UDT referenced by the first is a subtype of the UDT referenced by the second. If you want to restrict your code to Core SQL, don’t use the REF. Reference Operations A is compatible with, and comparable to, all other s of the same referenced type – that is, s are mutually comparable and mutually assignable if they are based on the same UDT. CAST In SQL, CAST is a scalar operator that converts a given scalar value to a given scalar. CAST, howver, can´t be used with s. To cast REF values, you´ll have to use a user-defined cast. It isn’t, of course, possible to convert the values of every into the values of every other. You can cast a source to a UDT target and to any SQL predefined target (except for s and s) provided that a user-defined cast exist for this purpose and your current has the EXECUTE Privilege on that user-defined cast. When you cast a to any legal target, your DBMS incokes the user-defined cast. When you cast a to any legal target, your DBMS invokes the user-defined cast routine´s argument. The cast result in the value returned by the user-defined cast. Assignment In SQL, when a is assigned to a target, the assignment is straightforward – however, assignment is possible only if your source’s UDT is a subtype of the UDT of your target. 116 Downloaded from EnggTree.com EnggTree.com [Obscure Rule] Since only SQL accepts null values, if your source is NULL, then your target’s value is not changed. Instead, your DBMS will set its indicator parameter to -1, to indicate that an assignment of the null value was attempted. If your target doesn’t have an indicator parameter, the assignment will fail: your DBMS will return the SQLSTATE error 22002 "data exception-null value, no indicator parameter". Going the other way, there are two ways to assign a null value to an SQL-data target. Within SQL, you can use the NULL in an INSERT or an UPDATE statement to indicate that the target should be set to NULL; that is, if your source is NULL, your DBMS will set your target to vNULL``. Outside of SQL, if your source has an indicator parameter that is set to -1, your DBMS will set your target to NULL (regardless of the value of the source). (An indicator parameter with a value less than -1 will cause an error: your DBMS will return the SQLSTATE error 22010 "data exception-invalid indicator parameter value".) We’ll talk more about indicator parameters in our chapters on SQL binding styles. Comparison SQL provides only two scalar comparison operators – = and – to perform operations on s. Both will be familiar; there are equivalent operators in other computer languages. Two REF values are comparable if they’re both based on the same UDT. If either of the comparands are NULL, the result of the operation is UNKNOWN. Other Operations With SQL, you have several other operations that you can perform on s. Scalar functions SQL provides two scalar functions that operate on or return a : the and the. The required syntax for a is as follows. :: = reference_argument -> The operates on two operands — the first must evaluate to a that has a non-empty scope and the second must be the name of an Attribute of the ’s UDT. The allows you to access a Column of the row identified by a REF value; it returns a result whose is the of and whose value is the value of the system-generated Column of the Table in the ’s scope (where the system-generated Column is equal to reference_argument). That is, given a REF value, the returns the value at the site referenced by that REF value. If the REF value doesn’t identify a site (perhaps because the site it once identified has been destroyed), the returns NULL. 117 Downloaded from EnggTree.com EnggTree.com If you want to restrict your code to Core SQL, don’t use the. The required syntax for a is as follows. :: = reference_argument -> DEREF operates on any expression that evaluates to a that has a non- empty scope. It returns the value referenced by a REF value. Your current must have the SELECT WITH HIERARCHY Privilege on reference_argument's scope Table. If you want to restrict your code to Core SQL, don’t use DEREF. Set Functions SQL provides three set functions that operate on a : COUNT and GROUPING. Since none of these operate exclusively with REF arguments, we won’t discuss them here; look for them in our chapter on set functions. Predicates In addition to the comparison operators, SQL provides eight other predicates that operate on s: the , the , the , the , the , the , the and the. Each will return a boolean value: either TRUE, FALSE or UNKNOWN. Since none of them operates strictly on s, we won’t discuss them here. Look for them in our chapters on search conditions. 5.4 ROWTYPE Attribute Row s A is defined by a descriptor that contains three pieces of information: The ’s name: ROW. The ’s degree: the number of Fields that belong to the row. A descriptor for every Field that belongs to the row. The Field descriptor contains the name of the Field, the Field’s ordinal position in the , the Field’s and nullability attribute (or, if the Field is based on a Domain, the name of that Domain), the Field’s Character set and default Collation (for character string s) and the Field’s (for s). ROW Example: 118 Downloaded from EnggTree.com EnggTree.com The required syntax for a specification is as follows. ::= ROW ( [ {,}... ]) ::= { | } [ ] [ COLLATE ] A specification defines a row of data: it consists of a sequence of one or more parenthesized {,} pairs, known as Fields. The degree of a row is the number of Fields it contains. A value of a row consists of one value for each of its Fields, while a value of a Field is a value of the Field’s. Each Field in a row must have a unique name. Example of a specification: ROW (field_1 INT, field_2 DATE, field_3 INTERVAL (4) YEAR) A identifies a Field and is either a or a that is unique (for all Fields and Columns) within the Table it belongs to. You can define a Field’s either by putting a specification after or by putting a after the. The of a Field can be any type other than a – in particular, it can itself be a. Example, of a specification; It defines a row with one Field (called field_1) whose defined is DATE: ROW (field_1 DATE) [Obscure Rule] If the of a Field is CHAR, VARCHAR or CLOB, the Character set that the Field’s values must belong to is determined as follows: If the contains a specification that includes a CHARACTER SET clause, the Field’s Character set is the Character set named. Your current must have the USAGE Privilege on that Character set. If the does not include a specification, but the Field is based on a Domain whose definition includes a CHARACTER SET clause, the Field’s Character set is the Character set named. If the does not include any CHARACTER SET clause at all – either through a specification or through a Domain definition – the Field’s Character set is the Character set named in the DEFAULT CHARACTER SET clause of the CREATE SCHEMA statement that defines the Schema that the Field belongs to. 119 Downloaded from EnggTree.com EnggTree.com For example, the effect of these two SQL statements: CREATE SCHEMA bob AUTHORIZATION bob DEFAULT CHARACTER SET INFORMATION_SCHEMA.LATIN1; CREATE TABLE Table_1 ( column_1 ROW( field_1 CHAR(10), field_2 INT)); is to create a Table in Schema bob. The Table has a Column with a ROW , containing two Fields. The character string Field’s set of valid values are fixed length character strings, exactly 10 characters long, all of whose characters must be found in the INFORMATION_SCHEMA.LATIN1 Character set – the Schema’s default Character set. The effect of these two SQL statements: CREATE SCHEMA bob AUTHORIZATION bob DEFAULT CHARACTER SET INFORMATION_SCHEMA.LATIN1; CREATE TABLE Table_1 ( column_1 ROW( field_1 CHAR(10) CHARACTER SET INFORMATION_SCHEMA.SQL_CHARACTER, field_2 INT)); is to create the same Table with one difference: this time, the character string Field’s values must consist only of characters found in the INFORMATION_SCHEMA.SQL_CHARACTER Character set – the explicit Character set specification in CREATE TABLE constrains the Field’s set of values. The Schema’s default Character set does not. [Obscure Rule] If the of a Field is CHAR, VARCHAR, CLOB, NCHAR, NCHAR VARYING or NCLOB, and your does not include a COLLATE clause, the Field has a coercibility attribute of COERCIBLE – but if your includes a COLLATE clause, the Field has a coercibility attribute of IMPLICIT. In either case, the Field’s default Collation is determined as follows: If the includes a COLLATE clause, the Field’s default Collation is the Collation named. Your current must have the USAGE Privilege on that Collation. If the does not include a COLLATE clause, but does contain a 120 Downloaded from EnggTree.com EnggTree.com specification that includes a COLLATE clause, the Field’s default Collation is the Collation named. Your current must have the USAGE Privilege on that Collation. If the does not include a COLLATE clause, but the Field is based on a Domain whose definition includes a COLLATE clause, the Field’s default Collation is the Collation named. If the does not include any COLLATE clause at all – either explicitly, through a specification or through a Domain definition – the Field’s default Collation is the default Collation of the Field’s Character set. [Obscure Rule] If the of a Field is REF(UDT), your current must have the USAGE Privilege on that UDT. If the of a Field includes REF with a , your must also include this clause: REFERENCES ARE [NOT] CHECKED ON DELETE NO ACTION – to indicate whether references are to be checked or not. Do not add a clause under any other circumstances. If a Field is defined with REFERENCES ARE CHECKED, and a is included in the , then there is an implied DEFERRABLE INITIALLY IMMEDIATE Constraint on the Field. This Constraint checks that the Field´s values are also found in the corresponding Field of the system-generated Column of the Table named in the. If the of a Field in a row is a UDT, then the current must have the USAGE Privilege on that UDT. A is a subtype of a if (a) both are s with the same degree and (b) for every pair of corresponding s, the s are the same and the of the Field in the first is a supertype of the of the Field in the second. A returns a row. The required syntax for a is as follows. ::= ROW { | | } A row of data values belonging to a Table (or a query result, which is also a Table) is also considered to be a. In a Table, each Column of a data row corresponds to a Field of the : the Column and Field have the same ordinal positions in the Table and , respectively. A allows you to access a specific row of a Table or a query result. Here is an example of a that would return a row of a Table named TABLE_1: 121 Downloaded from EnggTree.com EnggTree.com ROW(Table_1) A returns a Field of a row. The required syntax for a is as follows. ::= row_argument. A allows you to access a specific Field of a row. It operates on two arguments: the first must evaluate to a and the second must be the name of a Field belonging to that row. If the value of row_argument is NULL, then the specified Field is also NULL. If row_argument has a non-null value, the value of the Field reference is the value of the specified Field in row_argument. Here is an example of a that would return the value of a Field named FIELD_1 that belongs to a row of TABLE_1: ROW(Table_1).field_1 An is used to construct a row of data. The required syntax for a is as follows. ::= element_expression | [ ROW ] (element_expression [ {,element_expression}... ]) | ( ) element_expression ::= element_expression | NULL | ARRAY[] | ARRAY??(??) | DEFAULT A allows you to assign values to the Fields of a row, using either a list of element_expressions of the result of a subquery. An element_expression may be any expression that evaluates to a scalar value with a that is assignable to the corresponding Field’s. A subquery – ( ) – is discussed in our chapter on complex queries. The result is a row whose n-th Field value is the value of the n-th element_expression (or whose value is the value of the subquery) you specify. If your element_expression is 122 Downloaded from EnggTree.com EnggTree.com NULL, the corresponding Field is assigned the null value. If your element_expression is ARRAY [] or ARRAY??(??), the corresponding Field is assigned an empty array. If your element_expression is DEFAULT, the corresponding Field is assigned its default value. Here is an example of a : ROW ('hello',567, DATE '1994-07-15’, NULL, DEFAULT, ARRAY []) This example constructs a row with six Fields. The first Field has a character string value of 'hello', the second has a numeric value of 567, the third has a date value of '1994-07-15', the fourth has a null value, the fifth has a value that is the fifth Field’s default value and the sixth has a value that is an empty array. This would be valid for this specification: ROW (field_1 CHAR (5), field_2 SMALLINT, field_3 DATE, field_4 BIT (4), field_5 domains_1, field_6 INT ARRAY ) A serves the same purpose for a row as a does for a predefined. It has the same format as the ’s ROW () – but instead of a series of s inside the size delimiters, it contains comma- delimited values of the correct for each Field. For example, if your specification is: ROW (field_1 INT, field_2 CHAR (5), field_3 BIT (4)) a valid would be: ROW (20,'hello’, B'1011') If you construct a row with a subquery, the row takes on the of the subquery’s result. An empty subquery result constructs a one-Field row whose value is NULL. A non-empty subquery result constructs a one-Field row whose value is the subquery result. If you want to restrict your code to Core SQL, (a) don’t use the ROW or s and s and, when using a , (b) don’t use ARRAY[] or ARRAY??(??) as an element_expression,(c) don’t construct a row with more than one Field,(d) don’t use the ROW in front of your element_expression, and (e) don’t use a subquery to construct your row. Row Operations 123 Downloaded from EnggTree.com EnggTree.com A row is compatible with, and comparable to, any row with compatible Fields – that is, rows are mutually comparable and mutually assignable only if they have the same number of Fields and each corresponding pair of Fields are mutually comparable and mutually assignable. Rows may not be directly compared with, or directly assigned to, any other class, though implicit type conversions of their Fields can occur in expressions, SELECTs, INSERTs, DELETEs and UPDATEs. Explicit row type conversions are not possible. Assignment In SQL, when a is assigned to a target, the assignment is done one Field at a time – that is, the source’s first Field value is assigned to the target’s first Field, the source’s second Field value is assigned to the target’s second Field, and so on. Assignment of a to another is possible only if (a) both s have the same number of Fields and (b) each corresponding pair of Fields have s that are mutually assignable. [Obscure Rule] Since only SQL accepts null values, if your source is NULL, then your target’s value is not changed. Instead, your DBMS will set its indicator parameter to -1, to indicate that an assignment of the null value was attempted. If your target doesn’t have an indicator parameter, the assignment will fail: your DBMS will return the SQLSTATE error 22002 "data exception-null value, no indicator parameter". Going the other way, there are two ways to assign a null value to an SQL-data target. Within SQL, you can use the NULL in an INSERT or an UPDATE statement to indicate that the target should be set to NULL; that is, if your source is NULL, your DBMS will set your target to NULL. Outside of SQL, if your source has an indicator parameter that is set to -1, your DBMS will set your target to NULL (regardless of the value of the source). (An indicator parameter with a value less than -1 will cause an error: your DBMS will return the SQLSTATE error 22010 "data exception-invalid indicator parameter value".) We’ll talk more about indicator parameters in our chapters on SQL binding styles. Comparison SQL provides the usual scalar comparison operators – = and and < and and >= – to perform operations on rows. All of them will be familiar; there are equivalent operators in other computer languages. Two rows are comparable if (a) both have the same number of Fields and (b) each corresponding pair of Fields have s that are mutually comparable. Comparison is between pairs of Fields in corresponding ordinal positions – that is, the first Field of the first row is compared to the first Field of the second row, the second Field of the first row is compared to the second Field of the second row, an so on. If either comparand is NULL the result of the operation is UNKNOWN. 124 Downloaded from EnggTree.com EnggTree.com The result of a comparison depends on two things: (a) the comparison operator and (b) whether any Field is NULL. The order of comparison is: If the comparison operator is = or : First the Field pairs which don´t include NULLs, then the pairs which do. If the comparison operator is anything other than = or : Field pairs from left to right. Comparison stops when the result is unequal or UNKNOWN, or when there are no more Fields. The result of the row comparison is the result of the last Field pair comparison. Here are the possibilities. If the comparison operator is =. The row comparison is (a) TRUE if the comparison is TRUE for every pair of Fields, (b) FALSE if any non-null pair is not equal, and (c) UNKNOWN if at least one Field is NULL and all non-null pairs are equal. For example: ROW (1,1,1) = ROW (1,1,1) -- returns TRUE ROW (1,1,1) = ROW (1,2,1) -- returns FALSE ROW (1, NULL,1) = ROW (2,2,1) -- returns FALSE ROW (1, NULL,1) = ROW (1,2,1) -- returns UNKNOWN Comparison operator is. The row comparison is (a) TRUE if any non-null pair is not equal, (b) FALSE if the comparison is FALSE for every pair of Fields, and (c) UNKNOWN if at least one Field is NULL and all non-null pairs are equal. For example: ROW (1,1,1) ROW (1,2,1) -- returns TRUE ROW (1, NULL,2) ROW (2,2,1) -- returns TRUE ROW (2,2,1) ROW (2,2,1) -- returns FALSE ROW (1, NULL,1) ROW (1,2,1) -- returns UNKNOWN Comparison operator is anything other than = or. The row comparison is (a) TRUE if the comparison is TRUE for at least one pair of Field and every pair before the TRUE result is equal, (b) FALSE uf the comparison is FALSE for at least one pair of Fields and every pair before the FALSE result is equal, and (c) UNKNOWN if the comparison is UNKNWON for at least one pair of Fields and every pair before the UNKNOWN result is equal. Comparison stops as soon as any of these results (TRUE, FALSE, or UNKNOWN) is established. For example: ROW (1,1,1) < ROW (1,2,1) -- returns TRUE 125 Downloaded from EnggTree.com EnggTree.com ROW (1, NULL,1) < ROW (2, NULL,0) -- returns TRUE ROW (1,1,1) < ROW (1,1,1) -- returns FALSE ROW (3, NULL,1) < ROW (2, NULL,0) -- returns FALSE ROW (2, NULL,1) < ROW (1,2,0) -- returns UNKNOWN ROW (NULL,1,1) < ROW (2,1,0) -- returns UNKNOWN SQL also provides three quantifiers – ALL, SOME, ANY – which you can use along with a comparison operator to compare a row value with the collection of values returned by a. Place the quantifier after the comparison operator, immediately before the. For example: SELECT row_column FROM Table_1 WHERE row_column < ALL ( SELECT row_column FROM Table_2); ALL returns TRUE either (a) if the collection is an empty set (i.e.: if it contains zero rows) or (b) if the comparison operator returns TRUE for every value in the collection. ALL returns FALSE if the comparison operator returns FALSE for at least one value in the collection. SOME and ANY are synonyms. They return TRUE if the comparison operator returns TRUE for at least one value in the collection. They return FALSE either (a) if the collection is an empty set or (b) if the comparison operator returns FALSE for every value in the collection. The search condition = ANY (collection) is equivalent to “IN (collection)`` 5.5 UDTs A UDT is defined by a descriptor that contains twelve pieces of information: 1. The , qualified by the of the Schema it belongs to. 2. Whether the UDT is ordered. 3. The UDT’s ordering form: either EQUALS, FULL or NONE. 4. The UDT’s ordering category: either RELATIVE, HASH or STATE. 5. The that identifies the UDT’s ordering function. 6. If the UDT is a direct subtype of one or more other UDTs, then the names of 126 Downloaded from EnggTree.com EnggTree.com those UDTs. 7. If the UDT is a distinct type, then the descriptor of the it’s based on; otherwise an Attribute descriptor for each of the UDT’s Attributes. 8. The UDT’s degree: the number of its Attributes. 9. Whether the UDT is instantiable or not instantiable. 10. Whether the UDT is final or not final. 11. The UDT’s Transform descriptor. 12. If the UDT’s definition includes a method signature list, a descriptor for each method signature named. To create a UDT, use the CREATE TYPE statement (either as a stand-alone SQL statement or within a CREATE SCHEMA statement). CREATE TYPE specifies the enclosing Schema, names the UDT and identifies the UDT’s set of valid values. To destroy a UDT, use the DROP TYPE statement. None of SQL3’s UDT syntax is Core SQL, so if you want to restrict your code to Core SQL, don’t use UDTs. UDT Names A identifies a UDT. The required syntax for a is: ::= [. ] unqualified name A is a or a that is unique (for all Domains and UDTs) within the Schema it belongs to. The which qualifies a names the Schema that the UDT belongs to and can either be explicitly stated, or a default will be supplied by your DBMS as follows: If a in a CREATE SCHEMA statement isn’t qualified, the default qualifier is the name of the Schema you’re creating. If the unqualified is found in any other SQL statement in a Module, the default qualifier is the name of the Schema identified in the SCHEMA clause or AUTHORIZATION clause of the MODULE statement that defines that Module UDT Example Here’s an example of a UDT definition: CREATE TYPE book_udt AS -- the UDT name will be book_udt title CHAR (40), -- title is the first attribute buying_price DECIMAL (9,2), -- buying_price is the second attribute selling_price DECIMAL (9,2) -- selling_price is the third attribute 127 Downloaded from EnggTree.com EnggTree.com NOT FINAL -- this is a mandatory Finality Clause METHOD profit () RETURNS DECIMAL (9,2); -- profit is a method, defined later This CREATE TYPE statement results in a UDT named BOOK_UDT. The components of the UDT are three attributes (named TITLE, BUYING_PRICE and SELLING_PRICE) and one method (named PROFIT). The three name-and-data-type pairs title CHAR (40) and buying_price DECIMAL (9,2) and selling_price DECIMAL (9,2) are the UDT’s Attribute definitions. The words NOT FINAL matter only for subtyping, which we’ll get to later. Briefly, though, if a UDT definition doesn’t include an UNDER clause, the finality clause must specify NOT FINAL. The clause METHOD profit () RETURNS DECIMAL (9,2) is a teaser. Like an Attribute, a “method” is a component of a UDT. However, this method – PROFIT – is actually a declaration that a function named PROFIT exists. This function isn’t defined further in the UDT definition – there is a separate SQL statement for defining functions: CREATE METHOD. All we can see at this stage is that PROFIT has a name and a (predefined) data type>, just as regular Attributes do. Some people would call PROFIT a “derived Attribute”. 5.6 Super type and Sub type Purpose of the Supertypes and Subtypes Supertypes and subtypes occur frequently in the real world: food order types (eat in, to go) grocery bag types (paper, plastic) payment types (check, cash, credit) You can typically associate ‘choices’ of something with supertypes and subtypes. For example, what will be the method of payment – cash, check or credit card? Understanding real world examples helps us understand how and when to model them. Evaluating Entities Often some instances of an entity have attributes and/or relationships that other instances do not have. Imagine a business which needs to track payments from customers. 128 Downloaded from EnggTree.com EnggTree.com Customers can pay by cash, by check, or by credit card. All payments have some common attributes: payment date, payment amount, and so on. But only credit cards would have a “card number” attribute. And for credit card and check payments, we may need to know which CUSTOMER made the payment, while this is not needed for cash payments Should we create a single PAYMENT entity or three separate entities CASH, CHECK, and CREDIT CARD? And what happens if in the future we introduce a fourth method of payment? Subdivide an Entity Sometimes it makes sense to subdivide an entity into subtypes. This may be the case when a group of instances has special properties, such as attributes or relationships that exist only for that group. In this case, the entity is called a “supertype” and each group is called a “subtype”. Subtype Characteristics A subtype: Inherits all attributes of the supertype Inherits all relationships of the supertype Usually has its own attributes or relationships Is drawn within the supertype Never exists alone May have subtypes of its own 129 Downloaded from EnggTree.com EnggTree.com Supertype Example EXAM is a supertype of QUIZ, MIDTERM, and FINAL. The subtypes have several attributes in common. These common attributes are listed at the supertype level. The same applies to relationships. Subtypes inherit all attributes and relationships of the supertype entity. Read the diagram as: Every QUIZ, MIDTERM, or FINAL is an EXAM (and thus has attributes such as description, weight, date, and grade). 130 Downloaded from EnggTree.com EnggTree.com Conversely: Every EXAM is either a QUIZ, a MIDTERM, or a FINAL. Always More Than One Subtype When an ER model is complete, subtypes never stand alone. In other words, if an entity has a subtype, a second subtype must also exist. This makes sense. A single subtype is exactly the same as the supertype. This idea leads to the two subtype rules: Exhaustive: Every instance of the supertype is also an instance of one of the subtypes. All subtypes are listed without omission. Mutually Exclusive: Each instance of a supertype is an instance of only one possible subtype. At the conceptual modeling stage, it is good practice to include an OTHER subtype to make sure that your subtypes are exhaustive — that you are handling every instance of the supertype. Subtypes Always Exist Any entity can be subtyped by making up a rule that subdivides the instances into groups. But being able to subtype is not the issue—having a reason to subtype is the issue. When a need exists within the business to show similarities and differences between instances, then subtype. 131 Downloaded from EnggTree.com EnggTree.com Correctly Identifying Subtypes When modeling supertypes and subtypes, you can use three questions to see if the subtype is correctly identified: Is this subtype a kind of supertype? Have I covered all possible cases? (exhaustive) Does each instance fit into one and only one subtype? (mutually exclusive) Nested Subtypes You can nest subtypes. For ease of reading — “readability” — you would usually show subtypes with only two levels, but there is no rule that would stop you from going beyond two levels. 5.7 User-Defined routines (UDR) User-defined routines (UDR) are functions that perform specific actions that you can define in your SIL™ programs for a later use. These can considerably improve the 132 Downloaded from EnggTree.com EnggTree.com readability and maintainability of your code. Syntax function ( param1, param2,...) { Instruction1;... InstructionN; return ; } Example function zero () { return 0; } number a = zero (); Parameters The list of parameters in the definition of a UDR can be of any length (including 0) and their respective types can be any valid SIL™ type. Eample: function zero () { return 0; } function doSomething(string s, number n1, number [] n2, boolean flag, string [] oneMore){.... } Constant Parameters Parameters of user-defined routines can be made read-only in the scope of the routine by adding the keyword "const" before the parameter definition in the signature of the routine. function f (const string s) {... 133 Downloaded from EnggTree.com EnggTree.com } Variable visibility There are three categories of variables that can be used in a UDR: Local variables These are the variables you define in the body of the UDR. These can be used throughout the body of the UDR. On exit, the values of these variables are lost. function example () { number a = 3; number b = a + 10; // use here variables a and b } Parameter variables These are the values passed to the UDR in the list of parameters. Because SIL™ uses a "pass-by-value" policy, even though you modify the value of these variables in the body of the function, on exit, their original values will be restored. function increment (number a) { a = a + 1; // the value of a is only modified locally return a; } number b = 0; number c = increment(b); // the value of b does not change print(b); // this prints 0 print(c); // this prints 1 Global variables These are the variables that are already defined and can be used right away (issue fields, customfields and any variables defined before the routine). You can use issue fields and custom fields anywhere in your code (including in the UDR body) without having to declare them. function print Key () { print(key); 134 Downloaded from EnggTree.com EnggTree.com } Return value Return values can be used to communicate with the context that called the UDR or to halt its execution. Examples function isEven(number a){ return (a % 2 == 0); } function increment (number a) { return a + 1; } number b = increment (2); Notice that there is no need to declare the type of the return value; this will be evaluated at runtime. Therefore, even though the check on the following program will be ok, at runtime the value of d will NOT be modified because of the incompatibility between date (on the right-hand-side) and number (on the left-hand-side). function increment (number a) { return a + 1; } date d = increment (2); You can return simply from a routine without specifying a value. However, you should always remember that by design routines return a value, even if it is undefined. The following code is therefore valid: function f (number a) { if (a > 0) { print("positive"); return; } 135 Downloaded from EnggTree.com EnggTree.com if (a == 0) {print("ZERO");} } //[...................] string s =f (4); //s is still undefined, no value was returned if(isNull(s)) { ? print ("S IS NULL!"); //this will be printed } else { ? print ("S IS NOT NULL!"); } Of course, the above code will print the text 'S IS NULL' in the log. 5.8 Collection types A collection is an ordered group of elements having the same data type. Each element is identified by a unique subscript that represents its position in the collection. PL/SQL provides three collection types − Index-by tables or Associative array Nested table Variable-size array or Varray Oracle documentation provides the following characteristics for each type of collections – Collection Number of Subscript Dense or Where Can Be Type Elements Type Sparse Created Object Type Attribute Associative Unbounded String or Either Only in No array (or integer PL/SQL index-by block table) Nested Unbounded Integer Starts Either in Yes table dense, can PL/SQL become block or at sparse schema level Variablesize Bounded Integer Always Either in Yes array dense PL/SQL 136 Downloaded from EnggTree.com EnggTree.com (Varray) block or at schema level Both types of PL/SQL tables, i.e., the index-by tables and the nested tables have the same structure and their rows are accessed using the subscript notation. However, these two types of tables differ in one aspect; the nested tables can be stored in a database column and the index-by tables cannot. Index-By Table An index-by table (also called an associative array) is a set of key-value pairs. Each key is unique and is used to locate the corresponding value. The key can be either an integer or a string. An index-by table is created using the following syntax. Here, we are creating an index- by table named table_name, the keys of which will be of the subscript_type and associated values will be of the element_type TYPE type_name IS TABLE OF element_type [NOT NULL] INDEX BY subscript_type; table_name type_name; Example Following example shows how to create a table to store integer values along with names and later it prints the same list of names. DECLARE TYPE salary IS TABLE OF NUMBER INDEX BY VARCHAR2(20); salary_list salary; name VARCHAR2(20); BEGIN -- adding elements to the table salary_list('Rajnish') := 62000; salary_list('Minakshi') := 75000; salary_list('Martin') := 100000; salary_list('James') := 78000; -- printing the table name := salary_list.FIRST; WHILE name IS NOT null LOOP 137 Downloaded from EnggTree.com EnggTree.com dbms_output.put_line ('Salary of ' || name || ' is ' || TO_CHAR(salary_list(name))); name := salary_list.NEXT(name); END LOOP; END; / When the above code is executed at the SQL prompt, it produces the following result − Salary of James is 78000 Salary of Martin is 100000 Salary of Minakshi is 75000 Salary of Rajnish is 62000 PL/SQL procedure successfully completed. Example Elements of an index-by table could also be a %ROWTYPE of any database table or %TYPE of any database table field. The following example illustrates the concept. We will use the CUSTOMERS table stored in our database as − Select * from customers; +----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | +----+----------+-----+-----------+----------+ DECLARE CURSOR c_customers is select name from customers; TYPE c_list IS TABLE of customers.Name%type INDEX BY binary_integer; name_list c_list; counter integer:=0; 138 Downloaded from EnggTree.com EnggTree.com BEGIN FOR n IN c_customers LOOP counter:= counter +1; name_list(counter):= n.name; dbms_output.put_line('Customer('||counter||'):'||name_lis t(counter)); END LOOP; END; / When the above code is executed at the SQL prompt, it produces the following result − Customer (1): Ramesh Customer (2): Khilan Customer (3): kaushik Customer (4): Chaitali Customer (5): Hardik Customer (6): Komal PL/SQL procedure successfully completed Nested Tables A nested table is like a one-dimensional array with an arbitrary number of elements. However, a nested table differs from an array in the following aspects − An array has a declared number of elements, but a nested table does not. The size of a nested table can increase dynamically. An array is always dense, i.e., it always has consecutive subscripts. A nested array is dense initially, but it can become sparse when elements are deleted from it. A nested table is created using the following syntax − TYPE type_name IS TABLE OF element_type [NOT NULL]; table_name type_name; This declaration is similar to the declaration of an index-by table, but there is no INDEX BY clause. A nested table can be stored in a database column. It can further be used for simplifying SQL operations where you join a single-column table with a larger table. An associative 139 Downloaded from EnggTree.com EnggTree.com array cannot be stored in the database. Example The following examples illustrate the use of nested table − DECLARE TYPE names_table IS TABLE OF VARCHAR2(10); TYPE grades IS TABLE OF INTEGER; names names_table; marks grades; total integer; BEGIN names := names_table('Kavita', 'Pritam', 'Ayan', 'Rishav', 'Aziz'); marks:= grades(98, 97, 78, 87, 92); total := names.count; dbms_output.put_line('Total '|| total || ' Students'); FOR i IN 1.. total LOOP dbms_output.put_line('Student:'||names(i)||', Marks:' || marks(i)); end loop; END; / When the above code is executed at the SQL prompt, it produces the following result − Total 5 Students Student:Kavita, Marks:98 Student:Pritam, Marks:97 Student:Ayan, Marks:78 Student:Rishav, Marks:87 Student:Aziz, Marks:92 PL/SQL procedure successfully completed. Example 140 Downloaded from EnggTree.com EnggTree.com Elements of a nested table can also be a %ROWTYPE of any database table or %TYPE of any database table field. The following example illustrates the concept. We will use the CUSTOMERS table stored in our database as − Select * from customers; +----+----------+-----+-----------+----------+ | ID | NAME | AGE | ADDRESS | SALARY | +----+----------+-----+-----------+----------+ | 1 | Ramesh | 32 | Ahmedabad | 2000.00 | | 2 | Khilan | 25 | Delhi | 1500.00 | | 3 | kaushik | 23 | Kota | 2000.00 | | 4 | Chaitali | 25 | Mumbai | 6500.00 | | 5 | Hardik | 27 | Bhopal | 8500.00 | | 6 | Komal | 22 | MP | 4500.00 | +----+----------+-----+-----------+----------+ DECLARE CURSOR c_customers is SELECT name FROM customers; TYPE c_list IS TABLE of customerS.No.ame%type; name_list c_list := c_list(); counter integer :=0; BEGIN FOR n IN c_customers LOOP counter := counter +1; name_list.extend; name_list(counter) := n.name; dbms_output.put_line('Customer('||counter||'):'||name_list(counter)); END LOOP; END; / When the above code is executed at the SQL prompt, it produces the following result − Customer(1): Ramesh Customer(2): Khilan Customer(3): kaushik Customer(4): Chaitali 141 Downloaded from EnggTree.com EnggTree.com Customer(5): Hardik Customer(6): Komal PL/SQL procedure successfully completed. Variable size array(Varray) type The PL/SQL programming language provides a data structure called the VARRAY, which can store a fixed-size sequential collection of elements of the same type. A varray is used to store an ordered collection of data, however it is often better to think of an array as a collection of variables of the same type. All varrays consist of contiguous memory locations. The lowest address corresponds to the first element and the highest address to the last element. Varrays in PL/SQL An array is a part of collection type data and it stands for variable-size arrays. We will study other collection types in a later chapter 'PL/SQL Collections'. Each element in a varray has an index associated with it. It also has a maximum size that can be changed dynamically. Creating a Varray Type A varray type is created with the CREATE TYPE statement. You must specify the maximum size and the type of elements stored in the varray. The basic syntax for creating a VARRAY type at the schema level is − CREATE OR REPLACE TYPE varray_type_name IS VARRAY(n) of Where, varray_type_name is a valid attribute name, n is the number of elements (maximum) in the varray, element_type is the data type of the elements of the array. Maximum size of a varray can be changed using the ALTER TYPE statement. For example, 142 Downloaded from EnggTree.com EnggTree.com CREATE Or REPLACE TYPE namearray AS VARRAY (3) OF VARCHAR2(10); / Type created. The basic syntax for creating a VARRAY type within a PL/SQL block is − TYPE varray_type_name IS VARRAY(n) of For example – TYPE namearray IS VARRAY(5) OF VARCHAR2(10); Type grades IS VARRAY(5) OF INTEGER; Let us now work out on a few examples to understand the concept Example 1 The following program illustrates the use of varrays DECLARE type namesarray IS VARRAY(5) OF VARCHAR2(10); type grades IS VARRAY(5) OF INTEGER; names namesarray; marks grades; total integer; BEGIN names := namesarray('Kavita', 'Pritam', 'Ayan', 'Rishav', 'Aziz'); marks:= grades(98, 97, 78, 87, 92); total := names.count; dbms_output.put_line('Total '|| total || ' Students'); FOR i in 1.. total LOOP dbms_output.put_line('Student: ' || names(i) || ' Marks: ' || marks(i)); END LOOP; END; / 143 Downloaded from EnggTree.com EnggTree.com When the above code is executed at the SQL prompt, it produces the following result Total 5 Students Student: Kavita Marks: 98 Student: Pritam Marks: 97 Student: Ayan Marks: 78 Student: Rishav Marks: 87 Student: Aziz Marks: 92 PL/SQL procedure successfully completed. 5.9 Object Query Language; No-SQL: CAP theorem CAP theorem The CAP theorem is about how distributed database systems behave in the face of network instability. When working with distributed systems over unreliable networks we need to consider the properties of consistency and availability in order to make the best decision about what to do when systems fail. The CAP theorem introduced by Eric Brewer in 2000 states that any distributed database system can have at most two of the following three desirable properties Consistency: Consistency is about having a single, up-to-date, readable version of our data available to all clients. Our data should be consistent - no matter how many clients reading the same items from replicated and distributed partitions we should get consistent results. All writes are atomic and all subsequent requests retrieve the new value. High availability: This property states that the distributed database will always allow database clients to make operations like select or update on items without delay. Internal communication failures between replicated data shouldn’t prevent operations on it. The database will always return a value as long as a single server is running. Partition tolerance: This is the ability of the system to keep responding to client requests even if there’s a communication failure between database partitions. The system will still function even if network communication between partitions is temporarily lost. Note that the CAP theorem only applies in cases when there’s a connection failure between partitions in our cluster. The more reliable our network, the lower the probability we will need to think about this theorem. The CAP theorem helps us 144 Downloaded from EnggTree.com EnggTree.com understand that once we partition our data, we must determine which options best match our business requirements: consistency or availability. Remember: at most two of the aforementioned three desirable properties can be fulfilled, so we have to select either consistency or availability. 5.10 MongoDB CRUD Operations Data Model Design Effective data models support your application needs. The key consideration for the structure of your documents is the decision to embed or to use references. Embedded Data Models With MongoDB, you may embed related data in a single structure or document. These schema are generally known as "denormalized" models, and take advantage of MongoDB's rich documents. Consider the following diagram: Embedded data models allow applications to store related pieces of information in the same database record. As a result, applications may need to issue fewer queries and updates to complete common operations. In general, use embedded data models when: you have "contains" relationships between entities. See Model One-to-One Relationships with Embedded Documents. you have one-to-many relationships between entities. In these relationships the "many" or child documents always appear with or are viewed in the context of the "one" or parent documents. See Model One-to-Many Relationships with Embedded Documents. 145 Downloaded from EnggTree.com EnggTree.com In general, embedding provides better performance for read operations, as well as the ability to request and retrieve related data in a single database operation. Embedded data models make it possible to update related data in a single atomic write operation. To access data within embedded documents, use dot notation to "reach into" the embedded documents. See query for data in arrays and query data in embedded documents for more examples on accessing data in arrays and embedded documents. Embedded Data Model and Document Size Limit Documents in MongoDB must be smaller than the maximum BSON document size. For bulk binary data, consider GridFS. Normalized Data Models Normalized data models describe relationships using references between documents. In general, use normalized data models: when embedding would result in duplication of data but would not provide sufficient read performance advantages to outweigh the implications of the duplication. to represent more complex many-to-many relationships. to model large hierarchical data sets. CRUD operations CRUD operations create, read, update, and delete documents. Create Operations: Create or insert operations add new documents to a collection. If the collection does not currently exist, insert operations will create the collection. MongoDB provides the following methods to insert documents into a collection: 146 Downloaded from EnggTree.com EnggTree.com db.collection.insertOne() db.collection.insertMany() In MongoDB, insert operations target a single collection. All write operations in MongoDB are atomic on the level of a single document. Read Operations: Read operations retrieve documents from a collection; i.e. query a collection for documents. MongoDB provides the following methods to read documents from a collection: db.collection.find() You can specify query filters or criteria that identify the documents to return. Update Operations: Update operations modify existing documents in a collection. MongoDB provides the following methods to update documents of a collection: db.collection.updateOne() New in version 3.2 db.collection.updateMany() New in version 3.2 db.collection.replaceOne() New in version 3.2 In MongoDB, update operations target a single collection. All write operations in MongoDB are atomic on the level of a single document. You can specify criteria, or filters, that identify the documents to update. These filters use the same syntax as read operations. Delete Operations: Delete operations remove documents from a collection. MongoDB 147 Downloaded from EnggTree.com EnggTree.com provides the following methods to delete documents of a collection: db.collection.deleteOne() New in version 3.2 db.collection.deleteMany() New in version 3.2 In MongoDB, delete operations target a single collection. All write operations in MongoDB are atomic on the level of a single document. You can specify criteria, or filters, that identify the documents to remove. These filters use the same syntax as read operations. 5.11 HBase Data Model and CRUD Operations The HBase Data Model is designed to handle semi-structured data that may differ in field size, which is a form of data and columns. The data model’s layout partitions the data into simpler components and spread them across the cluster. HBase's Data Model consists of various logical components, such as a table, line, column, family, column, column, cell, and edition. Table: An HBase table is made up of several columns. The tables in HBase defines upfront during the time of the schema specification. Row: An HBase row consists of a row key and one or more associated value columns. Row 148 Downloaded from EnggTree.com EnggTree.com keys are the bytes that are not interpreted. Rows are ordered lexicographically, with the first row appearing in a table in the lowest order. The layout of the row key is very critical for this purpose. Column: A column in HBase consists of a family of columns and a qualifier of columns, which is identified by a character: (colon). Column Family: Apache HBase columns are separated into the families of columns. The column families physically position a group of columns and their values to increase its performance. Every row in a table has a similar family of columns, but there may not be anything in a given family of columns. The same prefix is granted to all column members of a column family. For example, Column courses: history and courses: math, are both members of the column family of courses. The character of the colon (:) distinguishes the family of columns from the qualifier of the family of columns. The prefix of the column family must be made up of printable characters. During schema definition time, column families must be declared upfront while columns are not specified during schema time. They can be conjured on the fly when the table is up and running. Physically, all members of the column family are stored on the file system together. Column Qualifier The column qualifier is added to a column family. A column standard could be content (html and pdf), which provides the content of a column unit. Although column families are set up at table formation, column qualifiers are mutable and can vary significantly from row to row. Cell: A Cell store data and is quite a unique combination of row key, Column Family, and the Column. The data stored in a cell call its value and data types, which is every time treated as a byte []. Timestamp: In addition to each value, the timestamp is written and is the identifier for a given version of a number. 149 Downloaded from EnggTree.com EnggTree.com The timestamp reflects the time when the data is written on the Region Server. But when we put data into the cell, we can assign a different timestamp value. CRUD Operations 1. Create a data-Hbase Inserting Data using HBase Shell- to create data in an HBase table. To create data in an HBase table, the following commands and methods are used: put command, add () method of Put class, and put () method of HTable class. As an example, we are going to create the following table in HBase. z Using put command, you can insert rows into a table. Its syntax is as follows: put’’,’row1’,’’,’’ Inserting the First Row Let us insert the first-row values into the emp table as shown below. hbase(main): 005:0> put 'emp','1','personal data:name','raju' 0 row(s) in 0.6600 seconds hbase(main): 006:0> put 'emp','1','personal data:city','hyderabad' 0 row(s) in 0.0410 seconds hbase(main): 007:0> put 'emp','1','professional data:designation','manager' 150 Downloaded from EnggTree.com EnggTree.com 0 row(s) in 0.0240 seconds hbase(main): 007:0> put 'emp','1','professional data: salary','50000' 0 row(s) in 0.0240 seconds Insert the remaining rows using the put command in the same way. If you insert the whole table, you will get the following output. hbase(main): 022:0> scan 'emp' ROW COLUMN+CELL 1 column=personal data:city, timestamp=1417524216501, value=hyderabad 1 column=personal data:name, timestamp=1417524185058, value=ramu 1 column=professional data:designation, timestamp=1417524232601, value=manager 1 column=professional data:salary, timestamp=1417524244109, value=50000 2 column=personal data:city, timestamp=1417524574905, value=chennai 2 column=personal data:name, timestamp=1417524556125, value=ravi 2 column=professional data:designation, timestamp=1417524592204, value=sr:engg 2 column=professional data:salary, timestamp=1417524604221, value=30000 3 column=personal data:city, timestamp=1417524681780, value=delhi 3 column=personal data:name, timestamp=1417524672067, value=rajesh 3 column=professional data:designation, timestamp=1417524693187, value=jr:engg 3 column=professional data:salary, timestamp=1417524702514, value=25000 Inserting Data Using Java API You can insert data into Hbase using the add () method of the Put class. You can save it using the put () method of the HTable class. These classes belong to the org.apache.hadoop.hbase.client package. Below given are the steps to create data in a Table of HBase. Step 1: Instantiate the Configuration Class The Configuration class adds HBase configuration files to its object. You can create a 151 Downloaded from EnggTree.com EnggTree.com configuration object using the create () method of the HbaseConfiguration class as shown below. Configuration conf = HbaseConfiguration.create(); Step 2: Instantiate the HTable Class You have a class called HTable, an implementation of Table in HBase. This class is used to communicate with a single HBase table. While instantiating this class, it accepts configuration object and table name as parameters. You can instantiate HTable class as shown below. HTable hTable = new HTable(conf, tableName); Step 3: Instantiate the PutClass To insert data into an HBase table, the add () method and its variants are used. This method belongs to Put, therefore instantiate the put class. This class requires the row name you want to insert the data into, in string format. You can instantiate the Put class as shown below. Put p = new Put (Bytes.toBytes("row1")); Step 4: Insert Data The add () method of Put class is used to insert data. It requires 3-byte arrays representing column family, column qualifier (column name), and the value to be inserted, respectively. Insert data into the HBase table using the add () method as shown below. p.add(Bytes.toBytes("coloumn family "), Bytes.toBytes("column name"), Bytes.toBytes("value")); Step 5: Save the Data in Table After inserting the required rows, save the changes by adding the put instance to the put () method of HTable class as shown below. hTable.put(p); Step 6: Close the HTable Instance After creating data in the HBase Table, close the HTable instance using the close () method as shown below. hTable.close(); Given below is the complete program to create data in HBase Table. import java.io.IOException; 152 Downloaded from EnggTree.com EnggTree.com import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.util.Bytes; public class InsertData{ public static void main (String [] args) throws IOException { // Instantiating Configuration class Configuration config = HBaseConfiguration.create(); // Instantiating HTable class HTable hTable = new HTable(config, "emp"); // Instantiating Put class // accepts a row name. Put p = new Put (Bytes.toBytes("row1")); // adding values using add () method // accepts column family name, qualifier/row name, value p.add(Bytes.toBytes("personal"), Bytes.toBytes("name"), Bytes.toBytes("raju")); p.add(Bytes.toBytes("personal"), Bytes.toBytes("city"), Bytes.toBytes("hyderabad")); p.add (Bytes.toBytes("professional"), Bytes.toBytes("designation"), Bytes.toBytes("manager")); p.add(Bytes.toBytes("professional"),Bytes.toBytes("salary"), Bytes.toBytes("50000")); // Saving the put Instance to the HTable. hTable.put(p); System.out.println("data inserted"); // closing HTable 153 Downloaded from EnggTree.com EnggTree.com hTable.close(); } } Compile and execute the above program as shown below. $javac InsertData.java $java InsertData The following should be the output: data inserted 2. Updating Data using HBase Shell You can update an existing cell value using the put command. To do so, just follow the same syntax and mention your new value as shown below. put ‘table name’,’row’,'Column family:column name',’new value’ The newly given value replaces the existing value, updating the row. Example Suppose there is a table in HBase called emp with the following data. hbase(main): 003:0> scan 'emp' ROW COLUMN + CELL row1 column = personal:name, timestamp = 1418051555, value = raju row1 column = personal:city, timestamp = 1418275907, value = Hyderabad row1 column = professional:designation, timestamp = 14180555,value = manager row1 column = professional:salary, timestamp = 1418035791555,value = 50000 1 row(s) in 0.0100 seconds The following command will update the city value of the employee named ‘Raju’ to Delhi. hbase(main): 002:0> put 'emp','row1','personal: city','Delhi' 0 row(s) in 0.0400 seconds The updated table looks as follows where you can observe the city of Raju has been changed to ‘Delhi’. hbase(main): 003:0> scan 'emp' ROW COLUMN + CELL 154 Downloaded from EnggTree.com EnggTree.com row1 column = personal:name, timestamp = 1418035791555, value = raju row1 column = personal:city, timestamp = 1418274645907, value = Delhi row1 column = professional:designation, timestamp = 141857555,value = manager row1 column = professional:salary, timestamp = 1418039555, value = 50000 1 row(s) in 0.0100 seconds Updating Data Using Java API You can update the data in a particular cell using the put () method. Follow the steps given below to update an existing cell value of a table. Step 1: Instantiate the Configuration Class Configuration class adds HBase configuration files to its object. You can create a configuration object using the create () method of the HbaseConfiguration class as shown below. Configuration conf = HbaseConfiguration.create(); Step 2: Instantiate the HTable Class You have a class called HTable, an implementation of Table in HBase. This class is used to communicate with a single HBase table. While instantiating this class, it accepts the onfiguration object and the table name as parameters. You can instantiate the HTable class as shown below. HTable hTable = new HTable(conf, tableName); Step 3: Instantiate the Put Class To insert data into HBase Table, the add () method and its variants are used. This method belongs to Put, therefore instantiate the put class. This class requires the row name you want to insert the data into, in string format. You can instantiate the Put class as shown below. Put p = new Put (Bytes.toBytes("row1")); Step 4: Update an Existing Cell The add () method of Put class is used to insert data. It requires 3-byte arrays representing column family, column qualifier (column name), and the value to be inserted, respectively. Insert data into HBase table using the add () method as shown below. p.add(Bytes.toBytes("coloumn family "), Bytes.toBytes("column 155 Downloaded from EnggTree.com EnggTree.com name"),Bytes.toBytes("value")); p.add(Bytes.toBytes("personal"), Bytes.toBytes("city"),Bytes.toBytes("Delih")); Step 5: Save the Data in Table After inserting the required rows, save the changes by adding the put instance to the put () method of the HTable class as shown below. hTable.put(p); Step 6: Close HTable Instance After creating data in HBase Table, close the HTable instance using the close () method as shown below. hTable.close(); Given below is the complete program to update data in a particular table. import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.util.Bytes; public class UpdateData{ public static void main (String [] args) throws IOException { // Instantiating Configuration class Configuration config = HBaseConfiguration.create(); // Instantiating HTable class HTable hTable = new HTable(config, "emp"); // Instantiating Put class //accepts a row name Put p = new Put (Bytes.toBytes("row1")); // Updating a cell value p.add(Bytes.toBytes("personal"), 156 Downloaded from EnggTree.com EnggTree.com Bytes.toBytes("city"), Bytes.toBytes("Delih")); // Saving the put Instance to the HTable. hTable.put(p); System.out.println("data Updated"); // closing HTable hTable.close(); } } Compile and execute the above program as shown below. $javac UpdateData.java $java UpdateData The following should be the output: data Updated 3. Reading Data using HBase Shell The get commands and the get () method of HTable class are used to read data from a table in HBase. Using get command, you can get a single row of data at a time. Its syntax is as follows: get’’,’row1’ Example The following example shows how to use the get command. Let us scan the first row of the emp table. hbase(main): 012:0> get 'emp', '1' COLUMN CELL personal: city timestamp = 1417521848375, value = hyderabad personal: name timestamp = 1417521785385, value = ramu professional: designation timestamp = 1417521885277, value = manager professional: salary timestamp = 1417521903862, value = 50000 4 row(s) in 0.0270 seconds Reading a Specific Column 157 Downloaded from EnggTree.com EnggTree.com Given below is the syntax to read a specific column using the get method. hbase> get 'table name', ‘rowid’, {COLUMN ⇒ ‘column family:column name ’} Example Given below is the example to read a specific column in HBase table. hbase(main): 015:0> get 'emp', 'row1', {COLUMN ⇒ 'personal:name'} COLUMN CELL personal:name timestamp = 1418035791555, value = raju 1 row(s) in 0.0080 seconds Reading Data Using Java API To read data from an HBase table, use the get () method of the HTable class. This method requires an instance of the Get class. Follow the steps given below to retrieve data from the HBase table. Step 1: Instantiate the Configuration Class Configuration class adds HBase configuration files to its object. You can create a configuration object using the create () method of the HbaseConfiguration class as shown below. Configuration conf = HbaseConfiguration.create(); Step 2: Instantiate the HTable Class You have a class called HTable, an implementation of Table in HBase. This class is used to communicate with a single HBase table. While instantiating this class, it accepts the configuration object and the table name as parameters. You can instantiate the HTable class as shown below. HTable hTable = new HTable(conf, tableName); Step 3: Instantiate the Get Class You can retrieve data from the HBase table using the get () method of the HTable class. This method extracts a cell from a given row. It requires a Get class object as parameter. Create it as shown below. Get get = new Get(toBytes("row1")); Step 4: Read the Data While retrieving data, you can get a single row by id, or get a set of rows by a set of row ids, or scan an entire table or a subset of rows. 158 Downloaded from EnggTree.com EnggTree.com You can retrieve an HBase table data using the add method variants in Get class. To get a specific column from a specific column family, use the following method. get.addFamily(personal) To get all the columns from a specific column family, use the following method. get.addColumn(personal, name) Step 5: Get the Result Get the result by passing your Get class instance to the get method of the HTable class. This method returns the Result class object, which holds the requested result. Given below is the usage of get () method. Result result = table.get(g); Step 6: Reading Values from the Result Instance The Result class provides the getValue() method to read the values from its instance. Use it as shown below to read the values from the Result instance. byte [] value = result. getValue(Bytes.toBytes("personal"),Bytes.toBytes("name")); byte [] value1 = result. getValue(Bytes.toBytes("personal"),Bytes.toBytes("city")); Given below is the complete program to read values from an HBase table. import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.util.Bytes; public class RetriveData{ public static void main (String [] args) throws IOException, Exception { // Instantiating Configuration class Configuration config = HBaseConfiguration.create(); // Instantiating HTable class HTable table = new HTable(config, "emp"); 159 Downloaded from EnggTree.com EnggTree.com // Instantiating Get class Get g = new Get (Bytes.toBytes("row1")); // Reading the data Result result = table.get(g); // Reading values from Result class object byte [] value = result. getValue(Bytes.toBytes("personal"),Bytes.toBytes("name")); byte [] value1 = result. getValue(Bytes.toBytes("personal"),Bytes.toBytes("city")); // Printing the values String name = Bytes.toString(value); String city = Bytes.toString(value1); System.out.println("name: " + name + " city: " + city); } } Compile and execute the above program as shown below. $javac RetriveData.java $java RetriveData The following should be the output: name: Raju city: Delhi Deleting a Specific Cell in a Table Using the delete command, you can delete a specific cell in a table. The syntax of delete command is as follows: delete ‘’, ‘’, ‘’, ‘’ Example Here is an example to delete a specific cell. Here we are deleting the salary. hbase(main): 006:0> delete 'emp', '1', 'personal data:city', 1417521848375 0 row(s) in 0.0060 seconds Deleting All Cells in a Table Using the “deleteall” command, you can delete all the cells in a row. Given below is the 160 Downloaded from EnggTree.com EnggTree.com syntax of deleteall command. deleteall ‘’, ‘’, Example Here is an example of “deleteall” command, where we are deleting all the cells of row1 of emp table. hbase(main): 007:0> deleteall 'emp','1' 0 row(s) in 0.0240 seconds Verify the table using the scan command. A snapshot of the table after deleting the table is given below. hbase(main): 022:0> scan 'emp' ROW COLUMN + CELL 2 column = personal data:city, timestamp = 1417524574905, value = chennai 2 column = personal data:name, timestamp = 1417524556125, value = ravi 2 column = professional data:designation, timestamp = 1417524204, value = sr:engg 2 column = professional data:salary, timestamp = 1417524604221, value = 30000 3 column = personal data:city, timestamp = 1417524681780, value = delhi 3 column = personal data:name, timestamp = 1417524672067, value = rajesh 3 column = professional data:designation, timestamp = 1417523187, value = jr:engg 3 column = professional data:salary, timestamp = 1417524702514, value = 25000 4. Deleting Data Using Java API You can delete data from an HBase table using the delete () method of the HTable class. Follow the steps given below to delete data from a table. Step 1: Instantiate the Configuration Class Configuration class adds HBase configuration files to its object. You can create a configuration object using the create () method of the the HbaseConfiguration class as shown below. Configuration conf = HbaseConfiguration.create(); Step 2: Instantiate the HTable Class You have a class called HTable, an implementation of Table in HBase. This class is 161 Downloaded from EnggTree.com EnggTree.com used to communicate with a single HBase table. While instantiating this class, it accepts the configuration object and the table name as parameters. You can instantiate the HTable class as shown below. HTable hTable = new HTable(conf, tableName); Step 3: Instantiate the Delete Class Instantiate the Delete class by passing the rowid of the row that is to be deleted, in byte array format. You can also pass timestamp and Rowlock to this constructor. Delete delete = new Delete(toBytes("row1")); Step 4: Select the Data to be Deleted You can delete the data using the delete methods of the Delete class. This class has various delete methods. Choose the columns or column families to be deleted using those methods. Take a look at the following examples that show the usage of Delete class methods. delete.deleteColumn(Bytes.toBytes("personal"), Bytes.toBytes("name")); delete.deleteFamily(Bytes.toBytes("professional")); Step 5: Delete the Data Delete the selected data by passing the delete instance to the delete () method of the HTable class as shown below. table.delete(delete); Step 6: Close the HTableInstance After deleting the data, close the HTable Instance. table.close(); Given below is the complete program to delete data from the HBase table. import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Delete; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.util.Bytes; public class DeleteData { 162 Downloaded from EnggTree.com EnggTree.com public static void main (String [] args) throws IOException { // Instantiating Configuration class Configuration conf = HBaseConfiguration.create(); // Instantiating HTable class HTable table = new HTable(conf, "employee"); // Instantiating Delete class Delete delete = new Delete (Bytes.toBytes("row1")); delete.deleteColumn(Bytes.toBytes("personal"), Bytes.toBytes("name")); delete.deleteFamily(Bytes.toBytes("professional")); // deleting the data table.delete(delete); // closing the HTable object table.close(); System.out.println("data deleted......"); } } Compile and execute the above program as shown below. $javac Deletedata.java $java DeleteData The following should be the output:data deleted 163 Downloaded from EnggTree.com