Anda di halaman 1dari 19

Database Testing SQL Commands

In order to do complete justice to the Database testing, it is equally important to know the various SQL commands. You will find below the commonly used SQL commands that could be put to use to test the concerned database. BASICS SELECT clause SELECT col_name FROM table_name E.g. SELECT store_name FROM store_information DISTINCT clause SELECT DISTINCT column_name FROM table_name E.g. SELECT DISTINCT store_name FROM store_information WHERE clause SELECT col_name FROM table_name WHERE condition E.g. SELECT store_name FROM store_information WHERE sales > 1000 AND/ OR conditions SELECT store_name FROM store_information WHERE sales > 1000 OR (Sales < 600 AND Sales > 275) Using IN SELECT * FROM store_information WHERE store_name IN (Delhi, Mumbai) Using BETWEEN

SELECT * FROM store_information WHERE sales_date BETWEEN Jan-06-1999 AND Jan-10-1999 Using LIKE patterns Typical patterns used with LIKE are - A_Z - ABC% - %MUM - %AN% SELECT * FROM store_information WHERE store_name LIKE %AN% Will display all those store_names with two consecutive characters as AN Something like Bangalore Mangalore ORDER BY clause SELECT store_name, sales, Sales_date FROM store_information ORDER BY sales DESC Will display the store_name, sales and sales_date columns in the descending order of the sales value. Aggregate functions These functions act on a numeric data to generate aggregation. The aggregate functions are - AVG - COUNT - MAX - MIN - SUM SUM SELECT SUM(sales) FROM store_information Will generate the sum of the entire sales value within the sales_information table.

COUNT SELECT COUNT(store_name) FROM store_information Will count and display the total number of store names from the store_information table. Using GROUP BY clause SELECT store_name, SUM(sales) FROM store_information GROUP BY store_name Will display store_name and the sum against each store_name. Here you will find that basic query column is the store_name and the associated aggregate function is the SUM of the sales value. Using HAVING clause Acts on the Grouped by details to induce a condition on the aggregate. SELECT store_name, SUM(sales) FROM store_information GROUP BY store_name HAVING SUM(sales) > 1500 Will first get the aggregate SUM of sales value against each store_name. The HAVING will exercise the additional condition of the SUM being greater than 1500. Only these records will be displayed. Using JOINS SELECT A1.region_name REGION, SUM(A2.Sales) SALES FROM Geography A1, Store_information A2 WHERE A1.Store_name = A2.store_name GROUP BY A1.region_name Will match the column store_name from tables Geography and Store_information and display the output grouped by REGION. OUTER JOIN SELECT A1.region_name REGION, SUM(A2.Sales) SALES FROM Geography A1, Store_information A2 WHERE A1.Store_name = A2.store_name (+) GROUP BY A1.store_name

Will display output even if there are no matches in the second table. Other way of writing the same is SELECT A1.region_name REGION, SUM(A2.Sales) SALES FROM Geography A1 LEFT OUTER JOIN Store_Information A2 ON A1.store_name = A2.store_name GROUP BY A1.store_name Using UNION SELECT sales_date FROM store_information UNION SELECT S_date FROM Internet_Sales Will display the sales_date and S_date from the respective tables and display the combined output. Using INTERSECT SELECT sales_date FROM store_information INTERSECT SELECT S_date FROM Internet_Sales Will display only the common dates from each of the table. Using MINUS SELECT sales_date FROM store_information MINUS SELECT S_date FROM Internet_Sales Will display the Date from Store_information but not available in Internet_sales Using CONCATENATION MySQL/ Oracle SELECT CONCAT(region_name, store_name) FROM geography WHERE store_name = Mumbai Will combine the column values for region_name with store_name

E.g. If region_name is West and store_name is Mumbai Then result will be WestMumbai In Oracle SELECT region_name || || store_name FROM georgraphy WHERE store_name = Mumbai In SQL Server SELECT region_name + + store_name FROM geography WHERE store_name = Mumbai Using SUBSTR SELECT SUBSTR(store_name,3) FROM geography WHERE store_name = Mumbai Will return the value mbai Trim Functions - TRIM - LTRIM - RTRIM INNER JOIN SELECT emp.Name, Orders.Product FROM emp INNER JOIN Orders ON emp.emp_id = Orders.emp_id Will display the matching records for emp_id in emp table with that of the Orders table. i.e. all those employees who have placed an order will appear in the output. LEFT JOIN SELECT emp.Name, Orders.Product FROM emp LEFT JOIN Orders ON emp.emp_id = Orders.emp_id Will display records from the emp table irrespective of whether it has a match in the Orders table.

RIGHT JOIN SELECT emp.Name, Orders.Product FROM emp RIGHT JOIN Orders ON emp.emp_id = Orders.emp_id Will display records from the Orders table irrespective of the match available in the emp table. SELF JOIN SELECT last_name, first_name FROM employees WHERE city IN (SELECT city FROM employees WHERE last_name = Gopi AND first_name = Abhilash) This query writing with a subquery will be simplified by using a SELF JOIN. SELECT e1.last_name, e1.first_name FROM employees e1, employees e2 WHERE e1.city = e2.city AND e2.last_name = Gopi AND e2.first_name = Abhilash SUB QUERY A SQL statement embedded within another SQL query by the means of the WHERE or the HAVING statement is called a subquery. Syntax is SELECT column_name1 FROM table_name WHERE column_name2 [COMPARISON OPERATOR] (SELECT column_name1 FROM table_name WHERE [Condition]) E.g. SELECT SUM(Sales) FROM Store_Information WHERE store_name IN (SELECT store_name FROM geography WHERE region_name = West)

INSERT clause - This statement is used to insert a row of data into a table. - All inserted values are enclosed using single quote strings. - Syntax is INSERT INTO table_name (col1, col2, colx) Values (value1, value2, valuex) E.g. INSERT INTO citylist (name, state, population, zipcode) Values (Abhilash,Kerala,11400000,682-020) Note that the values for name, state, and zipcode are strings and hence included in single quotes. Numeric value does not have a single quote. UPDATE clause - Used to update the values in a table by using a Key. E.g. UPDATE employee SET Status = 0 WHERE emp_id = 111 It will set the status of the employee to 0 where the emp_id is 111 Hope these basic set of SQL commands will enable you to effectively test a database. In case of further details or special case, please do let me know. Thank you.

Software Testing Data Flow Testing


Data Flow Testing Data Flow testing targets the lifecycle of a particular piece of data (i.e. a variable) in an application. The technique to implement an effective Data Flow testing is to look for patterns of data usage based on which risky areas of the code can be found and more test cases can be applied. Lets see some basics as regards DATA. # DATA is used in 4 ways (D)efined, (U)sed in a predicate, (U)sed in a calculation, and (K)illed. E.g. Usage of a piece of data in a calculation after it has been killed, show an anomaly in the code; Therefore a bug. # Most Failures involve Execution of an incorrect definition Incorrect assignment or input statement Predicate is faulty Definition is missing Explore the sequences of events related to the data state and the unreasonable things that can happen to data. Explore the effect of using the value produced by each and every computation. Considerable effort needs to be given to data declaration statements, since most of the code contains nearly half of the contents as data declaration. Some examples are given below Sum = expression Read (var1, var2, var3) Write (total1, total2, total3) Method call : CompareValue(val1, val2) While : while Boolean do statements For statement : for (val1 = init to final) If then else : If Boolean then S1;

if Boolean then S1 else S2. case : Case var1: Statement 1: Statement 2: # Use the techniques given below for an effective data flow test. DD -bug DK -bug DU normal KD normal KK harmless, but a bug KU bug UD normal UK normal UU -normal where D Define / U Use / K Kill

Software Testing Database Testing


Database Testing? It is done as part of a detailed test activity for the Database and its interaction with the Application. The testing is done to ensure that right data is being input through the User Interface, and the right data is being retrieved from the database and displayed in the relevant reports. Essence of Database Testing You have a SQL database, some stored procedures, and a layer of code sitting between your application and the database. How can you put tests in place to make sure your code really is reading and writing the right data from the database? An Approach # Some of us may do a test as follows set up database, write code to access database, run code, do a SELECT to see if the records showed up in the database. If the query shows up some records in the database, they are done.

Hmmm, Well, Does this really scrutinize the database operations and verify the same in detail. Absolutely NOT. Visual Inspection in DB testing What has been done right now is a simple VISUAL INSPECTION. The problem with visual inspection is that the tester/ developer does not do it often. Also it may be possible that when the tester/ developer makes changes to a system, and maybe months later, the code is broken & some data will go missing. Most of us may not spend much time checking the data itself, so it may take a while for this mistake to surface. How do you start? In order to go about a Database Test, it is advisable to have a good test set. What is a good test set? A good test set is self-sufficient and creates all the data it needs. Testing can be simplified if you can get the database in a known state before a test is run. One way to do this is to have a separate unit test database The Test cases should be designed in such a way that the tester/ developer can start with the database in the initial state and then proceed to test from that state onwards. Eg. test cases to clean out the database before starting any tests. An Approach One of the things to be taken care of while testing a database is to have more than one database. These database can be run on one server if necessary, or synonyms can be created for efficient testing. Types of Databases. # Types of databases (required for testing) 1 The production database. Contains Live data. No testing is done on this database. 2 Your local development database. Most which is the testing is carried out on this database. 3 A populated development database. A database shared by all developers so that they can run the application and see it work with realistic amounts of data, rather than the hand full of records the testers have in their test database. This may not be a mandatory database, but is used by most Projects to ensure that the application code work with lots of data (i.e., a copy of the production databases data).

4. A deployment database. Also called and Integration Database, where the tests are run prior to deployment to make sure that local database changes have been applied. Also ensure that any database structure or stored procedure changes have been made to the production database before you go live with your code. Concepts of a Database The most simplest element of a database is a cell. The cell contents represent an attribute, when considered on a broad perspective. Collection of related attributes is called a ROW. Collection of different attributes values, and representing an attribute is called as a column. Collection of rows is called a table. Collection of tables is called a DATABASE. Database Normalization Normalization is the process of removing redundant data from your tables in order to improve storage efficiency, data integrity and scalability. This improvement is balanced against an increase in complexity and potential performance losses from the joining of the normalized tables at query-time. Database Normalization # Consider a Bookshop storing the following information Title Author Author Biography ISBN Price Subject Number of Pages Publisher Description Review Reviewer Name Database Normalization Suppose it contains records pertaining to 2 authors, and we assume that each name is 20 characters long i.e. 20 bytes; then it is found that it will contain 20 x 10 = 200 byes for storing 10 records of each author. It will also fail DATA INTEGRITY, since to repeat their names, somebody might make a mistake. Database Normalization

Imagine this scenario occurs for only 2 authors. What if, we had few 1000 authors. Hence Normalization is necessary. What is Normalization Normalization is the process of removing redundant data from your tables in order to improve storage efficiency, data integrity and scalability. Normal Forms We can have 5 normal forms. The 5 normal forms are rarely implemented. All Databases are normalized only up to the 3rd Normal Form. The drawback of further Normalization is the generation / creation of complex query structure.

First Normal Form

Remove redundant data from the horizontal rows. This ensures that there is no duplication of data in a given row, and every column stores the least amount of information possible (i.e. atomic data). First Normal Form The Book Information given below is organized as First Normal Form

Relationship Once this breakup is achieved, Now it is time to create a relationship. Relationship exists between two tables - One to One - One to Many - Many to Many Relationship The relationship between the Book table and the Author table is a many-to-many relationship: A book can have more than one author, and an author can write more than one book. To represent a many-to-many relationship in a relational database we need a third table to serve as a link between the two: First Normal Form -Relationship Second Normal Form

Where the First Normal form deals with redundancy of data across a horizontal row, Second Normal Form (2NF) deals with redundancy of data in vertical columns.

Third Normal Form Identify and separate data that is not fully dependent on the primary key, but dependent on another value in the table. Eg. In the REVIEWER table, the Reviewer_Street and Reviewer_City fields are really dependent on the Reviewer_PostalCode and not on the reviewer_ID. Hence we may do the following.

Fourth Normal Form Where non-key attributes depend on the key column exclusive of other non-key columns are eliminated. Fifth Normal Form 5NF indicates when an entity cannot be further decomposed. 5NF is complex and not intuitive. In simple words the 5NF says After a table has been decomposed into 3 or smaller tables, must be capable of being joined again on common keys to form the original table. Benefits of Normalization Greater overall database organization Reduction of redundant data Data consistency within the database A much more flexible database design A better handle on database security Drawbacks of Normalization reduced database performance. Impact on CPU usage, memory usage, and input/output (I/O). User related Database Design considerations What data should be stored in the database? How will the user access the database?

What privileges does the user require? How should the data be grouped in the database? What data is the most commonly accessed? How is all data related in the database? What measures should be taken to ensure accurate data? SUB QUERY It is possible to embed a SQL statement within another. When this is done on the WHERE or the HAVING statements, we have a subquery construct. What is subquery useful for? It can also be used to join tables. Also, there are cases where the only way to correlate two tables is through a subquery. The syntax is as follows: SELECT column_name1 FROM table_name WHERE column_name2 [Comparison Operator] (SELECT column_name1 FROM table_name WHERE [Condition]) INSERT clause This statement is used to insert a row of data in a table. All inserted values are enclosed using single quote strings. The syntax of this command is: insert into tablename (column1name,column2namecolumnxname) values (value1,value2valuex); Example insert into citylist (name, state, population, zipcode) values (Abhi, Kerala, 11400000, 682 020); Advanced Concepts of Normalization Database Dependency can be put forth with respect to two aspects, (a) Functional and (b) Multivalued dependencies. These express the presence of structure in database relations, that can be utilized in the database design process. The discovery of database dependencies can be viewed as an induction problem, in which general rules (dependencies) are obtained from specific facts (relation). Functional Dependency

# Functional dependency Why is one relational schema better than the other? Informal design guidelines Semantics of the attributes : design a relational schema so that it is easy to explain its meaning. Reduce redundant values in tuples: Minimize storage space Reduce the null value in tuples: Nulls should apply in exceptional cases only. Avoid generating spurious tuples. Design relations that can be joined with equality conditions on attributes that are primary or foreign keys Functional Dependency Definition (Functional Dependency) : A functional dependency, denoted by X-> Y between two sets of attributes X and Y that are subsets of the attributes of relation R, specifies that the values in a tuple corresponding to the attributes in Y are uniquely determined by the values corresponding to the attributes in X. Eg. The Emp Number uniquely determines a name EMPID -> Name Functional Dependency Functional dependencies are determined by the semantics of the relation, in general they cannot be determined by inspection of an instance of the relation. That is, a functional dependency is a constraint and not a property derived from a relation. KEYS # Keys We say that a set of one or more attributes (A_1,., A_n) is a key for a relation R if Those attributes functionally determine all other attributes of the relation No proper subset of those attributes functionally determines all other attributes of R . i.e. a key is minimal. A set of attributes that contains a key is called a superkey. Thus every key is a superkey but not every superkey is minimal. If a relation has more than one key, one of the keys is designated as the primary key. Database Designing (Advanced Concepts) The Schema Lending-schema = (bname, assets, bcity, loan#, cname, amount) Is a bad design The set of functional dependencies we required to hold on this schema is Bname -> assets, bcity Loan# -> amount, bname Database Designing (Advanced Concepts)

if we decompose it into Branch-schema = (bname, assets, bcity) Loan-info-schema = (bname, loan#, amount) Borrow-schema = (cname, loan#) We can say that this decomposition has several desirable properties. Lossless join decomposition Dependency preservation Database Designing (Advanced Concepts) First we decompose Lending-schema into Branch-schema = (bname, bcity, assets) Loan-info-schema = (bname, cname, loan#, amount) Since bname assets bcity, the augmentation rule for functional dependencies implies that bname bname assets bcity Since Branch-schema Borrow-schema = bname, our decomposition is lossless join. Database Designing (Advanced Concepts) Next we decompose Borrow-schema into Loan-schema = (bname, loan#, amount) Borrow-schema = (cname, loan#) As loan# is the common attribute, and loan# amount bname This is also a lossless-join decomposition. Comparison BCNF and 3NF Comparison of BCNF and 3NF We have seen BCNF and 3NF. It is always possible to obtain a 3NF design without sacrificing lossless-join or dependencypreservation. If we do not eliminate all transitive dependencies, we may need to use null values to represent some of the meaningful relationships. Repetition of information occurs. These problems can be illustrated with Banker-schema. As banker-name bname , we may want to express relationships between a banker and his or her branch. The table above shows how we must either have a corresponding value for customer name, or include a null. Repetition of information also occurs. Every occurrence of the bankers name must be accompanied by the branch name.

If we must choose between BCNF and dependency preservation, it is generally better to opt for 3NF. If we cannot check for dependency preservation efficiently, we either pay a high price in system performance or risk the integrity of the data. The limited amount of redundancy in 3NF is then a lesser evil. # To summarize, our goal for a relational database design is BCNF. Lossless-join. Dependency-preservation. # If we cannot achieve this, we accept 3NF Lossless-join. Dependency-preservation. Multivalued Dependency When designing a database using the E-R modelling technique, a common problem that arises is that of an entity having multivalued attributes. The multivalued dependency relates to this problem when more than one multivalued attributes exist. Multivalued Dependency Consider the following relation that represents an entity employee that has one mutlivalued attribute proj: emp (e#, dept, salary, proj) For example, e# -> dept implies only one dept value for each value of e#. Multivalued Dependency Not all information in a database is single-valued. Eg: proj in an employee relation may be the list of all projects that the employee is currently working on. Although e# determines the list of all projects that an employee is working on, e# -> proj is not a functional dependency.

Multivalued Dependency The fourth and fifth normal forms deal with multivalued dependencies. Before discussing the 4NF and 5NF we discuss the following example to illustrate the concept of multivalued dependency programmer (emp_name, qualifications, languages) Multivalued Dependency

The relation includes two multivalued attributes of entity programmer; qualifications and languages. Multivalued Dependency The above relation is therefore in 3NF (even in BCNF) but it still has some disadvantages. Suppose a programmer has several qualifications (B.Sc, Dip. Comp. Sc, etc) and is proficient in several programming languages; how should this information be represented? Multivalued Dependency Note that there is no relationship between qualifications and programming languages. All these variations have some disadvantages. If the information is repeated we face the same problems of repeated information and anomalies as we did when second or third normal form conditions are violated. If there is no repetition, there are still some difficulties with search, insertions and deletions. Multivalued Dependency The basis of the above decomposition is the concept of multivalued dependency (MVD). Functional dependency A ->B relates one value of A to one value of B while multivalued dependency A ->> B defines a relationship in which a set of values of attribute B are determined by a single value of A.

Anda mungkin juga menyukai