Anda di halaman 1dari 44

Database Basics

Objectives of Database Its advantages and application in Corporate


Share ability:
An ability to share data resources is a fundamental objective of database management.

This means different people and different processes can the same actual data at the same time. Serving differently types of users with varying skill levels Handling different user views of the same stored data Combining interrelated data Controlling concurrent updates so as to maintain data integrity's

Database Basics
Evolvability
Evolvability refers to the ability of the DBMS to change in response to growing user needs and advancing technology. Evolvability is the system characteristic that enhances future availability of the data resources. Evolvability is not the same as expandability or extensibility, which imply extending or adding to the system, which then grows ever larger. Evolvability covers expansion or contraction, both of which may occur as the system changes to fit the ever changing needs and desires of the using environment.

Database Basics

Integrity
The importance and pervasiveness of the need to maintain database integrity is rooted in the reality that man is perfect. Destruction, errors and improper disclosure must be anticipated and explicit mechanisms provided for handling them. The three primary facets of database integrity are: protecting the existence of the database Maintaining the quality of the database Ensuring the privacy of the database

DBMS
Data Dictionary & Metadata

A database contains information about entities of interest to users in an organization When created, the database itself becomes an entity about which information must be kept for various data administration purposes Data dictionary (or system catalog) is a database about the database Contents of a DD are commonly referred to as metadata DD can be updated, queried much as a regular database DBMS often maintains the DD

DBMS
Benefits of Data Dictionary Benefits include improved documentation and control consistency in data use easier data analysis reduced data redundancy simpler programming the enforcement of standards better means of estimating the effect of change.

DBMS
Metadata
Metadata: data that describes the properties and context of user data.
but separate from that data; Stored as part of the database.

---including data types, field sizes, allowable values, and data context

DBMS

Data Independence
With knowledge about the three-schemes architecture the term data independence can be explained as followed: Each higher level of the data architecture is immune to changes of the next lower level of the architecture. Physical Independence: Therefore, the logical scheme may stay unchanged even though the storage space or type of some data is changed for reasons of optimisation or reorganisation. Logical Independence: Also the external scheme may stay unchanged for most changes of the logical scheme. This is especially desirable as in this case the application software does not need to be modified or newly translated.

Distributed Database
Types of Distributed Database System Distributed database system are of the following types Homogenous Distributed Database Systems Heterogeneous Distributed Database System

Distributed Database
Homogenous Distributed Database System All sites have identical software They are aware of each other and agree to cooperate in processing user requests It appears to user as a single system

An Homogenous Distributed Database Systems example

 A distributed system connects three databases: hq, mfg, and sales  An application can simultaneously access or modify the data in several databases in a single distributed environment.

Distributed Database
Heterogeneous Distributed Database System In a heterogeneous distributed database system, at least one of the databases uses different schemas and software. A database system having different schema may cause a major problem for query processing. A database system having different software may cause a major problem for transaction processing.

Distributed Database
Features of Distributed Database System Replication System maintains multiple copies of data, stored in different sites, for faster retrieval and fault tolerance. Fragmentation Relation is partitioned into several fragments stored in distinct sites Replication and fragmentation can be combined Relation is partitioned into several fragments: system maintains several identical replicas of each such fragment.

Distributed Database
Advantages Distributed Database System Availability: failure of site containing relation r does not result in unavailability of r is replicas exist. Parallelism: queries on r may be processed by several nodes in parallel. Reduced data transfer: relation r is available locally at each site containing a replica of r.

Distributed Database
Disadvantages Distributed Database System Increased cost of updates: each replica of relation r must be updated. Increased complexity of concurrency control: concurrent updates to distinct replicas may lead to inconsistent data unless special concurrency control mechanisms are implemented. One solution: choose one copy as primary copy and apply concurrency control operations on primary copy.

Data Warehousing & Data Mining


Benefits of Data Mining
Data mining in customer relationship management applications can contribute significantly to the bottom line. Rather than randomly contacting a prospect or customer through a call center or sending mail, a company can concentrate its efforts on prospects that are predicted to have a high likelihood of responding to an offer. More sophisticated methods may be used to optimize resources across campaigns so that one may predict to which channel and to which offer an individual is most likely to respondacross all potential offers. Businesses employing data mining may see a return on investment Data mining can also be helpful to human-resources departments in identifying the characteristics of their most successful employees.

Introduction to MS Access

Why choose MS-Access over SPSS / Excel?


Although there is always overlap, the following rules might help when deciding when / when not to use MS Access: MS Access is best used for long-term data storage and/or data sharing. MS Excel is best used for minor data collection, manipulation, and especially visualization. SPSS is best used for minor data collection and especially data analysis. It is easy to export data from MS Access to Excel SPSS

What is in an MS-Access file - 1?


Although the term database typically refers to a collection of related data tables, an Access database includes more than just data. In addition to tables, you can add: Saved queries (stored procedures) - organizing and/or manipulating data Forms gui interaction with data, event programming Reports customized results for printing (~ static forms) Macros and VB programs for extending functionality Microsoft provides some logical integration of these tools through wizards. However, these are pretty basic - most developers must pick and choose the best approach when implementing applications.

What is in an MS-Access file - 2?


Unless advanced techniques are employed, all entities are stored in one *.mdb file. When running, a locking file (*.ldb) is also visible. Only the mdb file needs to be copied to transfer the database to another computer or location. Ex. MSCI_ByrneGuestLect ure.mdb

What is in an MS-Access file - 3?


VB + Macros Event Driven Automation, etc.

Forms (Active)

Reports (Static)

Queries

Tables Demographics Ethnicity Labs H&P

Microsoft Access Module 1 Summary


MS-Access is a powerful relational database program. It has many integrated features and can be greatly customized to fit most personal/departmental needs for data collection and storage.

Microsoft Access Module 2

Creating / Working with Tables

Tables Glucose Measurement Database

We wish to construct a database to track waking glucose measurements for an indefinite amount of time on 100 patients receiving 3 possible drug combinations.

Why would this be difficult in MS-Excel or SPSS?

Tables Overview
j Think of Access as a collection of spreadsheets that are
STORE DATA ONE TIME / ONE PLACE DO NOT STORE CALCULATED DATA

relationally linked.
Demographics Patient_ID Fname Lname Address Phone Gender Race DOB Height Glucose Glucose_ID Patient_ID Date Weight Med_ID Glucose Meds Med_ID DrugCombonation

Table Demonstration - Live

General Setup for Tables Describe General Options Show Validation Rule Relationships Lookup Option

Table Relationships - Live

Table Relationships Describe Cascade Features

Table Import / Link - Live


Importing a Table makes a copy of existing data Linking a Table lets you control existing data through Access (Exercise Caution !)

Note that you may import non-Access files.

MS Access Module 2 Summary


Data storage principles 1. Attempt to store data 1 time / 1 place; 2. Do not store data that may be calculated from other fields (utilize queries); and 3. Strive for very discrete data storage (no ambiguity garbage in / garbage out). 4. Choose real or arbitrary (autonumber) unique identifier for each record.

Relationships Use table relationships to automatically cascade delete and update records. Other Data Sources Import = Copy; Link = Live Connect.

Microsoft Access Module 3

Creating / Working with Queries

Query Overview - 1
j An MS-Access query is a set of stored SQL instructions that

manipulate and/or select data from one or more tables.


j j j j

Select Query Data grouping and/or filtering Make-Table Query Select + creates/populates new table. Update Query Updates fields from specified table data Append Query Runs query on one table, appends results to a table j Delete Query Delete selected records from table

One Table Query Example - Live


Use this button to toggle between design, sheet and SQL views.

Right-Click + Add to add table(s)

Custom sort by one or more fields.

Drag and Drop Fields

2-Table Query Example - Live


Right-Click + Add to add table(s) Note that relationship often automatic.

Calculated Field Drag and Drop Fields BMI: [Weight]/([Height]/100)^2 Right-Clicking gray area above field enables property changes.

Query Calculating Fields


Name the calculated field, then type a colon, then type the equation using brackets ( [ ] ) around table fields. If there is ambiguity in the field names between tables, you may need to type table.[field] format. Ex: BMI: [Weight]/([Height]/100)^2

Query Sorting Data

Choose Ascending or Descending in the Sort Row This query would sort by Gender THEN by Race.

Query Filtering Data


You need not show the data field to use as a filter.

This query will return all records in the database for: Females who are not white whose height are greater than 150 cm and who weigh between 60 and 70 kg

Query Filter Operators


= equals > greater than >= greater than or equal < less than <= less than or equal <> not equal to Betweenbetween two values Is Null field is empty is not null field is not empty Like Matches a pattern (Like John*) OR Logical OR (one or other is true) AND Logical AND (both are true) etc.

Query Grouping Data - 1


Clicking the Totals Button Enables Grouping, Counting and Statistical Options

Running this Query indicates there are 203 Females and 261 Males in the database. Notice new Total row. Each field (column) can be set.

Query Grouping Data -2


Totals Options Include: Group By Sum Avg Min Max Count StDev Var

Query Export Data


1) Create and Save Query

2)

Use OfficeLinks (Excel Toggle Option) to Analyze it with Excel

3)

Data Automatically Exported to Excel

Microsoft Access Module 4

Creating / Working with Forms/Reports

Graphical User Interface (GUI)


Although it is possible to enter data directly into a table, you can enhance data quality by forcing data entry through forms. Depending upon your users, you may wish to set things up so they never even see the database window. In other words, you can design your application so they only touch the data through programmed forms.

Graphical User Interface (GUI)


Continuing with the glucose database we formulated earlier, well now attempt to build a graphical user interface to: Collect Data Periodically report data through pre-formatted Quit the program

reports

GUI Forms/Report Live

Out of Program

Anda mungkin juga menyukai