Anda di halaman 1dari 24

BAB IV DATABASE DAN FILE

1. DATABASE

Tujuan Pada bab ini peserta diklat akan dijelaskan mengenai database dan cara bekerja suatu
Instr database.
uksio
nal
Khus
us

Apakah Sebuah database adalah merupakan kumpulan dari sebuah atau beberapa file yang
data saling terhubung.
base

Bagaimana Dalam suatu database file/file-file yang ada di dalamnya akan selelu berada dalam
sebu kondisi open file (terbuka) sehingga setiap informasi yang ada didalamnya dapat
ah terus di proses tanpa perlu di buka, retrieve dan tutup setiap kali dibutuhkan. (which
data can be very time consuming).
base
beke Untuk perbandingan jika bukan suatu database -file doc ms-word sebagai contoh-
rja setiap kali kita ingin mengetahui informasi yang ada di dalamnya maka file tersebut
harus di buka (open) dan setelah selesai harus ditutup.

Bagaimana tiap Tiap file terhubung (link) menggunakan suatu key (kunci) tertentu. Bisa hanya satu
file atau beberapa key yang terhubung ke satu atau beberapa file.
dala
Contoh:
m
data Dalam sebuah file pesanan pembelian (order). Setiap record pesanan akan mempunyai
base nomor pelanggan (customer number) dan juga nomor produk yang dipesan (Product
terh number). Nomor pelanggan merupakan key
ubun In a file of Orders, each Order record will have in it the Customer Number and the
g? Product Numbers (one for each item on the order). Customer Number is also the Key
to the Customer record, in which details like name and address are held. Product
Number is also the Key to the Product or Stock record, which holds the Description,
Unit Price etc.

FileOrganisation 31
Data Relationships

Introduction In order to understand how these links are defined and created, data relationships must
be understood.

These are defined using an Entity Model or Logical Data Structure (LDS). This
represents the relationships between the Entities in the system in diagrammatic form.

Definitions Entity: Something we need to hold information about (in effect, a Record).
Attribute: The things we need to know about (Fields).

Example We will use this example over the following pages to show you how data relationships
work within a database:

A Sales Order system needs to hold information about:

Orders

Products {These are the entities in the system


{
Customers {which are represented in the form of CUSTOMER
Invoices

We then have to determine what the relationships are between these entities, and how
we can uniquely identify records within them.

FileOrganisation 32
Defining the Relationship

Introduction We define data relationships by drawing up the relationships between entities using
symbols. Relationships can be either ONE-TO-MANY or MANY TO MANY.

ONE TO This is done by drawing up the ONE-TO-MANY relationships between entities using
MA the symbols:-
NY
Relat
ionsh MANY ONE
ips
Therefore, in our sales order processing system, the relationship between ORDER and
CUSTOMER would be shown as:

ORDER CUSTOMER

This shows that for every Order, there is only ONE Customer, but for each Customer
there are MANY Orders.

MANY TO Some of the relationships, as they appear initially, are not One-to-Many but Many-to-
MA Many.
NY
For example:
Relat
ionsh There is such a relationship between Order and Product, as there is generally more
ips than one Product on any one Order (one on each line). This makes it impossible to
identify uniquely a particular record within the group and therefore we have to find a
way of establishing one-to-many relationships throughout the Model.
The existence of a many-to-many in the diagram indicates the presence of a hidden
or composite entity

How do we deal See diagram on following page:


with
a
MA
NY
TO
MA
NY
Relat
ionsh
ip?

Continued on next page

FileOrganisation 33
Defining the Relationship, Continued
Diagram This is how we deal with a MANY-TO-MANY relationship:

FROM:

ORDER PRODUCT

TO:

ORDER PRODUCT

ORDER LINE

FileOrganisation 34
Linking Files

Introduction Once the final entity model has been drawn, the database files and their keys are
identified, together with the links between files. The basic structure is then complete
and can be implemented.

How does This allows links to be set up between the Order file and the Customer file, and
linki between the Order file and the Stock file, providing the computer with a sort of map
ng to help it navigate its way through the files to find the relevant records.
work
? So, if an Order record is accessed, the computer can then identify the Customer for
that Order and find the appropriate record in the Customer file and, if the Customer
record is accessed first, the machine can find all the Orders for that Customer. In the
same way, it can find the necessary Product information in the Stock file.

What are the This has a number of advantages:


adva
ntag Firstly, it means that data does not have to be held more than once in the system, as is
es of often necessary in conventional systems in order to avoid having to access too many
linki different files.
ng
files Secondly, it allows great flexibility in the way in which reports can be produced
this because all (or nearly all) the information in the system is readily available.
way?

Example In the following example, if Customer, Stock and Order files are held on a database,
the information required to produce the Invoice is already present in those files
without having to hold it again elsewhere.

Note:
The presence of keys to other files present in the Orders file. These are Foreign Keys,
and are the means used by databases of navigating from one file to another.

See following page for example diagram.

Continued on next page

FileOrganisation 35
Linking Files, Continued

Diagram
CUSTOMER FILE
*Customer Number
Name and Address
Payment Terms
INVOICE FILE
*Invoice Number (generated by computer)
ORDERS FILE Invoice Date (generated by computer)
*Order Number Customer Number#
#Customer Number Name and Address
#Product Number Order Number#
Quantity Quantity
Product Number#
Description
STOCK FILE Unit Price
*Product Number Total Price (calculated by computer)
Description of Goods
Unit Price
NB: * = Key field
# = Foreign Key

As can be seen above, no additional information is necessary to create the Invoice.


In a conventional system, it would be necessary to hold all of the Product and
Customer information within the Orders file in order to create an invoice without
retrieving and accessing other files.

Foreign Keys Note that there are also a number of Foreign Keys within that record as well. This
means that there are also links between the Invoice file and the files for which it holds
those Keys.
This is where the database gets its flexibility in reporting from. It can use the links
between files to produce.
For example:
Reports for Orders by Product, Orders by Customer, or even Orders by Customer for
a specified range of Products, again without having to spend time finding and
opening various files. Provided there is a relationship via a Foreign Key, the
database can go directly to the record it wants and get the information required.

FileOrganisation 36
The Database Management System

Introduction The DBMS is a special piece of software which deals with the organisation, storage
and retrieval of the information held in the database and acts as a link between the
Operating System and the Application Programs in use.

What does the It is more than just a data handling facility and consists of a range of system utilities,
DB including:
MS
do? A Data Dictionary - which holds all the field definitions centrally, so that data is
defined once and once only, unlike conventional systems which require a separate
definition for each occurrence of an item of data.
A Recovery Facility - which enables recreation of the database in the event of a
systems failure.
Access & Privacy Controls - which allow restriction of Read, Write, Create, Amend
and Delete access to specified areas of the database.
A Report Generator - which allows users to easily specify and produce ad-hoc
reports by combining data from different areas of the database.
A Data Accessing Language - which enables navigation through the structure of the
database to retrieve and store any data required.

FileOrganisation 37
Database System Documentation

Introduction In addition to the usual system documentation - file layouts, system specifications,
user guides (which will be just like those of a conventional system), a database system
will have two additional forms of documentation:
Schema
Subschemas

Schemas Also known as a Bachman Diagram, the Schema is a diagram showing the files in
the database and their relationships. It shows which files are linked and therefore
which access path is required to retrieve any record from a file. A schema for the
example used previously might look like the diagram below:

CUSTOMER

ORDER INVOICE

ORDER LINE INVOICE LINE

STOCK

Continued on next page

FileOrganisation 38
Database System Documentation, Continued

Schemas cont.. This is a very simplified example, but it shows the relationships between files. Note
that Order and Invoice Line have to be separate files in order to have a unique
relationship with Stock, as there is one product for each line on the Invoice and the
Order.
In a full size Schema, there would be many more files and more information. In
addition to identifying the links between files, a Schema will also show the proper file
names and whether a record can be accessed directly via its Key or whether it has to be
accessed through its Parent i.e.. the file to which it belongs. For example Order Line
is the Child of Order and can only be retrieved once the basic Order Header
information has been accessed from the Order file.

Subschemas Subschemas - or Views - show which parts of the database are accessible by particular
programs. Obviously most programs do not need to access the whole of the database
and there are good reasons why database administrators would not want them to:
Firstly, there is a time overhead in having large parts of the database open to a
program - the less of the database it accesses, the less time the processing takes.
Secondly, it is good security practice to restrict access to those files required for
processing and no more - it reduces the changes of accidental or deliberate interference
with other areas. For example, it could prevent the amendment of a routine program to
get into a critical area of the database.
Subschemas are not generally in the form of a diagram. Usually they are in the form
of a printout which shows the program (or program suite) name, together with a list of
files that are available to it. They are a useful aid in determining exactly how a system
works, as knowing which files are accessed gives a clearer idea of what a program is
doing.

FileOrganisation 39
Types of Database Organisation

Introduction There are three common types of database:


Hierarchical

Network

Relational

Hierarchical Organised into groups of PARENTS and CHILDREN (records and subrecords).
Originally only one parent to a child and vice versa.
Each file holds the address of the file in which its subrecords are held. (Referred to as
PARTITIONING).

Network Organised into networks of records and subrecords, with a number of parents and
children possible in each set.
Each record holds a pointer to the address of its first sub-record of each type, which in
turn points to the next sub-record. The last sub-record points back to the main record.
Records can also hold backward pointers pointing to the previous record, home
pointers pointing directly back to the parent record and soft pointers pointing directly
to the ultimate owner record. (Referred to as CHAINING).

Relational Records are held in two-dimensional tables, just like flat files. Each record (TUPLE)
in a file (RELATION) has the same Prime Key, which is the path to the record.

Although this is the LOGICAL view of how the data is held, in fact it is really held
in no particular order at all. The DBMS keeps track of it by first deciding where to
put the data on the disk (via a calculation based on the Key) and reversing the
calculation when it wants to find and retrieve the information.

FileOrganisation 310
Make Up of a DBF File

Introduction When a DBF File is imported into MS Access, the program uses information
contained in the file header to define the file e.g. number of records and field
information.

Following is a breakdown of how this part of a DBF File is laid out.

DBF File The first 32-byte section of a DBF File holds the file information.
Infor All other 32-byte sections hold field information.
mati
on For Example:
If the DBF File has five fields, there will be a total of six 32-byte sections:
1 x File Information
5 x Field Information

Description of Following is a table laying out the content of the File Bytes:
File
Byte
s

Byte Number Description


1 File Type Identifier

3 dbf without memo


(FoxBase+/FoxPro/dBASE III PLUS/dBASE IV)
131 .dbf with memo (FoxBase+/dBASE III Plus
139 .dbf with memo (dBASE III)
245 .dbf with memo (FoxPro)

2 Year of last update


3 Month of last update
4 Day of last update
58 Number of records in file*

The number of records is calculated with the following formula:


(byte#5) + (byte#6 * 256) + (byte#7 * 256 * 256) + (byte#8 * 256 * 256
* 256)

Continued on next page

FileOrganisation 311
Make Up of a DBF File, Continued

9 10 Offset to start data*

The offset to the start of data is computed from the beginning of the file
to the first data record. The offset is calculated with the formula
(byte#9)+(byte#10 * 256)
11 12 Size of record***

The size of the records is calculated with the formula (byte#11) +


(byte#12 * 256). This number represents the sum of the field sizes plus 1.
The extra 1 is the deletion flag.

13 28 Not used
29
0 No .cdx file attached to the database
1 .cdx file attached to the database

30 32 Not Used

Description of Following is a table laying out the content of the Field Bytes:
Field
Byte
s

Byte Number Description


1 10 Field Name
11 Must be 0
12 Field Type:

67 C Character
68 D Date
76 L Logical
66 M Memo
78 N Numeric
71 G General

13 14 Offset of field from beginning of record*


The offset is calculated with the following formula: (byte#13)+(byte#14 *
256)

The non-numeric field length is calculated with the following formula:


(byte#16) + (byte#17 * 256)

Continued on next page

FileOrganisation 312
Make Up of a DBF File, Continued

15 16 Not Used
17 18 Field length (non-numeric fields)**

17 Field Length Numeric Fields


18 Field Decimals Numeric Fields

19 32 Not Used

2. FILE

File Organisation
Organising Files

Introduction This section describes the ways that computer files may be physically organised. The
method of organisation will depend on speed of access required, cost and the
hardware available.

Sequential Files Records are held in the order of a key field. The file must be read one record after
another to find a specific record. For example, if a file is sequential by record
number, the computer must read records 1 to 99 first to find record number 100.

Indexed- Records are held in the order of a key field. However, to avoid reading through all
Sequ the previous records to find a particular record, the computer creates an index that
entia holds the key of each record and its location. The computer will look up the key field
l in the index and use the location reference to find the record. For smaller files, this
Files will take longer than simply accessing the records sequentially. However, as files
sizes increase, the search time decreases.

Algorithmic or The computer performs a calculation (i.e. an algorithm) on the key of the record. The
Ran product of this calculation determines the address of the record in the file. A
dom computer can perform this calculation much faster than it can read an index.
Acce
ss
Files

FileOrganisation 313
Free Text Files These files have no records or keys but simply consist of text. They are usually read
as a whole.

For example: A document produced by a word processor.

Database Files A collection of data stored in tables linked by keys under the control of a Data Base
Management System (DBMS).

FileOrganisation 314
File types

Introduction This sections describes the main file types and what they are used for.

Master Files A file which contains standing data information which changes relatively
infrequently and is used for reference purposes. Sometimes called Table files or
Look-up tables.

Example Common examples of master files are:


Customer files
Product files
Tax Rate files

Transaction Files containing data that are updated regularly due to day-to-day processing tax
Files payments, sales, deliveries, orders etc. They change continually as some activity is
always taking place. Essentially, it is the transaction data that is processed through
the system and standing data that is used to process and control it.

Example Common examples of transaction files are:


General Ledger Transaction files
Invoice files
Purchase Order files

FileOrganisation 315
File Layouts

Introduction This section describes the different types of data that may be held in a file and how
these are documented in a file layout.

What a File In order for a program to access information within a file, a description of that
Layo information has to be given to the computer in the form of a File Specification or
ut Layout. The computer needs to know the following things about each item:
Cont
ains Record Format (whether all records have the same length (Fixed Length Files)
or they vary depending on what they contain (Variable length)).
Record Length (or maximum length for variable records).
Field Name (as decided by the programmer).
Data Type for each Field (e.g. alpha, numeric, packed, binary).
Format and Length of each Field (known as the PICTURE).
Start Position of each field in the file. In standard COBOL notation, this is
determined by adding up the preceding field lengths. Other forms of notation
define the start and finish positions of the field rather than specifying the field
length.

Data Types The following are common representations of data types:

Data Type Representation


Alpha, Alphanumeric or A or C
Character
Numeric N
Binary B or COMP1 or CSR
(NB. Binary fields are usually
specified as 2, 4 or 8 byte
fields.)
Packed P or COMP3

Continued on next page

FileOrganisation 316
File Layouts, Continued

Picture A picture specifies what kind of data is held in a field (e.g. numeric) and how many
digits or characters that data item will occupy. PICTURE can be abbreviated to PIC.

Shown below are how different types of data is represented in the PIC format:

Data Type Picture Explanation


Numeric 9 One 9 is used for each digit in the field. For
example, a 3-digit data item is represented as
999.
Alphanumeric X One X is used for each character in the field.
For example a 4-character data item is represented
as XXXX.
Sign S Used in value fields to indicate whether the
amount is positive or negative. For example, a 3
digit signed field is represented as S999.
Note: A sign will add half a byte to the length of a
field.
Decimal Places V V indicates how many decimal places a value
has. For example, a three digit field with 2
decimal places is represented as 999V99.
Note: The V stands for virtual decimal place
which means that it does not take up any space in
the field.

Abbreviating To represent a long field, the picture can be abbreviated in the following way:
Long
Pictu PIC 99999 = PIC 9(5)
res PIC XXXXXXXXXXXX = PIC X(12).
PIC S99999v99 = PIC S9(5)V99

Continued on next page

FileOrganisation 317
File Layouts, Continued

Filler There are often fields within records that never need to be referred to. These fields
are called Filler. Filler is often deliberately incorporated into a file to allow
space for expansion in the event of system changes.

Within COBOL, Filler is a reserved word for the name of a data item which coding
will never refer to directly. There can be several fields with the name Filler.
Programs will not object to the fact that there are many fields with the same name.

Note: When using interrogation packages such as ACL, a unique name is required,
i.e. FILLER1, FILLER2, etc.

Continued on next page

FileOrganisation 318
File Layouts, Continued

Example of a A typical file layout for a Customer in a Sales Ledger might look like this:
File
Layo Field Name Type Pic Notes
ut CUSTNO N 9(4) Customer No -
numeric, max 9999
CUSTNAME C X(20) Customer name, up
to 20 characters
CUSTADDR C X(30) Address, up to 30
characters
CREDLIM B 9(6) Credit Limit, max
value 999,999
CUSTBAL P S9(7)V9 A/c Balance, max +
9 or - 999,999.99

The record length for the above file would be:

Field Length Explanation


CUSTNO 4
CUSTNAME 20
CUSTADDR 30
CREDLIM 3 (Minimum field length req'd
to hold max value 999,999)
CUSTBAL 5 (A half byte for each digit +
a half-byte for the sign)
Total in Bytes 62

FileOrganisation 319
Standing Data and Control of Master Files

Introduction The use of standing data in master files is crucial to the operation of all systems to a
greater or lesser extent. It is often the main means of ensuring that any data input to
the system is processed correctly.

What is Any information used for reference purposes. It tends to be relatively static i.e. it is
Stan changed infrequently). Typical examples are:
ding
Data
?
Standing Data System
Tax codes held in a Product file GST System
Names & Addresses in a Customer file Sales Order System
Tax code held in an Employee file Payroll System

What is a Any file containing standing data. It does not have to contain such data exclusively -
Mast it can even contain transaction data or other data, such as current account balances,
er which is amended regularly. The basic reference material within the file, however,
File? will remain fairly stable.

Storage of An item of standing data is often used in a number of programs. For this reason,
Stan standing data is normally held in table form in master files for reference by each
ding program.
Data
in In this way only the file has to be amended in the event of a data change. Table files,
Proc or Look-up Tables, are terms used for some Master files.
essin
g It is not desirable to "hard code" it in the separate programs, as they would all have
to be amended in the event of a change in the data i.e. tax rate changes.

Why is Control Accurate standing data is vital to the correctness of processing i.e. garbage in =
over garbage out.
Acce
ss to If the standing data is incorrect, e.g. if the wrong GST code is allocated to a product,
Mast data will be processed incorrectly.
er
Files Control (or lack of it) over access to master files for input/amendment of that data is
so therefore a major area of concern to the auditor.
impo
rtant
?

InternalControls 91
Continued on next page

InternalControls 92
Standing Data and Control of Master Files, Continued

Controls over Hardware controls


Master File e.g. keys to terminals, which limit access to certain parts of the system.
Access
Software controls
A combination of User-IDs and passwords to restrict access.

Access permission(s)
Limiting functions which can be performed depending on the type of access
allowed. For Example:

- Read: gives read-only access to information


- Write: gives general permission to write to a file
- Amend : allows amendment but not insertion or deletion
- Copy: allows copying of data but gives no write access
- Insert: allows new records to be created
- Delete: allows removal of records

Log records
Files and/or reports showing access that has taken place into specified areas of
the system, giving details of fields amended etc.

Update reports
Files and/or reports showing original and amended records in restricted areas
each time a change is made.

Audit Use of Because of the need for control over access to standing data, many master files
Mast contain fields for:
er
File Date Created
Cont Date Last Amended
rols
The existence of these fields allows a test program to be installed. Once the existing
standing data has been checked manually, this can report all records where the date
in either of the above two fields is later than the date when the data was last checked
and known to be correct. The date criteria can be update each time to show the date
of the previous run.

In this way a very large file of standing data can be checked on a continuing basis
with very little effort.

InternalControls 93
Computerized Record Keeping Requirements

Revenue Procedure 98-25: Record Keeping Requirement


When a taxpayer's records are maintained within an Automated Data Processing system, essential basic requirements are
summarized in Revenue Procedure 98-25.

Machine-sensible records must:


- Be retained so long as their contents may become material in the administration of any internal revenue law;
- Reconcile with the taxpayers books and return;
- Contain sufficient transaction-level detail so that the information and the source documents underlying the machine-
sensible records can be identified;
- Be made available to the Service upon request and be capable of being processed.

Taxpayers must also:


- Maintain system documentation relating to retained records;
- Provide the Service at the time of an examination with computer resources that are necessary to process the machine-
sensible records;
- Notify its Field Director of any machine-sensible records lost, destroyed, or no longer capable of being processed, and
describe its plan to replace or restore the affected records.

Third Party Services


A taxpayers use of a third party (such as a service bureau) to provide services with respect to machine-sensible records does
not relieve the taxpayer of its record keeping obligations and responsibilities.

Revenue Procedure 97-22: Electronic Storage System


Revenue Procedure 97-22 is applicable when a taxpayer's records are maintained by using an electronic storage system that
either images their hardcopy (paper) books and records or transfers their computerized books and records to an electronic
storage media such as an optical disk.

Record Retention Agreement


A taxpayer may request to enter into a Record Retention Limitation Agreement. The request must identify and describe those
records the taxpayer proposes not to retain and explain why those records will not become material to the administration of
any internal revenue law.
Currently, Record Retention Limitation Agreements are not commonly used.

Types of Computer Records Acceptable Record Formats


-ASCII (American Standard Code for Information Interchange for DOS),
-ANSI (for Windows 3.1+),
-Unicode (Windows 95, 98 and NT),
-EBCDIC (Extended Binary Coded Decimal Interchange Code for IBM mainframe computers),
-ACCESS 97 or 2000,
-EXCEL 97 or 2000, and
- Access can import data in ASCII, ANSI, and Unicode formats while EBCDIC files must be converted to a compatible
format before they can be imported.
Types of EP Data Files
Types of EP Files
EP audits utilize the data processing files normally encountered in taxable
CEP corporate audits, in addition to those created and maintained solely for
use in administering employee plans. These files include:
-General ledger summary/detail files,
-Employee information/payroll system files,
-Forms 1099-R and W-2 files, and
-Participant account records.

Gambar1. AturanaturanIRStentangComputerized Record Keeping Requirements


(Kewajiban-kewajiban Penyimpanan Data komputer)

InternalControls 94
ModulKomputerAudit 1