SPSS Limited
Maygrove House
67 Maygrove Road
LONDON
NW6 2EG
England
1 Introduction.........................................................................................................................1
1.1 What Quantum does .............................................................................................................1
1.2 Stages in a Quantum run ......................................................................................................2
4 Basic elements.................................................................................................................13
4.1 Data constants .....................................................................................................................13
Individual constants .............................................................................................................13
Strings of data constants ......................................................................................................15
4.2 Numbers .............................................................................................................................16
Whole numbers ....................................................................................................................16
Real numbers .......................................................................................................................16
4.3 Variables and arrays ...........................................................................................................17
Data variables ......................................................................................................................18
Integer variables ..................................................................................................................20
Real variables ......................................................................................................................21
Reading real numbers from columns ...................................................................................23
4.4 Subscription ........................................................................................................................23
5 Expressions ......................................................................................................................25
5.1 Arithmetic expressions .......................................................................................................25
Combining arithmetic expressions ......................................................................................26
Counting the number of codes in a column .........................................................................28
Generating a random number ..............................................................................................29
5.2 Logical expressions ............................................................................................................30
Comparing values ................................................................................................................30
Comparing data variables and data constants ......................................................................31
Checking the arithmetic value of a field of columns ...........................................................38
Combining logical expressions ...........................................................................................39
Contents / i
Quantum User’s Guide Volume 1
ii / Contents
Quantum User’s Guide Volume 1
Contents / iii
Quantum User’s Guide Volume 1
iv / Contents
Quantum User’s Guide Volume 1
Contents / v
About this guide
The Quantum User’s Guide is written primarily for Quantum spec writers. It is also a useful
reference for Quanvert database administrators and others who prepare data for use with Quanvert
or Quanvert Text.
This guide is not intended as a tutorial or teach-yourself document. Instead, it provides a complete
and detailed description of the Quantum language and the Quantum programs. However, the guide
has been designed with your needs in mind. If you are an experienced user, you will find the Quick
Reference boxes at the start of each section helpful as a reminder of syntax. If you are less
experienced, you will probably prefer the more detailed explanations and examples in the main
body of each section.
The Quantum User’s Guide is divided into four volumes, which are described in more detail below.
All the volumes contain a comprehensive index that covers all four volumes.
• Chapters 1 to 3 give you an overview of the language and explain the basic concepts of
Quantum spec writing.
• Chapter 6, ‘How Quantum reads data’, describes types of records, data structure, trailer cards,
reserved variables, merging data files and reading non-standard data files.
• Chapter 7, ‘Writing out data’, describes creating a new data file, copying records to a print file,
and writing to a report file.
• Chapter 9, ‘Flow control’, describes the if and else statements, routing around statements,
loops, rejecting records, jumping to the tabulation section and canceling the run.
• Chapter 11, ‘Data validation’, describes the require statement, column and code validation, and
validating logical expressions.
• Chapter 12, ‘Data correction’, describes forced cleaning, on-line data correction, creating clean
and dirty data files, correcting data from a corrections file, and missing values in numeric
fields.
• Chapter 13, ‘Using subroutines in the edit’, describes how to call up subroutines, the
subroutines in the Quantum library, writing your own subroutines and calling functions from
C libraries.
• Chapter 14, ‘Creating new variables’, describes how to name and define variables in your
Quantum spec.
• Chapter 16, ‘Running Quantum under Unix and DOS’, describes how to compile and run your
Quantum program.
• Chapter 2, ‘The hierarchy of the tabulation section’, describes the components of a tabulation
program, the hierarchies of Quantum, how to define run conditions, the options that are
available on the a, sectbeg, flt and tab statements, the default options file and some sample
tables.
• Chapter 3, ‘Introduction to axes’, describes how to create an axis, the types of elements within
an axis, how to define conditions for an element, the n count creating elements, subheadings,
netting and axes within axes.
• Chapter 4, ‘More about axes’, describes the col, val, fld and bit statements, filtering within an
axis, and options on axis elements.
• Chapter 5, ‘Statistical functions and totals’, describes totals, averages, means, the standard
deviation, standard error and error variance statements and how to create percentiles.
• Chapter 6, ‘Using axes as columns’, describes special considerations for when axes are used
for the columns of a table.
• Chapter 7, ‘Creating tables’, describes the syntax of the tab statement, multidimensional tables,
multilingual surveys, combining tables, printing more than one table per page, and suppressing
percentages and statistics with small bases.
• Chapter 8, ‘Table texts’, describes table titles, underlining titles, printing text at the foot of a
page, table and page numbers and controlling table justification.
• Chapter 9, ‘Filtering groups of tables’, describes general filter statements, named filters and
nested filter sections.
• Chapter 10, ‘Include and substitution’, describes filing and retrieving statements, symbolic
parameters and grid tables.
• Chapter 11, ‘A sample Quantum job’, provides an example of a Quantum specification and the
tables it produces.
• Appendix B, ‘Error messages’, contains a list of compilation error messages with suggestions
as to why you may see them and how to solve the problems which caused them to appear.
• Appendix C, ‘Options in the tabulation section’, provides a summary of the options available
in the tabulation section.
• Chapter 1, ‘Weighting’, describes the weighting methods that you can use in Quantum.
• Chapter 2, ‘Row and table manipulation’, describes how to create new rows and tables using
previously created tables or parts of previously created tables.
• Chapter 3, ‘Dealing with hierarchical data’, describes how to use analysis levels in Quantum.
• Chapter 4, ‘Descriptive statistics’, describes the axis-level and table-level statistical tests that
are available in Quantum and provides details of the chi-squared tests, non-parametric tests on
frequencies and Friedman’s two-way analysis of variance.
• Chapter 5, ‘Z, T and F tests’, describe the Z, T and F tests that are available in Quantum.
• Chapter 6, ‘Other tabulation facilities’, describes how to include C code and edit statements in
the tabulation section and how to sort tables.
• Chapter 7, ‘Special T Statistics’, describes the special T statistics that are available in
Quantum.
• Chapter 8, ‘Creating a table of contents’, describes how to create a formatted list of the tables
that are produced by a Quantum run.
• Chapter 9, ‘Laser printed tables with PostScript’, describes how to convert the standard
tabulation output into a file suitable for printing on a PostScript laser printer.
• Appendix A, ‘Options in the tabulation section’, provides a summary of the options available
in the tabulation section.
• Chapter 1, ‘Files used by Quantum’, describes files you may need to create in order to use
certain Quantum facilities, including the variables file, the levels file, the default options file,
the run definitions file, the merges file, the corrections file, the rim weighting parameters file,
and the C subroutine code file, aliases for Quantum statements, customized texts, and user-
definable limits.
• Chapter 2, ‘Files created by Quantum’, describes many of the files created during a run and
draws your attention to those of particular interest.
• Chapter 3, ‘Quantum Utilities’, describes how to tidy up after a Quantum run and how to check
column and code usage.
• Chapter 4, ‘Data conversion programs’, describes the q2cda and qv2cda programs that convert
tables into comma-delimited ASCII format, the qtspss and nqtspss programs that convert
Quantum data into SPSS format, and the qtsas and nqtsas programs that convert Quantum data
into SAS format.
• Chapter 5, ‘Preparing a study for Quanvert’, describes the tasks you need to perform before
converting a Quantum spec and data file into a Quanvert database.
• Chapter 6, ‘Files for Quanvert users’, describes files that are specific to either Quanvert Text
or Windows-based Quanvert.
• Chapter 7, ‘Creating and maintaining Quanvert databases’, describes how to create and
maintain Quanvert databases.
• Appendix B, ‘Error messages’, contains a list of compilation error messages with suggestions
as to why you may see them and how to solve the problems that cause them to appear.
• Appendix D, ‘Using the extended ASCII character set’, explains how you can use Quantum with
data that contains characters in the extended ASCII character set.
• Appendix E, ‘ASCII to punch code conversion table’, provides a table showing ASCII to punch
code conversions.
• Appendix F, ‘Will this job run on my machine’, offers suggestions on how you can check
whether a particularly large job will run on your computer.
When showing the syntax of a statement, as in the Quick Reference sections, all keywords are
printed in bold. Parameters, such as question texts or responses, whose values are user-defined are
shown in italics. Optional parameters are enclosed in square brackets, that is, [ ].
The ☞ symbol marks a reference for further reading related to the current topic.
Comments
SPSS MR welcomes any comments you may have about this guide, and any suggestions for ways in
which it could be improved.
Quantum is a highly sophisticated and very flexible computer language designed to simplify the
process of obtaining useful information from a set of questionnaires.
Quantum has been designed with market researchers in mind so its syntax and grammar are similar
to English. Nevertheless, it is still a computer language and as such should be used with precision
and understanding.
The four volumes of the Quantum User’s Guide have three basic functions:
• To provide you with enough information about how Quantum works to enable you to carry out
a specific task.
• To help you work out what went wrong when errors occur or when your output is not what you
expected.
• Generate tables (in different languages, provided that the translated texts exist).
Any Quantum run may perform as many or as few of these tasks as you like, but for each run the
basic format is the same.
Introduction – Chapter 1 / 1
Quantum User’s Guide Volume 1
First, the data is read onto a disk. Data on disk can come from a number of different sources, for
example:
• It may be entered directly via a terminal by a telephone interviewer using Quancept CATI.
• It may be collected over the World Wide Web using software such as Quancept Web.
Next, the tasks to be performed are defined using the Quantum language.
Then, Quantum translates these tasks into instructions that the computer can understand.
Finally, the computer itself uses this program to run your job.
Quantum comprises two sections — an edit section and a tabulation section. The edit section
checks and validates the data, generates lists and reports, corrects data, produces new data files, and
recodes data and creates new variables. The tabulation section produces tables and performs
statistical calculations.
Quantum reads the records in the data file one at a time and passes them through the various parts
of the Quantum program. As long as there are records remaining in the data file, the loop of ‘read
a record −> edit −> tabulate’ is repeated; once the last record has been processed, the tables are
ready for printing.
If errors occur at any point in a Quantum run an error message is printed telling you what is wrong.
☞ For details of the error messages that can occur, see appendix B, ‘Error messages’ in the
Quantum User’s Guide Volume 2.
2 / Introduction – Chapter 1
2 Your Quantum program
Your Quantum program is the basic requirement for any Quantum run. It tells the computer what
tasks it has to perform. All Quantum programs are written in the Quantum language which both
you and the computer can understand. When writing in this language you must take care that you
say exactly what you mean; otherwise your output may not be quite what you expect. The computer
cannot guess at what you mean it to do; it only does what you tell it.
All Quantum programs are stored in separate files on the computer. Each file has a unique name
which may be made up of any characters on your keyboard, but you are advised to use only letters
and numbers in your filenames.
*include edit
a;dsp;spechar=–*;decp=0;flush
*include tabs
*include axes
where the file called edit contains editing instructions, the file called tabs contains statements
defining the tables required, and axes contains statements which define the individual rows and
columns each table is to have. The a; statement lists characteristics that all tables are to have,
although some of these characteristics can be overridden for individual tables or individual table
elements.
A Quantum program is made up of a series of statements defining the actions to be taken. If you
are typing Quantum programs on your screen you will notice that statements of more than 80
characters wrap around onto the next line and appear to be on two lines in the file. As long as these
statements have 200 characters or fewer, Quantum can read them, but you may prefer to make the
lines shorter for ease of reading on your screen.
In the following sections, we will explain briefly the types of statements you can use.
Edit statement
Quantum edit statements contain a Quantum keyword and other texts and numbers. Statements in
the edit section can generally start in any column, although comments and continuation characters
must start in column 1. A line may contain one or more statements, as long as each statement is
separated by a semicolon.
Edit statements may be preceded by a label number of up to five digits allowing them to be
referenced by other parts of the program, for example:
Here we are adding the number in column 56 to those in columns 57 and 58 and saving the result
in a variable called ‘total’. If this value is greater than eight we go to statement 100, otherwise we
continue with the statement immediately after the if line.
Quantum offers you the ability to check and verify your data prior to tabulation. Suppose your
questionnaire contains a series of questions to be answered only by people buying a specific brand
of tea. You may want to check that everyone who didn’t buy tea has a blank in all columns related
to tea. On the other hand, if they did buy a specific brand of tea, you could check whether the codes
in the following columns were within a specific range.
The statement that you would use for this type of test is require. To perform the test given as an
example, we might write:
This says that if column 24 contains a ‘1’, then columns 25 to 30 must not be blank, otherwise, if
column 24 does not contain a ‘1’, then columns 25 to 30 must all be blank.
More generalized checking facilities exist which enable you to produce frequency distributions of
numeric data (e.g., how many respondents have the number 201 in columns 13 to 15) or holecounts
(marginals) which show the broad pattern of coding across all columns in the data. Words
associated with these are list and count.
When errors are found in the data, you have several courses of action open to you. You may:
if (c224’5’) write
to write out all records in which column 24 of card 2 contains a 5. The records are written to the
default print file, out2
Incidentally, many of the statements mentioned in this section may be used for other purposes,
rather than just to deal with errors.
Quantum offers you many aids to efficient programming. Repetitive checks may be specified once
with instructions to Quantum to repeat them a given number of times or until a certain condition is
satisfied. The word associated with loops of this kind is do.
There are two sorts of routing: you may either go to another edit statement (go to) or you may send
the record straight on to the tabulation section (return).
Tabulation statements
Tabulation statements tell Quantum which tables are required and how to create them. They consist
of a start letter or keyword to identify the type, and may be followed by other keywords, numbers
or text. They are used to define rows and columns (elements), the variables that are to be cross-
tabulated (axes) and finally, the tables themselves.
There are also statements for weighting your data and for creating tables by manipulating the
contents of tables created previously in the current run or even in other runs.
Writing in the Quantum language is very easy but as with all computer languages it needs to be
done with care and precision to obtain the required results.
The characters and symbols that you may use in Quantum are:
Where symbols have two meanings, the meaning required will become clear in the context in which
the symbol is employed.
Quantum is a ‘free-format’ language which means that within reason you may enter your program
however you like. Statements occupy columns 1 to 200 of successive lines and may be written in
uppercase or lowercase or a combination. Thus:
The exception to this is text in tables, where the text is printed on the tables in the same case as you
write it in your Quantum program. Additionally, you must set up table text so that it fits on the
paper when you print your tables. Therefore, if you want the table title to be printed on two lines,
you must write it on two lines in your program.
Generally, spaces are allowed anywhere in a Quantum program except within Quantum keywords.
As we mentioned earlier, Quantum has separate edit and tabulation sections which may or may not
be in the same file. If your program contains an edit, it must precede the tabulation statements and
must be enclosed by the words ed and end, each on a separate line, thus:
ed
.
edit statements
.
end
Errors will occur if either of these words is missing. If there is no edit, these statements are not
needed.
3.3 Comments
Comment statements insert comments or information into the Quantum program. They do not
affect the way your program works because they are ignored when the program is run to produce
tables.
Comments are identified either by a capital C in column 1 or by a slash and an asterisk in columns
1 and 2 respectively (/*). If a comment needs more than one line, each line must start with the
appropriate notation otherwise it will be assumed to require some sort of action.
It is a good idea to put comment statements in your program in case someone else has to take over
your job or alternatively to remind yourself what you are doing and why. For example:
3.4 Continuation
Any Quantum statement may be continued over several lines by starting the second and subsequent
lines with + or ++, depending on where the statement is split.
A single plus sign is used when the statement is split between keywords. This assumes that a
semicolon appears at the end of each continued line, whether or not there is actually one there. Take
the statement:
This could be split in three places with a single plus sign for a continuation:
if (c132’12’.and.t5.gt.50)
+write $t5 incorrect$
+else
+write ofil
We have omitted the semicolons at the end of each line, but it would not be wrong to leave them in.
The double-plus sign introduces an internal continuation of a long statement over several lines.
Statements may be split between lexics; that is, between keywords, conditions, lists of numbers,
and so on, but not in the middle of any of these. In our previous example, we could write:
if (c132’12’.and.
++t5.gt.50) write $t5 incorrect$; else; write ofil
A double plus is needed here because we have split an expression in which one parameter is
dependent on the other. The statement on the first line means nothing on its own, neither does the
second line, hence the ++. We could equally well have split the expression before the .and. or
before or after the .gt.. To split it between t and 5, or in any other similar place, is incorrect because
the two characters by themselves do not mean anything.
Quick Reference
To have possible syntax errors (that is, ones which Quantum can process even though they are not
quite perfect) treated as fatal, type:
check_
nocheck_
When the Quantum compiler is checking your program and finds an error it flags the incorrect
statement with an explanatory error message and continues with the next statement. If any of these
errors are fatal — that is, Quantum cannot convert your statement into C code — the run will be
terminated.
Sometimes Quantum finds statements which are not quite correct, but which it can still convert into
C. In these cases the compiler flags the statement with the message ‘Possible syntax error’ and
continues as if nothing were wrong. You can choose to have this type of error treated as fatal and
have the run terminated at the end of the compilation by entering the statement check_ (note the
underscore at the end) at the start of your edit.
The statement nocheck_ causes possible syntax errors to be flagged but ignored, and this is the
default.
Quick Reference
To have more or less than the default of 20 error messages displayed on your screen, type:
errprint n
before the edit and tabulation sections. Where n is the number of messages you wish to see.
When the Quantum compiler finds errors in your program, it copies them to the compilation listing
file. It also displays the first twenty messages on your screen. You may increase or decrease this
number by placing the statement:
errprint n
at the top of your main program file, before the edit and tabulation sections.
n is the number of messages you want to see on your screen: it must be an integer. Thus:
errprint 5
prints the first five error messages on the screen and in the listing file, and then any others only in
the file.
• Data constants.
• Integer numbers.
• Real numbers.
Individual constants
Quick Reference
To refer to one or more codes in a single column, type:
’codes’
An individual constant is one or more of the codes 1234567890–& or blank. The – is sometimes
referred to as the 11 or X punch, and & is sometimes called the 12, V or Y punch. Each code
represents one answer to a question. For example, let’s take the question ‘What is your favorite
color?’ which has the response list:
Red 1
Yellow 2
Blue 3
Green 4
Black 5
White 6
coded into one column. If my favorite color is green, this will appear in the data file as a 4 in the
appropriate column, just as if your favorite color is red, there will be a 1 in that column.
To refer to these answers inside your Quantum program (maybe we only want our table to include
those respondents whose favorite color is blue), type in the code enclosed in single quotes:
’3’
You will also have to tell Quantum which column to look in.
☞ To find out how to refer to columns, see ‘Data variables’ later in this chapter.
Several codes may be combined in the same column and are called multicodes. Throughout this
manual when we talk of multicodes or multicoding we mean two or more codes in the same
column. Suppose the next question asks me to choose three colors from the same list; I pick yellow,
black and white. If these answers were all coded in the same column (a multicoded column), we
would refer to them by typing:
or any other variation of those three codes. Quantum does not care what order you enter the codes
in.
If you have a series of consecutive codes in the order &–01234567890–&, you may either type each
code separately or you may enter the first and last codes separated by a slash (/) meaning ‘through’,
as shown below:
As you can see, the last two examples mean exactly the same thing. However, the notations ’0/&’
and ’0–&’ are not the same: ’0/&’ means ’01234567890–&’ whereas ’0–&’ is ’0’, ’–’ and ’&’
only.
Some combinations of codes represent ASCII characters; that is, they represent characters which
you can type on your screen:
The only time you would use letters rather than codes (that is, ‘A’ rather than ‘&1’) is when the
questionnaire tells you that a column should contain a letter.
☞ For further information, see appendix E, ‘ASCII to punch code conversion table’ in the
Quantum User’s Guide Volume 4.
Sometimes we may need to write a notation for ‘no codes’ — for instance, if my favorite color does
not appear in the list of choices. To do this, we write ’ ’ (that is, a blank enclosed in single quotes).
✎ The notation ’ ’ is a special case since blank is not really a code. If you type a blank inside
single quotes with any other characters Quantum will follow its usual rule of ignoring spaces.
This means that references of the form ’ 12 ’ are read as ’12’.
Quick Reference
To refer to a string of codes in a field of columns, type:
$codes$
When data constants are single-coded or the multicodes correspond to ASCII characters (for
example, ‘A’, ‘B’) they may be strung together. Strings of data constants are sometimes called
literals or column fields. Strings are enclosed in dollar signs, with the component single codes
losing their single quotes. For example:
The first string is five columns long with 1 in the first column, 2 in the second, 3 in the third, and
so on. The third string is six columns wide with the fourth column being blank.
• When the answers to a question are represented by codes of more than 1 digit. For example, in
a car ownership survey the car make and model owned may be represented by a 3-digit code.
To pick up respondents owning a particular type of car you would need to check whether the
relevant columns contained the code for that car. For instance, to look for owners of Ford
Escorts you might ask Quantum to search for the string $132$ in a particular field of columns.
4.2 Numbers
Quantum can print figures in tables with up to ten characters; figures that require more than ten
characters are printed as asterisks. For example, 12345678.12 appears as 12345678.1 when
displayed with one decimal place, but as asterisks (*) when displayed with two decimal places.
However, you can use the scale= option to apply a scaling factor before printing.
Whole numbers
Quantum can deal with whole numbers in the range −1,073,741,824 to +1,073,741,823 with an
accuracy of up to six significant figures. Numbers with more than six significant figures are
rounded up or down depending on the value of the remaining figures.
☞ For some examples of how Quantum rounds figures up and down, see ‘Real numbers’ later in
this chapter.
Your data will contain whole numbers whenever there are questions requiring numeric responses:
for example, the question ‘How many children do you have?’ can only be answered with a whole
number. If the respondent has three children, the number 3 will appear in the appropriate column
in his or her data record, whereas a respondent with five children will have a 5 in that column
instead.
Whole numbers are also used if you want to perform arithmetic calculations during the run, for
instance to multiply a field by a number.
Real numbers
Real numbers are numbers containing decimal points. To be valid, they must have at least one digit
on either side of the decimal point:
Quantum deals with real numbers of any size with accuracy up to six significant figures. Numbers
with more than six significant figures have the sixth figure rounded up or down depending on the
value of the remaining figures.
By default, Quantum calculates cell values in single precision. However, when working with very
large numbers, you can produce more accurate results by using the double precision option (dp) on
the a statement.
☞ For further details on double precision, see chapter 2, ‘The hierarchy of the tabulation section’
in the Quantum User’s Guide Volume 2.
There are three types of variable — data, integer and real — each used for storing different types
of information. You may create your own variables with names representing the type of
information stored (for example, the variable called meals might contain a count of the number of
meals eaten during the day) or you may use the ones offered automatically by Quantum.
Sometimes it is useful for a group of variables to have the same name. Each variable can then be
addressed by its position in the group. This arrangement is known as an array. Arrays are discussed
further in the following sections.
Data variables
Quick Reference
To refer to a single data variable in the C array, type:
cnumber
c(start_pos,end_pos)
before the edit section. To refer to it, use the same notation as above but replace the c with the
variable’s name.
At the start of every job, Quantum provides you with an array of 1,000 data cells called C. This
array is sometimes referred to as the C matrix. The individual cells are called C-variables. Each
C-variable stores one ‘column’ of data. Quantum reads data from your data file into this array: we
will discuss exactly how it does this in chapter 6, ‘How Quantum reads data’. For the time being,
let’s say we have a very small questionnaire which uses 43 columns to store the data. Quantum will
read the data for each respondent into cells 1 to 43 of the C array, one respondent at a time. The
codes from column 1 of the data are copied into cell 1 of the C array, the codes from column 2 of
the data are copied into cell 2, and so on. When Quantum has finished with that respondent’s data
it clears out the cells in the C array and reads the data for the next respondent, placing it in cells 1
to 43 of the array.
We can access this data by defining the columns whose contents we wish to inspect or change. Let’s
take the questions about color that we mentioned earlier.The printed questionnaire tells us that the
respondent’s favorite color will be coded into column 15. To look at this column we would write:
c15 or c(15)
The C may be in uppercase or lowercase, and the parentheses around the column number are
optional.
c43 or c(43)
Now suppose we want to look at a field of columns such as the questionnaire serial number in
columns 1 to 5. All we have to do is tell Quantum that the serial number is in a field starting in
column 1 and ending in column 5, as follows:
c(1,5)
C variables are reset to blank before a new respondent’s data is read. Thus, you can be certain that
Quantum never muddles the contents of column 10 for the first respondent with those of c10 for
the second respondent.
As we mentioned above, you may create your own data variables to store specific pieces of data.
For instance, in a shopping survey we may want to store data about visits to Sainsburys in an array
called ‘sains’ and data about visits to Safeways in an array called ‘safe’.
Before we can use these arrays, we must create them. If each array is to contain 100 cells or
columns of data, we would write:
before the edit section. Where the s at the end of each statement causes Quantum to recognize that,
for example, safe1 is the same as safe(1), just as it knows that c15 and c(15) refer to the same
column of data. If you created the arrays without the s, then Quantum would not recognize safe1
as being the same as safe(1).
Data variables which you create remain blank until you copy data into them. If the data about visits
to Sainsburys is stored in columns 30 to 45, then we might copy this into cells 30 to 45 of the array
called sains. If we then want to use this data we can write statements which refer to sains30 to
sains45. Unless you subsequently change the data in sains(30,45), each time you refer to one of
those cells it is exactly the same as referring to c30, c45, and so on, in the C array, and to columns
30, 45, and so on, in the data file.
In this simple example, there is not much to be gained (apart from an immediate improvement in
readability) by using your own data variables. However, when you have many columns of data per
respondent, or a complicated Quantum program, named data variables can be very useful for
improving readability and also for providing simple yet powerful facilities for data manipulation.
☞ To find out more about creating and using named data variables, see chapter 14, ‘Creating new
variables’.
Integer variables
Quick Reference
To define an integer variable, type:
name[cell_number]
Integer variables store whole numbers. Strings of integer variables are called integer arrays, and
each cell in the array may store any whole number from −1,073,741,824 to +1,073,741,823.
At the start of each run, Quantum provides an array of 200 integer variables called T. The first cell
in this array is the integer variable t1 which may store any value within the given range; the second
cell in the array is the integer variable called t2 which may also store any value within the given
range.
To illustrate the difference between a data variable and an integer variable, let’s suppose that our
data contains the value of the respondent’s car to the nearest whole pound. If the value is £6,000,
this will take up 4 columns in the data (assuming that we are only concerned with the digits) — that
is, four data variables, the first of which will contain the 6, and the other three of which will all
contains zeroes.
If we placed this same value in an integer variable, we would only need one variable to store the
whole value because each variable can store values in the range ±1,073,741,824.
We have already mentioned that Quantum provides an integer array of 200 integer variables. You
may create your own arrays using statements similar to those shown above for data variables.
Suppose you have a household survey in which you have collected the value of each car that the
family owns. You want to set up an integer array in which to store each value, so you write:
This creates an array called carval which contains ten separate integer variables called carval1 to
carval10. Notice that we have followed the array size with the letter s so that we can omit the
parentheses from the individual variable names. We can then copy the value of the first car into
carval1, the value of the second car into carval2, and so on. If a particular household owns three
cars values at £6,000, £2,500 and £500, then carval1 would have a value of 6,000, carval2 would
be 2,500 and carval3 would be 500.
If you create your own integer variables, it is recommended that you name them with names that
reflect their purpose in the run, as we have done in our example.
☞ To find out more about creating and using named integer variables, see chapter 14, ‘Creating
new variables’.
All integer variables have a value of zero at the start of a run, and they are not reset between
respondents. If you want your integer variables to store information about the current record only,
you must include statements in the edit to reset those variables to zero when a new record is read.
For example, we might write:
carval1 = 0
at the start of the edit to reset the first integer variable of the carval array to zero.
✎ You can also reset an integer variable to zero by using a clear statement.
☞ For further information about the clear statement, see section 8.7, ‘Clearing variables’.
T-variables with non-zero values are printed out at the end of the run.
Real variables
Quick Reference
To define a real variable, type:
name[cell_number]
You may define real variables and arrays to store real numbers with accuracy up to six significant
figures. Values with more than six significant figures have the sixth figure rounded up or down
according to the value of the extra figures.
☞ For further information about real values, see ‘Real numbers’ earlier in this chapter.
As with integer variables, the names of real variables should give some clue to the type of
information they contain. Real arrays are created by statements of the form:
real liters 5s
This example creates a real array called liters which has five real variables named liters1 to liters5.
It can store five real values, the first in liters1 and the fifth in liters5.
☞ To find out more about creating and using named real variables, see chapter 14, ‘Creating new
variables’.
Quantum also provides a set of 100 real variables named X which you may use.
✎ All real variables start with a value of 0.0 and are not reset to zero between respondents.
As an example, let’s say that the data contains information on how long, on average, each person
in the household spent watching television during a given week. We want to manipulate these
figures so we create an array of real variables in which to store the average viewing figures:
real tvwatch 8s
This provides room for up to eight people’s figures. If our household contains four people with
viewing averages of 20.8 hours, 15.75 hours, 9.75 hours and 10.0 hours, then tvwatch1 will have a
value of 20.8, tvwatch2 will have a value of 15.75, tvwatch3 will be 9.75 and tvwatch4 will be 10.0
hours. The rest of the variables in the array have values of 0.0.
Real variables with non-zero values at the end of the run are not printed out automatically. If you
want to see these values, you will need to write them using a report statement.
☞ For further information about report, see section 7.3, ‘Writing to a report file’.
Quick Reference
To read real values from the C array, type:
cx(start_col,end_col)
As we have already said, data from the questionnaire is read into columns for use during the run.
When the data contains real numbers you will have to tell Quantum that the dot is to be treated as
a decimal point rather than as a multicode representing a number of different answers. The way to
do this is to refer to the field as cx:
cx(15,20) cx(131,135)
Here we have two fields containing real numbers: the first is six columns wide including the
decimal place, which means that the number itself contains five digits, whereas the second is only
five columns wide with four digits. Notice that there is no need to tell Quantum where the decimal
point is.
4.4 Subscription
As we have shown above, you may refer to specific variables in integer and real arrays and cells or
columns in data arrays by naming their position in the array.
For example:
Variables within an array may also be referred to using any arithmetic expression. In this case,
parentheses must be used. For example:
c(t1) The column number depends on the value of t1. If t1 has a value of 10, then the
variable is c10; if t1 is 67, the variable is c67.
c(t4,t5) The field delimiters depend on the values of t4 and t5. If t4 has a value of 12 and
t5 has a value of 19, the column field referred to is c(12,19).
t(c4) The variable number depends on the value in c4. If c4 contains a single code in
the range 1 to 9, the integer variable will be one of t1 to t9 depending on the exact
value in c4. If c4 is multicoded, then the result is nonsense.
time(c4*23) The variable number is the result of multiplying the value in c4 by 23. As in the
previous example, c4 must be single-coded in the range 1 to 9 for this example
to make sense. Thus, if c4 contains just a 4, the value of the expression is 92 so
the variable referred to is time92.
When variables are referenced in this way, the value of the expression must be positive. The
expression c(t1−5) is acceptable as long as t1 is at least 5. If the expression has a zero or negative
value Quantum will issue an array dimension error when it comes to read the data during the
datapass. Also, if the variable refers to columns, the value of the subscript must not exceed 32,767.
These are called subscripted variables and they greatly increase the flexibility with which you can
write your edit.
✎ Subscription may be used in repetitive processes to save you writing the same thing over and
over again.
☞ For an example, see section 9.5, ‘Loops’.
Quantum recognizes two types of expression — arithmetic and logical. Arithmetic expressions are
used to produce numeric values and logical expressions, when evaluated, produce a value of true
or false.
The simplest form of arithmetic expression is a single positive or negative number such as 10 or
-26.5 or an integer or real variable.
Although the C array is data, columns may also be used in arithmetic when the response coded into
those columns is a numeric response, such as a respondent’s age or the number of different shops
he or she visited. For example, if columns 243 to 247 contain the codes 4,7,2,6 and 0 respectively
the value in c(243,247) could be read as 47,260. Similarly, if columns 45 to 48 contain 7, 8, a dot
and 2 respectively, the value in cx(45,48) would be 78.2.
Blank columns in a field are ignored when the codes in those columns are evaluated. Thus, if
columns 20 to 21 contain the codes 6 and 7 respectively, and column 22 is blank, the codes in
c(20,22) will be evaluated as 67. A similar result is produced if the blank column appears anywhere
else in the field. All the examples of c(20,22) below produce an arithmetic value of 67:
The same applies to multicoded columns. If you use a multicoded column as part of an arithmetic
expression, the multicoded column will be ignored. The exception to this is a multicode of a digit
and a minus sign which creates a negative number: a minus sign anywhere in a numeric field
negates the value in the field as a whole, not just the number it is multicoded with. For example:
----+----1----+----2
5 3778 is 5378
9
0
2---+----3----+----4
12-4 is -1234
3
4---+----5----+----6
83- is -83
Expressions – Chapter 5 / 25
Quantum User’s Guide Volume 1
Quick Reference
To combine arithmetic expressions, type:
where variable is a numeric value or the name of a variable containing a numeric value, and
operator is one of the arithmetic operators +, −, * (multiply) or / (divide).
More often than not, you will want to combine numeric expressions to form a larger expression,
for instance to count the number of records read with a given code in a named column.
Arithmetic expressions are linked with any of the arithmetic operators listed below:
+ (addition) * (multiplication)
− (subtraction) / (division)
Expressions may contain more than one of these operators, for instance:
t5 + c(134,136) / otot
c(150,152) * 10 + 2.5
1. Expressions in parentheses
If you wish to change this order you should enclose the expressions which go together in
parentheses. The first expression in the example above will be evaluated by dividing the value in
columns 134 to 136 by otot and adding the result to t5. If you change the expression to:
this adds the values of t5 and c(134,136) first and then divides that by otot. Let’s substitute numbers
and compare the results. If t5=10, otot=5 and the value in c(134,136) is 125, the two versions of the
expression would read as follows:
26 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
Where two integer expressions are combined, the result is integer (any decimal places are ignored),
but if an expression contains a real then the result will be real. Therefore, if t1=5 and t2=3, then:
t1 + 4 = 9
t1 + 4.0 = 9.0
t1 * t2 = 15
t1 / t2 = 1
t1 * 1.0 = 5.0
t1 * 1.0 / t2 = 1.66667
If you use parentheses in expressions which contain both integer and real variables, you need to
take extra care to ensure that your expression is producing the correct results. Let’s look at an
example to illustrate how an expression can look correct but can still produce unexpected results.
yields a result of 2.85714 (that is, 200.0/70). The final value will be 2.85714 if the result is saved
in a real variable, or 2 if it is saved in an integer variable.
If we use parentheses:
the result is 0.0 (or 0 if saved in an integer variable). The reason for this is as follows. Because
Quantum evaluates expressions in parentheses before it deals with the rest of the expression, it
treats that expression as integer arithmetic. The rules for integer arithmetic dictate that real results
are truncated at the decimal point, so the true result of 0.0285714 becomes 0. Any multiplication
involving zero is always zero, so the final result is zero.
If you find that a run gives unexpected zero results, try looking for expressions of this type and
checking whether the parenthesized part of the expression has been truncated because the integer
division results in a decimal number.
Expressions – Chapter 5 / 27
Quantum User’s Guide Volume 1
Quick Reference
To count the number of codes in a column or list of columns, type:
If any columns are followed by a code reference, only those codes will be counted for those
columns.
The function numb is an arithmetic expression which counts the number of codes in a column or
list of columns. Its format is:
where cn1 to cnn are the columns whose codes are to be counted. So, if we wanted to count the
number of codes in columns 132 to 135 we would type:
numb(c132,c133,c134,c135)
Notice that even though the columns are consecutive, each one is entered separately, with each
column number preceded by a ‘c’. It is incorrect to define only the start and end columns of a field
when using numb. Therefore it is wrong to write numb(c(132,135)) or numb(c(132,135)) and, if
you write statements such as these, Quantum will flag them as errors.
Sometimes you will only be interested in certain codes, for instance you may want to know how
many 1, 2 or 3 codes there are in a group of columns. In this case the function is entered as:
where p1 to pn are the codes to be counted. Only the named codes are counted — any others
appearing in the columns are ignored. Let’s say our data on card 1 is as follows:
1---+----2---...---5----+----4
1 2 1
6 / /
8 6 7
9
and we want to count the number of codes in column 115 and also the number of codes in the range
‘5/8’ in columns 121 and 157. The expression would be entered as:
numb(c115,c121’5/8’,c157’5/8’)
28 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
When Quantum checks these columns and codes, it will tell us that there are 9 codes in these
columns which are within the given ranges. These codes are all four codes in column 115 (we did
not specify which codes to count in that column), codes 5 and 6 in column 121 (codes 2 to 4 are
outside the given range), and codes 5 to 7 in column 157 (codes 1 to 4 are outside the given range).
Quick Reference
To generate a random number in the range 1 to n, type:
random(n)
Quantum can generate random numbers automatically with the random function:
random(n)
where n is the maximum value the random number may take. So, to generate a random number in
the range 1 to 100, the expression would read:
random(100)
The number produced may be saved for later use in an integer variable or column, thus:
rnum=random(32)
c(110,112)=random(156)
When using random with columns, always make sure that the number of columns allocated to the
number is sufficient to store the highest possible number that can be generated. In our example, we
need three columns in order to store numbers up to 156.
✎ random generates a different random value each time it is run, even on reruns of the same job.
If you want to retain the same set of random values between runs, copy them into the data the
first time you run the job.
Expressions – Chapter 5 / 29
Quantum User’s Guide Volume 1
Logical expressions are used for comparing values, codes and variables.
Comparing values
Quick Reference
To compare the values of two arithmetic expressions, type:
where log_operator is one of the operators .eq., .gt., .ge., .lt., .le. or .ne.
Values are compared when you need to check whether an expression has a given value — for
example, did the respondent buy more than 10 pints of milk?
Values are compared by placing arithmetic expressions on either side of one of the following
operators:
.eq. Equal to
.gt. Greater than
.ge. Greater than or equal to
.lt. Less than
.le. Less than or equal to
.ne. Not equal to / unequal to
If the number of pints of milk that the respondent bought is stored in columns 114 and 115, the
expression to check whether he bought more than ten pints would be:
c(114,115) .gt. 10
If the number in these columns is greater than ten the expression is true, otherwise it is false.
In chapter 4, ‘Basic elements’, we said that integer variables may take numeric values or the logical
values true and false depending upon whether or not the value is zero. To check whether the
respondent bought any packets of frozen vegetables, we can either write:
fveg .gt. 0
to check the numeric value of the variable fveg, or we can simply say:
fveg
30 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
to check whether the logical value of fveg is true. To check whether fveg is false (that is, zero), we
would write:
.not. fveg
☞ For further information about .not., see ‘Combining logical expressions’ later in this chapter.
In virtually every Quantum run you will want to check which codes occur in which columns. This
is easily done using logical expressions. There are several forms of expression depending on
whether you are checking a column or a field of columns.
Data variables
Quick Reference
To test whether a data variable contains at least one of a list of codes, type:
var_name’codes’
To test whether a data variable contains none of the listed codes, type:
var_namen’codes’
To test whether a data variable contains exactly the given codes and nothing else, type:
var_name = ’codes’
To test whether a data variable contains exactly the given letter and nothing else, type:
var_name = ’letter’
var_name1 = var_name2
To test whether a data variable contains codes other than those listed, type:
var_nameu’codes’
To test whether two data variables do not contain identical codes, type:
var_name1uvar_name2
Expressions – Chapter 5 / 31
Quantum User’s Guide Volume 1
To check whether a column or data variable contains certain codes, place the codes, enclosed in
single quotes, immediately after the name of the column or data variable. For example:
The expression:
Cn’p’
checks whether a column (n) contains a certain code or codes (p). The expression is true as long as
column n contains at least one of the given codes. It does not matter if there are other codes present
since these are ignored.
For example, to check whether column 6 contains any of the codes 1 through 4 we would type:
c6’1/4’
The expression is true if c6 contains any of the codes 1, 2, 3 or 4 or any combination of those codes,
regardless of what other codes may also be present. For instance:
----+----1
5
7
9
-
is false.
In our original example we chose the codes 1 through 4. You can, of course, use any codes you like
and they may be entered in any order.
cnN’p’
which checks that a column does not contain the given code or codes. The expression is true as long
as the column does not contain any of the listed codes.
32 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
For example:
c478n’5/7&’
is true as long as column 478 does not contain a 5, 6, 7 or & or any combination of them. A
multicode of ‘189’ returns the logical value true, because it does not contain any of the codes
‘5/7&’ whereas a multicode of ‘1589’ makes the expression false because it contains a ‘5’.
The ‘=’ operator is used to check that the contents of a column are identical to either the given
codes or the given letters.
The expression:
c312=’1/46’
is true as long as c312 contains all of the codes 1 through 4 and 6, and nothing else. The expression:
c142=’ ’
checks that column 142 is blank. The equals sign is optional when checking for blanks, so we could
simply write:
c142’ ’
The expression:
c124=’A’
checks that column 124 contains the letter A and nothing else.
The ‘=’ operator may also be used to compare the contents of two data variables. For example:
c56=c79
checks whether c56 contains exactly the same codes as c79. If so, the expression is true, otherwise
it is false. If we have:
Expressions – Chapter 5 / 33
Quantum User’s Guide Volume 1
yields the value false because column 79 contains a ‘9’ when column 56 does not.
If you have defined your own data variables, you could write a statement of the form:
brand1=c79
to check whether the data variable called brand1 contains the same codes as c79.
cnU’p’
This checks whether column n contains something other than just the code ‘p’. Suppose we have
two sets of data:
----+----4 ----+----4
1 1
4 5
7 9
and we write:
c34u’7’
The expression is true for both sets of data. In the first example, the ‘7’ is multicoded with a ‘1’ and
a ‘4’, while in the second example, column 34 does not contain a ‘7’ at all. The only time this
expression is false is when column 34 contains a ‘7’ and nothing else.
34 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
Quick Reference
To test whether a field contains a given list of codes, type:
var_name(start,end) = $codes$
var_name(start,end) = $letters$
var_name1(start1,end1) = var_name2(start2,end2)
To test whether the codes in one field differ from a given string, type:
var_name(start,end)u$codes$
To test whether the codes in one field differ from those in another, type:
var_name1(start1,end1)uvar_name2(start2,end2)
The contents of data fields must be enclosed in dollar signs with each code in the string referring
to a separate column in the field. For instance, to check whether columns 47 to 50 contain the codes
–, 6, 4 and 9 respectively we would type:
c(47,50)=$–649$
+----5----+
-649
+----5----+
-529
164&
Expressions – Chapter 5 / 35
Quantum User’s Guide Volume 1
In a similar way as you can test whether a field contains a given list of codes, you can also check
whether a field contains a given list of letters. For example, to check whether columns 55 to 57
contained the string AAA, we would type:
c(55,57)=$AAA$
+----5-----+
AAA
All our examples have used columns, but the same rules apply to data variables that you define
yourself. For example:
rating(1,4)=$1234$
checks whether the field rating1 to rating4 contains the codes 1, 2, 3 and 4 in that order. That is, it
checks whether rating1 contains a 1, whether rating2 contains a 2, and so on.
When checking the contents of fields in this way, make sure that you enter as many columns as
there are codes in the string (that is, five codes require five columns). The exception to this rule
occurs when you are checking for blanks when the expression may be shortened to:
c(50,80)=$ $
This type of statement may also be used to compare two fields, to check whether the second field
contains exactly the same codes as the first field. When you compare one field with another,
Quantum takes each column in the first field in turn and looks to see whether the corresponding
column in the second field contains exactly the same codes. For example, if the first column of the
first field contains a code 1 and a code 2 and nothing else, then Quantum will check whether the
first column of the second field also contains a code 1 and a code 2 and nothing else. If all columns
of the second field are identical to their counterparts in the first field, then the expression is true;
otherwise it is false. Here is an example:
c(129,132)=c(356,359)
For this expression to be true, column 129 must contain exactly the same codes as column 356,
column 130 must be exactly the same as column 357, and so on. Once again, the two expressions
on either side of the equals sign must be the same length.
36 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
✎ Comparisons of one data variable against another are concerned with columns and codes: they
are not concerned with the arithmetic values of the codes in the fields as a whole.
If we have:
----+----3----+----
02 2
the expression:
c(24,25)=c(34,35)
is false because the string $02$ is not the same as the string $2$. If you want to compare fields
arithmetically (for example, is 02 the same as 2) then you will need to use the .eq. operator:
c(24,25).eq.c(34,35)
to test whether the value in c(34,35) was equal to the value in c(24,25).
☞ For further information about the .eq. operator, see ‘Comparing values’, earlier in this chapter.
To check whether the codes in one field do not match a given string or the codes in another field,
we can use the u (unequals) operator:
If codes in the field c(m,n) do not match the given string or the codes in c(m1,n1) then the
expression is true. If the two fields are identical, then the expression is false.
✎ The comparison is of codes in columns, where the columns are compared on a one to one
basis. It is not a comparison of a field with a numeric value, or of the numeric values in two
fields. Numeric comparisons for inequality are written with the .ne. operator.
☞ For further information about numeric comparisons, see ‘Comparing values’, earlier in this
chapter.
c(67,69)u$123$
+----7----+
123
Expressions – Chapter 5 / 37
Quantum User’s Guide Volume 1
The expression:
c(67,69)uc(77,79)
is true as long as columns 67 to 69 differ by at least one code from columns 77 to 79. If our data is:
+----7----+----8
123 256
the expression is true because each of columns 77 to 79 differ from columns 67 to 69. Also, if we
have:
+----7----+----8
123 123
5
the expression is true because column 77 is multicoded ‘15’. The only time the expression is false
is when columns 67 to 69 are identical to columns 77 to 79.
Quick Reference
To test whether a value in a field is within a specified range, type:
range(start,end,minimum,maximum)
Blanks at the start of the field cause this statement to give a false result. To ignore leading blanks,
type:
rangeb(start,end,minimum,maximum)
The logical expression range checks whether the number in a field of columns is within a given
range. If so, the expression is true, otherwise it is false. The format of this statement is:
range(start,end,min,max)
where start and end are column numbers and min and max are the range delimiters. For example,
the statement:
range(137,139,100,150)
will return the value true if the number in columns 37 to 39 of card 1 is in the range 100 to 150.
38 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
✎ It is important to remember that this statement is designed for use with purely numeric
columns. Columns which contain blanks, multicodes or an ampersand (12 punch)
automatically cause the statement to be false. The exception to this is a multicode of a digit
and a minus sign (11 code) which converts the whole field to a negative number.
A variation of range is rangeb which allows columns to the left of the field to be blank if the
number is right-justified in the field. In all other respects it is exactly the same as range. If our data
is:
----+----2
123 6
the expression:
rangeb(17,18,1,10)
will be true because the string $ 6$ will be read as 6. With range the value would be false.
rangeb(15,18,2000,3000)
Quick Reference
To combine logical expressions, type:
Two or more logical expressions may be combined into a single expression using the operators:
Any number of subexpressions may be combined to form a larger expression, but whether the result
is true or false depends upon the values of the subexpressions and also upon the operators used to
combine them.
Expressions – Chapter 5 / 39
Quantum User’s Guide Volume 1
The .and. operator requires that all the expressions preceding and following the .and. be true for
the whole expression to be true. Thus, the statement:
int1.eq.9 .and. c116’1’
is true if the integer variable int1 has a value of 9 and column 116 contains a 1. If either
subexpression is false, the whole expression is false too.
By comparison, the .or. operator requires that one expression or the other, or both, be true in order
for the whole expression to be true.
c(249,251)=$159$ .or. numb(c132,c135) .gt. 4
For this expression to be true, columns 249 to 251 must contain nothing but a ‘1’, ‘5’ and ‘9’
respectively or the number of codes in columns 132 to 135 must be greater than 4. It is also true if
both expressions are true. However, if both are false, the overall result is false.
Expressions are reversed (negated) simply by preceding them with the keyword .not. Although it
is not wrong to use it with a single variable, it is more generally used to reverse an expression
containing the keywords .and. and .or. Thus, it is not wrong to write .not.c15’1/5’ but it is much
simpler to write this as c15n’1/5’.
✎ Take care when using .not. with the .eq. operator. Statements of the form:
.not. c(1,3) .eq. 100
are incorrect and will not work. They should be written as either:
(.not.(c(1,3).eq.100))
with the expression to be reversed enclosed in parentheses, or, more efficiently, as:
(c(1,3).ne.100)
Any of the operators .and., .or, and .not. may appear in a statement more than once, as long as you
use parentheses to define the order of evaluation.
For example:
causes Quantum to check whether the .or. condition is true before dealing with the .and. Suppose
our data is:
----+----2----+
13 &
79
The first expression (c15’1/47’) is true because column 15 contains a 1 and a 7 and the second
expression (c16’3579’) is also true since the codes it contains are amongst those listed as
acceptable. Thus, the .or. condition is true. Column 22 contains an ampersand so the last expression
is also true, therefore the expression as a whole is true regardless.
40 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
If both expressions in the parentheses were false, the whole expression would be false.
When you use .not. with expressions in parentheses, be very careful that what you write is what
you mean. Let’s take the conditions male and married and forget about columns and codes for the
minute. The condition:
which refers to unmarried men and all women. This can also be written as:
The first .not. collects all the women, the second collects everyone who is not married (for example,
single, widowed, and so on), and together they collect people who are female and unmarried. We
use .or. instead of .and. here because the latter will gather unmarried women but will ignore the
unmarried men and married women.
Reversing .or. expressions works in exactly the same way. The expression:
means anyone who is Male, or anyone who is Married, or anyone who is Male and Married. The
opposite of this is:
which means anyone who is not Male or is not Married or is not both; that is, anyone who is a
woman and is unmarried. This can be written as:
Expressions – Chapter 5 / 41
Quantum User’s Guide Volume 1
3----+----4----+----5----+----6----+
519 1
9 &
the expression is true because c(135,137) do not contain just the codes 5, 1 and 9 (c135 is
multicoded), and c160 does not contain any of the codes 6 through 0. The expression will only be
false if:
• column 135 contains a 5 only, column 136 contains a 6 only and column 137 contains a 9 only,
and
• column 160 contains any of the codes 6 through 0, either singly or as a multicode. We could
therefore write the expression as:
Quick Reference
To compare the value of a variable or an arithmetic expression to a list of numbers, type:
in the edit section. Ranges of numbers may be entered in the list as start:end. If the item is a
reference to a field containing blanks, enter the values as strings of codes enclosed in dollar signs.
From time to time you may need to check whether a variable or arithmetic expression has one of a
given list of values. For example, if the questionnaire codes brands of frozen vegetables as 3-digit
codes into columns 145 to 147 we might want to check that only valid codes appeared in this field.
This is achieved using the logical expression .in. as follows:
where variable-name is that of the variable to be checked and list is a list of permissible values.
The arithmetic expression is an expression consisting of data or integer variables, arithmetic
operators and integer values as described earlier in this chapter. If the variable or arithmetic
expression has one of the listed values, the expression is true, if not, it is false.
42 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
The left-hand side of the expression may contain integer variables, columns or data variables
containing whole numbers, or expressions using these types of variables. If it is a data variable, then
the list may contain codes enclosed in dollar signs. Quantum will then compare the codes in the
data variable with the codes inside the dollar signs. We could therefore check that the frozen
vegetables have been coded correctly by keying in a statement which says:
Quantum will flag any records in which c(145,147) does not contain exactly 205, 206, 207, 210,
215 or 220 (that is, three single-coded columns) as incorrect.
If the data variable contains a valid positive or negative whole number, then the list may also
contain such values. Ranges of values may be entered in the form min:max, where min is the lowest
acceptable value and max is the highest. Since the frozen vegetables have numeric codes, we could
write the expression as:
Any columns in the field which contain non-numeric data (for example, multicodes) will be flagged
as incorrect, as will any which contain values which do not match the specification.
Sometimes, though, the codes and numbers will not be interchangeable. If you have 2-digit codes
in a 3-column field, the statement:
unless column 206 is always blank. If the 2-digit codes have been padded on the left with zeros
instead of blanks (that is, 010, 011) or if they all start in column 206 (that is, $10 $, $11 $), then the
first expression will be false, even though the second one will still be true.
☞ For a fuller explanation of the difference between codes and numbers, see the earlier sections
of this chapter.
If the left-hand side of the expression is an integer variable or an arithmetic expression, the list may
contain positive or negative whole numbers:
Lists may contain up to 247 values or codes, which may be entered in any order. In our examples,
we have always entered them in ascending order, but this is not a requirement of Quantum. You
may enter codes in a list in any order you like. The exception is numeric ranges which must be
entered in the form lowest:highest.
Expressions – Chapter 5 / 43
Quantum User’s Guide Volume 1
Naming lists
Quick Reference
To assign a name to a list of values, type:
definelist name=(list)
in the edit section. Where list is a comma-separated list of numbers, ranges or code strings enclosed
in dollar signs.
If you have a list that is used more than once you may give it a name and refer to it by that name
instead of typing in the complete list each time. To name a list, write:
definelist name=(list)
For example:
definelist fveg=(205:207,210,215,220)
To use a defined list, simply replace the list with the name:
✎ You cannot use a definelist in an .in. statement with a data-mapped variable. Quantum cannot
handle this syntax because it needs to read the data in the definelist differently for data-mapped
variables (as strings instead of column punches) but does not know at the time the definelist
is parsed whether it will be used with a data-mapped variable.
44 / Expressions – Chapter 5
Quantum User’s Guide Volume 1
Quick Reference
To speed up your Quantum program by converting expressions of the form c(1,4)=$1234$ into
C in a more efficient way, type:
inline n
where n is the maximum field width to be converted in this manner. This statement must appear at
the start of the edit.
If you have a large edit, you can speed up the time it takes to run by including the inline statement
in your edit. This instructs the Quantum compiler to convert expressions of the form
c(1,4)=$1234$ into statements in the C programming language in a different way to the way it
normally does. You need not worry about these different methods of conversion, apart from
deciding whether or not to use them.
If you want to speed your program up, place a statement of the form:
inline n
at the beginning of the edit section, where n is the maximum field width to be converted in the
special way.
For example:
inline 6
Here we are saying that fields of six columns or less should be converted in the special way rather
than in the normal way.
Expressions – Chapter 5 / 45
6 How Quantum reads data
In order for the answered questionnaire to be processed, the information contained on the
questionnaire must be read into the computer into a location where Quantum can access it. This is
done by reading the data into the data variable array called C which is supplied automatically with
every Quantum run. You may then access this data by addressing this array.
Different types of records are read into the C array in different ways.
Quantum deals with three types of record: ordinary, multicard and multicard with trailer cards.
Ordinary records
These are strings of codes and numbers, one per respondent, up to a maximum of 32,767 characters
per respondent.
Multicard records
When data originates from punched cards and each questionnaire requires more than 80 columns,
the data is spread over several cards. So that all cards belonging to a particular respondent may be
easily identified, each questionnaire is assigned a serial number which is entered as part of the data
for each card. Within this, each card has a unique card type or card number to distinguish it from
others in the group. It is important that both the serial number and card type be in the same relative
positions on all cards in the file, since this is the only way that Quantum can tell which data belongs
to which respondent.
If the questionnaire serial number is in columns 1 to 4 of each card and the card type is in column 5,
and we are looking at questionnaire 1005, we will see that it has two cards whose first five columns
are 10051 and 10052 respectively. Quantum can deal with records that contain up to 327 cards per
respondent.
Occasionally you may have multicard records in which each ‘card’ is greater than 80 columns. The
notes that follow refer to multicard records of up to 100 columns per card.
☞ For information on how Quantum deals with ‘cards’ of more than 100 columns, see section
6.10, ‘Multicard records of more than 100 columns per card’.
Sometimes a record contains very repetitive data which is tabulated over and over again in the same
way. For instance, a shopping survey may ask the respondent a series of identical questions for each
store he visited. In this case, there may be a separate card for each store.
Processing this type of data is often easier if we treat all cards containing the same questions as if
they were, in fact, one card with one card number. These cards are called trailer cards.
Thus, if the respondent visited five stores, and the questions about these stores are coded on a
card 2, the record for that respondent would contain five cards of type 2. If demographic details
were stored on a card 1, the whole record would be 6 cards in all. In Quantum, the demographic
data would be described as the higher level and the stores as the lower level.
Another example of data gathered at different levels might be a travel survey in which respondents
are asked about the places they visited and their method of travelling. The highest level may be
demographic information about the respondents, the second level would be the various trips they
made and the third level might be information about the various modes of transport they used. If
we were to draw a chart of a record, it would look like this:
Respondent
|
-------------------------------------------
| | |
Trip1 Trip2 Trip3
| | |
Tran1 Tran2 Tran3 Tran1 Tran2 Tran1 Tran2 Tran3
Here, we have three groups of data at level 2 and eight groups of data at level 3.
Data is read into the C array automatically, one record at a time. The way data is read depends upon
the record structure. If a record contains carriage return characters (CTRL+M), those characters are
always ignored.
Ordinary records
Ordinary records are read into cell 1 onwards of the array. Therefore, for example, the 50th column
is referenced as c50 and the 200th cell as c200.
Multicard records
Records are read into c101 to c200 for card 1, c201 to c300 for card 2, and so on. For example,
80-column cards are read into c101 to c180 for card 1 and c201 to c280 for card 2. Columns 181–
200, 281–300, and so on remain blank. In this case, the C array may be pictured as ten rows of 100
cells each. Column 50 of card 1 is then accessed by referring to it as c150, and column 67 of card
8 is referred to as c867.
☞ For information on longer records, see section 6.10, ‘Multicard records of more than 100
columns per card’.
If you have records with more than nine cards, you need to extend the size of the C array by using
max=. This also tells Quantum which cells to clear between records.
☞ For further details on max=, see ‘Highest card type number’ later in this chapter.
It is also possible to read cards into the array sequentially regardless of card type: the first card goes
in c(101,200), the second in c(201,300), the third in c(301,400), and so on.
☞ For further information, see ‘Record type’ in section 6.8, ‘Describing the data structure’.
Each time an ordinary record or set of cards comprising a multicard record is read in, that data is
processed first by the edit section and then by the tabulation section of your program. The complete
record is edited and tabulated in one go. The exception to this is the trailer card record where
processing can take place a number of times within each record for each lower level.
To ensure that only the part of the edit section applying to a particular level is used, the edit section
is defined separately for each level. Similarly, the table instructions specify the level at which the
table should be incremented.
☞ For more information about levels, see chapter 3, ‘Dealing with hierarchical data’ in the
Quantum User’s Guide Volume 3.
By using the Levels facility, the user need not know how Quantum deals with trailer card data
internally. However, there are occasions when it may be necessary to edit or tabulate the data
without using levels. To do this, it is necessary to know more about how trailer cards are processed.
Quantum deals with trailer cards in a number of ‘reads’. Cards are read into the appropriate rows
of the C array until:
• A card is located with a card type matching that of the previous card (for example, two
consecutive card 2’s), or
• A card is read with a type lower than its predecessor and matching one of the card types already
read in during the current ‘read’ (for example, a card 2, a card 3, and then another card 2).
In order to produce useful tables, you will need to know which cards are currently in the C array.
Quantum has four reserved variables — thisread, allread, firstread and lastread — which it uses to
keep track of which cards it has read for each respondent.
thisread
The array called thisread is used to check which cards have been read in during the current read.
thisread1 will be true (or 1) if a card type 1 has just been read in; thisread2 will be true if a card 2
has just been read, and so on.
There are nine such variables (thisread1 to thisread9) available unless extra card types have been
specified using the max= option In this case, these variables will be numbered 1 to max; if there
are 13 cards, we will have thisread1 to thisread13.
☞ For further details on max=, see ‘Highest card type number’ later in this chapter.
allread
allread notes which cards have been read in so far for this questionnaire. If cards 1, 2 and 3 have
been read so far, allread1, allread2 and allread3 will all be true. Additionally, each cell of allread
will contain the number of cards of the given type read in — for instance, if two cards of type 3
have been read, allread3 will be true and it will contain the number 2.
As with thisread, there are nine allread variables available unless extra card types have been
specified with max=.
The variables firstread and lastread become true when the first and last cards in a record have been
read in.
Examples
You can use these variables in your program to associate specific parts of the edit or tabulation
section with specific types of data. For instance:
Let’s take an example and look at the contents of the C array and the values of thisread, allread,
firstread and lastread. Suppose the record has five cards: 1, 2, 2, 2 and 3 of 80 columns each. The
first ‘read’ places card 1 in c(101,180) and the first card 2 in c(201,280). The second card 2 is not
read into the array yet because it has the same card type as the previous card. As this is the start of
a new respondent, firstread is true (or 1), and because cards 1 and 2 have been read, thisread1,
thisread2, allread1 and allread2 are also true.
The second ‘read’ deals only with the second card 2 since it is followed by another card of the same
type. thisread2 is true, as are allread1 and allread2. Also, allread2 contains the value 2 because we
have read in 2 card 2s so far. Note that thisread1 is now false (or 0) as no card 1 was read this time.
On the third and final ‘read’ the third card 2 is read into c(201,280) and card 3 is copied into
c(301,380). lastread is true because we have reached the end of the record, thisread2 and thisread3
are true because we have just read cards 2 and 3, and allread1, allread2 and allread3 are true because
this record contains cards 1, 2 and 3. allread2 now contains the value 3 because there were 3 card 2s
altogether.
The chart below summarizes the cards read and the variables which will be true after each read.
If Quantum reads a record in which the repeated cards are out of sequence, it inserts blanks cards
of the appropriate types wherever necessary to force the cards into the correct sequence. For
example, if the record contains the cards 1, 2, 4, 3, 4, 4 in that order, Quantum will generate a
completely blank card 3 when it reads the first card 4. The record is then processed as if it contained
cards 1, 2, 3, 4, 3, 4, 4.
It is sometimes useful to know that in the case of multicard records the first card of the next record
is waiting in columns 1 to 100 of the array. Beware of overwriting these columns.
In section 6.4, ‘Trailer cards’ we discussed the reserved variable thisread, which keeps track of
which cards have been read in during the current read, and allread, which keeps track of all cards
read in for the current record. Other reserved variables associated with reading in data:
lastrec Set to true when the last record in the file has been read or, in the case of trailer
cards, the last read of the last record has occurred.
rec_count Stores the number of records read in so far.
card_count Counts the number of cards read so far.
You can use spare columns in the C array for data manipulation and storing additional information.
However, it may be clearer to store this information in named variables where the name gives some
indication of the type of data stored.
In ordinary records you can use the space beyond the end of the record. If the record length is 120
columns, you can use columns 121 to 1000.
✎ For ordinary records, only columns 1 to reclen are reset to blanks, where reclen is the
maximum record length as defined by the reclen= keyword on the struct statement.
☞ For further information about defining the record length, see ‘Record length’ in the next
section.
In multicard records you may not use c(1,100). However, you may use any columns between the
end of the card (reclen) and the end of that row of the C array. For instance, when reclen=80 you
may use c(181,200), c(281,300) and so on. You may also use full sets of columns in which there is
no data: that is, if the record has only four cards (1, 2, 3 and 4), then c(501,1000) are the spare
columns you may use. Additionally, cells 101 to c(100+reclen), c201 to c(200+reclen), and so on
are reset to blanks before the next record is read in.
Quick Reference
To describe the structure of the data, type:
struct; options
All programs dealing with multicard records must contain a struct statement unless the data
contains trailer cards which will be read and tabulated using the levels facility. In this case you may
choose between using a struct statement or using a levels file. If the run has no struct statement and
no levels file, Quantum assumes that the data contains ordinary records to be read into c1 onwards
of the C array.
☞ For information about levels and how to describe the levels data structure, see chapter 3,
‘Dealing with hierarchical data’ in the Quantum User’s Guide Volume 3.
The struct statement is used to define the type of records, the location of the serial number and card
type in the record and the number of the highest card type if greater than 9. Its format is:
struct; options
Record type
Quick Reference
To define the record type, type:
struct; read=n
where n is 0 for ordinary records, 2 to read multicard records in sections according to the card type,
or 3 to read multicard records all in one go.
Quantum recognizes two types of record: single card and multicard. The type of record is defined
by the keyword read= on the struct statement:
• Ordinary records — Ordinary records are defined using read=0. Each record is read into c1
onwards of the array. Since it is the default, you need only use it when other options are
required; for example, when the records contain serial numbers and you wish to have the serial
number printed out as part of the record, or when you are working with long records of more
than 100 columns.
• Multicard records — Multicard records are identified by the keyword read=2. Each card in
the record is read into the row corresponding to the card type of that card — that is, card 1 in
c(101,200), card 2 in c(201,300), and so on.
We mentioned briefly that it is possible to read all cards in a multicard record in at once and
ignore the card type. The first card goes in c(101,200), the second in c(201,300), and so on.
This is achieved with read=3.
Record length
Quick Reference
To define the record length of records greater than 100 columns, type:
struct; reclen=n
The keyword reclen=n defines the maximum number of characters to be read into the C array, the
number of cells to be reset to blanks and the number of cells to be written out by the write statement.
With ordinary records reclen may take any value, but with multicard records the maximum is
reclen=1000. In both cases, the default is reclen=100. When data is read into the array, any record
which is longer than reclen characters is truncated to that length and a warning message is printed.
When ordinary records are written out with write or split, cells c1 to c(reclen) are copied, with any
trailing blanks being ignored. For instance, if we have:
struct;read=0;reclen=200
and the current record is only 157 characters long, the record written out will be 157 characters
long. This length can be overridden by an option on a filedef statement.
When multicard records are written out, columns c101 to c(100+reclen), c201 to c(200+reclen),
and so on will be output. Thus, if we write:
struct;read=2;reclen=70
and we have 2 cards per record, Quantum will write out c(101,170) and c(201,270).
Finally, with ordinary records cells c1 to c(reclen) are reset to blanks between records, but with
multicard records cells c101 to c(100+reclen), c210 to c(200+reclen), and so on are reset.
☞ For information about the write statement, see section 7.1, ‘Print files’.
For information about the split statement, see section 12.4, ‘Creating clean and dirty data
files’.
For information about the filedef statement, see section 7.4, ‘Defining the file type’.
Quick Reference
To define the location of the serial number in each record, type:
struct; ser=c(m,n)
The keyword ser=c(m,n) defines the field of columns containing the respondent serial number. For
example, if the serial number is in columns 1 to 5 of an ordinary record we would write:
struct;read=0;ser=c(1,5)
struct;read=2;ser=c(1,5)
Notice that even with multicard records we only give the actual column numbers containing the
serial number, rather than card type and column number as is usually the case when identifying
columns in such records. This is because the column numbers refer to all cards in the data set rather
than to a single card in the file.
Quick Reference
For multicard records only, to define the location of the card type in the record, type:
struct; crd=cn
Defining the card type location is much the same as defining the position of the serial number in
the record. The keyword is crd=cn for a single digit card type or crd=c(m,n) for a card type of more
than one digit. Once again, m and n are column numbers only, not card type and column number.
For example:
struct;read=2;ser=c(1,4);crd=c5
tells us that we have a multicard record with serial numbers in columns 1 to 4 and the card type in
column 5 of each card. Each card will be read into the row corresponding to its card number.
Quick Reference
For multicard records only, to define cards which must be present in each record, type:
struct; req=card_numbers
where card_numbers is either a comma-separated list of card numbers, or a range of sequential card
numbers in the form start:end or start/end.
Sometimes some cards will be optional and others mandatory. You define the cards which must
appear in every record by using the keyword req= followed by the numbers of the cards that each
respondent must have. For example:
req=1,2
tells us that cards 1 and 2 must be present in each record for that record to be accepted. Any other
cards are optional. If a record is read without one of these cards, the error message ‘Card Missing
in Set’ and a note of the record’s position in the file are printed and the record is ignored.
If you have ranges for required card types, you may type the numbers of the lowest and highest
cards separated by a slash (/) or a colon (:) rather than listing each card type separately. For
example, if cards 1 to 4 are all required, you may type:
Quick Reference
For multicard records only, to define cards which may appear more than once in a record, type:
struct; rep=card_numbers
where card_numbers is either a comma-separated list of card numbers, or a range of sequential card
numbers in the form start:end or start/end.
If the data contains trailer cards and the Levels facility is not used, you must list their card types
with the keyword rep=. For instance, if card 2 is a trailer card we would write rep=2. Where there
is more than one trailer card, each card type is listed separated by a comma. If cards 2, 3 and 4 are
all trailer cards we could write:
rep=2,3,4
If you have ranges for repeated card types, you may type the numbers of the lowest and highest
cards separated by a slash (/) or a colon (:) rather than listing each card type separately.
If rep= is not used and a record is read with two or more cards of the same type, the last card of
that type will be accepted and the message ‘Identical duplicate’ or ‘Non-identical duplicate’ and a
note of the record’s position in the file will be printed. For example:
Record structure error: serial 026, card 234 in run, card 234 in dfile
card type 2 — non-identical duplicate
Because rep= refers to trailer cards only, it will be ignored if read=2 and crd= are not both present
on the struct statement.
Quick Reference
For multicard records only, to define the highest card type in the record, if there are more than nine
cards per record, type:
struct; max=n
The only time you need to inform Quantum of the highest card type is when you have records with
more than nine cards. This is so that Quantum can allocate sufficient cells in the C array to store
the extra cards. The highest card type is defined with max=n, where n is the number of the highest
card type. Cells 1 to max*reclen are then cleared between respondents. For example, to read a data
set with 11 cards per respondent we might write:
struct;read=2;ser=c(1,4);crd=c5;req=1,2,3,4;max=11
If you forget max=, and a record is read with more than nine cards, the message ‘Too many cards
per record’ is printed and the record is rejected. On the other hand, if a card is read with a card type
higher than that defined with max=, the record is rejected with the message ‘Card number out of
range’.
✎ Since the maximum size of the C array is 32,767 cells, the maximum value you can set with
max= is 327 cards.
Quick Reference
For multicard records only, to define the location in the C array of cards with alphanumeric card
types, type:
struct; order=card_types
where card_types is a list of card type numbers and letters in the order they are to appear in the C
array.
From time to time you may need to read in records with alphabetic as well as numeric card types.
This generally happens in a multicard data set containing more than nine cards per record where
only one column has been allocated to the card type.
Quantum can deal with this data but first you have to say where in the C array the alphabetic card
types should go. This is done with the keyword:
order=n
where n is one or more of the codes ‘1234567890–&’ or the letters A to Z (in upper or lower case)
not separated by spaces.
The card type bearing the first number in the list is read into c(101,200), the card bearing the second
code in the list is read into c(201,300), and so on. For example, suppose each record has ten cards
— 1 to 9 and A — our struct statement might say:
struct;read=2;ser=c(1,4);crd=c5;max=10;order=123456789A
Data from card A would be read into cells 1001 to 1100 of the C array.
Quick Reference
For multicard records only, to define the location of the merge sequence number in trailer cards,
type:
struct; seq=cn
When trailer card data is merged during a run with the merge facility, you may wish trailer cards
to be merged in a specific order, according to a sequence number entered as part of the data. The
location of this sequence number can be defined with the keyword seq=cn for a single column code
or seq=c(m,n) for a multicolumn code. For more information on merging data see the next section.
When we say that Quantum allows you to merge data files, we do not mean that Quantum takes
data from a number of files and merges it to create a new file. Rather, we mean that data can be
read from a series of files during a Quantum run. Of course, the merged data can then be written
out to a new file for future use.
Quantum provides two methods for merging data. The first is designed for studies where you have
different card types in different files; for example, cards 1 and 2 in the file data1 and card 3 in the
file data2. In this case, merging is by serial number and, optionally, card type and trailer card
sequence number.
The second method is designed for situations where you want to merge a field of data from an
external file into records from the main data file. For example, you may have a file of
manufacturers’ codes which refer to a number of products. If each record in the main data file
contains the product the respondent preferred, you may wish to merge the appropriate
manufacturer’s code from the external file into the main data in the C array. In this case, merging
is based on finding matching keys in the main record and the records in the external file.
Data for a study may be spread across a number of files. This is particularly useful with large
surveys because it means that you can put each card type in a different file and simply merge in the
cards required for the current batch of tables. For example, if we require tables from cards 4 and 5,
we need not even read in cards 1, 2, 3 and 6.
Data from up to 16 files may be merged; that is, the main data file and 15 others. It may be merged
on serial number and, within that, on card type. With trailer card data, you also have the option of
merging trailer cards according to a sequence number entered as part of the data.
In order for the merge to be successful, all files must be sorted in ascending order with the serial
number, card type and sequence number in the same position. Quantum reads the locations from
the keywords ser=, crd= and seq= on the struct statement.
To merge data files you must create a file called merges telling Quantum which items to merge on,
and which files to merge. The type of merge is represented by a number:
1 Merge on serial number. Cards are read in from each data file according to their serial number
only — the card type and sequence number, if any, are ignored. You might use this option
when you have two files, dat01 containing cards of type 1 and dat02 containing cards of
type 2, and you want the files to be merged so that card type 1 is read into the C array, followed
by card type 2.
3 Merge on serial number and card type (default). With this option, cards with the same serial
number read from different data files are merged to form a single record by comparing the
serial number and card type. Cards within a record are then sorted sequentially from 1 so that
each card is read into the appropriate cells of the C array. For example, if dat01 contains cards
1 and 3, and dat02 contains cards of type 2, the merge will produce records containing cards
1, 2 and 3 in that order.
5 Merge on serial number, card type and sequence number. This is similar to merge type 3,
except that trailer cards are merged according to their sequence number. For example, if dat01
contains cards 1 and 2, where card 2 is a trailer card with a sequence number of 2, and dat02
contains cards 2 and 3, where card 2 is a trailer cards with a sequence number of 1, the merged
record will contain cards 1, 2/1, 2/2, and 3, in that order.
The type of merge is the first item in the merges file, and is followed by the names of the files to
be merged with the main data file named in the Quantum command line. Items may be entered on
separate lines or all on the same line separated by semicolons. For example, if we want to merge
data in files dat02 and dat03 with data in the main file, dat01, by serial number, card type and
sequence number, the merges file would look like this:
5; dat02; dat03
Notice that we have not mentioned dat01 in the merges file because it will be named on the
Quantum command line instead.
✎ This facility is not designed to work with merge files that contain *include or #include
statements to read additional data files into the current data file. All merge files must be named
in the merges file, which accepts pathnames if the data files are not in the project directory.
Quick Reference
To merge extra data from an external data file into the data currently in the C array, type:
where
key_field is the location of the key in the main data file, entered using the standard Quantum
notation for columns and fields.
key_start is the start column of the key in the external data file.
copy_to is the field in the main data record in which to place the external data. The field is
defined using the standard Quantum notation for columns and fields.
The mergedata statement merges a field of data from an external file with the main data at the
datapass stage of the Quantum run. Merging is by means of a data key present in both the main
records and the records in the external file. If a record in the external file has a key which matches
that of a record in the main data file, the external data will be merged into a user-defined field of
the main record when it is read into the C array.
In order for data to be merged correctly, both the main data file and the external file must be sorted
in ascending order by key value. If the key is the record serial number then the data file will already
be sorted in the correct order (assuming, of course, that the data is sorted by serial number). If you
are using a key that is not the record serial number you must sort the data file so that it is ordered
by key rather than by serial number.
where:
int_variable is the name of an integer variable in which the function can place its return value.
ex_file is the name of the file containing the extra data. It must be enclosed in dollar
signs.
key_field is the location of the key in the main data file, entered using the standard
Quantum notation for columns and fields.
key_start is the start column of the key in the external data file, for example, 1 if the key
starts in column 1. The length of the key is taken from the length of key_field.
copy_to is the field in the main data record in which to place the external data. The field
is defined using the standard Quantum notation for columns and fields.
data_start is the start column of the data to be copied. Quantum copies as many columns as
are defined by copy_to.
For example:
t1 = mergedata($manuf_codes$,c(178,180),15,c(168,175),1)
tells Quantum to compare the key in columns 178 to 180 of the main record with the key which
starts in column 15 of the external records in the file manuf_codes.
Because the key field in the main record is 3 columns long, Quantum reads columns 15 to 17 of
each external record to obtain its key. If the keys match, Quantum copies the data from the external
record into columns 168 to 175 of the main record in the C array. The external data to be copied
starts in column 1 and, since the destination field is 8 columns long, Quantum copies 8 columns
starting at that column.
This statement returns a value of 1 if a match was found (i.e., merging took place), or 0 if not.
There is no limit on the number of mergedata statements in a specification, but you may only merge
data from up to nine different files per record.
Errors
Errors can occur if your run contains a mergedata statement and either the main data file or the file
of supplementary data for merging has records with duplicate keys or records that are out of
sequence. In some cases the run is also canceled after all data has been read, when a complete error
report is available. The following table lists the situations when duplicate or out of sequence data
may occur and shows what happens to your job.
Run
Circumstance Message canceled?
read=0 and the main data file contains records WARNING: FILE name CONTAINS No
with duplicate keys DUPLICATES IN key_field
read=2 and the main data file contains records WARNING: FILE name CONTAINS Yes
with duplicate keys DUPLICATES IN key_field
read=0 or read=2 and the supplementary data WARNING: FILE name CONTAINS Yes
file contains records with duplicate keys DUPLICATES IN key_field
read=0 or read=2 and records in the main data WARNING: FILE name OUT OF Yes
file are out of sequence SEQUENCE IN key_field
read=0 or read=2 and records in the WARNING: FILE name OUT OF Yes
supplementary data file are out of sequence SEQUENCE IN key_field
Occasionally you may have multicard records in which each card contains more than 100 columns.
To process this data, Quantum extends the width of the C array to 10 rows of 1,000 cells each —
that is, 10,000 cells in all — when a struct statement with reclen>100 is present. Data is read into
c(1001,2000) for card 1, c(2001 to 3000) for card 2, and so on. The last three digits are used for the
column number and the other digits are used for the card number.
All other points mentioned previously for multicard records apply, but column numbers refer to the
extended rather than the default C array. For example, in the default C array c(1,100) stores the first
card of the next record, whereas in the extended C array this data is stored in c(1,1000).
Occasionally you may have to process data which does not come in the standard formats described
in this chapter. For instance, records may be strung out one after the other without being separated
by a new-line character. Quantum provides limited facilities for reading non-standard data.
☞ For further details, see ‘Reading non-standard data files’ in chapter 10, ‘Include and
substitution’ of the Quantum User’s Guide Volume 2.
There are three ways of writing out your data once it has been read into the C array. You may:
Data and print files are both accessed by the write statement, but the exact format of the statement
varies according to the type of file and the information being written. You write to report files using
the report statement.
Print files are printouts of records or parts of records with headings, descriptive texts and page
numbers. They cannot be used as data for subsequent Quantum runs.
Quick Reference
To write a record or part of a record to a print file, type:
The word write by itself prints out a whole record in the form it is when the write statement is
executed, together with a ruler showing which codes fall in which columns, the line number of the
record in the data file and the message ‘write’ indicating that the record was generated by a write
statement. Any multicodes in the record are shown as asterisks, but you may change this with an
option on the filedef statement.
☞ For information on the filedef statement, see section 7.4, ‘Defining the file type’.
If the record contains more than one card, each card is listed separately beneath the ruler. For
example, the statement:
write
1 in file
----+----1----+----2-- ... --9----+----0
columns 1 - 100 are |12345
write
2 in file
----+----1----+----2-- ... --9----+----0
columns 1 - 100 are |23456
write
Each write statement will produce a line in the default print file, out2, telling you how many records
were written out, as follows:
2 (1%) write
Which cards are printed from multi-card records depends upon which cards have been read in so
far. Quantum looks at the ‘allread’ variables and writes out cards for those which are true; so for
example, if allread1, allread2 and allread3 are true, cards 1, 2 and 3 will be printed. If you have
changed the contents of these variables prior to printing out the record, you will see the cards for
which allread is true rather than those which were originally read.
The example above was very simple; more often than not your program will contain several write
statements and you will want some way of identifying which records were printed by which
statement and why. If the write is dependent upon some other statement — for instance, it is part
of an if statement — the whole statement is printed underneath each record, thus:
67 in file
----+----1----+----2-- ... --9----+----0
columns 1 - 100 are |0015263-16*735 *837361 ... 79&
if (c14n’1/4’) write
Here, as you can see, we are checking whether column 14 contains a 1/4. This record has been
printed out because it contains a ‘5’ instead.
Sometimes it is more helpful to have an explanatory text printed instead of the statement itself. In
this case all that is necessary is to follow the word write with the text to be printed enclosed in dollar
signs:
Record 17 51 in file
----+----1----+----2-- ... --9----+----0
columns 101 - 200 are |00170116548986131*46*1 ...
columns 201 - 300 are |0017026464515 875 ** ...
columns 301 - 400 are |0017031929-5897231 ...
c308 incorrect
too many choices
Record 32 94 in file
----+----1----+----2-- ... --9----+----0
columns 101 - 200 are |003201837021 **53798 ...
columns 201 - 300 are |0032021353452 763736 ...
columns 301 - 400 are |003203212 & ...
too many choices
Our first statement writes out all records in which column 308 does not contain any of the codes
1/5, and the second picks up all records having more than 3 codes in columns 117 to 119.
Normally all output from write goes to the default print file, and whenever the current record is
written to this file, the variable printed_ becomes true. You may change the output file by following
the word write with the name of the file to write to. For example:
All files named on write statements must be defined on a filedef statement before they are used.
☞ For information on the filedef statement, see section 7.4, ‘Defining the file type’.
If two or more write statements apply to a single record, the record is printed out once in the state
it was when the first applicable write was read, with all relevant write statements or texts listed
below it. If a record satisfies two or more write statements which write to different files, Quantum
writes the record out once for each statement, in the state it is when each write is executed.
✎ If you want to write out more than one field at a time, or to print more than one text, you can
define those fields and/or texts on an ident statement. All write statements from that point on
will then print those fields and texts.
☞ To find out more about ident, read section 7.5, ‘Default print parameters for write statements’.
Often you will not want to write out the whole record, especially if it contains several cards.
Therefore Quantum allows you to include a field specification in a write statement to print only
selected portions of an incorrect record. For example:
checks that columns 110 and 119 both contain a 2, and if so prints out columns 110 to 120 in the
print file, followed by the text Married woman. If you are writing out fewer than ten columns,
Quantum does not print a ruler above the codes.
If you are dealing with multi-card records, you may prefer to use this form of write to print only
the card containing the error, rather than all cards in the record. If we take our previous example
where we were checking the contents of column 308:
✎ The write statement can only write out information from the C array.
Quick Reference
To write records or fields to a data file, type:
write may also be used to copy records to a data file. This is useful if you want to separate a
particular card type from the rest of the data, or if you want to correct errors and save the corrected
data in a new file for later tabulation.
write filename
If you use write in a levels job to write data to a new data file, the statement write datafile at
any level will write out data for that level only. Additionally, if the write statement is inside an if
clause, or a return statement is encountered, then only relevant data is written for that level. To
write out data for all levels, you will need one write statement per level.
In all cases, records are written in the state they are when the write is executed, and all cards read
in with the current read are copied; that is, all cards for which thisread is true. For instance, if
thisread1, thisread2 and thisread3 are true, Quantum will write out cards 1, 2 and 3. To prevent any
of these cards being written, you may set the appropriate variable to false (zero); therefore to print
only card 1 of our three cards, we would write:
thisread2=0; thisread3=0
write newdat
Any number of writes to data files are allowed in the edit, and each one may write to a different file.
Records written by write are normally as long as the record length defined with reclen on the struct
statement. You may change this with len= on the filedef statement. The exception is where records
end with blank columns. In this case Quantum ignores the blank columns. If you want to create a
data file of fixed length records, and your data is single coded, you can use the reportn statement.
If your data is multicoded you can convert it to single coded first by using the explode statement.
☞ For further information about explode, see ‘Converting multicoded data to single-coded data’
in chapter 13, ‘Using subroutines in the edit’.
If your data is multicoded and you need to preserve the multicodes, the only way of writing out
fixed length records if the data currently has trailing blank columns is to insert a dummy code in
the last column of those records.
New cards can be created by copying information into spare columns of the C array. To save these
as part of a new data file you will have to give each new card the same respondent serial number
as the rest of the data in the array and a card type which may or may not be unique. In the example
below, we are moving some information from card 1 of a 2-card data set into a new card 3. The
comments explain what each statement is doing.
Quick Reference
To write information to a report file, type:
Use reportn rather than just report to start a new line each time the statement is executed.
A report file is a special type of print file in which you can print out records, fields or variables in
the format of your choice. To write information in a report file, use the report statement, as follows:
where filename is the name of the file to be written to, and parameters define exactly what is to be
written.
Lines in a report may be up to 1024 characters long. Report does not start a new line automatically
at the end of each write, but you may tell it to do so by following the keyword report with the
letter n:
In both cases, the named file must be identified as a report file using a filedef statement, as
described in section 7.4, ‘Defining the file type’.
The parameter list defines what is to be printed in the report file. It may contain variables, texts,
and special characters representing tabs and spaces.
Data variables
Quick Reference
To print the contents of a data variable, type:
var_name or var_name(start,end)
To print the contents of a field, evaluated as an integer right-justified in a field of a given width,
type:
var_name:field_width
To print a the contents of every column in a field, even if they are multicoded or blank, type:
start:field_width
where start is the first position in the field. You may also use this notation to print fields whose
contents evaluate to a value greater than the maximum integer value Quantum can deal with.
All data variables that are single coded are printed using as many positions as there are columns in
the variable. For example, if the data is:
----+----4
511 538253
2
&
the statement:
prints the contents of columns 31, 35 and 40 one after the other, as follows:
553
The statement:
538253
In both the examples the last column of the field has contained a code. If the last column or columns
of a field are blank, Quantum omits those columns when printing the contents of the field. (You
can get round this by entering the field specification as start:field_width as described later in this
section.)
A single data variable that is blank is printed as such, while a single data variable that is multicoded
is printed as an asterisk. The statement:
5 *1
If a variable refers to a string that contains multicoded or blank columns, Quantum ignores the
multicodes and blanks and evaluates the contents of the remaining columns as an integer. For
example, using the data shown above, the statement:
prints a line containing the value 51538253. The value starts in the first print position available.
✎ If the field you wish to print is very long, its contents may produce an incorrect value when
evaluated as an integer (the maximum integer value which Quantum can deal with is
1,073,741,824). You can get round this by specifying the first column and the field width as
described below.
If you want to see all columns in a field which contains blanks or multicodes, or you need to have
the correct evaluation of a long field, you will need to deal with each column in the field separately.
You could type each column number separately, but it is quicker just to specify the start column
and the total number of columns you want to print starting at that column.
start:field_width
The output from this command would be 51* 538253, the same as if you had typed each column
number separately. As before, the data is printed starting in the first print position available.
You can use this alternative notation with field specifications too. In this instance Quantum will
evaluate the contents of the field as an integer and will print the result right-justified in a field of
the given width. If you type, for example:
Quantum will print the value 51538253 in positions 3 to 10 of a ten-position field. The first two
positions will be blank.
This notation is also useful if you need to create data files with fixed length records, and some
records end with blank columns. Writing records to a data file preserves multicodes but ignores
trailing blank columns. Writing to a report file allows you to create a single-coded data file with
fixed length records. If your data is multicoded you will need to convert it to single-coded form
before writing it out. You can do this by ‘exploding’ any multicodes into a field of single codes.
You use the explode statement for this.
☞ For information on how to use explode, see ‘Converting multicoded data to single-coded data’
in chapter 13, ‘Using subroutines in the edit’.
Once your data is in single-coded form you can then write the whole record out to a report file using
a reportn statement as follows:
Integer variables
Quick Reference
To print the contents of an integer variable, type:
var_name[:field_width]
If the report statement names a variable by itself, Quantum prints the variable’s value starting in
the first print position available. If the specification includes a field width, Quantum prints the
variable’s value right-justified in a field of the given width. Any extra columns on the left of the
field width are shown as blanks.
var_name[:field_width]
If you type the variable name by itself, without a field width, Quantum prints it left-justified starting
in the first available position on the line. If you would prefer values to be printed right-justified,
follow the variable name with a colon and a field width. Quantum will then print all values for that
variable right-justified in a field of the width you have given. For example:
prints the values of the variable called codenums right-justified in a field five positions wide.
Values that are shorter than five characters are padded on the left with blanks.
Real variables
Quick Reference
To print the value of a real variable, type:
var_name[:field_width.dec_places]
where field_width is the width of the field in which the values are to be printed (values are right-
justified and padded on the left with blanks if necessary) and dec_places is the number of decimal
places to be shown for each number. If you omit these parameters, Quantum prints the values
starting in the first available print position and with six decimal places.
var_name[:field_width.dec_places]
If you type the variable name by itself, without a field width and a number of decimal places,
Quantum will print the variable’s value with six decimal places and starting in the first print
position available.
You can control the layout by defining a field width and the number of decimal places required.
For example, by typing:
you can create a neat column of figures all with two decimal places and all right-justified in a field
six characters wide.
Quick Reference
To print text, type:
$text$
[number]x
print_post
Most reports require some sort of text or spacing on the line, either on the same line as the values
or on lines by themselves to create titles, column headings, and the like.
$text$
To print spaces between the values on a line, you can either use spaces or tabs. To print a given
number of spaces between one value and the next, type:
[number]x
where number is the number of spaces required. The default is one space.
If you are producing tabular or columnar output you’ll probably find tabs are more useful for
creating blank space since they allow you to skip to a particular print position on the line. For
example, typing:
25t
takes you directly to position 25 on the line, regardless of the current print position. Compare this
with 25x which moves you 25 positions on from your current position.
Examples
in a file called summary. Printing starts in position (column) 20 because we started the parameter
list with the keyword 20t. The variable brda is an integer variable whose value is to be right-
justified in a field three columns wide. Notice also how we have inserted spaces between the texts
and the value of brda.
The statements:
/* only print title if this is the first record in the data file
if (.not. rchk)
+reportn yogurt 30t,$Serial Numbers for Yogurt Buyers$
if (c119’1’) reportn yogurt c(1,4)
.
rchk = 1
produce a report showing the serial numbers of all respondents who buy yogurt. As you can see,
we have given our report a title.
As a final example, let’s look at the difference between printing a field of columns all in one go and
printing them one at a time. If our data is:
+----4----+
18 036
& /
7
the statement:
c(37,43) is 106
c(37,43) is 1* 0*6
✎ You cannot write information to the standard print file (usually called out2) using report. To
do this use the function qfprnt.
☞ For information about qfprnt, see section 7.6, ‘Writing out data in a user-defined format’.
Quick Reference
To define a report file, type:
where mpa, mpd and mpe indicate that multipunches should be printed across the page, down the
page, or as an asterisk and then listed below the record.
All files named on write and report statements must be defined by a filedef statement before they
are used. This tells Quantum whether the file is a report, print or data file, and defines more
specifically how the output should be written. So that you can be sure that all filenames will be
recognized, you are advised to place all filedef statement at the beginning of the edit.
where filename is the name of the report file and report is a mandatory keyword indicating that the
file is a report file.
✎ If you are writing out more than 200 characters to a report file, you need to set len= on the
filedef statement to more than 200 to ensure that no lines are truncated.
Quantum normally creates report files in the main project directory. If you want the report file to
be created in a different directory, follow the filename with =pathname. When specifying a
pathname, the filename acts as a short-hand reference (tag). This means that you still have to tell
Quantum the filename by appending it to the pathname.
For example, to declare a report file called repfile1 that is to be created in the directory /home/ben,
you would write:
where filename is the name of the output file and data is a mandatory keyword indicating that the
named file is a data file. As with report files you may use the optional =pathname parameter to
name the directory in which the data file should be created.
All records written to data files are as long as the record length defined with reclen on the struct
statement. If you wish to change this, add the option len=reclen to the filedef statement, thus:
This example says that records written to the data file newdat1 must be 80 columns long.
where filename is the name of the print file with an optional pathname, print is a mandatory
keyword indicating that the file is a printout file, and options is a list of optional keywords defining
more specifically how the records should be written. Filename lengths are as described above for
data files.
len=n Length of output record if different from reclen= on the struct statement.
$text$ Heading text to be printed at the top of each page.
mpa Prints the codes in a multicode across the page enclosed in curly brackets. For example:
000401 635495{134}45111
Here, we have a multicode of ‘134’. The ruler is of little use when multicodes are printed
in this manner, so you may prefer to suppress it with the option norule.
mpd Prints the codes in a multicode down the page, thus:
----+----1----+----2
000401 635495145111
3
4
mpe Prints multicodes as an asterisk, but lists the individual codes within each multicode
beneath the record. For example:
----+----1----+----2
000401 635495*45111
Column 14 contains codes 134
norule Turns off the ruler.
noser Prevents the messages ‘Record nnn’ and ‘n in File’ from being printed.
The default output file is a print file called out2, and the default output style is as described above.
To change the output style for this (for example, to suppress the ruler or print multicodes in a
different format), simply use a filedef statement naming this file and giving the appropriate options
from the list above:
Quick Reference
To define default print parameters for write statements, type:
Any number of texts, variable names and fields are allowed. Items are printed in the order they are
listed.
To turn off ident defaults and return to the standard write behavior, type:
noident
The ident statement gives you increased control over the content of the print file by allowing you
to print more than one field of columns and one text per write statement.
Each ident statement may contain any number of texts, variable names and columns as long as each
one is separated from the others by a comma. The order in which you define items with this
statement controls the order in which they will be printed. For example, if you type:
and Quantum finds a record which fails this test, it will print the following:
Notice that the text defined with ident does not replace the text given with write. If you do not
define a message on the write statement, Quantum will print the complete statement as it usually
does.
In this example there is not much difference between using ident and writing the test as:
The real power comes when you want to write out more than one field and/or text per write
statement, or if you want to write out the values of data, integer or real variables. For example, if
you type:
t(1) is 10
t(2) is 15
t(3) is 20
in the print file (the values reported will, of course, be the values of the variables as they are in your
run).
✎ In ident statements you can refer to a field of adjacent entries in a data variable array by
specifying the first and last entries. For example, you can specify c(1,12) to refer to columns
1 through 12 of the C array. However, like most other Quantum statements, you cannot use
this syntax for other types of variable, such as integer arrays.
ident t(1,3)
You can combine texts, columns and variable names. The statements:
might print:
You could use this type of output for checking records which may be incorrectly coded for use with
field and bit statements.
☞ For information about field, see section 8.6, ‘Reading numeric codes into an array’.
For information about bit, see section 4.4, ‘Responses with numeric codes: bit’ in the
Quantum User’s Guide Volume 2.
When ident writes out data variables, it prints the data according to the specification on the filedef
statement for the file to which you are writing the data. If the filedef statement includes the keyword
norule to suppress the ruler, the data is written out without a ruler, otherwise the ruler is always
printed above the data, as in the previous example.
You can alter this behavior without having to respecify the filedef command by typing a + or − sign
at the end of the ident keyword. If filedef normally requests a ruler, type:
to print the listed variables without a ruler. If filedef normally suppresses the ruler, type:
To switch off ident and revert to the standard write behavior, type:
noident
Quick Reference
To write data to the standard print file (usually called out2) in a format of your choice, type:
where format defines the format in which the data is to be written and the data types of the variables
used. variables is a comma-separated list of the variables to be written out. Variables must be listed
in the order they are used in the format statement.
The format string consists of optional text interspersed with references to variables in the list:
%num_posi Print an integer variable in the next num_pos positions on the line. If the
variable has a negative value the value is printed starting with a minus
sign.
%num_pos.dec_plr Print a real variable in the next num_pos positions on the line and with
dec_pl decimal places. The number of print positions must allow for the
required number of decimal places and a decimal point.
%num_colc Print num_col columns starting with the column whose name or number
appears in the variable list. Columns are printed as texts not punch codes;
that is, multicodes are converted to letters where possible.
%numberb Print number blank spaces.
write and report are both powerful statements for writing out data, but they do have limitations
which you may find restrictive in some circumstances. The write statement lets you write data out
to a print file, including the standard print file (usually called out2), but it always writes the data in
a fixed format that you cannot change. The report statement lets you write out data and text in any
format you like, but only to a report file. You cannot write to a print file with report.
The qfprnt function brings together the functionality of write and report by writing text and data to
the standard print file in a format of your choice. To use it, type:
where format defines the format in which the data is to be written and the data types of the variables
used. variables is a comma-separated list of the variables to be written out. Variables must be listed
in the order they are used in the format statement.
If the respondent tested five products this statement will appear in the standard print file as:
The underscore character in front of the 5 represents a space and appears as such in the print file.
We’ll explain why we have printed it here shortly. First, let’s look at the qfprnt statement itself.
The format section of the statement consists of text to be printed exactly as it is written and
references to variables whose values are to be substituted in the text at the given points. In this
example we are writing out the value of the numeric (integer) variable t1. The variable is named in
the variable list section of the statement and is represented by the characters %2i in the format
section.
There are three parts to the variable’s reference. The % sign signals to Quantum that it has reached
a variable reference: all references start with a % sign. The i says that the variable is an integer
variable and the 2 says how many print positions to reserve for printing this variable. In the example
two positions are reserved for printing the value of t1, but since the value of t1 is only 5, Quantum
prints the value on the right of the reserved space and fills the remaining positions with spaces. In
the sample output we have used an underscore to represent this space.
As before, the underscore represents a space used to pad a value to the full field width.
This qfprnt statement produces the correct results because the variables are in the same order as
their references in the format section. This is your responsibility. As long as a variable has the same
type as the reference in the corresponding position in the format section, Quantum will print its
value at that point in the statement. So, if we had written:
As you can see, Quantum does not increase the number of print positions to accommodate the value
it needs to print. Instead, it prints asterisks. In this example, the asterisks would alert you to the fact
that there is something wrong with the qfprnt specification, but this would not always be so.
More often than not you’ll be printing positive values. If Quantum needs to print a negative
number, it prints the minus sign directly in front of the first digit, just as you would write it
manually.
Besides integer variables, you can also print real variables, columns or fields of columns and blank
strings. You use a reference similar to the one you’ve seen for integer variables.
%num_pos.dec_plr
where num_pos is the number of print positions required and dec_pl is the number of decimal
places. As an example, the statement:
prints the value of the real variable called liters in a field 5 positions wide. The value is printed with
two decimal places so, allowing for the decimal point, the maximum value that can be printed in
99.99:
Quantum can also print the text values of a column, a field of columns or a data variable. By this
we mean that Quantum converts multicodes to letters or other keyboard characters before printing
them. Multicodes that do not correspond to letters or characters are printed as asterisks. For
example, the multicode ‘&1’ translates into the letter A and would be printed as such; the multicode
‘&123’ is simply as collection of codes and would therefore be printed as an asterisk.
%numberc
in the format section, where number is the number of print positions required, and the name of a
single column in the corresponding position in the variable list. Quantum will then print number
columns starting at the named column. For example:
might produce:
----+---2
9462&5736
5 1 8
9
The statement:
%numberb
where number is the number of blanks you want. You’ll find this useful if you want to indent lines
or print values in columns.
This chapter describes how to assign values to variables and the statements emit, delete and
priority, all of which may be used to alter the contents of a variable. Emit, delete and priority are
used only with columns whereas assignment statements can deal with character, integer and real
variables.
When we say that these statements change the contents of a column we mean that they change the
contents of that column as it exists during the run: at no time do they change the corresponding
column in the data file.
An assignment statement normally means ‘put the specified information into the given variable
overwriting anything already in that variable’. It can be used with any type of variable to perform
any of the following tasks:
• To replace certain codes in one column with those from a second column.
• To copy codes from groups of columns into another column using the logical operators and, or
and xor.
In spite of the diversity of these functions the basic format of any assignment statement is:
variable=item
Remember that comments can be identified by an uppercase C in column 1. If the first variable in
your statement starts with a C, make sure that you type it in lower case otherwise the whole line
will be read as a comment and will be ignored. For example:
Alternatively, you may precede assignment statements with the word set, thus:
set c(15,16)=$12$
Copying codes
Quick Reference
To copy codes into a single data variable, overwriting the variable’s original contents, type:
variable=’codes’
var_name(start,end)=$codes$
variable1 = variable2
Assignment statements are most commonly used to copy codes into a column or to copy the
contents of one variable into another. For instance:
c121=’159’
c121=c134
In the first example we are copying the codes 1, 5 and 9 into column 121 overwriting whatever is
already there. The second example copies everything in column 134 into column 121, again
overwriting what was originally there. Column 134 remains unchanged.
You can also copy strings of characters into fields of columns. Let’s say we want to copy the code
59642 into columns 76 to 80 of card 3; we would write:
c(376,380)=$59642$
Notice that the characters to be copied into the array are enclosed in dollar signs as is the rule when
dealing with strings.
\;
Quantum uses a semicolon to mark the end of a statement, and will issue an error message if it finds
a semicolon by itself in the middle of a string. The backslash in front of the semicolon tells
Quantum to read the next character as an ordinary character with no special meaning. For example:
c(376,380)=$59\;42$
When characters are being copied into columns, the equals sign may be omitted:
Just as the contents of a single column can be copied into another, so the contents of one field can
be copied into another field. For example:
c(10,19)=c(70,79) or c(20,22)=c(45,47)
copies the contents of c(70,79) into c(10,19) and the contents of c(45,47) into c(20,22), in both
cases overwriting the original contents of those columns.
Data variables in assignment statements may be subscripted. The following are valid:
c(t1)=c145
c(178,180)=c(t4,t5)
c(t3,t5)=c(t10,t10+2)
When subscripting columns, remember that the current values of the integer variables will be
substituted in the expression before the statement itself is executed. If t3=120 and t10=240, the
statement:
c(t3,t3+2)=c(t10,t10+2)
means:
c(120,122)=c(240,242)
Generally you will know how many characters are required to hold the information they will
receive, but this is not always the case. What if the field on the left of the equals sign is longer than
the string to be copied into it? Quantum always copies a string starting with the right-most column
and transferring it into the right-most column of the field. It continues in this way until all
characters have been copied, then if there are still columns left in the field they are reset to blanks.
When strings are copied in this way they are called ‘right-justified and blank-padded’.
and we enter:
c(241,245)=c(185,187)
If there are fewer characters than there are columns in the field, the characters are right-justified in
the field with the remaining columns set to blanks. If the reverse is true, and there are more
characters than there are columns in the field, the error message ‘Attempt to set too many columns
into too few columns’ is issued.
c(145,150)=c(143,148)
copies the contents of columns 143 to 148 into columns 145 to 150, so:
----+----5 ----+----5
83645902 becomes 83836459
When a field is set to blanks it is never wrong to type in as many blanks (enclosed in dollar signs)
as there are columns in the field, but it is much quicker and more efficient to type, say:
c(301,380)=$ $
Quick Reference
To replace a code or set of codes in one data variable with a code or set of codes in a second data
variable, type:
variable1’codes1’=variable2’codes2’
codes1 and codes2 must contain the same number of codes, and the codes must be in
superimposable order (e.g., ‘123’ and ‘456’, but not ‘123’ and ‘135’).
Assignment statements are also used to replace parts of one column with those of another, leaving
the remaining contents of that column intact. Note that this is the only time that assignment does
not overwrite everything in the recipient variable. Let’s start with a simple example. Suppose we
have:
and we want column 124 to contain a ‘1’ only if column 159 contains a ‘7’. We would write:
c124’1’=c159’7’
However, if we wrote:
c124’3’=c159’3’
meaning that c124 should only contain a ‘3’ if c159 contains a ‘3’, Quantum would give us:
As you can see, the ‘3’ in c124 has been deleted because there is no ‘3’ in c159. Both examples
could equally well be written using if, else, emit and delete, but an assignment statement is much
more efficient when you have a set of codes to check for.
☞ For further information about if, see section 9.1, ‘Statements of condition – if’.
For further information about else, see section 9.2, ‘Statements of condition – else’.
For further information about emit, see section 8.2, ‘Adding codes into a column’.
For further information about delete, see section 8.3, ‘Deleting codes from a column’.
c10’123’=c11’456’
+----1----+
14
35
4
+----1----+
14
25
4
Column 10 contains a ‘1’ and a ‘2’ because c11 contains a ‘4’ and a ‘5’. The ‘3’ that was originally
there has been removed because there was no ‘6’ in c11. The ‘4’ in column 10 remains untouched
because it has no corresponding code in c11.
Partial assignment need not have different column numbers either side of the equals sign. Quantum
accepts statements of the form:
c127’0/3’ = c127’1/4’
which can be used for recoding incorrectly coded data. The example we have used will recode a
‘0’ in column 127 as a ‘1’, a ‘1’ in column 127 as a ‘2’, and so on.
When entering codes with this type of statement, make sure that there are the same number of codes
on either side of the equals sign and that they are in the same relative positions in the order
&-0123456789. In the previous example we used ‘123’ and ‘456’. We could also have used ‘&-1’,
‘789’ or ‘234’ instead of ‘456’, to name but a few alternatives. The important thing is that the two
groups follow the same pattern: if the first set names alternate codes (for example, ‘1357’) then so
must the second (for example, ‘&024’).
c21’&–0’=c92’456’
c21’05’=c86’49’
c56’ 0’=c91’15’
c78’123’=c81’367’
The statement for columns 56 and 91 is incorrect because blank is not a valid code here; the
statement for columns 78 and 81 is wrong because the codes ‘367’ cannot be superimposed on
‘123’ (either 345 or 567 would be correct).
Quick Reference
To store the value of an arithmetic expression in a variable, type:
variable = expression
In many of your Quantum programs you will need to save the result of some arithmetic expression
in a variable. The variable may be a column or an integer or real variable and the arithmetic
information may be the contents of a column, integer or real variable, an integer or real number, or
the results of the functions numb or random. It can also include arithmetic expressions which have
been manipulated using the arithmetic operators +, −, / and *. Here are some examples to start with:
var1=100
/* Next statement expects that variable ntim is < 10
c135=ntim
/* In next example, if c31’5678’, variable np=4
np=numb(c31)
/* Increment rect (record total) by 1 for each record processed
rect=rect+1
Copying a number into an integer or real variable is easy because the variable has no predetermined
size — that is, Quantum does not say that such variables may only store numbers of up to, say, three
digits. Integer variables can store any whole number in the range +2,147,483,648 to -2,147,483,647
and real variables may take values of any magnitude with six digits accuracy.
Suppose our questionnaire tells us how many pints of milk a respondent bought and we want to
save this is in an integer variable called npt. Here’s what we might write:
npt=c(125,126)
Similarly, if we know how many miles the respondent travels to work each day, and we want to
convert this to kilometers, we could save the conversion in a real variable called km0:
km=c(213,214) * 1.609
If the respondent travels 5 miles, km will have the value 8.045, but if he or she travels 9 miles, km
would be 14.481.
The main difference between the two examples is the type of variable in which the results are saved.
The number of pints bought will always be a whole number so we save it in an integer variable,
whereas the conversion from miles to kilometers is likely to produce a real number so we save it in
a real variable.
When copying a real value into an integer variable or vice versa, remember that the accuracy of the
result depends upon the type of variable in which the value is saved. Real values saved in integer
variables are truncated before the decimal point, thus:
but integer values placed in a real variable are saved as reals with decimal places and accuracy to
6 significant figures:
Integer variables are often used to count the number of respondents having a specific characteristic.
For instance, to count the number of respondents holidaying at home and the number taking
holidays abroad we can say,
☞ This example uses the if statement that is described in chapter 9, ‘Flow control’.
Whenever a record is read with c113’1’, the variable home will be incremented by one and
whenever a record is read with c113’2’ the variable abroad will be increased by 1.
Let’s say we have five respondents who took the following holidays:
At the start of the run, the variables home and abroad are both zero. After these records have been
processed, home will equal 3 and abroad will be 2. The person unlucky enough to have no holiday
at all will be ignored.
In the example above we were accumulating information about holiday habits for all respondents
together, but on many occasions you will want to store information on a per respondent basis
instead. Normally, integer and real variables are not reset between respondents, but all you need
do to overcome this is to enter a statement at the start of your edit to reset the variable in question
to zero each time a new record is read. For instance:
home=0
☞ We will discuss in more detail the times when you might want to do this when we describe the
do statement in section 9.5, ‘Loops’.
Columns which contain single codes may be treated as a whole number. For instance, if our data is:
+----2----+
4922
the statement:
value=c(219,222)
will assign the value 4922 to value. If any of the columns are blank or multicoded in any way, they
are ignored.
+----2----+ +----2----+
49 2 and 4912
2
Columns
Columns may also store arithmetic information, but unlike other variables they have a predefined
size which means they can only store numbers of a certain size. For instance, c(1,10) can store
numbers of up to ten digits whereas c(1,3) only stores numbers of up to three digits.
If the number is negative Quantum places the minus sign in the column immediately to the left of
the first digit, but if there are no spare columns the first digit will be dropped and the minus sign
placed in the left-hand column. If t5=−278, the statement:
but:
Note that this does not hold true for negative numbers whose length exceeds the field width by
more than one character. Then, the number is copied into the field from the right and the minus sign
and any excess digits are ignored. Thus, if t5=−1278, c(42,44) will contain the number 278.
If the value to be saved has fewer digits than there are columns in the field, it will be right-justified
in the field and the remaining columns padded with zeros.
When copying real numbers into columns, Quantum needs to know how many decimal places are
required. This is done by following the variable with a colon and a digit defining the number of
places. For example, if x5=10.22, the statement:
cx(15,19):2=x5
results in:
----+----2----
10.22
If the real number has more decimal places than we have allowed for, say 3 instead of 2, the extra
decimal places will be ignored.
Quick Reference
To copy codes which are present in at least one of a list of columns, type:
To copy codes which are present in only one of a list of columns, type:
If any of these statements includes codes (p), only those codes are checked for. Any unlisted codes
are then ignored.
The final type of assignment is copying codes from a set of columns. The codes copied depend
upon the type of operator used:
where ca, cb, and cc are the columns whose codes are to be compared. Note that even if you are
comparing codes in consecutive columns, each column must be identified separately, preceded by
a c.
Suppose we have:
----+----4
111
/22
453
77
and we type:
c181=and(c137,c138,c139)
Notice that even though the codes ‘3’ and ‘7’ appear in more than one column they are not copied
to c181 because they are not common to all columns.
Let’s take the same three columns with the or operator. We type:
c182=or(c137,c138,c139)
c182 contains a list of all codes present in at least one of the named columns.
c183=xor(c137,c138,c139)
yields:
Here only two codes have been copied because all other codes appear in more than one column. If
one column was blank, this would be ignored if there were other codes unique to one column. Only
if there were no other unique codes would column 183 be blank. For instance, if we have
c11=’ ’, c12=’12’, c13=’13’ and we type:
c14=xor(c11,c12,c13)
we would have c14=’23’, but if c13 were to contain a ‘12’ instead, c14 would be blank.
All our examples so far have referred to whole columns, but sometimes you will only be interested
in specific codes in those columns. To write this in Quantum, follow each column number with the
positions to be checked enclosed in single quotes. Any unnamed codes in those columns are then
automatically ignored. Here is an example. Our data is:
----+----4----+----5
1 1 2
/ 3 /
5 5 6
Even though column 31, 41 and 45 all contain a ‘3’ and a ‘5’, Quantum only copies the ‘3’ because
the ‘5’ is not part of our specification. We have used the same code specification for all three
columns, but you can use whatever combination you like.
✎ These types of statement are extremely useful for setting up shorthand references to the codes
present in a group of columns. Say, for instance, that you wanted various statements
throughout the edit to be executed only if there was a ‘1’ in one or more of c110, c112, c120
and c125. You can always write out each column and code separately each time:
if(c110’1’.or.c112’1’.or.c120’1’.or.c125’1’) .....
but it is simpler and much more efficient to say:
c181=or(c110,c112,c120,c125)
if (c181’1’) ...
especially if you will need to refer to the contents of these columns again later on in the edit.
This facility may also be used to simplify what would otherwise be complicated filter
conditions in the tabulation section.
Quick Reference
To add codes into a column in addition to those that are already there, type:
The emit statement inserts codes into a column leaving the original contents intact. Its format is:
emit cn’p’
Suppose we have:
----+----7
4
5
&
----+----7
3
4
5
&
More than one column may be entered on each line, provided that each one is separated by a
comma.
✎ emit can only be used with single columns; string variables are not valid: emit c(109,110)$99$
does not work.
Quick Reference
To delete selected codes from a column, type:
The delete statement is the opposite of emit in that it deletes codes from a column leaving the
remainder intact. Its format is:
delete cn’p’
Suppose we have:
+----1----+
5
6
8
9
+----1----+
6
8
9
More than one deletion may be effected with the same delete statement as long as each column is
separated by a comma.
Quick Reference
To force single-coding of multicoded columns, type:
where a code at the start of the list should be accepted in preference to any later in the list.
Sometimes when you are cleaning your data you will come across a column which is multicoded
when it ought to contain only one code. You can either print out the record and change the incorrect
codes later or you can have Quantum do it for you automatically. When data is to be corrected
automatically, you will need to write a statement saying which codes should be discarded and
which are to be kept. Obviously, there can be no hard and fast rule since the codes may vary
between questionnaires, so what you may do is assign each code a priority so that when a certain
code is found Quantum knows that all others in that column are to be deleted.
where cn is the column whose codes are to be checked and ‘code1’ to ‘coden’ are the positions to
check, entered in order of priority, the most important first.
✎ priority checks only the listed positions; if any other codes are present they are ignored.
Suppose one of the questions in a survey asks respondents to give their overall opinion of a product,
rated on a scale of 1 (Poor) to 5 (Excellent). You have been told that if the question has accidentally
been multicoded you are to assume that the higher rating is correct and delete the lower rating from
the column. You will not know beforehand exactly what multicodes there are, if any, but you will
know the column and the possible codes it may contain, and also that low codes should be discarded
in favor of high ones. If this question is coded into column 249, you could write:
This causes Quantum to scan column 249 to see first whether it contains a ‘5’ and, if so, to delete
all subsequent codes in the list. If c249 contains a ‘5’ and nothing else, obviously there will be no
extra codes to delete; this does not matter. If there is no ‘5’ in c249, Quantum then checks whether
it contains a ‘4’; if so, any other codes in the range ‘1/3’ are deleted, otherwise the program skips
to the next code in the list and checks for that. If none of the listed codes are found, the column
remains unchanged.
If our first record has c249’53’ Quantum will give us c249=’5’, but if the second has c249’942’ we
will end up with c249’94’; the ‘9’ has not been removed because it was not one of the named
positions.
You can also use priority to force a field to be single-coded simply by listing the columns and codes
to be checked in order of importance. If a listed code is found in the first column, any other listed
codes will be removed from that column, as will any that appear in subsequent columns. For
example, if our record is:
-----+----6
22
3
5
and we write:
-----+----6
2
However:
-----+----6 -----+----6
22 would become 2&
3
&
In the previous example, we have named two different columns on the same priority statement
because together they form a field which must be single coded overall. If you want to force two
completely separate columns to be single-coded, you must write two priority statements, one for
each column. If our data is:
+----3----+
21
33
6
the statement
priority c129’1’,’2’,’3’,c130’1’,’2’,’3’
+----3----+
26
but:
results in:
+----3----+
21
6
Quick Reference
To choose a random code from a list of codes, type:
data_var_name=rpunch(’codes’)
data_var_name=rpunch(col_number)
Occasionally you may wish to set a random code into a column, perhaps because the code in that
column is incorrect. To do this, write:
cvar = rpunch(’p’)
where cvar is the column into which one of the codes ‘p’ is to go. For example:
c115 = rpunch(’1/5’)
c115 = rpunch(c120)
Once this statement has been executed, column 115 will contain one of the codes present in
column 120.
Quick Reference
To set up an array based on numeric codes in the data, type:
column_specs are references to the fields containing the numeric codes. code is a non-numeric code
present in those fields and cell_number is the cell of the array which should be incremented
whenever that code is encountered.
Cells in the array are reset to zero at the start of each new record. To prevent this happening, enter
the statement name as fieldadd rather than field. The rest of the statement is as shown.
On some studies you will find responses which are represented by numbers rather than codes. There
are various methods of checking and tabulating these responses. Which one you use depends on
whether you want to know the number of respondents whose record contains a given code in a field
or group of fields, or the number of times a code appears in a group of fields.
To illustrate this, let’s suppose the question and response list in the questionnaire are as follows:
Q6A: Which films did you see on your last three visits to the
cinema?
If you want a table which shows how many people saw each film, one way of tabulating this data
is to use a fld statement in the axis which tells Quantum which columns to read and which codes
represent each film.
☞ For information about the fld statement, see section 4.3, ‘Responses with numeric codes: fld’
in the Quantum User’s Guide Volume 2.
Another way is to use a combination of field in the edit and bit in the axis. This is particularly
efficient if, rather than wanting to count the number of people who saw each film, you want to count
the number of times each film was seen.
The field statement counts the number of times a particular code appears in a list of fields for each
respondent. It stores these counts in an integer array that consists of as many cells as there are fields
to count. In the films example, the array will have five cells. Cell 1 will hold the number of times
code 01 appears in the fields c(12,13), c(14,15) and c(16,17). If the respondent saw Green Card
then Batman 2 and then Green Card again, his/her data will be:
1----+----2
040504
Cell 4 (Green Card) of the array will be set to 2, and cell 5 (Batman 2) of the array will be set to 1.
You can then tabulate the contents of this array using a bit statement in the axis.
output_array is the name of the array in which you wish to store the counts of responses. You can
use spare columns in the C array, but you may find your program is easier to read if you define an
integer array of your own with a name which reflects the type of information it contains. For
example, if you want an integer array called films, you might write:
int films 5s
ed
field films = .....
When you define the integer array, make sure that you request as many cells as there are codes in
the data. In this example there are five films so you define the array as having five cells. Quantum
automatically creates an extra cell (cell 0) which it uses to count responses for which there is no
cell allocated. If there were six films, for example, Quantum would increment cell 0 each time it
found code 06 in the films columns. You might like to check the value of this cell as a means of
reporting on invalid codes:
Negative and zero values also cause cell zero to be incremented. Codes which are shorter than the
field width are accepted as long as they are left-padded with blanks or zeros. Codes which are
shorter than the field width and which are right-padded with blanks only increment cell zero.
The input_specs part of the statement defines the columns to read. You have a number of choices
here. First, you may list each column or field reference one after the other, separated by commas.
The list must be enclosed in parentheses. In our example this would be:
Second, if you have sequential fields as you do here, you can type the start columns of each field
followed by the field length. The list of start columns is separated by commas and enclosed in
parentheses, and the field length comes after the closing parenthesis and starts with a colon. If you
use this notation for the film example you would write:
If you wish, you can abbreviate this further by typing just the start columns of the first and last
fields, followed by the field length.
Third, if the fields are not sequential, you list the start columns and field width of each group of
columns (as shown above) and separate each group with a slash. For example, to read data from
columns 12 to 17 and 52 to 57, with each field being two columns wide, you would type:
You can also use this notation for single non-sequential fields. For example:
The special_specs part of the statement is optional. You use it when a field contains non-numeric
codes such as $&&$ for None of these films. If you want to count codings of this type, you must
remember to allocate cells in the array for each code or group of codes you wish to count. You then
include the notation:
$code$ = cell_number
int films 6s
ed
field films = (c12, c14, c16) :2, $&&$=6
If you want to count more than one non-numeric code, list each one individually, separated by
commas.
✎ To tabulate data counted by a field statement, you use a bit statement which names the integer
array you have created and defines the element texts associated with each cell of the array.
☞ For further information about the bit statement, see section 4.4, ‘Responses with numeric
codes: bit’ in the Quantum User’s Guide Volume 2.
Quantum normally resets the cells of the integer array to zero at the start of each record. If you want
counts to continue from one record to another, use a fieldadd statement instead of field. For
example:
✎ The advantage of using field or fieldadd is that they automatically count the number of times
a code appears in a list of fields. If you want a table which uses this information, you just tell
Quantum to increment the counts in the table by the values stored in the appropriate cells of
the array.
You can also manipulate the values stored in the cells before you tabulate the data. For
example, if you had codes for Aliens 1, 2 and 3, you might wish to merge them into a single
cell for all Aliens films so that the tabulation spec is easier to write.
Quick Reference
To remove values from variables, type:
where var1 to varn are any valid Quantum variable or range of variables. For example:
Data variables are reset to blank, integer variables are reset to 0 and real variables are reset to 0.0.
Variables can also be cleared using assignment statements (e.g., t1=0), but there are advantages to
using clear instead. Firstly, clear is much easier to write. Secondly, with clear the compiler checks
that the subscripts are in the correct range (e.g., 1 to 33 if ‘myarray’ has only 33 cells); this is not
possible with the loop method because the subscript is a variable. However, if you use variables as
subscripts with clear (e.g., clear c(t1,t1+5) subscript checking once again cannot be done.
Quick Reference
To prevent Quantum from checking array boundaries during a run, type:
nobounds
Quantum normally terminates if it detects that you are writing beyond the end of an array. For
example:
Here, we have defined an integer array called ‘number’ as having 10 cells. When Quantum reads
the assignment statement and detects that it refers to ‘number(11)’ it will terminate because there
are only 10 cells in the array, not 11. The same would be true for statements which referred to, say,
t201 when the size of the T array had not been extended past the default of 200 cells.
The exceptions to this are emit, delete, partial column moves and reads from fetch files.
☞ emit, delete and partial column moves are discussed earlier in this chapter. For further
information about fetch files, see ‘The fetch statement’ in chapter 13, ‘Using subroutines in
the edit’.
While they may save you time in the long run, these checks do mean that your job will run slightly
slower than it otherwise would.
If you wish to run without these checks, insert a nobounds statement near the start of the edit.
Quick Reference
To assign a value to a T variable in the data file, type:
*set tn = value
You may use a *set statement in the data file to assign a value to a T variable. Its format is:
*set tn = value
where n is a number between 1 and 200 (unless you have increased the number of T-variables).
The statement must start in column 1. You may type ‘set’ in upper or lower case, and may follow
it with any number of spaces. If Quantum reads anything that it cannot interpret as a T variable, it
terminates the run immediately.
This facility is available in all jobs with or without levels (trailer cards). You may use it as many
times as you need throughout the data file to assign different values to the same T-variable, or to
assign different values to a number of T-variables.
Statements in the edit section are usually dealt with in the order in which they occur in the program.
Quantum provides statements which may be used to alter this normal order of execution, for
example, by missing out a statement or repeating a group of statements a number of times.
Quick Reference
To define statements to be executed if a certain condition is true, type:
The if statement has exactly the same meaning as in English; it defines a statement whose execution
depends upon the value of a logical expression. Let’s first take an English sentence to explain this:
we might say ‘If it is raining, I will take my umbrella’. Here, the statement is ‘I will take my
umbrella’ and it depends upon the logical expression ‘It is raining’. If the expression is true (i.e., it
is raining), the statement is executed (I take my umbrella), if it is false (no rain) it is ignored (I don’t
even think about my umbrella).
Now let’s take a Quantum sentence. We have a shopping survey in which respondents have been
asked to name the supermarkets in which they shop at least once a week. These responses are coded
into column 21 of card 1, and we want to keep a count of the number of respondents shopping in
Safeway (code 4). Our sentence would say ‘If column 21 contains a 4, increment our counter by 1’.
2. The logical expression whose value controls the action to be taken, enclosed in parentheses.
☞ For further information about logical expressions, see section 5.2, ‘Logical expressions’.
Thus, to translate our sentence into the Quantum language, we would write:
if (c121’4’) safe=safe+1
The logical expression to be tested states that the number of codes in columns 10, 11 and 12 is
greater than three. If it is true, and there are, say, 5 codes altogether in those columns, we will add
a 9 into column 20 in addition to what is already there. On the other hand, if it there are 3 or fewer
codes in that field we leave column 20 as it is and continue with the statement on the line
immediately after the if. For instance:
+----1----+----2----+ +----1----+----2----+
62- 1 62- 1
0 / yields 0 /
4 4
9
but:
+----1----+----2----+ +----1----+----2----+
2- 1 2- 1
0 / yields 0 /
4 4
Once the emit statement has been executed, Quantum continues with the statement on the next line.
The statement to be executed if the expression is true may be any Quantum statement, even another
if. For example:
says ‘if c130 contains a ‘1’, and then if c131 contains a ‘9’, then put the multicode ‘19’ in c181’.
This statement is not incorrect, but it can be more efficiently written as:
if (c130’1’.and.c131’9’) c181’19
The if keyword may be followed by a whole series of statements as long as each one is separated
by a semicolon. These statements will then be executed in the order in which they appear. For
example:
This says, if the value of t4 is less than or equal to 5, put the multicode ‘45’ in column 235
overwriting whatever is there already, then add a ‘2’ into column 567 and, finally, remove the ‘0’
from column 789.
✎ You cannot switch missing values processing on or off with an if statement. A missingincs
statement is always executed wherever it appears in the edit. This means that although the
compiler will accept statements of the form:
if (....) missingincs 1
Quantum will, in fact, switch on missingincs for the rest of the edit or until a missingincs 0
statement is read. It does not switch on missingincs selectively for only those records that
satisfy the expression defined by the if clause.
☞ For further information about missingincs, see section 12.6, ‘Missing values in numeric
fields’.
Quick Reference
To define statements to be executed if a given condition does not exist, type:
In Quantum the keyword else means ‘otherwise’. In English we would say ‘If it’s raining I’ll take
the car, otherwise I’ll walk’; in Quantum we write:
This says, if the expression is true, execute the statements immediately after the if, but if it is false,
execute those following the else. For example:
Here, if c76 contains a ‘4’, t3 is set to 1 and a ‘3’ is deleted from c76. However, if c76 does not
contain a ‘4’, t3 is set to 2 and a ‘2’ is added into c77.
The else keyword may only be used as part of an if statement and must be separated from the if by
at least a semicolon. Statements of the form:
are correct, but since action is only required if the expression is not true, it is more usual to write:
Sometimes your Quantum program will include statements which refer to certain respondents only;
for instance you will only want to check the data associated with a particular brand of soap powder
if the respondent bought that powder. These statements may be routed over when the respondent
does not buy the powder by using the go to (or goto) statement, followed by a statement number.
The statement:
if (c121n’1’) go to 50
causes Quantum to go immediately to the statement labeled 50 if column 121 does not contain a
‘1’ (for example, the respondent did not buy Brand A soap powder). Any statements between this
if statement and statement 50 are ignored whenever a record is read where c121n’1’ is true.
The statement labeled 50 may be any Quantum statement, but many people just write:
50 continue
to gather all respondents together before continuing through the rest of the program. This statement
is described in the next section. All labels must be attached to statements: a label by itself is an error
and Quantum will tell you so.
You may route forwards or backwards in your program, but when routing backwards, take care that
you are not creating a situation from which it is impossible to escape: the following will go on and
on forever if you let it:
10 t1=t1+1
- - other statements - -
go to 10
The only way to avoid situations like this is to make sure that somewhere between statement 10
and go to is another statement that routes you past the go to at some time, for example:
10 t1=t1+1
- - other statements - -
if (t1.gt.10) go to 15
go to 10
15 continue
9.4 continue
Quick Reference
Attach the keyword:
continue
This statement is a dummy statement whose sole purpose is to join various bits of a program
together. It is often used with a statement label as a destination for routing with go to, or to identify
the end of a loop.
☞ To find out more about using continue with loops, see ‘do with individually specified numeric
values’ in the following section.
9.5 Loops
Quick Reference
To define a set of repetitive statements, type:
do label_number int_variable=value_list
statements
label_number statement
Loops are extremely important structures because they enable the same set of basic statements to
be executed over and over again on a changing series of numbers, columns or codes. Their use can
reduce the work involved in checking data. The statement which introduces a loop is do which is
formatted as follows:
• An integer variable (for numbers or columns) or a letter (for codes) whose value is to be used
by the statements in the loop.
• An equals sign.
• A list of whole numbers, integer variables or codes which are the values the integer variable or
letter is to take. These may be entered in two ways (see below).
Loops should be terminated by any statement other than go to, stop, return, another do or an if
containing any of these words. The main purpose of the terminating statement is to identify the end
of the loop and send the program back to the start of the loop. Go to and return send the record
elsewhere, stop terminates the run and another do indicates the start of another loop. The statement
most often used to terminate a loop is the dummy statement continue. Any statement that
terminates a loop must be preceded by a label number.
☞ For information about the return statement, see section 9.7, ‘Jumping to the tabulation
section’.
We will now go on to discuss the various ways of defining the values in the value list.
Quick Reference
To define a loop to be repeated for a set of given values, type:
The simplest way to define the values for the loop is to list them individually. In this case, values
must be whole numbers, separated by commas with the whole list enclosed in parentheses. For
example:
do 20 t5 = (125,130,140,145)
if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $
20 continue
Before we discuss what this loop is doing, let’s look at the way it has been written. The do statement
tells us three things, namely that the loop is terminated by the statement labeled 20, the integer
variable to be used is t5, and the statements within the loop are to be repeated four times (there are
four values in the list). The statement labeled 20 is continue which just sends Quantum back to do.
The purpose of this loop is to check whether the contents of four fields are greater than 3000, and
if so to reset those columns to blank. The first time through the loop, t5=125. When substituted into
the if statement it yields:
if (c(125,129).gt.3000) c(125,129)=$ $
The next statement is continue which sends us back to the top of the loop. t5 is now pointing to the
second value in the list, 130. The if statement reads:
if (c(130,134).gt.3000) c(130,134)=$ $
This process is repeated until t5 has taken all values in the list. There is no need to include
statements which check the value of t5 and jump out of the loop when the last value is reached:
Quantum keeps a count of how many values there are and it knows that once the last value has been
reached it should continue with the statements following the loop.
Quick Reference
To define a loop which will be executed for a range of values, type:
If the incremental value is 1 and the loop has one range only, the incremental value may be omitted.
Sometimes there will be a pattern to the numbers in the list: for example, they may increase in steps
of 5. You may list them all individually if you prefer, but it is quicker to enter them as a range with
a start, end and incremental value (in our example, 5) separated by commas. The start value must
be smaller than the end value, and the increment must be positive. Quantum checks the start and
end values and if the start is larger than the end value, the statements inside the loop will not be
executed at all. If the increment is negative, the loop will be executed for the start value only.
do 20 t5 = 125,145,5
if (c(t5,t5+4).gt.3000) c(t5,t5+4)=$ $
20 continue
This loop is very similar to that used in the previous section. It will be executed for all values of t5
between t5=125 and t5=145 where the value is incremented by 5 each time. The loop says:
if (c(125,129).gt.3000) c(125,129)=$ $
if (c(130,134).gt.3000) c(130,134)=$ $
if (c(135,139).gt.3000) c(135,139)=$ $
if (c(140,144).gt.3000) c(140,144)=$ $
if (c(145,149).gt.3000) c(145,149)=$ $
You may enter as many range specifications as you like on one line, as long as each one is separated
by a slash (/):
do 15 t1 = 25,35,2 / 50,62,3
if (numb(c(t1).gt.1) c(t1)’ ’
15 continue
This loop replaces eleven if statements: t1 will take the values 25, 27, 29, 31, 33, 35, 50, 53, 56, 59
and 62.
If the loop has only one range, and the incremental value is 1, the 1 may be omitted. If t3=11 and
t4=15:
do 15 t2 = t3,t4
if (numb(c(t2).gt.1) c(t2)’ ’
15 continue
checks that columns 11, 12, 13, 14 and 15 each contain no more than 1 code. If not, the column is
reset to blank.
do with codes
Quick Reference
To repeat a set of statements for all codes in a given range, type:
Sometimes you will want to repeat a statement or set of statements for a given set of codes, rather
than for columns or other types of variable. The way to do this is to write a do statement which,
instead of naming an integer variable and whole numbers, defines a list of codes and a temporary
variable which points to each code in turn. When you want to refer to the current code, you simply
enter the name of the temporary variable and Quantum will substitute the value of the current code
in the statement before it is executed.
to execute statements for all codes in the range ‘p1’ to ‘p2’, where the sequence of codes is
&-01234567890–&;
or:
In both formats, note that the variable name and the codes must all be enclosed in single quotes.
Additionally, you may not use the notation ’ ’ to indicate a blank code, nor may you use the
temporary variable in partial column moves (that is, in statements of the form c(1,4)=c(3,6)).
Here is an example which illustrates how to check for certain codes in a series of columns:
do 10 ’code’ = (’1’,’3’,’5’)
if (c110’code’ .or. c111’code’) emit c180’code’
10 continue
This loop is executed three times, once for each of the three listed codes. The first time the loop is
executed, the statement will read:
Nested loops
Loops may contain other loops: this is called nesting. Loops may be nested up to six levels deep,
but they must not overlap. Also, each loop must have a separate terminating statement. In other
words, they must always take the form:
do 60 t2 = do 60 t2 =
do 70 t3 = do 70 t3 =
do 80 t4 = .
. or 70 continue
. do 80 t4 =
80 continue .
70 continue 80 continue
60 continue 60 continue
It is possible to route from inside a loop to outside, but not from outside to inside. The following is
permissible:
do 150 t1 = 125,145,5
if (numb(c(t1)).eq.1) c189’1’; go to 76; else; c(t1)’ ’
150 continue
76 continue
What we are saying in this loop is that if a given column specified by t1 is single-coded (i.e.,
contains one code only) we set a spare column equal to 1 and send the record out of the loop. If not,
we set the column being checked to blank and return to the top of the loop to get the next value of
t1. This process continues until a single-coded column is found, or until all values of t1 have been
tried.
if (c176’3’) go to 76
.
.
do 150 t1 = 125,145,5
76 if (numb(c(t1)).eq.1) c(t1+1)’&’
150 continue
If c176’3’, the program would jump into the middle of the loop and have an unidentified value for
t1. An error message will be printed under the offending statement.
Quick Reference
To reject a record from the tables, but to include it in the rest of the edit, type:
reject [level_name]
In a levels job, include a level name to reject all data at the given level.
Normally all records are passed straight from the edit to the tabulation section regardless of whether
or not they contain errors. The reject statement tells Quantum to continue editing the record but
not to include it in the tables. The record is also rejected from the weighting and where split is used,
it is rejected from the clean file and may be found in the dirty file.
if (c73’8’) reject
if (c80’1’) t5=t5+1
end
to reject records in which column 73 contains an ‘8’ from the tabulations but not from the rest of
the edit. Therefore, even if c73’8’, the record is still checked for a ‘1’ in column 80 and if one is
found, t5 is incremented.
Whenever a record is rejected the variable rejected_ becomes true. You may use this variable in
your program to deal with rejected records in a different way to accepted records. For instance, we
may wish to write all rejected records out in the file rejfil for later inspection and correction:
The variables rec_rej and rec_acc count the total number of records rejected and accepted so far.
You may wish to check these variables and terminate the run if too many records are rejected.
☞ There is an example of how to do this in section 9.9, ‘Canceling the run’ below.
If you are working with hierarchical (levels or trailer card) data, reject at a given level will reject
all data at that level. Additionally, data at a level higher than that currently being edited may be
rejected from tables — for instance, in the edit of data at the item level, you may reject all data at
person level. The syntax for this is:
reject levelname
✎ When used with split, reject at any level rejects the whole record from the clean file.
☞ For more information about levels data, see chapter 3, ‘Dealing with hierarchical data’ in the
Quantum User’s Guide Volume 3.
Quick Reference
To send the record to the tabulation section, type:
return
The word return in Quantum bears no relation to the same word in English. It does not mean go
back to the start of the edit or anything like that, rather it means ‘terminate the edit immediately
and jump to the tabulation section’. Once the record is tabulated Quantum reads in another record
as usual. If there is no tabulation section, the next record is read in straight away.
The return keyword is often used with reject to reject a record without finishing the edit. For
example:
Here any records in which c73’8’ are rejected from the tables, but, because reject is followed by
return which sends records to the tabulation section, editing is terminated immediately. Thus, only
records in which c73n’8’ will be tested for a ‘1’ in column 80. Compare this example with the one
in section 9.6, ‘Rejecting records’ above.
✎ Do not put reject after return because it will never be reached. Once the return is read, the edit
is terminated immediately and the record is passed to the tabulation section without the rest of
the statement ever being read:
if (c73’8’) return;reject
Quick Reference
To stop editing records and start tabulating records read so far, type:
stop [num_times_execute]
On some surveys you may want to run test tables on a few records only. This can be done using the
word stop.
stop tells Quantum to stop the run and print tables once editing has been completed on the current
record. For example, we may want test tables for 50 people who own goldfish, so we set up a
counter and terminate the run when it reaches 50:
If we did not wish to restrict ourselves to goldfish owners, and were satisfied with just the first 100
respondents, we could use the reserved variable rec_count in our test and stop when it reached 100:
if (rec_count.eq.100) stop
Alternatively, to be sure that we stop when 100 records have been accepted for tabulation, we could
write:
if (rec_acc.eq.100) stop
When the stop statement is executed, the reserved variable stopped_ becomes true.
A variation of stop is
stop n
where n is the number of times the statement is to be executed. If stop is part of a routing pattern
in the edit, it may be necessary to read in more than the n records to execute the statement n times.
As an example, here is another way of counting goldfish owners:
Here, the stop statement is only executed whenever we find someone who owns a goldfish. We may
need to read data for 72 respondents before we reach our target of 50 goldfish owners.
When either form of stop is used, editing and tabulation is completed for the respondent at which
the condition is fulfilled, and no more records are read. Therefore, if we have to process 72
respondents in order to find 50 goldfish owners, a holecount requested by the edit would include
72 records and errors in those 72 records would be included in the error listings.
Quick Reference
To cancel a run, type:
cancel [num_times_execute]
The word cancel, which is similar in format to stop, terminates the run immediately, producing
tables only for those respondents already passed to the tabulation section. It is often used to halt a
run when too many errors have been detected in the data. For instance, to cancel the run when more
than 100 errors have been found, we might have:
To cancel the run when more than 50 records have been rejected, we could write:
if (rec_rej.gt.50) cancel
Alternatively, cancel may be followed by a number indicating that the run should be cancelled
when the statement has been executed a specific number of times:
cancel 100
cancels the run when this statement has been executed 100 times.
As with stop, holecounts and error listings will only contain information about records read prior
to the cancellation condition being fulfilled. If 400 records are read before 101 errors are found, we
will see the errors for those 400 records.
Quick Reference
To send a record temporarily to the tab section, type:
process
The process statement is similar to return but must not be confused with it. When return is
executed, the record is sent on to the tabulation section; after the tables are completed for that
record, the program returns to the start of the edit section and the next record is read in.
When process is executed, the record is also sent immediately to the tabulation section where it is
used in table creation. However, after the record has been tabulated, control is passed back to the
edit section to the statement immediately following the word process. The record continues
through the edit and any statements after process applicable to the record are executed. At the end
of the edit the record is passed through the tabulation section again.
The process statement is used when you need to tabulate portions of a record more than once. For
example, if our survey asks shoppers about the brands of bread they purchased the last four times
they visited the shops, our data may be set out as follows:
Suppose we wish to create a table showing the total number of loaves of each brand bought by all
(or selected groups of) respondents during their four trips to the store. The simplest way to do this
is to set up an axis of the form:
l brd;inc=c135
n23Number of Loaves Bought
col 134;Brand A;Brand B;Brand C;Brand D
process
in the edit at the point you want to tabulate the record for the first brand.
c(134,135)=c(136,137)
process
This overwrites the information about the first purchase with information about the second
purchase, and the record is processed a second time. The total number of loaves bought on the
second trip will be added to the total number of loaves bought on the first trip.
c(134,135)=c(138,139)
process
c(134,135)=c(140,141)
process
When we finish, the total number of loaves of each brand bought by all respondents during those
four visits will be contained in the relevant cells of the axis.
In a situation like this we would probably put the process statements in a loop at the end of the edit,
although this is not strictly necessary. For example:
do 10 t1 = 134,140,2
c(134,135)=c(t1,t1+1)
process
10 continue
This performs exactly the same task as the list of statements shown earlier; it is just a more efficient
way of writing them.
✎ Be careful if process is the last statement in your edit: the record will be passed to the
tabulation section by process and then again by the end statement. If this is not what you want,
omit the last process.
☞ For another example of process, see ‘Incrementing tables more than once per respondent’ in
chapter 4, ‘More about axes’ in the Quantum User’s Manual Volume 2.
There are a number of ways of examining your data once it has been read into the C array. You
may:
• Create a frequency distribution reporting the different values found in a column or field of
columns.
• Write out specific records and examine them individually, as discussed in chapter 7, ‘Writing
out data’.
10.1 Holecounts
Holecounts are used to obtain an overall picture of the data before you write your edit program. For
each column they show:
• A distribution of the codes — for example, how many respondents have a 2 in column 56.
• The density of coding — how many respondents have one, two, or three or more codes in each
column.
There is an example of a holecount on the next page. The first column tells us the columns for
which codes are being counted; in this case it is columns 1 to 16 of card 1. The numbers across the
top are the individual codes, and the total in the top left-hand corner is the total number of
respondents (records): our data has 605 respondents.
As you can see, there are two numbers in each cell; an absolute figure and a percentage. The former
tells us how many records were found with a specific code in a column and the latter tells us what
percentage of the total data that is.
For example, there are 169 records with a code 1 in column 14 and this is 27.9% of the total.
Similarly, 32 records have a code 4 in column 15 which is 5.3% of the total records. Notice that
when the cell total is zero, no percentage figure is printed: this all makes it easier to see the pattern
of coding in each column.
The four right-hand columns of the holecount show the density of coding in each column. the
columns headed Den1 shows the total number of records with only one code of any sort in the
column. Den2 is the number of records with two codes in the column, and Den3+ tells us how many
records were multicoded with three or more codes in that column. The TOTAL is the total number
of codes in that column — that is, the sum of Den1, Den2 and Den3+.
Let’s look at column 115. 162 records have one code only in that column; six have two codes and
one has three or more codes. The total number of codes in this column is 177, and each card has an
average of 0.29 codes in this column.
The holecount is the starting place in your search for errors. There are many holecounts in which
it is immediately apparent that the presence of certain codes indicates an error. It is also clear
whether or not the column should be multicoded.
Creating a holecount
Quick Reference
To create a holecount, type:
where text is the heading to be printed at the top of each page. This is optional; if it is omitted the
holecount will simply be headed ‘Holecount’. Our example was created by the statement:
Quantum itself accepts double quotes in the holecount heading, but the C compiler which processes
the code that Quantum creates from your specification does not. Generally, it will issue an error
message that refers to a missing ) symbol at the point the double quote occurs. To prevent this
happening, precede the double quote with a backslash. For example:
You may count as many or as few columns as you like, as long as the columns to be counted are
consecutive: to count, say, columns 135 to 140 and columns 160 to 180 you will need two
statements, one for each field.
Records are counted at the stage they are when the count is read. If you have previously altered any
columns, say, with assignment or emit statements, the count will refer to the columns as they are
after the alterations rather than as they were in the original data file. Similarly, any changes which
are effected after the count are not reflected in the output.
✎ If you place a count statement in a loop, Quantum sums the counts for all the columns in the
statement and reports the total number of codes as the count for the first column only.
Filtered holecounts
A filtered holecount is one in which only records fulfilling a specific condition are counted. They
can be created using the if statement to define the occasions when a record should be counted.
For example, suppose we only wish to include male respondents in our holecount. Our statement
might be:
Normally, trailer cards of a given type are treated as one card and are counted together. Thus, the
number of codes in a column for a particular trailer card contains the sum of all codes found in that
column on all trailer cards of the given type (e.g., all cards 2s).
You may, however, prefer to produce holecounts on such cards based on their relative position
within the group of trailer cards. For example, suppose card 2 is a trailer card and we wish to make
a holecount on the third card 2 of each group. In chapter 6 we said that the variable allread2 is true
when a card 2 has been read in for the current record, and that it keeps count of the number of card
2s read. So, to produce a holecount for the third card 2, we would write:
We can also create filtered holecounts of trailer cards based on characteristics of the individual
cards. Suppose we have a trailer card for each store visited, in which the store is identified in c79.
The trailer card is the 5-card. We would write:
Multiplied holecounts
Quick Reference
To create a multiplied or weighted holecount, type:
where text is the holecount title and c(m_start,m_end) is the field in the C array containing the
multiplier or weight for each record.
In ordinary holecounts, the cells are simply counts of records: each time a record is read with a
specific code in a given column, the relevant cell in the holecount is incremented by one. If 231
records have a 7 in column 79, the figure in that cell will be 231.
Holecounts may also be created by incrementing each cell by the value found in a column field in
the record. This value is the record’s ‘multiplier’. If the multiplier is 15, and the record has a 6 in
column 152, the count for c152’6’ will be incremented by 15 rather than by 1 for this record. You
may hear this type of holecount referred to as a weighted holecount because multiplying a record
by a given value is the equivalent of weighting it.
✎ If the multiplier is being calculated during the run, it must be placed in the C array using wttran
before the holecount is requested.
☞ For further details on weighting and wttran, see section 1.9, ‘Copying weights into the data’
in the Quantum User’s Guide Volume 3.
where c(m,n) is the field to be counted, text is the optional heading to be printed at the top of each
page, and c(x,y) is the field containing the multiplier for the record. If this field contains a real
number, it must be referenced as cx(x,y) otherwise the decimal point will be ignored (for example,
1.5 will be read as 15).
The number labeled TOTAL at the top of each page of output is no longer the total number of records
in the data file, rather it is the number of records after each record has been multiplied by its
multiplier. This is best illustrated by an example. If we are producing a holecount for c(20,30), and
of our 50 respondents, 20 have a multiplier of 2.5, 15 have a multiplier of 2.6 and 15 have a
multiplier of 3.0, the total at the top of the page will be 134 respondents, calculated as follows:
Multipliers may be part of the original data file or they may be calculated during the edit. Both real
and integer values are valid, even though the cell counts in the output will always be shown as
whole numbers. This does not mean that you lose accuracy with real multipliers. Quantum stores
the cell counts with as many decimal places as are necessary until the count is complete, whereupon
it rounds all values ending in .49 or less down and all values ending in .5 or more up.
The figures used to create the multiplied holecount would then be 22.4, 12.7, or 11.9, depending
upon the contents of c104 in each record. Suppose we have 27 home owners (that is, 27 people have
c104’2’), the count for a ‘2’ in column 4 of card 1 would be 612.9 (27 × 22.4), which would appear
in the output file as 613.
• Since we are copying a real number into a field of columns we use the notation cx to refer to
the columns and follow them with the number of decimal places required.
• Because the word count is written in lower case it may start in column 1. If it had been written
in upper case it would need to start in a column other than 1 to prevent it being read as a
comment.
A frequency distribution enables you to inspect the contents of a field of columns containing
alphabetic or numeric data. For example, in a shopping survey the price the respondent paid for a
bottle of mineral water may be stored in columns 112 to 114. A frequency distribution will tell you
how many respondents bought mineral water at a particular price. This is very useful for
determining how the values in these fields should be grouped for tabulation, as well as for rough
estimates of medians.
By default, each distribution has two parts. In the first part, the values in the column field are sorted
in alphabetic or numeric order; in the second, they are sorted in rank order, according to the number
of times each one occurs in the data. Any multicodes in the field are decoded and the constituent
codes are listed. Each distribution shows both absolute and cumulative figures as well as
percentages for both. At the end of the alphabetic sort, Quantum prints:
• The sum of factors — that is, the sum of all wholly numeric items (values which occur more
than once are counted as many times as they occur).
• The mean for the numeric items listed (that is, the sum of factors divided by the number of
numeric items).
If the field is numeric and the run has missing values processing switched on, fields that are non-
numeric will contain the value missing_. This value is counted as zero by the sum of factors, mean
and standard deviation lines of the report.
Statements are provided for requesting a frequency distribution sorted in alphabetic or numeric
order only.
Quick Reference
To create a frequency distribution sorted in alphabetic and rank orders, type:
To produce a frequency distribution sorted in alphabetic order only, type lista instead of list. For a
distribution sorted in rank order only, type listr instead of list.
A frequency distribution, as shown in the example on the next page, is created with the list
statement, as follows:
where c(m,n) is the column field whose contents are to be listed and text is the heading to be printed
at the top of each page. If no heading text is given, the heading ‘Frequency Distribution’ is used
instead.
The list statement, as shown above, produces both the alphabetic and numerically-sorted
distributions. To request an alphabetic distribution only, type:
The first example produces a frequency distribution of the contents of c(107,108) sorted in numeric
order; the second example generates a list of car brands which will be sorted in alphabetic order.
Additionally, we are using subscripts to represent the column numbers. If t1 has a value of 36,
Quantum will list the values found in columns 36 to 40.
The rules for double quotes in the text are the same as for holecounts, that is, you must precede
them with a backslash.
The list in the diagram below shows a frequency distribution for the column field c(123,125). It
was created by the statement:
Since it was run on a data file containing 200 respondents, the total is 200.
Let’s start with the first table — the alphabetical sort. The figures in the column headed ‘string’ are
the values found in columns 123 to 125, in this case, the price paid for a bottle of mineral water.
The next column (item) tells us how many times each code occurred in those columns — that is,
how many people paid each price. We can see the actual number of people and also what
percentage of the total sample that is. For instance, 31 respondents paid 111p which is 15.5% of the
total (200).
The columns labeled cumulative show accumulated totals and percentages for each value found.
There are 86 respondents who paid between 111p and 114p, and these are 43.0% of the total
respondents.
The second table shows exactly the same information presented in rank order, with the most
frequently occurring value first. The example shows that this is 212, and that 41 respondents or
20.5% of all the respondents paid 212p for a bottle of mineral water.
Unlike count, if list is part of a loop, it will be executed once for each pass through the loop. All
values found will be entered in the same list: Quantum does not create a separate listing for each
pass through the loop.
PRICE PAID
Total = 200 Alphabetical Sort
Number of categories = 14
Number of numeric items = 200
Sum of factors = 32218.00
Mean Value = 161.09
Std deviation = 67.97
PRICE PAID
Total = 200 Rank Sort
Quick Reference
To create a multiplied or weighted frequency distribution, type:
where text is the frequency distribution title and c(m_start,m_end) is the field in the C array
containing the multiplier or weight for each record. If the multiplier contains a decimal point,
reference it as cx(m_start,m_end).
For a distribution sorted in alphabetic or rank order only, type lista or listr as appropriate instead
of list.
Creating multiplied frequency distributions is exactly the same as creating multiplied holecounts:
As with count, c(m,n) is the column field whose values are to be listed, text is the optional heading
to be printed at the top of the page, and c(x,y) is the field containing the multiplier. If the multiplier
contains a decimal point, reference it as cx(x,y), otherwise the decimal point will be ignored and,
for example, 1.5 will be read as 15. Multipliers may either be part of the original data, or they may
be created during the edit, in which case they must be placed in the C array with a wttran statement
before the frequency distribution is requested.
Multiplied frequency distributions are generally required when you are producing weighted tables
and you want to check that you have the correct number of people in each row of a table.
☞ For further information about weighting and wttran, see section 1.9, ‘Copying weights into
the data’ in the Quantum User’s Guide Volume 3.
In earlier chapters, we discussed ways of examining the data for a set of records (with count) or for
an individual record (with write). In general, however, we want to check the validity of the data for
individual records by putting in the edit a set of testing sentences which will tell us not only whether
a record contains an error but also what that error is.
There are two types of checking sentence. The first involves checking whether a column contains
the correct type of coding (single-coding/ multicoding) and whether the codes in that column are
valid. Take the question on a respondent’s sex which may be Male, coded c106’1’, or Female,
coded c106’2’. c106 must be single-coded because a person cannot have two sexes, and the only
codes which may appear in that column are 1 and 2. Any record in which c106 is not single-coded
with a 1 or a 2 will be flagged as incorrect.
The second type of checking involves making sure that columns whose contents depend on the
contents of other columns contain the correct codes. For instance, suppose the questionnaire asks
whether the respondent has ever used a particular brand of washing up liquid. The answer is coded
into c125 as ‘1’ for Yes and a ‘2’ for No. If the answer is Yes, the next questions concerning price
and quality are asked. If c125’2’ indicating that the respondent has not used that brand of washing
up liquid, the following columns must be blank. Conversely, if c125’1’, the following columns
must be coded according to the codes on the questionnaire.
11.1 require
Both tasks listed above can be carried out using if but sometimes they can become very complicated
and repetitive. Therefore, Quantum has an additional testing statement, require, specifically
designed to increase the efficiency of this checking process.
☞ For more information on the if statement, see section 9.1, ‘Statements of condition – if’.
• Column validation. Tests columns against a given set of characteristics and deals with records
not meeting the requirements according to a specified action code.
• Testing the validity of a logical expression. Tests a logical expression and, if it is true,
continues with the next statement. If the expression is false, the record is dealt with according
to the given action code.
• Testing the equivalence of logical expressions. Compares the logical value of a group of
logical expressions. If all are true or all are false, the run continues with the next statement, but
if the expressions yield a mixture of values the specified error action is carried out.
The actions which are carried out when the stated conditions are violated are determined by an error
action code defined either in the require statement itself or in a global statement placed at the start
of the edit.
☞ For information about the error action code, see ‘The action code’ in the following section.
The require statement has three forms, depending upon the function it performs, and these are
described in the subsequent sections. Each one must start with the word require which may be
abbreviated to r.
Quick Reference
To validate columns and codes, type:
where code is the error action code, condition is the type of coding required, and col1 and col2 are
the columns or fields to be tested.
For example:
Our example checks that columns 110 and 125 are not blank (nb). Any records in which this is not
the case are written out to a new file and rejected from any tables that may be produced (/5/).
Quick Reference
To define a default error action code, type:
rqd number
The action code is a number between 0 and 7 which tells Quantum what to do with records that do
not match the required conditions (for example, records which are blank but which should contain
codes). The action code may either be entered as a parameter on each require statement or, if it is
the same for all statements, on an rqd statement.
0 Print a summary of errors only — records are not listed individually, but a count is kept of the
number of records failing each require statement. This is printed out at the end of the run.
3 Print the record and reject it from the tables. This is the default.
5 Write the record into the output data file, punchout.q and reject it from the tables.
6 Print the record in the print file, out2, and write it into the output data file, punchout.q.
To write a statement which would print out incorrect records but include them in the tables, we
would write:
r /2/ ....
Similarly, to have all incorrect records printed in the print file, written into the output data file and
rejected from the tables, we would write:
r /7/ ....
In both cases the action code is part of the individual require statement, but where the same action
applies to all requires, it is quicker and more efficient to define the action code on an rqd statement
at the beginning of the edit. For instance, if all erroneous records are to be written out and rejected
we would write:
rqd 5
The default action is to print the record out and reject it from the tables:
Checking with require can be as simple or complex as you like. In this section, we will start with
the simplest checks and deal with each extra feature in turn. We will assume, unless otherwise
stated, that the error action code is the default Print and Reject (code 3) and will omit it from most
of the examples accordingly.
The most basic form of the require statement simply checks whether the column or field of columns
contains the correct type of code; it does not check the individual codes themselves. Code types
may be:
b Blank
nb Not blank (single-coded or multicoded)
sp Single-coded (literally, single-punched)
spb Single-coded or blank
One of these types must follow the word require since it tells Quantum what to check for.
All that remains is to say which columns are to be inspected; just list each column or field of
columns at the end of the statement. If more than one column or field is defined, each one must be
separated by a comma.
----+----1----+----2----+----3----+----4----+
002411123481231&- *1927235537*&& 1 1 1
The statement:
r nb c10, c(25,35)
checks that columns 10, and 25 to 35 inclusive are not blank — they may contain any number of
codes. This record satisfies both conditions so it passes on to the next statement in the edit.
The statement:
looks to see whether columns 11, 15, 23 and 41 are single-coded. In our record they are, but if this
were not the case (say c11’123’) the record would be printed out and rejected from any tables that
may be produced. Additionally, Quantum would tell us ‘Column 11 is 123’.
✎ Be careful when using field specifications with require: the condition applies to each column
individually, not to the field as a whole. For instance:
r sp c(1,4)
means that each of columns 1, 2, 3 and 4 must contain one code. It does not mean that the field
must contain one code overall. To check that a field contains one code only, use numb.
Very often some columns on the questionnaire are not used, so you might like to check that all such
columns are blank in the data file. In our example, let’s say that columns 51 to 70 are not used. To
check that there are no stray codes in these columns we would write:
r b c(51,70)
Quick Reference
To define a message to be printed when a record fails a test, type:
When incorrect records are printed out, require automatically prints a short text describing the
error. Normally, it tells you what codes were found in the column which is wrong, but if this is not
what you want, you may define your own error text by entering it enclosed in dollar signs at the
end of the statement. This text will then be printed in place of the default text when errors are found.
For example, if c329 is multicoded when it should be single-coded, the statement:
r sp c329
will print the whole record and tell us which codes were found in that multicode:
Column 329 is 13
Instead of being told which codes the column contains, you may prefer to see a message linking the
error to a question on the questionnaire. In this case you will need to add your own error text as
follows:
Quick Reference
To check for specific codes in a column, type:
where codes1 are the codes to be tested for in column or field col1, and codes2 are the codes to be
tested for in column or field col2.
Any codes which are present in col1 but are not listed in codes1 are ignored. The same applies to
any other column and code pairs listed.
Sometimes it is not sufficient to check just the type of coding, and you will want to know whether
the codes found are valid for that column. To do this, we use the information given in the previous
section as a base, and add on our first ‘optional extra’.
To check whether a column or field of columns contains specific codes, follow the column
specification with the codes to be checked, enclosed in single quotes. For example:
r /5/ sp c223’1/5’
tells us that column 223 should be single-coded within the range of codes 1 through 5. Any other
codes in this column are ignored. Thus, a record in which c223’14’ is incorrect because it contains
two of the listed codes, whereas a record in which c223’27’ is correct because it contains only a 2
from the range ‘1/5’. Of course, any record which does not contain a 1, 2, 3, 4 or 5 at all is also
incorrect, regardless of whether or not it is single-coded: c223’9’ is just as wrong as c223’789&’.
Codes may also be defined with all other code types, thus:
r /3/ nb c156’2/6’
If c156 does not contain at least one of the codes 2 through 6 (regardless of anything else it may
contain) the record is printed out. Column 156 may be multicoded as long as at least one of the
codes is within the required range.
----+----6
9
Even though it checks for blanks, require b may be followed by columns and codes. You would do
this when you are checking that a column is either blank or, if not blank, that it does not contain
certain codes. Here’s an example to clarify this:
r b c134’1/8’
This statement tells Quantum that column 134 must never contain any of the codes 1 through 8:
only ‘09-&’ or blank are acceptable. This is the opposite of r sp and r nb, both of which list valid
codes. Any record failing this condition will be printed and rejected via the default action code 3.
Exclusive codes
Quick Reference
To check that a column or field contains no codes other than those listed, type:
If col1 contains any codes other than those given in codes1, the test is false.
Now that you know how to check codes, the next thing to discuss is how to check that all other code
positions are blank.
r sp ca’p’
accept all records containing only one of the codes ‘p’ in column a, regardless of what other codes
are also present. To check that a column contains only the listed codes and nothing else, follow the
code specification with the letter o (for only) in upper or lower case. For example, to indicate that
c356 must be single-coded in the range ‘1/5’ and that all other positions (‘6/&’) must be blank, you
should type:
r sp c356’1/5’o
Any of the following would cause the record to be printed and rejected:
The require statement may define conditions for more than one column. Just follow each column
with the code positions to be checked and separate each set with a comma:
Here the columns to be checked are consecutive but have been listed separately because they each
have different sets of valid codes. If all columns could be single-coded in the range 1 to 7 we might
abbreviate this to:
r sp c(164,168)’1/7’ $q10a/e$
since this notation means that each column in the field must be single-coded within the given range
rather than that the field as a whole may contain only one of those codes.
Quick Reference
To define a correction code to be used as a replacement for codes which fail the required condition,
type:
new_code is the code or codes to be inserted in col1 if it fails the test condition. Any codes already
in that column are overwritten.
As you know, records found to have errors are printed, coded and/or rejected according to the error
action code. When the run is finished you will look at these records and, if possible, correct the
errors by using the on-line edit or correction file facilities.
☞ For information about on-line editing and the corrections file, see chapter 12, ‘Data
correction’.
Occasionally you will know in advance what to do with certain types of error; say, for instance, the
respondent’s sex has been miscoded. You may decide or be told to recode this person as a ‘3’ in
the appropriate column indicating that the sex was not known. The way to do all this in one go is
to write the normal require statement that checks columns and codes, and to follow the code
specification with a colon (:) and the replacement code (in this case ‘3’) enclosed in single quotes,
thus:
Any record in which c106 is not single-coded with either a ‘1’ or a ‘2’ will have the contents of
c106 overwritten with a ‘3’.
if (numb(c106’12’).ne.1) c106’3’;
+write $c106 incorrect$
When working with fields, it is not possible to define replacement strings for the field as a whole.
You should, however, note that if a single replacement code is given for a field of columns, any
incorrect columns in that field will be overwritten with the replacement code. The correct columns
remain untouched.
If we have:
+----4----+
1927
+----4----+
1&2&
✎ If you use this facility, remember that the replacement code is an alteration to the data, and as
such is operative only as long as each record is in the C array. If you want to save these
modifications you must include a statement in your edit which will write records to another
file. Statements which write out new data files are split and write. Alternatively, you can use
one of the action codes which writes records to the output data file.
☞ For information about split, see section 12.4, ‘Creating clean and dirty data files’.
For information about write, see section 7.1, ‘Print files’.
Quick Reference
To define defaults for all columns or fields tested, type:
The defaults may be overridden for an individual column by following the column with the
required coding, only flag and replacement code as usual.
By now you will have guessed that require statements can become lengthy things, especially when
specific codes have to be checked, replacement characters defined and error texts entered. In many
cases some, if not all, of these items will be common to the majority of the columns listed in the
statement; for instance, several non-consecutive columns may have the same set of valid codes.
When this happens you may enter these common items at the beginning of the require statement
as defaults for that statement. There are several ways of doing this, so let’s take the statement:
Both statements check whether columns 127, 129, 131 and 133 are single-coded n the range 0 to 9
or are blank. If the − or & codes appear in any of these columns, or if the columns are multicoded,
the offending records will be printed and rejected.
Defaults defined at the start of a require may be overridden for an individual column or field by
following that item with the new specification. For example:
tells us that columns 10, 12 and 15 must be single-coded in the range 1 to 5 while column 20 must
be single-coded in the range 1 to 3.
This checks that columns 10, 12, 15 and 24 are single-coded in the range 1 to 5 and that none of
the codes ‘6/&’ are present in those columns. Column 20 has its own code specification which
overrides not only the default codes but also the Only operator. Quantum will check that c20
contains only one of the codes 1 to 7, but it will ignore anything it finds in the range ‘8/&’.
This is exactly the same as the previous example except that we have added a replacement code to
be used when errors are found. This code refers to all columns named with this require, even
though column 20 has a different set of valid codes.
Quick Reference
To evaluate a logical expression, type:
The require statement can be used to evaluate a logical expression. If the expression is false, the
record will be dealt with according to the specified (or default) action code. If the expression is true,
the program continues with the next statement.
This type of require also has four parts, two of which are optional:
☞ Items 1, 2 and 4 are exactly as described in section 11.3, ‘Validating logical expressions’
above.
For further information about logical expressions, see chapter 5, ‘Expressions’.
For example:
says that c133 must contain a ‘4’ and c140 must not contain a ‘5’. If one or other or both expressions
are false, Quantum prints the record out with the message ‘Cols 33/40 incorrect’ and rejects it from
the tables.
This type of require statement is often used to check the number of codes present in a column or
group of columns. For example, if the questionnaire specifies that the respondent should name no
more than three products in his answer, you might write:
r (numb(c139).le.3)
causing any record in which column 39 is multicoded with more than 3 codes to be printed and
rejected. This statement has no error text, so any records printed will be followed by the require
statement itself.
Quick Reference
To test whether a group of logical expressions all have the same logical value, type:
Require can evaluate groups of expressions and perform given tasks depending on whether all
expressions are true or all are false. When all the expressions have the same value (i.e., all true or
all false) Quantum continues with the next statement in the program, whereas if some are true and
some are false, the record being tested will be dealt with according to the given (or default) error
action code.
This type of statement is generally used to check routing patterns. For example: if a ‘2’ in c125
means that the respondent did not try Brand A washing powder, we would expect columns 126 to
145 which record his opinion of it to be blank. On the other hand, if he tried the washing powder,
we would expect to find his opinions about it coded in columns 126 to 145. This can be written:
r = (c125’2’) (c(126,145)=$ $)
which says that to be accepted, a record must either have a ‘2’ in column 125 and blanks in columns
126 to 145, or something other than a ‘2’ in c125 with at least one code somewhere in c(126,145).
The following data is designed to clarify this.
----+----3----+----4----+----5
2 15 is accepted, so is
----+----3----+----4----+----5
15 42674 262&03 37 73 but
9 4 0
----+----3----+----4----+----5
2 6 8 15 is rejected, so is
----+----3----+----4----+----5
3 635
The first example is accepted because both expressions are true, the second is accepted because
both expressions are false. The third and fourth expression are both rejected because one expression
is true and the other is false.
Note that in this example, if column 125 does not contain a ‘2’ we are only checking that columns
126 to 145 contain at least one code; we are not checking whether those codes are correct.
When Quantum executes a require statement, it sets the variable failed_ to True if the data fails the
require statement or to False if the record passed the requirement. You can then test whether failed_
is True and take whatever actions you wish. For example, if you are checking that the respondent’s
sex is coded as a ‘1’ or a ‘2’ only, you may wish to blank out the column if it contains any other
code or codes. You could write this as:
r sp c123’12’
if (failed_) set c123’ ’
The test for failure is made on the last require statement executed for the current record. This may
not always be the most recent require statement in the program, and it may not be the require
statement you intend Quantum to execute. If you write:
r sp c112’1/5’
if (c115’1’) r b c116
if (failed_) set c116’ ’
the test for failure could apply to either of the previous statements. If column 115 does not contain
a ‘1’, the second require statement will not be executed and failed_ will be True if column 112 is
not single-coded in the range ‘1/5’. If column 115 contains a ‘1’, then failed_ will be True if column
116 is not blank.
You can get around this potential problem by setting failed_ to zero (the equivalent of False) just
before the require statement you wish to test. For example:
r sp c112’1/5’
failed_ = 0
if (c115’1’) r b c116
if (failed_) set c116’ ’
Require is often part of an if statement saying “If this is true, then that also must be true”. In our
previous example with r= we were saying two things:
• if the respondent didn’t try Brand A, the columns associated with it must be blank, or
• if he tried Brand A, there must be a code in at least one of the associated columns.
Sometimes this type of test is too stringent and will reject records in which the data is perfectly
correct. For example, the extra questions for people who tried the product may not contain a
specific code for Refused or No Answer, so anyone who tried the product but refused to answer the
extra questions would have blanks in the relevant columns. This data is perfectly correct but would
be rejected by the r= statement which expects at least one column to contain a code. Therefore, we
need to write a statement that will only check whether columns 126 to 145 are all blank if the
respondent didn’t try the product; if the respondent tried the product we do not care whether he
answered the extra questions or not. The statement for this is:
This says that if the respondent did not try Brand A, all columns associated with it must be blank,
but if he tried the product we expect those columns to be single-coded in the range ‘1/7’ or blank.
One can also make require statements apply to smaller sets of data by having records for which
they would be irrelevant go around the statements. Let’s say c112 records whether there are
children in the household. If c112’1’ there are children and c113 and c114 must contain answers.
We could write:
if (c112n’1’) go to 30
r nb c(113,114)
30 continue
This means that all irrelevant records (respondents without children) would not be tested.
✎ This system makes sense when there are several requires and you want to avoid a whole set
of identical if statements. It’s more efficient and it’s easier to follow. Remember, as well, that
you can put in comments to remind yourself what you are doing and why.
It is always possible to deal with data which has been incorrectly coded and/or entered. If the errors
themselves cannot be corrected because correct codes cannot be determined, the incorrect data can
be collected under some miscellaneous heading in the tabulations.
However, a cleaner data set can be obtained by correcting or removing invalid data whenever
possible.
• Replace the incorrect codes with specific codes using edit forcing statements.
• Write a file of corrections to be merged with the original data when it is read in by a Quantum
program.
Changing the contents of the original data file is not a function of Quantum: you will need to use
the data editing program, ded, for this. If you do need to edit the original data file, you should
always take a copy of it first in case your editing does not have the desired effect.
☞ For further information about ded, see the SPSS MR Utilities Manual.
This section does not introduce any new keywords; instead it tells you how to combine the
statements that you already know in order to clean your data.
A record which generates too many error messages, or which is clearly incorrect can be removed,
as noted. Suppose its serial number is 2004. Then we have:
This rejects the record from the rest of the edit and the tabulation section as well.
This statement should be at the beginning of the edit to avoid unnecessary editing of a useless
record.
Columns within a record can be removed by blanking them out or setting them to a common reject
code, often a minus or ampersand.
For example:
All records in which c125 contains neither a 1 or a 2 will have the contents of that column replaced
with an ampersand, and whatever is in c(126,145) blanked out. As a real-life example, suppose a 1
in c125 means that the respondent visited the market, and a 2 in that column means he did not.
Information about purchases made at the market are stored in c(126,145). If column 125 contains
neither a 1 or a 2, we cannot clearly establish whether or not the respondent visited the market so
we set c125 to a special code and blank out any information about purchases.
Inserting correct data is generally more difficult than removing invalid data, because you very often
don’t know what the correct data is. However, if you do know, you can correct the data record by
record, or make the same correction for any record which is incorrect. For instance:
corrects the record whose serial number is 2222 by setting a 2 into c112 and blanking out
c(113,114).
If you do not know what the correct data is, you may decide to replace the incorrect code or codes
with a valid code chosen at random. For example:
if (c(101,104)=$3625$) c145=rpunch(’1/5’)
replaces whatever was in column 145 with one of the codes 1 through 5 for the record whose serial
number is 3625.
✎ When correcting data on a record-by-record basis, it is more convenient to use the methods
outlined below.
Quick Reference
To allow interactive correction of errors, type:
online [label_number]
at the point at which you want to make corrections. label_number is the label of the statement to
execute when the record is returned to the main edit with an rt command. The default is to return
to the start of the edit.
On-line correction is a method whereby Quantum interrupts processing when incorrect records are
found, so that corrections, if any, may be made interactively. The record may then be re-edited to
check for further errors straight away.
When an incorrect record is found, the current contents of the C array are written to the print file,
out2, as usual, and a message is displayed on your screen indicating the record’s position in the data
file. Any messages associated with the write or require statement finding the error are also
displayed, and you then have the opportunity to accept the record as it is, reject it, correct it or re-
edit it. The record itself is not displayed unless you request it.
online
You may put in as many online statements as you like, but as long as there is one online statement
in the edit, on-line editing will be possible both at the point where the statement occurs and also at
the end of the edit. If there are no errors to be corrected, Quantum ignores the online statements.
Once an incorrect record has passed through the on-line edit, you may leave it to continue through
the rest of the standard edit until it reaches the end statement or you may return it to the start of the
edit to be retested. If you prefer, you may name a statement to which records should return simply
by giving that statement a label number and following online with that number. For example:
online 45
✎ Runs containing on-line edits must be run from a terminal rather than in the background until
the edit section is finished; otherwise you will not know when there is a record awaiting
correction.
Any corrections made during on-line editing are effective only during the current run unless
your edit contains one of the commands split or write to create a new data file. If your program
calls the on-line editor but does not contain split or write, a warning message will be displayed
when your program is checked.
Like any other editor, the Quantum on-line editor has its own set of commands, many of which are
similar in appearance and function to statements you would write in a normal Quantum edit. There
are three types of editing command:
• Those which terminate on-line editing either for the individual record or for the file as a whole.
Quick Reference
To display the record being edited, type:
di [column(s)]
As we said in the introduction to on-line editing, Quantum displays any messages associated with
the write or require statement finding the error, but does not automatically display the record itself.
It also displays an arrow prompting you for a command. To display the full record in its current
state, type display or di. The whole record is displayed underneath a ruler, as with the write
statement.
Sometimes it is easier to see the error if you print out the incorrect column or columns separately
rather than looking at the whole record. To see a column or field only, just follow the di command
with the numbers of the columns you wish to see. For example:
Column fields may be entered as just two column numbers separated by a comma, the parentheses
and the C being optional. Thus, the second example could equally well be written:
di 115,130
When a single column is displayed, the individual codes comprising a multicode are shown, but
when fields are displayed, a ruler is printed and multicodes appear as asterisks (*). Here is an
example:
-> di 25,35
+--- 3 ---+
613*9 2 144
-> di 28
159
->
In the first example, the asterisk represents a multicode, whereas in the second example where only
one column is displayed, the codes 1, 5 and 9 are a multicode in column 28.
Correcting records
Quick Reference
To overwrite the current contents of a column or field with a new code or string, type:
e column(s) codes
de column(s) codes
The words used for correcting records are set, emit and delete which are usually abbreviated to s,
e and de. They work in exactly the same way as their counterparts in the ordinary edit section:
s overwrites the original contents of a column or field with new information; e appends a single
code to the codes that are already in a column and de removes one or more codes from a column
leaving the remainder intact.
There are many variations of these commands, all of which are equally correct. Just choose the one
that you find most convenient. Here are some examples. The first group are set statements for
overwriting the contents of a column or field with the given code or string of codes.
If you want to overwrite a single column with a single code, use one of the four formats on the first
line. In all cases you may type in the full command word (set) or the abbreviation (s). All four
variations replace whatever is currently in c5 with a code 7.
The examples on the second line are for overwriting a single column with a multicode. Notice that
if you use the = notation, the single quotes enclosing the multicode are optional.
The last line illustrates how to overwrite a field of columns with a string — in this case to replace
the current contents of columns 123 to 126 with the codes 4, 5, 6 and 7 respectively.
In all on-line set statements you may omit the set or s at the beginning of the command, thus:
When it comes to adding codes to columns, the on-line editor has an option that the ordinary editor
does not. Whereas the ordinary emit statement only allows you to specify single columns, the on-
line editor also allows you to emit strings of single-codes into a field of columns. Thus, the syntax
of the on-line emit statement is:
The same notes apply to deleting codes: the online edit allows you delete codes from a single
column or a field:
✎ In all the examples we have just shown, the c, equals sign, single quotes and dollar signs are
optional as long as the components of each statement are separated by spaces. Additionally,
in assignments, set (or s) is optional.
Whenever you alter columns with set, emit or delete, the on-line edit checks that the columns you
are editing are within the range of the C array for the current job. If you are using the default array
of 1,000 cells, c1001 and above are out of range for editing.
Quick Reference
To accept a record whether or not it has been corrected, type:
ac
To terminate the edit and send the record to the tabulation section, type:
rt
To reject the record from the tables but continue the edit, type:
rj
The following commands may be used to determine a record’s path through the remainder of the
edit section and the tabulation section:
ac (accept) Accepts the record up to the point at which the online statement occurs, whether
or not it has been corrected. The record continues on through the rest of the edit
and will only be re-presented for correction by other online statements or at the
end of the edit if other errors are found. Records accepted in this way are written
to the clean data file if split or write are used.
rt (return) Terminates the edit for that record: that is, the record is assumed to have reached
the end statement. If split or write has not yet been reached, the record will not be
written to the clean data file even though it will be included in any tables produced
by the run.
rj (reject) Rejects the record. The record continues through the edit unless it is terminated
with rt. The record is copied to the dirty data file.
Quick Reference
To add new cards to the output data file, type:
ad card_num1[card_num2 ... ]
rm card_num1[card_num2 ... ]
The add command adds new cards to the output data file and rm removes cards from it. To add a
card type, type add or ad followed by the number of the card type to be added. If you are adding
several different cards at once, separate the card type numbers by spaces. Quantum will then set the
appropriate thisread variable to be true so that the new card type will be written out with the rest of
the data. Thus:
-> ad 3 4
will set thisread3 and thisread4 to be true so that the new cards 3 and 4 will be written out. Each
card will contain as many columns as the record length defined for the current run. If the C array
already contains data for a card 3 or 4, Quantum issues an error message to this effect.
Removing cards is exactly the same, except that the appropriate thisread variables are reset to false
to prevent the unwanted cards from being written out. It does not alter the data in your original data
file. If you try to delete a card that is not currently in the C array (i.e., the thisread variable is already
false) an error message is displayed.
Quick Reference
To return the record to the start of the main edit section, type:
ed
The edit command (abbreviation, ed) re-edits the record by sending it back to the start of the edit
or to the statement number given with online. If no more errors occur, the record is copied to the
clean data file.
If you prefer, you may hit the return key instead of typing ed.
Quick Reference
To cancel on-line editing for the rest of the data file, type:
ca
cancel (abbreviation, ca) cancels on-line editing but continues passing records through the standard
edit program. Any errors found subsequently are not displayed on the screen for correction, but
records are still placed in the clean or dirty files as appropriate.
The on-line edit commands we have just described are the defaults which are programmed into
Quantum. If you wish, you may redefine these command names or translate them into a language
other than English, or define your own abbreviations. You do this in the translatable texts file.
☞ To find out about this file, see section 1.9, ‘Customized texts’ in the Quantum User’s Guide
Volume 4.
Quick Reference
To write correct records out to a clean data file and incorrect records out to a dirty data file, type:
split [only]
at the point at which records are to be written out. Type split only if the edit does not alter the
contents of the record and you want to copy records directly from the original data file rather than
from an intermediate file.
Clean and dirty data files are the terms used to refer to files of correct and incorrect or rejected
records created automatically by the edit statement split.
Each time a record is read and reaches split, it is written out to the appropriate file in its current
state. If any changes have been made with assignment statements, emit, delete, priority, require or
the on-line edit, they will be saved in the clean data file if the record is now correct or in the dirty
data file if the record still contains errors or has been rejected.
Split may occur several times in the edit, but each record will be written out once only. In the
example below, the second split is redundant since all records will have been written out by the first
one. The data to be checked is:
Let’s suppose that the record has reached the require statement without error. Since c234’2’ and
c309’3’, the record is correct so it is copied to the clean file. However, when the next statement is
read and the contents of c146 are checked, we find that it contains a ‘5’ which means that it must
be rejected and should be copied to the dirty file by the second split. This does not happen because
it has already been written out by the previous split. For this example to place the record in the dirty
file instead, it should read:
r sp c234’1/5’,c309’1/5-&’ :’&’
if (c146’12’) emit c180’1’;else; reject
split
Split is often used at the end of an edit after online. This causes all records found in error by write
and require statements to be offered in the on-line edit for correction and then saved in the clean
or dirty file according to the type of on-line commands you use. For example, if a record is flagged
as incorrect and you correct those errors, the record will be placed in the clean data file. The same
is true if you use ac to accept the record even if you do not make corrections. If you reject the record
with rj, the record will be placed in the dirty data file. By putting both statements at the end of the
edit you can be sure of seeing all erroneous records and of saving all records in their final state.
If some records are rejected from the run using reject;return, these records will not be included in
the clean or dirty files unless the data is split before the records are rejected:
split
if (c132n’1/9’) reject; return
In this example, because split appears in the edit before reject;return, all records will appear in one
or other of the clean or dirty files (depending on whether or not they contain errors) even though
records in which c132 does not contain any of the codes 1 through 9 have their edit terminated and
are rejected from the tables.
Here, because split appears after reject; return, only records in which c132 contains any of the
codes 1 through 9 will appear the clean or dirty files. Again, which file the records are written to
depends on whether or not they contain errors.
☞ For further information about using reject, see section 9.6, ‘Rejecting records’.
For further information about using return, see section 9.7, ‘Jumping to the tabulation
section’.
By default, an intermediate data file is created for splitting. The name of this file is clean.q. If the
run does not contain statements which alter the data (for example, recoding with assignment
statements or creating new columns) then this file will be identical to the original data file. In such
cases, you may save disk space during the run by splitting the original data file instead with the
statement:
split only
When we talk about the original data file, we do not mean that Quantum alters your original data
file in any way; merely that it reads records directly from this file and allocates them to the clean
and dirty files rather than taking a backup copy of this file and reading records from there.
✎ You may not use split only when the datapass reads input from another program (for example,
when you use a corrections file to correct records rather than writing a forced edit or using the
on-line edit). Instead, you should run Quantum using the corrections file only and write all
records to a new data file. Then run the datapass on this new data file.
If you do an on-line edit but forget split or write, your changes will not be saved. Also if you
have created new cards and have not made thisread true for the new cards (for example,
thisread3=1 for a new card 3), they will not be written out.
If you use split on a levels (trailer card) job, splitting is switched on for all levels and must
therefore be part of the top level edit. Additionally, it must appear once only and must not be
part of an if statement. A reject statement at any level rejects the whole record and writes it to
the dirty file.
Quick Reference
To correct data using a corrections file, create a file called corrfile containing statements of the
form:
The last method of correcting errors is to create a file of corrections which will be merged with the
original data when it is read by a Quantum program. The correction file must exist in the directory
or partition in which you will be running your job.
Corrections are made by comparing the serial number of the record currently in the C array with
the serial number given with each correction in corrfile. Consequently, all serial numbers in corrfile
must be in the same order as those in the data file. The format for a correction record is:
serial ; corrections
serial /n ; corrections
for records containing trailer cards. In both cases, serial is the record serial number and corrections
are the corrections to be made. The /n in the trailer card format is the read number defining the
trailer card to be corrected; it can be found from the error listing. For example, if our data contains
a card 1, three card 2s and a card 3, and we want to correct an error on the third card 2, the read
number would be /3 because the third card 2 is read into the C array during the third read. If /n is
omitted, the read number is assumed to be 1.
As in the on-line edit, the s and the equals signs may be omitted. If the correction refers to a field
of columns, you may define a string of codes in place of a single code.
Any number of corrections may be specified for a record as long as each correction is separated by
a semicolon. The data to be corrected may be a single column or a field, and the corrections may
be single-codes or multicodes enclosed in single quotes or strings enclosed in dollar signs. If the
data variable is larger than the string it is to contain, the string will be right-justified and padded
with blanks. If the string is longer than the data variable, a warning message is issued.
The first record to be corrected is that with serial number 10. Column 112 is to be overwritten with
a ‘1’, a ‘3’ is to be added into column 212, column 314 is to be overwritten with the multicode ‘34’
and the ‘3’ in column 115 is to be deleted.
The second correction is to the cards in the C array after the fourth read for serial number 123. Both
corrections involve overwriting the original data with new codes.
✎ Correcting data with a corrections file is considerably faster than using a forced edit of the
form:
if (c(101,103)=$123$) c109’2’
Corrections in corrfile are made before the statements in the edit section of your program are
executed. If you are rerunning your previous job to correct errors and you have not altered the edit
in any way, you may save more time by telling Quantum to read the data but not to recompile and
load your program. This is done with the option –r on the Quantum command line.
☞ For further information about options for Quantum runs, see chapter 16, ‘Running Quantum
under Unix and DOS’.
The term missing values refers to data in numeric fields that is either non-numeric or totally blank.
You may find them in data gathered from questions of the type shown below:
If the respondent replies ‘no’ to question 1 or does not answer it at all, question 2 is not asked and
columns 9 and 10 are left blank. If the respondent replies ‘yes’ to question 1 then question 2 should
be coded either with a numeric value or, perhaps, with && for a don’t know answer. The blank data
and && are missing values.
You may also find missing values when a numeric field is incorrectly coded with a combination of
numbers and letters. This is usually the result of mistyping when the data is entered and can often
be corrected by looking at the questionnaire itself and then cleaning the data within the edit section
of the run.
Missing values processing is an optional feature. If you use it, Quantum automatically detects
missing values and provides a variety of facilities for dealing with them in both the edit and
tabulation sections of your run. In the edit section you have:
• Manual assignment of the special value missing_ to variables of your choice within the edit.
You can use missing values processing in the edit section, in the tabulation section, or both. To
switch it on in the edit section, type:
missingincs 1
missingincs 0
You may use these statements any number of times in the edit to toggle between using and not using
the missing values features.
✎ The missingincs statement is always executed wherever it appears in the edit. This means that
although the compiler will accept statements of the form:
if (....) missingincs 1
Quantum will, in fact, switch on missingincs for the rest of the edit or until a
missingincs 0 statement is read. It does not switch on missingincs selectively for only
those records that satisfy the expression defined by the if clause.
If a job contains an edit and a tab section and missing values processing is used in the edit, the
setting of missingincs carries forward from the edit to the tab section. If the edit uses missing values
processing but the tab section does not require it, remember to end the edit with a
missingincs 0 statement.
The general rules for non-numeric data variables in arithmetic assignments are as follows:
• Blanks in an otherwise numeric field are ignored, but totally blank fields are read as zero.
• &’s in an otherwise numeric field are ignored, but fields full of &’s are read as zero.
• Multicodes in an otherwise numeric field are ignored, but a field in which all columns are
multicoded is read as zero.
If you switch on missing values processing these rules are modified so that any field that is not
totally numeric or a combination of numbers and blanks is counted as missing.
Here is a table showing samples of data in a numeric field and the difference missing values
processing makes to the way that data is interpreted:
If you print variables whose values are missing_ in a report file or write them out to a data file,
Quantum will show their values as −1,048,576 rather than as the word missing_.
If an arithmetic expression uses a variable whose value is missing, the value of the expression
differs depending on whether or not missing values processing is switched on. If missing values
processing is switched on the value of the expression is always missing_. If it is switched off, the
value of the expression is always zero. For example, if c(1,3) contains the string ABC:
missingincs 1
t1 = c(1,3) * 100
missingincs 0
t1 = c(1,3) * 100
sets t1 to zero.
If you have other values that you want to replace with the missing value in the edit, you may do so
by typing a standard assignment of the form:
variable_name = missing_
Since missing_ is a special value you cannot use statements of the form:
to test whether a variable has the special missing value. Instead, use the function:
ismissing(variable_name)
For example:
if (ismissing(t4)) ....
Subroutines can be used to make your program more readable by eliminating the need to use go tos
in certain circumstances. If you use a subroutine with a name describing its purpose it will be
immediately apparent what is to be done, and it will mean you don’t have to go skipping backwards
and forwards in the program in order to understand what it is doing.
Quick Reference
To call a subroutine, type:
To use any subroutine, enter the call statement at the point at which the routine is required. The call
statement simply says:
call routine[(arguments)]
where routine is the name of the subroutine to be used and arguments are any other items of
information required by the routine. These will differ from routine to routine and are clearly
explained in the appropriate section below.
Quantum has its own library of subroutines which you may call from within your Quantum
program.
Quick Reference
To load data from a look-up file, type:
To load data from a look-up file and generate a report of used and unused keys, type:
where keys is a number whose value determines whether used or unused keys are listed in the
report.
Sometimes you will have additional information available that is not part of each respondent’s data
record but that nevertheless needs to be read into the C array for use in the analysis. For instance,
suppose we did some additional work on a chocolate purchasing survey and collected information
about the cost of various types of chocolate bars. We can transfer this information to the array in
two ways. We can either write an edit to check which brand has been bought and then copy the
appropriate price into the record using if and an assignment statement, or, in a much simpler
operation, we can put the costs into a look-up file and call them up as required with the fetch
statement.
A look-up file is one which contains information to be transferred into the data record at a given
point. Each item in the file has a unique key associated with it; this is very often the code
representing that information in the data. If brands A, B, C and D are represented by the codes 1
through 4 in the data, the costs for those brands must have the keys 1 through 4 as well. Similarly,
if a Ford Escort car is coded 274 in the data, the additional information for a Ford Escort would be
identified by the key 274 in the look-up file.
Data in the look-up file must be sorted in alphabetical order and must be formatted as follows:
• The first line must contain exactly two whole numbers anywhere on the line. The first is the
key length, the second is the total record length including the key.
• All other lines must start with the key which may be followed by any other information as
necessary.
The look-up for our chocolate survey is named costs and is as follows:
1 4
1 14
2 15
3 21
4 17
The first line tells us that the key is 1 character long and that the record length is four characters
long (the space in column 2 is part of that information). The other lines refer to the individual
chocolate bars. Brand A (coded 1) costs 14 pence, Brand B (coded 2) costs 15 pence, Brand C costs
21 pence, Brand D costs 17 pence.
To transfer data from the look-up file to the record in the C array, enter the fetch statement in your
edit at the point at which data is to be copied. Fetch is a C routine and is invoked by typing:
call fetch($file_name$,key_col,put_col)
where file_name is the name of the look-up file, key_col is the start column of the key in the record,
and put_col is the start column of the field into which the data is to be copied.
Data copied from a look-up file does not retain its key at the beginning. If you look at the example
in the previous section, the data transferred for records with key 1 will be $ 14$.
Suppose, in our chocolate survey, that the first brand bought is stored in c135, the second in c150
and the third in c165. Brands are coded 1 through 4 as noted above, and costs are to be copied into
fields starting in columns 136, 151 and 166 respectively. To deal with all three purchases we will
call fetch three times, once per purchase. For the first purchase we would write:
call fetch($costs$,c135,c136)
When the first record is read, Quantum inspects c135 and compares its contents with the first field
of the look-up file. If c135’1’ (brand A was bought) and a matching key is found in costs, the
information associated with that key is copied into the C array starting at c136. In our example,
brand A chocolate bars cost 14 pence so c(136,138) will contain $ 14$. If a matching key cannot
be found in costs, the destination area c(136,138) will be blanked out.
Calls for the second and third purchases would be entered as:
call fetch($costs$,c150,c151)
call fetch($costs$,c165,c166)
When you read additional data in from fetch files, Quantum writes a summary of what it has done
to the file out2. The format of the report is as shown here:
This tells you that the run used two fetch files. The first file, cost1, contained seven keys; five were
present in the data and two were not. The file was called 893 times altogether and 869 times the
key in the data was found in the fetch file. The 24 misses refer to keys that were present in the data
but not in cost1.
The second file was called cost2 and contained three keys all of which were present in the data. The
file was called 196 times and every time the key in the data was found in the cost2.
Nine digits are allowed for each column making the maximum count in a column 999,999,999.
If you want a list of which keys were used and unused, use fetchx instead of fetch. fetchx has the
same syntax as fetch except that it has an extra parameter at the end which tells Quantum which
additional information is required. Possible values for this parameter are:
So, to load data from a fetch file called costs and to see a list of used and unused keys, you would
type:
call fetchx($costs$,c150,c151,3)
If you use fetchx more than once, the key listings are printed after the summary line to which they
refer. If the listing goes over onto a new page the column headings are repeated at the top of the
page.
Quick Reference
To write a multicoded column out as a single-coded field, type:
When Quantum converts multicoded data into single-coded data, it takes the codes in the multicode
and transfers each one to a separate column in the data, thus creating a single-coded field of
columns in addition to the original multicode. You may choose which codes should be exploded in
this manner, and also the start column of the single coded field.
call explode(mc_start_col,num_cols,’codes’,sc_start_col)
where mc_start_col is the first multicoded column to be converted, num_cols is the number of
sequential columns to be converted, codes are the codes to be written out as single codes, and
sc_start_col is the first column in the single-coded field.
Codes are exploded in the order 1234567890–&. If the first code specified in codes is present in
the multicode, that code will be copied into the first column of the single-coded field. If the code
is not present, the column is blank. For instance, if our data is:
----+----5
1
/
4
and we write:
we will have:
----+----5----+
1 1234
/
4
If we write:
then:
----+----4 ----+----4----+----5
14 14 12 4 45
25 becomes 25
46 46
7 7
The explode statement says ‘explode codes 1 to 5 in the two columns starting at column 132 into a
field starting at column 140’. Quantum copies a ‘1’ into c140 because there is a ‘1’ in c132, and a
‘2’ into c141 because there is also a ‘2’ in c132. Column 142 is blank because there is not a ‘3’ in
c132, and so on. Notice that the ‘7’ in c132 and the ‘6’ in c133 have been ignored because they are
not part of the code specification with explode.
If explode is called for any record in the data file, Quantum prints a map in the out2 print file listing
the contents of the multicoded columns and the columns into which the codes were transferred. If
explode is not called for any record, no map is produced.
Writing subroutines in C
Quick Reference
To write C subroutines, either type them into a file called private.c in the project directory, or insert
them in the Quantum run immediately before or after the edit section as follows:
#c
C statements
#endc
You can also include executable C statements in the edit section itself, as long as you enclose the
code within #c and #endc statements.
Subroutines written in the C language must be filed in the file private.c in the current directory so
that they will be compiled automatically with the rest of your Quantum program. If you have
already compiled your subroutines before doing your Quantum run, the compiled version must be
stored in the file private.o in the current directory.
Alternatively, you can insert complete C functions immediately before or after the edit section as
long as you enclose the code between #c and #endc statements as shown here:
#c
/* C code
#endc
Here are some examples of how to include a function, square, that calculates the square root of a
number. The Quantum edit that calls this function may look something like this:
real square 1f
ed
cx(181,190):4 = square(cx(1,3))
filedef srdata data
write srdata
end
When calling C functions, be sure to add the f option (where f stands for function) to the end of the
declaration as shown above. If you omit this, Quantum will not recognize the function name and
will issue a syntax error.
✎ If the function you are calling does not return a value, or if you do not need to save the return
value, you can use call to call the function and you do not need to declare it.
☞ For further information about call, see section 13.1, ‘Calling up subroutines’.
#include <math.h>
double square(double dval)
{
return (sqrt(dval));
}
In the second example, the C code for the function has been included directly after the end
statement in the Quantum run, and is enclosed by #c and #endc statements.
real square 1f
ed
.
end
#c
#include <math.h>
double square(double dval)
{
return (sqrt(dval));
}
#endc
It is also possible to include executable C statements directly into the Quantum edit section. Again,
the code must be surrounded by #c and #endc statements. Here is an example that calls the standard
C sqrt function directly and assigns the result to the Quantum variable x1.
ed
#c
#include <math.h>
x1 = sqrt(2.0);
#endc
cx(181,190):4 = x1
filedef srdata data
write srdata
end
In addition, any standard C library function, such as sqrt, can be declared and used directly in
Quantum. So the above example can also be written as:
real sqrt 1f
ed
x1=2
cx(181,190):4 = sqrt(x1)
filedef srdata data
write srdata
end
☞ For more details, see ‘Calling functions from C libraries’, later in this chapter.
Quick Reference
To define a subroutine written in the Quantum language, type:
return
subroutine name [(var1 [,var2, ...]) ]
Quantum statements
return
Subroutines written in Quantum must be placed at the end of the edit section, before the end
statement, and preceded by a return, thus:
Each subroutine starts with a subroutine statement and ends with a return. The format of the
subroutine statement is:
where name is the name of the subroutine. If you define more than one subroutine their names must
be unique within the first six characters of the name so, for example, sqroot and sqrt are acceptable
whereas sqroot and sqroot1 are not.
var1, var2, and so on, are variables which the subroutine will use. These variables are generally
referred to as the arguments of the subroutine.
When you use subroutines, Quantum differentiates between variables defined in the variables file
or before the ed statement and those defined after the ed or subroutine statements.
☞ For more information on the variables file, see chapter 14, ‘Creating new variables’ in this
volume, and chapter 1, ‘Files used by Quantum’ in the Quantum User’s Guide Volume 4.
Variables defined in the variables file or before ed are called external variables and may be
accessed and changed by statements within a subroutine. Variables defined after ed or inside a
subroutine are local variables and cannot be changed by a subroutine. For example:
real cost 1
int items 1
ed
int nshop 1
/* edit statements
return
/* subroutines
end
The variables cost and items are defined before the ed statement. This means they are external
variables and can have their values changed by a subroutine. The variable nshop is defined after ed
so it is a local variable. This means it cannot have its value changed by the subroutine, even though
its value can be passed to the subroutine for use by it.
Information stored in external variables is always available within a subroutine, and may be
accessed and changed regardless of whether you pass it as an argument to the subroutine. For
example, if we define an integer variable called items in the variables file, we can read its contents
and change them in the subroutine even if we do not include items as part of the call statement.
We might write:
call sub1
return
subroutine sub1
if (items.gt.5) emit c134’1’
return
end
This checks, inside the subroutine, whether the value of items is greater than 5 and, if so, inserts a
‘1’ in column 134. We do not pass the value of items to the subroutine because it is an external
variable which is available to the subroutine as a matter of course. Because items is an external
variable we could change its value in the subroutine if we wished. For instance, we could reset it
to zero.
Local variables which are required in the subroutine must be passed to the routine as arguments.
If the items variable was defined after the ed statement we would have to name it on the call
statement and on the subroutine statement thus:
ed
int items 1
call sub1(items)
return
subroutine sub1(items)
if (items.gt.5) emit c134’1’
return
end
This example performs the same task as the previous one. The difference is that this time items is
a local variable, so we must pass it to the subroutine. Once inside the subroutine, we cannot change
the value of items in any way.
In neither example is it necessary to pass c134 as an argument as all cells in the C array are external
variables.
When you use a subroutine which requires arguments, be sure that you call it with as many
arguments as are listed on the subroutine statement for that subroutine. If you give too many or too
few arguments, errors will occur.
For example:
call conv(gallons,liters)
.
subroutine conv(gallons,liters)
is correct because we call the subroutine with the same number of arguments as there are in its
definition, but:
call conv(aa,bb,cc)
.
subroutine conv(aa,bb,cc,dd)
is incorrect because we are calling conv with one argument fewer than its definition specifies.
When you return to the edit from a subroutine, any changes made to external variables will still
exist, but values assigned to local variables defined in the subroutine will not be accessible from
the main edit program. For example:
call sub1
return
subroutine sub1
int doneit 1
if (items.gt.5) emit c134’1’
items = 0
doneit = 1
return
end
Once the subroutine has been executed and control has returned to the edit, the value of items will
be zero but doneit will have no value at all.
Arguments
Generally, subroutines only need arguments when you are passing the values of local edit variables
to the subroutine. All arguments on the call statement must have a corresponding argument of the
same type on the subroutine statement. This is because Quantum does not compare the names of
the arguments on the call and subroutine lines. It simply passes the value of the first argument given
with call to the first argument named with subroutine and so on. For instance, if gallons and liters
are local edit variables and we want to use their values in the subroutine calc, we might write:
int gallons 1s
real liters 1s
ed
call calc(gallons,liters)
.
subroutine calc(input,output)
int input
real output
Here, the value of gallons is passed to input while the value of liters is passed to output. Input and
output are variables used solely within the subroutine so they are defined in the subroutine.
As we have said, external variables can always be changed by a subroutine whether or not they are
passed as arguments. If the subroutine is called once only, you would call it without any arguments
and then refer to the variables to be changed by name inside the routine. For example:
However, if you have a subroutine that is called more than once with different external variables,
you would represent them with local variables in the subroutine. For instance:
Here, n1 represents c120 or c220 and n2 represents total or tot2. n1 and n2 are local to the
subroutine so they are defined after the subroutine statement.
All local variables named on the subroutine statement must be defined in that subroutine. Real or
integer variables passed to the subroutine must be defined as such in the routine. For example:
subroutine conv(gallons,liters,price)
/* number of gallons bought
int gallons
/* equivalent in liters
real liters
/* price per gallon
real price
Single data variables (columns in the C array or user-defined data variables with one cell only) are
passed to a subroutine by naming the variable on a data statement as shown here:
subroutine chk(flav,prefb)
/* flavors bought
data flav
/* brand preferred
data prefb
subroutine ctyp(car)
/* make of car owned
int car
Any multicodes present in this field are ignored. If you have a multicoded field and you want to be
able to access the codes in each multicode, you must treat the field as a series of single data
variables and pass each one separately, using a data statement, rather than passing the field as a
whole. When variables are passed with call they are written in exactly the same way as you would
write them anywhere else in your edit. For example:
call sub1(c15,gallons,cost,c(20,28))
passes the address of the data variable c15, and the integer values of the variables gallons and cost
and the field c(20,28).
Notice that in the main definitions the size of the variable is defined, whereas in the subroutine
definition no size is required since all values are passed as integer values or, in the case of a single
data variable, as an address.
We have conducted a survey to test the market for a new TV station which would be available via
the satellite network. When it comes to asking how likely respondents would be to take this new
channel, people who already subscribe to the satellite network are asked slightly different questions
from those who do not. However, the possible responses to each set of questions are identical.
One way of checking these answers is to write a subroutine and call it up using variables to define
the columns to be checked. For example:
ed
/* c(21,23) is for those already subscribing
/* c(24,26) is for those who don’t subscribe
if (c17’1’) call subchk(21,22,23); else; call subchk(24,25,26)
/* rest of edit
return
subroutine subchk(high,low,dep)
/* high – willingness to take at $20
/* low – willingness to take at $10
/* dep – willingness to pay advance deposit
int high
int low
int dep
r sp ’1/59’ c(high), c(low), c(dep)
return
end
As our comments show, the fields to be checked are c(21,23) for those already subscribing to the
satellite network and c(24,26) for non-subscribers. Both calls to the subroutine subchk name the
columns in the field individually. This is because we want to look at the codes present in each
column. We have not defined the data variables at the start of the edit because they are read
automatically from Quantum’s variables file. This means that they are external variables and can
have their values accessed by the subroutine.
The subroutine statement uses local variables with names describing the contents of the variables
they represent. The variable high represents c21 and c24 which tell us how likely the respondent
would be to take the new station if it cost $20 a month. Similarly the variable low represents c22
and c25 and dep represents c23 and c26. All local variables are defined in the subroutine as the
name of the variable they represent.
The require statement simply checks whether each column is single-coded in the range ‘1/59’.
If you glance back at the example, you’ll notice that although we’re talking about columns in the
data, we’ve actually treated them as integers. The call to the subroutine simply gives the column
numbers without a preceding ‘c’. The subroutine itself defines its arguments as integers and then
uses them as pointers into the C array. There are two reasons for this:
• First, it allows Quantum to report the column numbers correctly if it finds records which fail
the require statement. Passing columns to a subroutine as data variables causes Quantum
always to refer to column 0 in the output from require regardless of the true column number
which is in error.
• Second, it enables you, if you wish, to set new codes into the columns used in the subroutine.
Normally, any changes made to the C array inside a subroutine are forgotten when control
passes back to the main program. Referring to the columns as pointers into the C array, as in
this example, causes any changes to the C array to be remembered when the subroutine
finishes.
✎ The notes in this section are for guidance only. SPSS does not own the source code for
functions in the C libraries and therefore cannot support them. If you have any problems,
consult your C compiler reference guide.
The C runtime and maths libraries contain a number of general-purpose functions, some of which
may be useful in Quantum programs. For example, if you want to square a number or calculate a
square root, you will almost certainly find functions that do this in one of the C libraries.
Before you use a C function in Quantum, read the documentation on that function to find out what
parameters it needs, and of what type. Having done this, you then need to provide this information
in a format Quantum understands. In order to explain how you do this, we’ll use the pow function
which raises a value to a given power.
The Unix documentation for pow( ) states that the function expects two arguments, both of which
are double precision real variables. This means that your Quantum program will need to hold the
value and the power (exponential) in x variables:
x1 = 5
x2 = 2
x3 = pow(x1, x2)
Even if one of the arguments is a constant, as both are in this example, you must assign the values
to variables as Quantum will not accept real constants within the function’s parentheses.
pow( ) returns a value which you want to use in your Quantum program. In order to do this, you
must define the function in the variables section of your run (that is, in the variables file or at the
top of your program, before the ed statement). The function’s type must be set to the type of data
the function returns. pow( ) returns a double precision value so we define it as:
real pow 1f
real pow 1f
ed
x1 = cx(11,14)
x2 = 2.0
x3 = pow(x1, x2)
end
The table below lists the various C return types and shows how to define them in Quantum:
When looking things up in this table, bear in mind the following points:
• Quantum uses long integers, so all integer variable types except ‘unsigned long’ can be
accommodated.
• Quantum does not support unsigned values, but this is only a problem with ‘unsigned long’
variables.
If you are not interested in the value the function returns, or the function does not return a value at
all, you can treat it as a subroutine and run it using call, as you would for the standard Quantum
functions. For example:
Whether you call C library functions as subroutines or functions, you need to specify the arguments
correctly in Quantum so that they are converted to the appropriate C variable types. In general, the
safest option is to store any real or integer arguments in Quantum real or integer variables, as in the
pow( ) example, and then call the function with those variables as the arguments. This is
particularly important when dealing with Quantum data variables.
You can pass text strings as they are, as you saw for printf, but you cannot pass text held in data
variables.
✎ Quantum stores all names in lower case. So if you want to reference an external function
whose name includes upper case characters, you need to define a function in private.c using a
name in lower case, to call the external function.
☞ For more information about private.c, see section 1.12, ‘C subroutine code file’ in The
Quantum User’s Guide Volume 4.
In chapter 4, ‘Basic elements’, we said that Quantum automatically provides you with an array of
1,000 data variables in which to store data, 200 integer variables for storing whole numbers and
100 real variables for storing real numbers. We also said that you may create your own data, integer
and real variables with names representing the type of information they contain. In this chapter we
will discuss how to increase the number of variables that Quantum provides and how to create your
own named variables.
All variables in a program must have a unique name, which can be up to 253 characters long. The
name must not contain spaces and must start with a letter.
You can use only the following characters in a name: A through Z _ 1234567890.
You may choose any name you like, but you are advised to use names which have some relevance
to the type of data they contain — for instance, total_income for a variable which contains a
respondent’s total income.
Also, remember that Quantum is case insensitive and therefore does not distinguish between
uppercase and lowercase letters. For example, COUNTRIES_Visited is the same as
Countries_Visited.
Although variable names can include digits, if you do include a digit and you are using the ‘s’
option, you still have to refer to the individual columns using parentheses. For example, if you
create a data variable by writing:
safe12
However, if you create a data variable whose name ends with a number by writing:
Quantum does not recognize safe112 as column 12 of the data. So you have to write:
safe1(12)
So, to avoid unexpected conflict statements during a Quantum run, it is probably simpler to name
your variables using A through Z, and the underscore characters only.
Quick Reference
To define a data variable, type:
Type s after the variable’s size if you want to be able to omit the parentheses from references to
single cells in the variable.
Before Quantum will recognize named variables in your program, you must say what type of
information the variable is to contain and how many cells it should have. If you wish to increase
the size of the C array, you must indicate how many cells you require.
There are three places that you can declare named variables:
• In the variables file. Variables declared here are available in the edit and tab sections of your
program and also in subroutines, and may be changed by the edit or by a subroutine.
• At the start of your program before the ed statement. Variables declared here are available in
the edit and tab sections of your program and also in subroutines and may be changed by the
edit or by a subroutine.
• In the edit after the ed statement. Variables declared here are available in the edit section only
and may only be changed there. They are unknown to the tab section and to subroutines.
• The variable name: C, T or X to increase the number of data, integer or real variables available;
any name for a new variable.
• The variable size. This is generally the number of cells the variable is to have.
data c 1500
increases the size of the C array to 1500 cells. This provides space for records with up to 14 cards
per respondent.
int number_of_trips 5
creates an integer variable called number_of_trips which can store up to five whole numbers.
real price 10
✎ Increasing the C array with a data, int or real statement does not cause Quantum to clear the
extra cells between records. However, when you increase the C array by using the max=
option on the struct statement, Quantum automatically clears the entire array between records.
☞ For further information on max=, see ‘Highest card type number’ in chapter 6, ‘How Quantum
reads data’.
When we first talked about variables we said that the individual cells of an array may be referenced
by following the name of the array by the cell number enclosed in parentheses. Therefore:
We also mentioned that you may omit the parentheses when you are referring to a single cell in the
C array so that c100 means the same as c(100).
To make this possible you must follow the variable size with the letter ‘s’. This is particularly
important when you are increasing the size of the C array as, without it, any references to, say, c15
will cause errors. For instance, if we write:
data c 1200s
we are increasing the size of the C array to 1200 cells — enough for 11 cards per record. Because
the array size is followed by ‘s’ we can write c1056 when we mean c(1056): Quantum will
substitute the parentheses automatically.
The dimension of the C array will be taken automatically from the value of max= on the struct
statement if this is greater than the dimension requested in the variables file or at the start of your
program file.
int c 1300s
struct;max=15;ser=c(1,4);crd=c(79,80); ....
in your program, the C array will be increased to 1600 cells to accommodate card type 15.
int brand 1s
int brand 1
The former creates the variable ‘brand’ as an array, and you can refer to it in your program as
brand1. The latter creates a single named variable that must be referred to as brand.
If you are not increasing the number of data, integer or real variables or creating new variables,
there is no need to set up a variables file. Quantum will read the default values from its own
variables file, as follows:
data c 1000s
colreal cx c
real x 100s
int t 200s
This gives you the 1000 data variables, 100 real variables and 200 integer variables mentioned in
chapter 4, ‘Basic elements’.
The second statement (colreal cx c) informs Quantum that variables referred to as cx are, in fact,
data variables whose contents are to be treated as real numbers.
An alternative method of naming variables is to define them as part of your Quantum program.
Variables which you want to use within your program and which you want to be able to change in
a subroutine must be defined before the ed statement. These are called external variables.
Variables which are to be used during the edit, and whose values may be passed to a subroutine but
not changed by it may be defined after the ed statement. These are termed local variables.
Here is an example:
☞ For further information about external and local variables, see ‘Passing information between
the edit and a subroutine’ in chapter 13, ‘Using subroutines in the edit’.
Data-mapped variables can be used to store the answers to questions, both numerical and
categorical. When storing numerical information, a data-mapped variable can be treated in the
same way as other numerical variables. Categorical values are generally stored and retrieved as text
strings, that is, the response texts of a question.
As the name suggests, data-mapped variables are typically used in conjunction with one or more
data-mapping files and allow Quantum specs to be written without needing column and code
information. Instead, the Quantum can be written so that it automatically retrieves the information
it needs from the data-mapping files used. Using this technique, you can specify conditions in your
Quantum run by referring to the response texts that appear in your questionnaire, rather than having
to specify the columns and codes that are involved. An example of this could be:
While it is possible to use data-mapped variables on their own, they don’t really offer too much
over what is already available. The real power of these variables comes when you start using data-
mapping files.
A data-mapping file contains information about what is contained in a data file and where specific
information is located within it. The Quancept data acquisition package produces such a file, the
project qdi (questionnaire data information) file. This file contains details of all the variables
defined, their possible values (that is, the possible responses), and where the information for
particular variables is located in the data records.
Instead of having to specify the columns which refer to the data, you can simply use the names
they were given in the data-mapping file. You do not need to write specifications to transfer
this information — it happens automatically in the same way as Quantum automatically sets
entries in the C array according to the data read in. This means that Quantum will set the values
of data-mapped variables for variables whose names appear in the mapping file that is being
used.
The data-mapping file contains the column and code values for each of the responses in a
categorical question. Because of this, there is no need to write this in your Quantum
specifications; you can just refer to the response text itself. At first, this may seem a bit
cumbersome, for example, it may seem easier to write:
c=c233’1’
as opposed to:
c=opinions$I liked the first brand much more than the second$
However, you do not need to specify the whole response text, just enough to uniquely identify
it. (In addition, it is very likely that the specifications will have been automatically generated
rather than hand written.) The above example could therefore be written as:
c=opinions$I liked\$
You type in the characters which uniquely identify the text and then append the \ character to
ignore the remaining characters in the string. This is described in more detail later.
When referring to data fields by name, and response codes by the response text, the data
locations are derived entirely from the data-mapping files. The advantage of this is that if the
data layout changes, all you need to do is use the data-mapping file for the new data set.
Furthermore, Quantum lets you analyze many data sets with many different mapping files —
all in the same run. You do not need to write complex recoding specifications as this is handled
automatically, which in turn, means less chance of error.
Data-mapping files can contain a complete description of one or more data files. This can
include field names, response texts and their locations. In fact, everything that would be
required to generate the main body of a Quantum run can be held in data-mapping files and the
Quantum specifications can be generated automatically. Obviously, there will always be some
reason why specifications generated in this way would need to be manually adjusted, but this
can be kept to a minimum freeing you from the routine tasks (where mistakes are likely). This
allows you to concentrate on the more complex requirements.
☞ For more details about generating a Quantum specification automatically, see section 15.10,
‘Automatically generating a Quantum spec’.
All this does not mean that you must have data-mapping files to use data-mapped variables.
However, without a data-mapping file, you would have to manually load values into the data-
mapped variables, which removes many of these advantages.
The data-mapping file describes the content and layout of one or more data files. It contains the
following information:
• For numerical fields, the location of the fields in the data file (that is, the card and column
specifications).
• For categorical fields, the response text and location (that is, the card, column, and codes) and,
possibly, the unique ID for each category.
• Additional information that is not used by Quantum such as the limits of a numerical range.
• As a normal variable definition, using the mapvar variable type. The syntax for which is:
For example:
mapvar my_variable 1
Where data-mapped variables are defined using mapvar, the following should be noted:
— If you define an array of mapvar variables (that is, by specifying a size greater than 1), the
actual size of the array is determined by its use and not by the size specified. For example,
if you have the question ‘Which colors did you paint the walls of each room’, you could
specify an array of rooms as:
mapvar rooms 2
Note, however, that defining a size of 1 will always define a single variable.
— The s (special) and f (function) options have no meaning and so are not valid for mapvar
variables.
• Using the *usemap statement to introduce a map file. The syntax for this statement is:
*usemap mapfile_name
For example:
*usemap project.qdi
If you use a *usemap statement to introduce a map file, a variable is automatically defined for
each item in the file.
✎ Although you may use the same name for a mapvar name or a *usemap file, you cannot use
the same name for a data-mapped variable and any other type of variable. For example, you
could assign the name preferences to both a data-mapped variable and a map file, but you
could not use then use this name for a data, integer or real variable.
As mentioned earlier, data-mapped variables can hold both numerical and categorical data. If a
data-mapped variable is storing numerical data, it can be used in exactly the same way as any other
numerical variable.
my_mapvar=t1+23
x1=my_mapvar
t1=t1+my_mapvar-4
• Test the value using logical operators (that is, .eq., .ne., .lt., .le., .ge. and .gt.), for example:
n01Total bought;c=my_mapvar.gt.0;inc=my_mapvar
val my_mapvar;=;base=Total;1;2;3;i;4-5;6+
• Analyze the value using the var statement (described later), for example:
var my_var;=;base=Total;1;2;3;i;4-5;6+
For data-mapped variables storing categorical data, its use is similar to using data variables that are
storing categorical data. In this case, you can:
my_mapvar1=my_mapvar2
n01Drank Pepsi;c=my_mapvar$Pepsi$
• Analyze the value using the var statement (described later), for example:
var my_mapvar;base=Total;Coke;Pepsi;Other=$_other$;DK=rej
In addition to normal response code names, packages such as Quancept allow certain special
responses in the data. In order to check or set these names, the following special response texts are
recognized by Quantum:
When using data-mapped variable arrays, you can refer to the array element just as you would any
other variable array, that is, by specifying the element using the numerical index. However, if the
data-mapping file contains names for the array elements (note that using a qdi file which was
generated by Quancept will create arrays for variables that are iterated and the elements are named
after the iterations), then you can use those names to reference specific array elements. For
example, if you had the variable array wrate in the Quantum run that stores the rating given to the
widget suppliers Wilsons Wonderful Widgets and Just Widgets, you could refer to the rating for
each supplier as:
wrate(1)
wrate(2)
or:
Also, if the array elements have unique IDs associated with them, then these too may be used to
refer to the elements. As with other uses of unique IDs, the ID text is converted to a response text
format by placing it within parentheses and prepending the underscore character. Therefore, using
the same example as above, you could write this as:
wrate($_(wilsons)$)
wrate($_(justwids)$)
If you are not using a mapping file, or the data-mapped variable is not represented in the mapping
file, then each array element will be created when it is first used.
One of the most powerful features of data-mapped variables is that you don’t have to assign values
to them. This is done automatically; as records are read, these variables are automatically initialized
according to the data. All you have to do is to introduce the data-mapping file prior to reading data.
The data-mapping file will contain item definitions for fields contained in each data record. Where
those item names match the name of a data-mapped variable, the variable is initialized as a data
record is read. (Variables whose names do not correspond to any of the items in the map file will
be cleared.) So, if you just want to analyze your data, simply introduce the map file at the beginning
of your Quantum spec to define the variables, write your analysis specs using those variables, then
introduce the map file again immediately before reading your data.
Of course, if you have a second data file with a different mapping scheme, you would:
You can see above how your Quantum run specifications do not change at all. Also note that you
only need to introduce one of the maps in your Quantum specifications. This is because you are just
using the map file to define your variables. If the same items exist in both files, you do not need to
define them twice.
There are various reasons why you may wish to explicitly assign values to data-mapped variables,
so naturally you can set values into data-mapped variables. Below is a summary of how you can
achieve this:
You can assign the value of any numerical expression directly to a data-mapped variable as
follows:
For example:
q23 = t1 + 7
If the variable is either clear or already holds a numerical value, then the result of the arithmetic
expression is stored as a numerical value.
If, however, the variable is already set to one or more categorical responses, then Quantum
attempts to set the categorical response that corresponds to the result of the arithmetic
expression. For example, if the result of the expression is 5, then Quantum sets the 5th
categorical response and all other categorical responses are cleared. Looking at the categoric
question:
You may then have a data-mapped variable called Q23. Typically, you would expect the
variable to be tested for the exclusive response Mail as:
if (q23=$Mail$) ...
However, you could refer to it by its numeric value (that is, 4):
if (q23.eq.4) ...
So, if you wanted to explicitly set Q23 to be $Mail$, you can do it in one of two ways:
q23 = $Mail$
q23 = 4
However, since the variable is associated with a list of responses, then the following would
give a data error:
q23 = 7
This is because there is not a seventh response in the list. If, however, the variable were a true
numeric type, this would be fine.
• Assign a response.
You can assign a specific response either by using the method for assigning a numerical value,
or by using the following syntax:
variable_name = $response_text$
For example:
q1 = $Once a week$
If unique ID texts are defined for responses in the data-mapping file, you can assign a response
using its unique ID text. The syntax for this is:
variable_name = $_(unique_ID_text)$
For example:
q1 = $_(Once a week)$
You can assign the value(s) of one data-mapped variable to another data-mapped variable
simply by specifying:
target_variable = source_variable
For example:
aware_copy = q23
If either the source variable or the target variable holds a numerical value, then the numerical
value of the source variable is copied to the target variable. If this is not the case, then the
categorical responses are copied.
Categorical responses are transferred by matching the text. This means that the target variable
may not contain the same positional value for a response text as the source. For example, if the
variable drank_most_recently contained the response texts:
$Coke$
$Pepsi$
then here the response text $Coke$ would be referenced as response number 1.
$Pepsi$
$Coke$
then here, the response text $Coke$ could be referenced as response number 2. If a respondent
had $Coke$ as the answer to drank_most_recently, then the statement:
drank_at_all = drank_most_recently
would result in both variables having the value $Coke$. This would, however, be response
number 1 in drank_most_recently, but response number 2 in drank_at_all.
If two or more variables contain categorical responses, then you use the OR function to assign
the combination of all of the responses as follows:
For example:
Responses are transferred over using the response text as described above. Bear in mind that
the following special responses are exclusive and can only appear in the absence of all other
responses:
$_ref$
$_dk$
$_null$
$_na$
If an assignment results in a combination of any of these exclusive codes and one or more other
responses, Quantum removes the exclusive special responses from the target variable. If an
assignment results in more than one exclusive special response and no other responses,
Quantum removes all but one of the exclusive special responses using a defined order of
precedence. The order of precedence is $_ref$, $_dk$, $_null$, $_na$. So, if an assignment
results in $_null$ and $_dk$, Quantum removes the $_null$ response and leaves the $_dk$.
• Collecting the logical AND of the responses from several data-mapped variables.
In a similar way to the OR function, you can use the AND function to collect only responses that
appear on every one of the specified list of variables. You can assign the result to a variable as
follows:
For example:
As with the OR function, exclusive special responses are unset if they are not valid.
• Collecting the logical XOR (exclusive OR) of the responses from several variables.
Again, similar to the OR function, you can use XOR to collect responses that appear on only one
variable of a specified list of variables. This means that if a response is not mentioned on any
of the variables, it is not collected. In addition, if the same response is mentioned on two or
more of the variables, that response will also not be collected. You can assign the result to a
variable as follows:
For example:
As with the OR function, exclusive special responses are unset if they are not valid.
Although data-mapped variables can hold either numerical or categorical data, they can, in fact,
have four possible states:
Regardless of the kind of data being stored, you can always test the numerical value of a data-
mapped variable by using the standard logical operators:
• If the data-mapped variable contains a numerical value, then the value of the variable is tested.
• If the data-mapped variable contains a single categorical response, then the value of the
variable is the response number (that is, the first response is counted as 1).
• If the data-mapped variable contains several categorical responses, the value of the variable is
zero.
• Finally, if the data-mapped variable is unset, then the value of the variable is zero.
You can test any data-mapped variable to see if it holds a specific categorical response text in the
following ways:
• To test if a data-mapped variable has only one categorical response, and it is the one specified,
you can use the = operator.
• To test if a data-mapped variable has one or more categorical response stored, including the
one specified, you can use the & operator.
Testing for categorical response texts is achieved by specifying the variable name, the test operator
and the response text. The syntax is very similar to the way the presence of response punch codes
are tested when using standard data variables. However, it is not possible to test for the presence of
several response texts in a single test. The following examples show how you might check for
responses using standard data variables with punch codes, and using data-mapped variables with
response texts:
c123’1’ q23$Yes$
In the same way as using standard variables, you can omit the test operator, in which case & is
assumed. You can also combine or negate tests using the logical operators: .or, .and., and .not. and
adjust the order of evaluation using parentheses.
In addition to using just the response texts associated with a given variable, you can also:
• Use one of the special response texts described earlier (that is, one of $_base$, $_normal$,
$_dk$, $_ref$, $_other$, $_na$, $_null$, $_precode$, $_special$, $_answered$ and
$_possible$).
• Use the unique ID associated with a response using the syntax $_(unique_ID)$.
• If you have specified the unique ID on an element (using uniqid=keyword), then you may use
the special response text $_uniqid$ as a shorthand for $_(unique_ID)$.
• You may specify only as much of the response text as is needed to uniquely identify it. When
doing so, you must append a \ character to the text. For example, $Very impressed$ may
become $Very\$ and $Quite impressed$ may become $Quite\$ (or, in fact, just $V\$ and
$Q\$ if these strings are unique).
• You may also specify response texts (or the unique ID) using either uppercase, lowercase
characters, or any such combination.
The main use for data-mapped variables in analysis specifications is to define conditions for tables
and table elements. You can do this in the following ways:
As already discussed, you can test the values of data-mapped variables as part of logical
expressions. This means that you can use the standard c= keyword on tab and l statements to
create analysis conditions. For example, you might create the following axis:
l q23
ttlQ.23 Which of the following have you bought in the last week?
n10Total
n03
n01Fridge; c=q23 $Fridge$
n01Freezer; c=q23 $Freezer$
n01Microwave oven; c=q23 $Microwave\$
n01Coffee maker; c=q23 $Filter coffee maker$
n01Toaster; c=q23 $Toaster$
n01None of these; c=q23 $None\$
n01Don’t know/Not answered; c=-
Just as col and val statements make it a lot easier to write simple specifications using standard
data variables, the var statement provides the same shortcuts for data-mapped variables. For
example, you could write the above axis as follows:
l q23
ttlQ.23 Which of the following have you bought in the last week?
var q23;base=Total;hd;Fridge;Freezer;Microwave oven;
+Coffee maker=$filter coffee\$;Toaster;
+None of these;Don’t know/Not answered=rej
By default, the var statement uses the element text as the response text to create the condition
required. If this is not correct (as with the Coffee maker element), you can specify the required
response text using the = operator.
When analyzing data-mapped variables that contain numerical values, you can use either the
var or val statements. For example, the following two statements are equivalent:
Whether analyzing numerical or categorical data, var has an added advantage over col and val
equivalents in that you can combine several variables on one statement. This ability extends
the power of the var statement so that it becomes an equivalent to the fld statement too. To
combine two or more variables, simply place a comma-separated list of variables where you
would normally specify the single variable. For example, if the variable q25 held information
about the first appliance purchased and the variable q26 on the second purchased, you could
use the var statement to combine them as follows:
l q25_26
var q25,q26;base=Total;hd=Appliances purchased;hd=---------------
+Fridge;Freezer;Microwave oven;
+Coffee maker=$filter coffee\$;Toaster;
+None of these;Don’t know/Not answered=rej
Often, lists of items such as these are specified in the data using a code number. Therefore, if,
instead of the actual names, a numerical code is given to each type of appliance and the codes
are assigned to q25 and q26, the above example can be written as:
l q25_26
var q25,q26;base=Total;hd=Appliances purchased;hd=--------------
+Fridge=134;Freezer=135;Microwave oven=102;
+Coffee maker=117;Toaster=203;
+None of these=0;Don’t know/Not answered=rej
You can substitute variable names, response texts or array element names by using the text
substitution keywords available on *include and *def statements in exactly the same way as you
use them for any other purpose in Quantum. For grid axes, however, the following two new
keywords are available to facilitate the substitution of variable names and response texts:
If you wish to use a different data-mapped variable for each column in a grid axis, you can use
the var(#)= keyword. In the grid column specification, this keyword is used to specify the
name of the variable to be substituted. In the grid side specifications, use var# in place of the
variable name (where # can be any number in the range 1 to 9).
For example:
l gridax
n01First purchase; var(1)=q25
n01Second purchase; var(1)=q26
side
var var1;base=Total;hd=Appliances purchased;hd=-----------------
+Fridge;Freezer;Microwave oven;
+Coffee maker=$filter coffee\$;Toaster;
+None of these;Don’t know/Not answered=rej
n01Fridge or Freezer;c= var1$Fridge$ .or. var1$Freezer$
If you are using a data-mapped variable array, then you must follow the var(#)= keyword by a
specific array element. For example:
l rates
n01Rating for Wilsons Wonderful Widgets;var(1)=wrate($wilsons\$)
n01Rating for Widgets R Us;var(1)=wrate($widgets\$)
side
var var1;=;1;2;3;4;5;Don’t know/No Answer=rej
Where a grid axis requires different response texts to be used for each column in the grid, then
the resp(#)= keyword must be used to specify the response texts on the column specifications.
The value you assign must be the full specification, including the quoting dollar characters. In
the side specification, use the special response text $resp(#)$.
l rates
n01Fridge;resp(1)=$fridge$
n01Freezer;resp(1)=$freezer$
n01Microwave oven;resp(1)=$microwave\$
n01Coffee maker;resp(1)=$filter coffee\$
n01Toaster;resp(1)=$toaster$
side
n01First purchase;c=q25$resp(1)$
n01Second purchase;c=q26$resp(1)$
n01Any purchase;c=or(q25,q26)$resp(1)$
Earlier, it was mentioned that you could use the AND, OR and XOR functions to combine the
values of several categorical data-mapped variables. As well as assigning the result to another
variable, Quantum also allows the value to be tested directly. Using the & and = operators, you
can test for response texts in the result of these functions. For example:
This is a useful feature in that if two or more tests are to be performed on the same operation,
then it may be better to assign the result to a new variable and test that. This saves you repeating
the AND, OR or XOR operation many times.
Sometimes, you need to know the number of responses to a categorical question (for example,
to calculate the average number of mentions). You can use the numb function to count the
number of precoded categories that are set in one or more data-mapped variables. Similarly to
AND, OR and XOR, the numb function is given a comma-separated list of data-mapped variables
to act upon. The function returns a count of the number of precoded responses set on all of the
given variables. This can be used in the usual way.
For example:
t1 = numb(q25, q26)
n01Average number of purchases;inc=numb(q23)
flt ;c=numb(q23).gt.1
numb counts only precoded responses. This means that it will include all user-defined
responses and also the $_other$ response; it does not include the $_dk$, $_ref$, $_na$ or
$_null$ responses.
Quick Reference
To create a data-mapped Quantum spec from an existing qdi file, at the command line, type:
Using an existing Quancept qdi file, you can automatically generate a data-mapped Quantum
specification. The qdi file contains details of all the variables defined, their possible values (that is,
the possible responses), and where in the data records the information for particular variables is
located. The generated Quantum run file includes the necessary statement to the qdi file which is
then referred to for information during the Quantum run. You may need to manually adjust the
generated specifications, but in general, this automatic creation can save you a great deal of time
(especially for new spec writers) and reduces the likelihood of errors.
The main body of the Quantum specification is generated using the qdiaxes program. This program
reads the qdi file and creates the following files:
• A run file containing a struct statement, a *usemap statement for the specified qdi file, *include
statements for the ‘tab’ and ‘axes’, and a dummy breakdown axis.
• A table specification file containing a tab statement line for each data item in the qdi file.
✎ The Quancept utility, qditum, can also generate a basic Quantum specification from a qdi file.
However the specification that it creates does not use the data-mapping feature.
☞ For information about qditum, see the Quancept Utilities Manual.
To create a Quantum spec file from an existing Quancept qdi file, type:
where:
–a This option causes qdiaxes to remove all text strings of the form < ... > from
question texts. This is useful for Quancept Web projects where the text may
contain embedded HTML directives.
Note that text-formatting codes resulting from a Quancept CAPI script are
always removed since they are meaningless to Quantum.
input_qdi_file The name of the qdi file — with or without the qdi suffix.
output_filename The base name for the output files. qdiaxes appends the relevant suffix to each
Quantum output file.
reads the qdi input file holidays.qdi and generates the corresponding Quantum files (that is,
holidays.run, holidays.tab and holidays.axs). Since the –t parameter is not specified, the response
texts (written to the holidays.axs file) are truncated to 12 characters.
Quancept CAPI: Because Quancept CAPI runs in a graphical environment, more text formatting
control is provided on the interviewing screen than is possible in character-based Quancept. These
text formatting options are defined in the script by a formatting code enclosed in angle brackets.
For example, to switch bold on and off, the following codes are used:
These formatting codes serve no purpose in the generated Quantum texts and so are always
removed by the qdiaxes program.
Quancept Web: In addition to the formatting control provided with Quancept CAPI, the Quancept
Web product allows the scriptwriter to embed HTML directives into texts. Such directives are
usually enclosed in angle brackets and can optionally be removed from the qdiaxes output by using
the –a option.
☞ For details on all the text-formatting options available with Quancept CAPI and Quancept
Web, see the documentation relating to these products.
To reduce the size of specifications, Quantum allows you to truncate the response text specification
to the least number of characters that will uniquely identify the response in question. qdiaxes uses
this feature in generated specifications but allows you to define the minimum number of characters.
If the –t option was set to a number greater than the longest response text, say 50, then the
conditions generated would read:
If, however, the text was truncated to the minimum number of unique characters, in this case 3, then
the conditions would read:
✎ The \ character informs Quantum to ignore the remaining characters in the string.
But, by applying the default truncation length of 12, they would then read:
You may prefer the texts to be shorter or longer. Either way, the –t option on the command line can
accommodate your preference.
✎ Note that response texts are never reduced below a minimum threshold; that is, either the limit
set by the –t option, or a default of 12.
qdiaxes generates three types of files: the run file, the tab file and an axs file. Each file is given the
same base name and qdiaxes adds the appropriate suffix to each.
The generated Quantum filename.run file contains the following seven lines:
struct;read=2;ser=c(1,5);crd=c(6,7)
*usemap input_qdi_file
*include output_filename.tab
*include output_filename.axs
l XXXXXX
n10All Respondents
The *usemap statement instructs Quantum to refer to the qdi file for all the relevant variable
information. The two *include statements tell Quantum to read in the contents of the generated tab
and axis files.
The last two statements form the dummy breakdown by which all axes are tabbed. This enables top
lines to be produced immediately.
These statements act as a template for your specification; you can add to, delete and amend the
statements accordingly.
The filename.tab file contains a tab statement for each data item in the qdi file being processed.
There are two types of tab statements generated.
Apart from grid axes produced for data items with more than one iteration, all axes are tabbed by
the dummy breakdown ‘XXXXXX’, that is:
These can easily be stream-edited to use any standard analysis breakdown which you may wish to
define. Data items with multiple iterations produce grid axes which are tabbed as usual, that is:
Where item_name is the name of a data item in the qdi file and XXXXXX or GRID is a dummy name
against which to tabulate the first axis.
You will, no doubt, need to make changes to these statements in the axes file.
The axes file, filename.axs, holds the Quantum axis names and specifications. Axes are generated
in the following format:
— Each axis is named using the same name as the data item in the qdi file.
— A left-justified table title (ttl) is generated containing the name of the axis.
— The required number of ttl statements for each line of the item text — up to 80 characters
per line. Longer texts are wrapped onto multiple ttls.
— For each axis, the following three element statements are generated:
n01All Respondents
n00 ;c=.not.(item_name=$_na$)
n01All Answering
n01iteration_text;var(1)=item_name($iteration_name$)
— Following this specification, a side statement is generated. (side is used to separate the
column definitions from the row definitions.)
— Categoric items will produce an n01 element for each category as follows:
Any category that has a unique ID associated with it will also have the appropriate
Quantum uniqid= keyword generated. For example:
3. Read and process the data using the program created at step 2.
You can either run all stages automatically one after the other, or you can run a specific stage in
isolation.
Your computer may have more than one version of Quantum available, for example, a standard
version for client tables and a newer version for in-house testing. To indicate which version you
wish to use, you must assign the pathname of that version to an environment variable called
QTHOME and then add the Quantum bin directory to your path.
On Unix systems, you define QTHOME in your login file with a setenv statement. For example:
Under DOS, the version to use is set at the time the software is installed. For information on how to
switch between versions, see your installation instructions.
quantum This version silently deletes all temporary files created during a run unless you
include the option –k on the command line.
quantumx This version does not delete temporary files.
In the examples of commands in the rest of this section, we will use the word quantum to mean
quantum or quantumx.
At installations where automatic deletion of temporary files is not desirable, you may find that the
administrator has renamed the files so that quantumx is called quantum, and vice versa. You should
check this before you run your first job.
☞ For further details on file deletion, see section 3.1, ‘Tidying up after a Quantum run’, in the
Quantum User’s Guide Volume 4.
Quick Reference
To run a complete Quantum job, type:
If you omit the program and/or data file names, Quantum will prompt you for them as it needs them.
If you omit the name of the tables output file, Quantum will save any tables in a file called tab_.
• Run only one section of the job such as the compilation stage or the table creation stage.
• Define a run ID when you want to do more than one run in the directory.
• Define the names of directories in which Quantum should look for program and data files or
create intermediate files.
• Convert the Quantum program and data files into a Quanvert database.
They are:
✎ The option to create a Quanvert database is only available if the Quanvert Database
Administration software is installed.
☞ For further information about creating a Quanvert database, see chapter 7, ‘Creating and
maintaining Quanvert databases’ in the Quantum User’s Guide Volume 4.
Quantum can deal with compressed data files whose file names end with a .Z suffix. If you have a
data file of this type, there is no need to specify the suffix on the command line. Quantum always
checks first for a file with the exact name you typed on the command line. If it cannot find this file,
it makes a second search for that file with a .Z suffix.
Quantum can also cope with files that start with records you wish to ignore, or in which records are
not terminated by a new-line character. We refer to these globally as non-standard data files. If
you have a file of this type, create a dummy data file and enter the name of that file on the quantum
command line.
☞ For further information about dummy data files see ‘Reading non-standard data files’ in
chapter 10, ‘Include and substitution’, in the Quantum User’s Guide Volume 2.
Quick Reference
To compile a Quantum program, type:
quantum –c [program_file]
The first step in any Quantum run is to check the syntax of your Quantum specification and to
convert it into C code. We call this compilation. You can run the compilation stage by itself by
typing the quantum command with the –c option:
quantum –c [program_file]
The compilation creates many files, the most important of which are:
out1 The program file listing made as the program is checked. If errors are found,
Quantum marks them in this file.
colmap A listing of all the columns and codes referred to by all non-ignored axes.
☞ For further information on the contents of these files, see chapter 2, ‘Files created by
Quantum’, in the Quantum User’s Guide Volume 4.
Quick Reference
To load the C code created by a compilation under Unix, type one of:
quantum –l data_file
After a successful compilation, Quantum converts the C code created by the Quantum compile into
a program and, if there are no problems, reads the data. We call this program the datapass program.
You can run this stage as a separate task on Unix systems by typing:
or:
quantum –l data_file
This stage also creates a number of files, most of which are normally deleted at the end of the run.
The file you need to know about is:
Quick Reference
To read the data file after a previous compile and load, type:
quantum –r data_file
The datapass program reads and processes data according to the definitions in your Quantum
program file. Normally, this happens as an automatic extension of the load phase, but if you have
corrected errors in the data or added more data to the data file, you may rerun the datapass without
recompiling and reloading your program file. To do this, type:
quantum –r data_file
The datapass reads and processes each record separately. If you requested that data should be
separated into clean and dirty data files, or that it should be written out to another file, Quantum
will do so during this stage. Any holecounts or frequency distributions are also created now.
Finally, Quantum sets flags indicating the cells and tables in which each record is to be included.
Quick Reference
The weight, accumulation and manipulation programs cannot be run separately.
The weighting program, weight, weights records according to the figures given in your Quantum
program file. If the run has no weighting, the weighting program is ignored.
The accumulation program, accum, builds a file containing the cell values for each table.
If your job uses row or table manipulation, Quantum runs a program called manip. This carries out
your manipulation requests and creates a second file of cell values. Note that this file contains
values for all tables whether or not they are the result of manipulation.
You cannot run the weighting, accumulation or manipulation stages in any way except as part of a
complete Quantum run.
Quantum creates the following files, amongst others, during these stages:
Quick Reference
To create tables, type:
quantum –o [program_file]
The final step in most runs is to take the cell values and use them to create tables. Quantum reads
the page and table headings and positions them as requested. If tables are to be sorted, added or
placed side by side, the relevant figures are rearranged or combined.
To change the table layout without changing the cell counts (for example, to print more decimal
places for percentages, or to use special characters for absolute zero or rounding) you may rerun
just the compilation and output stages using the command:
quantum –o [program_file]
Files created during this phase which you should know about are:
If you want to rerun a single table only, you may run the Quantum output program by name rather
than via the Quantum shell script. Type:
where tab_file is the name of the file to which the table will be written and table_num is the number
of the table you wish to reprint. For example, to rerun table 10 and save it in the file tab_10 you
would type:
qout -o tab_10 -t 10
Quick Reference
To create a log file under Unix, type:
To run the job in the background, append & to the end of the command line.
The notes in this section do not apply to DOS Quantum since these facilities are not available on
that platform.
Quantum normally runs interactively. With large jobs, this can lock up your terminal for a
considerable time, so you may wish to use facilities provided with your operating system to run
your jobs in the background. This then frees up your terminal for other uses.
When you run jobs in the background, they still write messages to your screen unless you redirect
them to a log file. Quantum provides for this on the systems which need it with the –l option. This
writes any messages which would normally appear on your screen into a file called log instead. You
use it on the quantum command in addition to any other options required for the job. For example,
to run a complete job in the background under Unix, you might type:
✎ On some systems your system manager may prefer you to run large jobs via the batch system.
Quick Reference
To run more than one job in a directory, assign a unique suffix to each run by typing:
You may run more than one job in a directory without overwriting existing files by assigning a
unique suffix to each run. All files created during this run will have names which end with a dot
and the given string. For example:
File names that already contain a dot will not have a suffix appended. If your run creates clean and
dirty data files, these will retain their original names, clean.q and dirty.q.
We advise you to avoid a suffix of Z since this is the suffix assigned to compressed files and it may
lead to confusion if compressed files also exist.
Quick Reference
To create intermediate files in a directory other than the project directory, type:
Quantum can create its temporary work files in a directory other than that in which the job is
running. The directory is named using the option –td on the command line:
This example tells Quantum to create temporary files in a subdirectory called temp in the project
directory.
Creating temporary files in a different directory is one way of improving the performance of large
jobs running under DOS. When the number of files associated with a job rises above 500, you’ll find
that the job runs more quickly if the temporary files are created in a different directory. You’ll also
find it more convenient to scan directories’ contents when the number of files in each one is
reduced.
You may also find that using –td when creating a Quanvert database helps to keep the project
directory clean of unwanted files. It is also useful if you need to do multiple Quantum runs to create
the database. As long as you use a different temporary directory for each run, you can then combine
the directories with qvmerge to create the Quanvert database.
✎ The option to create a Quanvert database is only available if the Quanvert Database
Administration software is installed.
☞ For further information about creating a Quanvert database, see chapter 7, ‘Creating and
maintaining Quanvert databases’ in the Quantum User’s Guide Volume 4.
Quick Reference
To read program, data or include files from a directory other than the one in which the program is
being run, and to create permanent files such as report files in that same directory, type:
Quantum normally reads its program, data and include files from the directory in which you are
running the program, and creates permanent output files such as print or report files in that
directory. If you want to use a different directory, define it on the command line with the option
-pd. An example using Unix pathname notation is:
The exceptions are filedef and include with absolute pathnames. In these cases Quantum uses the
directory named in the pathname.
This index covers all four volumes of the Quantum User’s Guide. The page references consist of the volume
number followed by the page number; for example 2-6 is page 6 of Volume 2, 3-166 is page 166 of Volume 3,
and so on.
Index / 233
Quantum User’s Guide Volume 1
234 / Index
Quantum User’s Guide Volume 1
Index / 235
Quantum User’s Guide Volume 1
236 / Index
Quantum User’s Guide Volume 1
Index / 237
Quantum User’s Guide Volume 1
ed, start of edit section 1-8 elms=, maximum number of elements per axis 4-9
with levels 3-50 else, conditional actions 1-117
edheap=, limit for edit statement 4-9 emit, insert codes in columns 1-102
Edit, processing missing values 1-172 #end, finish edit in tab section 3-124
Editing #endc, end C code 1-183, 3-123
axis coding requirements 2-25 End of data file, checking for 1-52
in tabulation section 3-124 end, end of edit section 1-8
interactive correction of errors 1-160 endlevel, edit at end of level 3-51
with levels 3-50 endnet, end a net 2-67, 2-113
effbase, effective base 2-119, 2-153, 3-147, 3-149 #endpostscript, end PostScript code 3-213
Effective base 2-119, 2-153, 3-147, 3-149 endsort, end secondary level sorting 2-113, 3-134
Element texts terminating more than one level 3-135
define breakpoints in 2-163 Environment variables
printing | and ! in 3-202 QTAXES 4-10
Elements QTEDHEAP 4-10
all zero, ignoring 2-116 QTELMS 4-10
assign to subgroups 2-79, 2-114 QTFORM 3-201
base 2-56, 2-57 QTHEAP 4-10
non-printing 2-57 QTHOME 1-223
basic counts 2-50 QTINCHEAP 4-10
non-printing 2-56 QTINCS 4-10
required for statistics 3-71 QTINLISTHEAP 4-10
blank lines 2-58 QTLEXCHARS 4-10
cases already counted 2-48 QTMANIPHEAP 4-10
cases not yet counted 2-48 QTNAMEVARS 4-10
conditions on 2-46, 2-52 QTNOPAGE 4-23
count creating 2-49 QTNOWARN 4-11
distribution of records between 2-129 QTSPSSRC 4-55
excluding from totals 2-116 QTTEXTDEFS 4-10
extra text 2-58 .eq., logical equality 1-30
ignore in column axes 2-115 Error messages
ignore in higher dimensions 2-115 accum stage 297, 4-163
ignore in row axes 2-116 C compilation stage 296, 4-162
indent text when split 2-115 compilation stage 271, 4-137
intermediate figures for special T stats 3-157 datapass stage 297, 4-163
maximum values of inc= 2-124 include files 2-226
minimum values of inc= 2-124 percentiles 2-151
number per create in Quanvert Text 4-83 printing on the screen 1-11
percentage differences 2-124 Error variance of the mean 2-136
print all-zero 2-45 formula 2-157
rejecting one from another 2-125 in weighted jobs 2-143
reprint at top of continued tables 2-109 suppress if has small base 2-20
responses with numeric codes 2-94, 2-97 suppress if small base 2-196
selecting for special T stats 3-145 Errors, correcting 1-5, 1-10, 1-170
set maximum per run 4-9 errprint, print error messages on the screen 1-11
simplifying complex conditions 2-52 ex, table manipulation 3-34
splitting long texts 2-51 ex=, manipulation expression 2-119, 3-26, 3-32
subheadings 2-62 secure databases 2-45, 2-118, 4-116, 4-118
sum of suppressed 2-118 Examining records
suppress all-zero 2-15, 2-44 count 1-133
suppressed, accumulating in tables of nets 2-72 list 1-138
text continuation 2-66 online edit 1-160
types of 2-45 qfprnt 1-84
underlining text on 2-119 report 1-70
unsorted, in sorted table 2-116 require 1-145
weight factors for 3-15 write 1-65
weighted target for 3-14
elms=, elements for special t-tests 3-155
238 / Index
Quantum User’s Guide Volume 1
Index / 239
Quantum User’s Guide Volume 1
240 / Index
Quantum User’s Guide Volume 1
Index / 241
Quantum User’s Guide Volume 1
242 / Index
Quantum User’s Guide Volume 1
Index / 243
Quantum User’s Guide Volume 1
Levels file 4-2 manipclean, delete all except manipulation files 4-25
lexchars=, increase limit for text strings 4-10 manipheap=, limit for element manipulation 4-9
License expiry warning 4-11 Manipulated cell counts file 3-38, 4-22
Limits Manipulated elements, in sorted tables 3-141
increasing 4-9 Manipulation
list of 2-265, 4-131 apply spechar and nz options to manipulated
numbers 1-16 elements 2-14, 3-31
linesaft, blank lines after column headings 2-14, averages 3-33
2-162 example of 3-40, 3-42
linesbef, blank lines before column headings 2-14, expressions with 3-41
2-162 manipulated cell counts file 3-38
list, create frequency distribution 1-139 more than one table 3-39
lista, alphabetic frequency distribution 1-139 on n statements 3-32
listr, ranked frequency distribution 1-139 parts of tables 3-41
Lists program 1-229
alphabetic 1-139 replacing numbers in tables 3-35
creating 1-139 row, example of 3-30
named 1-44 run definitions file 3-38, 4-3
preventing use of in Quanvert Text 4-85 run ids for 3-38
ranked 1-139 tables from dummy data 3-43
Local variables 1-199 tables from other runs 3-38
with subroutines 1-186 using automatic table ids 3-36
Location, test for in matched samples 3-85 using element ids 3-28
Log files 1-230 using overall position 3-37
Logical expressions using previously manipulated figures 3-38
arithmetic value of field 1-38 using relative position 3-29, 3-37
checking equivalence of 1-154 using row texts 3-27
combining 1-39 using your own ids 3-36
comparing data variables 1-31 whole tables 3-34
comparing values 1-30 manipz, apply spechar and nz options to manipulated
comparing variables to a list 1-42 elements 2-14, 3-31
data-mapped variables 1-205, 1-210 mapvar, define data-mapped variable 1-203
negating 1-40 Matched samples, testing difference in location 3-85
range 1-38 max, maximum manipulation operator 3-26
validating 1-153 max=, highest card type 1-57, 1-198, 3-47, 4-2
with c= 2-119 maxim, maximum values of inc= 2-28, 2-124
with if 1-115 example of use 2-37
Logos, printing on tables 3-209 maxima.qt, limits file 4-10
Long texts, splitting 2-51 maxsub=, maximum sub-records per record in levels
Look-up files 1-178 data 3-48, 4-2
list used/unused keys 1-180 maxwt=, maximum weight 3-8
Loops mcnemar, McNemar’s test for differences 3-83
function of 1-119 McNemar’s test for differences 3-83
nesting 1-123 formula 3-91
with routing 1-124 mean, t-test on column means 3-164
Lotus-123, convert Quantum data for use with 4-32 Means
lsd, least significant difference test 3-175 analysis levels with 3-61
lst_, frequency distribution file 1-228, 4-17 decimal places with 2-139
.lt., less than 1-30 error variance 2-136
formula 2-156
least significant difference test 3-175
print maximum values of 2-37
M print minimum values of 2-37
produced by list 1-139
m, create a manipulated row 3-25 sorted table of 3-141
define manipulation expression 3-26 standard deviation 2-136
options on 3-25 standard error 2-136
machine.def, qvpack/qvtrans alias file 4-128 suppress if have small base 2-20, 2-196
244 / Index
Quantum User’s Guide Volume 1
Index / 245
Quantum User’s Guide Volume 1
246 / Index
Quantum User’s Guide Volume 1
Index / 247
Quantum User’s Guide Volume 1
248 / Index
Quantum User’s Guide Volume 1
Index / 249
Quantum User’s Guide Volume 1
Proportions (continued) qtsas, convert Quantum data & spec to SAS 4-56
test of differences qtspss, convert Quantum data & spec to SPSS 4-38
between overlapping samples 3-99 how differs from nqtspss 4-44
between subsamples 3-97 QTSPSSRC, nqtspss options 4-55
t-test on column 3-160 QTTEXTDEFS, max num of text symbolic params
two sample test of difference 3-95 4-10
pstab, create PostScript tables 3-198 Quancept 1-201, 1-205, 1-217, 1-218
ptf, translation file 2-176, 4-23, 4-77 Quantum program
Punch codes, ASCII equivalents 4-175 components of 1-3
punch()=, symbolic parameters for codes 2-232 format of 1-8
punchout.q, records written out by require 1-228, modify for Quanvert 4-68
4-18 options with 1-224
pvals, print P-values for special T stats 3-159 storing 1-3
P-values which version to use 1-223
Newman-Keuls test 3-165 Quanvert 4-67
paired preference test 3-173 add with 4-71
significant net difference test 3-169 allow creation of new axes 4-96
t-test on column means 3-164 allow use of special T statistics 4-76
t-test on column proportions 3-163 alpha variables 4-73, 4-74
axis titles 4-68
create database 4-93
create uniq_id variable 4-121
Q defining axes 4-68
effective base elements 3-149
q2cda, Quantum tables to CDA 2-82, 4-32 export grids to SAS and SPSS 2-40, 2-249
column headings 2-169 files 4-94
options with 4-35 files which must be present 4-96
qdi files 1-201, 1-217 filters 4-71
qdiaxes, generate Quantum spec 1-217 levels cross-reference files 4-94, 4-95
qextras.lst file for Quanvert (Windows) 4-91 levels data 4-72, 4-73
qfprnt, write out data in user-defined format 1-84 missing values 4-74
qnaire.txt file for Quanvert (Windows) 4-91 n25 with 4-76
qotext.dat 4-82 naming weighting matrices 4-71
qout, output program 1-230 norow/nocol/nohigh with 4-75
qqhct, holecount file 4-17 numeric variables 2-123, 3-60, 4-70
qsj, split or join databases 4-125, 4-127 page width suggestions 4-71
QTAXES, maximum number of axes per run 4-10 prepare weighted databases 4-71
qteclean, delete files created by edit-only run 4-25 prevent access to weighted/unweighted data
QTEDHEAP, to adjust edit statement complexity 4-116
4-10 process with 4-75
QTELMS, max number of elements per axis 4-10 reduce disk space for database 4-75
qtext, convert Quantum data to text format 4-167 respondent serial numbers 4-71
QTFORM define special characters for laser printing secure databases 2-45, 2-118
3-201 special T statistics 4-76
QTHEAP, max number of characters per axis 4-10 temporary directories 1-232
QTHOME, Quantum home directory 1-223 text at bottom of tables 4-71
QTINCHEAP, max number of characters for inc= trailer cards with 4-72
variables 4-10 weighting matrices 3-8
QTINCS, maximum different inc= per run 4-10 Quanvert (Windows) 4-67
QTINLISTHEAP, adjust definelist complexity 4-10 database icon 4-90
qtlclean, delete temporary compilation files 4-25 databases 4-86
QTLEXCHARS, max size of long text strings 4-10 languages 4-77
qtm_ex_, datapass program 1-227 levels data 4-88
QTMANIPHEAP, max size of expressions 4-10 news file 4-90
QTNAMEVARS, max num of named variables 4-10 notes file 4-90
QTNOPAGE, suppress blank page 4-23 packing extra files 4-91
QTNOWARN, suppress license expiry warning 4-11 percentiles 2-151
qtoclean, delete files created by quantum -o 4-25 questionnaire file 4-91
250 / Index
Quantum User’s Guide Volume 1
Index / 251
Quantum User’s Guide Volume 1
report, write data to report file 1-70 rim, rim weighting 3-7
report=, report type for rim weighting 3-21 rinc, rows take precedence when paginating large
req=, required card types 1-56 tables 2-19, 2-107
require, validating codes and columns 1-144 Risk level for special T stats 3-156
action codes 1-145 rj, reject record in online edit 1-165
actions when test fails 1-156 rm, delete cards in online edit 1-166
automatic error correction 1-151 Root mean square 3-4, 3-20
checking codes in columns 1-148 formula 4-20
checking exclusive codes 1-150 Rotated grid tables 2-245
checking logical expressions 1-153 round, forced rounding to 100% 2-19, 2-32
checking routing 1-155 Rounding to 100% 2-19, 2-32
checking type of coding 1-146 Routing
comments with 1-147 checking 1-155
correcting errors from 1-160 using go to 1-118
data output file for 4-18 with loops 1-124
data validation 1-143 Row manipulation 3-25
defaults with 1-152 expressions for 2-119
equivalence of logical expressions 1-154 ids for 2-115
file of records failing 4-17 Row offsets with added tables 2-184
with if 1-157 Row percentages 2-16
Required card types 4-2 force to round to 100% 2-19
defining 1-56, 3-46 suppress small 2-21
Reserved variables Row ranks in tables 2-16
allread 1-50 row, row element 2-116, 2-140
card_count 1-52 Rows
firstread 1-51, 3-64 alignment of text in laser printed tables 3-203
lastread 1-51, 3-65 basic counts 2-83
lastrec 1-52 created with col 2-83
number of cards read so far 1-52 indenting folded text 2-13
number of records accepted 1-125 reprint at top of continued tables 2-109, 2-114
number of records read so far 1-52 sorting 2-20, 3-126
number of records rejected 1-125 suppressing small 2-21
printed_ 1-67 text width 2-20
rec_acc 1-125 text width in Quanvert Text 4-85
rec_count 1-52 rpunch, set a random code into a column 1-107
rec_rej 1-125 rqd, default action code for require 1-146
record written to out 1-67 rsort, sort rows 2-20, 3-125
rejected_ 1-125 rt, terminate online edit for current record 1-165
stop statement executed 1-127 Run conditions, defining 2-8
stopped_ 1-127 Run defaults file see Default options file
this record rejected 1-125 Run definitions file 4-3
thisread 1-50 Run file, generate from qdi file 1-220
with trailer cards 1-50 Run ids for table manipulation 3-38
Reserved words with flip 4-70
Resetting variables between respondents 1-97
resp(#)=, substitution for data-mapped variables
1-215 S
Response, assign to data-mapped variable 1-209
return, go to tabulation section 1-126 s, assignment in online edit 1-163
with levels 3-50 s, side element for manipulation 3-41
with reject 1-126 Sample Quantum job 2-253
rgrid, rotated grid tables 2-245 Sample tables
Rim weighting 3-3, 3-7, 3-19 cumulative percentages 2-34
efficiency, formula 4-19, 4-21 hitch/squeeze 2-191
parameters file 4-5 inc= 2-136
report for each iteration 3-21 indices 2-35
root mean square 3-4, 3-20, 4-20 means 2-36
summary information for 4-19 multidimensional tables 2-172
252 / Index
Quantum User’s Guide Volume 1
Index / 253
Quantum User’s Guide Volume 1
254 / Index
Quantum User’s Guide Volume 1
Index / 255
Quantum User’s Guide Volume 1
256 / Index
Quantum User’s Guide Volume 1
Index / 257
Quantum User’s Guide Volume 1
V W
258 / Index
Quantum User’s Guide Volume 1
Index / 259