Anda di halaman 1dari 37

Dr Asim Waris

Course Information

Grading

OHT 1 and 2 25%

Final 50%

Objectives

process and how it is a core component of

evidence-based medicine

Understand features, strengths and limitations of

descriptive, observational and experimental

studies

Distinguish between association and causation

Understand roles of chance, bias and

confounding in the evaluation of research

Text Books

Introduction to Statistics for Biomedical Engineers

(pdf)

•Applied Statistics and Probability for Engineers

(pdf)

Statistical and medical ethics

Misuse of patients

1.Are the proposed procedures or diagnostic techniques safe?

2.Is it ethical to withhold the treatment under evaluation from some patients

(namely, the controls)?

3.Is it ethical to bring certain persons into the trial?

4.Has informed consent been obtained from all patients?

5.Is it ethical to offer inducements to people to participate in a trial?

6.Is it ethical to use double-blind techniques?

7.Is it ethical for patients to be randomly allocated to the different treatment and

control groups?

8.How far can one go with placebos and dummy treatments? Can placebo or

sham surgery be justified?

9.Who should make the decision about the answers to these questions? The

persons in charge of the investigation?

All members of the investigation team? Clinical colleagues? A formal ethics

committee of clinical colleagues? A formal ethics committee of non-medical

people? A formal ethics committee of medical and non-medical people?

Statistical and medical ethics

Misuse of statistics

1.Why is it unethical to publish results that for

statistical reasons are incorrect?

2.Why is it unethical to present results in a

misleading way?

3.Why should professional statistical advice be

sought at the beginning of an investigation?

Statistical and medical ethics

The World Medical Association (WMA) has

developed the Declaration of Helsinki as a

statement of ethical principles for medical research

involving human subjects, including research on

identifiable human material and data. The first

version was adopted in 1964 and has been

amended six times since, most recently at the

General Assembly in October 2008. The current

(2008) version is the only official one.

The Scientific Method

Observation

Hypothesis

Experiment

Revise H

Results

Evidence

Evidence

inconsistent

supports H

with H

Variable and Data

The raw data of an investigation consist of observations made on individuals.

The number of individuals is called the sample size. Any aspect of an individual that is

measured, like blood pressure, or recorded, like age or sex, is called a variable.

Variable and Data

analyse data is to classify the variables into their

different types.

There are two major types of variable –

categorical variables and metric variables

Variables

measured on each individual unit in a statistical

study.

1.Catagorical variable

2. Numerical variable

Discrete variable

Continous variable

Variables

variables that are measured in terms of numbers.

Some examples of quantitative variables are

height, weight, and shoe size etc.

Variable

variables that are measured in terms of numbers.

Some examples of quantitative variables are

height, weight, and shoe size etc.

1. Discrete variable is a variable that has values

that has either a finite number of possible values or

a countable number of possible values.

2. A continuous variable is a variable that has an

infinite number of possible values that is not

countable.

Variable

that express a qualitative attribute such as hair color,

eye color, religion, favorite movie, gender etc. They

can also be referred as categorical variables.

1)Nominal categorical variables: Consider the variable

blood type. Let’s assume for simplicity that there are

only four different blood types: O, A, B, and A/B. We

can first determine the blood type of subjects and then

allocate the result to one of the four blood type

categories. The order is not important.

Ordinal categorical variables: The ordering of the

categories is not arbitrary as it was with nominal

variables. It is now possible to order the categories

in a meaningful way. e.g stress score or Glasgow

Coma Scale.

It is very important to note that the score/scale are

not real numbers. So it’s not appropriate to apply

arithmetic operations on Ordinal variables data.

Examples

are discrete or continuous.

(a) The number of light bulbs that burn out in a

room of 10 light bulbs in the next year.

(b) The number of branches on a randomly

selected Oak tree.

(c) The length of time between calls to 911.

Dependent and Independent

Variable

Variables are properties or characteristics of some

event, object, or person that can take on different

values or amounts (as opposed to constants such

as π that do not vary). When conducting research,

experimenters often manipulate variables.

For example, an experimenter might compare the

effectiveness of four types of antidepressants. In

this case, the variable is “type of antidepressant.”

In general, the independent variable is

manipulated by the experimenter and its effects

on the dependent variable are measured.

Example

Can blueberries slow down aging? A study indicates that antioxidants found in

blueberries may slow down the process of aging. In this study, 19-month-old

rats (equivalent to 60-year-old humans) were fed either their standard diet or a

diet supplemented by either blueberry, strawberry, or spinach powder. After

eight weeks, the rats were given memory and motor skills tests. Although all

supplemented rats showed improvement, those supplemented with blueberry

powder showed the most notable improvement.

What are the dependent variables?

dietary supplement: none, blueberry, strawberry, and

spinach

memory test and motor skills test

Objectives of statistic

categories of statistics

Descriptive statistics

Are concerned with the presentation, organization

and summarization and description of data Graphical

representation & Tables

Inferential Statistics

Allow us to generalize from our sample of data to a

larger groups of subjects

It consists of Estimation and hypothesis of testing

What is the description of

“data”?

Data consist of observations made on

individuals. Normally, we collect observations on

a sample from a much larger group called the

population.

Different samples from the same population will

give different results, a phenomenon called

sampling variation

DATA

The set of values collected for the variable of each

of the elements belonging to the sample

Data Collection

Types of observations

Examples of observations about people are gender,

age, height, eye color, responsiveness to treatment,

life expectancy, etc.

These are called variables can be dependent or

independent (already discussed)

Types of variables

and exposure variables, in addition to identifying

the types of each of the variables in the data set.

The outcome variable is the variable that is the

focus of our attention, whose variation or

occurrence we are seeking to understand. In

particular we are interested in identifying factors, or

exposures, that may influence the size or the

occurrence of the outcome variable.

Example

Population vs. Sample

to the investigator .e.g. average FSc marks of

NUST students

Sample: Any subset of all measurements selected

from the population.

Example: does current government is achieving

international standards in Health care?

To estimate, we take a sample which is a good

representative of the entire population

Example

did on their last test. The teacher asks the 10

students sitting in the front row to state their latest

test score. He concludes from their report that the

class did extremely well.

What is the sample? What is the population? Can

you identify any problems with choosing the

sample in the way that the teacher did?

Principles of Sampling

undertaking

2. Carefully identify the population for sampling

3. Choose the variables you will measure in the

study

4. Decide appropriate design for producing the

data

5. Collect the data

Principles of Sampling

The population should be explicitly described in

order to obtain a sample that provides accurate

information.

For example: poverty, average money spend on

the food.

Principles of Sampling

study

We must determine what will be measured and

how it will be measured. In order not to overlook an

important issue we should attempt to identify all

relevant variables prior to data collection.

For example we want to find out Obesity in youth.

Possible variables are genes, demography, food

habits, exercise etc

Principles of Sampling

Statistical design falls under two categories Survey

and experiments

All elections polls are survey (Gallup, Harris, Roper

etc)

An experiment is an attempt to determine cause-

and-effect relationship between variables.

Methods of sampling

nonprobabilistic

Non-probability sampling is a sampling technique where

the samples are gathered in a process that does not give all the

individuals in the population equal chances of being selected.

Students in a class or co-workers in a workplace.

Volunteers.

Judgment sample.

Quota sample - obtain a cross-section of a population, eg. by age and

sex for individuals or by region, firm size, and industry for businesses.

This may be reasonably representative.

Sampling distribution of statistics cannot be obtained using any of the

above methods, so statistical inference is not possible.

“the theory, methods, and practice of forming judgements about the

parameters of a population and the reliability of statistical relationships,

typically on the basis of random sampling”

Random Sample

elements chosen from the population in such a

way that all samples of that size have the same

chance of being selected.

e.g. Physical Mixing, or randomly select elements

from population etc.

Methods of sampling –

probabilistic

Random sampling methods – each member has an

equal probability of being selected.

Systematic – every kth case. Equivalent to

random if patterns in list are unrelated to issues of

interest. Eg. telephone book.

Stratified samples – sample from each stratum or

subgroup of a population. Eg. region, size of firm.

Cluster samples – sample only certain clusters of

members of a population. Eg. city blocks, firms.

Multistage samples – combinations of random,

systematic, stratified, and cluster sampling.

Probabilistic

Stratified sampling. With stratified sampling, the population is divided into groups, based on

some characteristic. Then, within each group, a probability sample (often a simple random

sample) is selected. In stratified sampling, the groups are called strata.As a example, suppose

we conduct a national survey. We might divide the population into groups or strata, based on

geography - north, east, south, and west. Then, within each stratum, we might randomly select

survey respondents.

Cluster sampling. With cluster sampling, every member of the population is assigned to one,

and only one, group. Each group is called a cluster. A sample of clusters is chosen, using a

probability method (often simple random sampling). Only individuals within sampled clusters

are surveyed.Note the difference between cluster sampling and stratified sampling. With

stratified sampling, the sample includes elements from each stratum. With cluster sampling, in

contrast, the sample includes elements only from sampled clusters.

Multistage sampling. With multistage sampling, we select a sample by using combinations of

different sampling methods.For example, in Stage 1, we might use cluster sampling to choose

clusters from a population. Then, in Stage 2, we might use simple random sampling to select a

subset of elements from each chosen cluster for the final sample.

Systematic random sampling. With systematic random sampling, we create a list of every

member of the population. From the list, we randomly select the first sample element from the

first k elements on the population list. Thereafter, we select every kth element on the list.This

method is different from simple random sampling since every possible sample of n elements is

not equally likely.

Example

A research scientist is interested in studying the

experiences of twins raised together versus those

raised apart. She obtains a list of twins from the

National Twin Registry, and selects two subsets of

individuals for her study. First, she chooses all those in

the registry whose last name begins with Z. Then she

turns to all those whose last name begins with B.

Because there are so many names that start with B,

however, our researcher decides to incorporate only

every other name into her sample for B.

What is the population? What is the sample? Was the

sample picked by simple random sampling? Is it

biased?

Sampling Terminology

Parameter

fixed, unknown number that describes the

population

Statistic

known value calculated from a sample

a statistic is often used to estimate a parameter

Variability

different samples from the same population may

yield different values of the sample statistic

Sampling Distribution

tells what values a statistic takes and how often it takes

those values in repeated sampling

