100%(1)100% menganggap dokumen ini bermanfaat (1 suara)

24 tayangan44 halamanGeostatistik

Mar 11, 2019

© © All Rights Reserved

100%(1)100% menganggap dokumen ini bermanfaat (1 suara)

24 tayangan44 halamanAnda di halaman 1dari 44

that vary in space and/or time”

(Deutsch, 2002)

collection of numerical

techniques that deal with the

characterization of spatial attributes,

employing primarily random models

in a manner similar to the

way in which time series analysis

characterizes temporal data.”

(Olea, 1999)

describing the spatial continuity of

natural phenomena and provides

adaptations of classical regression

techniques to take advantage of this

continuity.” (Isaaks and Srivastava,

1989)

Geostatistics

Geostatistics deals with

spatially autocorrelated data.

Autocorrelation: correlation

between elements of a series

and others from the same

series separated from them

by a given interval. (Oxford

American Dictionary)

A plot showing 100 random numbers with a

parameters of interest to

"hidden" sine function, and an autocorrelation reservoir engineers: facies,

(correlogram) of the series on the bottom. reservoir thickness, porosity,

permeability

Autocorrelation.

from +1 to -1. An autocorrelation of +1 represents

perfect positive correlation (i.e. an increase seen in

one time series will lead to a proportionate increase

in the other time series), while a value of -1

represents perfect negative correlation (i.e. an

increase seen in one time series results in a

proportionate decrease in the other time series).

analysis. For example, if you know a stock

Visual comparison of convolution, historically has a high positive autocorrelation value

crosscorelation and autocorrelation. and you witnessed the stock making solid gains over

the past several days, you might reasonably expect

the movements over the upcoming several days (the

leading time series) to match those of the lagging

time series and to move upwards.

Basic Components of Geostatistics

of spatial correlation

best linear unbiased estimate at each

location; employs semivariogram model

multiple equiprobable images of the

variable; also employs semivariogram model

Variography

of estimating the theoretical

semivariogram. Steps: (1)

exploratory data analysis, (2)

check for global trend, (3)

computation of the empirical

semivariogram, (4) binning

and fitting a semivariogram

model, (5) computation of

directional variograms to

identify anisotropy.

Kriging

intervals. Squares indicate the location of the data. The kriging interpolation

is in red. The confidence intervals are represented by gray areas.

Stochastic simulation

random variables depending on a variable

parameter (which is usually time)

Exploratory Analysis of Example Data

vertically averaged porosity

values, in percent, in Zone A of

the Big Bean Field (fictitious, but

based on data from a real field).

Porosity values are available from

85 wells distributed throughout

the field, which is approximately

20 km in east-west extent and 16

km north-south. The porosities

range from 12% to 17%. Here are

the data values posted at the well

locations:

Geostatistical methods are optimal

when data are :

- stationary (mean and variance do not

vary significantly in space)

Significant deviations from normality

and stationarity can cause problems, so

it is always best to begin by looking at

a histogram or

similar plot to check for normality and

a posting of the data values in space to

check for significant trends. The

posting above shows some hint of a

SW-NE trend, which we will check

later.

Looking at the histogram (with a normal density superimposed) and

a normal quantile-quantile plot shows that the porosity distribution

does not deviate too severely from normality

Spatial Covariance,

Correlation and Semivariance

and correlation are measures of the

similarity between two different variables.

To extend these to measures of spatial

similarity, consider a scatterplot where the

data pairs represent measurements of the

same variable made some distance apart

from each other. The separation

distance is usually referred to as “lag”, as

used in time series analysis. We’ll refer to

the values plotted on the vertical axis as

the lagged variable, although the decision

as to which axis represents the lagged

values is somewhat arbitrary. Here is a

scatterplot of porosity values at wells

separated by a nominal lag of 1000 m:

Spatial Covariance, Correlation

and Semivariance (Contn’s)

distribution of wells, we cannot

expect to find many pairs of data

values separated by exactly 1000

m, if we find any at all. Here we

have introduced a “lag tolerance”

of 500 m, pooling the data pairs

with separation distances between

500 and 1500 m in order to get a

reasonable number of pairs for

computing statistics. The actual

lags for the data pairs shown in the

crossplot range from 566 m to

1456 m, with a mean lag of

1129 m.

Spatial Covariance, Correlation

and Semivariance (Contn’s) The three statistics shown on the

crossplot are the covariance,correlation,

and semivariance between the porosity

values on the

horizontal axis and the lagged porosity

values on the vertical axis.

To formalize the definition of these

statistics, we need to introduce some

notation. Following standard

geostatistical practice, we’ll

use:

u: vector of spatial coordinates (with

components x, y or “easting” and

“northing” for our 2D example)

z(u): variable under consideration as a

function of spatial location (porosity in

this example)

h: lag vector representing separation

between two spatial locations

z(u+h): lagged version of variable under

consideration

Sometimes z(u) will be referred to as the

“tail” variable and z(u+h) will be referred to

as the “head” variable, since we can think of

them as being located at the tail and head of

the lag vector, h. The scatterplot of tail

versus head values for a certain lag, h, is

often called an h-scattergram.

pairs separated by lag h (plus or minus the

lag tolerance), we can compute the statistics

for lag h as

The semivariance is the moment of

inertia or spread of the h scattergram

about the 45° (1 to 1) line shown on

the plot.

Covariance and correlation are both

measures of the similarity of

the head and tail values. Semivariance

is a measure of the dissimilarity.

Here are the h-scatterplots for

nominal lags of 2000 m and

3000 m.

Note that the covariance and

correlation decrease and the

semivariance increases with

increasing separation distance.

The plot above shows all three

statistics versus actual mean lag for

the contributing data pairs at each

lag. The shortest lag shown (the

nominally “zero” lag) includes six

data pairs with a mean lag of 351 m.

The correlation versus lag is referred

to as the correlogram and the

semivariance versus lag is the

semivariogram. The covariance

versus lag is generally just referred

to as the covariance

function.

The empirical functions that we have plotted – computed

from the sample data – are of course just estimators of the

theoretical functions C(h), r(h), and g (h), which can be

thought of as population parameters. Estimating these

functions based on irregularly distributed data (the usual

case) can be very tricky due to the need to pool data pairs

into lag bins.

Larger lag spacings and tolerances allow more data pairs for

estimation but reduce the amount of detail in the

semivariogram (or covariance or correlogram). The problem

is particularly difficult for the shorter lags, which tend to

have very few pairs (six in this example). This is

unfortunate, since the behavior near the origin is the most

important to characterize.

Under the condition of second-order

stationarity (spatially constant mean and

variance), the covariance function,

correlogram, and semivariogram obey the

following relationships:

r(h) = C(h) C(0)

g (h) = C(0)-C(h)

the variable under consideration, the correlogram should look like the

covariance function scaled by the variance, and the semivariogram should

look like the covariance function turned upside down:

Semivariance

and N is the number of pairs of data points

separated by lag distance h.

The sill is the amount of semivariance

achieved at the plateau of the curve, and is

equivalent to the variance of the data. The

range is the lag distance at which data is no

longer correlated. Data within the range are

correlated and can be used for making

predictions. These two values can be

calculated by fitting a model to the

semivariogram. Different models will yield

different values for the sill and range. The

nugget is the semivariance at h = 0, and is a

measure of the inherent variability in the data

or the noise of the data.

The range and sill

When you look at the model of a

semivariogram, you'll notice that

at a certain distance, the model

levels out. The distance where the

model first flattens out is known

as the range. Sample locations

separated by distances closer than

the range are spatially

autocorrelated, whereas locations

farther apart than the range are

not.

The value that the semivariogram

model attains at the range (the

value on the y-axis) is called the

sill. The partial sill is the sill

minus the nugget.

The nugget Theoretically, at zero separation distance

(lag = 0), the semivariogram value is 0.

However, at an infinitesimally small

separation distance, the semivariogram

often exhibits a nugget effect, which is

some value greater than 0. For example, if

the semivariogram model intercepts the y-

axis at 2, then the nugget is 2.

The nugget effect can be attributed to

measurement errors or spatial sources of

variation at distances smaller than the

sampling interval or both. Measurement

error occurs because of the error inherent

in measuring devices. Natural phenomena

can vary spatially over a range of scales.

Variation at microscales smaller than the

sampling distances will appear as part of

the nugget effect. Before collecting data, it

is important to gain some understanding of

the scales of spatial variation.

Modeling the Semivariogram

Using h to represent lag distance, a to represent (practical) range,

and c to represent sill, the five most frequently used models are:

Above are Semi-variograms of effective porosity and permeability logs

produced with a core calibrated multi-mineral petrophysical analysis.

Short ranges usually infer a high degree of heterogeneity while long

ranges tend to infer larger structures and less heterogeneity.

Using the vertical semi-variogram one can obtain a rough idea of what

the horizontal semi-variogram may be based on ranges for different

lithological and depositional facies.

Having more wells in the near vicinity would help acquire the true

horizontal semi-variogram but in the case of many wildcat wells they

may be the only one within miles so the modeling Geologist must be

able to make a rough guesstimate for the ratio of vertical to horizontal

semi-variance.

Modelling

Stochastic

Stochastic or geostatistical

modeling is a method of

determining heterogeneity and

uncertainty in a spatial

distribution such as a petroleum

reservoir. Before drilling,

optimum placement of the well is

key to maximize profits while

minimizing uncertainty. Multiple

realizations give many "what if"

type scenarios providing a general

overview of the inherent

uncertainty inevitable with sparse

well control.

Why

The deterministic model, who studies a population of bacteria, considers

continuous concentrations of molecules. However, in a single bacteria, the

quantity of the different proteins is of the order of 100, and the concentrations

take thus discrete values. These values depend on events (production,

degradation) which are hard to predict, and must therefore be approached in

terms of probability of occurence (or preopensity).

How

conditions and to see what they might be like.

To introduce that randomness we use a new function : propensities.

As an example, to illustrate this we consider

4 possible reactions. Each reaction has a probability

to happen in the next time step.

regarding the propensities.

When we run a stochastic simulation once, we get a trajectory. This graph represents

the random evolution of the variables. Because of the randomness, if we run the script

another time we will get a different trajectory. That is why to be able to interpret the

results we have to run the cripts hundreds or thousands of times.

Instead of describing a process which can only evolve in one way, in a stochastic

or random process there is some indeterminacy : even if the initial condition is

known, there are several directions in which the process may evolve.

Traditional continuous and deterministic biochemical rate equations do not

accurately predict cellular reactions since they rely on random reactions that

require the interactions of millions of molecules. In contrast, the Gillespie

algorithm allows a discrete and stochastic simulation of a system with few

reactants because every molecule is explicitly simulated. When simulated, a

Gillespie realization represents a random walk of the entire system.

We will use the following notations :

quantities :

Once we have presented the theory behind the stochastic

approach,let us have a look at the algorithm.

The algorithm comprises 5 steps.

Simulation

replicating reality using a model. In

geostatistics, simulation is the realization of a

random function (surface) that has the same

statistical features as the sample data used to

generate it (measured by the mean, variance,

and semivariogram). Gaussian geostatistical

simulation (GGS), more specifically, is suitable

for continuous data and assumes that the data,

or a transformation of the data, has a normal

(Gaussian) distribution. The main assumption

behind GGS is that the data is stationary—the

mean, variance, and spatial structure

(semivariogram) do not change over the spatial

domain of the data.

Another key assumption of GGS is that the

random function being modeled is a

multivariate Gaussian random function.

GGS offers an advantage over kriging.

Because kriging is based on a local average

of the data, it produces smoothed output.

GGS, on the other hand, produces better

representations of the local variability

because it adds the local variability that is

lost in kriging back into the surfaces it

generates. The variability that GGS

realizations add to the predicted value at a

particular location has a mean of zero, so that

the average of many GGS realizations tends

toward the kriging prediction. This concept

is illustrated in the figure below. Different

realizations are represented as a stack of

output layers, and the distribution of values

at a particular coordinate is Gaussian, with a

mean equal to the kriged estimate for that

location and a spread that is given by the

kriging variance at that location.

Realizations

There are many applications in

which spatially dependent

variables are used as input for

models (for example, flow

simulation in petroleum

engineering). In these cases,

uncertainty in the model's results

is evaluated by producing a

number of simulations using the

following procedure:

1. A large number of equally

probable realizations are

simulated for the variable.

2. The model (generally termed

a transfer function) is run using

the simulated variable as input.

3. The model runs are

summarized to evaluate

variability in the model's output.

Realization of effective porosity produced with the sequential Gaussian

simulation (sGs) algorithm and the semi-variograms shown above.

Realization of permeability produced with the sequential Gaussian

simulation (sGs) algorithm and the semi-variograms shown above.

How many realizations should be generated?

not depend on the number of

realizations that were generated. One

way to determine how many

realizations to generate is to compare

the statistics for different numbers of

realizations in a small portion of the

data domain (a subset is used to save

time). The statistics tend toward a

fixed value as the number of

realizations increases. The statistics

examined in the example below are the

first and third quartiles, which were

calculated for a small region (subset)

of simulated elevation surfaces (in feet

above sea level) for the state of

Wisconsin, USA.

The top graph shows fluctuations in

elevation for the first 100 realizations.

The lower graph shows results for

1,000 realizations

Some Geostatistics Textbooks

C.V. Deutsch, 2002, Geostatistical Reservoir Modeling, Oxford University Press, 376

pages. o Focuses specifically on modeling of facies, porosity, and permeability for

reservoir simulation.

C.V. Deutsch and A.G. Journel, 1998, GSLIB: Geostatistical Software Library and

User's Guide, Second Edition, Oxford University Press, 369 pages.

o Owner's manual for the GSLIB software library; serves as a standard

reference for concepts and terminology.

P. Goovaerts, 1997, Geostatistics for Natural Resources Evaluation, Oxford

University Press, 483 pages.

o A nice introduction with examples focused on an environmental chemistry

dataset; includes more advanced topics like factorial kriging.

E.H. Isaaks and R.M. Srivastava, 1989, An Introduction to Applied Geostatistics,

Oxford University Press, 561 pages.

o Probably the best introductory geostatistics textbook; intuitive

development of concepts from first principles with clear examples at every

step.

P.K. Kitanidis, 1997, Introduction to Geostatistics: Applications in Hydrogeology,

Cambridge University Press, 249 pages.

o A somewhat different take, with a focus on generalized covariance

functions; includes discussion of geostatistical inversion of (groundwater)

flow models.

M. Kelkar and G. Perez, 2002, Applied Geostatistics for Reservoir Characterization,

Society of Petroleum Engineers Inc., 264 pages.

o Covers much the same territory as Deutsch's 2002 book; jam-packed with

figures illustrating concepts.

R.A. Olea, 1999, Geostatistics for Engineers and Earth Scientists, Kluwer Academic

Publishers, 303 pages.

o Step by step mathematical development of key concepts, with clearly

documented numerical examples.

## Lebih dari sekadar dokumen.

Temukan segala yang ditawarkan Scribd, termasuk buku dan buku audio dari penerbit-penerbit terkemuka.

Batalkan kapan saja.