Skip to content

Uncertainty in Engineering

Sections
Personal tools
You are here: Home » Applications » Numerical efficiency improvement » Subset Sampling

Subset Sampling

Document Actions
Reliability assessments, quantified with failure probablities, are usually performed with the aid of Monte Carlo simulations. Though for the determination of small failure probabilties especially in the case of complex nonlinear structures the Monte Carlo simulation meets their limits. A more advanced simulation method is subset sampling, which promises to compensate this drawback.

The basic idea of subset sampling is the subdivision of the failure event into a sequence of m partial failure events (subsets) F1, F2, ···, Fm = F. Generally, the determination of small failure probabilities PF with the aid of Monte Carlo simulation requires the  expensive simulation of rare events. The division into subsets (subproblems) offers the possibility to transfer the simulation of rare events into a set of simulations of more frequent events. The determination of the failure regions Fi can be effected by presetting a series gi|i=1...m of limit values, whereas m denotes the (unknown) total number of subsets.


This enables the computation of the failure probability as a product of conditional probabilities P(Fi+1|Fi) and P(F1).


The determination of the failure regions Fi and subsequently the partial failure probabilities Pi =P(Fi+1 | Fi) influences the accuracy of the simulation. It is convenient to specify the limit values gi|i=1...m  so that nearly equal partial failure probabilities Pi|i=2...m are obtained for each subset. Unfortunately, it is difficult to specify the limit values gi in advance according to the prescribed probability Pi. Therefore the limit  values have to be determined adaptively within the simulation.

Subset sampling algortihm

In the first step the probability P1=P(F1) is determined by application of the direct Monte Carlo simulation.


To obtain conditional probabilities P(Fi+1|Fi) the evaluation of the respective probability functions


is necessary. With the application of the Markov chain Monte Carlo simulation in conjunction with the Metropolis-Hastings algorithm samples may be generated in a numerically efficient way according to f(x|Fi).


The starting sample of the subset i+1 is selected randomly from the samples x(i)|gi(x(i)) < gi, i=1,...,m−1 of subset i that are located in the failure region Fi. The limit value gi of the i-th partial subset is determined adaptively during the simulation. Therefore, gi is determined from a list of ascending sorted tuple (xk(i), g(xk(i)))|k=1,...,Ni according to the values g(xk(i)). The limit value gi is given by

Under the condition gi<0 the last subset m of the simulation has been reached. The last failure probability P(Fm|Fm−1) can be estimated with

The failure probability PF may now be computed as

The subset sampling algorithm is exemplified for three subsets in the following picture


Numerical efficiency

The numerical efficiency can be assessed with the demanded number of samples. In contrast to Monte Carlo simulation the required sample size for subset sampling cannot be predetermined in advance. A coarse approximation of the number of samples NT, which is sufficient to estimate PF, is given by


The parameter p0 denotes the predefined failure probability of the respective susbet and denotes the coefficient of variation. The exponent r < 3 specifies the correlation of the Pi of the individual subsets. The value of represent the correlation among Markov chain samples and depends itself on the selected proposal density function q. The accuracy of the conditional probability estimator Pi is reduced for  > 0 compared to the case of independently distributed samples (= 0).

The estimated computational effort of subset sampling vs. Monte Carlo simulation  is shown in the following diagrams




Example


The susbet sampling method is demonstrated by means of a numerical example. The performance function

depends on the variables x1 and x2, which are assumed to be randomly distributed (beta distribution). The generated realizations in the respective subsets are illustrated in the following pictures.

 ss
1st subset (direct Monte Carlo simulation)

2nd subset (Markov chain Monte Carlo simulation)



3rd subset (Markov chain Monte Carlo simulation)


References

 

Powered by Plone