Skip to content

LHS sampling can return perfectly correlated set of points #316

@dk-teknologisk-mon

Description

@dk-teknologisk-mon

The present LHS algorithm creates the design by creating a list of N equally spaced values along each dimension, followed by a randomization step. This does not safeguard against obtaining values in a sorted order, which is not ideal. The problem is biggest for spaces with few dimensions and a low number of points. Here the risk of obtaining pearls on a string along the diagonal is greatest.

There are many ways of potentially improving the algorithm, one could be to create a relatively large number of candidate designs and pick the one that is optimized in terms of maximin, e.g. the maximal minimum pairwise distance between points in the space. An alternative could be optimizing for the lowest covariance / orthogonality of the design.

Furthermore, right now the internal LHS algorithm is hard-seeded in the construction of the Optimizer. I'm not sure this is ideal and maybe we should instead pass an RNG to the LHS call?

Here is one example of the problem:

from ProcessOptimizer import Optimizer
import matplotlib.pyplot as plt

space = [(0.0, 1.0), (0.0, 1.0)]
n_points = 5  # Number of points in the design
seed = 41

# Build optimizer
opt = Optimizer(
    dimensions=space,
    lhs=True,
    n_initial_points=n_points,
    random_state=seed,
)

x = opt.space.lhs(n=n_points, seed=seed)
# Plot the starting version
plt.figure()
for x in x:
    plt.scatter(x[0], x[1])
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.title("Badly correlated initial points")

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions