The present LHS algorithm creates the design by creating a list of N equally spaced values along each dimension, followed by a randomization step. This does not safeguard against obtaining values in a sorted order, which is not ideal. The problem is biggest for spaces with few dimensions and a low number of points. Here the risk of obtaining pearls on a string along the diagonal is greatest.
There are many ways of potentially improving the algorithm, one could be to create a relatively large number of candidate designs and pick the one that is optimized in terms of maximin, e.g. the maximal minimum pairwise distance between points in the space. An alternative could be optimizing for the lowest covariance / orthogonality of the design.
Furthermore, right now the internal LHS algorithm is hard-seeded in the construction of the Optimizer. I'm not sure this is ideal and maybe we should instead pass an RNG to the LHS call?
Here is one example of the problem:
from ProcessOptimizer import Optimizer
import matplotlib.pyplot as plt
space = [(0.0, 1.0), (0.0, 1.0)]
n_points = 5 # Number of points in the design
seed = 41
# Build optimizer
opt = Optimizer(
dimensions=space,
lhs=True,
n_initial_points=n_points,
random_state=seed,
)
x = opt.space.lhs(n=n_points, seed=seed)
# Plot the starting version
plt.figure()
for x in x:
plt.scatter(x[0], x[1])
plt.xlim(0, 1)
plt.ylim(0, 1)
plt.title("Badly correlated initial points")

The present LHS algorithm creates the design by creating a list of N equally spaced values along each dimension, followed by a randomization step. This does not safeguard against obtaining values in a sorted order, which is not ideal. The problem is biggest for spaces with few dimensions and a low number of points. Here the risk of obtaining pearls on a string along the diagonal is greatest.
There are many ways of potentially improving the algorithm, one could be to create a relatively large number of candidate designs and pick the one that is optimized in terms of maximin, e.g. the maximal minimum pairwise distance between points in the space. An alternative could be optimizing for the lowest covariance / orthogonality of the design.
Furthermore, right now the internal LHS algorithm is hard-seeded in the construction of the Optimizer. I'm not sure this is ideal and maybe we should instead pass an RNG to the LHS call?
Here is one example of the problem: