I've attempted to add an init image to the pipeline, but it ends up just becoming a blur no matter how many steps it runs, as if a gaussian blur was applied to the image. Maybe different schedulers would affect it?
public static Tensor<float> GetLatentSampleFromImage(Image<RgbaVector> image, int batchSize, int width, int height)
{
//noise here?
image.Mutate(ctx =>
{
ctx.Resize(width - width % 64, height - height % 64);
});
var channels = 3;
var latents = new DenseTensor<float>(new[] { batchSize, channels, height, width });
for (int y = 0; y < image.Height; y++)
{
image.ProcessPixelRows(ctx =>
{
Span<RgbaVector> row = ctx.GetRowSpan(y);
for (int x = 0; x < image.Width; x++)//maybe .transpose(0, 3, 1, 2)?
{
latents[0, 0, y, x] = (float)(row[x].R * 2f - 1f);
latents[0, 1, y, x] = (float)(row[x].G * 2f - 1f);
latents[0, 2, y, x] = (float)(row[x].B * 2f - 1f);
}
});
}
return latents;
}
Hello and thanks for this great project!
I've attempted to add an init image to the pipeline, but it ends up just becoming a blur no matter how many steps it runs, as if a gaussian blur was applied to the image. Maybe different schedulers would affect it?
Then encode it through the
vae_encoder. Does that look about right?