Skip to content

Conversation

@dlindhol
Copy link
Member

I've started making some utilities for working with S3 buckets, initially to support an S3 granule list adapter (coming soon). I decided to play with both the "regular" and the effectful approach. I have a number of TODOs for issues that we can discuss.

The testing also raises some questions. The first time I ran with localstack, it did take about a minute for it to start up. Subsequent runs were much faster. Not sure how it managed that. I wonder how it would play out in CI. I'd hate to slow down the tests too much. Note that I also tested with real buckets, breaking my "rule" for external dependencies.

@dlindhol
Copy link
Member Author

Error in CI already...

@dlindhol dlindhol requested review from jli2003 and lindholc June 13, 2025 22:45
S3Client.builder.region(region).build

/** Makes an effectful S3 client. */
//TODO: Async here? delay?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're only using delay, you only need Sync.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was playing type tetris to get as generic of an F as possible. I wasn't sure if I needed to go for Async here when making an AsyncClient. I guess making is separate from the client-ing so this could be Sync. Anything more generic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some confusion on S3Client. S3Client is not pure, it's just a synchronous (read: blocking, since it needs to do network stuff) S3 client.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going for a simple way to get a client for those who don't want to deal with effects. I suppose that has no place here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see any value keeping an effectful synchronous client constructor?

import software.amazon.awssdk.services.s3.S3Client
import software.amazon.awssdk.services.s3.model.ListObjectsV2Request

object S3Utils {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the "bag of methods" util objects.

I think it would be better to define a trait and wrap an S3 client implementation:

trait S3Client[F[_]] {
  def getKeys(...): Stream[F, String]
}

object S3Client {
  def fromAsyncClient[F[_]: Sync](client: S3AsyncClient): Resource[F, S3Client] =
    new S3Client {
      override def getKeys(...): Stream[F, String] = ???
    }
}

This has the benefit of being able to make a no-op implementation that just returns a list of canned values for testing. With S3Utils.getKeys you're forced to use a real S3 client.

"ch.qos.logback" % "logback-classic" % logbackVersion % Test,
"com.github.fs2-blobstore" %% "s3" % "0.9.15",
"software.amazon.awssdk" % "s3" % "2.31.59",
"com.dimafeng" %% "testcontainers-scala-localstack-v2" % "0.43.0" % Test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could try s3proxy instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants