-
Notifications
You must be signed in to change notification settings - Fork 168
Open
Description
Currently we are supporting object storage (Amazon S3 or compatible ones) by allowing users to install S3-compatible client libraries inside containers and images and let their workloads connect to the external object storage services.
There are increasing demands on "managed" object storage abstraction both from customers and storage vendors. In Backend.AI, a managed storage space is abstracted as a vfolder.
Potential Mapping Designs
- Bucket-to-VFolder
- Subdirectory-to-VFolder
For simplicity, we’d choose the first one: bucket-to-vfolder.
Component Extensions
Storage Proxy
- Let’s treat the connection configuration to a specific object storage service as a storage-proxy backend.
- Backend type:
s3-compatible ones (including MinIO), … - Each volume configuration includes the endpoint and the service credentials (like API keys).
- Backend type:
- cf) Each vfolder is mapped to a specific storage-proxy volume via its
hostfield.
Manager
- Depending on the storage backend type of a vfolder, we need to pass the object storage endpoint & credentials to the agent when creating sessions.
- Similarly to the unmanaged vfolders (https://lablup.atlassian.net/browse/BA-114), we need to apply specialized lifecycle implementations. e.g.:
- Decouple actual filesystem-level creation/deletion of vfolders.
- Allow users to register/deregister vfolder entries assuming that the original bucket’s lifecycle is managed by the object-storage solution.
Agent
- Before creating a container, the agent lets
s3fsmount the bucket into the local filesystem and bind-mount it into the container. - After destroying a container, the agents unmounts the bucket.
- We need to keep track of the references to a specific bucket to prevent duplicate mounts and premature unmounts when there are multiple containers using the same bucket.
- We may need to handle potential system instability due to frequent mount/unmount in the filesystem.
Technical Considerations
- How to implement read-only mounts?
- In the storage-proxy, we need to keep mounts of ALL registered buckets to implement managed vfolder interaction APIs. Could this incur too much burden to the storage-proxy nodes?
JIRA Issue: BA-255
Reactions are currently unavailable