Example template to use Conda + Docker for reproducible, easy to deploy models.
Blog post goes into more detail - find it here:
https://binal.pub/2018/10/data-science-with-docker-and-conda/
As an example - here's my normal development process. Using it I can get from development to production with little friction, knowing that my code will work as expected, and that it won't negatively affect other processes on the production server.
- Clone the template down. Update the
environment.ymlas needed with packages I know I'll need, and rundocker-compose build. This will build the development image with all the packages I defined installed within it. - Create a
.env_devfile with development environment variables - Run
docker-compose upand navigate to JupyterLab, which will be running on http://localhost:8888. We can access it by entering in the tokenlocal_dev. - From there prototype and develop a model/process using Jupyter Notebooks, saving any notebooks I create along the way into
/notebooksas a development diary. Any final artifacts/models I plan on using in production I save within/code. - Once I have a final version of my code, save it (and any models it relies on) into
/code. - Update the
docker-compose.prod.ymlfile'scommandsection to point to the my scripts' name, and theimagesection to point to my docker registry (something like my_registry/my_project:0.1). - Run
docker-compose -f docker-compose.prod.yml build- this builds the production version of the image, packaging everything in the/codeand/notebooksdirectories directly onto the image. - Run
docker-compose -f docker-compose.prod.yml pushwhich pushes that packaged image into my organizations docker registry.
At this point I now have an image that contains all my code, models, and other artifacts I need, that's preinstalled with exact versions of the Python packages and dependencies I require. It's stored in a central location where I can easily pull it down onto other servers.