MMvec refactor

We're going to go [pytorch](https://www.pytorchlightning.ai/) OR [numpyro](https://github.com/pyro-ppl/numpyro).  The framework will have the following skeleton

`model.py (mmvec.py)`

```python
import torch
import torch.nn
from torch.distributions import Multinomial

class MMvec(nn.Module):
    def __init__(self, num_microbes, num_metabolites, latent_dim):
        self.encoder = nn.Embedding(num_microbes, latent_dim)
        self.decoder = nn.Sequential([nn.Linear(latent_dim, num_metabolite), nn.Softmax()])
        # TODO : may want to have a better softmax

    def forward(X, Y):
        """ X is one-hot encodings (B x num_microbes).  Y is metabolite abundances (B x num_metabolites).  B is the batch size""" 
        z = self.encoder(X)
        pred_y = self.decoder(z)
        lp = Multinomial(pred_y).log_prob(Y).mean()
        return lp
```

`train.py` (could use Pytorch lightning)

The wishlist
- Early stopping (see [video](https://pytorch-lightning.readthedocs.io/en/stable/common/early_stopping.html) for example)
- [Arviz](https://github.com/arviz-devs/arviz) for diagnostics diagnostics
- Typing would be great.  See [torchtyping](https://github.com/patrick-kidger/torchtyping)
- Torchtests could be cool also.  See [torchtest](https://github.com/suriyadeepan/torchtest)
- Being Bayesian would be nice.  [SWAG](https://pytorch.org/blog/pytorch-1.6-now-includes-stochastic-weight-averaging/) is the laziest approach

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MMvec refactor #166

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

MMvec refactor #166

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions