Skip to content

Distributed inoperable in Docker Hub version #65

@dimalvovs

Description

@dimalvovs

There is an issue with operation of the Dockerhub version of pycogaps. There is a newer version of docker image available in the ghcr, it would make sense to continue maintaining just one of them.

Steps to reproduce:

  1. pull the image docker pull fertiglab/pycogaps
  2. run the image docker run -it --entrypoint /bin/bash fertiglab/pycogaps
  3. validate that standard version works well:
echo "if __name__ == '__main__':
    from PyCoGAPS.parameters import *
    from PyCoGAPS.pycogaps_main import CoGAPS
    import scanpy as sc

    modsimpath = 'data/ModSimData.txt'
    modsim = sc.read_text(modsimpath)

    params = CoParams(path=modsimpath)
    params.printParams()

    setParams(params, {
        'nIterations':10000,
        'seed': 42,
        'nPatterns': 3
    })

    params.printParams()
    start = time.time()
    result = CoGAPS(modsimpath, params)
    end = time.time()
    print('TIME:', end - start)

    result.write('data/dist_modsim.h5ad')" > test2.py

python3 test2.py 

______      _____       _____   ___  ______  _____ 
| ___ \    /  __ \     |  __ \ / _ \ | ___ \/  ___|
| |_/ /   _| /  \/ ___ | |  \// /_\ \| |_/ /\ `--. 
|  __/ | | | |    / _ \| | __ |  _  ||  __/  `--. |
| |  | |_| | \__/\ (_) | |_\ \| | | || |    /\__/ /
\_|   \__, |\____/\___/ \____/\_| |_/\_|    \____/ 
       __/ |                                       
      |___/             
                                 
                    

-- Standard Parameters --
nPatterns:  3
nIterations:  1000
seed:  0
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

setting distributed parameters - call this again if you change nPatterns
if you wish to perform genome-wide distributed cogaps, please run setParams(params, "distributed", "genome-wide")

-- Standard Parameters --
nPatterns:  3
nIterations:  10000
seed:  42
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

This is pycogaps version  0.0.1
Running Standard CoGAPS on ModSimData.txt ( 25 genes and 20 samples) with parameters: 

-- Standard Parameters --
nPatterns:  3
nIterations:  10000
seed:  42
sparseOptimization:  False


-- Sparsity Parameters --
alpha: 0.01
maxGibbsMass:  100.0

Data Model: Dense, Normal
Sampler Type: Sequential
Loading Data...Done! (00:00:00)
-- Equilibration Phase --
1000 of 10000, Atoms: 64(A), 45(P), ChiSq: 1830, Time: 00:00:00 / 00:00:00
2000 of 10000, Atoms: 69(A), 42(P), ChiSq: 1466, Time: 00:00:00 / 00:00:00
3000 of 10000, Atoms: 80(A), 50(P), ChiSq: 1229, Time: 00:00:00 / 00:00:00
4000 of 10000, Atoms: 73(A), 54(P), ChiSq: 1212, Time: 00:00:00 / 00:00:00
5000 of 10000, Atoms: 86(A), 52(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
6000 of 10000, Atoms: 81(A), 52(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
7000 of 10000, Atoms: 75(A), 48(P), ChiSq: 1178, Time: 00:00:00 / 00:00:00
8000 of 10000, Atoms: 70(A), 57(P), ChiSq: 1155, Time: 00:00:00 / 00:00:00
9000 of 10000, Atoms: 73(A), 54(P), ChiSq: 1173, Time: 00:00:00 / 00:00:00
10000 of 10000, Atoms: 79(A), 58(P), ChiSq: 1159, Time: 00:00:00 / 00:00:00
-- Sampling Phase --
1000 of 10000, Atoms: 74(A), 51(P), ChiSq: 1125, Time: 00:00:00 / 00:00:00
2000 of 10000, Atoms: 78(A), 56(P), ChiSq: 1161, Time: 00:00:00 / 00:00:00
3000 of 10000, Atoms: 79(A), 57(P), ChiSq: 1166, Time: 00:00:00 / 00:00:00
4000 of 10000, Atoms: 69(A), 55(P), ChiSq: 1176, Time: 00:00:00 / 00:00:00
5000 of 10000, Atoms: 80(A), 55(P), ChiSq: 1175, Time: 00:00:00 / 00:00:00
6000 of 10000, Atoms: 81(A), 48(P), ChiSq: 1168, Time: 00:00:00 / 00:00:00
7000 of 10000, Atoms: 73(A), 56(P), ChiSq: 1151, Time: 00:00:00 / 00:00:00
8000 of 10000, Atoms: 72(A), 51(P), ChiSq: 1156, Time: 00:00:00 / 00:00:00
9000 of 10000, Atoms: 75(A), 60(P), ChiSq: 1155, Time: 00:00:00 / 00:00:00
10000 of 10000, Atoms: 80(A), 50(P), ChiSq: 1179, Time: 00:00:00 / 00:00:00

GapsResult result object with 25 features and 20 samples
3 patterns were learned

TIME: 0.9086663722991943
  1. provide the distributed config file:
echo "if __name__ == '__main__':
    from PyCoGAPS.parameters import *
    from PyCoGAPS.pycogaps_main import CoGAPS
    import scanpy as sc

    modsimpath = 'data/ModSimData.txt'
    modsim = sc.read_text(modsimpath)

    params = CoParams(path=modsimpath)
    params.printParams()

    setParams(params, {
        'nIterations':10000,
        'seed': 42,
        'nPatterns': 3,
        'useSparseOptimization': True,
        'distributed': 'genome-wide'
    })

    params.setDistributedParams(nSets=2)
    params.printParams()
    start = time.time()
    result = CoGAPS(modsimpath, params)
    end = time.time()
    print('TIME:', end - start)

    result.write('data/dist_modsim.h5ad')" > test.py
  1. run the program python3 test.py
  2. Observed output contains an error:
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 313, in callInternalCoGAPS
    gapsresult = standardCoGAPS(adata, params, uncertainty, transposeData=params.coparams["transposeData"])
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 166, in standardCoGAPS
    result = GapsResultToAnnData(gapsresultobj, adata, prm)
  File "/home/user/pycogaps-docker/PyCoGAPS/helper_functions.py", line 434, in GapsResultToAnnData
    Pmean = toNumpy(gapsresult.Pmean)[prm.coparams["subsetIndices"], :]
IndexError: index 22 is out of bounds for axis 0 with size 20
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 23, in <module>
    result = CoGAPS(modsimpath, params)
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 44, in CoGAPS
    result = distributedCoGAPS(path, params, uncertainty=None)
  File "/home/user/pycogaps-docker/PyCoGAPS/pycogaps_main.py", line 197, in distributedCoGAPS
    result = list(result)
  File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
IndexError: index 22 is out of bounds for axis 0 with size 20

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions