Understanding CSFS Samples and Discretization in ASMC's Prepare Decoding Tool

Hi there, 

I'm using the C++ compiled version of Prepare Decoding to create decoding quantities files for fastSMC, focusing on analyzing IBD segments. I've got demo files from ASMC_data and frequency files made from my own dataset including 1600 samples and around 500,000 variants. I used disc file from the one included in package "input_30_-100-2000.disc". 

When I tried setting 'CSFSsamples=1600' to match the sample count, I ran into a memory issue causing a core dump. However, lowering 'CSFSsamples' to 300 fixed the problem.

I'm curious about the actual meaning of 'CSFS samples' counts. Do they need to match the sample count in the frequency file or the '.haps', '.samples', and '.map' files which will be used in fastSMC analysis later (n = 1600)? Also, is there a maximum limit for 'CSFS samples' counts?

Additionally, I'd like to know how to define my own number of quantiles for discretization in the C++ version. I noticed Python version allows user to define discretization like this: discretization=[[30.0, 15], [100.0, 15], 39]. Can you tell me how to do this in the C++ version?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding CSFS Samples and Discretization in ASMC's Prepare Decoding Tool #13

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Understanding CSFS Samples and Discretization in ASMC's Prepare Decoding Tool #13

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions