You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My data comes from 8 samples totaling 36 260 cells and 3664 genes. It is scaled data. When I run that code, it says it will take an estimated 5 days to run. I do have 256GB of memory and 64 cores. Is there a way to run this command in parallel? I need to check up to 90 clusters, so taking 5 days to do 15 at a time will take a long time.
Also, is it normal to take 5 days to check the first 15 clusters?
Thank you,
-Damien
The text was updated successfully, but these errors were encountered:
Thank you for your question and sorry for reaching out this late!
Unfortunately, I have limited availability to further improve the performance of the PAC component and I do not think there will be improvements done on this section of the package in the near future.
However, as suggested here, you can try downsampling your dataset using methods such as geosketch, infer the appropriate number of cluster on the subsample and then use this information to cluster your entire dataset.
Hi,
This is a nice package and the documentation is helpful. There is one issue I am having. I am running the following command
consensus_cluster(data, k_max=15, n_reps=100, p_sample =0.8, p_feature=0.8).
My data comes from 8 samples totaling 36 260 cells and 3664 genes. It is scaled data. When I run that code, it says it will take an estimated 5 days to run. I do have 256GB of memory and 64 cores. Is there a way to run this command in parallel? I need to check up to 90 clusters, so taking 5 days to do 15 at a time will take a long time.
Also, is it normal to take 5 days to check the first 15 clusters?
Thank you,
-Damien
The text was updated successfully, but these errors were encountered: