Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

groupCorr cor_exp_th = 0.90 separate two > 0.9995 #50

Open
cbonnefoy opened this issue Mar 13, 2020 · 6 comments
Open

groupCorr cor_exp_th = 0.90 separate two > 0.9995 #50

cbonnefoy opened this issue Mar 13, 2020 · 6 comments

Comments

@cbonnefoy
Copy link

Hello and thanks in advance for you help

I am using CAMERA

xsgf<-as(fdata,"xcmsSet")
xsaf<-xsAnnotate(xsgf,sample = c(3:5), polarity = "positive")
xsaFf<-groupFWHM(xsaf, sigma = 6 , perfwhm = 0.6, intval = "maxo")
xsaFCIf<-findIsotopes(xsaFf, maxcharge=3, maxiso=4, ppm=5, mzabs=0.015, intval="maxo", minfrac=1, isotopeMatrix = NULL,filter = TRUE)
xsFCf <- groupCorr(xsaFCIf, cor_exp_th = 0.90,calcCiS = FALSE, calcCaS = TRUE)
xsaFCIAf<-findAdducts(xsFCf,polarity = "positive")

At the end, I get groups in different ps-groups

Example

line 2227 and line 2239
after FWHM pc-group = 21 for both
after isotopes, corr and adducts pc-group = 756 and 757 respectively
the correlation coefficient calculated by
calcCaS(xsaFCIf,corval=0.90, pval=0.05, intval="maxo")
is 0.9999

Could someone explain me why?
What is the role of the pval? How could I access it?

Christelle

@stanstrup
Copy link
Contributor

The p-value comes from the regression done by Hmisc::rcorr.
I am not a statistician, so I won't explain but the documentation says "[...] P, the asymptotic P-values".
I think you can find an explanation here: https://statisticsbyjim.com/regression/interpret-coefficients-p-values-regression/

It is unclear if you question was something else...

@cbonnefoy
Copy link
Author

Thanks for your comment Jan.

Now I can calculate the p_values by Hmisc::rcorr.

I understand it is to test the significativity of the correlation which depends on the number of pairs used. Because I have only 12 samples, even a correlation of 0.6 pass the test (= is less than 0.05).

What I don't understand is why two variables that are in the same pseudospectra before groupCorr are splitted in two different after, knowing that the correlation is >0.90 and the significativity far less than 0.05

@stanstrup
Copy link
Contributor

stanstrup commented Mar 16, 2020

Sounds strange. Are they the only features in the group?
you could try graphMethod="lpc" to see if it is the clustering algorithm that does something strange.
I think example data is probably needed to investigate this further.

@cbonnefoy
Copy link
Author

Yes when they split they are often the only feature in the group

I only test graphMethod for one psg_list. They both result in splitting, even if the splitting is sligthly different.

I noticed that the number of the pcgroups after splitting are very near, they only differ by one.
It seems to me that the features are rejected from the originating group but I don't know how

Here is an example for 3 molecules. Have a look at carbamazepine

Thanks

FWHM_Corr_1_12_Carbamazepine_Sulfamethoxazole_Ketoprofene.xlsx

@stanstrup
Copy link
Contributor

Which intensity values did you use for that analysis? You used maxo for groupFWHM but default for groupCorr is into.
I am wondering if there are some NA values in play. Did you fill peaks?

@cbonnefoy
Copy link
Author

I used maxo for groupCorr too.
I filled peaks but for some features there are still many NA.

I agree a part of my problem can come from that but I think there is another reason because I have examples where no peak is missing.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants