Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

The number of overlaps are more than the true set #12

Open
KopalliV opened this issue Nov 8, 2024 · 0 comments
Open

The number of overlaps are more than the true set #12

KopalliV opened this issue Nov 8, 2024 · 0 comments

Comments

@KopalliV
Copy link

KopalliV commented Nov 8, 2024

Hi, Thank you very much for regioneR.
Just a question,
I am trying to check the significance of overlapping variants between my variant set and a true set.
I have around 37k variants in my true set but I get more than 44k observed overlaps, even if I use the function count.once.
This is the code I used:

gn <- read.csv("Chr.len",sep="\t", header=F)
names(gn)<-c("chr","start","end")
bt <- read.csv("true_set.pos",sep="\t", header=F)
names(bt)<-c("chr","start","end")
at <- read.csv("variants.pos",sep="\t", header=F)
names(at)<-c("chr","start","end")
pt <- permTest(A=at, B=bt, randomize.function=randomizeRegions, evaluate.function=numOverlaps, ntimes=100, genome = gn, count.once=TRUE)
plot(pt)

'at' has 179656 variant positions and bt which is the true set has 37555 variant positions
pt[["numOverlaps"]][["observed"]]
[1] 43896
If my B has 37555 variants, then shouldn't my observed overlap count be <= 37555? Am I missing something basic here?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant