TPM microsats model transition matrix error #2237

GertjanBisschop · 2023-12-08T11:33:52Z

When specifying the following MicrosatMutationModel I get an "Each row of the transition matrix must be non-negative and sum to one" error.

lo = 50
hi = 500
m = 0.25
p = 0.5
model = msprime.TPM(p=p, m=m, lo=lo, hi=hi)

I just followed the instructions from the docs to generate a random model. If this is not a valid parameter combination then this should have been caught I think. @andrewkern any idea what is going on here?

The text was updated successfully, but these errors were encountered:

andrewkern · 2023-12-08T14:58:29Z

looks like that value of lo is revealing a bug. if I change that value everything works. it's weird because I can't confirm that the rows don't sum to 1, e.g.

>>> x = msprime.mutations.general_microsat_rate_matrix(s=0, u=0.5, v=0, p=p, m=m, lo=50, hi=500)
>>> np.sum(x, axis=1)
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1.])

my first blush reaction is that something wonky is happening with matrix_mutation_model class on the C side? not sure yet

andrewkern · 2023-12-08T18:53:11Z

a little further digging -- this looks like rounding error in the transition matrix construction

x = msprime.mutations.general_microsat_rate_matrix(s=0, u=0.5, v=0, p=p, m=m, lo=50, hi=500)
print(x[np.where(x<0)])

[-2.22044605e-16]

andrewkern · 2023-12-08T18:57:07Z

the negative element in the matrix is on the diagonal of the rate matrix, so the numerical issue is creeping in here most likely

(from msprime.mutations.general_microsat_rate_matrix)

    row_sums = rate_matrix.sum(axis=1, dtype="float64")
    np.fill_diagonal(rate_matrix, 1.0 - row_sums)

andrewkern · 2023-12-08T18:58:20Z

@petrelharp any recs on how best to beat this? set small diagonal values to zero?

petrelharp · 2023-12-12T00:02:23Z

Well, here's the problem, I think:

>>> 1 == (1 / 1.0000000000000001)
True

In the problematic code, the max row sum is alpha=1.0000000000000002 before normalizing; then dividing by this value does not change the max row sum (i.e., it's still 1.0000000000000002).

Doing as you suggest and changing that line to

    np.fill_diagonal(rate_matrix, np.fmax(0.0, 1.0 - row_sums))

fixes the error at least with this set of parameters. I'm not sure if there's a better way other than changing the code that throws the error to be more tolerant of a little slop?

I've got a PR, will throw it up.

jeromekelleher · 2023-12-12T09:10:26Z

Fix LGTM. Messiness like this near zeros is inevitable I think.

jeromekelleher added this to the 1.3.0 milestone Dec 8, 2023

petrelharp added a commit to petrelharp/msprime that referenced this issue Dec 12, 2023

avoid negative probs, closes tskit-dev#2237

4b9669f

petrelharp mentioned this issue Dec 12, 2023

avoid negative probs, closes #2237 #2238

Merged

mergify bot closed this as completed in cb3a469 Dec 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TPM microsats model transition matrix error #2237

TPM microsats model transition matrix error #2237

GertjanBisschop commented Dec 8, 2023

andrewkern commented Dec 8, 2023

andrewkern commented Dec 8, 2023

andrewkern commented Dec 8, 2023 •

edited by petrelharp

Loading

andrewkern commented Dec 8, 2023

petrelharp commented Dec 12, 2023

jeromekelleher commented Dec 12, 2023

TPM microsats model transition matrix error #2237

TPM microsats model transition matrix error #2237

Comments

GertjanBisschop commented Dec 8, 2023

andrewkern commented Dec 8, 2023

andrewkern commented Dec 8, 2023

andrewkern commented Dec 8, 2023 • edited by petrelharp Loading

andrewkern commented Dec 8, 2023

petrelharp commented Dec 12, 2023

jeromekelleher commented Dec 12, 2023

andrewkern commented Dec 8, 2023 •

edited by petrelharp

Loading