Standardize the sampler-constructor interface #16

brandonwillard · 2022-03-26T01:00:19Z

Currently, this library provides a few functions that construct samplers for very specific models given the inputs/parameters of said models. We should standardize these constructor functions under a function interface that takes model graphs as inputs (i.e. graphs that represent the models supported by the constructor functions).

This is a first step toward the more general case of #3.

Again, since our current constructor functions apply to very specific types of models, we can keep things simple and employ some very basic/restricted unification/pattern matching and generalize from there.

For example:

import numpy as np
import aesara
import aesara.tensor as at


srng = at.random.RandomStream(seed=2309)

X = at.as_tensor(np.random.normal(size=(100, 10)))

# Horseshoe sub-graph
tau_rv = srng.halfcauchy(size=1)
lambda_rv = srng.halfcauchy(size=10)
beta_rv = srng.normal(0, lambda_rv * tau_rv, size=10)

# The "observation" model
eta = X @ beta_rv
p = at.sigmoid(-eta)
Y_rv = srng.nbinom(10, p)


gibbs_sampler = horseshoe_nbinom_gibbs(Y_rv, obs, n_samples)

where horseshoe_nbinom_gibbs would extract the X, beta_rv, and any other necessary components and subsequently perform the same actions as aemcmc.negative_binomial_horseshoe_gibbs.

The text was updated successfully, but these errors were encountered:

brandonwillard · 2022-03-26T03:09:44Z

Here's an example of how the unification/matching could work for the negative-binomial regression example above:

from etuples import etuple, etuplize
from unification import unify, reify, var
from aesara.tensor.math import Dot
from aesara.graph.unify import eval_if_etuple


X_lv = var()
b_lv = var()
h_lv = var()
neg_one_lv = var()

sigmoid_dot_pattern = etuple(etuplize(at.sigmoid), etuple(etuplize(at.mul), neg_one_lv, etuple(etuple(Dot), X_lv, b_lv)))
nbinom_sig_dot_pattern = etuple(etuplize(at.random.nbinom), var(), var(), var(), h_lv, sigmoid_dot_pattern)

zero_lv = var()
halfcauchy_1_lv, halfcauchy_2_lv = var(), var()
horseshoe_pattern = etuple(etuplize(at.random.normal), var(), var(), var(),
                           zero_lv, etuple(etuplize(at.mul), halfcauchy_1_lv, halfcauchy_2_lv))


def extract_horseshoe_components(graph):
    """Extract the local and global shrinkage terms from a Horseshoe prior graph."""
    s = unify(graph, horseshoe_pattern)

    if s is False:
        raise ValueError("Input graph does not match")

    halfcauchy_1 = eval_if_etuple(s[halfcauchy_1_lv])

    if halfcauchy_1.owner is None or not isinstance(halfcauchy_1.owner.op, type(at.random.halfcauchy)):
        raise ValueError("Input graph does not match")

    halfcauchy_2 = eval_if_etuple(s[halfcauchy_2_lv])

    if halfcauchy_2.owner is None or not isinstance(halfcauchy_2.owner.op, type(at.random.halfcauchy)):
        raise ValueError("Input graph does not match")

    if halfcauchy_1.type.shape == (1,):
        lambda_rv = halfcauchy_2
        tau_rv = halfcauchy_1
    elif halfcauchy_2.type.shape == (1,):
        lambda_rv = halfcauchy_1
        tau_rv = halfcauchy_2
    else:
        raise ValueError("Input graph does not match")

    return (lambda_rv, tau_rv)


def horseshoe_nbinom_gibbs(Y_rv, obs, n_samples):
    # Canonicalize the input graph first, so that we unify against a consistent
    # form of a negative-binomial regression (with a Horseshoe prior) graph
    Y_rv = optimize_graph(Y_rv)

    Y_et = etuplize(Y_rv)

    s = unify(Y_et, nbinom_sig_dot_pattern)

    if s is False:
        raise ValueError("Input graph does not match")

    if all(s[neg_one_lv].data != -1):
        raise ValueError("Input graph does not match")

    # Evaluate these when they're `ExpressionTuple`s so that we have Aesara graphs
    # with which to work
    h = eval_if_etuple(s[h_lv])
    beta_rv = eval_if_etuple(s[b_lv])
    X = eval_if_etuple(s[X_lv])

    lambda_rv, tau_rv = extract_horseshoe_components(beta_rv)

    # Construct the sampler given the inputs...

rlouf · 2022-03-28T10:15:33Z

I cannot make the unification work in your example, I think because of the at.mul operator. If I etuplize the part of the graph corresponding to the horsehoe prior in your example I get:

import aesara.tensor as at
from aesara.graph.opt_utils import optimize_graph

from etuples import etuplize
import numpy as np

# Build horsehoe build graph
srng = at.random.RandomStream(seed=2309)
X = at.as_tensor(np.random.normal(size=(100, 10)))
tau_rv = srng.halfcauchy(size=1)
lambda_rv = srng.halfcauchy(size=10)
beta_rv = srng.normal(0, lambda_rv * tau_rv, size=10)

beta_opt_rv = optimize_graph(beta_rv)
beta_et = etuplize(beta_opt_rv)
print(beta_et)
# e(e(<class 'aesara.tensor.random.basic.NormalRV'>, normal, 0, (0, 0), floatX, False), RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FF73DB41660>), TensorConstant{(1,) of 10}, TensorConstant{11}, TensorConstant{0}, e(e(<class 'aesara.tensor.elemwise.Elemwise'>, mul, <frozendict {}>), e(e(<class 'aesara.tensor.random.basic.HalfCauchyRV'>, halfcauchy, 0, (0, 0), floatX, False), RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FF73DB2EE40>), TensorConstant{(1,) of 10}, TensorConstant{11}, TensorConstant{0.0}, TensorConstant{1.0}), e(e(<class 'aesara.tensor.random.basic.HalfCauchyRV'>, halfcauchy, 0, (0, 0), floatX, False), RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FF73DB2E9E0>), TensorConstant{(1,) of 1}, TensorConstant{11}, TensorConstant{0.0}, TensorConstant{1.0})))

the e(<class 'aesara.tensor.elemwise.Elemwise'>, mul, <frozendict {}>) being what is of interest here. On the other hand:

etuplize(at.mul)
# <class 'aesara.tensor.elemwise.Elemwise'>, <aesara.scalar.basic.Mul object at 0x7ff740e2a3d0>, <frozendict {}>

so that the unify(beta_et, horseshoe_pattern), where horseshoe_pattern is defined as in your example, returns False .

Here is a self-contained example that reproduces the behaviour with at.mul:

a_tt = at.scalar()
b_tt = at.scalar()
prod_tt = a_tt * b_tt  # same with at.mul(a_tt, b_tt)

prod_pattern = etuple(etuplize(at.mul), var(), var())

unify(prod_tt, prod_pattern)
# False

brandonwillard · 2022-03-28T17:29:01Z

Perhaps there's a version difference, because that example is unifying in my environment (e.g. etuples at 0.3.3 and 0.3.5, unification at 0.4.4, aesara at 2.5.3):

import aesara.tensor as at

from etuples import etuple, etuplize
from unification import unify, var


a_tt = at.scalar()
b_tt = at.scalar()
prod_tt = a_tt * b_tt  # same with at.mul(a_tt, b_tt)

prod_pattern = etuple(etuplize(at.mul), var(), var())

unify(prod_tt, prod_pattern)
# {~_1: <TensorType(float64, ())>, ~_2: <TensorType(float64, ())>}

brandonwillard · 2022-03-28T22:43:20Z

Oh, make sure you're using the logical-unification package. The original unification package won't work.

The samplers currently assume that there is an intercept term in the regressions, and this term is treated separately from the other regression parameters (Makalic & Schmidt 2016). This is limiting for two reasons: 1. We don't always want to run regressions with an intercept parameter; at the very least we'd like to control its prior value. 2. This complicates the work for #16. As a result in this commit we generalize the Gibbs samplers by removing the separation between the intercept beta0 and the other regression parameters.

rlouf · 2022-04-07T13:45:46Z

Note that the optimize_graph to obtain a canonical version of the graph is important, even on such a simple example. Only the following expression for sigmoid_dot_pattern will match the model above if we don't canonicalize the graph:

sigmoid_dot_pattern = etuple(
    etuplize(at.sigmoid),
    etuple(etuplize(at.neg), etuple(etuple(Dot), X_lv, b_lv))
)

The samplers currently assume that there is an intercept term in the regressions, and this term is treated separately from the other regression parameters (Makalic & Schmidt 2016). This is limiting for two reasons: 1. We don't always want to run regressions with an intercept parameter; at the very least we'd like to control its prior value. 2. This complicates the work for #16. As a result in this commit we generalize the Gibbs samplers by removing the separation between the intercept beta0 and the other regression parameters.

brandonwillard added enhancement New feature or request help wanted Extra attention is needed important refactoring A change that improves the codebase but doesn't necessarily introduce a new feature labels Mar 26, 2022

brandonwillard changed the title ~~Standardize the sampler-constructor interfaces~~ Standardize the sampler-constructor interface Mar 26, 2022

rlouf self-assigned this Mar 26, 2022

brandonwillard mentioned this issue Apr 1, 2022

Generalize horseshoe_nbinom_gibbs #17

Closed

This was referenced Apr 6, 2022

Remove the intercept assumption in Gibbs samplers #23

Merged

Construct samplers from models #24

Closed

rlouf mentioned this issue Apr 7, 2022

Add a Metropolis-within-Gibbs sampler #26

Closed

rlouf closed this as completed Apr 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize the sampler-constructor interface #16

Standardize the sampler-constructor interface #16

brandonwillard commented Mar 26, 2022

brandonwillard commented Mar 26, 2022 •

edited

Loading

rlouf commented Mar 28, 2022 •

edited

Loading

brandonwillard commented Mar 28, 2022 •

edited

Loading

brandonwillard commented Mar 28, 2022

rlouf commented Apr 7, 2022 •

edited

Loading

Standardize the sampler-constructor interface #16

Standardize the sampler-constructor interface #16

Comments

brandonwillard commented Mar 26, 2022

brandonwillard commented Mar 26, 2022 • edited Loading

rlouf commented Mar 28, 2022 • edited Loading

brandonwillard commented Mar 28, 2022 • edited Loading

brandonwillard commented Mar 28, 2022

rlouf commented Apr 7, 2022 • edited Loading

brandonwillard commented Mar 26, 2022 •

edited

Loading

rlouf commented Mar 28, 2022 •

edited

Loading

brandonwillard commented Mar 28, 2022 •

edited

Loading

rlouf commented Apr 7, 2022 •

edited

Loading