Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Standardize the sampler-constructor interface #16

Closed
brandonwillard opened this issue Mar 26, 2022 · 5 comments
Closed

Standardize the sampler-constructor interface #16

brandonwillard opened this issue Mar 26, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed important refactoring A change that improves the codebase but doesn't necessarily introduce a new feature

Comments

@brandonwillard
Copy link
Member

Currently, this library provides a few functions that construct samplers for very specific models given the inputs/parameters of said models. We should standardize these constructor functions under a function interface that takes model graphs as inputs (i.e. graphs that represent the models supported by the constructor functions).

This is a first step toward the more general case of #3.

Again, since our current constructor functions apply to very specific types of models, we can keep things simple and employ some very basic/restricted unification/pattern matching and generalize from there.

For example:

import numpy as np
import aesara
import aesara.tensor as at


srng = at.random.RandomStream(seed=2309)

X = at.as_tensor(np.random.normal(size=(100, 10)))

# Horseshoe sub-graph
tau_rv = srng.halfcauchy(size=1)
lambda_rv = srng.halfcauchy(size=10)
beta_rv = srng.normal(0, lambda_rv * tau_rv, size=10)

# The "observation" model
eta = X @ beta_rv
p = at.sigmoid(-eta)
Y_rv = srng.nbinom(10, p)


gibbs_sampler = horseshoe_nbinom_gibbs(Y_rv, obs, n_samples)

where horseshoe_nbinom_gibbs would extract the X, beta_rv, and any other necessary components and subsequently perform the same actions as aemcmc.negative_binomial_horseshoe_gibbs.

@brandonwillard brandonwillard added enhancement New feature or request help wanted Extra attention is needed important refactoring A change that improves the codebase but doesn't necessarily introduce a new feature labels Mar 26, 2022
@brandonwillard brandonwillard changed the title Standardize the sampler-constructor interfaces Standardize the sampler-constructor interface Mar 26, 2022
@brandonwillard
Copy link
Member Author

brandonwillard commented Mar 26, 2022

Here's an example of how the unification/matching could work for the negative-binomial regression example above:

from etuples import etuple, etuplize
from unification import unify, reify, var
from aesara.tensor.math import Dot
from aesara.graph.unify import eval_if_etuple


X_lv = var()
b_lv = var()
h_lv = var()
neg_one_lv = var()

sigmoid_dot_pattern = etuple(etuplize(at.sigmoid), etuple(etuplize(at.mul), neg_one_lv, etuple(etuple(Dot), X_lv, b_lv)))
nbinom_sig_dot_pattern = etuple(etuplize(at.random.nbinom), var(), var(), var(), h_lv, sigmoid_dot_pattern)

zero_lv = var()
halfcauchy_1_lv, halfcauchy_2_lv = var(), var()
horseshoe_pattern = etuple(etuplize(at.random.normal), var(), var(), var(),
                           zero_lv, etuple(etuplize(at.mul), halfcauchy_1_lv, halfcauchy_2_lv))


def extract_horseshoe_components(graph):
    """Extract the local and global shrinkage terms from a Horseshoe prior graph."""
    s = unify(graph, horseshoe_pattern)

    if s is False:
        raise ValueError("Input graph does not match")

    halfcauchy_1 = eval_if_etuple(s[halfcauchy_1_lv])

    if halfcauchy_1.owner is None or not isinstance(halfcauchy_1.owner.op, type(at.random.halfcauchy)):
        raise ValueError("Input graph does not match")

    halfcauchy_2 = eval_if_etuple(s[halfcauchy_2_lv])

    if halfcauchy_2.owner is None or not isinstance(halfcauchy_2.owner.op, type(at.random.halfcauchy)):
        raise ValueError("Input graph does not match")

    if halfcauchy_1.type.shape == (1,):
        lambda_rv = halfcauchy_2
        tau_rv = halfcauchy_1
    elif halfcauchy_2.type.shape == (1,):
        lambda_rv = halfcauchy_1
        tau_rv = halfcauchy_2
    else:
        raise ValueError("Input graph does not match")

    return (lambda_rv, tau_rv)


def horseshoe_nbinom_gibbs(Y_rv, obs, n_samples):
    # Canonicalize the input graph first, so that we unify against a consistent
    # form of a negative-binomial regression (with a Horseshoe prior) graph
    Y_rv = optimize_graph(Y_rv)

    Y_et = etuplize(Y_rv)

    s = unify(Y_et, nbinom_sig_dot_pattern)

    if s is False:
        raise ValueError("Input graph does not match")

    if all(s[neg_one_lv].data != -1):
        raise ValueError("Input graph does not match")

    # Evaluate these when they're `ExpressionTuple`s so that we have Aesara graphs
    # with which to work
    h = eval_if_etuple(s[h_lv])
    beta_rv = eval_if_etuple(s[b_lv])
    X = eval_if_etuple(s[X_lv])

    lambda_rv, tau_rv = extract_horseshoe_components(beta_rv)

    # Construct the sampler given the inputs...

@rlouf rlouf self-assigned this Mar 26, 2022
@rlouf
Copy link
Member

rlouf commented Mar 28, 2022

I cannot make the unification work in your example, I think because of the at.mul operator. If I etuplize the part of the graph corresponding to the horsehoe prior in your example I get:

import aesara.tensor as at
from aesara.graph.opt_utils import optimize_graph

from etuples import etuplize
import numpy as np

# Build horsehoe build graph
srng = at.random.RandomStream(seed=2309)
X = at.as_tensor(np.random.normal(size=(100, 10)))
tau_rv = srng.halfcauchy(size=1)
lambda_rv = srng.halfcauchy(size=10)
beta_rv = srng.normal(0, lambda_rv * tau_rv, size=10)

beta_opt_rv = optimize_graph(beta_rv)
beta_et = etuplize(beta_opt_rv)
print(beta_et)
# e(e(<class 'aesara.tensor.random.basic.NormalRV'>, normal, 0, (0, 0), floatX, False), RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FF73DB41660>), TensorConstant{(1,) of 10}, TensorConstant{11}, TensorConstant{0}, e(e(<class 'aesara.tensor.elemwise.Elemwise'>, mul, <frozendict {}>), e(e(<class 'aesara.tensor.random.basic.HalfCauchyRV'>, halfcauchy, 0, (0, 0), floatX, False), RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FF73DB2EE40>), TensorConstant{(1,) of 10}, TensorConstant{11}, TensorConstant{0.0}, TensorConstant{1.0}), e(e(<class 'aesara.tensor.random.basic.HalfCauchyRV'>, halfcauchy, 0, (0, 0), floatX, False), RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7FF73DB2E9E0>), TensorConstant{(1,) of 1}, TensorConstant{11}, TensorConstant{0.0}, TensorConstant{1.0})))

the e(<class 'aesara.tensor.elemwise.Elemwise'>, mul, <frozendict {}>) being what is of interest here. On the other hand:

etuplize(at.mul)
# <class 'aesara.tensor.elemwise.Elemwise'>, <aesara.scalar.basic.Mul object at 0x7ff740e2a3d0>, <frozendict {}>

so that the unify(beta_et, horseshoe_pattern), where horseshoe_pattern is defined as in your example, returns False .

Here is a self-contained example that reproduces the behaviour with at.mul:

a_tt = at.scalar()
b_tt = at.scalar()
prod_tt = a_tt * b_tt  # same with at.mul(a_tt, b_tt)

prod_pattern = etuple(etuplize(at.mul), var(), var())

unify(prod_tt, prod_pattern)
# False

@brandonwillard
Copy link
Member Author

brandonwillard commented Mar 28, 2022

Perhaps there's a version difference, because that example is unifying in my environment (e.g. etuples at 0.3.3 and 0.3.5, unification at 0.4.4, aesara at 2.5.3):

import aesara.tensor as at

from etuples import etuple, etuplize
from unification import unify, var


a_tt = at.scalar()
b_tt = at.scalar()
prod_tt = a_tt * b_tt  # same with at.mul(a_tt, b_tt)

prod_pattern = etuple(etuplize(at.mul), var(), var())

unify(prod_tt, prod_pattern)
# {~_1: <TensorType(float64, ())>, ~_2: <TensorType(float64, ())>}

@brandonwillard
Copy link
Member Author

Oh, make sure you're using the logical-unification package. The original unification package won't work.

rlouf added a commit that referenced this issue Apr 6, 2022
The samplers currently assume that there is an intercept term in the
regressions, and this term is treated separately from the other
regression parameters (Makalic & Schmidt 2016). This is limiting for two
reasons:

1. We don't always want to run regressions with an intercept parameter;
at the very least we'd like to control its prior value.
2. This complicates the work for #16.

As a result in this commit we generalize the Gibbs samplers by removing
the separation between the intercept beta0 and the other regression parameters.
rlouf added a commit that referenced this issue Apr 6, 2022
The samplers currently assume that there is an intercept term in the
regressions, and this term is treated separately from the other
regression parameters (Makalic & Schmidt 2016). This is limiting for two
reasons:

1. We don't always want to run regressions with an intercept parameter;
at the very least we'd like to control its prior value.
2. This complicates the work for #16.

As a result in this commit we generalize the Gibbs samplers by removing
the separation between the intercept beta0 and the other regression parameters.
@rlouf
Copy link
Member

rlouf commented Apr 7, 2022

Note that the optimize_graph to obtain a canonical version of the graph is important, even on such a simple example. Only the following expression for sigmoid_dot_pattern will match the model above if we don't canonicalize the graph:

sigmoid_dot_pattern = etuple(
    etuplize(at.sigmoid),
    etuple(etuplize(at.neg), etuple(etuple(Dot), X_lv, b_lv))
)

rlouf added a commit that referenced this issue Apr 9, 2022
The samplers currently assume that there is an intercept term in the
regressions, and this term is treated separately from the other
regression parameters (Makalic & Schmidt 2016). This is limiting for two
reasons:

1. We don't always want to run regressions with an intercept parameter;
at the very least we'd like to control its prior value.
2. This complicates the work for #16.

As a result in this commit we generalize the Gibbs samplers by removing
the separation between the intercept beta0 and the other regression parameters.
@rlouf rlouf closed this as completed Apr 21, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request help wanted Extra attention is needed important refactoring A change that improves the codebase but doesn't necessarily introduce a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants