State Space Model (Base class)

Contents

State Space Model (Base class)#

class SSM[source]#

A base class for state space models. Such models consist of parameters, which we may learn, as well as hyperparameters, which specify static properties of the model. This base class allows parameters to be indicated a standardized way so that they can easily be converted to/from unconstrained form for optimization.

Abstract Methods

Models that inherit from SSM must implement a few key functions and properties:

  • initial_distribution() returns the distribution over the initial state given parameters

  • transition_distribution() returns the conditional distribution over the next state given the current state and parameters

  • emission_distribution() returns the conditional distribution over the emission given the current state and parameters

  • log_prior() (optional) returns the log prior probability of the parameters

  • emission_shape returns a tuple specification of the emission shape

  • inputs_shape returns a tuple specification of the input shape, or None if there are no inputs.

The shape properties are required for properly handling batches of data.

Sampling and Computing Log Probabilities

Once these have been implemented, subclasses will inherit the ability to sample and compute log joint probabilities from the base class functions:

  • sample() draws samples of the states and emissions for given parameters

  • log_prob() computes the log joint probability of the states and emissions for given parameters

Inference

Many subclasses of SSMs expose basic functions for performing state inference.

  • marginal_log_prob() computes the marginal log probability of the emissions, summing over latent states

  • filter() computes the filtered posteriors

  • smoother() computes the smoothed posteriors

Learning

Likewise, many SSMs will support learning with expectation-maximization (EM) or stochastic gradient descent (SGD).

For expectation-maximization, subclasses must implement the E- and M-steps.

  • e_step() computes the expected sufficient statistics for a sequence of emissions, given parameters

  • m_step() finds new parameters that maximize the expected log joint probability

Once these are implemented, the generic SSM class allows to fit the model with EM

  • fit_em() run EM to find parameters that maximize the likelihood (or posterior) probability.

For SGD, any subclass that implements marginal_log_prob() inherits the base class fitting function

  • fit_sgd() run SGD to minimize the negative marginal log probability.

abstract initial_distribution(params, inputs)[source]#

Return an initial distribution over latent states.

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • inputs (Float[Array, 'input_dim'] | None) – optional inputs \(u_t\)

Returns:

distribution over initial latent state, \(p(z_1 \mid \theta)\).

Return type:

Distribution

abstract transition_distribution(params, state, inputs)[source]#

Return a distribution over next latent state given current state.

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • state (Float[Array, 'state_dim']) – current latent state \(z_t\)

  • inputs (Float[Array, 'input_dim'] | None) – current inputs \(u_t\)

Returns:

conditional distribution of next latent state \(p(z_{t+1} \mid z_t, u_t, \theta)\).

Return type:

Distribution

abstract emission_distribution(params, state, inputs=None)[source]#

Return a distribution over emissions given current state.

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • state (Float[Array, 'state_dim']) – current latent state \(z_t\)

  • inputs (Float[Array, 'input_dim'] | None) – current inputs \(u_t\)

Returns:

conditional distribution of current emission \(p(y_t \mid z_t, u_t, \theta)\)

Return type:

Distribution

log_prior(params)[source]#

Return the log prior probability of any model parameters.

Returns:

log prior probability.

Return type:

lp (Scalar)

Parameters:

params (ParameterSet)

abstract property emission_shape: Tuple[int]#

Return a pytree matching the pytree of tuples specifying the shape of a single time step’s emissions.

For example, a GaussianHMM with \(D\) dimensional emissions would return (D,).

property inputs_shape: Tuple[int] | None#

Return a pytree matching the pytree of tuples specifying the shape of a single time step’s inputs.

sample(params, key, num_timesteps, inputs=None)[source]#

Sample states \(z_{1:T}\) and emissions \(y_{1:T}\) given parameters \(\theta\) and (optionally) inputs \(u_{1:T}\).

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • key (Array) – random number generator

  • num_timesteps (int) – number of timesteps \(T\)

  • inputs (Float[Array, 'num_timesteps input_dim'] | None) – inputs \(u_{1:T}\)

Returns:

latent states and emissions

Return type:

Tuple[Float[Array, ‘num_timesteps state_dim’], Float[Array, ‘num_timesteps emission_dim’]]

log_prob(params, states, emissions, inputs=None)[source]#

Compute the log joint probability of the states and observations

Parameters:
  • params (ParameterSet)

  • states (Float[Array, 'num_timesteps state_dim'])

  • emissions (Float[Array, 'num_timesteps emission_dim'])

  • inputs (Float[Array, 'num_timesteps input_dim'] | None)

Return type:

float | Float[Array, ‘’]

marginal_log_prob(params, emissions, inputs=None)[source]#

Compute log marginal likelihood of observations, \(\log \sum_{z_{1:T}} p(y_{1:T}, z_{1:T} \mid \theta)\).

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – current inputs \(u_t\)

  • emissions (Float[Array, 'ntime emission_dim'])

Returns:

marginal log probability

Return type:

float | Float[Array, ‘’]

filter(params, emissions, inputs=None)[source]#

Compute filtering distributions, \(p(z_t \mid y_{1:t}, u_{1:t}, \theta)\) for \(t=1,\ldots,T\).

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – current inputs \(u_t\)

  • emissions (Float[Array, 'ntime emission_dim'])

Returns:

filtering distributions

Return type:

Posterior

smoother(params, emissions, inputs=None)[source]#

Compute smoothing distribution, \(p(z_t \mid y_{1:T}, u_{1:T}, \theta)\) for \(t=1,\ldots,T\).

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – current inputs \(u_t\)

  • emissions (Float[Array, 'ntime emission_dim'])

Returns:

smoothing distributions

Return type:

Posterior

e_step(params, emissions, inputs=None)[source]#

Perform an E-step to compute expected sufficient statistics under the posterior, \(p(z_{1:T} \mid y_{1:T}, u_{1:T}, \theta)\).

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • emissions (Float[Array, 'num_timesteps emission_dim']) – emissions \(y_{1:T}\)

  • inputs (Float[Array, 'num_timesteps input_dim'] | None) – optional inputs \(u_{1:T}\)

Returns:

Expected sufficient statistics under the posterior.

Return type:

Tuple[SuffStatsSSM, float | Float[Array, ‘’]]

m_step(params, props, batch_stats, m_step_state)[source]#

Perform an M-step to find parameters that maximize the expected log joint probability.

Specifically, compute

\[\theta^\star = \mathrm{argmax}_\theta \; \mathbb{E}_{p(z_{1:T} \mid y_{1:T}, u_{1:T}, \theta)} \big[\log p(y_{1:T}, z_{1:T}, \theta \mid u_{1:T}) \big]\]

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • props (PropertySet) – properties specifying which parameters should be learned

  • batch_stats (SuffStatsSSM) – sufficient statistics from each sequence

  • m_step_state (Any) – any required state for optimizing the model parameters.

Returns:

new parameters

Return type:

ParameterSet

fit_em(params, props, emissions, inputs=None, num_iters=50, verbose=True)[source]#

Compute parameter MLE/ MAP estimate using Expectation-Maximization (EM).

EM aims to find parameters that maximize the marginal log probability,

\[\theta^\star = \mathrm{argmax}_\theta \; \log p(y_{1:T}, \theta \mid u_{1:T})\]

It does so by iteratively forming a lower bound (the “E-step”) and then maximizing it (the “M-step”).

Note: emissions and inputs can either be single sequences or batches of sequences.

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • props (PropertySet) – properties specifying which parameters should be learned

  • emissions (Float[Array, 'num_timesteps emission_dim'] | Float[Array, 'num_batches num_timesteps emission_dim']) – one or more sequences of emissions

  • inputs (Float[Array, 'num_timesteps input_dim'] | Float[Array, 'num_batches num_timesteps input_dim'] | None) – one or more sequences of corresponding inputs

  • num_iters (int) – number of iterations of EM to run

  • verbose (bool) – whether or not to show a progress bar

Returns:

tuple of new parameters and log likelihoods over the course of EM iterations.

Return type:

Tuple[ParameterSet, Float[Array, ‘num_iters’]]

fit_sgd(params, props, emissions, inputs=None, optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), batch_size=1, num_epochs=50, shuffle=False, key=Array([0, 0], dtype=uint32))[source]#

Compute parameter MLE/ MAP estimate using Stochastic Gradient Descent (SGD).

SGD aims to find parameters that maximize the marginal log probability,

\[\theta^\star = \mathrm{argmax}_\theta \; \log p(y_{1:T}, \theta \mid u_{1:T})\]

by minimizing the _negative_ of that quantity.

Note: emissions and inputs can either be single sequences or batches of sequences.

On each iteration, the algorithm grabs a minibatch of sequences and takes a gradient step. One pass through the entire set of sequences is called an epoch.

Parameters:
  • params (ParameterSet) – model parameters \(\theta\)

  • props (PropertySet) – properties specifying which parameters should be learned

  • emissions (Float[Array, 'num_timesteps emission_dim'] | Float[Array, 'num_batches num_timesteps emission_dim']) – one or more sequences of emissions

  • inputs (Float[Array, 'num_timesteps input_dim'] | Float[Array, 'num_batches num_timesteps input_dim'] | None) – one or more sequences of corresponding inputs

  • optimizer (GradientTransformation) – an optax optimizer for minimization

  • batch_size (int) – number of sequences per minibatch

  • num_epochs (int) – number of epochs of SGD to run

  • key (Array) – a random number generator for selecting minibatches

  • verbose – whether or not to show a progress bar

  • shuffle (bool)

Returns:

tuple of new parameters and losses (negative scaled marginal log probs) over the course of SGD iterations.

Return type:

Tuple[ParameterSet, Float[Array, ‘niter’]]

Parameters#

Parameters and their associated properties are stored as jax.DeviceArray and dynamax.parameters.ParameterProperties, respectively. They are bundled together into a dynamax.parameters.ParameterSet and a dynamax.parameters.PropertySet, which are simply aliases for immutable datastructures (in our case, NamedTuple).

class ParameterSet(*args, **kwargs)[source]#

A NamedTuple with parameters stored as jax.DeviceArray in the leaf nodes.

class PropertySet(*args, **kwargs)[source]#

A matching NamedTuple with ParameterProperties stored in the leaf nodes.

class ParameterProperties(trainable=True, constrainer=None)[source]#

A PyTree containing parameter metadata (properties).

Note: the properties are stored in the aux_data of this PyTree so that changes will trigger recompilation of functions that rely on them.

Parameters:
  • trainable (bool) – flag specifying whether or not to fit this parameter is adjustable.

  • constrainer (Optional tfb.Bijector) – bijector mapping to constrained form.

Hidden Markov Model#

Abstract classes#

class HMM(num_states, initial_component, transition_component, emission_component)[source]#

Bases: SSM

Abstract base class of Hidden Markov Models (HMMs).

The model is defined as follows

\[z_1 \mid u_1 \sim \mathrm{Cat}(\pi_0(u_1, \theta_{\mathsf{init}}))\]
\[z_t \mid z_{t-1}, u_t, \theta \sim \mathrm{Cat}(\pi(z_{t-1}, u_t, \theta_{\mathsf{trans}}))\]
\[y_t | z_t, u_t, \theta \sim p(y_t \mid z_t, u_t, \theta_{\mathsf{emis}})\]

where \(z_t \in \{1,\ldots,K\}\) is a discrete latent state. There are parameters for the initial distribution, the transition distribution, and the emission distribution:

\[\theta = (\theta_{\mathsf{init}}, \theta_{\mathsf{trans}}, \theta_{\mathsf{emis}})\]

For “standard” models, we will assume the initial distribution is fixed and the transitions follow a simple transition matrix,

\[z_1 \mid u_1 \sim \mathrm{Cat}(\pi_0)\]
\[z_t \mid z_{t-1}=k \sim \mathrm{Cat}(\pi_{z_k})\]

where \(\theta_{\mathsf{init}} = \pi_0\) and \(\theta_{\mathsf{trans}} = \{\pi_k\}_{k=1}^K\).

The parameters are stored in a HMMParameterSet object.

We have implemented many subclasses of HMM for various emission distributions.

Parameters:
  • num_states (int) – number of discrete states

  • initial_component (HMMInitialState) – object encapsulating the initial distribution

  • transition_component (HMMTransitions) – object encapsulating the transition distribution

  • emission_component (HMMEmissions) – object encapsulating the emission distribution

property emission_shape#

Return a pytree matching the pytree of tuples specifying the shape of a single time step’s emissions.

For example, a GaussianHMM with \(D\) dimensional emissions would return (D,).

initial_distribution(params, inputs=None)[source]#

Return an initial distribution over latent states.

Parameters:
  • params – model parameters \(\theta\)

  • inputs – optional inputs \(u_t\)

Returns:

distribution over initial latent state, \(p(z_1 \mid \theta)\).

transition_distribution(params, state, inputs=None)[source]#

Return a distribution over next latent state given current state.

Parameters:
  • params – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs – current inputs \(u_t\)

Returns:

conditional distribution of next latent state \(p(z_{t+1} \mid z_t, u_t, \theta)\).

emission_distribution(params, state, inputs=None)[source]#

Return a distribution over emissions given current state.

Parameters:
  • params – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs – current inputs \(u_t\)

Returns:

conditional distribution of current emission \(p(y_t \mid z_t, u_t, \theta)\)

log_prior(params)[source]#

Return the log prior probability of any model parameters.

Returns:

log prior probability.

Return type:

lp (Scalar)

marginal_log_prob(params, emissions, inputs=None)[source]#

Compute log marginal likelihood of observations, \(\log \sum_{z_{1:T}} p(y_{1:T}, z_{1:T} \mid \theta)\).

Parameters:
  • params – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs – current inputs \(u_t\)

Returns:

marginal log probability

filter(params, emissions, inputs=None)[source]#

Compute filtering distributions, \(p(z_t \mid y_{1:t}, u_{1:t}, \theta)\) for \(t=1,\ldots,T\).

Parameters:
  • params – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs – current inputs \(u_t\)

Returns:

filtering distributions

smoother(params, emissions, inputs=None)[source]#

Compute smoothing distribution, \(p(z_t \mid y_{1:T}, u_{1:T}, \theta)\) for \(t=1,\ldots,T\).

Parameters:
  • params – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs – current inputs \(u_t\)

Returns:

smoothing distributions

e_step(params, emissions, inputs=None)[source]#

The E-step computes expected sufficient statistics under the posterior. In the generic case, we simply return the posterior itself.

initialize_m_step_state(params, props)[source]#

Initialize any required state for the M step.

For example, this might include the optimizer state for Adam.

m_step(params, props, batch_stats, m_step_state)[source]#

Perform an M-step to find parameters that maximize the expected log joint probability.

Specifically, compute

\[\theta^\star = \mathrm{argmax}_\theta \; \mathbb{E}_{p(z_{1:T} \mid y_{1:T}, u_{1:T}, \theta)} \big[\log p(y_{1:T}, z_{1:T}, \theta \mid u_{1:T}) \big]\]

Parameters:
  • params – model parameters \(\theta\)

  • props – properties specifying which parameters should be learned

  • batch_stats – sufficient statistics from each sequence

  • m_step_state – any required state for optimizing the model parameters.

Returns:

new parameters

class HMMInitialState(m_step_optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), m_step_num_iters=50)[source]#

Abstract class for HMM initial distributions.

abstract distribution(params, inputs=None)[source]#

Return a distribution over the initial latent state

Returns:

conditional distribution of initial state.

Parameters:
  • params (ParameterSet)

  • inputs (Float[Array, 'input_dim'] | None)

Return type:

Distribution

abstract initialize(key=None, method='prior', **kwargs)[source]#

Initialize the model parameters and their corresponding properties.

Parameters:
  • key (PRNGKey | None) – random number generator

  • method (str) – specifies the type of initialization

Returns:

tuple of parameters and their corresponding properties

Return type:

Tuple[ParameterSet, PropertySet]

abstract log_prior(params)[source]#

Compute the log prior probability of the initial distribution parameters.

Parameters:

params (ParameterSet) – initial distribution parameters

Return type:

float | Float[Array, ‘’]

collect_suff_stats(params, posterior, inputs=None)[source]#

Collect sufficient statistics for updating the initial distribution parameters.

Parameters:
  • params (ParameterSet) – initial distribution parameters

  • posterior (HMMPosterior) – posterior distribution over latent states

  • inputs (Float[Array, 'num_timesteps input_dim'] | None) – optional inputs

Returns:

PyTree of sufficient statistics for updating the initial distribution

Return type:

PyTree

initialize_m_step_state(params, props)[source]#

Initialize any required state for the M step.

For example, this might include the optimizer state for Adam.

Parameters:
m_step(params, props, batch_stats, m_step_state, scale=1.0)[source]#

Perform an M-step on the initial distribution parameters.

Parameters:
  • params (ParameterSet) – current initial distribution parameters

  • props (PropertySet) – parameter properties

  • batch_stats (PyTree) – PyTree of sufficient statistics from each sequence, as output by collect_suff_stats().

  • m_step_state (Any) – any state required for the M-step

  • scale (float) – how to scale the objective

Returns:

Parameters that maximize the expected log joint probability.

Return type:

ParameterSet

class HMMTransitions(m_step_optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), m_step_num_iters=50)[source]#

Abstract class for HMM transitions.

abstract distribution(params, state, inputs=None)[source]#

Return a distribution over the next latent state

Parameters:
  • params (ParameterSet) – transition parameters

  • state (int) – current latent state

  • inputs (Float[Array, 'input_dim'] | None) – current inputs

Returns:

conditional distribution of next state.

Return type:

Distribution

abstract initialize(key=None, method='prior', **kwargs)[source]#

Initialize the model parameters and their corresponding properties.

Parameters:
  • key (PRNGKey | None) – random number generator

  • method (str) – specifies the type of initialization

Returns:

tuple of parameters and their corresponding properties

Return type:

Tuple[ParameterSet, PropertySet]

abstract log_prior(params)[source]#

Compute the log prior probability of the transition distribution parameters.

Parameters:

params (ParameterSet) – transition distribution parameters

Return type:

float | Float[Array, ‘’]

collect_suff_stats(params, posterior, inputs=None)[source]#

Collect sufficient statistics for updating the transition distribution parameters.

Parameters:
  • params (ParameterSet) – transition distribution parameters

  • posterior (HMMPosterior) – posterior distribution over latent states

  • inputs (Float[Array, 'num_timesteps input_dim'] | None) – optional inputs

Returns:

PyTree of sufficient statistics for updating the transition distribution

Return type:

PyTree

initialize_m_step_state(params, props)[source]#

Initialize any required state for the M step.

For example, this might include the optimizer state for Adam.

Parameters:
Return type:

Any

m_step(params, props, batch_stats, m_step_state, scale=1.0)[source]#

Perform an M-step on the transition distribution parameters.

Parameters:
  • params (ParameterSet) – current transition distribution parameters

  • props (PropertySet) – parameter properties

  • batch_stats (PyTree) – PyTree of sufficient statistics from each sequence, as output by collect_suff_stats().

  • m_step_state (Any) – any state required for the M-step

  • scale (float) – how to scale the objective

Returns:

Parameters that maximize the expected log joint probability.

Return type:

ParameterSet

class HMMEmissions(m_step_optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), m_step_num_iters=50)[source]#

Abstract class for HMM emissions.

abstract property emission_shape: Tuple[int]#

Return a pytree matching the pytree of tuples specifying the shape(s) of a single time step’s emissions.

For example, a Gaussian HMM with D dimensional emissions would return (D,).

abstract distribution(params, state, inputs=None)[source]#

Return a distribution over the emission

Parameters:
  • params (ParameterSet) – emission parameters

  • state (int) – current latent state

  • inputs (Float[Array, 'input_dim'] | None) – current inputs

Returns:

conditional distribution of the emission

Return type:

Distribution

abstract initialize(key=None, method='prior', **kwargs)[source]#

Initialize the model parameters and their corresponding properties.

Parameters:
  • key (PRNGKey | None) – random number generator

  • method (str) – specifies the type of initialization

Returns:

tuple of parameters and their corresponding properties

Return type:

Tuple[ParameterSet, PropertySet]

abstract log_prior(params)[source]#

Compute the log prior probability of the transition distribution parameters.

Parameters:

params (ParameterSet) – transition distribution parameters

Return type:

float | Float[Array, ‘’]

collect_suff_stats(params, posterior, emissions, inputs=None)[source]#

Collect sufficient statistics for updating the emission distribution parameters.

Parameters:
  • params (ParameterSet) – emission distribution parameters

  • posterior (HMMPosterior) – posterior distribution over latent states

  • emissions (Float[Array, 'num_timesteps emission_dim']) – observed emissions

  • inputs (Float[Array, 'num_timesteps input_dim'] | None) – optional inputs

Returns:

PyTree of sufficient statistics for updating the emission distribution

Return type:

PyTree

initialize_m_step_state(params, props)[source]#

Initialize any required state for the M step.

For example, this might include the optimizer state for Adam.

Parameters:
Return type:

Any

m_step(params, props, batch_stats, m_step_state, scale=1.0)[source]#

Perform an M-step on the emission distribution parameters.

Parameters:
  • params (ParameterSet) – current emission distribution parameters

  • props (PropertySet) – parameter properties

  • batch_stats (PyTree) – PyTree of sufficient statistics from each sequence, as output by collect_suff_stats().

  • m_step_state (Any) – any state required for the M-step

  • scale (float) – how to scale the objective

Returns:

Parameters that maximize the expected log joint probability.

Return type:

ParameterSet

High-level models#

The HMM implementations below cover common emission distributions and, if the emissions are exponential family distributions, the models implement closed form EM updates. For HMMs with emissions outside the non-exponential family, these models default to a generic M-step implemented in HMMEmissions.

Unless otherwise specified, these models have standard initial distributions and transition distributions with conjugate, Bayesian priors on their parameters.

Initial distribution:

\[p(z_1 \mid \pi_1) = \mathrm{Cat}(z_1 \mid \pi_1)\]
\[p(\pi_1) = \mathrm{Dir}(\pi_1 \mid \alpha 1_K)\]

where \(\alpha\) is the prior concentration on the initial distribution \(\pi_1\).

Transition distribution:

\[p(z_t \mid z_{t-1}, \theta) = \mathrm{Cat}(z_t \mid A_{z_{t-1}})\]
\[p(A) = \prod_{k=1}^K \mathrm{Dir}(A_k \mid \beta 1_K + \kappa e_k)\]

where \(\beta\) is the prior concentration on the rows of the transition matrix \(A\) and \(\kappa\) is the stickiness, which biases the prior toward transition matrices with larger values along the diagonal.

These hyperparameters can be specified in the HMM constructors, and they default to weak priors without any stickiness.

class BernoulliHMM(num_states, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_prior_concentration0=1.1, emission_prior_concentration1=1.1)[source]#

Bases: HMM

An HMM with conditionally independent Bernoulli emissions.

Let \(y_t \in \{0,1\}^N\) denote a binary vector of emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \prod_{n=1}^N \mathrm{Bern}(y_{tn} \mid \theta_{z_t,n})\]
\[p(\theta) = \prod_{k=1}^K \prod_{n=1}^N \mathrm{Beta}(\theta_{k,n}; \gamma_0, \gamma_1)\]

with \(\theta_{k,n} \in [0,1]\) for \(k=1,\ldots,K\) and \(n=1,\ldots,N\) are the emission probabilities and \(\gamma_0, \gamma_1\) are their prior pseudocounts.

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_prior_concentration0 (float | Float[Array, '']) – \(\gamma_0\)

  • emission_prior_concentration1 (float | Float[Array, '']) – \(\gamma_1\)

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_probs=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Note: in the future we may support more initialization schemes, like K-Means.

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters. Defaults to jr.PRNGKey(0).

  • method (str) – method for initializing unspecified parameters. Currently, only “prior” is allowed. Defaults to “prior”.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities. Defaults to None.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix. Defaults to None.

  • emission_probs (Float[Array, 'num_states emission_dim'] | None) – manually specified emission probabilities. Defaults to None.

Returns:

Model parameters and their properties.

Return type:

Tuple[ParameterSet, PropertySet]

class CategoricalHMM(num_states, emission_dim, num_classes, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_prior_concentration=1.1)[source]#

Bases: HMM

An HMM with conditionally independent categorical emissions.

Let \(y_t \in \{1,\ldots,C\}^N\) denote a vector of \(N\) conditionally independent categorical emissions from \(C\) classes at time \(t\). In this model,the emission distribution is,

\[p(y_t \mid z_t, \theta) = \prod_{n=1}^N \mathrm{Cat}(y_{tn} \mid \theta_{z_t,n})\]
\[p(\theta) = \prod_{k=1}^K \prod_{n=1}^N \mathrm{Dir}(\theta_{k,n}; \gamma 1_C)\]

with \(\theta_{k,n} \in \Delta_C\) for \(k=1,\ldots,K\) and \(n=1,\ldots,N\) are the emission probabilities and \(\gamma\) is their prior concentration.

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • num_classes (int) – number of multinomial classes \(C\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_prior_concentration\(\gamma\)

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_probs=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Note: in the future we may support more initialization schemes, like K-Means.

Parameters:
  • key (PRNGKey, optional) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters. Defaults to None.

  • method (str, optional) – method for initializing unspecified parameters. Currently, only “prior” is allowed. Defaults to “prior”.

  • initial_probs (array, optional) – manually specified initial state probabilities. Defaults to None.

  • transition_matrix (array, optional) – manually specified transition matrix. Defaults to None.

  • emission_probs (array, optional) – manually specified emission probabilities. Defaults to None.

Returns:

Model parameters and their properties.

Return type:

Tuple[ParameterSet, PropertySet]

class GammaHMM(num_states, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, m_step_optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), m_step_num_iters=50)[source]#

Bases: HMM

An HMM whose emissions come from a gamma distribution.

Let \(y_t \in \mathbb{R}_+\) denote non-negative emissions. In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \mathrm{Ga}(y_{t} \mid \alpha_{z_t}, \beta_{z_t})\]

with emission concentration \(\alpha_k \in \mathbb{R}_+\) and emission rate \(\beta_k \in \mathbb{R}_+\).

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • m_step_optimizer (GradientTransformation) – optax optimizer, like Adam.

  • m_step_num_iters (int) – number of optimizer steps per M-step.

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_concentrations=None, emission_rates=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_concentrations (Float[Array, 'num_states'] | None) – manually specified emission concentrations.

  • emission_rates (Float[Array, 'num_states'] | None) – manually specified emission rates.

  • emissions (Float[Array, 'num_timesteps'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class GaussianHMM(num_states, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_prior_mean=0.0, emission_prior_concentration=0.0001, emission_prior_scale=0.0001, emission_prior_extra_df=0.1)[source]#

Bases: HMM

An HMM with multivariate normal (i.e. Gaussian) emissions.

Let \(y_t \in \mathbb{R}^N\) denote a vector-valued emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \mathcal{N}(y_{t} \mid \mu_{z_t}, \Sigma_{z_t})\]

with \(\theta = \{\mu_k, \Sigma_k\}_{k=1}^K\) denoting the emission means and emission covariances.

The model has a conjugate normal-inverse-Wishart prior,

\[p(\theta) = \prod_{k=1}^K \mathcal{N}(\mu_k \mid \mu_0, \kappa_0^{-1} \Sigma_k) \mathrm{IW}(\Sigma_{k} \mid \nu_0, \Psi_0)\]

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_prior_mean (float | Float[Array, ''] | Float[Array, 'emission_dim']) – \(\mu_0\)

  • emission_prior_concentration (float | Float[Array, '']) – \(\kappa_0\)

  • emission_prior_extra_df (float | Float[Array, '']) – \(\nu_0 - N > 0\), the “extra” degrees of freedom, above and beyond the minimum of \(\\nu_0 = N\).

  • emission_prior_scale (float | Float[Array, ''] | Float[Array, 'emission_dim emission_dim']) – \(\Psi_0\)

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_means=None, emission_covariances=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_means (Float[Array, 'num_states emission_dim'] | None) – manually specified emission means.

  • emission_covariances (Float[Array, 'num_states emission_dim emission_dim'] | None) – manually specified emission covariances.

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class DiagonalGaussianHMM(num_states, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_prior_mean=0.0, emission_prior_mean_concentration=0.0001, emission_prior_concentration=0.1, emission_prior_scale=0.1)[source]#

Bases: HMM

An HMM with conditionally independent normal (i.e. Gaussian) emissions.

Let \(y_t \in \mathbb{R}^N\) denote a vector-valued emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \prod_{n=1}^N \mathcal{N}(y_{t,n} \mid \mu_{z_t,n}, \sigma_{z_t,n}^2)\]
or equivalently
\[p(y_t \mid z_t, \theta) = \mathcal{N}(y_{t} \mid \mu_{z_t}, \mathrm{diag}(\sigma_{z_t}^2))\]

where \(\sigma_k^2 = [\sigma_{k,1}^2, \ldots, \sigma_{k,N}^2]\) are the emission variances of each dimension in state \(z_t=k\). The complete set of parameters is \(\theta = \{\mu_k, \sigma_k^2\}_{k=1}^K\).

The model has a conjugate normal-inverse-gamma prior,

\[p(\theta) = \prod_{k=1}^K \prod_{n=1}^N \mathcal{N}(\mu_{k,n} \mid \mu_0, \kappa_0^{-1} \sigma_{k,n}^2) \mathrm{IGa}(\sigma_{k,n}^2 \mid \alpha_0, \beta_0)\]

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_prior_mean (float | Float[Array, ''] | Float[Array, 'emission_dim']) – \(\mu_0\)

  • emission_prior_mean_concentration (float | Float[Array, ''] | Float[Array, 'emission_dim']) – \(\kappa_0\)

  • emission_prior_concentration (float | Float[Array, '']) – \(\alpha_0\)

  • emission_prior_scale (float | Float[Array, '']) – \(\\beta_0\)

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_means=None, emission_scale_diags=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_means (Float[Array, 'num_states emission_dim'] | None) – manually specified emission means.

  • emission_scale_diags (Float[Array, 'num_states emission_dim'] | None) – manually specified emission standard deviations \(\sigma_{k,n}\)

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class SphericalGaussianHMM(num_states, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_prior_mean=0.0, emission_prior_mean_covariance=1.0, emission_var_concentration=1.1, emission_var_rate=1.1, m_step_optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), m_step_num_iters=50)[source]#

Bases: HMM

An HMM with conditionally independent normal emissions with the same variance along each dimension. These are called spherical Gaussian emissions.

Let \(y_t \in \mathbb{R}^N\) denote a vector-valued emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \prod_{n=1}^N \mathcal{N}(y_{t,n} \mid \mu_{z_t,n}, \sigma_{z_t}^2)\]
or equivalently
\[p(y_t \mid z_t, \theta) = \mathcal{N}(y_{t} \mid \mu_{z_t}, \sigma_{z_t}^2 I)\]

where \(\sigma_k^2\) is the emission variance in state \(z_t=k\). The complete set of parameters is \(\theta = \{\mu_k, \sigma_k^2\}_{k=1}^K\).

The model has a non-conjugate, factored prior

\[p(\theta) = \prod_{k=1}^K \mathcal{N}(\mu_{k} \mid \mu_0, \Sigma_0) \mathrm{Ga}(\sigma_{k}^2 \mid \alpha_0, \beta_0)\]

Note: In future versions we may implement a conjugate prior for this model.

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_prior_mean (float | Float[Array, ''] | Float[Array, 'emission_dim']) – \(\mu_0\)

  • emission_prior_mean_covariance (float | Float[Array, ''] | Float[Array, 'emission_dim emission_dim']) – \(\Sigma_0\)

  • emission_var_concentration (float | Float[Array, '']) – \(\alpha_0\)

  • emission_var_rate (float | Float[Array, '']) – \(\beta_0\)

  • m_step_optimizer (GradientTransformation) – optax optimizer, like Adam.

  • m_step_num_iters (int) – number of optimizer steps per M-step.

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_means=None, emission_scales=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_means (Float[Array, 'num_states emission_dim'] | None) – manually specified emission means.

  • emission_scales (Float[Array, 'num_states'] | None) – manually specified emission scales (sqrt of diagonal of covariance matrix).

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class SharedCovarianceGaussianHMM(num_states, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_prior_mean=0.0, emission_prior_concentration=0.0001, emission_prior_scale=0.0001, emission_prior_extra_df=0.1)[source]#

Bases: HMM

An HMM with multivariate normal (i.e. Gaussian) emissions where the covariance matrix is shared by all discrete states.

Let \(y_t \in \mathbb{R}^N\) denote a vector-valued emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \mathcal{N}(y_{t} \mid \mu_{z_t}, \Sigma)\]

where \(\Sigma\) is the shared emission covariance.

The complete set of parameters is \(\theta = (\{\mu_k\}_{k=1}^K, \Sigma)\).

The model has a conjugate prior,

\[p(\theta) = \mathrm{IW}(\Sigma \mid \nu_0, \Psi_0) \prod_{k=1}^K \mathcal{N}(\mu_{k} \mid \mu_0, \kappa_0^{-1} \Sigma)\]

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_prior_mean (float | Float[Array, ''] | Float[Array, 'emission_dim']) – \(\mu_0\)

  • emission_prior_concentration (float | Float[Array, '']) – \(\kappa_0\)

  • emission_prior_scale (float | Float[Array, '']) – \(\Psi_0\)

  • emission_prior_extra_df (float | Float[Array, '']) – \(\nu_0 - N > 0\), the “extra” degrees of freedom, above and beyond the minimum of \(\\nu_0 = N\).

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_means=None, emission_covariance=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_means (Float[Array, 'num_states emission_dim'] | None) – manually specified emission means.

  • emission_covariance (Float[Array, 'emission_dim emission_dim'] | None) – manually specified emission covariance.

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class LowRankGaussianHMM(num_states, emission_dim, emission_rank, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_diag_factor_concentration=1.1, emission_diag_factor_rate=1.1, m_step_optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), m_step_num_iters=50)[source]#

Bases: HMM

An HMM with multivariate normal (i.e. Gaussian) emissions where the covariance matrix is low rank plus diagonal.

Let \(y_t \in \mathbb{R}^N\) denote a vector-valued emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \mathcal{N}(y_{t} \mid \mu_{z_t}, \Sigma_{z_t})\]

where \(\Sigma_k\) factors as,

\[\Sigma_k = U_k U_k^\top + \mathrm{diag}(d_k)\]

with low rank factors \(U_k \in \mathbb{R}^{N \times M}\) and diagonal factor \(d_k \in \mathbb{R}_+^{N}\).

The complete set of parameters is \(\theta = (\{\mu_k, U_k, d_k\}_{k=1}^K\).

This model does not have a conjugate prior. Instead, we place a gamma prior on the diagonal factors,

\[p(\theta) \propto \prod_{k=1}^K \prod_{n=1}^N \mathrm{Ga}(d_{k,n} \mid \alpha_0, \beta_0)\]

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • emission_rank (int) – rank of the low rank factors, \(M\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_diag_factor_concentration (float | Float[Array, '']) – \(\alpha_0\)

  • emission_diag_factor_rate (float | Float[Array, '']) – \(\beta_0\)

  • m_step_optimizer (GradientTransformation) – optax optimizer, like Adam.

  • m_step_num_iters (int) – number of optimizer steps per M-step.

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_means=None, emission_cov_diag_factors=None, emission_cov_low_rank_factors=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_means (Float[Array, 'num_states emission_dim'] | None) – manually specified emission means.

  • emission_cov_diag_factors (Float[Array, 'num_states emission_dim'] | None) – manually specified emission scales (sqrt of diagonal of covariance matrix).

  • emission_cov_low_rank_factors (Float[Array, 'num_states emission_dim emission_rank'] | None) – manually specified emission low rank factors (sqrt of diagonal of covariance matrix).

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class MultinomialHMM(num_states, emission_dim, num_classes, num_trials, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_prior_concentration=1.1)[source]#

Bases: HMM

An HMM with conditionally independent multinomial emissions.

Let \(y_{t,n} \in \mathbb{N}^C\) denote a vector of \(C\) counts for each of \(N\) conditionally independent multinomial emissions at time \(t\). In this model,the emission distribution is,

\[p(y_t \mid z_t, \theta) = \prod_{n=1}^N \mathrm{Mult}(y_{tn} \mid R, \theta_{z_t,n})\]
\[p(\theta) = \prod_{k=1}^K \prod_{n=1}^N \mathrm{Dir}(\theta_{k,n}; \gamma 1_C)\]

with \(\theta_{k,n} \in \Delta_C\) for \(k=1,\ldots,K\) and \(n=1,\ldots,N\) are the emission probabilities and \(\gamma\) is their prior concentration.

Parameters:
  • num_states – number of discrete states \(K\)

  • emission_dim – number of conditionally independent emissions \(N\)

  • num_classes – number of multinomial classes \(C\)

  • num_trials – number of multinomial trials \(R\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_prior_concentration (float | Float[Array, '']) – \(\gamma\)

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_probs=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Note: in the future we may support more initialization schemes, like K-Means.

Parameters:
  • key (PRNGKey, optional) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters. Defaults to None.

  • method (str, optional) – method for initializing unspecified parameters. Currently, only “prior” is allowed. Defaults to “prior”.

  • initial_probs (array, optional) – manually specified initial state probabilities. Defaults to None.

  • transition_matrix (array, optional) – manually specified transition matrix. Defaults to None.

  • emission_probs (array, optional) – manually specified emission probabilities. Defaults to None.

Returns:

Model parameters and their properties.

Return type:

Tuple[ParameterSet, PropertySet]

class PoissonHMM(num_states, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_prior_concentration=1.1, emission_prior_rate=0.1)[source]#

Bases: HMM

An HMM with conditionally independent Poisson emissions.

Let \(y_t \in \{0,1\}^N\) denote a vector of count emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \prod_{n=1}^N \mathrm{Po}(y_{tn} \mid \theta_{z_t,n})\]
\[p(\theta) = \prod_{k=1}^K \prod_{n=1}^N \mathrm{Ga}(\theta_{k,n}; \gamma_0, \gamma_1)\]

with \(\theta_{k,n} \in \mathbb{R}_+\) for \(k=1,\ldots,K\) and \(n=1,\ldots,N\) are the emission rates and \(\gamma_0, \gamma_1\) are their prior concentration and rate, respectively.

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_prior_concentration (float | Float[Array, '']) – \(\gamma_0\)

  • emission_prior_rate (float | Float[Array, '']) – \(\gamma_1\)

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_rates=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Note: in the future we may support more initialization schemes, like K-Means.

Parameters:
  • key – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters. Defaults to jr.PRNGKey(0).

  • method – method for initializing unspecified parameters. Currently, only “prior” is allowed. Defaults to “prior”.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities. Defaults to None.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix. Defaults to None.

  • emission_rates (Float[Array, 'num_states emission_dim'] | None) – manually specified emission probabilities. Defaults to None.

Returns:

Model parameters and their properties.

Return type:

Tuple[ParameterSet, PropertySet]

class GaussianMixtureHMM(num_states, num_components, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_weights_concentration=1.1, emission_prior_mean=0.0, emission_prior_mean_concentration=0.0001, emission_prior_extra_df=0.0001, emission_prior_scale=0.0001)[source]#

Bases: HMM

An HMM with mixture of multivariate normal (i.e. Gaussian) emissions.

Let \(y_t \in \mathbb{R}^N\) denote a vector-valued emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \sum_{c=1}^C w_{k,c} \mathcal{N}(y_{t} \mid \mu_{z_t, c}, \Sigma_{z_t, c})\]

with \(\theta = \{\{\mu_{k,c}, \Sigma_{k, c}\}_{c=1}^C, w_k \}_{k=1}^K\) denoting the emission means and emission covariances for each disrete state \(k\) and component \(c\), as well as the emission weights \(w_k \in \Delta_C\), which specify the probability of each component in state \(k\).

The model has a conjugate normal-inverse-Wishart prior,

\[p(\theta) = \mathrm{Dir}(w_k \mid \gamma 1_C) \prod_{k=1}^K \prod_{c=1}^C \mathcal{N}(\mu_{k,c} \mid \mu_0, \kappa_0^{-1} \Sigma_{k,c}) \mathrm{IW}(\Sigma_{k, c} \mid \nu_0, \Psi_0)\]

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • num_components (int) – number of mixture components \(C\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_weights_concentration=\(\gamma\)

  • emission_prior_mean (float | Float[Array, ''] | Float[Array, 'emission_dim']) – \(\mu_0\)

  • emission_prior_concentration\(\kappa_0\)

  • emission_prior_extra_df (float | Float[Array, '']) – \(\nu_0 - N > 0\), the “extra” degrees of freedom, above and beyond the minimum of \(\nu_0 = N\).

  • emission_prior_scale (float | Float[Array, ''] | Float[Array, 'emission_dim emission_dim']) – \(\Psi_0\)

  • emission_weights_concentration (float | Float[Array, ''] | Float[Array, 'num_components'])

  • emission_prior_mean_concentration (float | Float[Array, ''])

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_weights=None, emission_means=None, emission_covariances=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_weights (Float[Array, 'num_states num_components'] | None) – manually specified emission weights.

  • emission_means (Float[Array, 'num_states num_components emission_dim'] | None) – manually specified emission means.

  • emission_covariances (Float[Array, 'num_states num_components emission_dim emission_dim'] | None) – manually specified emission covariances.

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class DiagonalGaussianMixtureHMM(num_states, num_components, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_weights_concentration=1.1, emission_prior_mean=0.0, emission_prior_mean_concentration=0.0001, emission_prior_shape=1.0, emission_prior_scale=1.0)[source]#

Bases: HMM

An HMM with mixture of multivariate normal (i.e. Gaussian) emissions with diagonal covariance.

Let \(y_t \in \mathbb{R}^N\) denote a vector-valued emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid z_t, \theta) = \sum_{c=1}^C w_{k,c} \mathcal{N}(y_{t} \mid \mu_{z_t, c}, \mathrm{diag}(\sigma_{z_t, c}^2))\]

or, equivalently,

\[p(y_t \mid z_t, \theta) = \sum_{c=1}^C w_{k,c} \prod_{n=1}^N \mathcal{N}(y_{t,n} \mid \mu_{z_t, c, n}, \sigma_{z_t, c, n}^2)\]

The parameters are \(\theta = \{\{\mu_{k,c}, \sigma_{k, c}^2\}_{c=1}^C, w_k \}_{k=1}^K\) denoting the emission means and emission variances for each disrete state \(k\) and component \(c\), as well as the emission weights \(w_k \in \Delta_C\), which specify the probability of each component in state \(k\).

The model has a conjugate normal-inverse-gamma prior,

\[p(\theta) = \mathrm{Dir}(w_k \mid \gamma 1_C) \prod_{k=1}^K \prod_{c=1}^C \prod_{n=1}^N \mathcal{N}(\mu_{k,c,n} \mid \mu_0, \kappa_0^{-1} \sigma_{k,c}^2) \mathrm{IGa}(\sigma_{k, c, n}^2 \mid \alpha_0, \beta_0)\]

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • num_components (int) – number of mixture components \(C\)

  • emission_dim (int) – number of conditionally independent emissions \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_weights_concentration=\(\gamma\)

  • emission_prior_mean (float | Float[Array, ''] | Float[Array, 'emission_dim']) – \(\mu_0\)

  • emission_prior_mean_concentration (float | Float[Array, '']) – \(\kappa_0\)

  • emission_prior_shape (float | Float[Array, '']) – \(\alpha_0\)

  • emission_prior_scale (float | Float[Array, '']) – \(\beta_0\)

  • emission_weights_concentration (float | Float[Array, ''] | Float[Array, 'num_components'])

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_weights=None, emission_means=None, emission_scale_diags=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_weights (Float[Array, 'num_states num_components'] | None) – manually specified emission weights.

  • emission_means (Float[Array, 'num_states num_components emission_dim'] | None) – manually specified emission means.

  • emission_scale_diags (Float[Array, 'num_states num_components emission_dim'] | None) – manually specified emission scales (sqrt of the variances). Defaults to None.

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class LinearRegressionHMM(num_states, input_dim, emission_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0)[source]#

Bases: HMM

An HMM whose emissions come from a linear regression with state-dependent weights. This is also known as a switching linear regression model.

Let \(y_t \in \mathbb{R}^N\) and \(u_t \in \mathbb{R}^M\) denote vector-valued emissions and inputs at time \(t\), respectively. In this model, the emission distribution is,

\[p(y_t \mid z_t, u_t, \theta) = \mathcal{N}(y_{t} \mid W_{z_t} u_t + b_{z_t}, \Sigma_{z_t})\]

with emission weights \(W_k \in \mathbb{R}^{N \times M}\), emission biases \(b_k \in \mathbb{R}^N\), and emission covariances \(\Sigma_k \in \mathbb{R}_{\succeq 0}^{N \times N}\).

The emissions parameters are \(\theta = \{W_k, b_k, \Sigma_k\}_{k=1}^K\).

We do not place a prior on the emission parameters.

Note: in the future we add a matrix-normal-inverse-Wishart prior (see pg 576).

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • input_dim (int) – input dimension \(M\)

  • emission_dim (int) – emission dimension \(N\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_weights=None, emission_biases=None, emission_covariances=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_weights (Float[Array, 'num_states emission_dim input_dim'] | None) – manually specified emission weights.

  • emission_biases (Float[Array, 'num_states emission_dim'] | None) – manually specified emission biases.

  • emission_covariances (Float[Array, 'num_states emission_dim emission_dim'] | None) – manually specified emission covariances.

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class LogisticRegressionHMM(num_states, input_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, emission_matrices_scale=100000000.0, m_step_optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), m_step_num_iters=50)[source]#

Bases: HMM

An HMM whose emissions come from a logistic regression with state-dependent weights. This is also known as a switching logistic regression model.

Let \(y_t \in \{0,1\}\) and \(u_t \in \mathbb{R}^M\) denote binary emissions and inputs at time \(t\), respectively. In this model, the emission distribution is,

\[p(y_t \mid z_t, u_t, \theta) = \mathrm{Bern}(y_{t} \mid \sigma(w_{z_t}^\top u_t + b_{z_t}))\]

with emission weights \(w_k \in \mathbb{R}^{M}\) and emission biases \(b_k \in \mathbb{R}\).

We use \(L_2\) regularization on the emission weights, which can be thought of as a Gaussian prior,

\[p(\theta) \propto \prod_{k=1}^K \prod_{m=1}^M \mathcal{N}(w_{k,m} \mid 0, \varsigma^2)\]

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • input_dim (int) – input dimension \(M\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • emission_matrices_scale (float | Float[Array, '']) – \(\varsigma\)

  • m_step_optimizer (GradientTransformation) – optax optimizer, like Adam.

  • m_step_num_iters (int) – number of optimizer steps per M-step.

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_weights=None, emission_biases=None, emissions=None, inputs=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_weights (Float[Array, 'num_states input_dim'] | None) – manually specified emission weights.

  • emission_biases (Float[Array, 'num_states'] | None) – manually specified emission biases.

  • emissions (Float[Array, 'num_timesteps'] | None) – emissions for initializing the parameters with kmeans.

  • inputs (Float[Array, 'num_timesteps input_dim'] | None) – inputs for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class CategoricalRegressionHMM(num_states, num_classes, input_dim, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0, m_step_optimizer=(<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), m_step_num_iters=50)[source]#

Bases: HMM

An HMM whose emissions come from a categorical regression with state-dependent weights. This is also known as a switching multiclass logistic regression model.

Let \(y_t \in \{1, \ldots, C\}\) and \(u_t \in \mathbb{R}^M\) denote categorical emissions and inputs at time \(t\), respectively. In this model, the emission distribution is,

\[p(y_t \mid z_t, u_t, \theta) = \mathrm{Cat}(y_{t} \mid \mathrm{softmax}(W_{z_t} u_t + b_{z_t}))\]

with emission weights \(W_k \in \mathbb{R}^{C \times M}\) and emission biases \(b_k \in \mathbb{R}^C\).

This model does not have a prior.

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • num_classes (int) – number of emission classes \(C\)

  • input_dim (int) – input dimension \(M\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

  • m_step_optimizer (GradientTransformation) – optax optimizer, like Adam.

  • m_step_num_iters (int) – number of optimizer steps per M-step.

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_weights=None, emission_biases=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_weights (Float[Array, 'num_states num_classes input_dim'] | None) – manually specified emission weights.

  • emission_biases (Float[Array, 'num_states num_classes'] | None) – manually specified emission biases.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

class LinearAutoregressiveHMM(num_states, emission_dim, num_lags=1, initial_probs_concentration=1.1, transition_matrix_concentration=1.1, transition_matrix_stickiness=0.0)[source]#

Bases: HMM

An autoregressive HMM whose emissions are a linear function of the previous emissions with state-dependent weights. This is also known as a switching vector autoregressive model.

Let \(y_t \in \mathbb{R}^N\) denote vector-valued emissions at time \(t\). In this model, the emission distribution is,

\[p(y_t \mid y_{1:t-1}, z_t, \theta) = \mathcal{N}(y_{t} \mid \sum_{\ell = 1}^L W_{z_t, \ell} y_{t-\ell} + b_{z_t}, \Sigma_{z_t})\]

with emission weights \(W_{k,\ell} \in \mathbb{R}^{N \times N}\) for each lag \(\ell=1,\ldots,L\), emission biases \(b_k \in \mathbb{R}^N\), and emission covariances \(\Sigma_k \in \mathbb{R}_{\succeq 0}^{N \times N}\).

The emissions parameters are \(\theta = \{\{W_{k,\ell}\}_{\ell=1}^L, b_k, \Sigma_k\}_{k=1}^K\).

We do not place a prior on the emission parameters.

Note: in the future we add a matrix-normal-inverse-Wishart prior (see pg 576).

Parameters:
  • num_states (int) – number of discrete states \(K\)

  • emission_dim (int) – emission dimension \(N\)

  • num_lags (int) – number of lags \(L\)

  • initial_probs_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\alpha\)

  • transition_matrix_concentration (float | Float[Array, ''] | Float[Array, 'num_states']) – \(\beta\)

  • transition_matrix_stickiness (float | Float[Array, '']) – optional hyperparameter to boost the concentration on the diagonal of the transition matrix.

initialize(key=Array([0, 0], dtype=uint32), method='prior', initial_probs=None, transition_matrix=None, emission_weights=None, emission_biases=None, emission_covariances=None, emissions=None)[source]#

Initialize the model parameters and their corresponding properties.

You can either specify parameters manually via the keyword arguments, or you can have them set automatically. If any parameters are not specified, you must supply a PRNGKey. Parameters will then be sampled from the prior (if method==prior).

Parameters:
  • key (PRNGKey) – random number generator for unspecified parameters. Must not be None if there are any unspecified parameters.

  • method (str) – method for initializing unspecified parameters. Both “prior” and “kmeans” are supported.

  • initial_probs (Float[Array, 'num_states'] | None) – manually specified initial state probabilities.

  • transition_matrix (Float[Array, 'num_states num_states'] | None) – manually specified transition matrix.

  • emission_weights (Float[Array, 'num_states emission_dim emission_dim_times_num_lags'] | None) – manually specified emission weights. The weights are stored as matrices \(W_k = [W_{k,1}, \ldots, W_{k,L}] \in \mathbb{R}^{N \times N \cdot L}\).

  • emission_biases (Float[Array, 'num_states emission_dim'] | None) – manually specified emission biases.

  • emission_covariances (Float[Array, 'num_states emission_dim emission_dim'] | None) – manually specified emission covariances.

  • emissions (Float[Array, 'num_timesteps emission_dim'] | None) – emissions for initializing the parameters with kmeans.

Returns:

Model parameters and their properties.

Return type:

Tuple[HMMParameterSet, HMMPropertySet]

sample(params, key, num_timesteps, prev_emissions=None)[source]#

Sample states \(z_{1:T}\) and emissions \(y_{1:T}\) given parameters \(\theta\).

Parameters:
  • params (HMMParameterSet) – model parameters \(\theta\)

  • key (PRNGKey) – random number generator

  • num_timesteps (int) – number of timesteps \(T\)

  • prev_emissions (Float[Array, 'num_lags emission_dim'] | None) – (optionally) preceding emissions \(y_{-L+1:0}\). Defaults to zeros.

Returns:

latent states and emissions

Return type:

Tuple[Float[Array, ‘num_timesteps state_dim’], Float[Array, ‘num_timesteps emission_dim’]]

compute_inputs(emissions, prev_emissions=None)[source]#

Helper function to compute the matrix of lagged emissions.

Parameters:
  • emissions (Float[Array, 'num_timesteps emission_dim']) – \((T \times N)\) array of emissions

  • prev_emissions (Float[Array, 'num_lags emission_dim'] | None) – \((L \times N)\) array of previous emissions. Defaults to zeros.

Returns:

\((T \times N \cdot L)\) array of lagged emissions. These are the inputs to the fitting functions.

Return type:

Float[Array, ‘num_timesteps emission_dim_times_num_lags’]

Low-level inference#

class HMMPosterior(marginal_loglik, filtered_probs, predicted_probs, smoothed_probs, initial_probs, trans_probs=None)[source]#

Simple wrapper for properties of an HMM posterior distribution.

Transition probabilities may be either 2D or 3D depending on whether the transition matrix is fixed or time-varying.

Parameters:
  • marginal_loglik (float | Float[Array, '']) – \(p(y_{1:T} \mid \theta) = \log \sum_{z_{1:T}} p(y_{1:T}, z_{1:T} \mid \theta)\).

  • filtered_probs (Float[Array, 'num_timesteps num_states']) – \(p(z_t \mid y_{1:t}, \theta)\) for \(t=1,\ldots,T\)

  • predicted_probs (Float[Array, 'num_timesteps num_states']) – \(p(z_t \mid y_{1:t-1}, \theta)\) for \(t=1,\ldots,T\)

  • smoothed_probs (Float[Array, 'num_timesteps num_states']) – \(p(z_t \mid y_{1:T}, \theta)\) for \(t=1,\ldots,T\)

  • initial_probs (Float[Array, 'num_states']) – \(p(z_1 \mid y_{1:T}, \theta)\) (also present in smoothed_probs but here for convenience)

  • trans_probs (Float[Array, 'num_timesteps num_states num_states'] | Float[Array, 'num_states num_states'] | None) – \(p(z_t, z_{t+1} \mid y_{1:T}, \theta)\) for \(t=1,\ldots,T-1\). (If the transition matrix is fixed, these probabilities may be summed over \(t\). See note above.)

class HMMPosteriorFiltered(marginal_loglik, filtered_probs, predicted_probs)[source]#

Simple wrapper for properties of an HMM filtering posterior.

Parameters:
  • marginal_loglik (float | Float[Array, '']) – \(p(y_{1:T} \mid \theta) = \log \sum_{z_{1:T}} p(y_{1:T}, z_{1:T} \mid \theta)\).

  • filtered_probs (Float[Array, 'num_timesteps num_states']) – \(p(z_t \mid y_{1:t}, \theta)\) for \(t=1,\ldots,T\)

  • predicted_probs (Float[Array, 'num_timesteps num_states']) – \(p(z_t \mid y_{1:t-1}, \theta)\) for \(t=1,\ldots,T\)

hmm_filter(initial_distribution, transition_matrix, log_likelihoods, transition_fn=None)[source]#

Forwards filtering

Transition matrix may be either 2D (if transition probabilities are fixed) or 3D if the transition probabilities vary over time. Alternatively, the transition matrix may be specified via transition_fn, which takes in a time index \(t\) and returns a transition matrix.

Parameters:
  • initial_distribution (Float[Array, 'num_states']) – \(p(z_1 \mid u_1, \theta)\)

  • transition_matrix (Float[Array, 'num_timesteps num_states num_states'] | Float[Array, 'num_states num_states']) – \(p(z_{t+1} \mid z_t, u_t, \theta)\)

  • log_likelihoods (Float[Array, 'num_timesteps num_states']) – \(p(y_t \mid z_t, u_t, \theta)\) for \(t=1,\ldots, T\).

  • transition_fn (Callable[[Int], Float[Array, 'num_states num_states']] | None) – function that takes in an integer time index and returns a \(K \times K\) transition matrix.

Returns:

filtered posterior distribution

Return type:

HMMPosteriorFiltered

hmm_smoother(initial_distribution, transition_matrix, log_likelihoods, transition_fn=None, compute_trans_probs=True)[source]#

Computed the smoothed state probabilities using a general Bayesian smoother.

Transition matrix may be either 2D (if transition probabilities are fixed) or 3D if the transition probabilities vary over time. Alternatively, the transition matrix may be specified via transition_fn, which takes in a time index \(t\) and returns a transition matrix.

Note: This is the discrete SSM analog of the RTS smoother for linear Gaussian SSMs.

Parameters:
  • initial_distribution (Float[Array, 'num_states']) – \(p(z_1 \mid u_1, \theta)\)

  • transition_matrix (Float[Array, 'num_timesteps num_states num_states'] | Float[Array, 'num_states num_states']) – \(p(z_{t+1} \mid z_t, u_t, \theta)\)

  • log_likelihoods (Float[Array, 'num_timesteps num_states']) – \(p(y_t \mid z_t, u_t, \theta)\) for \(t=1,\ldots, T\).

  • transition_fn (Callable[[Int], Float[Array, 'num_states num_states']] | None) – function that takes in an integer time index and returns a \(K \times K\) transition matrix.

  • compute_trans_probs (bool)

Returns:

posterior distribution

Return type:

HMMPosterior

hmm_two_filter_smoother(initial_distribution, transition_matrix, log_likelihoods, transition_fn=None, compute_trans_probs=True)[source]#

Computed the smoothed state probabilities using the two-filter smoother, a.k.a. the forward-backward algorithm.

Transition matrix may be either 2D (if transition probabilities are fixed) or 3D if the transition probabilities vary over time. Alternatively, the transition matrix may be specified via transition_fn, which takes in a time index \(t\) and returns a transition matrix.

Parameters:
  • initial_distribution (Float[Array, 'num_states']) – \(p(z_1 \mid u_1, \theta)\)

  • transition_matrix (Float[Array, 'num_timesteps num_states num_states'] | Float[Array, 'num_states num_states']) – \(p(z_{t+1} \mid z_t, u_t, \theta)\)

  • log_likelihoods (Float[Array, 'num_timesteps num_states']) – \(p(y_t \mid z_t, u_t, \theta)\) for \(t=1,\ldots, T\).

  • transition_fn (Callable[[Int], Float[Array, 'num_states num_states']] | None) – function that takes in an integer time index and returns a \(K \times K\) transition matrix.

  • compute_trans_probs (bool)

Returns:

posterior distribution

Return type:

HMMPosterior

hmm_fixed_lag_smoother(initial_distribution, transition_matrix, log_likelihoods, window_size, transition_fn=None)[source]#

Compute the smoothed state probabilities using the fixed-lag smoother.

The smoothed probability estimates

\[p(z_t \mid y_{1:t+L}, u_{1:t+L}, \theta)\]

Transition matrix may be either 2D (if transition probabilities are fixed) or 3D if the transition probabilities vary over time. Alternatively, the transition matrix may be specified via transition_fn, which takes in a time index \(t\) and returns a transition matrix.

Parameters:
  • initial_distribution (Float[Array, 'num_states']) – \(p(z_1 \mid u_1, \theta)\)

  • transition_matrix (Float[Array, 'num_timesteps num_states num_states'] | Float[Array, 'num_states num_states']) – \(p(z_{t+1} \mid z_t, u_t, \theta)\)

  • log_likelihoods (Float[Array, 'num_timesteps num_states']) – \(p(y_t \mid z_t, u_t, \theta)\) for \(t=1,\ldots, T\).

  • window_size (Int) – the number of future steps to use, \(L\)

  • transition_fn (Callable[[Int], Float[Array, 'num_states num_states']] | None) – function that takes in an integer time index and returns a \(K \times K\) transition matrix.

Returns:

posterior distribution

Return type:

HMMPosterior

hmm_posterior_mode(initial_distribution, transition_matrix, log_likelihoods, transition_fn=None)[source]#

Compute the most likely state sequence. This is called the Viterbi algorithm.

Parameters:
  • initial_distribution (Float[Array, 'num_states']) – \(p(z_1 \mid u_1, \theta)\)

  • transition_matrix (Float[Array, 'num_timesteps num_states num_states'] | Float[Array, 'num_states num_states']) – \(p(z_{t+1} \mid z_t, u_t, \theta)\)

  • log_likelihoods (Float[Array, 'num_timesteps num_states']) – \(p(y_t \mid z_t, u_t, \theta)\) for \(t=1,\ldots, T\).

  • transition_fn (Callable[[Int], Float[Array, 'num_states num_states']] | None) – function that takes in an integer time index and returns a \(K \times K\) transition matrix.

Returns:

most likely state sequence

Return type:

Int[Array, ‘num_timesteps’]

hmm_posterior_sample(rng, initial_distribution, transition_matrix, log_likelihoods, transition_fn=None)[source]#

Sample a latent sequence from the posterior.

Parameters:
  • rng (PRNGKey) – random number generator

  • initial_distribution (Float[Array, 'num_states']) – \(p(z_1 \mid u_1, \theta)\)

  • transition_matrix (Float[Array, 'num_timesteps num_states num_states'] | Float[Array, 'num_states num_states']) – \(p(z_{t+1} \mid z_t, u_t, \theta)\)

  • log_likelihoods (Float[Array, 'num_timesteps num_states']) – \(p(y_t \mid z_t, u_t, \theta)\) for \(t=1,\ldots, T\).

  • transition_fn (Callable[[Int], Float[Array, 'num_states num_states']] | None) – function that takes in an integer time index and returns a \(K \times K\) transition matrix.

Returns:

sample of the latent states, \(z_{1:T}\)

Return type:

Int[Array, ‘num_timesteps’]

parallel_hmm_filter(initial_probs, transition_matrix, log_likelihoods)#

Parallel implementation of the forward filtering algorithm with jax.lax.associative_scan.

Note: for this function, the transition matrix must be fixed. We may add support for nonstationary transition matrices in a future release.

Parameters:
  • initial_distribution\(p(z_1 \mid u_1, \theta)\)

  • transition_matrix (Float[Array, 'num_states num_states']) – \(p(z_{t+1} \mid z_t, u_t, \theta)\)

  • log_likelihoods (Float[Array, 'num_timesteps num_states']) – \(p(y_t \mid z_t, u_t, \theta)\) for \(t=1,\ldots, T\).

  • initial_probs (Float[Array, 'num_states'])

Returns:

filtered posterior distribution

Return type:

HMMPosteriorFiltered

parallel_hmm_smoother(initial_probs, transition_matrix, log_likelihoods)#

Parallel implementation of HMM smoothing with jax.lax.associative_scan.

Notes:

  • This implementation uses the automatic differentiation of the HMM log normalizer rather than an explicit implementation of the backward message passing.

  • The transition matrix must be fixed. We may add support for nonstationary transition matrices in a future release.

Parameters:
  • initial_distribution\(p(z_1 \mid u_1, \theta)\)

  • transition_matrix (Float[Array, 'num_states num_states']) – \(p(z_{t+1} \mid z_t, u_t, \theta)\)

  • log_likelihoods (Float[Array, 'num_timesteps num_states']) – \(p(y_t \mid z_t, u_t, \theta)\) for \(t=1,\ldots, T\).

  • initial_probs (Float[Array, 'num_states'])

Returns:

smoothed posterior distribution

Return type:

HMMPosteriorFiltered

Types#

class HMMParameterSet(*args, **kwargs)[source]#

Container for HMM parameters.

Parameters:
  • initial – (ParameterSet) initial distribution parameters

  • transitions – (ParameterSet) transition distribution parameters

  • emissions – (ParameterSet) emission distribution parameters

class HMMPropertySet(*args, **kwargs)[source]#

Container for properties of HMM parameter properties.

Parameters:
  • initial – (PropertySet) initial distribution properties

  • transitions – (PropertySet) transition distribution properties

  • emissions – (PropertySet) emission distribution properties

Linear Gaussian SSM#

High-level class#

class LinearGaussianSSM(state_dim, emission_dim, input_dim=0, has_dynamics_bias=True, has_emissions_bias=True)[source]#

Linear Gaussian State Space Model.

The model is defined as follows

\[p(z_1) = \mathcal{N}(z_1 \mid m, S)\]
\[p(z_t \mid z_{t-1}, u_t) = \mathcal{N}(z_t \mid F_t z_{t-1} + B_t u_t + b_t, Q_t)\]
\[p(y_t \mid z_t) = \mathcal{N}(y_t \mid H_t z_t + D_t u_t + d_t, R_t)\]

where

  • \(z_t\) is a latent state of size state_dim,

  • \(y_t\) is an emission of size emission_dim

  • \(u_t\) is an input of size input_dim (defaults to 0)

  • \(F\) = dynamics (transition) matrix

  • \(B\) = optional input-to-state weight matrix

  • \(b\) = optional input-to-state bias vector

  • \(Q\) = covariance matrix of dynamics (system) noise

  • \(H\) = emission (observation) matrix

  • \(D\) = optional input-to-emission weight matrix

  • \(d\) = optional input-to-emission bias vector

  • \(R\) = covariance function for emission (observation) noise

  • \(m\) = mean of initial state

  • \(S\) = covariance matrix of initial state

The parameters of the model are stored in a ParamsLGSSM. You can create the parameters manually, or by calling initialize().

Parameters:
  • state_dim (int) – Dimensionality of latent state.

  • emission_dim (int) – Dimensionality of observation vector.

  • input_dim (int) – Dimensionality of input vector. Defaults to 0.

  • has_dynamics_bias (bool) – Whether model contains an offset term \(b\). Defaults to True.

  • has_emissions_bias (bool) – Whether model contains an offset term \(d\). Defaults to True.

property emission_shape#

Return a pytree matching the pytree of tuples specifying the shape of a single time step’s emissions.

For example, a GaussianHMM with \(D\) dimensional emissions would return (D,).

property inputs_shape#

Return a pytree matching the pytree of tuples specifying the shape of a single time step’s inputs.

initialize(key=Array([0, 0], dtype=uint32), initial_mean=None, initial_covariance=None, dynamics_weights=None, dynamics_bias=None, dynamics_input_weights=None, dynamics_covariance=None, emission_weights=None, emission_bias=None, emission_input_weights=None, emission_covariance=None)[source]#

Initialize model parameters that are set to None, and their corresponding properties.

Parameters:
  • key (Array) – Random number key. Defaults to jr.PRNGKey(0).

  • initial_mean (Float[Array, 'state_dim'] | None) – parameter \(m\). Defaults to None.

  • initial_covariance – parameter \(S\). Defaults to None.

  • dynamics_weights – parameter \(F\). Defaults to None.

  • dynamics_bias – parameter \(b\). Defaults to None.

  • dynamics_input_weights – parameter \(B\). Defaults to None.

  • dynamics_covariance – parameter \(Q\). Defaults to None.

  • emission_weights – parameter \(H\). Defaults to None.

  • emission_bias – parameter \(d\). Defaults to None.

  • emission_input_weights – parameter \(D\). Defaults to None.

  • emission_covariance – parameter \(R\). Defaults to None.

Returns:

parameters and their properties.

Return type:

Tuple[ParamsLGSSM, ParamsLGSSM]

initial_distribution(params, inputs=None)[source]#

Return an initial distribution over latent states.

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional inputs \(u_t\)

Returns:

distribution over initial latent state, \(p(z_1 \mid \theta)\).

Return type:

Distribution

transition_distribution(params, state, inputs=None)[source]#

Return a distribution over next latent state given current state.

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • state (Float[Array, 'state_dim']) – current latent state \(z_t\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – current inputs \(u_t\)

Returns:

conditional distribution of next latent state \(p(z_{t+1} \mid z_t, u_t, \theta)\).

Return type:

Distribution

emission_distribution(params, state, inputs=None)[source]#

Return a distribution over emissions given current state.

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • state (Float[Array, 'state_dim']) – current latent state \(z_t\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – current inputs \(u_t\)

Returns:

conditional distribution of current emission \(p(y_t \mid z_t, u_t, \theta)\)

Return type:

Distribution

sample(params, key, num_timesteps, inputs=None)[source]#

Sample states \(z_{1:T}\) and emissions \(y_{1:T}\) given parameters \(\theta\) and (optionally) inputs \(u_{1:T}\).

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • key (Array) – random number generator

  • num_timesteps (int) – number of timesteps \(T\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – inputs \(u_{1:T}\)

Returns:

latent states and emissions

Return type:

PosteriorGSSMFiltered

marginal_log_prob(params, emissions, inputs=None)[source]#

Compute log marginal likelihood of observations, \(\log \sum_{z_{1:T}} p(y_{1:T}, z_{1:T} \mid \theta)\).

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – current inputs \(u_t\)

  • emissions (Float[Array, 'ntime emission_dim'])

Returns:

marginal log probability

Return type:

float | Float[Array, ‘’]

filter(params, emissions, inputs=None)[source]#

Compute filtering distributions, \(p(z_t \mid y_{1:t}, u_{1:t}, \theta)\) for \(t=1,\ldots,T\).

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – current inputs \(u_t\)

  • emissions (Float[Array, 'ntime emission_dim'])

Returns:

filtering distributions

Return type:

PosteriorGSSMFiltered

smoother(params, emissions, inputs=None)[source]#

Compute smoothing distribution, \(p(z_t \mid y_{1:T}, u_{1:T}, \theta)\) for \(t=1,\ldots,T\).

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • state – current latent state \(z_t\)

  • inputs (Float[Array, 'ntime input_dim'] | None) – current inputs \(u_t\)

  • emissions (Float[Array, 'ntime emission_dim'])

Returns:

smoothing distributions

Return type:

PosteriorGSSMSmoothed

posterior_predictive(params, emissions, inputs=None)[source]#

Compute marginal posterior predictive smoothing distribution for each observation.

Parameters:
  • params (ParamsLGSSM) – model parameters.

  • emissions (Float[Array, 'ntime emission_dim']) – sequence of observations.

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional sequence of inputs.

Returns:

posterior predictive means \(\mathbb{E}[y_{t,d} \mid y_{1:T}]\) and standard deviations \(\mathrm{std}[y_{t,d} \mid y_{1:T}]\)

Return type:

Tuple[Float[Array, ‘ntime emission_dim’], Float[Array, ‘ntime emission_dim’]]

e_step(params, emissions, inputs=None)[source]#

Perform an E-step to compute expected sufficient statistics under the posterior, \(p(z_{1:T} \mid y_{1:T}, u_{1:T}, \theta)\).

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • emissions (Float[Array, 'num_timesteps emission_dim'] | Float[Array, 'num_batches num_timesteps emission_dim']) – emissions \(y_{1:T}\)

  • inputs (Float[Array, 'num_timesteps input_dim'] | Float[Array, 'num_batches num_timesteps input_dim'] | None) – optional inputs \(u_{1:T}\)

Returns:

Expected sufficient statistics under the posterior.

Return type:

Tuple[SuffStatsLGSSM, float | Float[Array, ‘’]]

m_step(params, props, batch_stats, m_step_state)[source]#

Perform an M-step to find parameters that maximize the expected log joint probability.

Specifically, compute

\[\theta^\star = \mathrm{argmax}_\theta \; \mathbb{E}_{p(z_{1:T} \mid y_{1:T}, u_{1:T}, \theta)} \big[\log p(y_{1:T}, z_{1:T}, \theta \mid u_{1:T}) \big]\]

Parameters:
  • params (ParamsLGSSM) – model parameters \(\theta\)

  • props (ParamsLGSSM) – properties specifying which parameters should be learned

  • batch_stats (SuffStatsLGSSM) – sufficient statistics from each sequence

  • m_step_state (Any) – any required state for optimizing the model parameters.

Returns:

new parameters

Return type:

Tuple[ParamsLGSSM, Any]

Low-level inference#

lgssm_filter(params, emissions, inputs=None)[source]#

Run a Kalman filter to produce the marginal likelihood and filtered state estimates.

Parameters:
  • params (ParamsLGSSM) – model parameters

  • emissions (Float[Array, 'ntime emission_dim']) – array of observations.

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional array of inputs.

Returns:

filtered posterior object

Return type:

PosteriorGSSMFiltered

lgssm_smoother(params, emissions, inputs=None)[source]#

Run forward-filtering, backward-smoother to compute expectations under the posterior distribution on latent states. Technically, this implements the Rauch-Tung-Striebel (RTS) smoother.

Parameters:
  • params (ParamsLGSSM) – an LGSSMParams instance (or object with the same fields)

  • emissions (Float[Array, 'ntime emission_dim']) – array of observations.

  • inputs (Float[Array, 'ntime input_dim'] | None) – array of inputs.

Returns:

smoothed posterior object.

Return type:

PosteriorGSSMSmoothed

lgssm_posterior_sample(key, params, emissions, inputs=None, jitter=0)[source]#

Run forward-filtering, backward-sampling to draw samples from \(p(z_{1:T} \mid y_{1:T}, u_{1:T})\).

Parameters:
  • key (Array) – random number key.

  • params (ParamsLGSSM) – parameters.

  • emissions (Float[Array, 'ntime emission_dim']) – sequence of observations.

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional sequence of inptus.

  • jitter (float | Float[Array, ''] | None) – padding to add to the diagonal of the covariance matrix before sampling.

Returns:

one sample of \(z_{1:T}\) from the posterior distribution on latent states.

Return type:

Float[Array, “ntime state_dim”]

Types#

class ParamsLGSSM(initial, dynamics, emissions)[source]#

Parameters of a linear Gaussian SSM.

Parameters:
class ParamsLGSSMInitial(mean, cov)[source]#

Parameters of the initial distribution

\[p(z_1) = \mathcal{N}(z_1 \mid \mu_1, Q_1)\]

The tuple doubles as a container for the ParameterProperties.

Parameters:
class ParamsLGSSMDynamics(weights, bias, input_weights, cov)[source]#

Parameters of the emission distribution

\[p(z_{t+1} \mid z_t, u_t) = \mathcal{N}(z_{t+1} \mid F z_t + B u_t + b, Q)\]

The tuple doubles as a container for the ParameterProperties.

Parameters:
  • weights (ParameterProperties | Float[Array, 'state_dim state_dim'] | Float[Array, 'ntime state_dim state_dim']) – dynamics weights \(F\)

  • bias (ParameterProperties | Float[Array, 'state_dim'] | Float[Array, 'ntime state_dim']) – dynamics bias \(b\)

  • input_weights (ParameterProperties | Float[Array, 'state_dim input_dim'] | Float[Array, 'ntime state_dim input_dim']) – dynamics input weights \(B\)

  • cov (ParameterProperties | Float[Array, 'state_dim state_dim'] | Float[Array, 'ntime state_dim state_dim'] | Float[Array, 'state_dim_triu']) – dynamics covariance \(Q\)

class ParamsLGSSMEmissions(weights, bias, input_weights, cov)[source]#

Parameters of the emission distribution

\[p(y_t \mid z_t, u_t) = \mathcal{N}(y_t \mid H z_t + D u_t + d, R)\]

The tuple doubles as a container for the ParameterProperties.

Parameters:
  • weights (ParameterProperties | Float[Array, 'emission_dim state_dim'] | Float[Array, 'ntime emission_dim state_dim']) – emission weights \(H\)

  • bias (ParameterProperties | Float[Array, 'emission_dim'] | Float[Array, 'ntime emission_dim']) – emission bias \(d\)

  • input_weights (ParameterProperties | Float[Array, 'emission_dim input_dim'] | Float[Array, 'ntime emission_dim input_dim']) – emission input weights \(D\)

  • cov (ParameterProperties | Float[Array, 'emission_dim emission_dim'] | Float[Array, 'ntime emission_dim emission_dim'] | Float[Array, 'emission_dim'] | Float[Array, 'ntime emission_dim'] | Float[Array, 'emission_dim_triu']) – emission covariance \(R\)

class PosteriorGSSMFiltered(marginal_loglik, filtered_means=None, filtered_covariances=None, predicted_means=None, predicted_covariances=None)[source]#

Marginals of the Gaussian filtering posterior.

Parameters:
  • marginal_loglik (float | Float[Array, ''] | Float[Array, 'ntime']) – marginal log likelihood, \(p(y_{1:T} \mid u_{1:T})\)

  • filtered_means (Float[Array, 'ntime state_dim'] | None) – array of filtered means \(\mathbb{E}[z_t \mid y_{1:t}, u_{1:t}]\)

  • filtered_covariances (Float[Array, 'ntime state_dim state_dim'] | None) – array of filtered covariances \(\mathrm{Cov}[z_t \mid y_{1:t}, u_{1:t}]\)

  • predicted_means (Float[Array, 'ntime state_dim'] | None)

  • predicted_covariances (Float[Array, 'ntime state_dim state_dim'] | None)

class PosteriorGSSMSmoothed(marginal_loglik, filtered_means, filtered_covariances, smoothed_means, smoothed_covariances, smoothed_cross_covariances=None)[source]#

Marginals of the Gaussian filtering and smoothing posterior.

Parameters:
  • marginal_loglik (float | Float[Array, '']) – marginal log likelihood, \(p(y_{1:T} \mid u_{1:T})\)

  • filtered_means (Float[Array, 'ntime state_dim']) – array of filtered means \(\mathbb{E}[z_t \mid y_{1:t}, u_{1:t}]\)

  • filtered_covariances (Float[Array, 'ntime state_dim state_dim']) – array of filtered covariances \(\mathrm{Cov}[z_t \mid y_{1:t}, u_{1:t}]\)

  • smoothed_means (Float[Array, 'ntime state_dim']) – array of smoothed means \(\mathbb{E}[z_t \mid y_{1:T}, u_{1:T}]\)

  • smoothed_covariances (Float[Array, 'ntime state_dim state_dim']) – array of smoothed marginal covariances, \(\mathrm{Cov}[z_t \mid y_{1:T}, u_{1:T}]\)

  • smoothed_cross_covariances (Float[Array, 'ntime_minus1 state_dim state_dim'] | None) – array of smoothed cross products, \(\mathbb{E}[z_t z_{t+1}^T \mid y_{1:T}, u_{1:T}]\)

Nonlinear Gaussian GSSM#

High-level class#

class NonlinearGaussianSSM(state_dim, emission_dim, input_dim=0)[source]#

Nonlinear Gaussian State Space Model.

The model is defined as follows

\[p(z_t | z_{t-1}, u_t) = N(z_t | f(z_{t-1}, u_t), Q_t)\]
\[p(y_t | z_t) = N(y_t | h(z_t, u_t), R_t)\]
\[p(z_1) = N(z_1 | m, S)\]

where the model parameters are

  • \(z_t\) = hidden variables of size state_dim,

  • \(y_t\) = observed variables of size emission_dim

  • \(u_t\) = input covariates of size input_dim (defaults to 0).

  • \(f\) = dynamics (transition) function

  • \(h\) = emission (observation) function

  • \(Q\) = covariance matrix of dynamics (system) noise

  • \(R\) = covariance matrix for emission (observation) noise

  • \(m\) = mean of initial state

  • \(S\) = covariance matrix of initial state

These parameters of the model are stored in a separate object of type ParamsNLGSSM.

Parameters:
  • state_dim (int)

  • emission_dim (int)

  • input_dim (int)

property emission_shape#

Return a pytree matching the pytree of tuples specifying the shape of a single time step’s emissions.

For example, a GaussianHMM with \(D\) dimensional emissions would return (D,).

property inputs_shape#

Return a pytree matching the pytree of tuples specifying the shape of a single time step’s inputs.

initial_distribution(params, inputs=None)[source]#

Return an initial distribution over latent states.

Parameters:
  • params (ParamsNLGSSM) – model parameters \(\theta\)

  • inputs (Float[Array, 'input_dim'] | None) – optional inputs \(u_t\)

Returns:

distribution over initial latent state, \(p(z_1 \mid \theta)\).

Return type:

Distribution

transition_distribution(params, state, inputs=None)[source]#

Return a distribution over next latent state given current state.

Parameters:
  • params (ParamsNLGSSM) – model parameters \(\theta\)

  • state (Float[Array, 'state_dim']) – current latent state \(z_t\)

  • inputs (Float[Array, 'input_dim'] | None) – current inputs \(u_t\)

Returns:

conditional distribution of next latent state \(p(z_{t+1} \mid z_t, u_t, \theta)\).

Return type:

Distribution

emission_distribution(params, state, inputs=None)[source]#

Return a distribution over emissions given current state.

Parameters:
  • params (ParamsNLGSSM) – model parameters \(\theta\)

  • state (Float[Array, 'state_dim']) – current latent state \(z_t\)

  • inputs (Float[Array, 'input_dim'] | None) – current inputs \(u_t\)

Returns:

conditional distribution of current emission \(p(y_t \mid z_t, u_t, \theta)\)

Return type:

Distribution

Low-level inference#

extended_kalman_filter(params, emissions, num_iter=1, inputs=None, output_fields=['filtered_means', 'filtered_covariances', 'predicted_means', 'predicted_covariances'])[source]#

Run an (iterated) extended Kalman filter to produce the marginal likelihood and filtered state estimates.

Parameters:
  • params (ParamsNLGSSM) – model parameters.

  • emissions (Float[Array, 'ntime emission_dim']) – observation sequence.

  • num_iter (int) – number of linearizations around posterior for update step (default 1).

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional array of inputs.

  • output_fields (List[str] | None) – list of fields to return in posterior object. These can take the values “filtered_means”, “filtered_covariances”, “predicted_means”, “predicted_covariances”, and “marginal_loglik”.

Returns:

posterior object.

Return type:

post

iterated_extended_kalman_filter(params, emissions, num_iter=2, inputs=None)[source]#

Run an iterated extended Kalman filter to produce the marginal likelihood and filtered state estimates.

Parameters:
  • params (ParamsNLGSSM) – model parameters.

  • emissions (Float[Array, 'ntime emission_dim']) – observation sequence.

  • num_iter (int) – number of linearizations around posterior for update step (default 2).

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional array of inputs.

Returns:

posterior object.

Return type:

post

extended_kalman_smoother(params, emissions, filtered_posterior=None, inputs=None)[source]#

Run an extended Kalman (RTS) smoother.

Parameters:
  • params (ParamsNLGSSM) – model parameters.

  • emissions (Float[Array, 'ntime emission_dim']) – observation sequence.

  • filtered_posterior (PosteriorGSSMFiltered | None) – optional output from filtering step.

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional array of inputs.

Returns:

posterior object.

Return type:

post

iterated_extended_kalman_smoother(params, emissions, num_iter=2, inputs=None)[source]#

Run an iterated extended Kalman smoother (IEKS).

Parameters:
  • params (ParamsNLGSSM) – model parameters.

  • emissions (Float[Array, 'ntime emission_dim']) – observation sequence.

  • num_iter (int) – number of linearizations around posterior for update step (default 2).

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional array of inputs.

Returns:

posterior object.

Return type:

post

unscented_kalman_filter(params, emissions, hyperparams, inputs=None, output_fields=['filtered_means', 'filtered_covariances', 'predicted_means', 'predicted_covariances'])[source]#

Run a unscented Kalman filter to produce the marginal likelihood and filtered state estimates.

Parameters:
  • params (ParamsNLGSSM) – model parameters.

  • emissions (Float[Array, 'ntime emission_dim']) – array of observations.

  • hyperparams (UKFHyperParams) – hyper-parameters.

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional array of inputs.

  • output_fields (List[str] | None)

Returns:

posterior object.

Return type:

filtered_posterior

unscented_kalman_smoother(params, emissions, hyperparams, inputs=None)[source]#

Run a unscented Kalman (RTS) smoother.

Parameters:
  • params (ParamsNLGSSM) – model parameters.

  • emissions (Float[Array, 'ntime emission_dim']) – array of observations.

  • hyperperams – hyper-parameters.

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional inputs.

  • hyperparams (UKFHyperParams)

Returns:

posterior object.

Return type:

nlgssm_posterior

Types#

class ParamsNLGSSM(initial_mean, initial_covariance, dynamics_function, dynamics_covariance, emission_function, emission_covariance)[source]#

Parameters for a NLGSSM model.

\[p(z_t | z_{t-1}, u_t) = N(z_t | f(z_{t-1}, u_t), Q_t)\]
\[p(y_t | z_t) = N(y_t | h(z_t, u_t), R_t)\]
\[p(z_1) = N(z_1 | m, S)\]

If you have no inputs, the dynamics and emission functions do not to take \(u_t\) as an argument.

Parameters:
  • dynamics_function (Callable[[Float[Array, 'state_dim']], Float[Array, 'state_dim']] | Callable[[Float[Array, 'state_dim'], Float[Array, 'input_dim']], Float[Array, 'state_dim']]) – \(f\)

  • dynamics_covariance (Float[Array, 'state_dim state_dim']) – \(Q\)

  • emissions_function\(h\)

  • emissions_covariance\(R\)

  • initial_mean (Float[Array, 'state_dim']) – \(m\)

  • initial_covariance (Float[Array, 'state_dim state_dim']) – \(S\)

  • emission_function (Callable[[Float[Array, 'state_dim']], Float[Array, 'emission_dim']] | Callable[[Float[Array, 'state_dim'], Float[Array, 'input_dim']], Float[Array, 'emission_dim']])

  • emission_covariance (Float[Array, 'emission_dim emission_dim'])

Generalized Gaussian GSSM#

High-level class#

class GeneralizedGaussianSSM(state_dim, emission_dim, input_dim=0)[source]#

Generalized Gaussian State Space Model.

The model is defined as follows

\[p(z_t | z_{t-1}, u_t) = N(z_t | f(z_{t-1}, u_t), Q_t)\]
\[p(y_t | z_t) = q(y_t | h(z_t, u_t), R(z_t, u_t))\]
\[p(z_1) = N(z_1 | m, S)\]

where the model parameters are

  • \(z_t\) = hidden variables of size state_dim,

  • \(y_t\) = observed variables of size emission_dim

  • \(u_t\) = input covariates of size input_dim (defaults to 0).

  • \(f\) = dynamics (transition) function

  • \(h\) = emission (observation) function

  • \(Q\) = covariance matrix of dynamics (system) noise

  • \(R\) = covariance function for emission (observation) noise

  • \(m\) = mean of initial state

  • \(S\) = covariance matrix of initial state

The parameters of the model are stored in a separate object of type ParamsGGSSM.

For example usage, see probml/dynamax.

property emission_shape#

Return a pytree matching the pytree of tuples specifying the shape of a single time step’s emissions.

For example, a GaussianHMM with \(D\) dimensional emissions would return (D,).

initial_distribution(params, inputs=None)[source]#

Return an initial distribution over latent states.

Parameters:
  • params (ParamsGGSSM) – model parameters \(\theta\)

  • inputs (Float[Array, 'input_dim'] | None) – optional inputs \(u_t\)

Returns:

distribution over initial latent state, \(p(z_1 \mid \theta)\).

Return type:

Distribution

transition_distribution(params, state, inputs=None)[source]#

Return a distribution over next latent state given current state.

Parameters:
  • params (ParamsGGSSM) – model parameters \(\theta\)

  • state (Float[Array, 'state_dim']) – current latent state \(z_t\)

  • inputs (Float[Array, 'input_dim'] | None) – current inputs \(u_t\)

Returns:

conditional distribution of next latent state \(p(z_{t+1} \mid z_t, u_t, \theta)\).

Return type:

Distribution

emission_distribution(params, state, inputs=None)[source]#

Return a distribution over emissions given current state.

Parameters:
  • params (ParamsGGSSM) – model parameters \(\theta\)

  • state (Float[Array, 'state_dim']) – current latent state \(z_t\)

  • inputs (Float[Array, 'input_dim'] | None) – current inputs \(u_t\)

Returns:

conditional distribution of current emission \(p(y_t \mid z_t, u_t, \theta)\)

Return type:

Distribution

Low-level inference#

conditional_moments_gaussian_filter(model_params, inf_params, emissions, num_iter=1, inputs=None)[source]#

Run an (iterated) conditional moments Gaussian filter to produce the marginal likelihood and filtered state estimates.

Parameters:
  • model_params (ParamsGGSSM) – model parameters.

  • inf_params (EKFIntegrals | UKFIntegrals | GHKFIntegrals) – inference parameters that specify how to compute moments.

  • emissions (Float[Array, 'ntime emission_dim']) – array of observations.

  • num_iter (int) – optional number of linearizations around prior/posterior for update step (default 1).

  • inputs (Float[Array, 'ntime input_dim'] | None) – optopnal array of inputs.

Returns:

posterior object.

Return type:

filtered_posterior

iterated_conditional_moments_gaussian_filter(model_params, inf_params, emissions, num_iter=2, inputs=None)[source]#

Run an iterated conditional moments Gaussian filter.

Parameters:
  • model_params (ParamsGGSSM) – model parameters.

  • inf_params (EKFIntegrals | UKFIntegrals | GHKFIntegrals) – inference parameters that specify how to compute moments.

  • emissions (Float[Array, 'ntime emission_dim']) – array of observations.

  • num_iter (int) – optional number of linearizations around prior/posterior for update step (default 1).

  • inputs (Float[Array, 'ntime input_dim'] | None) – optional array of inputs.

Returns:

posterior object.

Return type:

filtered_posterior

conditional_moments_gaussian_smoother(model_params, inf_params, emissions, filtered_posterior=None, inputs=None)[source]#

Run a conditional moments Gaussian smoother.

Parameters:
  • model_params (ParamsGGSSM) – model parameters.

  • inf_params (EKFIntegrals | UKFIntegrals | GHKFIntegrals) – inference parameters that specify how to compute moments.

  • emissions (Float[Array, 'ntime emission_dim']) – array of observations.

  • num_iter – optional number of linearizations around prior/posterior for update step (default 1).

  • inputs (Float[Array, 'ntime input_dim'] | None) – optopnal array of inputs.

  • filtered_posterior (PosteriorGSSMFiltered | None)

Returns:

posterior object.

Return type:

post

iterated_conditional_moments_gaussian_smoother(model_params, inf_params, emissions, num_iter=2, inputs=None)[source]#

Run an iterated conditional moments Gaussian smoother.

Parameters:
  • model_params (ParamsGGSSM) – model parameters.

  • inf_params (EKFIntegrals | UKFIntegrals | GHKFIntegrals) – inference parameters that specify how to compute moments.

  • emissions (Float[Array, 'ntime emission_dim']) – array of observations.

  • num_iter (int) – optional number of linearizations around prior/posterior for update step (default 1).

  • inputs (Float[Array, 'ntime input_dim'] | None) – optopnal array of inputs.

Returns:

posterior object.

Return type:

post

Types#

class ParamsGGSSM(initial_mean, initial_covariance, dynamics_function, dynamics_covariance, emission_mean_function, emission_cov_function, emission_dist=<function ParamsGGSSM.<lambda>>)[source]#

Container for Generalized Gaussian SSM parameters. Specifically, it defines the following model:

\[p(z_t | z_{t-1}, u_t) = N(z_t | f(z_{t-1}, u_t), Q_t)\]
\[p(y_t | z_t) = q(y_t | h(z_t, u_t), R(z_t, u_t))\]
\[p(z_1) = N(z_1 | m, S)\]

This differs from NLGSSM in by allowing a general emission model. If you have no inputs, the dynamics and emission functions do not to take \(u_t\) as an argument.

Parameters:
  • initial_mean (Float[Array, 'state_dim']) – \(m\)

  • initial_covariance (Float[Array, 'state_dim state_dim']) – \(S\)

  • dynamics_function (Callable[[Float[Array, 'state_dim']], Float[Array, 'state_dim']] | Callable[[Float[Array, 'state_dim'], Float[Array, 'input_dim']], Float[Array, 'state_dim']]) – \(f\). This has the signature \(f: Z * U -> Y\) or \(h: Z -> Y\).

  • dynamics_covariance (Float[Array, 'state_dim state_dim']) – \(Q\)

  • emission_mean_function (Callable[[Float[Array, 'state_dim']], Float[Array, 'emission_dim']] | Callable[[Float[Array, 'state_dim'], Float[Array, 'input_dim']], Float[Array, 'emission_dim']]) – \(h\). This has the signature \(h: Z * U -> Z\) or \(h: Z -> Z\).

  • emission_cov_function (Callable[[Float[Array, 'state_dim']], Float[Array, 'emission_dim emission_dim']] | Callable[[Float[Array, 'state_dim'], Float[Array, 'input_dim']], Float[Array, 'emission_dim emission_dim']]) – \(R\). This has the signature \(R: Z * U -> Z*Z\) or \(R: Z -> Z*Z\).

  • emission_dist (Callable[[Float[Array, 'state_dim'], Float[Array, 'state_dim state_dim']], Distribution]) – the observation distribution \(q\). This is a callable that takes the predicted mean and covariance of Y, and returns a tfp distribution object: \(q: Z * (Z*Z) -> Dist(Y)\).

Utilities#

find_permutation(z1, z2)[source]#

Find the permutation of the state labels in sequence z1 so that they best align with the labels in z2.

Parameters:
  • z1 (Int[Array, 'num_timesteps']) – The first state vector.

  • z2 (Int[Array, 'num_timesteps']) – The second state vector.

Returns:

permutation such that jnp.take(perm, z1) best aligns with z2. Thus, len(perm) = min(z1.max(), z2.max()) + 1.