Approximate Bayesian Computation

BiochemNetABC.abc_smc — Method

abc_smc(pm::ParametricModel, l_obs, func_dist; nbr_particles, alpha, kernel_type, NT
        duration_time, bound_sim, sym_var_aut, verbose)

Run the ABC-SMC algorithm with the pm parametric model.

func_dist(l_sim, l_obs) is the distance function between simulations and observation, it corresponds to $\rho(\eta(y_sim), \eta(y_exp))\$. l_obs::Vector{<:T2} is a collection of observations. dist must have a signature of the form func_dist(l_sim::Vector{T1}, l_obs::Vector{T2}).

If pm is defined on a ContinuousTimeModel, then T1 should verify T1 <: Trajectory.

!!! Distance function and distributed ABC If you use abc_smc with multiple workers, dist has to be defined on every workers by using @everywhere.

source

BiochemNetABC.abc_model_choice_dataset — Method

abc_model_choice_dataset(models,
                         summary_stats_observations,
                         summary_stats_func::Function, distance_func::Function,
                         k::Int, N_ref::Int; dir_results::Union{Nothing,String} = nothing)

Creates a reference table for ABC model choice with discrete uniform prior distribution over the models.

source

BiochemNetABC.abc_model_choice_dataset — Method

abc_model_choice_dataset(models, models_prior,
                         summary_stats_observations,
                         summary_stats_func::Function, distance_func::Function,
                         k::Int, N_ref::Int; dir_results::Union{Nothing,String} = nothing)

Creates a reference table for ABC model choice.

The mandatory arguments are:

models is a list of objects inherited from Model or ParametricModel,
models_prior: the prior over the models (by default: discrete uniform distribution)
summary_stats_observations are the summary statitics of the observations,
summary_stats_func::Function: the function that computes the summary statistics over a model simulation,
distance_func: the distance function over the summary statistics space,
N_ref: the number of samples in the reference table,
k: the k nearest samples from the observations to keep in the reference table (k < N_ref).

The result is a AbcModelChoiceDataset with fields:

summary_stats_matrix: the (Nstats, Nref) features matrix. Accessible via .X.
summary_stats_observations: the observations used for simulating the dataset.
models_indexes: the labels vector. Accessible via .y.

If specified, dir_results is the directory where the summary statistics matrix and associated models are stored (CSV).

source

BiochemNetABC.posterior_proba_model — Method

posterior_proba_model(rf_abc::RandomForestABC)

Estimates the posterior probability of the model $P(M = \widehat{M}(s_{obs}) | s_{obs})$ with the Random Forest ABC method.

source

BiochemNetABC.rf_abc_model_choice — Method

rf_abc_model_choice(abc_trainset;
                    k::Int = N_ref, distance_func::Function = (x,y) -> 1, 
                    hyperparameters_range::Dict)

Run the Random Forest Approximate Bayesian Computation model choice method with an already simulated dataset.

The mandatory arguments are:

abc_trainset: an already simulated dataset with `abc_model_choice_dataset

The optional arguments are:

hyperparameters_range: a dict with the hyperparameters range values for the cross validation fit of the Random Forest (by default: Dict(:n_estimators => [200], :min_samples_leaf => [1], :min_samples_split => [2])). See scikit-learn documentation of RandomForestClassifier for the hyperparameters name.

The result is a RandomForestABC object with fields:

reference_table an AbcModelChoiceDataset that corresponds to the reference table of the algorithm,
clf a random forest classifier (PyObject from scikit-learn),
summary_stats_observations are the summary statitics of the observations
estim_model is the underlying model of the observations inferred with the RF-ABC method.

source

BiochemNetABC.rf_abc_model_choice — Method

rf_abc_model_choice(models, summary_stats_observations,
                    summary_stats_func::Function, N_ref::Int;
                    k::Int = N_ref, distance_func::Function = (x,y) -> 1, 
                    hyperparameters_range::Dict)

Run the Random Forest Approximate Bayesian Computation model choice method.

The mandatory arguments are:

models is a list of objects inherited from Model or ParametricModel,
summary_stats_observations are the summary statitics of the observations
N_ref: the number of samples in the reference table.
summary_stats_func::Function: the function that computes the summary statistics over a model simulation.

The optional arguments are:

models_prior: the prior over the models (by default: discrete uniform distribution)
k: the k nearest samples from the observations to keep in the reference table (by default: k = N_ref)
distance_func: the distance function, has to be defined if k < N_ref
hyperparameters_range: a dict with the hyperparameters range values for the cross validation fit of the Random Forest (by default: Dict(:n_estimators => [200], :min_samples_leaf => [1], :min_samples_split => [2])). See scikit-learn documentation of RandomForestClassifier for the hyperparameters name.

The result is a RandomForestABC object with fields:

reference_table an AbcModelChoiceDataset that corresponds to the reference table of the algorithm,
clf a random forest classifier (PyObject from scikit-learn),
summary_stats_observations are the summary statitics of the observations
estim_model is the underlying model of the observations inferred with the RF-ABC method.

source

Approximate Bayesian Computation related methods