Approximate Bayesian Computation related methods
BiochemNetABC.abc_smc
— Methodabc_smc(pm::ParametricModel, l_obs, func_dist; nbr_particles, alpha, kernel_type, NT
duration_time, bound_sim, sym_var_aut, verbose)
Run the ABC-SMC algorithm with the pm parametric model.
func_dist(l_sim, l_obs)
is the distance function between simulations and observation, it corresponds to $\rho(\eta(y_sim), \eta(y_exp))\$. l_obs::Vector{<:T2}
is a collection of observations. dist
must have a signature of the form func_dist(l_sim::Vector{T1}, l_obs::Vector{T2})
.
If pm
is defined on a ContinuousTimeModel
, then T1
should verify T1 <: Trajectory
.
!!! Distance function and distributed ABC If you use abc_smc
with multiple workers, dist
has to be defined on every workers by using @everywhere.
BiochemNetABC.abc_model_choice_dataset
— Methodabc_model_choice_dataset(models,
summary_stats_observations,
summary_stats_func::Function, distance_func::Function,
k::Int, N_ref::Int; dir_results::Union{Nothing,String} = nothing)
Creates a reference table for ABC model choice with discrete uniform prior distribution over the models.
BiochemNetABC.abc_model_choice_dataset
— Methodabc_model_choice_dataset(models, models_prior,
summary_stats_observations,
summary_stats_func::Function, distance_func::Function,
k::Int, N_ref::Int; dir_results::Union{Nothing,String} = nothing)
Creates a reference table for ABC model choice.
The mandatory arguments are:
models
is a list of objects inherited fromModel
orParametricModel
,models_prior
: the prior over the models (by default: discrete uniform distribution)summary_stats_observations
are the summary statitics of the observations,summary_stats_func::Function
: the function that computes the summary statistics over a model simulation,distance_func
: the distance function over the summary statistics space,N_ref
: the number of samples in the reference table,k
: the k nearest samples from the observations to keep in the reference table (k < N_ref).
The result is a AbcModelChoiceDataset
with fields:
summary_stats_matrix
: the (Nstats, Nref) features matrix. Accessible via.X
.summary_stats_observations
: the observations used for simulating the dataset.models_indexes
: the labels vector. Accessible via.y
.
If specified, dir_results
is the directory where the summary statistics matrix and associated models are stored (CSV).
BiochemNetABC.posterior_proba_model
— Methodposterior_proba_model(rf_abc::RandomForestABC)
Estimates the posterior probability of the model $P(M = \widehat{M}(s_{obs}) | s_{obs})$ with the Random Forest ABC method.
BiochemNetABC.rf_abc_model_choice
— Methodrf_abc_model_choice(abc_trainset;
k::Int = N_ref, distance_func::Function = (x,y) -> 1,
hyperparameters_range::Dict)
Run the Random Forest Approximate Bayesian Computation model choice method with an already simulated dataset.
The mandatory arguments are:
abc_trainset
: an already simulated dataset with `abc_model_choice_dataset
The optional arguments are:
hyperparameters_range
: a dict with the hyperparameters range values for the cross validation fit of the Random Forest (by default:Dict(:n_estimators => [200], :min_samples_leaf => [1], :min_samples_split => [2])
). See scikit-learn documentation of RandomForestClassifier for the hyperparameters name.
The result is a RandomForestABC
object with fields:
reference_table
an AbcModelChoiceDataset that corresponds to the reference table of the algorithm,clf
a random forest classifier (PyObject from scikit-learn),summary_stats_observations
are the summary statitics of the observationsestim_model
is the underlying model of the observations inferred with the RF-ABC method.
BiochemNetABC.rf_abc_model_choice
— Methodrf_abc_model_choice(models, summary_stats_observations,
summary_stats_func::Function, N_ref::Int;
k::Int = N_ref, distance_func::Function = (x,y) -> 1,
hyperparameters_range::Dict)
Run the Random Forest Approximate Bayesian Computation model choice method.
The mandatory arguments are:
models
is a list of objects inherited fromModel
orParametricModel
,summary_stats_observations
are the summary statitics of the observationsN_ref
: the number of samples in the reference table.summary_stats_func::Function
: the function that computes the summary statistics over a model simulation.
The optional arguments are:
models_prior
: the prior over the models (by default: discrete uniform distribution)k
: the k nearest samples from the observations to keep in the reference table (by default: k = N_ref)distance_func
: the distance function, has to be defined if k < N_refhyperparameters_range
: a dict with the hyperparameters range values for the cross validation fit of the Random Forest (by default:Dict(:n_estimators => [200], :min_samples_leaf => [1], :min_samples_split => [2])
). See scikit-learn documentation of RandomForestClassifier for the hyperparameters name.
The result is a RandomForestABC
object with fields:
reference_table
an AbcModelChoiceDataset that corresponds to the reference table of the algorithm,clf
a random forest classifier (PyObject from scikit-learn),summary_stats_observations
are the summary statitics of the observationsestim_model
is the underlying model of the observations inferred with the RF-ABC method.