Approximate Bayesian Computation related methods
BiochemNetABC.abc_smc — Methodabc_smc(pm::ParametricModel, l_obs, func_dist; nbr_particles, alpha, kernel_type, NT
duration_time, bound_sim, sym_var_aut, verbose)Run the ABC-SMC algorithm with the pm parametric model.
func_dist(l_sim, l_obs) is the distance function between simulations and observation, it corresponds to $\rho(\eta(y_sim), \eta(y_exp))\$. l_obs::Vector{<:T2} is a collection of observations. dist must have a signature of the form func_dist(l_sim::Vector{T1}, l_obs::Vector{T2}).
If pm is defined on a ContinuousTimeModel, then T1 should verify T1 <: Trajectory.
!!! Distance function and distributed ABC If you use abc_smc with multiple workers, dist has to be defined on every workers by using @everywhere.
BiochemNetABC.abc_model_choice_dataset — Methodabc_model_choice_dataset(models,
summary_stats_observations,
summary_stats_func::Function, distance_func::Function,
k::Int, N_ref::Int; dir_results::Union{Nothing,String} = nothing)Creates a reference table for ABC model choice with discrete uniform prior distribution over the models.
BiochemNetABC.abc_model_choice_dataset — Methodabc_model_choice_dataset(models, models_prior,
summary_stats_observations,
summary_stats_func::Function, distance_func::Function,
k::Int, N_ref::Int; dir_results::Union{Nothing,String} = nothing)Creates a reference table for ABC model choice.
The mandatory arguments are:
modelsis a list of objects inherited fromModelorParametricModel,models_prior: the prior over the models (by default: discrete uniform distribution)summary_stats_observationsare the summary statitics of the observations,summary_stats_func::Function: the function that computes the summary statistics over a model simulation,distance_func: the distance function over the summary statistics space,N_ref: the number of samples in the reference table,k: the k nearest samples from the observations to keep in the reference table (k < N_ref).
The result is a AbcModelChoiceDataset with fields:
summary_stats_matrix: the (Nstats, Nref) features matrix. Accessible via.X.summary_stats_observations: the observations used for simulating the dataset.models_indexes: the labels vector. Accessible via.y.
If specified, dir_results is the directory where the summary statistics matrix and associated models are stored (CSV).
BiochemNetABC.posterior_proba_model — Methodposterior_proba_model(rf_abc::RandomForestABC)Estimates the posterior probability of the model $P(M = \widehat{M}(s_{obs}) | s_{obs})$ with the Random Forest ABC method.
BiochemNetABC.rf_abc_model_choice — Methodrf_abc_model_choice(abc_trainset;
k::Int = N_ref, distance_func::Function = (x,y) -> 1,
hyperparameters_range::Dict)Run the Random Forest Approximate Bayesian Computation model choice method with an already simulated dataset.
The mandatory arguments are:
abc_trainset: an already simulated dataset with `abc_model_choice_dataset
The optional arguments are:
hyperparameters_range: a dict with the hyperparameters range values for the cross validation fit of the Random Forest (by default:Dict(:n_estimators => [200], :min_samples_leaf => [1], :min_samples_split => [2])). See scikit-learn documentation of RandomForestClassifier for the hyperparameters name.
The result is a RandomForestABC object with fields:
reference_tablean AbcModelChoiceDataset that corresponds to the reference table of the algorithm,clfa random forest classifier (PyObject from scikit-learn),summary_stats_observationsare the summary statitics of the observationsestim_modelis the underlying model of the observations inferred with the RF-ABC method.
BiochemNetABC.rf_abc_model_choice — Methodrf_abc_model_choice(models, summary_stats_observations,
summary_stats_func::Function, N_ref::Int;
k::Int = N_ref, distance_func::Function = (x,y) -> 1,
hyperparameters_range::Dict)Run the Random Forest Approximate Bayesian Computation model choice method.
The mandatory arguments are:
modelsis a list of objects inherited fromModelorParametricModel,summary_stats_observationsare the summary statitics of the observationsN_ref: the number of samples in the reference table.summary_stats_func::Function: the function that computes the summary statistics over a model simulation.
The optional arguments are:
models_prior: the prior over the models (by default: discrete uniform distribution)k: the k nearest samples from the observations to keep in the reference table (by default: k = N_ref)distance_func: the distance function, has to be defined if k < N_refhyperparameters_range: a dict with the hyperparameters range values for the cross validation fit of the Random Forest (by default:Dict(:n_estimators => [200], :min_samples_leaf => [1], :min_samples_split => [2])). See scikit-learn documentation of RandomForestClassifier for the hyperparameters name.
The result is a RandomForestABC object with fields:
reference_tablean AbcModelChoiceDataset that corresponds to the reference table of the algorithm,clfa random forest classifier (PyObject from scikit-learn),summary_stats_observationsare the summary statitics of the observationsestim_modelis the underlying model of the observations inferred with the RF-ABC method.