Novel approach to reconstruct context dependent metabolic network

Protocol: Novel approach to reconstruct context dependent metabolic network

Created by Ming Wu, 07/30/2012

0. Prepare the genetic metabolic network model

Current reconstruction of metabolic network for different species that are available can be found in BiGG database (

Most of the networks can be download in the form of SBML.

The software COBRA ( has become the standard in this field. our codes are add-ons to COBRA, so one has to know COBRA first. There is a good introduction and protocol in

Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nature Protocols6,1290–1307(2011)doi:10.1038/nprot.2011.308

What is flux balance analysis?Nature Biotechnology28,245–248(2010)doi:10.1038/nbt.1614

I suggest reading these reviews and protocols before moving on to the novel approach. The protocol is mostly based on E.coli, but it is a good example to show how to do flux balance analysis and how to use COBRA to analyze a metabolic model.

Installation of COBRA may take some effort. One can refer to the documentation of COBRA. The tricky thing will be install the SBML library, so I provide a MATLAB transformed human generic metabolic network model here


so that you don't have to worry about converting SBML to MATLAB structure in the beginning, if you are working on human system.

Another important thing is to install a tool for Linear optimization, which is not included COBRA. I have tried CPLEX and Gurobi. It will be a little bit challenging in the beginning to install andconfigure, since there will be some parts need to be re-compiled using MATLAB. Students in our lab can start from our account on Windows server (Ming) in which I have configured the environments.

Here is some hint of installing Gurobi:

mex -O -largeArrayDims -I"F:\HOME_F\MING\gurobi460\win64\include" "gurobi_mex.c" ...
"F:\HOME_F\MING\Gurobi460\win64\lib\gurobi46.lib" "G:\Program Files\MATLAB\R2009b\extern\lib\win64\microsoft\libut.lib"

Each time you start analyze with COBRA, you may want to initiate as:


if Gurobi is used for linear optimization.

If you are working on human cancer, this is an objective function for biomass production. (refer to our paper if you don't know what this is). Assume "model1" is your human network, then

model1=addReaction(model1,'bioMassR', '0.53602 ala-L[c] + 0.023165 amp[c] + 0.05444 asn-L[c] + 3.1282 asp-L[c] + 100 atp[c] + 0.044228 chsterol[c] + 0.0030671 clpn_hs[c] + 0.038608 cmp[c] + 0.016827 dag_hs[c] + 0.0094984 damp[c] + 0.0063323 dcmp[c] + 0.0063323 dgmp[c] + 0.0094984 dtmp[c] + 0.75797 gln-L[c] + 0.86266 glu-L[c] + 0.62815 gly[c] + 0.043755 gmp[c] + 100 h2o[c] + 0.00676 lpchol_hs[c] + 0.020642 mag_hs[c] + 0.087773 pa_hs[c] + 0.016562 pail_hs[c] + 0.09182 pchol_hs[c] + 0.10722 pe_hs[c] + 0.081241 pro-L[c] + 0.031502 ps_hs[c] + 0.16751 ser-L[c] + 0.01291 sphmyln_hs[c] + 0.14696 tag_hs[c] + 0.023165 ump[c] + 0.03514 xolest_hs[c] -> 100 adp[c] + 100 h[c] + 100 pi[c]');

1. Prepare context specific gene expression information

What we need is a structure variable in MATLAB with two fields:
  • Locus
  • Data
The "Locus" is a list ofEntrez Gene ID, the Data is a vector corresponding to Locus showing the gene expression level.

The gene expression level can be either 0/1 or -1/0/1 for GIMME approach. The gene expression level should be -1/0/1 or determined by gene expression data in the novel approach, as discussed in the following.

One could obtain context gene expression from literature, i. e. to find out whether a gene is active/nonactive or in the "control" level. One could also determine the 0/1s from discretization or thresholding of expression data.

For the novel approach, three variable is needed for context information of gene expression:
  • meta_exp_geneFC: log2 fold change of the gene expression compare the system of interest (e.g. liver) with the reference system.
  • meta_exp_geneFDR: describing whether or not the change is significant, for example, larger than 2 fold change can be considered as significant
  • meta_exp_geneLocus: the Entrez Gene ID
The three variables are all column vectors, each rowrepresentone gene.

When using literature information, the expression should be -1/0/1, and the changes are determined by comparing with ref condition, e.g. generic human model has the gene vector in which all gene are at 1 level. Changes larger or equal than 1 is indicate in meta_exp_geneFDR as 1. When using expression data, the changes and FDRs should be from differential expression analysis.

2. reconstruct context dependent model based on information of gene expression

The GIMME approach (refer to documentation of COBRA and paper of GIMME from Passon's group) has been implemented in COBRA, as example here:

[tissueModel_liver,rxns_liver] = createTissueSpecificModel(model1,expressionData,1,0,[],'GIMME' ,[],1);

Refer to COBRA documentation to understand the parameters here. The result is a tissue specific model obtained by removing some reactions from original model.

The novel approach is by changing the bounds to define a context dependent model:
our codes are here, including
  • extractGPRs.m
  • UB_update_withFC.m


Preprocessing: extracting the gene-reaction relationships:

[Parse_GPR,P_corrRxn] = extractGPRs(model1);

Preprocessing(when start from human generic model) flux variability analysis to define the original bounds:


Update bounds based on gene--reaction states


in which exp_generic is the expression profile (in the structure form) for the ref condition, for human generic model, this is what you can do:



(You need to first initialize an empty structure form of exp_generic)

Thus you have "NewModel" which is a context dependent model.