Abstract
Purpose
Microarrays have been utilized in many biological, physiological and pharmacological studies as a high-throughput genomic technique. Several generations of Affymetrix GeneChip® microarrays are widely used in gene expression studies. However, differences in intensities of signals for different probe sets that represent the same gene on various types of Affymetrix chips make comparison of datasets complicated.
Materials and Methods
A power coefficient scaling factor was applied in the pharmacokinetic/pharmacodynamic (PK/PD) modeling to account for differences in probe set sensitivities (i.e., signal intensities). Microarray data from muscle and liver following methylprednisolone 50 mg/kg i.v. bolus and 0.3 mg/kg/h infusion regimens were taken as an exemplar.
Results
The scaling factor applied to the pharmacodynamic output function was used to solve the problem of intensity differences between probe sets. This approach yielded consistent pharmacodynamic parameters for the applied models.
Conclusions
Modeling of pharmacodynamic/pharmacogenomic (PD/PG) data from diverse chips should be performed with caution due to differential probe set intensities. In such circumstances, a power scaling factor can be applied in the modeling.
Similar content being viewed by others
INTRODUCTION
Gene expression profiling is useful in understanding gene expression changes and interactions underlying basic biological processes. Microarray techniques provide a unique tool in assessing numerous mRNA expression levels in parallel in a high-throughput manner. Microarrays have been utilized in the research of many biological, physiological and pharmacological processes and provide multiple biomarkers in drug screening. Oligonucleotide gene chips produced by Affymetrix are widely used in microarray studies. As genechip technology evolves, several generations of chips have emerged. For example, Affymetrix has marketed to date both a RG_U34 and RAE230A genechip line for the rat genome. Probes for the same gene (i.e., the small unique sequences which are complementary to a particular gene) differ in sequence on these two types of chips. Futhermore, within a particular type of chip there are often multiple probe sets for the same gene. Differences in the signal intensity obtained on hybridization of these different probe sets to the same mRNA make comparison of data on various types of chips and of data for multiple probe sets for the same gene on the same chip problematic.
Corticosteroids (CS) are potent anti-inflammatory and immunosuppressive drugs. They are commonly used for the treatment of allergies and autoimmune diseases such as asthma, rheumatoid arthritis, nephritic syndrome, lupus, and immunosuppression for organ transplant patients. The clinical use of CS is associated with numerous side effects such as Cushing’s syndrome, steroid diabetes, osteoporosis, dyslipidemia, arteriosclerosis, and muscle wasting. Almost every tissue is involved in the biological functions of CS. Microarrays allow assessment of the diverse functions of CS by increasing the number of genes explored to thousands in a single study. We have used microarrays in the framework of a time series as a high-throughput data collection method in order to obtain the scope of data necessary for pharmacokinetic/pharmacodynamic/pharmacogenomic (PK/PD/PG) modeling of pharmacological events triggered by CS (1).
Simultaneous modeling of PK/PD/PG profiles from diverse conditions (dosing regimens, dose levels, administration routes) would be beneficial in understanding the mechanisms and dynamics of biological processes. However, in comparing data from various array types, different signal intensities of probe sets for the same gene should be taken into consideration. In the present study, a scaling factor was incorporated in the PK/PD/PG modeling of microarray data obtained from two different Affymetrix chips. Our aim is to provide a strategy to seperate sensitivity differences between probe sets from drug-induced pharmacodynamic changes and to reliably resolve the dynamics of pharmacological events induced by drugs. In the present report, microarray data from two different dosing regimens—acute and chronic methylprednisolone (MPL) studies were taken as examples in our simultaneous modeling. Two different Rat Affymetrix GeneChip® Rat Genome microarrays, RG_U34A and RAE230A, were used for acute and chronic studies.
MATERIALS AND METHODS
Experimental
Tissues were obtained from animal studies previously performed in our laboratory (2–4). Male adrenalectomized (ADX) Wistar rats (Harlan Sprague-Dawley Inc., IN) were used in both acute and chronic studies. In the acute study, a single intravenous bolus of 50 mg/kg MPL (Solu-Medrol, Pharmacia-Upjohn Company, MI) was given to 47 animals via right jugular vein cannula. Rats were sacrificed by aortic exsanguinations at 16 time points between 0 to 72 h after dosing. Four vehicle-treated rats were designated as controls and were considered as sacrificed at time point zero. In the 7-day infusion study, 0.3 mg/kg/h of MPL reconstituted in supplied diluent was administered to 40 rats. The infusions were given using Alzet osmotic pumps (model 2001, flow rate 1 ul/h; Alza, Palo Alto, CA). Pumps were subcutaneously implanted between the shoulder blades on the back. Rats were sacrificed by aortic exsanguinations at ten time points throughout the 7-day study. Four control rats were implanted with a saline-filled pump. Liver and gastrocnemius muscles were rapidly excised, quickly frozen in liquid nitrogen, and stored at −80°C.
Microarrays
Frozen muscle and liver tissues were ground into powder. Total RNA extractions were carried out by a Trizol-chloroform based extraction method followed by further purification on RNeasy columns. Isolated tissue RNA from each individual sample was used to prepare biotinylated cRNA target according to manufacturer protocols. The biotinylated cRNAs from the acute study were hybridized to 51 individual Affymetrix GeneChip® RG_U34A (Affymetrix, Santa Clara, CA). The target cRNAs from the chronic study were hybridized to 44 individual Affymetrix GeneChip® RAE230A (Affymetrix, Santa Clara, CA). Oligonucleotide microarrays were utilized for high reproducibility between separate arrays. The entire data set has been submitted to the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus database (GSE490, GSE5101, GSE487) and is also available online at http://pepr.cnmcresearch.org/ (5).
Data Mining
Data mining of the microarray data was described in previously published articles (6–8). A new approach was developed based on the unique nature of high-throughput gene arrays coupled to a time series design. In brief, Affymetrix Microarray Suite 5.0 (MAS 5.0) was used for initial data acquisition. Based on comparison of matched and mismatched pairs, a ‘call’ of present (P), absent (A) and marginal (M) was determined for each probe set on each chip. The results were then transferred to another program, GeneSpring 7.0 (Silicon Genetics, Redwood City, CA). The expression data of each probe set at various time points were expressed as a ratio of the mean of the four control values for that probe set and were referred as “normalized intensity.” A series of filters was applied to the data, which included filtering for expression of the probe set in the tissue, filtering for regulation of the probe set by the drug, and quality-control filtering.
PK/PD/PG MODELING
The pharmacokinetics of MPL and glucocorticoid receptor dynamics in both liver and muscle following MPL acute and chronic treatments have been analyzed (2–4). The receptor dynamics model was built under the assumption that unbound MPL diffuses through cell membranes and binds with cytosolic free receptors. Drug-receptor complex is translocated into nuclei where it binds to glucocorticoid response elements (GREs) in the target DNA. The binding of drug-receptor complex and GRE enhances or inhibits the expression of target genes. CS is known to inhibit the expression of its own receptor (9). After dissociation from DNA, GR receptors are recycled into cytosol, where part of the receptors are degraded and the rest may be further activated by MPL. In the following pharmacogenomic modeling, nuclear drug receptor complex (DR(N)) is assumed to be the main driving force for target gene transcriptional regulation. The gene array data were obtained from a ‘giant rat’ study design reflecting naïve pooling of data from a group of animals (2–4). Assuming that the errors from the observed and predicted responses are normally distributed, the ADAPT II program (10) with the maximum likelihood method was applied for all fittings. The following variance model was used:
where δ1 and δ2 are variance parameters and Y(t) represents the model output function at time t. The goodness-of-fit criteria included visual inspection of the fitted curves, estimator criterion value, sum of squared residuals, Akaike information criterion, Schwartz criterion, and coefficients of variation (CV) of the estimated parameters.
RESULTS
Several genes were selected as exemplars for PK/PD modeling using data from two different types of chips. The first one is myristoylated alanine-rich C-kinase substrate. The dynamics of this gene in muscle was described by an indirect response model where DR(N) inhibits the transcription of the gene. The differential equation and initial condition describing the mRNA transcription (mRNA) are:
where rate constants represent zero-order synthesis rate (ks) and first-order degradation (kd) of mRNA. In the modeling, ks is considered as a secondary parameter calculated from kd and the baseline, Imax is the maximum inhibition of gene transcription, and IC50 represents the concentration of DR(N) at which the production rate of mRNA is reduced to 50% of its baseline level. The baselines of mRNA in acute (BmRNA_A) and chronic (BmRNA_C) studies were fixed to 1. The output functions of mRNA in acute (YA(t)) and chronic (YC(t)) studies were:
where mRNAA(t) and mRNAC(t) represent mRNA expression at time t in acute and chronic studies. Scaling factors (SFA and SFC) are used to correct for different signal intensities between probe sets. The same output functions were used for all of the following pharmacogenomic modeling. Since relative sensitivity was considered in the present report, SFA was fixed to 1 in the modeling.
A characteristic of Affymetrix microarrays is that in many cases, a chip contains multiple probe sets for the same gene. One probe set in the acute study and three probe sets in the chronic study remained after data mining for myristoylated alanine-rich C-kinase substrate in muscle. Simultaneous modeling of acute and chronic data was performed using the acute profile and three different chronic profiles. The fitted curves are shown in Fig. 1. The estimated parameters are listed in Table I. The mRNA expression following both acute and chronic MPL dosing was simultaneously modeled using Eq. 2. The resultant gene-specific dynamic parameters kd, Imax and IC50 are similar. The estimated SFC parameter is between 1.192 and 1.925. The scaling factor represents the relative sensitivity (ie., signal intensity) of a probe set to a fixed amount of change. A higher scaling factor value suggests a higher sensitivity of the probe set. The scaling factor was employed in the function as a power coefficient. All three estimated scaling factors are higher than 1, suggesting higher sensitivity of the probe sets on the chronic chip (RAE 230A) than that on the acute chip (RG_U34A).
The second exemplar was tropomyosin 1α. It was assumed that tropomyosin 1α mRNA expression was described by a direct inhibitory effect from DR(N) and an indirect inhibitory effect from an intermediate biosignal (BS) in muscle. The differential equations are:
In addition to the symbols used previously, ke represents the first-order rate constant describing the production of BS and was fixed at 0.005 h−1 in the modeling, IC50_BS depicts the concentration of BS at which the production rate of mRNA is reduced to 50% of its baseline level under inhibition of the intermediate biosignal, γ is the slope factor of the Hill function and was fixed to 1 in fitting. The baselines were fixed to 1. Eqs. 3 and 4 were used as the output functions and SFA was fixed to 1.
One probe set in the acute study (RG_U34A) and two probe sets in the chronic study (RAE230A) remained after data mining for tropomyosin 1α in muscle. The estimated parameters for simultaneous acute and chronic data modeling are listed in Table II while the fitted curves are shown in Fig. 2. The proposed model was able to capture the expression of tropomyosin 1α expression following both treatments. The modeling results in almost identical dynamic parameters kd, IC50 and IC50_BS. The two probe sets on the RAE230A chip from the chronic study yield SFC values of 1.291 and 2.088, suggesting higher relative sensitivities than the RG_U34A probe. Although on the same chip, the relative sensitivity of probe set 1379936_at is higher than probe set 1390471_at, leading to a greater extent of signal reduction following drug treatment.
Mitogen-activated protein kinase 6 is the third exemplar. Like tropomyosin 1α, the model assumed dual inhibitory effects from DR(N) and an unknown biosignal. The pharmacodynamics of mitogen-activated protein kinase 6 in muscle was also described by Eqs. 5 and 6. The kd was fixed to 0.5 h−1 in the fitting due to correlation between parameters. The baselines were fixed to 1. Since two different probe sets were present on chip U34A while only one probe set was on 230A, SFC was fixed to 1. The SFA, ke and γ were fitted with other parameters. The best-fit curves and the mRNA expression data are shown in Fig. 3. As shown in Table III, all parameters except SFA are very similar for the two fittings. For probe set M64301_at, the estimated SFA is close to 1, indicating similar sensitivity of M64301_at and the probe on Chip 230A, 1368273_at. The SFA of 0.810 suggests a lower sensitivity of probe set M64301_g_at than the other two probes.
The fourth exemplar was mitochondrial precursor receptor from liver. Expression of this gene was modeled by a stimulatory effect from DR(N) and an indirect inhibitory effect from an intermediate BS according to:
In addition to the symbols already defined, S is a linear coefficient representing the stimulatory efficiency of DR(N) and BS is described by Eq. 6. Baselines were fixed to 1. The best-fit curves and the mRNA expression data are shown in Fig. 4. Simultaneous modeling of acute and chronic data from different probe sets yield similar kd and S values (Table IV). The SFC values for both probe sets on chip RAE230A are higher than 1. The estimated ke and IC50_BS are different. Notice that both parameters estimated from the pair U21871_at/1398870_at are about twice the values from modeling the other pair U21871_at/ 1370785_s_at. Examination of the model and equations suggests possible correlation between these two parameters.
In addition to probe sets on two different chips, probe sets representing the same gene on the same chip may also display differences in intensity. Three probe sets representing ornithine decarboxylase remained after data mining in liver following 50 mg/kg MPL i.v. bolus treatment. An indirect response model with DR(N) stimulation was applied to capture the dynamics of mRNA expression for this gene as follows:
All parameters have been defined previously. Modeling was performed previously using different kd and S values for each individual probe set (1). In the present report, the data were reanalyzed by performing simultaneous modeling of all three profiles using the scaling factor approach. Eq. 3 was used as the output function. SFA for probe set J04791_s_at was fixed to 1 while SFA for the other two probe sets were estimated. All baselines were fixed to 1. The best-fitted curves are shown in Fig. 5. Similar profiles but different amplitudes of up-regulation are evident for these probe sets, with the normalized peak values at 4, 6, and 8. Simultaneous fitting generated different SFA values, 0.598 for J04792_at and 0.827 for X07944exon#1–12_s_at. Although the change of scaling factor among these three probes was moderate, the net effect of the change on the output function was high since it is a power coefficient. The S was estimated at 0.0237 (fmol/mg protein)−1 and kd at 0.311 h−1 with reasonable precision (CV less than 20%).
DISCUSSION
In all of the above examples, SF coefficients were used to account for relative sensitivities (i.e., signal intensities). In the modeling, we assume the sensitivity of one probe set is 1.0, while we estimate the relative sensitivity of the other probe sets compared to the first one. Correlations between SF values and other parameters were small, and usually SF could be accurately estimated with a reasonable CV. The scaling factor approach has been applied to data from both liver and muscle. The similarity in estimated gene-specific dynamic parameters could be considered as partial validation of the approach.
Two rat genome microarrays, Affymetrix Genechip® RG_U34A and RAE230A, were utilized in our pharmacogenomic studies. The newer array, RAE230A, was used for chronic studies while the first-generation rat genome set RG_U34A was used for acute studies as RAE230A chips were not available at the time the acute studies were performed. RG_U34A contains 8,799 probe sets, including more than 7,000 genes and about 1,000 ESTs. RAE230A contains 15,967 probe sets representing a greater number of genes than those present on RG_U34A chip. Besides differences in the numbers of probe sets, the array design of the RAE230A is different from RG_U34A (11). The new design allows detection of the true 3’-transcript end and prevents selection of probe sets against aberrantly extended clusters. The probe quality is ensured by a thermodynamic multiple linear regression model which improves selection of probes that hybridize well to the correct target and reduces non-specific hybridization. Each probe set on the RAE230A chip contains 11 matched and mismatched probe pairs while each probe set on the RG_U34A chip has 16 probe pairs. The nucleotide length of each probe on RG_U34A is longer than that on RAE230A. The diversities of probe length and sequence may lead to discrepancies in probe GC content, Tm value, and hybridization efficiency, leading to differences in signal intensities. Since the differences between probe sets will not be the same for each gene under consideration, we cannot use a generalized method for normalization. The difference in relative sensitivity is gene specific but not chip specific. Thus each gene has to be accounted for instead of each chip. This can be seen by the wide range of SF values estimated in the present report.
Affymetrix has examined their microarrays and did a comparison between Rat Genome U34A and 230A chips (11). Their study shows that the new RAE230A design, with a reduced probe set size, produced more informative data compared to RG_U34A. The majority of RAE230A probe sets tend to produce higher signals relative to the corresponding probe sets on the old RG_U34A chip (11). In total, 51% of the 230A probe sets have significantly higher signals. Only 5.7% of the sets have significantly higher signals for the RG_U34A probe sets. The remaining 43% of probe sets exhibit similar intensity between the two chip designs. This is consistent with our findings in modeling as SFC tends to be higher than SFA. In all the probe sets that we analyzed in both liver and muscle, 56% of these have an estimated SFC higher than 1.2 when SFA is fixed at 1. The SFC values for 32% of probe sets are between 0.8 to 1.2, while the remaining 12% have a SFC value lower than 0.8.
In summary, a scaling factor was applied sucessfully to solve the problem of intensity differences between probe sets for microarray data coming from different genechips reflecting pharmacogenomic effects of corticosteroids.
Abbreviations
- ADX:
-
adrenalectomized
- BS:
-
biosignal
- CS:
-
corticosteroids
- CV:
-
coefficient of variation
- DR(N):
-
nucleus drug-receptor complex
- GR:
-
glucocorticoid receptor
- GRE:
-
glucocorticoid response element
- MPL:
-
methylprednisolone
- PD:
-
pharmacodynamics
- PG:
-
pharmacogenomics
- PK:
-
pharmacokinetics
References
J. Y. Jin, R. R. Almon, D. C. DuBois, and W. J. Jusko. Modeling of corticosteroid pharmacogenomics in rat liver using gene microarrays. J. Pharmacol. Exp. Ther. 307:93–109 (2003).
Y. N. Sun, D. C. DuBois, R. R. Almon, and W. J. Jusko. Fourth-generation model for corticosteroid pharmacodynamics: a model for methylprednisolone effects on receptor/gene-mediated glucocorticoid receptor down-regulation and tyrosine aminotransferase induction in rat liver. J. Pharmacokinet. Biopharm. 26:289–317 (1998).
Y. N. Sun, L. I. McKay, D. C. DuBois, W. J. Jusko, and R. R. Almon. Pharmacokinetic/Pharmacodynamic models for corticosteroid receptor down-regulation and glutamine synthetase induction in rat skeletal muscle by a Receptor/Gene-mediated mechanism. J. Pharmacol. Exp. Ther. 288:720–728 (1999).
R. Ramakrishnan, D. C. DuBois, R. R. Almon, N. A. Pyszczynski, and W. J. Jusko. Pharmacodynamics and pharmacogenomics of methylprednisolone during 7-day infusions in rats. J. Pharmacol. Exp. Ther. 300:245–256 (2002).
R. R. Almon, J. Chen, G. Snyder, D. C. DuBois, W. J. Jusko, and E. P. Hoffman. In vivo multi-tissue corticosteroid microarray time series available online at Public Expression Profile Resource (PEPR). Pharmacogenomics 4:791–799 (2003).
R. R. Almon, D. C. DuBois, W. H. Piel, and W. J. Jusko. The genomic response of skeletal muscle to methylprednisolone using microarrays: tailoring data mining to the structure of the pharmacogenomic time series. Pharmacogenomics 5:525–552 (2004).
R. R. Almon, D. C. Dubois, J. Y. Jin, and W. J. Jusko. Pharmacogenomic responses of rat liver to methylprednisolone: an approach to mining a rich microarray time series. Aaps J. 7:E156–E194 (2005).
R. R. Almon, W. Lai, D. C. DuBois, and W. J. Jusko. Corticosteroid-regulated genes in rat kidney: mining time series array data. Am. J. Physiol. 289:E870–E882 (2005).
Y. Dong, L. Poellinger, J. A. Gustafsson, and S. Okret. Regulation of glucocorticoid receptor expression: evidence for transcriptional and posttranslational mechanisms. Mol. Endocrinol. 2:1256–1264 (1988).
D. Z. D’Argenio and A. Schumitzky. ADAPT II user’s guide: pharmacokinetic/pharmacodynamic systems analysis software, Biomedical Simulation Resource, Los Angeles, 1997.
Technical note: array design and performance of the genechip rat expression set 230. http://www.affymetrix.com/support/technical/technotesmain.affx.
Acknowledgments
We would like to acknowledge Ms Nancy Pyszczynski for the conduct of animal studies. This work was supported by grants GM24211 and GM67650 from the National Institute of General Medical Sciences, National Institutes of Health, in part by grant R24HD050846 from NICHD National Center for Medical Rehabilitation Research and a research grant from NASA.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yao, Z., Zhao, B., Hoffman, E.P. et al. Application of Scaling Factors in Simultaneous Modeling of Microarray Data from Diverse Chips. Pharm Res 24, 643–649 (2007). https://doi.org/10.1007/s11095-006-9215-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11095-006-9215-y