1Department of Surgery, University of Texas Medical School at Houston, Houston, Texas. Electronic address: email@example.com.
2Department of Surgery, University of Texas Medical School at Houston, Houston, Texas.
3Division of Rheumatology, Department of Medicine, University of Texas Medical School, Houston, Texas.
Since 1990, numerous public repositories of microarray data have been created to store vast genomic data sets. Our hypothesis is that a secondary analysis of an available hepatocellular carcinoma (HCC) public data set could generate new findings and additional hypotheses.
The Gene Expression Omnibus at the National Center for Biotechnology Information was queried for available data sets specific for 'HCC' and 'clinical data.' Genes that passed filtering and normalization criteria were analyzed using the class comparison and prediction functions in BRB-ArrayTools. Ingenuity pathway analysis software was used to identify potential gene networks up- or down-regulated.
The file GDS274, which measured gene expression in primary HCC lesions with or without hepatic metastases from a cohort of Chinese patients, was identified as an appropriate data set and was imported into BRB-ArrayTools. 9984 genes passed filtering criteria. Clinical data demonstrated alpha fetoprotein (AFP) >100 ng/mL predictive of worse survival (HR 5.87, 95% confidence interval: 1.11-31.0). A class comparison between patients with an AFP >100 and those with AFP <100 demonstrated 92 genes to be differentially expressed. Ingenuity pathway analyses demonstrated the top networks associated with the observed gene expression.
Using available HCC microarray data, we identified genes differentially expressed based on AFP >100. Canonical pathway analysis demonstrated functional gene pathways and associated upstream regulators. This study maximizes the use of publicly available data by generating new findings. Secondary analyses of these data sets should be considered by investigators before embarking on new genomic experiments.