General instructions (read before running the code snippets)

In this second workflow, we will create a protein-protein interaction network of the up- and down-regulated genes in NanoString dataset (Folfirinox treatment). Afterwards, we will extend the network with gene-pathway associations to see in which pathways the differentially expressed genes are present in.


Setup

Loading libraries

options(connectionObserver = NULL)

library(dplyr)
library(rWikiPathways)
library(RCy3)
library(RColorBrewer)
library(rstudioapi)
library(readr)
setwd(dirname(getActiveDocumentContext()$path))

Load differential gene expression dataset

Make sure you ran workflow 1 beforehand, so the differential gene expression file has been generated.

We take the series 1 from the following dataset (NHBE mock treated versus SARS-CoV-2 infected): * https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE147507

which is related to the following publication:

Blanco-Melo, Daniel, et al. “Imbalanced host response to SARS-CoV-2 drives development of COVID-19.” Cell 181.5 (2020): 1036-1045.

dataset <- read.csv("data/DEG_TreatmentFOLFIRINOX.csv", header = TRUE)
dataset$X <- gsub('-mRNA', '', dataset$X)

# identifier mapping to Entrez Gene
hgcn2entrez <- clusterProfiler::bitr(dataset$X, fromType = "SYMBOL",toType = c("ENTREZID","SYMBOL"), OrgDb = org.Hs.eg.db)
'select()' returned 1:1 mapping between keys and columns
0.72% of input gene IDs are fail to map...
data.mapped <- merge(dataset, hgcn2entrez, by.x="X", by.y="SYMBOL", all.x = TRUE)

# filter genes without Entrez Gene identifier
data <- data.mapped %>% tidyr::drop_na(ENTREZID)

data.up <- unique(data[data$P.value < 0.05 & data$Log2.fold.change > 0.58, c(1,2)])
data.down <- unique(data[data$P.value < 0.05 & data$Log2.fold.change < -0.58, c(1,2)])

PPI network analysis

Next, we will create a protein-protein interaction network with all differentially expressed genes using the STRING database.

commandsRun(paste0('string protein query cutoff=0.7 newNetName="PPI network" query="',query,'" limit=0'))
[1] "Loaded network 'STRING network - PPI network - 4' with 84 nodes and 192 edges"

Let’s explore the network

  • Q1: How many of the differentially expressed genes were found in STRING?
  • Q2: Are all genes connected in the network?
  • Q3: Change the confidence cutoff in the commandsRun call from 0.7 (high confidence) to 0.4 (medium confidence). What changes?

Data visualization

Use the same visualization you created in workflow 1 to visualize the gene expression data on the network.

loadTableData(data, data.key.column = "ENTREZID", table.key.column = "query term")
[1] "Success: Data loaded in defaultnode table"
RCy3::copyVisualStyle("default","ppi")
RCy3::setNodeLabelMapping("display name", style.name="ppi")
NULL
RCy3::lockNodeDimensions(TRUE, style.name="ppi")
data.values<-c(-1,0,1) 
node.colors <- c(rev(brewer.pal(length(data.values), "RdBu")))
setNodeColorMapping("Log2.fold.change", data.values, node.colors, default.color = "#99FF99", style.name = "ppi")
NULL
RCy3::setVisualStyle("ppi")
                message 
"Visual Style applied." 
RCy3::toggleGraphicsDetails()

Interpretation

  • Q4: Do you see clusters of up- or down-regulated genes in the PPI network?

Pathway information

Next, we will add information about participation of the differentially expressed genes in molecular pathway models.

# run CyTargetLinker

wp <- file.path(getwd(), "data/wikipathways-hsa-20200710.xgmml")

commandsRun(paste0('cytargetlinker extend idAttribute="query term" linkSetFiles="', wp, '"'))
[1] "Extension step: 1"                                                            
[2] "Linkset: WikiPathways pathway-gene network_Homo sapiens_WikiPathways_20200710"
[3] "Added edges: 612"                                                             
[4] "Added nodes: 258"                                                             
commandsRun('cytargetlinker applyLayout network="current"')
commandsRun('cytargetlinker applyVisualstyle network="current"')
RCy3::setNodeLabelMapping("display name", style.name="CyTargetLinker")
NULL
# there is an issue in the latest version with visualization of the added edges - the workaround below solves this for now
RCy3::cloneNetwork()
network 
  43781 
RCy3::setVisualStyle("default")
                message 
"Visual Style applied." 
RCy3::setVisualStyle("CyTargetLinker")
                message 
"Visual Style applied." 
# TODO: VISUAL STYLE
# adapt the visual style to also show the differential gene expression as the node fill color

Interpretation

  • Q5: How many differentially expressed genes are in at least one of the pathways?
  • Q6: Are the genes also functionally related based on the PPI network?

Save Cytoscape output and session

# Saving output
png.file <- file.path(getwd(), "ppi-network.png")
exportImage(png.file,'PNG', zoom = 500)
cys.file <- file.path(getwd(), "ppi-network.cys")
saveSession(cys.file) 

#comment following line if you want to manipulate the visualization in Cytoscape
RCy3::closeSession(save.before.closing = F)
LS0tDQp0aXRsZTogIlBhbmNDYW5OZXQgLSBXb3JrZmxvdyAyIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQphdXRob3I6ICJNYXJ0aW5hIFN1bW1lci1LdXRtb24iDQpkYXRlOiAiMSBTZXB0ZW1iZXIgMjAyMSINCnZlcnNpb246IDEuMA0KbGljZW5zZTogIk1JVCBMaWNlbnNlIg0KLS0tDQoNCiMgR2VuZXJhbCBpbnN0cnVjdGlvbnMgKHJlYWQgYmVmb3JlIHJ1bm5pbmcgdGhlIGNvZGUgc25pcHBldHMpDQpJbiB0aGlzIHNlY29uZCB3b3JrZmxvdywgd2Ugd2lsbCBjcmVhdGUgYSBwcm90ZWluLXByb3RlaW4gaW50ZXJhY3Rpb24gbmV0d29yayBvZiB0aGUgdXAtIGFuZCBkb3duLXJlZ3VsYXRlZCBnZW5lcyBpbiBOYW5vU3RyaW5nIGRhdGFzZXQgKEZvbGZpcmlub3ggdHJlYXRtZW50KS4gQWZ0ZXJ3YXJkcywgd2Ugd2lsbCBleHRlbmQgdGhlIG5ldHdvcmsgd2l0aCBnZW5lLXBhdGh3YXkgYXNzb2NpYXRpb25zIHRvIHNlZSBpbiB3aGljaCBwYXRod2F5cyB0aGUgZGlmZmVyZW50aWFsbHkgZXhwcmVzc2VkIGdlbmVzIGFyZSBwcmVzZW50IGluLg0KDQoqIFRoZSBzY3JpcHQgY29udGFpbnMgc2V2ZXJhbCBjb2RlIHNuaXBwZXRzIHdoaWNoIHNob3VsZCBiZSBydW4gb25lIGFmdGVyIHRoZSBvdGhlci4gDQoqIE1ha2Ugc3VyZSBhbGwgdGhlIHJlcXVpcmVkIHBhY2thZ2VzIGFyZSBpbnN0YWxsZWQgYmVmb3JlaGFuZCAoQmlvY01hbmFnZXI6Omluc3RhbGwoLi4uKSkuIA0KKiBNYWtlIHN1cmUgeW91IGhhdmUgQ3l0b3NjYXBlIGluc3RhbGxlZCAodmVyc2lvbiAzLjguMCspIGFuZCBydW5uaW5nIGJlZm9yZSB5b3Ugc3RhcnQgcnVubmluZyB0aGUgc2NyaXB0LiANCg0KKioqIA0KDQojIFNldHVwDQoNCkxvYWRpbmcgbGlicmFyaWVzDQpgYGB7cn0NCm9wdGlvbnMoY29ubmVjdGlvbk9ic2VydmVyID0gTlVMTCkNCg0KbGlicmFyeShkcGx5cikNCmxpYnJhcnkocldpa2lQYXRod2F5cykNCmxpYnJhcnkoUkN5MykNCmxpYnJhcnkoUkNvbG9yQnJld2VyKQ0KbGlicmFyeShyc3R1ZGlvYXBpKQ0KbGlicmFyeShyZWFkcikNCmBgYA0KDQpgYGB7cn0NCnNldHdkKGRpcm5hbWUoZ2V0QWN0aXZlRG9jdW1lbnRDb250ZXh0KCkkcGF0aCkpDQpgYGANCg0KKioqIA0KDQojIyBMb2FkIGRpZmZlcmVudGlhbCBnZW5lIGV4cHJlc3Npb24gZGF0YXNldA0KDQpNYWtlIHN1cmUgeW91IHJhbiB3b3JrZmxvdyAxIGJlZm9yZWhhbmQsIHNvIHRoZSBkaWZmZXJlbnRpYWwgZ2VuZSBleHByZXNzaW9uIGZpbGUgaGFzIGJlZW4gZ2VuZXJhdGVkLiANCg0KV2UgdGFrZSB0aGUgc2VyaWVzIDEgZnJvbSB0aGUgZm9sbG93aW5nIGRhdGFzZXQgKE5IQkUgbW9jayB0cmVhdGVkIHZlcnN1cyBTQVJTLUNvVi0yIGluZmVjdGVkKTogDQoqIGh0dHBzOi8vd3d3Lm5jYmkubmxtLm5paC5nb3YvZ2VvL3F1ZXJ5L2FjYy5jZ2k/YWNjPUdTRTE0NzUwNw0KDQp3aGljaCBpcyByZWxhdGVkIHRvIHRoZSBmb2xsb3dpbmcgcHVibGljYXRpb246DQoNCkJsYW5jby1NZWxvLCBEYW5pZWwsIGV0IGFsLiAiSW1iYWxhbmNlZCBob3N0IHJlc3BvbnNlIHRvIFNBUlMtQ29WLTIgZHJpdmVzIGRldmVsb3BtZW50IG9mIENPVklELTE5LiIgQ2VsbCAxODEuNSAoMjAyMCk6IDEwMzYtMTA0NS4NCg0KYGBge3J9DQpkYXRhc2V0IDwtIHJlYWQuY3N2KCJkYXRhL0RFR19UcmVhdG1lbnRGT0xGSVJJTk9YLmNzdiIsIGhlYWRlciA9IFRSVUUpDQpkYXRhc2V0JFggPC0gZ3N1YignLW1STkEnLCAnJywgZGF0YXNldCRYKQ0KDQojIGlkZW50aWZpZXIgbWFwcGluZyB0byBFbnRyZXogR2VuZQ0KaGdjbjJlbnRyZXogPC0gY2x1c3RlclByb2ZpbGVyOjpiaXRyKGRhdGFzZXQkWCwgZnJvbVR5cGUgPSAiU1lNQk9MIix0b1R5cGUgPSBjKCJFTlRSRVpJRCIsIlNZTUJPTCIpLCBPcmdEYiA9IG9yZy5Icy5lZy5kYikNCmRhdGEubWFwcGVkIDwtIG1lcmdlKGRhdGFzZXQsIGhnY24yZW50cmV6LCBieS54PSJYIiwgYnkueT0iU1lNQk9MIiwgYWxsLnggPSBUUlVFKQ0KDQojIGZpbHRlciBnZW5lcyB3aXRob3V0IEVudHJleiBHZW5lIGlkZW50aWZpZXINCmRhdGEgPC0gZGF0YS5tYXBwZWQgJT4lIHRpZHlyOjpkcm9wX25hKEVOVFJFWklEKQ0KDQpkYXRhLnVwIDwtIHVuaXF1ZShkYXRhW2RhdGEkUC52YWx1ZSA8IDAuMDUgJiBkYXRhJExvZzIuZm9sZC5jaGFuZ2UgPiAwLjU4LCBjKDEsMTQpXSkNCmRhdGEuZG93biA8LSB1bmlxdWUoZGF0YVtkYXRhJFAudmFsdWUgPCAwLjA1ICYgZGF0YSRMb2cyLmZvbGQuY2hhbmdlIDwgLTAuNTgsIGMoMSwxNCldKQ0KDQpkYXRhLmRlZyA8LSB1bmlxdWUoZGF0YVtkYXRhJFAudmFsdWUgPCAwLjA1ICYgYWJzKGRhdGEkTG9nMi5mb2xkLmNoYW5nZSA8IDAuNTgpLCBjKDEsMTQpXSkNCg0KYGBgDQoNCioqKg0KDQojIyBQUEkgbmV0d29yayBhbmFseXNpcw0KDQpOZXh0LCB3ZSB3aWxsIGNyZWF0ZSBhIHByb3RlaW4tcHJvdGVpbiBpbnRlcmFjdGlvbiBuZXR3b3JrIHdpdGggYWxsIGRpZmZlcmVudGlhbGx5IGV4cHJlc3NlZCBnZW5lcyB1c2luZyB0aGUgU1RSSU5HIGRhdGFiYXNlLiANCg0KYGBge3J9DQojUkN5Mzo6Y3l0b3NjYXBlUGluZygpDQojaW5zdGFsbEFwcCgnc3RyaW5nQXBwJykgDQoNCnF1ZXJ5IDwtIGZvcm1hdF9jc3YoYXMuZGF0YS5mcmFtZShkYXRhLmRlZyRFTlRSRVpJRCksIGNvbF9uYW1lcz1GLCBxdW90ZV9lc2NhcGUgPSAiZG91YmxlIiwgZW9sID0iLCIpDQpjb21tYW5kc1J1bihwYXN0ZTAoJ3N0cmluZyBwcm90ZWluIHF1ZXJ5IGN1dG9mZj0wLjcgbmV3TmV0TmFtZT0iUFBJIG5ldHdvcmsiIHF1ZXJ5PSInLHF1ZXJ5LCciIGxpbWl0PTAnKSkNCg0KIyBuZXR3b3JrIHdpbGwgYmUgb3BlbmVkIGluIEN5dG9zY2FwZSAodGhpcyBtaWdodCB0YWtlIGEgd2hpbGUpDQpgYGANCg0KPiBMZXQncyBleHBsb3JlIHRoZSBuZXR3b3JrDQoNCi0gKipRMSoqOiBIb3cgbWFueSBvZiB0aGUgZGlmZmVyZW50aWFsbHkgZXhwcmVzc2VkIGdlbmVzIHdlcmUgZm91bmQgaW4gU1RSSU5HPw0KLSAqKlEyKio6IEFyZSBhbGwgZ2VuZXMgY29ubmVjdGVkIGluIHRoZSBuZXR3b3JrPyANCi0gKipRMyoqOiBDaGFuZ2UgdGhlIGNvbmZpZGVuY2UgY3V0b2ZmIGluIHRoZSBjb21tYW5kc1J1biBjYWxsIGZyb20gMC43IChoaWdoIGNvbmZpZGVuY2UpIHRvIDAuNCAobWVkaXVtIGNvbmZpZGVuY2UpLiBXaGF0IGNoYW5nZXM/IA0KDQoqKioNCg0KIyMgRGF0YSB2aXN1YWxpemF0aW9uDQoNClVzZSB0aGUgc2FtZSB2aXN1YWxpemF0aW9uIHlvdSBjcmVhdGVkIGluIHdvcmtmbG93IDEgdG8gdmlzdWFsaXplIHRoZSBnZW5lIGV4cHJlc3Npb24gZGF0YSBvbiB0aGUgbmV0d29yay4NCg0KYGBge3J9DQpsb2FkVGFibGVEYXRhKGRhdGEsIGRhdGEua2V5LmNvbHVtbiA9ICJFTlRSRVpJRCIsIHRhYmxlLmtleS5jb2x1bW4gPSAicXVlcnkgdGVybSIpDQoNClJDeTM6OmNvcHlWaXN1YWxTdHlsZSgiZGVmYXVsdCIsInBwaSIpDQpSQ3kzOjpzZXROb2RlTGFiZWxNYXBwaW5nKCJkaXNwbGF5IG5hbWUiLCBzdHlsZS5uYW1lPSJwcGkiKQ0KUkN5Mzo6bG9ja05vZGVEaW1lbnNpb25zKFRSVUUsIHN0eWxlLm5hbWU9InBwaSIpDQpkYXRhLnZhbHVlczwtYygtMSwwLDEpIA0Kbm9kZS5jb2xvcnMgPC0gYyhyZXYoYnJld2VyLnBhbChsZW5ndGgoZGF0YS52YWx1ZXMpLCAiUmRCdSIpKSkNCnNldE5vZGVDb2xvck1hcHBpbmcoIkxvZzIuZm9sZC5jaGFuZ2UiLCBkYXRhLnZhbHVlcywgbm9kZS5jb2xvcnMsIGRlZmF1bHQuY29sb3IgPSAiIzk5RkY5OSIsIHN0eWxlLm5hbWUgPSAicHBpIikNClJDeTM6OnNldFZpc3VhbFN0eWxlKCJwcGkiKQ0KUkN5Mzo6dG9nZ2xlR3JhcGhpY3NEZXRhaWxzKCkNCmBgYA0KDQo+IEludGVycHJldGF0aW9uDQoNCi0gKipRNCoqOiBEbyB5b3Ugc2VlIGNsdXN0ZXJzIG9mIHVwLSBvciBkb3duLXJlZ3VsYXRlZCBnZW5lcyBpbiB0aGUgUFBJIG5ldHdvcms/DQoNCioqKg0KDQojIyBQYXRod2F5IGluZm9ybWF0aW9uDQoNCk5leHQsIHdlIHdpbGwgYWRkIGluZm9ybWF0aW9uIGFib3V0IHBhcnRpY2lwYXRpb24gb2YgdGhlIGRpZmZlcmVudGlhbGx5IGV4cHJlc3NlZCBnZW5lcyBpbiBtb2xlY3VsYXIgcGF0aHdheSBtb2RlbHMuDQoNCmBgYHtyfQ0KIyBydW4gQ3lUYXJnZXRMaW5rZXINCg0Kd3AgPC0gZmlsZS5wYXRoKGdldHdkKCksICJkYXRhL3dpa2lwYXRod2F5cy1oc2EtMjAyMDA3MTAueGdtbWwiKQ0KDQpjb21tYW5kc1J1bihwYXN0ZTAoJ2N5dGFyZ2V0bGlua2VyIGV4dGVuZCBpZEF0dHJpYnV0ZT0icXVlcnkgdGVybSIgbGlua1NldEZpbGVzPSInLCB3cCwgJyInKSkNCmNvbW1hbmRzUnVuKCdjeXRhcmdldGxpbmtlciBhcHBseUxheW91dCBuZXR3b3JrPSJjdXJyZW50IicpDQpjb21tYW5kc1J1bignY3l0YXJnZXRsaW5rZXIgYXBwbHlWaXN1YWxzdHlsZSBuZXR3b3JrPSJjdXJyZW50IicpDQpSQ3kzOjpzZXROb2RlTGFiZWxNYXBwaW5nKCJkaXNwbGF5IG5hbWUiLCBzdHlsZS5uYW1lPSJDeVRhcmdldExpbmtlciIpDQoNCiMgdGhlcmUgaXMgYW4gaXNzdWUgaW4gdGhlIGxhdGVzdCB2ZXJzaW9uIHdpdGggdmlzdWFsaXphdGlvbiBvZiB0aGUgYWRkZWQgZWRnZXMgLSB0aGUgd29ya2Fyb3VuZCBiZWxvdyBzb2x2ZXMgdGhpcyBmb3Igbm93DQpSQ3kzOjpjbG9uZU5ldHdvcmsoKQ0KUkN5Mzo6c2V0VmlzdWFsU3R5bGUoImRlZmF1bHQiKQ0KUkN5Mzo6c2V0VmlzdWFsU3R5bGUoIkN5VGFyZ2V0TGlua2VyIikNCg0KIyBUT0RPOiBWSVNVQUwgU1RZTEUNCiMgdG9vIG1hbnkgcGF0aHdheSBhc3NvY2lhdGlvbnMgLSBiZXR0ZXIgd2F5cyB0byBncm91cCB0aGVtIGlzIG5lZWRlZA0KYGBgDQoNCj4gSW50ZXJwcmV0YXRpb24NCg0KLSAqKlE1Kio6IEhvdyBtYW55IGRpZmZlcmVudGlhbGx5IGV4cHJlc3NlZCBnZW5lcyBhcmUgaW4gYXQgbGVhc3Qgb25lIG9mIHRoZSBwYXRod2F5cz8gDQotICoqUTYqKjogQXJlIHRoZSBnZW5lcyBhbHNvIGZ1bmN0aW9uYWxseSByZWxhdGVkIGJhc2VkIG9uIHRoZSBQUEkgbmV0d29yaz8NCg0KKioqDQoNCiMjIFNhdmUgQ3l0b3NjYXBlIG91dHB1dCBhbmQgc2Vzc2lvbg0KDQpgYGB7cn0NCiMgU2F2aW5nIG91dHB1dA0KcG5nLmZpbGUgPC0gZmlsZS5wYXRoKGdldHdkKCksICJwcGktbmV0d29yay5wbmciKQ0KZXhwb3J0SW1hZ2UocG5nLmZpbGUsJ1BORycsIHpvb20gPSA1MDApDQpjeXMuZmlsZSA8LSBmaWxlLnBhdGgoZ2V0d2QoKSwgInBwaS1uZXR3b3JrLmN5cyIpDQpzYXZlU2Vzc2lvbihjeXMuZmlsZSkgDQoNCiNjb21tZW50IGZvbGxvd2luZyBsaW5lIGlmIHlvdSB3YW50IHRvIG1hbmlwdWxhdGUgdGhlIHZpc3VhbGl6YXRpb24gaW4gQ3l0b3NjYXBlDQpSQ3kzOjpjbG9zZVNlc3Npb24oc2F2ZS5iZWZvcmUuY2xvc2luZyA9IEYpDQpgYGA=