Comprehensive RNA-seq analysis of alternative splicing events that distinguishes between metastatic oral cancer of gingiva and tongue

Vishwas Sharma 1 , Dinesh Kumar 2 , Harpreet Singh 3, Sanjay Gupta 1 *

1 Division of Cytopathology, ICMR-National Institute of Cancer Prevention and Research, Noida, Uttar Pradesh, India

2 ICMR-National Institute of Cancer Prevention and Research, Noida, Uttar Pradesh, India

3 Informatics, Systems & Research Management (ISRM), ICMR, New Delhi

*Corresponding Author: Sanjay Gupta                    * Email: sanjaydr17@gmail.com

Equal contributiom

 

Abstract

Introduction: Oral cancer (OC) is a multifactorial disease caused due to various genomic changes. Alternative splicing (AS) is a regulatory genetic process through which messenger RNA forms diverse protein variants. This study aims to study the variation in the AS events at tongue and gingiva locations of OC.

Materials and Methods: Forty-five paired end OC RNA-seq data were downloaded from Sequence Read Archive (SRA) data repository. Twenty four paired end OC (tongue 13, gingival 11) RNA sequences passed the stringent inclusion/ exclusion criteria which were analyzed following Tuxedo pipeline. The ClueGO (v2.5.8) tool in Cytoscape app manager (v3.7.1) was used for gene set enrichment analysis keeping false discovery rate (FDR <=0.05).

Results: Eighty-three genes were identified to be significantly alternatively spliced when comparison was made between RNA sequences from normal tissues and tumor tissues from the gingiva region (p<0.05). Similarly, 39 genes were found to be significantly alternatively spliced when comparison was made between normal tissues and tumor tissues from tongue region of OC. Of these, only 4 genes i.e. AHR, AL356488.2, KREMEN1, SH3TC1 were similar in gingiva and tongue whereas others were unique to their location.

Conclusion: Genome-wide AS events vary considerably in gingival and tongue locations of OC. Hence, these events need to be thoroughly investigated for defining the treatment strategy. Further functional studies are needed to decipher the role of AS in OC.

Keywords: Gingiva cancer, Oral cancer, RNA-seq, Alternative splicing, Tongue cancer

Introduction

Oral cancer (OC) is the sixth most common type of cancer in the world. As per the Globocan 2020 data, the global new cases diagnosed with this disease accounted for ~3.7 million, of which ~1.7 million died (1). OC encompasses cancer occurring at various locations such as lips, tongue, gingiva, cheeks, the floor of the mouth, hard and soft palate, sinuses, and throat. Oral carcinogenesis is a multi-step process that involves the accumulation of various genomic alterations leading to aberrations in the genetic landscape (2). Knowledge of these variations may help decide the therapeutic strategy for treatment.

The alternative splicing (AS) or differential splicing is a regulatory genetic mechanism of choosing different combinations of splice-sites in an mRNA precursor (pre-mRNA) to form variably spliced mRNAs. These events contribute significantly to the etiopathogenesis of OC. In our previous systematic review, we have highlighted the RNA-seq based studies which depicted the role of AS in Head and Neck cancer (HNC) (3). Through this review, we have noticed that the role of AS events at different locations of OC had not been addressed in the available literature. To fill this research gap, we aim to study the variability in the AS events at two different locations of OC i.e. tongue and gingiva.

Materials and Methods

In order to understand the AS events at different locations of OC, the screening of the RNA-seq data was undertaken on Sequence Read Archive (SRA) data repository on 29th September 2020. Briefly, the search terms “oral cancer [All Fields] AND "Homo sapiens"[orgn] AND ("biomol rna"[Properties] AND "library layout paired"[Properties] AND "platform illumina"[Properties] AND "filetype fastq"[Properties])” were used for screening the data/sequences. The detailed search strategy is illustrated in Figure 1. Briefly, a total of 45 paired-end RNA-seq data were obtained from SRA. Of these, 21 paired-end sequences were on OC cell lines and hence excluded from the study (Table 1). The remaining 24 paired-end OC (tongue 13, gingival 11) sequences were included following the inclusion/exclusion criteria mentioned below. All these sequences were from a study by Bhattacharya et al. (4).

Figure 1. Detailed description of the records being screened.  

Inclusion criteria

i) Oral cancer at the selected location (case-control design).

ii) RNA-seq data.

iii) Sequencing performed on Illumina genome analyzer or Hi-Seq.

iv) Pair-end sequencing reads.

Exclusion criteria

i) Sequencing performed on a platform other than Illumina.

ii) Samples other than humans.

iii) Sequencing reads less than 3 million base pairs.

The raw fastq reads were downloaded and accessed for high-quality sequences using the FastQC program. The software Trimmomatic (v0.40) was used for trimming the sequences and removing the adapter content. For trimming, the parameters HEADCROP:12 TRAILING:1 SLIDINGWINDOW:4:20 MINLEN:50 were used. The mapping of the reads with the human genome (GRCh38) was done through Tophat using default parameters, followed by assembly through Cufflinks. The differentially spliced genes were identified through Cuffdiff using Jensen–Shannon divergence test between isoforms. P-value <0.05 was considered to be significant. The ClueGO (v2.5.8) tool in Cytoscape app manager (v3.7.1) was used for gene set enrichment analysis (GSEA) keeping false discovery rate (FDR <=0.05), two-sided hypergeometric test with Bonferroni step down correction and kappa score 0.4 with other default parameter was used for GO biological process and molecular function.


 


Table 1. Details of screening and samples being excluded or included.

Sample type

Layout

Inclusion/Exclusion remark

RNAseq_HSC2_10mM

Paired end

Oral cancer cell line HSC2 treated with 10mM metformin. Since the study is on cell line it is excluded from the analysis.

RNAseq_HSC2_control

Paired end

Oral cancer cell line HSC2 without Metformin. Since the study is on cell line it is excluded from the analysis.

RNAseq_Cal27_10mM

Paired end

Oral cancer cell line Cal27 treated with 10mM metformin. Since the study is on cell line it is excluded from the analysis.

RNAseq_Cal27_control

Paired end

Oral cancer cell line Cal27 without Metformin. Since the study is on cell line it is excluded from the analysis.

GSM4726150: tongue9-A64034; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726149: tongue8-A64033; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726148: tongue7-A64032; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726147: tongue6-A64031; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726146: tongue5-A64029; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726145: tongue4-A64028; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726144: tongue3-A64027; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726143: tongue2-A64025; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726142: tongue10-A64036; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726141: tongue1-A64024; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726140: tcont3-A64037; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726139: tcont2-A64035; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726139: tcont2-A64035; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726137: gingiva9-A64048; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726136: gingiva8-A64047; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726135: gingiva7-A64046; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726134: gingiva6-A64045; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726133: gingiva5-A64044; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726132: gingiva4-A64043; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726131: gingiva3-A64041; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726130: gingiva2-A64040; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726129: gingiva1-A64039; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726128: gcont2-A64042; Homo sapiens; RNA-Seq

Paired end

Included

GSM4726127: gcont1-A64038; Homo sapiens; RNA-Seq

Paired end

Included

GSM4550990: TW2.6; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line TW2. Since the study is on cell line it is excluded from the analysis

GSM4550989: OEC-M1; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line OEC-M1. Since the study is on cell line it is excluded from the analysis

GSM4550988: OC3; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line OC3. Since the study is on cell line it is excluded from the analysis

GSM4550987: CGHNC9; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line CGHNC9. Since the study is on cell line it is excluded from the analysis

GSM4550986: CGHNC8; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line CGHNC9. Since the study is on cell line it is excluded from the analysis

GSM4550985: CGHNK6; Homo sapiens; RNA-Seq

Paired end

Oral Keratinocyte cell line CGHNK6. Since the study is on cell line it is excluded from the analysis

GSM4550984: CGHNK2; Homo sapiens; RNA-Seq

Paired end

Oral Keratinocyte cell line CGHNK2. Since the study is on cell line it is excluded from the analysis

RNA-Seq_SAS_control

Paired end

Human tongue squamous cell carcinoma cell line control. Since the study is on cell line it is excluded from the analysis

RNA-Seq_SAS_HGK

Paired end

Human tongue squamous cell carcinoma cell line case treated with Hydroxygenkwanin drug. Since the study is on cell line it is excluded from the analysis

RNA-Seq_OECM_control

Paired end

Human gingival squamous carcinoma cell line control. Since the study is on cell line it is excluded from the analysis

RNA-Seq_OECM_HGK

Paired end

Human gingival squamous carcinoma cell line treated with Hydroxygenkwanin drug. Since the study is on cell line it is excluded from the analysis

HuR-CP1 (HuR cleavage product-1)+134; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line transfected with gene HuR-CP1 (HuR cleavage product-1). Since the study is on cell line it is excluded from the analysis

HuR-D226A (asp-aln mutant)_133; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line transfected with gene HuR-D226A (asp-aln mutant)_133. Since the study is on cell line it is excluded from the analysis

HUR FL (full-length HuR)_132; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line transfected with gene HUR FL (full-length HuR)_132. Since the study is on cell line it is excluded from the analysis

GFP control_131; Homo sapiens; RNA-Seq

Paired end

Oral cancer cell line control (GFP control). Since the study is on cell line it is excluded from the analysis

CELF1 KD replicate 1; Homo sapiens; RNA-Seq

Paired end

Si CELF1 treated tongue suqamous cell carcinoma cell line. Since the study is on cell line it is excluded from the analysis

WT replicate 1; Homo sapiens; RNA-Seq

Paired end

Tongue squamous cell carcinoma cell line control. Since the study is on cell line it is excluded from the analysis





Results

A total of 83 genes were found to be significantly alternatively spliced when a comparison was made between normal tissues and tumor tissue samples obtained from the gingiva region (p<0.05). Similarly, 39 genes were found to be significantly alternatively spliced when a comparison was made between normal tissues and tumor tissue samples obtained from the tongue region (Table 2).

 Of these, only 4 genes i.e. AHR, AL356488.2, KREMEN1, SH3TC1 were found to be commonly alternatively spliced in gingiva as well as tongue whereas others were unique to the location, that is, either gingiva or tongue (Figure 2).

Table 2. Details of differentially spliced genes in samples obtained from gingiva and tongue region of oral cancer.

Gingiva

 

Tongue

AC010323.1, NDUFA7

 

ABR

AHR

 

AHR

AL049629.2

 

AKR1C2

AL356488.2

 

AL356488.2

ANKRD11

 

AP1S1

ASCC2

 

ATP6V1E1

ASDURF, ASNSD1

 

ATP9B

ATF5

 

ATXN1

ATP13A2

 

BAZ1B

ATP5F1B

 

CDC37, MIR1181

BID

 

COL4A5

CCT5

 

COL5A1

COL12A1

 

DIS3L2

COL7A1

 

EGLN1

CORO2A

 

FBLIM1

CPSF1

 

GAMT

CTU2

 

GNB1

E2F3

 

IFI16

EMP3

 

JUP

ERAP1

 

KREMEN1

EXOC4

 

LIMK2

FBLN2

 

LMF2

FEZ1

 

MED14

FMNL2

 

NAPRT

GFOD1

 

NFATC3

GPD2

 

NRBF2

GTF3C1

 

NSUN5

HJURP

 

PIGP

HMBS

 

PLEKHG5

HP1BP3

 

S100A13

HSPBP1

 

SH3TC1

IK,MIR3655

 

SLC38A7

INF2

 

SPP1

INPP4A

 

TAB3

ISOC2

 

TENM4

KCTD15

 

TRIB2

KIAA1217

 

TYK2

KREMEN1

 

USP22

LMTK2

 

VWA1

LPXN

 

 

MARF1

 

 

MCM3AP

 

 

MGAT4B

 

 

NDRG1

 

 

NMD3

 

 

NOL7

 

 

PAFAH1B1

 

 

PDCD11

 

 

PIP5K1C

 

 

PLCD3

 

 

PNKP

 

 

POLA1

 

 

PPP1CC

 

 

PPP1R14B-AS1

 

 

PRRC2A

 

 

PSMD2

 

 

SACM1L

 

 

SEC14L1

 

 

SH3TC1

 

 

SLC3A2

 

 

SNIP1

 

 

STARD3NL

 

 

SUN1

 

 

TAF13

 

 

TASP1

 

 

TBC1D23

 

 

TDRD7

 

 

TGFA

 

 

TJP2

 

 

TMEM131L

 

 

TMEM138

 

 

TMEM177

 

 

TNFRSF10A

 

 

TNRC6B

 

 

TPM1

 

 

TTC38

 

 

TXNIP

 

 

UGP2

 

 

VPS18

 

 

YBX3

 

 

ZC3HAV1

 

 

ZFAND6

 

 

ZMIZ1

 

 

 

 

Figure 2. Details of differentially spliced genes in samples obtained from gingiva and tongue region of oral cancer.

Gene set enrichment analysis

Gingiva: The GSEA of AS genes revealed their significant association with cell death cycles, vesicle targeting, and tethering processes, inositol phosphate metabolism process, Charcot-Marie-Tooth disease, and cerebral cortex radial glia guided migration signaling (Figure 3a).

Tongue: The GSEA of the genes AS in the tongue revealed their role in prostaglandin signaling, formation of collagen trimers, poly-pyrimidine binding, and regulation of neuron projection regeneration (Figure 3b).

 


 

Figure 3. Depiction of gene set enrichment analysis for biological processes and cellular signals associated with differentially spliced genes in tongue (a) and gingiva (b).

 



Discussion

The result of this study suggests that variability in the AS events exists at the location level in OC. Hence, to understand the etiology of OC at the genetic level, the location from where the samples are obtained is an important parameter that needs to be considered. Understanding the signature of AS events, at the gene level, at different locations of OC may help us in developing molecular target/s for early prediction/diagnosis of OC. Moreover, it may also be useful in deciding the best therapeutic option from the available ones, for the management/treatment of OC.

The gene set enrichment analysis (GSEA) of the alternatively spliced genes revealed various genes whose role in cancer development has already been well defined, for instance, the genes that code for INPP4A, and PLCD3 were found to be alternatively spliced, play a significant role in inositol phosphate metabolism. The suppressed INPP4A gene leads to enhanced PIP3 level which promotes AKT1-dependent tumor growth and subsequent metastasis (5). The phospholipase C delta 3, i.e. PLCD3 is anti apoptosis molecule that plays a significant role in cancer cell proliferation and migration. The other gene CCT5 chaperonin has been found to be upregulated in p53-mutated tumors (6) and has been suggested as a potential cancer biomarker (7). The ZC3HAV1 regulates KRAS and acts as metastasis-promoting factor in pancreatic cancer (8). Another gene EXOC4 and VPS18 play an imperative role in vesicles docking involved in exocytosis. The gene prostaglandin E2 (PGE2) acts as a stimulator for tumor progression (9). High expression of the prostaglandin receptors has been observed in squamous cell carcinomas. The genes coding for AKR1C2 and GNB1 proteins were found to be differentially spliced which are linked with prostaglandin stimulus. We also noticed differential splicing in mRNA of genes coding for COL4A5 and COL5A1 in tongue and COL12A1 and COL7A1 in gingival cancerous tissue, which is responsible for the formation of collagen trimers and the formation of larger collagen molecules. Finally higher expression of Ataxin-1 (ATXN1), a proto-oncogene, has been reported in cervical cancer and considered as a potent tumor genetic factor (10). Cancer-causing genes show differential splicing which projects the roles of different isoforms for tumorigenesis and maintaining the mRNA expression to a threshold tumorigenic stage. Overall, the genes related to carcinogenesis, cancer progression, and metastasis show AS which points toward the role of different isoforms to establish the cancer cells.

The functional validation of the AS genes is needed to confirm the findings. However, we report the genes whose role in other cancers is already well defined. Further, studies at other locations of OC such as lips, floor of the mouth, etc are needed to decipher the role of AS events in metastatic OC. We were unable to segregate the data based on nodal status in metastatic OC as directed in the study by Bhattacharya et el.,4. Studying AS in OC in relation to the nodal status will be helpful to understand the association between its genetic and clinical features. We have performed analysis on 24 paired-end OC samples that were freely available in the data repositories. However, due to the limited sample size, there is a possibility that some genes might be alternatively spliced but have been missed as they did not reach the significant statistical cutoff.  A more profound analysis of a larger dataset is required to critically analyze and understand the role of AS events in OC.

Conclusion

Genome-wide AS events at different locations of OC vary which needs thorough investigation for defining the treatment strategy.

Author contributions

VS and SG designed the study, VS and DK collected the information and analyzed the data, and wrote the first draft of the manuscript. HS reviewed the manuscript. SG led the group, reviewed, and wrote the final draft.

Conflict of interest

The authors declare that they have no competing interests.

Funding

This work was funded by the Indian Council of Medical Research by grant No. 2019-3557, this supported the fellowship of Dr. Vishwas Sharma.

Acknowledgments

The authors are thankful to Dr. Amit Kumar for providing critical suggestions on the manuscript. The grammar was checked by Grammarly software and the plagiarism was checked by iThenticate software.

 

References

1.     Globocan 2020 fact sheethttps://gco.iarc.fr/today/data/factsheets/cancers/1-Lip-oral-cavity-fact-sheet.pdf (2020).

2.     Ram H, Sarkar J, Kumar H, et al. Oral Cancer: Risk Factors and Molecular Pathogenesis. J Maxillofac Oral Surg. 2011; 10: 132–137.

3.     Sharma V, Nandan A, Singh H, et al. Events of alternative splicing in head and neck cancer via RNA sequencing - an update. BMC Genomics. 2019; 20: 442.

4.     Bhattacharya A, Janal MN, Veeramachaneni R, et al. Oncogenes overexpressed in metastatic oral cancers from patients with pain: potential pain mediators released in exosomes. Sci Rep. 2020; 10: 14724.

5.     Ye Y, Jin L, Wilmott JS, et al. PI(4,5)P2 5-phosphatase A regulates PI3K/Akt signalling and has a tumour suppressive role in human melanoma. Nat Commun. 2013; 4: 1508.

6.     Ooe A, Kato K, Noguchi S. Possible involvement of CCT5, RGS3, and YKT6 genes up-regulated in p53-mutated tumors in resistance to docetaxel in human breast cancers. Breast Cancer Res Treat. 2007; 101: 305–315.

7.     Gao H, Zheng M, Sun S, et al. Chaperonin containing TCP1 subunit 5 is a tumor associated antigen of non-small cell lung cancer. Oncotarget. 2017; 8: 64170–64179.

8.     Huang W, Hua H, Xiao G, et al. ZC3HAV1 promotes the proliferation and metastasis via regulating KRAS in pancreatic cancer. Aging (Albany NY). 2021; 13: 18482–18497.

9.     Hoshikawa. Expression of prostaglandin E2 receptors in oral squamous cell carcinomas and growth inhibitory effects of an EP3 selective antagonist, ONO-AE3-240. Int J Oncol. 2009; 34.

10.   Kang A-R, An H-T, Ko J, et al. Ataxin-1 regulates epithelial-mesenchymal transition of cervical cancer cells. Oncotarget. 2017; 8: 18248–18259.