• Tidak ada hasil yang ditemukan

LIST OF TABLES

3. GENOME-WIDE ANALYSIS OF NAC TRANSCRIPTION FACTOR FAMILY IN COWPEA

3.2 METHODOLOGY

3.2.1 Identification of VuNAC TF family

The BLASTP search (e-value ≤ 0.05) was conducted against the cowpea proteome (taxid:

3917) seeded with six distinct NAC domains from Arabidopsis and rice i.e., AtNAM (AT1G52880), ATAF1 (AT1G01720), CUC1 (AT3G15170), AtNAC1 (AT1G56010), AtNAC2 (AT5G04410), and OsNAC003 (AK061716), downloaded from the NCBI database (https://www.ncbi.nlm.nih.gov/). The non-overlapping hits were subjected to a hidden markov model (HMM) search by Pfam (http://pfam.xfam.or/) and SMART database (http://smart.embl-heidelberg.de/), to examine the presence of NAC and other associated domains [321, 322]. The corresponding protein, CDS, gene, and promoter sequences of identified VuNAC TFs were retrieved from the NCBI database for further analysis [59, 323].

To estimate the theoretical pI and molecular weight of the proteins, Compute pI/Mw tool (http://web.expasy.org/compute_pi/) was used with an ‘average’ resolution setting [324]. The sequences of the Arabidopsis NAC proteins (AtNAC) used in this study were downloaded from the Plant Transcription Factor Database 5.0 (http://planttfdb.gao-lab.org/).

3.2.2 Alignment and classification

To classify the VuNAC proteins into functionally distinct groups, a combination of two approaches, i.e., multiple sequence alignment and phylogenetic analysis, were employed. First,

the domain sequences were aligned by Clustal Omega

(https://www.ebi.ac.uk/Tools/msa/clustalo/) using the default settings, and the representation of conserved amino acid residues was generated with GeneDoc software. The conserved NAC sub-domains were screened and classified into respective groups and subgroups based on their sequence similarity. Subsequently, the phylogenetic clustering of 130 VuNAC proteins, seeded with 75 reference AtNAC proteins, was conducted by MEGA 6.0, using the Neighbour-Joining (NJ) method (Poisson correction, pairwise deletion, and 1000 bootstrap replicate) to achieve an outcome similar to the former approach [325].

3.2.3 Motif detection, NLS, and TMM in VuNAC TFs

NAC proteins were subjected to motif discovery analysis using MEME suite v. 5.3.1 (http://meme-suite.org/tools/meme/), to identify conserved motifs, with the optimum search parameters (motif width range: 6-50; maximum number of motif = 50; minimum sites per motif = 2; maximum sites per motif-600) [326]. The presence of nuclear localization signals (NLS) was examined in proteins, using NLStradamus software, which used a 4-state HMM model suitable to find the multipartite NLS (prediction cut-off 0.3) [327]. To identify the

transmembrane regions, we used TMHMM server v. 2.0

(https://services.healthtech.dtu.dk/service.php?TMHMM-2.0) [328].

3.2.4 Gene structure and chromosome location and duplication

To illustrate the exon-intron organization within the coding region of the VuNAC genes, Gene Structure Display Server 2.0 (http://gsds.gao-lab.org/) was used [329]. The gene phylogenetic tree generated by MEGA 6.0 and the CDS/gene sequences were used as input.

The physical chromosomal location was graphically documented, indicating the paralogous and orthologous gene duplications. A chromosome region with two or more genes located 200 kb apart was defined as a gene cluster. The occurrence of two or more genes from the same phylogenetic group, locating within 100 kb distance on the same chromosome were denoted as tandem duplications, while the genes located on the different chromosomes were defined as segmental duplication [330].

3.2.5 Promoter analysis and study of regulatory elements

The 1.5 kb upstream sequences from the transcription start site (TSS) of the VuNAC genes were retrieved from NCBI to perform the promoter analysis. To investigate the presence of regulatory cis-elements, two promoter analysis tools, PlantCare (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) and PLACE (https://www.dna.affrc.go.jp/PLACE/? action=newplace) were used [331, 332]. The heat map of over-represented cis-elements was generated using the Multiple Experiment Viewer (MEV) tool.

3.2.6 RNA-sequencing and analysis

The healthy seeds of a drought-resilient cowpea genotype (Kannanado, IITA) were germinated for four days and then grown in soil pots for six weeks, under long-day photoperiod condition (16 hr light/ 8 hr dark) at 28 °C, with white light illumination (110 µmol photons m-2s-1) to attain the mature vegetative stage. Some germinated seedlings were cultured in hydroponic conditions and supplied with modified Hoagland media for 15 days (until the first trifoliate leaves were fully expanded) [333]. The mature leaves from soil plants and seedlings roots (5g of each) were sampled for the RNA sequencing, followed by the library preparation. The expression patterns of the VuNAC genes in leaf and root tissues of different growth stages were analyzed separately. The initial quality assessment on raw reads was carried out to remove the low-quality reads (quality phred score <30) using FastQC v0.11.7 [334] and the adapter sequences using NGSQC Toolkit v. 2.3.3 [335]. The high-quality reads were mapped on the cowpea genome available at NCBI Genome database under the BioProject ID PRJNA381312, using HiSAT2 separately for all samples, followed by reconstruction of the transcriptome from the RNA seq reads using StringTie [336, 337]. The expression of the VuNAC transcripts was calculated as fragments per kilobase of exon per million fragments mapped (FPKM), using the cuffdiff v2.2.1 program.

3.2.7 Gene-interactome analysis

The orthologous Arabidopsis NAC TFs were used as a model to predict the gene interactome, using ATTED II v. 10.1 tool (https://atted.jp/) [338]. The VuNAC genes were grouped into the co-expressing clusters. The genes in the constructed network were subjected to ontology analysis using the Panther tool (http://go.pantherdb.org/geneListAnalysis.do) to predict the associated role and regulatory genes [339].