Phylogenetic tree of WRKY TFs from C. arietinum and Arabidopsis thaliana

Materials and Methods:
The comparative phylogenetic tree was constructed by using protein sequences of putative WRKY TFs from C. arietinum (CarWRKY), Medicago truncatula (MedtrWRKY), and Arabidopsis thaliana (AtWRKY) as a reference sequence (Eulgem et al. 2000; Song and Nan 2014). For this purpose, the peptide sequences of 96 MedtrWRKYs were downloaded from the plant TF database (PlantTFDB v4.0; Jin et al. 2016) webserver http://plant tfdb.cbi.pku.edu.cn/family.php?sp=Mtr&fam=WRKY. On the other hand, the protein sequences of 72 AtWRKYs were obtained from Eulgem et al. (2000). The multiple sequence alignment of all the 238 WRKYs was carried out using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/ clustalo/; Sievers et al. 2011). The resultant alignment was used to compute the phylogenetic tree through the neighbor-joining method (Tamura et al. 2013) with 1000 bootstrap replicates using the Molecular Evolutionary Genetics Analysis tool (MEGA v7.0; Kumar et al. 2016). The Poisson correction method was used to compute the evolutionary distances in the units of the number of amino acid substitutions per site per unit of time. All positions containing <95% site coverage was removed i.e., fewer than 5% alignment gaps, missing data, and ambiguous bases were allowed at any position. There was a total of 71 amino acid positions in the final dataset. The resultant tree was then used to infer the evolutionary history and possible functional roles of WRKY TFs.
Analysis Results:
An unrooted neighbor-joining comparative phylogenetic tree was constructed from 238 protein sequences of CarWRKYs, MedtrWRKYs and AtWRKYs. The tree divided the WRKY protein into three major clusters of orthologous genes (MCOGI, II, and III) (Fig. 3). These three major groups were subdivided into seven sub classes i.e. IN, IC, IIa, IIb, IIc, IId and IIe as reported by Eulgem et al. (2000). Among the three major groups, MCOG-II represents the largest major group of the phylogenetic trees with 120 WRKYs distributed in five subgroups i.e. 22 in IIa, 13 in IIb, 40 in IIc, 27 in IId, and 18 in IIe. MCOG-I is the second-largest major group with 79 WRKYs distributed in two subgroups i.e. 57 in IN and 22 in IC. MCOG-III represents the smallest major group with 39 WRKYs from all the three species. Moreover, various orthologous gene pairs were also identified among these three species in all the three major groups (Table S1). For instance, MCOG I contained 21 orthologous gene pairs, 14 among them were identified from sub-group IN; and 7 were identified in IC. Similarly, 31 orthologous gene pairs were identified in MCOG II, distributed in five subgroups as follows i.e. 4 in IIa, 4 in IIb, 10 in IIc, 7 in IId and 6 in IIe. Only six pairs of orthologous genes were identified in MCOG III. It is interesting to note that the CarWRKYs shared more homology with MedtrWRKYs as compared to AtWRKYs. Overall, 57 orthologous gene pairs were identified between chickpea and M. truncatula. This sequence similarity of WRKY proteins among both these species is a result of the fact that both chickpea and M. trun[1]catula are members of a galegoid clade of the Phaseoleae tribe of the Fabaceae family. It has already been reported that the chickpea proteins shared a greater homology with M. truncatula as compared to A. thaliana (Varshney et al. 2013).