Highly efficient, processive and multifunctional recombinant endoglucanase Rf GH5_4 from Ruminococcus flavefaciens FD-1 v3 for recycling lignocellulosic plant biomasses

Gene encoding endoglucanase, Rf GH5_4 from R. flavefaciens FD-1 v3 was cloned, expressed in Escherichia coli BL21(DE3) cells and purified. Rf GH5_4 showed molecular size 41 kDa and maximum activity at pH 5.5 and 55 ◦ C. It was stable between pH 5.0 – 8.0, retaining 85% activity and between 5 ◦ C – 45 ◦ C, retaining 75% activity, after 60 min. Rf GH5_4 displayed maximum activity (U/mg) against barley β -D-glucan (665) followed by car-boxymethyl cellulose (450), xyloglucan (343), konjac glucomannan (285), phosphoric acid swollen cellulose (86), beechwood xylan (21.7) and carob galactomannan (16), thereby displaying the multi-functionality. Cata-lytic efficiency (mL.mg (cid:0) 1 s (cid:0) 1 ) of Rf GH5_4 against carboxymethyl cellulose (146) and konjac glucomannan (529) was significantly high. TLC and MALDI-TOF-MS analyses of Rf GH5_4 treated hydrolysates of cellulosic and hemicellulosic polysaccharides displayed oligosaccharides of degree of polymerization (DP) between DP2-DP11. TLC, HPLC and Processivity-Index analyses revealed Rf GH5_4 to be a processive endoglucanase as initially, for 30 min it hydrolysed cellulose to cellotetraose followed by persistent release of cellotriose and cellobiose. Rf GH5_4 yielded sufficiently high Total Reducing Sugar (TRS, mg/g) from saccharification of alkali pre-treated sorghum (72), finger millet (62), sugarcane bagasse (38) and cotton (27) in a 48 h saccharification reaction. Thus, Rf GH5_4 can be considered as a potential endoglucanase for renewable energy applications.


Introduction
Cellulose is the most abundant polysaccharide on earth, primarily present in the plant biomass [1]. It is an unbranched polysaccharide made up of glucose units linked by β-1,4-linkage. Cellulose can be hydrolyzed to monosaccharide (glucose) and subsequently converted to bioethanol [2]. Therefore, cellulosic plant biomass is seen as a significant renewable energy source. Cellulose is hydrolyzed synergistically to cellooligosaccharides, cellobiose and glucose by various bacterial or fungal cellulase enzymes. These are namely, endoglucanase (EC 3.2.1.4), exoglucanase (EC 3.2.1.91) and β-glucosidase (EC 3.2.1.21). Endoglucanase randomly hydrolyzes cellulose chain to cellooligosaccharides, whereas exoglucanase produces cellobiose entities [3]. The produced cellooligosaccharides and cellobiose are then hydrolyzed to D-glucose by β-glucosidase [4]. Cellulases are distributed over 15 different glycoside hydrolase (GH) families of Carbohydrate Active Enzyme (CAZy) database (http://www.cazy.org). The classification is based on the amino acid sequence of enzyme and function [5]. The glycoside hydrolase family 5 (GH5) is one of the largest GH families. The subfamily 4 (GH5_4) of family GH5 is known for cellulases showing broad substrate specificity. The enzymes from the subfamily GH5_4 can display multifunctional activities by acting as endoglucanase ( [6]. The conversion of lignocellulosic biomass to bioethanol currently faces a bottleneck which is the unavailability of catalytically efficient and stable endoglucanases. Moreover, the lignocellulosic biomass contains hemicellulose along with the cellulose. It is highly advantageous if the endoglucanase hydrolyzes more than one cellulosic and hemicellulosic substrate. Recently, an alkali stable bifunctional β-glucanase, Pgl5A of subfamily 4 of family GH5 from Paenibacillus sp. S09 was characterized that showed 94.5 U/mg of specific activity on tamarind xyloglucan, however, its activity against CMC-Na was negligible, 0.5 U/ mg [7]. Another multifunctional alkali tolerant family GH5 endoglucanase, Thrcel5A, from Thermoactinospora rubra YIM 77501T was reported to show 85.7 U/mg specific activity against CMC-Na and 22.9 U/mg against beechwood xylan, respectively at the optimum pH of 8.5 and 60 • C [8]. An endoglucanase SoCel5 from bacterium Stegonsporium opalus from family GH5 showed a specific activity of 350 U/mg and catalytic efficiency of 77 mL.mg − 1 s − 1 on carboxymethyl cellulose (CMC-Na) at pH 6 and 60 • C [9]. Another endoglucanase, CelR5 of family GH5 from rhizosphere metagenomic library showed the specific activity of 15 U/mg and k cat of 9.7 s − 1 against CMC-Na [10]. Moreover, a thermophilic microbial consortium developed on rice straw from vermicompost reported only 20 U/mg of the endoglucanase activity [11]. However, the lignocellulose deconstruction demands more efficient endoglucanases with higher catalytic efficiency, having a broad range of substrate specificity, pH stability and thermostability. Multifunctional and efficient endoglucanases could reduce the cost of bioethanol production, increase the productivity, thereby by making the bioethanol industry economically sustainable. The aim of this study was to explore a new endoglucanase, RfGH_4 from a ruminant gut bacterium Ruminococcus flavefaciens FD-1 v3 for lignocellulosic biomass conversion.
Inhabitant bacteria of herbivorous rumen have been facing cellulosic pressure since time immemorial. These rumen loving microorganisms have evolved efficient machinery, i.e., the cellulosomes, to deconstruct the cellulose. Ruminococcus flavefaciens FD-1 v3 is an anaerobic, mesophilic and a Gram-positive bacterium. FD-1 v3 strain of R. flavefaciens was isolated from bolus earlier [12]. R. flavefaciens resides in the intestine of monogastric mammalian herbivores as one of its natural microbiome [13]. It is the most populated cellulolytic inhabitant of the animal rumen. Genome sequencing analysis of R. flavefaciens FD-1 v3 uncovered a vast array of cellulosomal genes. These are putatively considered versatile for the efficient degradation of lignocellulose [14]. A putative multienzyme complex (cellulosome) of family GH5 from R. flavefaciens FD-1 v3, named as RfGH5 1/2, consists of an endoglucanase module (RfGH5_4) at N-terminal domain followed by family 80 Carbohydrate Binding Module (CBM80), a catalytic endo-mannanase module (RfGH5_7) and a dockerin at the C-terminal [13]. The module, RfGH5_7 was recently characterized as an efficient endomannanase [15]. In the present investigation, the role of the catalytic module, RfGH5_4 of RfGH5 1/2 has been investigated. The gene encoding a putative endoglucanase, RfGH5_4 from R. flavefaciens FD-1 v3 (GenBank Accession Number WP_009984467.1) was cloned and expressed in E. coli BL-21 expression system. The purified enzyme, RfGH5_4 was biochemically characterized and its application in the saccharification of various lignocellulosic biomasses was investigated to explore for bioethanol production.
This study describes the biochemical and kinetic parameters of a multifunctional endoglucanase (RfGH_4) such as pH, temperature, stability and catalytic efficiency. These properties play a pivotal role in evaluating the importance of a cellulase in the renewable energy sector. Based on the unique properties, RfGH5_4 is prospected as a potent cellulase for lignocellulosic bioethanol production. The efficiency of RfGH5_4 for lignocellulose biomass hydrolysis was tested by saccharification of six different alkali pre-treated biomasses namely cotton main stalk, cotton small branches, sorghum stalk, sugarcane bagasse, finger millet stalk and maize leaves. The hydrolysis of hemicellulosic chains such as xyloglucan and glucomannan by multifunctional RfGH5_4 would further increase the accessibility of cellulose content of lignocellulose. Moreover, RfGH5_4 was found to be a processive endoglucanase thus releasing the cellobiose, which will reduce the need of cellobiohydrolase during the saccharification of biomass thereby helping in reducing the cost of saccharification process. In addition, RfGH5_4 was found remarkably stable in ethanol making it a suitable endoglucanase for Simultaneous Saccharification and Fermentation (SSF) process for bioethanol production.

Bacterial strains, vectors
pHTP1, the bacterial expression vector, was received from Nzytech genes and enzymes Pvt. Ltd., Lisbon, Portugal. E. coli TOP10 competent cells were used to amplify recombinant pHTP1 vector containing the gene encoding RfGH5_4. E. coli BL21 (DE3) (Novagen) cells were used for the expression of the RfGH5_4 enzyme.

Molecular architecture of RfGH5_4
The sequence of a multienzyme complex of a family GH5 from R. flavefaciens FD-1 v3, named as RfGH5 1/2 which contains the putative catalytic module, RfGH5_4, was taken from the CAZy database having GenBank accession number WP_009984467.1. The amino acid sequence of RfGH5 1/2 was analysed for the conserved domains and homologs by using the conserved sequence database, NCBI-CDD (http://www.ncbi. nlm.nih.gov/cdd) and NCBI-blast analysis [16], respectively. The signal peptide was located in the RfGH5 1/2 sequence with the assistance of the SignalP 3.0 server (http://www.cbs.dtu.dk/services/SignalP).

Gene amplification and cloning
The gene encoding RfGH5_4 was amplified from the genomic DNA of R. flavefaciens FD-1 v3 by PCR using an NZYProof DNA polymerase (Nzytech genes and enzymes Ltd., Lisbon, Portugal) kit following the manufacturer's protocol. The forward and reverse primers used for PCR were: F-5'-TCAGCAAGGGCTGAGGGCTTCCAACATGACCGCAAG-3 ′ , R-5 ′ -TCAGCGGAAGCTGAGGTTATACTCCGAGTACTTCCATC-3 ′ , respectively. PCR amplification was performed by the initial denaturing at 95 • C for 3 min followed by 30 cycles of denaturation at 95 • C for 30 s, annealing at 50 • C for 30 s, extension at 72 • C for 60 s and the final extension at 72 • C for 10 min. The amplified gene encoding RfGH5_4 enzyme was cloned into the linearized pHTP1 vector by restriction endonuclease and ligase independent and temperature dependent directional cloning using NZYEasy cloning and expression kit (MB282) by following the manufacturer's protocol. The pHTP1 vector available with the kit containing the complementary overhangs, was added in the PCR product amplified by using the primers with appropriate 5 ′ extensions (Underlined in aforementioned primers' sequence). Through base-pair complementation, the gene of interest was joined in the vector using NZYEasy enzyme mix. The cloned gene encoding RfGH5_4 was sequenced to confirm that no mutations generated during the amplification process. His 6 -tag was incorporated at the N-terminal of RfGH5_4 by the pHTP1 vector. The recombinant plasmid containing pHTP1 vector and the gene encoding RfGH5_4 was transformed using E. coli TOP10 competent cells for plasmid amplification.

Expression and purification of RfGH5_4
The E. coli BL21 (DE3) cells were transformed with recombinant pHTP1 plasmid, containing the gene encoding RfGH5_4. The expression and purification of RfGH5_4 were performed as per the protocol given in the literature [17]. Briefly, the expression of RfGH5_4 by E. coli BL21 (DE3) cells was carried out in 400 mL LB medium containing 50 μg/mL kanamycin and 1 mM IPTG by incubating at 24 • C. The cell pellet was resuspended in the 5 mL 20 mM sodium phosphate buffer (pH 7.0), containing 50 mM imidazole and 300 mM NaCl and subjected to sonication. The purification of RfGH5_4 was performed by Immobilized Metal Ion Affinity Chromatography (IMAC). The cell free extract obtained by centrifugation of sonicated pellet was loaded onto an Ni 2+ ions charged IMAC column (1.0 mL HiTrap, GE Healthcare, USA). The His 6tagged RfGH5_4 protein was eluted by 5 mL elution buffer (20 mM sodium phosphate, pH 7.0, 300 mM NaCl, 300 mM imidazole). The eluted RfGH5_4 protein was dialyzed against 20 mM sodium phosphate buffer (pH 7.0) and run through the sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) on a 12% (w/v) gel to check its purity. The gel was stained with Coomassie Brilliant Blue R-250 to observe the protein bands. The Bradford method was used for estimating the concentration of purified protein by using Bovine Serum Albumin (BSA) as the standard.

Substrate specificity of RfGH5_4
The enzyme activity of RfGH5_4 was determined against different carbohydrate polymers like β-1,4-glucans (CMC -Na, HEC, PASC, avicel and cellulose powder), mixed β-1,3;1,4 linked glucans (barley β-Dglucan and lichenan), tamarind xyloglucan, konjac glucomannan, carob galactomannan and xylans (beechwood and birchwood xylan). The reaction mixture (100 μL) consisted of 1% (w/v) of the polysaccharide as the substrate dissolved in 20 mM citrate-phosphate buffer (pH 5.5) and 10 μL purified RfGH5_4 enzyme (5 μg/mL). The reaction mixture was incubated at 55 • C for an optimized time period of 2 min. For avicel and cellulose powder, the reaction was carried out at 30 • C with the rotation of 200 rpm for 2 h as reported earlier [18]. The temperature 30 • C was selected for the assay of avicel and cellulose powder since the enzyme, RfGH5_4 is not thermally stable at 55 • C when incubated for more than 10 min. The released reducing sugar was quantified by the methods of Nelson [19] and Somogyi [20] for calculating the enzyme activity of RfGH5_4. D-glucose was employed as a standard to calculate the enzyme activity of RfGH5_4. One unit of specific activity was defined as the μmol of glucose produced from the substrate per minute per mg of the enzyme (μmol/min/mg) under the optimized reaction conditions. All the assays were performed in triplicate sets.

Biochemical characterization of RfGH5_4
The effect of pH on the enzyme activity of RfGH5_4 was studied by using 20 mM buffers of the following pH range: sodium citratephosphate (pH 3.0-7.0), sodium phosphate (pH 5.8-8.0). The reaction mixture (100 μL) of different pHs containing 1% (w/v) CMC-Na and 10 μL (5 μg/mL) of the enzyme was incubated at 55 • C for 2 min. The released reducing sugar was estimated as described in Section 2.6 and the enzyme activity was calculated. The effect of temperature on the activity of RfGH5_4 was studied by incubating (2 min) the 100 μL reaction mixture in 20 mM citrate-phosphate buffer (pH 5.5) and the temperature range in between 30 • C and 80 • C. The pH stability of RfGH5_4 was performed by incubating the 100 μL of enzyme at various pH in the buffers: 20 mM sodium citrate-phosphate (pH 3.0-7.0), 20 mM sodium phosphate (pH 5.8-8.0), 20 mM MES (pH 5.5-6.7) and 20 mM Tris-Cl (pH 7.5-9.0), at 30 • C for 60 min. 10 μL aliquot of the enzyme was taken in a 100 μL reaction mixture and was assayed at an optimized reaction pH (5.5) and temperature (55 • C) for 2 min. The thermostability of RfGH5_4 was analysed by incubating the 100 μL of the enzyme (5 μg/ mL) at different temperatures (5 • C -70 • C) for 60 min. 10 μL of this incubated enzyme was used for the enzyme assay in a 100 μL reaction mixture as mentioned earlier to calculate the specific activity at optimum reaction pH (5.5) and temperature (55 • C). The highest activity obtained in pH stability and thermostability experiments was considered 100% to calculate the relative pH stability and thermostability of RfGH5_4, respectively. All the assays were performed in triplicate sets.

Protein melting curve of RfGH5_4
The temperature at which RfGH5_4 melts was deduced by subjecting 50 μg/mL of enzyme in a freshly prepared 20 mM MES buffer, pH 5.5 to increasing temperature with an increment of 1 • C per min from 25 • C to 100 • C on a UV-visible spectrophotometer (Varian, Cary 100 Bio) attached with the Peltier temperature controller. The absorbance change at 280 nm (A 280 ) was recorded at varying temperatures, thereby generating a curve of absorbance against the temperature. The effect of 10 mM K + and Ca 2+ ions on the stability and melting temperature of RfGH5_4 was also analysed. An independent experiment was carried out by incubating the enzyme at different temperatures in the presence of 10 mM EDTA to examine its effect on the enzyme structure.

Kinetic parameter analysis of RfGH5_4 against polysaccharides
The kinetic parameters of RfGH5_4 were evaluated by the Michaelis-Menten equation under steady-state conditions using different substrates. The enzyme (5 μg/mL) was incubated with different concentration of carbohydrate substrates, namely, barley β-D-glucan, CMC-Na, tamarind xyloglucan, lichenan, and konjac glucomannan in a 100 μL reaction volume under 20 mM citrate-phosphate buffer (pH 5.5) at 55 • C temperature for 2 min. The equivalent concentration of substrates was taken as blank and all the reactions were performed in triplicates. The enzyme activity was calculated as mentioned in Section 2.6. A double-reciprocal (Lineweaver-Burk) plot [22] of the Michaelis-Menten equation [23] was generated by using GraphPad Prism 6 to determine the kinetic parameters, namely K m , V max , turnover number (k cat ), and catalytic efficiency (k cat /K m , mL.mg − 1 s − 1 ).

Hydrolysis of various cellulosic and hemicellulosic substrates by
For mass spectrometric analysis, 1 μL of hydrolysate produced was mixed with 1 μL of DHB (10 mg/mL) matrix (in 50:50%, v/v acetonitrile: water, and 0.1%, v/v trifluoroacetic acid). The analysis was performed as per the earlier reported protocol [24]. MALDI-TOF MS was performed on time-of-flight mass spectrometer Autoflexspeed (Bruker Daltonics, Bremen, Germany) in positive reflectron mode. It was provided with the accelerating voltage of 4.9 × 2400 V. The time delay for pulse ion extraction and laser shot frequency was set at 120 ns and 2000 Hz, respectively. Two thousand shots per sample were recorded. The m/z range was plotted from 350 so that interference of high-intensity DHB matrix signals could be minimized.

Processivity of RfGH5_4
The mode of action of RfGH5_4 was further investigated to get the insights of whether the enzyme was hydrolyzing the cellulosic substrates in a processive (consistent release of two short oligosaccharides) or nonprocessive manner (random release of higher oligosaccharides). The (100 μL) reaction mixture containing 1% (w/v) PASC (10 mg/mL) in 20 mM citrate-phosphate buffer, pH 5.5 and 10 μL of 50 μg/mL RfGH5_4 was incubated at 30 • C for different time intervals namely, 1, 2, 5, 10, 15, 30 min and 1, 4, 12, and 24 h. By prospecting the role of RfGH5_4 in SSF, 30 • C temperature was selected for the reaction. The Processivity index (PI) (reducing sugar in soluble fraction/reducing ends in insoluble fractions) was calculated by estimating the total reducing sugar in soluble fraction and reducing ends in insoluble fractions as discussed in the Section 2.6. The TLC analysis was performed as described in Section 2.11. The oligosaccharide products generated during hydrolysis of PASC were also analysed by HPLC wherein MetaCarb 67C analytical reverse phase column (300 × 6.5 mm) at 85 • C with double distilled water at 0.5 mL/min flow rate as the mobile phase was used. The chromatogram generated for each reaction was analysed for presence of oligosaccharides using Lab Solution software (Shimadzu, Japan) and was compared with the chromatogram of cellooligosaccharide (G1, G2, G3 and G4) standards generated using the same experimental conditions.
For saccharification, each dried pre-treated biomass (2%, w/v) was individually suspended in 600 μL of 20 mM citrate-phosphate buffer, pH 5.5. 10 μL of sodium azide (0.05%, w/v) was also added in the reaction to prevent any contamination. The pre-treated biomasses were independently saccharified in a final reaction volume of 1 mL at 30 • C and 180 rpm for 48 h by using 390 μL of protein from 50 μg/mL stock of purified RfGH5_4 endoglucanase. The final reaction concentration of the RfGH5_4 was 19.5 μg/mL or 87 PASC U/g of biomass. The TRS was estimated as described in Section 2.6, whereas the TLC analysis of hydrolysed products of various biomasses was performed as discussed in Section 2.11.

Cloning, expression and purification of RfGH5_4
The gene (1053 bp) encoding the catalytic module, RfGH5_4 without signal peptide and linker sequences was cloned in the pHTP expression vector. It was expressed by using E. coli BL21 (DE3) cells followed by the purification using IMAC. The purified protein showed a single, homogenous band of molecular size approximately, 41 kDa as analysed on 12% (w/v) gel by SDS-PAGE analysis ( Fig. 1 b, Lane 7). This was found in agreement with the theoretically calculated molecular mass of 41.18 kDa from its amino acid sequence. The purified RfGH5_4 was used for biochemical characterization.
The multifunctionality is less commonly observed among endoglucanases. This characteristic of RfGH5_4 is noteworthy here, which hydrolyses cellulosic as well as hemicellulosic substrates with a notably higher or significant activity. As indicated in a recent report, aromatic residues like tryptophan near the active site protruding outwards from the surface of the protein could be credited for the substrate selectivity and multifunctionality of GH5_4 glucanases [34]. The efficient hydrolysis of cellulosic substrates like CMC-Na and PASC and various hemicellulosic polymers such as barley β-D-glucan and lichenan by RfGH5_4 makes it a potential cellulase candidate in the valorisation of lignocellulosic biomass. Overall, the multifunctional nature of RfGH5_4 widens its canvas in the application sectors like bioenergy, synthesis of prebiotics, modification of cellulosic surfaces, paper, pulp, food and pharmaceutical industry [35].

Biochemical properties of RfGH5_4
The optimization of pH and temperature for the assay of RfGH5_4 was performed by using CMC-Na as the substrate. Maximum activity of RfGH5_4 was recorded at the temperature of 55 • C and pH 5.5 (Fig. 2 a  and b, respectively). RfGH5_4 retained most of its activity in the temperature range of 5 • C-45 • C for 1 h of incubation (Fig. 2 c). It retained 75% activity at 45 • C after 1 h of incubation. However, the enzyme was less stable at its optimum reaction temperature of 55 • C, at which the enzyme was almost inactive after 1 h of incubation. RfGH5_4 displayed stability in the pH range, 5-8 of 20 mM sodium phosphate and MES buffers (Fig. 2 d). Thus, despite being optimally active at higher temperature, RfGH5_4 would be most suitable for cellulosic bioethanol production at the temperature of 30 • C displaying significant activity of 149 U/mg. Moreover, 30 • C being an appropriate temperature for fermentation purposes, RfGH5_4 could help in single-step fermentation  All the experiments were carried in triplicates (n = 3) and mean ± SD for each experiment is shown here.
such as SSF. The ethanol tolerance analysis of RfGH5-4 at 30 • C for 96 h showed that RfGH5_4 is able to tolerate around 20% of ethanol, where it could retain 80% of its activity (Fig. S1). The highest concentration of ethanol achieved during SSF reported is around 7% [36]. Whereas, RfGH5_4 retained 95% activity at 7% ethanol after 96 h of incubation. Thus, the higher ethanol tolerance of RfGH5_4 enhances its suitability for SSF.

Effect of metal ions and other additives on RfGH5_4 activity
The influence of various cations and additives on the activity of RfGH5_4 was explored using CMC-Na as the substrate. The activity of RfGH5_4 increased in the presence of 10 mM of K + or Li + ions ( Table 2). A family GH5 endoglucanase, CelRH5 from the rhizosphere, also showed a similar type of increase in enzyme activity in the presence of Li + or K + ions [10]. K + and Li + cations can alter the active site structure of endoglucanase, thereby increasing the enzyme activity [37]. The K + ions may be performing the role of mediator between the substrate and the enzyme active site, the characteristic of Type II enzyme activator thereby making conformational changes in RfGH5_4 as also reported earlier [38]. Thus, the increment in the enzyme activity of RfGH5_4 in presence of K + ions could be due to their participation at the active site, thereby stabilizing the intermolecular interactions. The use of K + ions during synergistic saccharification of lignocellulosic biomasses to enhance the enzyme activity of RfGH5_4 could be further explored. 10 mM of the Mg 2+ and Na + ions decreased the enzyme activity of RfGH5_4 by 33% and 8%, respectively. Surprisingly, Ca 2+ ions at 10 mM concentration drastically lowered the RfGH5_4 enzyme activity to 29%. Similarly, Ni 2+ , Fe 2+ , Co 2+ , Zn 2+ , Mn 2+ and Cu 2+ ions adversely affected the enzyme activity of RfGH5_4, where 80% to 100% loss in the enzyme activity of RfGH5_4 was observed ( Table 2). A similar abrupt decrease in the enzyme activity by these metal ions was also reported for endoglucanase, Ba-EGA from Bacillus sp. AC-1 [39] and CelRH5 from rhizosphere [10].
Interestingly, RfGH5_4 enzyme activity was increased by 10% in the presence of 10 mM of EDTA. Similarly, 10 mM of EGTA displayed an increment of 32% in the enzyme activity of RfGH5_4 ( Table 2). The increase in the enzyme activity by EDTA and EGTA indicated the absence of Ca 2+ or Mg 2+ ions in the RfGH5_4 structure [40]. This increment in the enzyme activity could be attributed to a flexible conformational change, increased k cat and overall stability of RfGH5_4 imparted by these chelating agents as reported for an endoglucanase from fungus, Aspergillus aculeatus. 10 mM guanidine hydrochloride drastically decreased the RfGH5_4 enzyme activity by 23%, whereas it was reduced by 15% (w/v) in the presence of 100 mM urea. RfGH5_4 retained 98% and 99% of enzyme activity in presence of 1% (v/v) DMSO and 10% (v/v) glycerol, respectively. The enzyme activity of RfGH5_4 was increased in presence of 0.5% (v/v) Triton X-100 by 8.7%, whereas 1% (v/v) Tween 80 decreased the activity of RfGH5_4 by 17% (Table 2). RfGH5_4 lost 100% of enzyme activity in the presence of 0.1% SDS which remarks the sensitivity of RfGH5_4 towards anionic detergents. Endoglucanase CS10 from Hermetia illucens was reported with a similar type of decrease in the enzyme activity in the presence of guanidine hydrochloride, urea and SDS [41].

Protein melting analysis of RfGH5_4
The complete melting of RfGH5_4 was observed at 70 • C (T m ). The addition of 10 mM EDTA slightly increased the T m of RfGH5_4 by 2 • C (72 • C), showing the absence of any metal ion in the enzyme structure (Fig. 2 e). This result corroborated with the result from the previous section, where incubation of RfGH5_4 with 10 mM EDTA increased the enzyme activity by 10%, indicating the absence of any metal ion in the structure. A decrease in the T m of RfGH5_4 by 2 • C was observed by 10 mM of Ca 2+ . This was also in agreement with results of previous section, where 10 mM of Ca 2+ ions lowered to 29% of the enzyme activity, thereby displaying the role of Ca 2+ ions in destabilizing the RfGH5_4 structure. 10 mM of K + ions did not affect the T m of RfGH5_4 (70 • C). However, K + ions displayed increase in the enzyme activity of RfGH5_4 as mentioned in the previous section. It is possible that the K + cations play a role in the catalysis of RfGH5_4 affecting its conformation rather than its stability as observed for an endoglucanase from a plant root ericoid mycorrhizal fungus Leohumicola sp. [37,42].

Kinetic parameters of RfGH5_4
The kinetic parameters of endoglucanase, RfGH5_4 against various carbohydrate substrates were determined and are summarized in Table 3. RfGH5_4 showed remarkably high V max and low K m against different substrates (Fig. 3 a, b, c and d). It displayed a turnover number (k cat , sec − 1 ) of 473 and 360.3 for β-D-glucan and CMC -Na, respectively. The V max of 525 U/mg by RfGH5_4 against CMC-Na was significantly higher than V max , 160.6 U/mg reported for endoglucanase CS10 cloned from the gut microflora of the black soldier fly, Hermetia illucens [41]. The catalytic efficiency (k cat /K m ) and K m of RfGH5_4 for CMC-Na was deduced to be 146 mL.mg − 1 s − 1 and 2.47 mg/mL, respectively (Table 3).
Recently, a multifunctional glucomannanase, 6XSU of subfamily GH5_4 from R. flavefaciens was reported with the K cat /K m (mL.mg − 1 s − 1 ) of 5.66 for glucomannan and 1.55 for xyloglucan at 30 • C [34]. These values are significantly lower to those of RfGH5_4 against these two substrates, 529 and 294, respectively, at 55 • C. This large difference in k cat /K m values of RfGH5_4 and 6XSU could be due to the difference in the reaction temperature used for determination of kinetic parameters. In the present state, fungal cellulases dominate the industrial cellulase market. The most studied and industrially used organism for cellulase production, Trichoderma reesei, displays the catalytic efficiency of 7.3 mL.mg − 1 s − 1 and K m of 5 mg/mL (at pH 5.0, 30 • C) for its endoglucanase, EGIII (Cel12A) against CMC-Na [43]. EGIII displays the highest catalytic efficiency among all the T. reesei endoglucanases. An attempt to increase the catalytic efficiency of EGIII resulted in a 2R4 mutant giving enhanced pH and thermostability but with a decrease in the catalytic efficiency to 4.8 mL.mg − 1 s − 1 [44]. The phylogenetic analysis of RfGH5_4 and fungal endoglucanases of CAZy GH5 family showed that RfGH5_4 shares the nearest evolutionary relationship with GH5 fungal endoglucanases Epi2 (Epidinium caudatum), CelA (Neocallimastix frontalis) and CelD (Neocallimastix patriciarum) (Fig. S2). The multiple sequence alignment of RfGH5_4 with GH5 fungal endoglucanases was also performed (Fig. S3) which revealed that the catalytic residues of RfGH5_4 (E168 and E292) have survived the evolution and are also conserved in all the GH5 fungal endoglucanases including the popular EGII (Cel5A) (formerly, EGIII-Cel5A) of Trichoderma reesei. Other conserved residues are H122 and W325. Singh et al. [1] reported 32.1 U/mg specific activity of an Ultra Violet (UV) radiated CMCase mutant, CMCase-UV2 from Bacillus amyloliquefaciens SS35 which is significantly lesser than 525 U/mg activity of RfGH5_4 against CMC-Na (Table 3). Another endoglucanase, CtCel5E (GH5) from Clostridium thermocellum, had specific activity of 736.2 U/mg against CMC-Na [45]. However, k cat /K m of 12.40 mL.mg − 1 s − 1 and K m of 2.1 mg/ mL of CtCel5E against CMC-Na (pH 5.0, 50 • C) is remarkably lower than 146 mL.mg − 1 s − 1 of RfGH5_4 (Table 4). A processive endoglucanase, EG5C [46] and EG5C-1 D70Q/S235W [47] from Bacillus subtilis BS-5 showed k cat /K m of 38 and 37.3 mL.mg − 1 s − 1 , respectively against CMC-Na as the substrate.
It could be concluded that k cat /K m of RfGH5_4 is much higher than those of EG III and CtCel5E. Moreover, K m of RfGH5_4 against CMC-Na is relatively, lower than those EGIII and CtCel5E (Table 4).
RfGH5_4 hydrolyzed tamarind xyloglucan with a significantly high specific activity (V max , 381 U/mg) at relatively low substrate concentration (K m , 0.89 mg/mL). An endoglucanase, XEG5 of Paenibacillus sp. XEG5 showed the V max of 18.4 U/mg against tamarind xyloglucan, which is notably lower than that of RfGH5_4 [51]. Moreover, the appreciable catalytic efficiency of RfGH5_4 over a wide range of cellulosic and hemicellulosic substrates is noteworthy ( Table 3). The kinetic All the experiments were carried in triplicates (n = 3) and mean ± SD for each experiment is shown here. Specific activity of Control was 341 U/mg and was considered as 100%. The assays were carried out in triplicates (n = 3) with 5 μg/mL (final concentration) of RfGH5_4 at an optimum pH 5.5 (20 mM citrate phosphate buffer) and 55 • C for 2 min. Mean ± SD for each experiment is shown here.
parameters and the catalytic efficiencies of some recently studied (including this study) and commercially available endoglucanases are described in Table 4. RfGH5_4 stands out as an efficient multifunctional endoglucanase among the other known endoglucanases. The comparison underlines the potential of RfGH5_4 for industrial applications like lignocellulosic bioethanol production over the other available endoglucanases.

Hydrolysis mechanism and multifunctionality of RfGH5_4
3.5.1. Hydrolytic mechanism analysis of RfGH5_4 using TLC A chromatogram developed with the CMC-Na hydrolysates produced by RfGH5_4 displayed cellotriose, cellotetraose and other higher oligosaccharides after 2 min of reaction in a 24 h hydrolysis (Fig. 4 a). However, after 1 h, cellotriose was a predominant oligosaccharide along with other higher oligosaccharides (>DP4), confirming the endo-acting catalytic mode of RfGH5_4. Quantification of reducing sugars revealed  P.V. Gavande et al. that the hydrolysis process achieved saturation after 12 h of the reaction (Fig. 4 b).

TLC and MALDI-TOF MS of RfGH5_4 hydrolyzed products from different substrates
The hydrolysis of various substrates by RfGH5_4 was analysed and confirmed by TLC. The substrates used were barley β-D-glucan, CMC -Na, lichenan, tamarind xyloglucan, HEC, konjac glucomannan, beechwood xylan, birchwood xylan and carob galactomannan. Moreover, the release of various degrees of oligosaccharides by RfGH5_4 was observed. Cellobiose, cellotriose, cellotetraose and other higher cellooligosaccharides could be detected in the barley β-D-glucan, CMC-Na and HEC hydrolysates (Fig. 4 c, Lanes 2, 3 and 6, respectively). Cellotetraose was the primary product in lichenan and konjac glucomannan hydrolysis (Fig. 4 c, Lanes 4 and 7, respectively). The hydrolysis of xyloglucan showed multiple spots on the TLC plate which were attributed to different xyloglucan oligosaccharides (XyGs), as confirmed by MALDI-TOF MS analysis (Fig. 4 c, Lane 5). The TLC analysis also confirmed the hydrolysis of beechwood xylan, birchwood xylan and carob galactomannan by RfGH5_4 (Fig. 4 c, Lanes 8, 9 and 10, respectively) for which released oligosaccharides (DP2 to DP11) were also recognized with the help of MALDI-TOF MS analysis. The cellooligosaccharides higher than cellotetraose were observed in the hydrolysates of avicel and cellulose powder (Fig. 4 d, Lane 2 and 3, respectively). However, PASC was hydrolysed to oligosaccharides smaller than cellotetraose like cellotriose and cellobiose (Fig. 4 d, Lane 4). The crystallinity index (CrI) of PASC was recorded to be 0 to 0.04 which makes the cellulose chains freely accessible to the endoglucanase, unlike avicel, for which the CrI ranges between 0.5 and 0.6 [52]. Higher the CrI of a cellulosic substrate, lower is the accessibility of an endoglucanase for the cellulose chain and vice-versa.
The MALDI-TOF MS (DP, m/z ratio) analysis revealed that RfGH5_4 randomly hydrolyses β-D-glucan to give cellooligosaccharides ranging from DP2 to DP12, thus further confirming RfGH5_4 as an endoglucanase (Fig. S4 a). Lichenan hydrolysate of RfGH5_4 also contained cellooligosaccharides in the range of DP2-DP12 (data not shown). MALDI-TOF analysis of PASC hydrolysate revealed the presence of majorly cellobiose (DP2) and cellotriose (DP3) (Fig. S4 b), indicating the processive hydrolysis of amorphous cellulose by RfGH5_4 as also described and confirmed in the Section 3.6. CMC-Na hydrolysate of RfGH5_4 showed cellooligosaccharides ranging between DP2 and DP6 (data not shown). Only higher cellooligosaccharides of size DP8 to DP10 were detected in the hydrolysates of avicel and cellulose powder by MALDI-TOF MS analysis (data not shown), thus confirming complex cellulose hydrolysis by RfGH5_4. Glucose was not detected in any of these hydrolysates upon the MALDI-TOF analysis.

Processivity of RfGH5_4 on cellulose
The mode of cellulose hydrolysis by RfGH5_4 was elucidated. The processivity of RfGH5_4 was studied at different time intervals up to 24 h, as shown in the TLC analysis. RfGH5_4 hydrolysed the cellulosic substrate, PASC and gave cellotetraose during first 30 min of the reaction. However, consistent release of cellotriose and cellobiose was observed after 1 h of the reaction (Fig. 5 a). This is the processive behaviour of RfGH5_4 endoglucanase, where it consistently cleaved PASC into cellotriose and cellobiose and this mode is called processivity as reported earlier [53]. The HPLC chromatogram of PASC hydrolysates from different time intervals also showed that the release of cellooligosaccharides from PASC was confined to cellotriose (G3) and cellobiose (G2) after 1 h of the reaction further confirming the processivity of RfGH5_4 (Fig. 5 b). The chromatogram of cellooligosaccharide standards (G1-G4) was generated to recognize the respective oligosaccharides in PASC hydrolysates (Fig. S5). The PI of the hydrolysis reaction of RfGH5_4 on PASC was found consistently increasing from 0.1 to 4.9 (Table 5) in a 1 min to 24 h assay, further confirming the processive behaviour of RfGH5_4.
In the PI analysis, the accumulation of soluble reducing sugars usually increases with time, whereas the reducing ends of cellulose in the insoluble un-hydrolysed substrate fractions remain constant. The processive endoglucanases show an increasing PI over a time period of incubation as also observed for RfGH5_4 (Fig. 5 c). Zheng et al. [53] also reported the initial release of cellotetraose followed by cellotriose and cellobiose from PASC by an GH5 endoglucanase, EG1 from fungus Volvariella volvacea. As observed for RfGH5_4, the processive endoglucanases, EG5C and EG5C-1 from Bacillus subtilis also showed the release of cellobiose and cellotriose from PASC and Avicel [46]. Processivity is rarely observed in endoglucanases from family GH5. Therefore, the processive endoglucanases from family GH5 should be explored for their applications in renewable energy sector.
The drawback of using non-processive endoglucanases in biomass conversion is that they produce only higher oligosaccharides unlike RfGH5_4. Therefore, processive endoglucanase, RfGH5_4 can be used for bioethanol production that releases significant amount of cellobiose through its processivity during lignocellulosic biomass conversion.

Hydrolysis of lignocellulosic biomasses by RfGH5_4
The compatibility of RfGH5_4 endoglucanase for saccharification of lignocellulosic biomass was observed. Six different types of alkali pretreated biomasses viz. CMS, CSB, SBG, SDR, FMS, MZL were hydrolysed by RfGH5_4 in to various degree of oligosaccharides in the 48 h of saccharification (DP4-DP2 and >DP4) (Fig. 6 a). Highest TRS (mg/g of biomass) was achieved by 87 U/g of RfGH5_4 through the hydrolysis of SDR (72) followed by FMS (62), SBG (38), CSB (27), CMS (19) and MZL (8.5) (Fig. 6 b). The TRS yield generated by RfGH5_4 from alkali pretreated SDR and SBG was significantly higher than the earlier reported TRS yields for these biomasses (34.2 mg/g for SDR, [27] and 5.9 mg/g for SBG, [54]. Jamaldheen et al. [27] used endoglucanase (CtCel8A, 80 U/mg) and β-glucosidase (CtBgl1A, 33 U/mg) of C. thermocellum for SDR saccharification. Nath et al. [54] employed chimera, CtGH1-L1-CtGH5-F194A (240 U/g) containing β-glucosidase and endoglucanase along with cellobiohydrolase, CtCBH5A (360 U/g) from C. thermocellum to saccharify SBG. However, the TRS yield from RfGH5_4 saccharified CMS and CSB was comparatively lower than the reported value (490 mg/g) in a previous report [25], where the alkali pre-treated 5% (w/v) cotton stalk was saccharified by 100 U of commercial cellulase enzyme mixture at 50 • C for 72 h. The TRS obtained in the current study could be lower because, a lower biomass (pre-treated CMS or CSB) loading, 2% (w/v) was saccharified and for shorter period (48 h) using an in house produced single enzyme, endoglucanase (RfGH5_4) at 30 • C which is a costeffective process having lesser energy requirements. Moreover, all the TRS yields from the aforementioned reports were from the synergistic action of the three cellulases, namely endoglucanase, cellobiohydrolase and β-glucosidase, whereas the TRS from saccharification reported in present study is only by RfGH5_4. Sorghum stalk and finger millet stalk were found to be the best lignocellulosic substrates for RfGH5_4. Annual residual biomass of sorghum is estimated around 11 metric tons in India [27]. Finger millet is a drought tolerant rain-fed crop in India [28]. However, being a feed for livestock, these biomasses could be avoided for their use in renewable energy sector or could be used only after ensuring the food security of live stack. It is noteworthy that RfGH5_4 hydrolyzed the cotton biomasses, CSB and CMS quite effectively. Cotton is an important cash crop worldwide having roughly 32-million-hectare area under cotton cultivation of which India accounts for about 10 million-hectares [25]. Farmers usually burn the biomass like cotton stalk after cotton plucking is complete. Cultivation of cotton generates abundant residual cotton biomass which could be used as a rich source of lignocellulose for bioethanol production. Thus, residual cotton stalk could be a good biomass resource for lignocellulosic bioethanol production. As a future prospect, efficient hydrolysis of cotton and other biomasses by synergistic action of RfGH5_4 endoglucanase, cellobiohydrolase and β-glucosidase could overcome the bottleneck of getting less saccharification yield. Being a stable endoglucanase at 30 • C as discussed in Section 3.4, the saccharification reaction could be stretched for more than 96 h thus giving more yield of glucose. RfGH5_4 endoglucanase hydrolysed various biomasses giving cellobiose (DP2) along with cellooligosaccharides of higher DP (DP3 and above) as observed in TLC analysis after 48 h of saccharification (Fig. 6 a). Therefore, RfGH5_4 endoglucanase not only reduces the need of cellobiohydrolase to generate cellobiose during synergistic saccharification but it will also boost the saccharification process even if an additional cellobiohydrolase is used. RfGH5_4 could also be immobilized using metal ion assisted recyclable pH-responsive polymer like Eudragit S100 to hydrolyse lignocellulose in bioethanol industry [55]. Moreover, RfGH5_4 is sufficiently capable of hydrolysing hemicellulosic polysaccharides as described in Section 3.5.2 which further decreases the requirement of additional hemicellulases. Overall, this study affirms that RfGH5_4 could potentially serve the lignocellulose conversion industry as a multifunctional yet an efficient processive endoglucanase.
This study established RfGH5_4 as an efficient yet multifunctional endoglucanase. RfGH5_4 was found to be hydrolysing both cellulosic and hemicellulosic polysaccharides. It gave maximum activity at pH 5.5 and 55 • C and was stable in pH range, 5.0-8.0 and between, 5 • C -45 • C.  Its catalytic efficiency against CMC-Na and konjac glucomannan was significantly superior to earlier reported endoglucanases. The melting temperature (T m ) of RfGH5_4 was 70 • C, that was unaffected by EDTA or EGTA, indicating the absence of inherent divalent metal ions in the RfGH5_4 structure. TLC and MALDI-TOF MS analyses of RfGH5_4 treated hydrolysates of various polysaccharides showed presence of respective oligosaccharides (DP2-DP11) revealing the endo-and multiligand activity of the enzyme. RfGH5_4 hydrolysed PASC processively thus generating cellobiose along with cellotriose thereby reducing the need of cellobiohydrolase. The efficient deconstruction and saccharification of various complex lignocellulosic biomasses was achieved using RfGH5_4, which prospects it as a potential endoglucanase for the purpose of bioethanol production. The efficiency, multifunctionality, remarkable stability in ethanol, suitability for SSF and capability to hydrolyse a diverse range of lignocellulosic biomasses puts RfGH5_4 in the category of cellulases important for renewable energy sector. The multifunctionality of RfGH5_4 could be further explored for generation of various carbohydrate oligosaccharides useful in feed, food, prebiotics and health sector.

Declaration of competing interest
CMGAF is a financial beneficiary of the company that sells both the cloning kits used in this study and the enzyme that is described in this paper.