Gray Box Modeling of Supercritical Nimbin Extraction from Neem Seeds Using Methanol as Co-solvent

In this paper yield of nimbin extraction from neem seeds using supercritical carbon dioxide with methanol as co-solvent has been studied. In this case mass transfer coefficient in terms of Sherwood number has been estimated by a neuro-fuzzy network. Then the estimated mass transfer coefficient has been fed to the system model which is set of partial differential equations. The proposed gray box model was validated with experimental data.


INTRODUCTION
The neem tree (Azadirachta indica) is native to tropical South East Asia.It is fast growing, can survive drought and poor soil and keeps its leaves all year round.It is a tall tree, up to 30 meters height.Many white flowers which smell like honey appear for the first time when the tree is 2 to 3 years old.The tree has fruit after 3 to 5 years.Inside the fruit there is a seed about 1.5 cm long.Neem trees can be grown in areas which have annually between 400 to 1500 mm of rain fall.The tree grows best at an altitude of less than 1,500 meters.Neem trees will survive very hot temperatures, up to 44°C and as low as 4°C [1].
The seeds of neem do not live long and are usually planted as soon as possible after the fruit ripens.To help the seeds live longer the fruit pulp should be removed by hand and the seeds should be dried up to 20% moisture content (wet basis).If the seeds have been properly air dried they can be kept up to twelve months at refrigerated conditions at 4°C [1].
There are many ways to use the extracts of the neem tree.Neem extracts contain a natural chemical called azadirachtin.The substance is found in all parts of the tree.The leaves are used effectively, though the chemical is much more concentrated in the fruit, especially in the seeds.One of the important uses for neem products is to fight against crop pests and diseases.Neem extracts do not usually kill insects immediately.They change the feeding or life cycle of the insect until it is no longer able to live or reproduce.This might mean that the neem extract takes a long time to work if the pest attack is severe.Other insects will avoid a plant treated with neem extracts.Neem based pesticides are suitable for use in developing countries because the useful chemicals can be easily removed from the neem without the use of expensive and complicated equipment.The neem cake which is left after the oil is extracted from the seed is also useful for controlling several pests which live in the soil, particularly nematodes.In addition to antiseptic, some other medical properties such as anti-pyretic, anti-fungal anti-inflammatory and antihistamine are found in neem extract/cake due to the presence of "Nimbin" one of extracted components of neem seeds [1].
Neem seeds also contain 45% oil that is valuable [1].A number of different acids and other components have been identified in this oil like oleic acid (50-60%), palmitic acid (13-15%), stearic acid (14-19%), linoleic acid (8-16%), arachidic acid (1-3%), nimbin (0.12%), nimbinin (0.01%), nimbidin (1.4%) and nimbidiol (0.5%) [2].There are traditional ways of removing this oil.One hundred to one hundred and fifty milligrams of oil for every 1 kilogram of neem seed can be extracted [1].Except the traditional ways, it can be extracted by industrial methods like supercritical fluid extraction (SFE), one of the method of extractions used in industrial applications.Carbon dioxide is a common solvent that is used in SFE.It has low toxicity, low critical temperature and pressure, high purity, low cost, low surface tension and viscosity and high diffusivity [3,4].
Several researchers have studied oil extraction from neem seeds.Johnson and Morgan [5] used supercritical fluid to extracted nimbin from neem seed; they investigated the effect of co-solvent and pressure on the extraction yield.Their results showed that nimbin extraction yield increases when methanol are used as co-solvent.Tonthubthimthong et al. [6] studied the influence of temperature, pressure, flow rate and weight of particle sample on the extraction yield of nimbin from neem seed, they found that the best extraction conditions from experiments be at: 308 K, 23 MPa and a flow rate of 1.24 cm 3 min -1 for a sample of 2 g of neem seed.O. P. Sidhu et al. [7] studied the quantitative variability in nimbin and salanin of India.Large crowds from various agro-climatic zones were evaluated to identify variability in nimbin and salanin.Tonthubthimthong et al. [8] studied extraction of nimbin from neem seeds.They studied the influence of temperature, pressure and particle diameter on the extraction yield.In another part of their experiment, they found that extraction yield increased when used methanol was used as co-solvent.A model for extraction of nimbin from neem seed by use of supercritical CO 2 was done by Mongkholkhajornsilp et al. [9]; they assumed that axial dispersion is negligible and in order to improve the performance of model, a new correlation of mass transfer was considered.
The aim of the present work is to study the yield of nimbin extraction from neem seeds using supercritical carbon dioxide with methanol as co-solvent.A hybrid approach is employed to modeling the supercritical process, in which a neuro-fuzzy network is designed and combined to the process model to improve the capability of modeling.Based on our literature survey the idea of gray box modeling in SFE of neem is new.
There are generally three approaches to building mathematical models: White box modeling, where everything is considered to be known from physical laws.
Black box modeling (system identification), where all knowledge derives from measurements.
Gray box modeling, where both physical laws and observed measurements are used to design a model.
The last approach assumes that the structure of the model is given from physical laws as a parameterized function.In the next step, the model parameters are obtained using the observed data (measurements) information [10].The "gray box" is a term that describes the symbolic approach to engineering computation that offers distinct benefits in education and research.
White box analysis involves analyzing and understanding the equation model.White box testing is typically very effective in finding programming errors and implementation errors.Black box analysis refers to analyzing a running program by probing it with various inputs.This kind of testing requires only a running program and does not make use of equation analysis of any kind.Black box testing is not as effective as white box testing in obtaining knowledge of the equation and its behavior.Black box modeling is much easier to accomplish and usually requires less expertise than white box modeling.Gray box analysis combines white box techniques with black box modeling.Gray box approaches usually requires using several tools together.The use of gray box techniques combines both methods in a powerful way, black box modeling can scan system across networks and white box modeling requires equation to analyze the behavior statically.
The remainder of this paper is organized as follows.After this introductory section, mathematical modeling and equations are described.In Section 3, neuro-fuzzy modeling and its results are presented.The gray box structures are explained in the same part.Results and Discussions are the last section.

MATHEMATICAL MODELING
Meireles et al. [11] have presented a mechanistic model to find yield of extraction in SFE process.The variation of extraction yield at a interval time, is the mole of solute which extract at this interval per initial mole of solute in the extractor, so: But: (2) So: (3 where F is extraction yield, v is flow velocity, is dimensionless time, L is extractor length, A is extractor area, is void fraction of packed bed, n 0 (kmol) is initial mole of solute in the bed, C (kmol m 3 ) is the oil concentration in the supercritical phase and z is the dimensionless axial coordinate along the bed.Equation 3 is used to combine with our neuro-fuzzy network and in the gray box approach.It The initial condition of this equation is: (4) Equation 3 with its initial condition (Equation 4) is related to some other equations and they all have to be solved together.Details of these equations can be found in literature [11].As mentioned in this paper, mass transfer coefficient (K) is needed to solve these equations set.Equation 5shows that if Sherwood Number (Sh) is known, K can be calculated as: (5) where R p (m) and D m (m 2 s -1 ) are particle radius and molecular diffusion coefficient respectively.Therefore, in this paper it has been tried to find Sherwood number with a neuro-fuzzy network.This network is designed to find Sherwood number with the non-dimensional Grashof (Gr), Reynolds (Re) and Schmidt (Sc) numbers as inputs; then, it is combined to Equation 3 and its result (Sherwood number) is employed to find the yield of extraction.Fig. (1) illustrates schematically the gray box modeling.Experimental data of Tonthubthimthong et al. [8] were used for validation of the proposed model.

NEURO-FUZZY MODELING
Neuro-Fuzzy method is a combination of artificial neural network and fuzzy logic.In this section these techniques are described briefly.Next, neuro-fuzzy method is described and results of designed network are presented.

Neural Networks
After McCulloch and Pitts [12] presented neurons as models of biological neurons and as conceptual components for circuits that could perform computational tasks in 1943 Artificial Neural Networks (ANN) emerged.Minsky and Papert [12] published their book Perceptrons in 1969 and showed the perceptron models.This book increased interest on ANN in researches and industrial designs [12].Some process units called "neurons" make the foundation of an ANN.These units are connected to each other via through weighted connections to send and receive signals.
Received signal for each unit is computed number by neighbors and sent signal will be computed for other unit.Weights of connections will be tuned during the building of ANN structure.According Fig. (2) shows a schematic of the ANN.
As it is illustrated in Fig. (2), signals from other units enter to each unit via weighted connections.Each input signal multiply with its weight and then summation of them plus a special weight, which is called "bias" and its input signal is always 1(it is indicated by k in the Fig. 2), makes net input for the function.The result of function will be the output signal of the unit and will be sent to other units.
An ANN may contain one or more layers of units with different structure both for unit's connection and layer connections maps.Outputs of last layer units are final results of ANN.Parameters of this structure will be calculated during operation in order to make results with lowest difference with targets.This process is known as training.

Fuzzy Logic
In 1965, Lotfi A. Zadeh, the Persian professor of computer science at the University of California in Berkeley presented a new technique in computation which is known as "Fuzzy Logic (FL)" [13][14][15].This method helps scientists to define intermediate values between conventional evaluations like true/false, hot/cold, etc. Nowadays computers are used for programming human-like way of thinking and formulate notions like rather hot or very clod mathematically [16].Fuzzy systems are an alternative to traditional notions of set membership and logic that has its origins in ancient Greek philosophy.Efforts of Aristotle and the philosophers who preceded him have helped mathematical concepts which devised concise theory of logic, and later mathematics, the so-called "Laws of Thought" [17].
The very basic notion of fuzzy systems is a fuzzy (sub) set.Sets in classical mathematics are crisp and scientists are familiar with them.For example, in a set like numbers between 0 and 1 a sub set (A) can be defined, like numbers between 0 and 0.2.The characteristic function of A, (i.e., this function assigns a number 1 or 0 to each element in main set, depending on whether the element is in the subset A or not) is shown in Fig. (3).The elements which are in the set A have been assigned the number 1 and others have been assigned the number 0.  A fuzzy set allows us to define another subset (B) that can contain both elements which are in set A and are not in set A, and those that are between these two elements.The aim is to use fuzzy sets in order to make computers more "intelligent".The interpretation of the numbers, now assigned to all elements is much more difficult.Of course, again the number 1 assigned to an element means that the element is in the set B and 0 means that the element is definitely not to be in the set B. All other values mean a gradual membership to the set B. This is shown in Fig. (4).The membership function is a graphical representation of the magnitude of participation of each input.The rules use the input membership values as weighting factors to determine their influence on the fuzzy output sets of the final output conclusion.

Neuro-Fuzzy Approach
As mentioned in previous section, fuzzy logic is a system that can be applied to transform linguistic concepts to mathematical and computational structure for many purposes.But fuzzy systems don't have so good ability to learn and adapt to changing conditions, which is one of the most important advantages of neural networks (i.e.training with unprocessed data) [18,19].So combinationsof these two methods can cover this problem and help us to design a system that can learn and can be amenable to human perception [19,20].Besides, merging fuzzy logic systems and neural networks helps researchers to choose and design parameters of fuzzy logic inferences [21].Not having a systematic procedure for choosing type of membership functions and parameter set leads us to use a network structure like that in neural networks.Recently an adaptive neuro fuzzy inference system (ANFIS) has been proposed for this goal [22], and has had good results in many researches.
ANFIS applied a combination of error back propagation algorithm and least squares method as a hybrid algorithm to adjust the membership functions of a fuzzy logic system optimally [23].After determining final error (difference between results and targets) of the system, derivative of squared error with respect to each node's output as error signals are fed back to the system to be used during the backward pass exactly the same as back propagation learning rules used in common feed forward neural networks in order to alter membership function parameters [24].
There are two fuzzy style inference which are most popular and used, Mamdani-style inference and Sugeno-style inference.Mamdani-style, based on Lotfi zadeh's 1973 paper [14], however is better studied to human input but its defuzzification computation process last longer [24].Sugeno-style, based on Takagi-Sugeno-Karg method of fuzzy inference, is more interesting to use in neuro fuzzy systems [25].Some reasons like computational efficiency, working well with linear techniques, working well with optimization and adaptive and adaptive techniques, guarantying continuity of the output surface and being studied to mathematical analysis [24] have made it better to use in a neuro fuzzy system.Sugeno fuzzy inferences usually use a first order or a zero order function in the consequent of their rules.For a first order Sugeno fuzzy model, a common rule set with two fuzzy if-then rules can be described as bellow Where x and y are inputs and z is output.A i and B i (i=1,2) are the fuzzy sets in the antecedent.f i (i=1,2) are crisp functions in the consequent which calculate results of the rule by equation f i = p i x + q i y + r i (6) Where p i , q i and r i (i=1,2) are design parameters that are determined during the training process.Fig. (5) shows these rules schematically.
In Fig. (6), a typical structure of an ANFIS based on Sugeno fuzzy modeling is shown.It has been considered two input x, y and one output z for simplicity.Two membership functions have been considered for each input in this network.
Circle symbol for nods indicates a fixed node and square indicates an adaptive nodes.First layer of this structure contains nodes which generate output of each member ship functions for inputs i = μ Aj (x) i=1,2 for j=1 and 3,4 for j=2 i = μ Bj (y) i=5,6 for j=1 and 7,8 for j=2 Where μ Aj and μ Bj are membership functions for x and y respectively.In second layer, nodes which are labeled calculate firing strength of a rule by equation as bellow Nodes labeled N, in third layer calculate the ratio of a rule firing strength to the sum of all rules firing strength i = i=1,2,3,4

∑
After calculating this ratio, outputs of forth layer are obtained by equation Using a Sugeno fuzzy modeling style in this network, z i are calculated by Equation (6), as below z i = p i x + q i y + r i i=1,2,3,4 The last single node in fifth layer, computes overall output as summation of all incoming signals, which is expressed in Equation ( 13).z = (13)

Neuro-Fuzzy Results
During the designing of a neuro-fuzzy network, data sets can be divided into two parts: "train" and "test" data.It is obvious that train data are applied for training the network and learn to it how the input/output data sets are.But, selecting some data for checking the network helps to avoid from over fitting the network.
In this paper, 16 data sets have been used to design the neuro-fuzzy network which can estimate Sherwood number as independent (or target) variable with three dependent variables (i.e.Reynolds, Schmidt and Grashof number).These data sets were obtained from literature (Tonthubthimthong, et al.) [26].Therefore data sets have to be selected for training and checking so that designed network covers all ranges of new data.Fig. (10) illustrates distribution of training and checking data set which are the best selection for our goal (i.e.lowest error for both train and check data).
Twelve data sets have been selected for training and designing the network and the 4 remaining sets were used for testing the network to know whether the network can estimate new data as well as trained data or untrained data.It was verified that the best type for membership functions were "Gaussian" for to first input sets (Re and Sc number) and "Triangular" for last (Gr number).Fig. (11) shows these functions which are characterized after training the best network.
In this study mean square error (MSE) has been set to describe ability of the obtained network.After training, for the best network obtained the MSE was 1.9934 10 -13 for train data sets and 2.9856 10 -04 for checking data.Both of these errors are suitable and reasonable and its results can be trusted as a good estimation for any use.It is reasonable that errors in training are much lower than testing, because the network is designed based on training data and is familiar with all of them.But testing data are which enter to the net-       work for the first time.Table 1 shows some specifications of the best obtained network.Figs.(12 and 13) illustrate the results of the network in estimating train and check data respectively.In this figures target values for Sherwood is plotted vs. network data.It is clear that network results for train data are close to targets and their differences are very small.Although error for checking data is not as small as that for train data but it is reasonable for a neuro fuzzy network.

Gray Box Modeling Results
After finding Sherwood number by designed neuro-fuzzy network (as described in section 4.1) and by applying equation 3 a gray box network that computes the extraction yield was designed.It can be seen that the model capability in estimation of extraction yield at higher extraction time decreases.This problem caused by using an average value for Grashof number in whole times.Grashof number is based on time and changes at each time therefore this assumption raises error in computations.

CONCLUSION
In this work ability of neuro-fuzzy approach was shown for modeling the nimbin SFE from neem seeds with cosolvent.A gray-box as a combination of neuro-fuzzy network, which estimates Sh number and a mathematical modeling equation, which calculates extraction yield was designed.And the power of this technique to find good results of new inputs even out of the training range was shown.Although an average value of Gr was used for whole extraction times instead of local values, suitable error was obtained for estimating of extraction yield.
Designing of classical and statistical models is so sensitive to range of available data and accuracy of assumption much more than neuro-fuzzy method.It is needed to have some experimental data not in wide range for those techniques.But this work showed that neuro-fuzzy method can be applied for various ranges of experimental data and it is expected to obtain good and reasonable results.Therefore it is suggested to use this approach in other investigation instead of classical method and compare their results.
Figs. (7 to 9) show variation of Sh with Re, Sc and Gr which illustrate coverage of wide range of values for both input and output variables.

Fig. ( 12
Fig. (12).Comparison of experimental and predicted data to the network for trainning data set.

Fig. ( 13 ).
Fig. (13).Comparison of experimental and predicted data by the proposed model for validation data set.
Fig. (14) depicts results of grey box modeling for neem extraction yield estimation.It is obvious that the proposed model can estimate the trends of extraction.