Transparent computational intelligence models for pharmaceutical tableting process
© Khalid et al. 2016
Received: 8 February 2016
Accepted: 23 March 2016
Published: 11 April 2016
Pharmaceutical industry is tightly regulated owing to health concerns. Over the years, the use of computational intelligence (CI) tools has increased in pharmaceutical research and development, manufacturing, and quality control. Quality characteristics of tablets like tensile strength are important indicators of expected tablet performance. Predictive, yet transparent, CI models which can be analysed for insights into the formulation and development process.
This work uses data from a galenical tableting study and computational intelligence methods like decision trees, random forests, fuzzy systems, artificial neural networks, and symbolic regression to establish models for the outcome of tensile strength. Data was divided in training and test fold according to ten fold cross validation scheme and RMSE was used as an evaluation metric. Tree based ensembles and symbolic regression methods are presented as transparent models with extracted rules and mathematical formula, respectively, explaining the CI models in greater detail.
CI models for tensile strength of tablets based on the formulation design and process parameters have been established. Best models exhibit normalized RMSE of 7 %. Rules from fuzzy systems and random forests are shown to increase transparency of CI models. A mathematical formula generated by symbolic regression is presented as a transparent model.
CI models explain the variation of tensile strength according to formulation and manufacturing process characteristics. CI models can be further analyzed to extract actionable knowledge making the artificial learning process more transparent and acceptable for use in pharmaceutical quality and safety domains.
Solid dosage forms are dominant on the pharmaceutical market. It is estimated that the tablets as the most common and popular oral dosage forms constitute more than two-thirds of the global market. They are prepared usually by compressing uniform volumes of powder mixtures consisting of active pharmaceutical ingredient (API) with suitable excipients such as diluents, binders, disintegrating agents, glidants, lubricant, taste maskers, etc. Therefore, understanding the physiochemical properties of ingredients and the mechanical behavior of powders during tableting process is very important for the quality of tablets with mechanical strength as one the profound parameters.
It has been observed that the upstream process of formulation design and manufacturing has an intrinsic effect on the physical and mechanical properties of the tablet: an important one being expressed as tensile strength. Tensile strength of tablets is an indicator of how strongly the ingredients are compacted and it gives an indirect measure of how the tablet will perform once consumed. Development of formulations and optimization of tableting conditions are intrinsically complex in nature—leading to reliance on empirical methods in practice (Sun 2009). Variations while manufacturing the tablet could lead to an undesirably slow rate of disintegration if the tablet is too hard or failure during packaging and shipping if the tablet is weak. Disintegration and dissolution are equally important considerations within the scope of Quality by Design (QbD) (ICH 2009).
It is imperative to have in-depth understanding of the process and its parameters and their response to different formulations and manufacturing conditions. Complexity of the problem requires the use of advances empirical approaches. The need for constant supervision of the process is imperative since the intermediate points at which variation might be introduced are numerous. For example, unwarranted changes in different moisture conditions (Gupta et al. 2005), particle sizes and pores (Nicklasson and Podczeck 2007), crystalline forms of molecules (Maghsoodi 2012), effects of roller compaction (Sun and Himmelspach 2006), and batch sizes, etc., can cause significant variation is the quality characteristics of the product. It is expensive and time-consuming to run experimental tests for all possible upstream process combinations in order to optimize the endpoint, rather it is much cheaper and faster to develop predictive models using computational intelligence (CI) which can be used as guidance tools in a competitive and rapidly changing environment.
The use of CI has been demonstrated in pharmaceutical manufacture before by previous works; all aiming towards increasing understanding of systems and using CI models as a stepping stone towards implementation of QbD approach (Ibrić et al. 2012). Neural networks, fuzzy systems, and other techniques have been used for various applications in pharmaceutical environments (Bourquin et al. 1998; Shao et al. 2007; Landin et al. 2012) including but not limited to assessment of tensile strength and dissolution profiles.
Changes in one component of the system has a profound effect which cannot be explained by the sum of all changes within the system components. Complex adaptive systems absorb the changes upstream of the process and evolve as they progress (Chaffee and McNeill 2007). In case of pharmaceutical tableting, complexities are abundant. Powder physical and chemical properties, powder mixtures, response of powder mixtures to the pharmaceutical processes (mixing, roll compaction, milling, tableting, coating, etc.), and the processes themselves can add to the complexity within the whole system. The combined effect of variations added at different stages of the process are non-linear, sometimes unintentional, and adds to difficulties in prediction. Paley and Eva (2010) argue that the use of complex systems can capture the unintentional behaviors of entities, as has been observed in the case of powder segregation while in the feeder of the tableting machine (Ortiza et al. 2014).
Although CI techniques have rapidly gained pace within the pharmaceutical technology sphere, their black-box mode of work remains a reason for skepticism within regulatory authorities. Demand for CI models to be transparent is to ensure efficacy and safety of a drug by fulfilling modern requirements for ultimate control and understanding of every element of the process including modeling procedures. This work makes an attempt to develop models for the quality characteristic of tensile strength using various CI methods and to dissect the best tree based models to extract rules describing the model. Finally, we present symbolic regression, the output of which can be represented in the form of an equation clearly showing the relationship of input variables to the outcome.
This paper is an extended version of Khalid et al. (2015).
Data and methods
Data was collected from a galenical study conducted on tableting using an undisclosed API in fixed quantity and four excipients (Silica Aerogel, MicroCrystalline Cellulose, Magnesium stearate, Sodium CarboxyMethyl Cellulose) in varying quantities. The study followed a vertex centroid experiment design generating 17 unique mixtures. For the 17 mixtures, two die compaction machines with three compaction pressures and two compaction speeds were used. One additional mixture from the preliminary trials was added. Details of the excipients used and experimental conditions are explained in the source publication for data (Bourquin et al. 1998).
Data transformation and organization
The main data set was divided into ten training and test files following the tenfold cross-validation (10cv) procedure using the cv tools library from CRAN. According to 10cv procedure, the data set was divided into ten training and test folds of 90 and 10 %, respectively. Models were created of nine folds and tested on the tenth over ten iterations.
Tree based methods
The following tree based methods were used.
Cubist is an implementation of tree based modeling approach in R where a resulting tree is a set of linear models at each node starting from the root to the last node. A tree is generated on the complete training data set and the best node of the tree is converted into a rule. Linear models are fit at the terminal node, results of which are smoothed with the predictions of linear models from earlier nodes within the tree. This process is continued in recursion until all the variables have been covered by a single or a set of rules. This is also known as the separate-and-conquer approach (Fürnkranz 1999). Furthermore, boosting-like mechanisms are applied where response adjustment is carried out for successive models based on the predictions of the previous models (Quinlan 1992). Cubist exhibits speed and an impressive generalization ability with regression problems. Cubist algorithm has been used in pharmaceutical research for ADME/ADMETox prediction models (Gupta et al. 2010).
Random forest and interactive trees
Random forests is a tree based model where many tree predictors are stacked together to form one model. Each tree is created on an independent and random sample taken from the training data set. In one forest, the sample distribution is kept same for all the trees. The generalization error of a forest depends on the errors of individual trees and the correlation between the trees (Breiman 2001). Random forests are known to be good for classification problems but they have work well with regression and feature selection problems too. To extract rules from randomForest models, CRAN package inTrees was used. inTrees extracts, measures, prunes, and selects rules from tree based ensembles (Houtao 2014; Pacławski et al. 2015).
Artificial neural networks
MON-MLP are generalized feed forward multi layer perceptron neural networks which work in a monotone fashion using NLM as their training algorithm. They allow two hidden layers with a choice of two activation and transfer (tanh and linear) functions. MON-MLPs are known to be robust with regression problems (Cannon 2005).
This is an evolutionary algorithm for fuzzy systems, a genetic algorithm is used to construct a fuzzy system able to fit the given training data. This fuzzy system can then be used as a prediction model, composed of fuzzy logic rules that can be further analyzed to provide plausible linguistic representation. One of the implementation of genetic algorithm based fuzzy systems in R is FugeR (Bujard 2012).
In this experiment, a maximum generation of 500 and population of 1000 were allowed, respectively. Out of every generation, 20 % of the population was set to be elitist. The rules generated from these experiments were set to sizes 10, 20, and 50 with 1–10 maximum variables per rule allowed. Each input and output variable is assigned membership functions describing the range an input variable has. A collection of such rules guides the input variable values to the predicted output.
This work makes use of fuzzy systems with co-evolution and symbolic regression methods in its course. The aim is to create models and use the models to extract rules and a mathematical formula in case of symbolic regression.
Symbolic regression by RGP
Genetic programming (GP) involves the automatic generation of computer programs to perform a user defined task. GP is bio inspired algorithm based on evolution principles to solve complex problems (Poli et al. 2008). Although RGP computations are expensive on time and computational power, their results are simple representations of the problem without being exposed to a priori information about the problem beforehand. RGP offers various options for initialization, variation, and selection procedures inherent in GP.
The population size was set to 1000 and the modeling process was set to 5 million evolution steps divided into ten stages. After each stage, the models were tested according to the tenfold cross-validation method. RMSE of 0.12 was used as an additional algorithm stop condition based on the guidance of previous results generated by other tree based packages. The equations were created on the whole data set initially and then selected ones were optimized using SANN algorithm followed by the BFGS method (Nash and Varadhan 2011; Nash 2014).
Results and discussion
Normalized RMSE (%) and R2 for tensile strength
Results from fugeR package can be represented in the form of linguistic rules although there exists no automatic methods which can defuzzify the system to a human readable form. Manual efforts are needed to extract rules and membership functions from fugeR results and to map the membership functions back to the fuzzy rules. A set of rules (Additional file 1) is generated where each rule contains information about two input variables interacting with each other. The rules guides the input variable values to the predicted output. FugeR models, however accurate with predictions, generate rules that are sometimes redundant and contradictory to each other within the same model raising doubts about the validity and safety of use of the models in a pharmaceutical environment.
Random forests show comparable results to monmlp and fugeR. The advantage of using random forests is that they are rule based techniques and that the output can be generated in a linguistic manner for further analysis using the inTrees package. Such rules, once simplified, can be used as guidance towards understanding and informed manipulation of the system. However, there are a few impediments; the rules created are large in number that generalizing through them can be daunting, and they might represent the problem in a wide manner leading to variability in the results. Variability in how the system processes inputs to compute results is to be avoided owing to consistency and quality considerations within the pharmaceutical industry and with the regulatory authorities. With symbolic regression, such variability can be avoided as the solution can be represented in the form of an equation.
Rules generated from randomForest models using inTrees
Mg > 0.35 and Dwell ≤ 47.94
Mg ≤ 0.35 and Dwell > 47.94
Dwell > 17.75 and Compr. ≤ 24
SA ≤ 0.405 and MC ≤ 19.795
Mg ≤ 0.675 and NaCMC > 3.885 and Compr. ≤ 16
Mg ≤ 0.675 and Dwell > 47.94 and Compr. > 16 and Compr. ≤ 24
NaCMC ≤ 3.82 and Dwell ≤ 47.94 and Compr. > 24
NaCMC ≤ 2.635 and Dwell > 47.94 and Compr. > 24
SA ≤ 0.405 and Mg > 0.675 and Dwell ≤ 47.94
NaCMC ≤ 2.57 and Dwell ≤ 47.94 and Compr. > 24
SA ≤ 1.595 and Mg ≤ 0.35 and Dwell > 47.94
SA > 0.575 and Mg > 1.285 and Dwell > 24.52 and Compr. ≤ 24
SA > 1.595 and Dwell ≤ 72.5
MC > 25.095 and Mg ≤ 0.35 and Dwell > 24.52 and Dwell ≤ 47.94
SA ≤ 1.595
SA ≤ 0.405 and Mg > 0.35 and Dwell ≤ 47.94
Prediction range values for low, medium, and high in Table 2
Tensile strength (N/mm2)
The algorithm of inTrees discretizes all the input and output values in the data set before dividing them into three quantiles based on the outcome value (Table 3). The rules are then extracted and pruned to define the outcome as low, medium, and high which correspond to the initially defined three quantiles of the outcome values. In Table 2, the rules are presented as conditions in a simple linguistic manner which can be interpreted and used as guidance to create a product of tensile strength within a certain known range. ‘Freq’ and ‘err’ are the number of occurrences of that particular conditions and how many cases deviate from the condition in the data set, respectively.
Equation 3 is simple and represents the problem in a concrete manner. The original data set contains six input features while the equation represents the two most important ones. This is an example of feature selection behaviour by rgp, which has been observed in other instances as well (Mendyk et al. 2015). Feature selection densifies the effect of crucial inputs in the system and discards the trivial ones in an attempt to capture more information in the model yet making it simpler. Out of all the inputs, Magnesium stearate and dwell time were selected as critical features. Although rgp prediction error for tensile strength was highest in the ranking (Table 1), it is the most transparent model of all methods tried. The choice of this equation was a tradeoff between simplicity and predictability performed due to the fact that complexity of rpg models found closer to the best generalization error was increased exponentially and the resulting models were over fitted.
A complex systems perspective
Prediction of tensile strength benefits the drug discovery and production chain by preventing failures beforehand, which can be extended designing a strategy that takes into account the design problems and their solutions at an early stage in the drug discovery and manufacture life cycle (Thomke and Fujimoto 2000), also conforming with QbD principles by FDA (ICH 2009). Developed CI models allow testing several approaches within the boundaries without the necessity of performing an assay/conducting experiments in laboratory. Increased understanding of the components of the system and how they interact lead to higher success rates of delivering the drug to market in less time and cost. Variables critical for (data-driven) predictive ability are discussed here, as opposed to variables already known to be typical for product quality. Our results focus on highlighting variables which are important in increasing predictive ability of the system for tensile strength. In this case, the amount of Magnesium stearate and the speed of tableting machine (dwell time) were found out to be the most important variables to predict tensile strength. Lesko et al. (2000) argues that the need for predictability in the pharmaceutical drug manufacture is of utmost importance as it can only be achieved by truly understanding the drug, underlying interactions, and the prevailing conditions knowledge of which will directly influence the design of production process (Van Dyck and Peter 2006).
CI models represent the problem of tensile strength satisfactorily. Furthermore, models have been further analyzed in an attempt to make them more transparent. Rules were extracted from randomForest models and represented in a simple and understandable manner which can be used by the pharmaceutical industry for research and regulatory purposes. A mathematical formula was created using symbolic regression which defines the problem of tensile strength for the particular data set used. Symbolic regression results exhibit feature selection behavior taking into account only the input variables which are contributing mostly towards the output. The latter is a starting point to further considerations about possible mechanisms governing analyzed problem. Tensile strength is a factor describing mechanical strength of a tablet. As addition of magnesium stearate was found to be responsible for tablets being less durable, it might be hypothesized that hydrophobic character of this excipient disrupts some hydrophilic interactions between particles in the tablet mass (Hersen-Delesalle et al. 2007). It conforms very well with one of the theories of tablets formation, where residual and/or crystalline water present in the bulk material of tablet mass during compression is relocated and causes re-crystallization of the material in-between particles thus creating inter-particles bonds contributing to the strength of the resulting tablets (Crouter and Briens 2014).
MHK carried out data wrangling, modeling experiments, and drafted the manuscript, PKT and PK participated in the modeling experiments and manuscript, JS deployed the inTrees methods and participated in manuscript draft, RJ participated in drafting the manuscript, and AM participated in data collection, modeling, and manuscript. All authors read and approved the final manuscript.
Two authors, Mohammad Hassan Khalid and Pezhman Kazemi, were supported by the IPROCOM Marie Curie initial training network, funded through the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme FP7/2007-2013/under REA Grant agreement No. 316555.
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- Bolhuis GK, Lerk CF, Zijlstra HT, Boer AHD (1975) Film formation by magnesium stearate during mixing and its effect on tabletting. Pharm Weekbl Sci 110:317–325Google Scholar
- Bourquin J, Schmidli H, van Hoogevest P, Leuenberger H (1998) Comparison of artificial neural networks (ANN) with classical modelling techniques using different experimental designs and data from a galenical study on a solid dosage form. Eur J Pharm Sci 6:287–301View ArticleGoogle Scholar
- Breiman L (2001) Random forests. Mach Learn. doi:10.1023/A:1010933404324 MATHGoogle Scholar
- Bujard A (2012) fugeR: FUzzy GEnetic, a machine learning algorithm to construct prediction model based on fuzzy logic. In: R Packag. version 0.1.2. http://cran.r-project.org/package=fugeR. Accessed 1 Jan 2015
- Cannon AJ (2005) Package “monmlp”: Multi-layer perceptron neural network with partial monotonicity constraints. doi:10.1007/11550907
- Chaffee MW, McNeill MM (2007) A model of nursing as a complex adaptive system. Nurs Outlook 55(5):232–241View ArticleGoogle Scholar
- Crouter A, Briens L (2014) The effect of moisture on the flowability of pharmaceutical excipients. AAPS PharmSciTech 15:65–74. doi:10.1208/s12249-013-0036-0 View ArticleGoogle Scholar
- Duberg M, Nyström C (1982) Studies on direct compression of tablets. VI. Evaluation of methods for the estimation of particle fragmentation during compaction. Acta Pharm Suec 19:421–436Google Scholar
- Fukui E, Miyamura N, Kobayashi M (2001) Effect of magnesium stearate or calcium stearate as additives on dissolution profiles of diltiazem hydrochloride from press-coated tablets with hydroxypropylmethylcellulose. Int J Pharm 216:137–146View ArticleGoogle Scholar
- Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13:3–54. doi:10.1023/A:1006524209794 View ArticleMATHGoogle Scholar
- Gupta A, Peck GE, Miller RW, Morris KR (2005) Influence of ambient moisture on the compaction behavior of microcrystalline cellulose powder undergoing uni-axial compression and roller-compaction: a comparative study using near-infrared spectroscopy. J Pharm Sci 94:2301–2313. doi:10.1002/jps.20430 View ArticleGoogle Scholar
- Gupta RR, Gifford EM, Liston T et al (2010) Using open source computational tools for predicting human metabolic stability and additional absorption, distribution, metabolism, excretion, and toxicity properties. Drug Metab Dispos 38:2083–2090. doi:10.1124/dmd.110.034918 View ArticleGoogle Scholar
- Hentzschel CM, Alnaief M, Smirnova I et al (2012) Tableting properties of silica aerogel and other silicates. Drug Dev Ind Pharm 38:462–467. doi:10.3109/03639045.2011.611806 View ArticleGoogle Scholar
- Hersen-Delesalle C, Leclerc B, Couarraze G et al (2007) The effects of relative humidity and super-disintegrant concentrations on the mechanical properties of pharmaceutical compacts. Drug Dev Ind Pharm 33:1297–1307. doi:10.1080/03639040701384918 View ArticleGoogle Scholar
- Houtao D (2014) Interpreting Tree Ensembles with in Trees. CRAN. ArXiv:1408.5456. Freely available at https://cran.r-project.org/web/packages/inTrees/index.html
- Ibrić S, Djuriš J, Parojčić J, Djurić Z (2012) Artificial neural networks in evaluation and optimization of modified release solid dosage forms. Pharmaceutics 4:531–550. doi:10.3390/pharmaceutics4040531 View ArticleGoogle Scholar
- ICH (2009) Guidance for industry: Q8(R2) Pharmaceutical development. Freely available at http://www.fda.gov/downloads/Drugs/.../Guidances/ucm073507.pdf
- Khalid MH, Tuszyński PK, Szlek J et al (2015) From black-box to transparent computational intelligence models: a pharmaceutical case study, 2015. In: 13th International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan, 2015, pp 114–118. doi: 10.1109/FIT.2015.30
- Landin M, Rowe RC, York P (2012) Establishing and analyzing the design space in the development of direct compression formulations by gene expression programming. Int J Pharm 434:35–42. doi:10.1016/j.ijpharm.2012.04.078 View ArticleGoogle Scholar
- Lesko LJ, Rowland M, Peck CC, Blaschke TF (2000) Optimizing the science of drug development: opportunities for better candidate selection and accelerated evaluation in humans. Pharm Res 17(11):1335–1344View ArticleGoogle Scholar
- Maghsoodi M (2012) How spherical crystallization improves direct tableting properties: a review. Adv Pharm Bull 2:253–257. doi:10.5681/apb.2012.039 Google Scholar
- Mendyk A, Güres S, Jachowicz R et al (2015) From heuristic to mathematical modeling of drugs dissolution profiles: application of artificial neural networks and genetic programming. Comput Math Methods Med 2015:1–9. doi:10.1155/2015/863874 View ArticleGoogle Scholar
- Nash JC (2014) On best practice optimization methods in R. J Stat Softw 60:1–14View ArticleGoogle Scholar
- Nash JC, Varadhan R (2011) Unifying optimization algorithms to aid software system users: optimx for R. J Stat Softw. doi:10.18637/jss.v043.i09 Google Scholar
- Nicklasson H, Podczeck F (2007) Evaluation of the role of pores during strength testing in compacts made from different particle size fractions of sucrose. Chem Pharm Bull 55:29–33. doi:10.1248/cpb.55.29Google Scholar
- Ortiza DM, Muzzio FJ, Mendez R (2014) Particle size segregation promoted by powder flow in confined space: the die filling process case. Powder Technol 262:215–222View ArticleGoogle Scholar
- Pacławski A, Szlęk J, Lau R et al (2015) Empirical modeling of the fine particle fraction for carrier-based pulmonary delivery formulations. Int J Nanomedicine 10:801–810. doi:10.2147/IJN.S75758 Google Scholar
- Paley J, Eva G (2010) Complexity theory as an approach to explanation in healthcare: a critical discussion. Int J Nurs Stud 48(2):269–279. doi:10.1016/j.ijnurstu.2010.09.012
- Poli R, Langdon WB, McPhee NF (2008) A field guide to genetic programing. (With contributions by J.R. Koza). Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk
- Quinlan JR (1992) Learning with continous classes. Mach Learn 92:343–348. Freely available at http://sci2s.ugr.es/keel/pdf/algorithm/congreso/1992-Quinlan-AI.pdf
- Rodvold DM, McLeod DG, Brandt JM et al (2001) Introduction to artificial neural networks for physicians: taking the lid off the black box. Prostate 46:39–44View ArticleGoogle Scholar
- Shao Q, Rowe RC, York P (2007) Comparison of neurofuzzy logic and decision trees in discovering knowledge from experimental data of an immediate release tablet formulation. Eur J Pharm Sci 31:129–136. doi:10.1016/j.ejps.2007.03.003 View ArticleGoogle Scholar
- Sun CC (2009) Materials science tetrahedron–a useful tool for pharmaceutical research and development. J Pharm Sci 98:1671–1687. doi:10.1002/jps.21552 View ArticleGoogle Scholar
- Sun CC, Himmelspach MW (2006) Reduced tabletability of roller compacted granules as a result of granule size enlargement. J Pharm Sci 95:200–206. doi:10.1002/jps.20531 View ArticleGoogle Scholar
- Thomke S, Fujimoto T (2000) The effect of ‘frontloading’ problem-solving on product development performance. J Prod Innov Manage 17(2):128–142View ArticleGoogle Scholar
- Van Dyck W, Peter MA (2006) Pharmaceutical discovery as a complex system of decisions: the case of front-loaded experimentation. Emergence-Mahwah-Lawrence Erlbaum- 8(3):40Google Scholar
- Vezin WR, Pang HM, Khan KA, Malkowska S (2008) The effect of precompression in a rotary machine on tablet strength. Drug Dev Ind Pharm 9(8):1465–1474. doi:10.3109/03639048309052388 View ArticleGoogle Scholar
- Vromans H, Lerk CF (1988) Densification properties and compactibility of mixtures of pharmaceutical excipients with and without magnesium stearate. Int J Pharm 46:183–192. doi:10.1016/0378-5173(88)90076-2 View ArticleGoogle Scholar
- Xu M, Heng PWS, Liew CV (2015) Formulation and process strategies to minimize coat damage for compaction of coated pellets in a rotary tablet press: a mechanistic view. Int J Pharm 499:29–37. doi:10.1016/j.ijpharm.2015.12.068 View ArticleGoogle Scholar