In different fields of study cumulative growth models over time have played and still play an important role. The objective of the research was to compare the Weibull model with the ROR methodology based on the cumulative number of new cases of COVID-19 that affected Iraq in the year 2020. The cumulative daily new cases of the COVID-19 pandemic were modeled, where a linear mathematical model was obtained through the methodology of Regressive Objective Regression (ROR), which explains the behavior of the same, depending on 15 days in advance and was compared with the non-linear Weibull model for the same data. With the linear ROR methodology, better results were obtained, since the variance explained was 100% and the F statistic was also higher, and the calculation of the Root Mean Square Error (RMSE) was improved by 72.38%; in addition, in six parameters the linear model outperformed the non-linear model. It is concluded that the cumulative cases of COVID-19 can be modeled with both models and even the cumulative cases of this new disease in the world can be predicted 15 days in advance by means of the mathematical modeling ROR, which allows reducing the number of dead, severe and critical patients for a better management of the pandemic, in spite of being the first time that a ROR model is applied to the processes of growth in the data with respect to time.
Key findings:
The research compares the Weibull model with the ROR methodology for modeling the cumulative new COVID-19 cases in Iraq during 2020. The linear ROR method outperformed the non-linear Weibull model, with 100% variance explained, higher F statistic, and improved RMSE by 72.38%. Predictions 15 days in advance were feasible, aiding pandemic management despite ROR's novel application to time-dependent growth data.
What is known and what is new?
The abstract highlights the importance of cumulative growth models in various fields and introduces a novel comparison between the Weibull model and the Regressive Objective Regression (ROR) methodology for modeling COVID-19 cases in Iraq in 2020. While both models can predict cumulative cases, the ROR method, for the first time applied to growth data over time, outperforms the Weibull model, offering more accurate predictions and potential for better pandemic management.
What is the implication, and what should change now?
The implication of the research is significant, as it demonstrates the efficacy of the Regressive Objective Regression (ROR) methodology in predicting cumulative COVID-19 cases, offering potential for improved pandemic management. Moving forward, greater adoption and refinement of ROR models in analyzing growth data over time could enhance forecasting accuracy and aid in mitigating the impact of the pandemic.
The current situation that the planet is living in, due to the new coronavirus, is one more effect, derived from the bad proceedings of anthropogenic activity, accumulated during thousands of years [1-3].
The novel coronavirus (2019-nCoV) identified on December 31, 2019 in Wuhan, China, currently officialized as SARS-CoV2, produces COVID-19. In addition, this virus is the first of its family to be declared a pandemic by the World Health Organization (WHO) on March 11, 2020 [4]. Global epidemiological studies of coronavirus (CoV) over 15 years have shown that bats in Asia, Europe, Africa, America and Australia are reservoirs for a wide variety of viruses, harboring and spreading these infectious agents quite easily, increasing their transmission capacity [5-7]. According to the Research Group Mathematical Models in Science and Technology: Development, Analysis, Numerical Simulation and Control (MOMAT) of the Institute of Interdisciplinary Mathematics of the Complutense University of Madrid, Spain, the application of the Be-CoDiS (Between-Countries Disease Spread) model in the analysis of the COVID-19 pandemic projected numerically that this viral phenomenon would be present until July 2020 in the world [8], but reality has surpassed the models.
Several models and methodologies have been applied in the study, analysis and modeling of COVID 19 in the world, where they stand out: Ordinary Differential Equation of First Order (EDOPO), linear type; Simple linear Regression model; Generalized Logistic Growth Model (GLM); Structured Susceptible Exposed-Infected-Removed (SEIR)/SEIR model; the Bayesian Probability Mathematical Model; SIRD model; Conceptual Model, the Simulation Model, nonlinear models Gompertz, Richards and Weibull, among many others [9-15]. By virtue of this it is important to estimate the trend in the behavior of the epidemiological curve of the COVID-19 pandemic [10, 13, 14, 16].
In different fields of study cumulative growth models over time have played an important role, many researchers have contributed to the knowledge of relevant developed models [8,9,13-16]. There are some nonlinear models, among the most common are: Gompertz, Weibull, negative exponential, Richard's model (logistic, mono molecular), Brody, Mitcherlich, von Betalanffy, S Shaped, among others [9, 13]. There are about 77 known equations referring to sigmoidal growth models, which are used in epidemics, bioassays, agriculture, engineering fields, tree diameter, forest height distribution [9]. Among the most commonly used growth models are those of Gompertz, Richard and Weibull. Their formulas can be seen in detail in Ban (2021) [15], in this research a detailed and brilliant mathematical description of them is made, although the root mean squared errors (RMSE) are large for nonlinear models.
The aim of the present research was to compare the Weibull model, which is a nonlinear model with the ROR methodology which is a linear model, as a function of the cumulative number of new COVID-19 cases in Iraq.
In carrying out the work, we used the data of the cumulative new cases pandemic in Iraq, taken from the article "Statistical modeling of the novel COVID-19 epidemic in Iraq", by Ban Ghanim Al-Ani, and published in the journal "Epidemiol. Methods" in 2021, and from which we were able to copy the data in handwritten form and enter them into SPSS, for further processing since the pandemic started (March 13, 2020, until July 22, 2020). The forecast was performed with the use of the Objective Regressive Regression (ORR) methodology, which has been implemented in different variables, such as viruses that have circulated in Villa Clara province, Cuba [16] and particularly for COVID-19 in Cuba [17]. A short- and long-term forecast up to August 6, 2020 was performed with da
In the linear ROR methodology, we must first create the dichotomous variables DS, DI and NoC, where: NoC: Number of cases of the base (its coefficient in the model represents the trend of the series). DS = 1, if NoC is odd; DI = 0, if NoC is even, and vice versa. DS represents a sawtooth function and DI this same function, but in inverted form, so that the variable to be modeled is trapped between these parameters and a large amount of variance is explained, a detailed explanation of which can be found in Fimia et al., (2020) [18].
The improvement index of both models (Weibull vs ROR) was calculated using the SKILL_SCORE formula, with a slight modification, since the Root Mean Squared Error (RMSE) was added, instead of the Mean Squared Error (MSE). All the data processing and analysis work was carried out with the help of the SPSS statistical package, Version 19, from IBM.
A summary of the linear ROR model obtained is shown, where 100 % of the variance is explained (Table 1).
Table 1. Summary of Model c,d.
Model | R | R squaredb | Adjusted R squared | Standard error of the estimate | Durbin-Watson |
1 | 1.000a | 1.000 | 1.000 | 178.924 | 1.124 |
a. Predictors: Lag1New, DS, DI, NoC, Lag15New. b. For regression through the origin (the model without intercept), R-squared measures the proportion of the variability in the dependent variable about the origin explained by the regression. This CANNOT be compared to R-squared for models that include intercept. c. Dependent variable: Cases-Iraq d. Linear regression through the origin |
The analysis of variance of the model was highly significant, with a Fisher's F of 1 020 727.429 significant at 100% (Table 2).
Table 2. ANOVAa,b.
Model | Sum of squares | gl | Quadratic mean | F | Sig. | |
1 | Regression | 163387352764.683 | 5 | 32677470552.937 | 1020727.429 | .000c |
Residuo | 3585557.317 | 112 | 32013.905 | |||
Total | 163390938322.000d | 117 | ||||
a. Dependent variable: Cases-Iraq b. Linear regression through the origin c. Predictors: Lag1New, SD, DI, NoC, Lag15New d. This total sum of squares is not corrected for the constant because the constant is zero for regression through the origin. |
In Table 3, it can be seen that all the variables are significant, the tendency of the disease is to increase significantly at 100 %. The linear model depends on the data regressed on 15 days (Lag15New), and on the data regressed on 1 day (Lag1New).
Table 3. Coefficientsa, b.
| Non-standardized coefficients | Coefficients standardized | t | Sig. | ||
B | Standard error | Beta | ||||
1 | DS | -181.845 | 59.741 | -.003 | -3.044 | .003 |
DI | -181.722 | 59.460 | -.003 | -3.056 | .003 | |
Trend | 5.237 | 1.091 | .011 | 4.801 | .000 | |
Lag15News | -.106 | .006 | -.059 | -16.789 | .000 | |
Lag1 News | 1.087 | .004 | 1.052 | 244.841 | .000 | |
a. Dependent variable: Cases-Iraq b. Linear regression through the origin |
In Figure 1 we can see that the residuals do not differ from a normal distribution with zero mean and variance 0.983.
Figure 1. Histogram of the Standardized Residuals for the Linear ROR Model
The modeling results show that the predicted values coincide with the real values of the pandemic, and the errors are very small, close to zero (Figure 2).
Figure 2. Cumulative New Cases Forecast for Iraq According to ROR Linear Model
The 15-day-ahead forecast is shown in Table 4, where a significant increase in cases can be seen. If the trends of this model were to continue, and the pandemic were not managed with severe measures, this would be
an unfavorable scenario for decision makers; 160873.71497 cases could have been reached by August 6, 2020, according to the linear ROR model only with the parameter lag15New.
Table 4. Case Summariesa
Date Cases-Iraq Unstandardized Predicted Value Unstandardized Residual
Date | Cases-Iraq | Unstandardized Predicted Value | Unstandardized Residual | ||
1 | 13-MAR-2020 | 15 | . | . | |
2 | 14-MAR-2020 | 29 | . | . | |
3 | 15-MAR-2020 | 35 | . | . | |
4 | 16-MAR-2020 | 56 | . | . | |
5 | 17-MAR-2020 | 62 | . | . | |
6 | 18-MAR-2020 | 73 | . | . | |
7 | 19-MAR-2020 | 88 | . | . | |
8 | 20-MAR-2020 | 109 | . | . | |
9 | 21-MAR-2020 | 128 | . | . | |
10 | 22-MAR-2020 | 161 | . | . | |
11 | 23-MAR-2020 | 211 | . | . | |
12 | 24-MAR-2020 | 241 | . | . | |
13 | 25-MAR-2020 | 277 | . | . | |
14 | 26-MAR-2020 | 353 | . | . | |
15 | 27-MAR-2020 | 401 | . | . | |
16 | 28-MAR-2020 | 442 | 336.26088 | 105.73912 | |
17 | 29-MAR-2020 | 525 | 384.44873 | 140.55127 | |
18 | 30-MAR-2020 | 590 | 479.37201 | 110.62799 | |
19 | 31-MAR-2020 | 624 | 552.90016 | 71.09984 | |
20 | 01-APR-2020 | 668 | 594.57252 | 73.42748 | |
21 | 02-APR-2020 | 716 | 646.33854 | 69.66146 | |
22 | 03-APR-2020 | 774 | 702.27172 | 71,72828 | |
23 | 04-APR-2020 | 857 | 768.19259 | 88.80741 | |
24 | 05-APR-2020 | 927 | 861.73826 | 65.26174 | |
25 | 06-APR-2020 | 1018 | 939.42853 | 78.57147 | |
26 | 07-APR-2020 | 1098 | 1038.38317 | 59.61683 | |
27 | 08-APR-2020 | 1128 | 1127.25888 | .74112 | |
28 | 09-APR-2020 | 1175 | 1161.40513 | 13.59487 | |
29 | 10-APR-2020 | 1214 | 1209.54336 | 4.45664 | |
30 | 11-APR-2020 | 1248 | 1252.19875 | -4.19875 | |
31 | 12-APR-2020 | 1274 | 1289.91813 | -15.91813 | |
32 | 13-APR-2020 | 1296 | 1314.73678 | -18.73678 | |
33 | 14-APR-2020 | 1311 | 1336.87184 | -25.87184 | |
34 | 15-APR-2020 | 1330 | 1354.92872 | -24.92872 | |
35 | 16-APR-2020 | 1378 | 1376.02890 | 1.97110 | |
36 | 17-APR-2020 | 1409 | 1428.46507 | -19.46507 | |
37 | 18-APR-2020 | 1435 | 1461.12270 | -26.12270 | |
38 | 19-APR-2020 | 1470 | 1485.94134 | -15.94134 | |
39 | 20-APR-2020 | 1498 | 1521.67435 | -23.67435 | |
40 | 21-APR-2020 | 1527 | 1547.81874 | -20.81874 | |
41 | 22-APR-2020 | 1573 | 1575.97153 | -2.97153 | |
42 | 23-APR-2020 | 1604 | 1628.14165 | -24.14165 | |
43 | 24-APR-2020 | 1659 | 1661.96496 | -2.96496 | |
44 | 25-APR-2020 | 1716 | 1722.96213 | -6.96213 | |
45 | 26-APR-2020 | 1743 | 1786.41864 | -43.41864 | |
46 | 27-APR-2020 | 1824 | 1818.36432 | 5.63568 | |
47 | 28-APR-2020 | 1899 | 1909.17455 | -10.17455 | |
48 | 29-APR-2020 | 1981 | 1994.45008 | -13.45008 | |
49 | 30-APR-2020 | 2049 | 2086.66498 | -37.66498 | |
50 | 01-MAY-2020 | 2155 | 2160.83622 | -5.83622 | |
51 | 02-MAY-2020 | 2192 | 2277.86156 | -85.86156 | |
52 | 03-MAY-2020 | 2242 | 2320.67478 | -78.67478 | |
53 | 04-MAY-2020 | 2327 | 2376.41804 | -49.41804 | |
54 | 05-MAY-2020 | 2376 | 2471.18349 | -95.18349 | |
55 | 06-MAY-2020 | 2439 | 2526.47582 | -87.47582 | |
56 | 07-MAY-2020 | 2499 | 2595.42523 | -96.42523 | |
57 | 08-MAY-2020 | 2575 | 2662.45991 | -87.45991 | |
58 | 09-MAY-2020 | 2663 | 2744.58339 | -81.58339 | |
59 | 10-MAY-2020 | 2714 | 2839,29195 | -125.29195 | |
60 | 11-MAY-2020 | 2809 | 2897.21375 | -88.21375 | |
61 | 12-MAY-2020 | 2928 | 2996.98630 | -68.98630 | |
62 | 13-MAY-2020 | 3039 | 3123.72078 | -84.72078 | |
63 | 14-MAY-2020 | 3089 | 3240.77542 | -151.77542 | |
64 | 15-MAY-2020 | 3156 | 3293.26570 | -137.26570 | |
65 | 16-MAY-2020 | 3300 | 3359.95990 | -59.95990 | |
66 | 17-MAY-2020 | 3450 | 3517.89008 | -67.89008 | |
67 | 18-MAY-2020 | 3507 | 3680.71914 | -173.71914 | |
68 | 19-MAY-2020 | 3620 | 3739.01521 | -119.01521 | |
69 | 20-MAY-2020 | 3773 | 3861.74036 | -88.74036 | |
70 | 21-MAY-2020 | 3860 | 4026.69611 | -166.69611 | |
71 | 22-MAY-2020 | 4168 | 4120.00000 | 48.00000 | |
72 | 23-MAY-2020 | 4365 | 4452.02493 | -87.02493 | |
73 | 24-MAY-2020 | 4528 | 4661.90455 | -133.90455 | |
74 | 25-MAY-2020 | 4744 | 4838.99947 | -94.99947 | |
75 | 26-MAY-2020 | 5031 | 5068.78562 | -37.78562 | |
76 | 27-MAY-2020 | 5353 | 5373.43202 | -20.43202 | |
77 | 28-MAY-2020 | 5769 | 5716.71852 | 52.28148 | |
78 | 29-MAY-2020 | 6075 | 6168.86806 | -93.86806 | |
79 | 30-MAY-2020 | 6335 | 6499.42919 | -164.42919 | |
80 | 31-MAY-2020 | 6764 | 6772.08399 | -8.08399 | |
81 | 01-JUN-2020 | 7283 | 7227.52029 | 55.47971 | |
82 | 02-JUN-2020 | 8064 | 7790.86365 | 273.13635 | |
83 | 03-JUN-2020 | 8736 | 8632.75809 | 103.24191 | |
84 | 04-JUN-2020 | 9742 | 9352.20163 | 389.79837 | |
85 | 05-JUN-2020 | 10994 | 10441.37084 | 552.62916 | |
86 | 06-JUN-2020 | 12262 | 11774.70607 | 487.29393 | |
87 | 07-JUN-2020 | 13377 | 13136.94801 | 240.05199 | |
88 | 08-JUN-2020 | 14164 | 14336.76366 | -172.76366 | |
89 | 09-JUN-2020 | 15310 | 15174.26372 | 135.73628 | |
90 | 10-JUN-2020 | 16571 | 16394.62845 | 176.37155 | |
91 | 11-JUNE-2020 | 17666 | 17736.01686 | -70.01686 | |
92 | 12-JUN-2020 | 18846 | 18887.28704 | -41.28704 | |
93 | 13-JUN-2020 | 20105 | 20142.34393 | -37.34393 | |
94 | 14-JUN-2020 | 21211 | 21488.37300 | -277.37300 | |
95 | 15-JUN-2020 | 22596 | 22649.97583 | -53.97583 | |
96 | 16-JUNE-2020 | 24150 | 24105.48962 | 44.51038 | |
97 | 17-JUN-2020 | 25613 | 25716.65659 | -103.65659 | |
98 | 18-JUN-2020 | 27248 | 27240.72375 | 7.27625 | |
99 | 19-JUN-2020 | 29118 | 28916.07451 | 20.92549 | |
100 | 20-JUN-2020 | 30764 | 30820.98777 | -56.98777 | |
101 | 21-JUNE-2020 | 32572 | 32480.52868 | 91.47132 | |
102 | 22-JUN-2020 | 34398 | 34332.58111 | 65.41889 | |
103 | 23-JUNE-2020 | 36598 | 36238.70922 | 359.29078 | |
104 | 24-JUN-2020 | 39035 | 38513.48397 | 521.51603 | |
105 | 25-JUN-2020 | 41089 | 41033.38872 | 55.61128 | |
106 | 26-JUN-2020 | 43158 | 43154.90192 | 3.09808 | |
107 | 27-JUN-2020 | 45298 | 45283.46494 | 14.53506 | |
108 | 28-JUN-2020 | 47047 | 47481.05986 | -434.05986 | |
109 | 29-JUN-2020 | 49005 | 49269.70352 | -264.70352 | |
110 | 30-JUNE-2020 | 51420 | 51256.15708 | 163.84292 | |
111 | 01-JUL-2020 | 53604 | 53721.10405 | -117.10405 | |
112 | 02-JUL-2020 | 55916 | 55944.89825 | -28.89825 | |
113 | 03-JUL-2020 | 58250 | 58289.32603 | -39.32603 | |
114 | 04-JUL-2020 | 60375 | 60633.00348 | -258.00348 | |
115 | 05-JUL-2020 | 62171 | 62773.04268 | -602.04268 | |
116 | 06-JUL-2020 | 64597 | 64538.61685 | 58.38315 | |
117 | 07-JUL-2020 | 67338 | 66986.69427 | 351.30573 | |
118 | 08-JUL-2020 | 69508 | 69737.71031 | -229.71031 | |
119 | 09-JUL-2020 | 72356 | 71842.83116 | 513.16884 | |
120 | 10-JUL-2020 | 75090 | 74725.60144 | 364.39856 | |
121 | 11-JUL-2020 | 77402 | 77482.64824 | -80.64824 | |
122 | 12-JUL-2020 | 79631 | 79773.80521 | -142.80521 | |
123 | 13-JUL-2020 | 81653 | 82015.95187 | -362.95187 | |
124 | 14-JUL-2020 | 83863 | 84011.23685 | -148.23685 | |
125 | 15-JUL-2020 | 86144 | 86162.15917 | -18.15917 | |
126 | 16-JUL-2020 | 88167 | 88414.96410 | -247.96410 | |
127 | 17-JUL-2020 | 90216 | 90373.57843 | -157.57843 | |
128 | 18-JUL-2020 | 92526 | 92358.36103 | 167.63897 | |
129 | 19-JUL-2020 | 94689 | 94648.69001 | 40.30999 | |
130 | 20-JUL-2020 | 96803 | 96814.37438 | -11.37438 | |
131 | 21-JUL-2020 | 99000 | 98859.80270 | 140.19730 | |
132 | 22-JUL-2020 | 101258 | 100962.29504 | 295.70496 | |
133 | 23-JUL-2020 | . | 103191.34418 | . | |
134 | 24-JUL-2020 | . | 118206.33323 | . | |
135 | 25-JUL-2020 | . | 122244.11662 | . | |
136 | 26-JUL-2020 | . | 125597.27251 | . | |
137 | 27-JUL-2020 | . | 128932.37848 | . | |
138 | 28-JUL-2020 | . | 131882.01664 | . | |
139 | 29-JUL-2020 | . | 135190.68524 | . | |
140 | 30-JUL-2020 | . | 138500.70648 | . | |
141 | 31-JUL-2020 | . | 141549.17571 | . | |
142 | 01-AUG-2020 | . | 144536.38277 | . | |
143 | 02-AUG-2020 | . | 147984.19541 | . | |
144 | 03-AUG-2020 | . | 151130.02668 | . | |
145 | 04-AUG-2020 | . | 154305.11699 | . | |
146 | 05-AUG-2020 | . | 157498.25724 | . | |
147 | 06-AUG-2020 | . | 160873.71497 | . | |
Total | N | 147 | 132 | 132 | 117 |
a. Limited to the first 200 cases. |
Total N 147 132 132 117 a. Limited to the first 200 cases.
Table 5 shows the results for both models, with a set of statistical parameters, the F statistic of the linear ROR model, as well as six other parameters show better results than the nonlinear Weibull model; however, both linear and nonlinear models can be used in the modeling and prediction of this disease.
Table 5. Comparison of Linear ROR and Nonlinear Weibull Model
Criterion | Nonlinear Weibull model | Linear model ROR |
F | 97 393.322 | 1 020 727.429 |
RMSE | 647.619 | 178.924 |
BIAS | 0.000 | 0.000 |
MAE | 531.753 | 118.5490 |
AIC | 1711.894 | 1432.64 |
AICC | 1712.207 | 7760.64 |
BIC | 1723.425 | 1765.81 |
WIC | 1725.274 | 1215.15 |
*Weibull results were taken from the article "Statistical modeling of the novel COVID-19 epidemic in Iraq", by Ban Ghanim Al-Ani, 2020. https://doi.org/10.1515/em-2020-0025.[15].
Finally, the rate of improvement of one model over another was calculated as follows, which is nothing more than the modeling SKILL.
SKILL _ SCORE = 1 – RMSE Model to verify
RMSE Model to established
Here a slight modification was made to the Wilks (1987) [19] formula, adding the RMSE instead of the MSE according to the original formula, obtaining that the established model is the Weibull model and the model to be tested, the ROR; the improvement was 72.38 %, so we can affirm for this case that the linear model outperforms a nonlinear model, at least for the cumulative cases studied of COVID-19 in Iraq.
In the summary shown in table 1, 100 % of the variance is explained with an error of 178.9 cases, as it is a model used for short term, the Durbin Watson statistic is small, which allows introducing in the future model more independent variables if necessary, nevertheless, the model obtained yielded good results, which is in agreement with results obtained in other studies, with different infectious entities, including COVID-19 [20-22].
The trend of the disease was increasing, where a cumulative 160 873.7 cases could be reached until August 6, 2020 according to the linear ROR model, so these results in terms of trend, coincide with those obtained for that pandemic in Cuba [20-21].
The linear ROR model outperformed the non-linear Weibull model by 72.38 %, but this does not mean absolutely, since it is only one study; moreover, both models offer very good possibilities for use in COVID 19 modeling, so they constitute excellent tools for pandemic monitoring, analysis and prediction [11,12,15], with a high level of application in many fields and/or areas of science [23-25]. On the other hand, it is the first time that a ROR model is applied to growth processes in data with respect to time, so that certain steps of progress are being made in mathematical modeling, in terms of COVID-19, which in itself, is an epidemic of paradoxes [26]. By virtue of everything analyzed in this research, it is important to estimate the trend in the behavior of the epidemiological curve of the pandemic, because in truth, we do not yet know whether the virus will become endemic, recurrent year after year, or will finally be controlled [26-29].
The ROR mathematical modeling showed better results than the Weibull model, since the F statistic, as well as six other parameters showed better results than the non-linear Weibull model; however, both linear and non-linear models show good results in the modeling and prediction of COVID-19, which despite being a new disease in the world can be predicted 15 days in advance using the ROR methodology, and thus reduce the number of dead, severe and critical patients for better management of the pandemic.
Funding: No funding sources.
Conflict of interest: None declared.
Ethical approval: The study was approved by the Institutional Ethics Committee of University of Medical Sciences of Villa Clara.