Here we consider some proposal-steps in the calculation of the shadow economy as a latent or simply response variable in the models that involve high volatile observable as the money aggregates etc., and if the number of data point is small. In this case we propose to check for possible extreme behavior or self-organization regimes present in the series by testing e log-periodic fit to the data. To improve the linear regressions critical points (if found) have been excluded from series by simply truncating them. Next, the presence of more general regimes is analyzed using empirical mode decomposition techniques, and we estimate that the best truncated series to be used should exclude the edges of such regimes. In the case of short term regimes, we propose to use series in intervals that include many cycles. This technique worked for the calculation of informal economy in the Republic of Macedonia for the short period of [2004, 2016] but it is supposed to improve calculation for other cases as well
1. Some remarks on modeling with economic and financial time series
In a general consideration for econometric modeling, we attempt to fit observed values for an economic observable to a functional of other ones. From mathematical point of view it is plausible that variables involved in the model could be stationary. If data series are not stationary the remove of unit root by some additional operation is needed. In some models for estimation of unregistered economy, a key variable that is money aggregate ,, by nature is non-stationary and usually volatile. Procedures of data elaboration for this case are detailed largely in literature as for example in ,  etc. More detailed algebraic analysis and remarks on those procedures have been described in the reference , etc. However, indexes and money aggregates could be characterized from a special non-linear dynamics called self-organization behavior ,  that could not be elaborated easily to be used in linear models. When dealing with a model of estimation for a latent quantity as the informal economy for example, it worth to consider the additional effects of critical behavior in the dynamics model’s variables. Those effects are expected to become important if the series consist on a small number of data. In the calculation of the informal economy in Republic of Macedonia for the period [2004, 2014] using standard models CDA (currency demand approach) and MIMIC (multiple indicators, multiple causes) as proposed in  etc., we noticed that the results were not so good. Calculation for other periods have been reported in many references as in ,  etc., but apparently by using more data. In our initial work we proposed to use monthly data to improve the calculation and to overcome the problem of small number of data that make regressions less suitable. But the key variable of the model, the currency in circulation, in monthly series showed selforganization-like behavior. So we must consider this problematical behavior in the modeling process. Next, in a brief regard we can expect that an observable can be measured or known accurately only if its (eigen) state is stationary. But in transitive economies (as our case is to be), there exist at least one point when a total regime change occur. Other disturbances could be present too. In short, specific systems are not as good as mathematical models want them to be. Therefore we assumed that removing those shortcomings could have improved the result of linear modeling and therefore we considered them in a preliminary analysis. In our case study we considered the analysis of the state itself from a general point of view and next we explored about specifics of the dynamics on the series used for the calculation.
2. Some proposals and approaches
As a starting point, we calculate the expected position of the variable under study and tried to recognize the general trend of it. Herein we refer to the parametric distribution analyzed in ,  and  say
called q-Gaussian. They have specific advantages in the fact that q-parameter therein represents the distance from the Gaussian distribution . Using this relationship we estimated the margins of informal economy for the country by using the distribution of global economies and acknowledged the general tendencies considering other analysis as in  or . Next, we expect to improve the result of the model by addressing complex dynamics issues for money aggregates variables used in regressions of CDA or MIMIC models. In reference  the self-organization and discrete scale of invariance (DSI) have been analyzed theoretically. Therein, specific functions called log periodic have been proposed to catch DSI or fractal dynamics of the quantity under the study. It reads
where tc is the critical time. According to  it signifies the moment where a regime change is most probable to occur. In other consideration there are more terms than in (2) as analyzed in  or in . We assume that linear models are expected to fit better if series do not contain critical points. In following we will discuss the dynamics related to the critical time tc, leaving the cyclic frequency (ω), the power exponent (m) out of the comments. Another problem that is supposed to affect the linearization procedure on models mentioned above is the presence of regimes. It could be reasonable to use series belonging to one single regime and far from its edges, so we propose to analyses the regimes on the series used. A very intriguing method based on error reduction analysis called empirical mode decomposition technique is discussed in  and , and in many applications for real systems. Shortly, the method adaptively represents non-stationary signals as sums of zero-mean AM-FM components , . Highly dynamical series that are difficult to be examined with analytic techniques could be investigated using EMD approach instead. Many improvement of such calculation have been introduced successively and some of them are discussed in . In following we will describe theprocedures used in our concrete calculation
3. Estimation of the boundaries from descriptive analysis
By straightforward analysis discussed in , the size of informal economy has been calculated for individual countries in many works . International organization as IMF or World Bank updated every time their own estimation of this parameter.. Referring to them we estimate that the Informal Economy of the country for the period [2006, 2016] could be found in the zone centered around 35%-40% as seen on the picture of Figure 1. Remember that this parameter is usually represented as fraction of the GDP.
Using equation (2) we observe that the informality in world is characterized by two distinct classes grouped in two separate distributions. The first one is characterized by a near to Gaussian shape centered at 15% and include developed economies. The other shows a distorted Gaussian distribution centered at 40%. Examining the q parameter in different periods, we obtained that the distribution of informal economies on the second group shows a tendency for stabilization. This is read form the value of q parameter that became marginally smaller by time. Therefore we can accept that the RM economy might have an estimated informal economy around 40% in this period and probably has a stabilizing trend toward smaller values. As an immediate consequence, the estimations for informal economy of the country that result outlying this reference values must be reviewed in details or even rejected. Considering other calculations as in  or  we noticed that informal economy has been reported in different trends during [2004, 2014]. Therefore, from the above picture we fixed the boundaries. Particularly the threshold is considered at 15% that belong to the developed countries as seen in the Figure 1, because we are sure the economy under analysis does not belong to this group. By empiric application of linear models we obtained values of the country informal economy outlying those mentioned above, therefore we proposed more careful analysis based on the dynamics of variables used.
4. The dynamics of some important variables and factors
In the CDA method, the indicator of informal economy is assumed to be the ratio of money in circulation with some other money aggregates or component as explained in ,  etc. We used the ratio C/M where C is the amount of money in circulation and M were taken the base money (M0), narrow money (M1) , broad money (M2) etc. according to above remarks, the regression should be performed for series that does not include extreme behavior or critical points. To realize that, we checked the log periodic fit of the data series. In this procedure we changed the start and the end date for partial series under test, using an ad hoc genetic algorithm as in . Firstly we considered trimestral data so we had 4 times more points than yearly ones, but still the number is smaller for such quantitative analysis. But those series are acceptable in CDA or MIMIC modeling because the values for important model variables as GDP, GNI, were still available in this format. So we learned qualitatively the behavior in this case. Monthly series could be better in this view, even they cannot be used directly in the CDA model for example because some important variables doesn’t appear in monthly records. But we can still analyze them for a better acknowledgment of the dynamics characterizing the variable under study. For sure some properties are expected to disappear in longer period series but ‘unwanted” zones could have been localized so far.
By reading a log-periodic fit to trimestral data of C/D variable in [2002,2016], we obtained that possibly the system has entered a near to a DSI regime which is expected to change near January 2023 (c.p). This result is not truthful in quantitative view, because it outlays the period of the study but can be considered as an argument that critical zone is not in the adjacent of the end of the period considered. Using monthly series of the ratio C/D, we observed that a good log-periodic fit has been found and a regime is expected to be present in its trend. It starts near the coordinate 33 that coincide to the September 2004, and the critical time was expected to be around 171 month later, that is around 2017 c.p,. So far, the end data of our series (2016) seems to fall in a particular region where extreme behavior is characteristic. This “problematic” edge would better be excluded and therefore we use the period [2005, 2014] in our calculation for informal economy. In this case values of informality obtained using different models lies in the range of 28%-43% and do not differ remarkably. We used this approach to detect which variable was the most suitable to be used as first indicator of informal economy according to the CDA models. But theoretically it other C/M ratios are more important for such models.
Considering rapport C/M2 as another candidate for response variable in CDA model, we observe that a shorter regime has been found in its underlying behavior. It probably started at coordinate 112 that is April 2012 and finished at coordinate 170 that is January 2017. Results are presented in the Table 1. Under such condition all the period discussed [2004, 2016] contains a regime change point for variable the C/M2 and moreover, it correspond to a bubble-like behavior around end of 2017, Figure 3. The data points around 2017 impose high deviations when used in our linear models as CDA or MIMIC. Next, taking into account the important weight of remittances in the country, we proposed to analyses the ratio of C to the broad money M22 In this case we obtained that more than one critical–like process might underline the dynamics of variable C/M22, as seen in Figure 4. We expect that the presence of more than one critical point would minimize the deviation effect of each other. With acceptable statistical significance we identified a relatively medium term regime that start at coordinate 82 and die at 159.
There are some short range processes lasting around 2 years or so as shown in figure 4. Hence the variable C/M22 spanning all target intervals [2002, 2016] is found admissible to be used in models.
In this case we observe that CDA and MIMIC approach gives similar result for all the period considered, say [2004, 2016]. Up here we underline to important facts. First, findings herein give some information for precaution in the linear analysis, but they are not considered sufficient for a trustworthy modeling. The dynamics observed in monthly data is not expected to be transferred in yearly data, but the presence of special points strongly suggested that linear relationship could be destroyed nearby them. However, the verification of the log-periodic presence needs for more data and more sophisticated analysis so false alarms are possible. Being aware of this, we use those result as precaution measure in linear regressions as mentioned above..
As long as the presence of regimes is deducted but its nature is not verified, the best strategy is to identify the real trend and regimes no matter what type they really are. In the case where the presence of a DSI regime is likely to be present we identify a critical point that behaves differently from the others. But other processes could be present and the resulting regime is complex, hence unknown. In this case we suggested the analysis of empirical regime using empirical mode decomposition.
5. Empirical regime identification
We used the Empirical Mode Decomposition (EEMD algorithm)  to obtain the underlining trend for whole interval studied, neglecting characteristic local or self-organization dynamics as above. By this technique a complicated nonlinear and non-stationary signal was decomposed in so called “intrinsic modes” which are not rigorously orthogonal. Interestingly, the last mode gives the trend that underlines the data. Important comments for such application have been provided in , and detailed calculation in . We noticed that by construction the last IMF signifies the long range trend of the series. In Figure 5 the last IMF on C/M1 analysis shows a possible regime that is expected to end around coordinate 240 and it has possibly started near to our start point of the series (January 2002). Considering the above discussion that the regime contains log-periodic behavior as well, it is supposed that empiric regime found here in coincide with self - organization one. Therefore if we choose our series in the interval [2004, 2016], the edges effect of regimes whatever type they are, is removed. For better analysis we should consider other IMF and not surprisingly 2-10 year cycles have been observed, but those findings has not been used in our work. In the same way we obtained that the variable C/M22 has an underlying regime (last IMF cycle) that started many month before our first point in series and will finish around 2020, Figure 6. So in the period [2004, 2016] contains no edges of medium term regime and hence, according to the above assumption it is appropriate to be used in linear models.
We performed such corrected calculation and obtained that informal economy in the country has reached a maximum of 38%-41% around the years 2010-2011 and at 2016 has decreased to the value 32%-34%. By now is slowly going down
In this work we propose to improve the calculation in linear modeling for econometric variables in the case of short period data series, or if complex dynamics is present on it. In modeling informal economy (for Republic of Macedonia) as a hidden variable, we obtained better result if using series that does not include critical points, edges of the regimes etc. It worked in our concrete case and it supposed to be fruitful in other similar circumstances
- The Shadow Economy Schneider Friedrich, Enste DominikH. .2013. CrossRef Google Scholar
- Estimating the size of the shadow economies of 162 countries using the MIMIC method Schneider Friedrich. .. CrossRef Google Scholar
- Estimating panel data duration models with censored data Lee Sokbae. .2003-sep. CrossRef Google Scholar
- Estimation of a Model with Multiple Indicators and Multiple Causes of a Single Latent Variable Joreskog KarlG, Goldberger ArthurS. Journal of the American Statistical Association.1975-sep. CrossRef Google Scholar
- Clarifications to Questions and Criticisms on the Johansen-Ledoit-Sornette Bubble Model Sornette Didier, Woodard Ryan, Yan Wanfeng, Zhou Wei-Xing. SSRN Electronic Journal.2011. CrossRef Google Scholar
- Bubble diagnosis and prediction of the 2005�2007 and 2008�2009 Chinese stock market bubbles Jiang Zhi-Qiang, Zhou Wei-Xing, Sornette Didier, Woodard Ryan, Bastiaensen Ken, Cauwels Peter. Journal of Economic Behavior and Organization.2010-jun;:149-162. CrossRef Google Scholar
- Structural Model Evaluation and Modification: An Interval Estimation Approach Steiger JamesH. Multivariate Behavioral Research.1990-apr;:173-180. CrossRef Google Scholar
- Prediction of Consumer Behavior Regarding Purchasing Remanufactured Products: A Logistics Regression Model Yilmaz KadriG, Belbag Sedat. International Journal of Business and Social Research.2016-feb. CrossRef Google Scholar
- Migration and Development: Managing Mutual Effects Sriskandarajah Dhananjayan. .. CrossRef Google Scholar
- Computational applications of nonextensive statistical mechanics Tsallis Constantino. Journal of Computational and Applied Mathematics.2009-may;:51-58. CrossRef Google Scholar
- Tsallis statistics and magnetospheric self-organization Pavlos GP, Karakatsanis LP, Xenakis MN, Sarafopoulos D, Pavlos EG. Physica A: Statistical Mechanics and its Applications.2012-jun;:3069-3080. CrossRef Google Scholar
- On a q-Central Limit Theorem Consistent with Nonextensive Statistical Mechanics Umarov Sabir, Tsallis Constantino, Steinberg Stanly. Milan Journal of Mathematics.2008-mar;:307-328. CrossRef Google Scholar
- A non-Gaussian option pricing model with skew Borland Lisa, Bouchaud Jean-Philippe. Quantitative Finance.2004-oct;:499-514. CrossRef Google Scholar
- �Good� and �Bad� Investments: Everything You Always Wanted to Know about Ukrainian Com-manders but Were Afraid to Ask Komin Michael, Vileykis Alexander. Connections: The Quarterly Journal.2016;:57-71. CrossRef Google Scholar
- The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen N.-C., Tung CC, Liu HH. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.1998-mar;:903-995. CrossRef Google Scholar
- ENSEMBLE EMPIRICAL MODE DECOMPOSITION: A NOISE-ASSISTED DATA ANALYSIS METHOD WU ZHAOHUA, HUANG NORDENE. Advances in Adaptive Data Analysis.2009-jan;:1-41. CrossRef Google Scholar
- on the Influence of Sampling on the Empirical Mode Decomposition Rilling G, Flandrin P. 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings.. CrossRef Google Scholar
- Complexity methods used in the study of some real systems with weak characteristic properties Prenga Dode, Ifti Margarita. .2016. CrossRef Google Scholar