Title: | a Non-Parametric Statistical Significance Test for Rolling Window Correlation |
---|---|
Description: | Estimates and plots (as a single plot and as a heat map) the rolling window correlation coefficients between two time series and computes their statistical significance, which is carried out through a non-parametric computing-intensive method. This method addresses the effects due to the multiple testing (inflation of the Type I error) when the statistical significance is estimated for the rolling window correlation coefficients. The method is based on Monte Carlo simulations by permuting one of the variables (e.g., the dependent) under analysis and keeping fixed the other variable (e.g., the independent). We improve the computational efficiency of this method to reduce the computation time through parallel computing. The 'NonParRolCor' package also provides examples with synthetic and real-life environmental time series to exemplify its use. Methods derived from R. Telford (2013) <https://quantpalaeo.wordpress.com/2013/01/04/> and J.M. Polanco-Martinez and J.L. Lopez-Martinez (2021) <doi:10.1016/j.ecoinf.2021.101379>. |
Authors: | Josue M. Polanco-Martinez [aut, cph, cre]
|
Maintainer: | Josue M. Polanco-Martinez <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.8.0 |
Built: | 2025-02-13 04:42:34 UTC |
Source: | https://github.com/cran/NonParRolCor |
'NonParRolCor' estimates and plots as a single plot and as a heat map the rolling window correlation coefficients and their statistical significance between two regular (sampled on identical time points) time series. The statistical significance is computed through a non-parametric computing-intensive method (Telford 2013, Polanco-Martínez and López-Martínez 2021). This method (test) address the effects due to the multiple testing problem (inflation of the Type I error) when the statistical significance is estimated for rolling correlation coefficients. The method is based on Monte Carlo simulations by permuting one (e.g., the dependent) of the variables under analysis and keeping fixed the other (e.g., the independent) variable. We improve the computational efficiency of this method to reduce the computation time through parallel computing. It has been designed especially for environmental (climate and ecological) data although this can be applied to other kinds of data sets as well. 'NonParRolCor' contains four functions: (1) 'rolcor_estim_1win' and (2) 'rolcor_estim_heatmap' to estimate the rolling window correlation coefficients and their respective statistical significance for only one window-length and for all possible window-lengths; (3) 'plot_rolcor_estim_heatmap' and (4) 'plot_rolcor_estim_heatmap' to plot the time series under analysis and the correlation coefficients that are statistically significant for only one window-length as a simple plot and for all possible window-lengths as a heat map, respectively. The functions contained in 'NonParRolCor' are highly flexible since these contains several parameters to control the estimation of correlation and the features of the plots of the time series, e.g., to remove potential linear trend contained in the time series under analysis or to personalise the plot of the time series under analysis. The 'NonParRolCor' package also provides examples with synthetic ('syntheticdata' data set) and real-life environmental ('ecodata' data sets) time series to exemplify its use.
Package: | NonParRolCor |
Type: | Package |
Version: | 0.8 |
Date: | 2020-10-30 |
License: | GPL (>= 2) |
LazyLoad: | yes |
NonParRolCor package contains four functions: (1) rolcor_estim_1win
and (2) rolcor_estim_heatmap
that estimate the rolling window correlation coefficients and their respective statistical significance for only one window-length and for all possible window-lengths, respectively; (3) plot_rolcor_estim_1win
and (4) plot_rolcor_estim_heatmap
that plots the time series under scrutiny and that create a simple plot and a heat map of the rolling window correlation coefficients that are statistically significant, respectively. NonParRolCor also contains three data sets: (1) syntheticdata
, (2) ecodata
and (3) ecodata2
to exemplify the use of the aforementioned functions. The significance test is based on and inspired from Telford (2013) and Polanco-Martínez (2019) whereas the simple plots and heat maps are based on Polanco-Martínez (2020). The non-parametric statistical significance test is described in detail in Polanco-Martínez and López-Martínez (2021).
Dependencies: stat, gtools, pracma, colorspace, scales, foreach, parallel, doParallel.
Josué M. Polanco-Martínez (a.k.a. jomopo).
Excellence Unit GECOS, IME, Universidad de Salamanca, Salamanca, SPAIN.
BC3 - Basque Centre for Climate Change, Leioa, SPAIN.
Web1: https://scholar.google.es/citations?user=8djLIhcAAAAJ&hl=en/.
Web2: https://www.researchgate.net/profile/Josue-Polanco-Martinez/.
Email: [email protected]
José L. López-Martínez.
Faculty of Mathematics, Universidad Autónoma de Yucatán (UADY), Tizimín, MEXICO.
Web1: https://scholar.google.es/citations?user=552PKVEAAAAJ&hl=es/.
Web2: https://www.researchgate.net/profile/Jose-Lopez-Martinez-3/.
Email: [email protected].
Acknowledgement:
The first author acknowledges to the SEPE (Spanish Public Service of Employment) and to the Excellence Unit GECOS (reference number CLU-2019-03), Universidad de Salamanca for its funding support. Special thanks to The Donegal Irish Pub (Portugalete) to provide space for research and code.
Polanco-Martínez, J. M. and López-Martínez, J.M. (2021). A non-parametric method to test the statistical significance in rolling window correlations, and applications to ecological time series. Ecological Informatics 60, 101379. <URL: doi:10.1016/j.ecoinf.2021.101379>.
Polanco-Martínez, J. M. (2020). RolWinMulCor: an R package for estimating rolling window multiple correlation in ecological time series. Ecological Informatics, 60, 101163. <URL: doi:10.1016/j.ecoinf.2020.101163>.
Polanco-Martínez, J. M. (2019). Dynamic relationship analysis between NAFTA stock markets using nonlinear, nonparametric, non-stationary methods. Nonlinear Dynamics, 97(1), 369-389. <URL: doi:10.1007/s11071-019-04974-y>.
Telford, R.: Running correlations – running into problems. (2013). <URL:
https://quantpalaeo.wordpress.com/2013/01/04/>.
The data set ecodata
contains four columns, the first one (named “Years”) is the time (years from 1989 to 2008, monthly resolution), the second (named “SST”) are monthly anomalies of sea surface temperature (SST) of the south of Gran Canaria (28.5 N/16.5 W) (NOAA 2021a), the third column (named “NAO”) are the monthly index of the North Atlantic Oscillation (NAO) (NOAA 2021b), and the last column (named “CPUE”) are monthly catches of common octopus (measured as CPUE or Catch Per Unit of Effort) from an artisanal fisheries from the Southwest of Gran Canaria Islands (Caballero-Alfonso et al. 2010, Polanco et al. 2011, Polanco-Martínez 2012).
data(ecodata)
data(ecodata)
One file in ASCII format containing four columns and 240 rows, columns are separated by spaces.
Caballero-Alfonso, A, Ganzedo, U., Trujillo-Santana, A., Polanco, J., del Pino, A. S., Ibarra-Berastegi, G., Castro-Hernández, J. (2010). The role of climatic variability on the short-term fluctuations of octopus captures at the Canary Islands. Fisheries Research, 102(3), 258-265. <URL: doi:10.1016/j.fishres.2009.12.006>.
NOAA Optimum Interpolation (OI) Sea Surface Temperature (SST) V2, <URL: https://psl.noaa.gov/data/gridded/data.noaa.oisst.v2.html>, accessed: 2021-02-28.
NAO index, <URL: https://psl.noaa.gov/data/correlation/nao.data>, accessed: 2021-02-28.
Polanco, J., Ganzedo, U., Sáenz, J., Caballero-Alfonso, A. M., & Castro-Hernández, J. J. (2011). Wavelet analysis of correlation among Canary Islands octopus captures per unit effort, sea-surface temperatures and the North Atlantic Oscillation. Fisheries Research, 107(1-3), 177-183. <URL: doi:10.1016/j.fishres.2010.10.019>.
Polanco-Martínez, J.M. (2012). Aplicación de técnicas estadísticas en el estudio de fenómenos ambientales y ecosistémicos, Ph.D. thesis, University of Basque Country, Spain. <URL: https://addi.ehu.es/handle/10810/11295/>.
The data set ecodata2
contains three columns, the first one (named ”Years”) is the time (years from 1700 to 1936, yearly resolution), the second column (named ”TSI”) are reconstructions of total solar irradiance (Lean 2000) and the third column the first component principal (PC1) of the reconstructed Atlantic Bluefin Tuna (BFT) captures (Ganzedo et al. 2016, Polanco-Martínez et al. 2018).
data(ecodata2)
data(ecodata2)
One file in ASCII format containing three columns and 237 rows, columns are separated by spaces.
Lean, J. (2000). Evolution of the Sun's spectral irradiance since the Maunder Minimum. Geophysical Research Letters, 27(16), 2425-2428. <URL: doi:10.1029/2000GL000043>. Lean Web TSI data set: <URL: https://www.ncei.noaa.gov/access/paleo-search/study/5788/>.
Ganzedo, U., Polanco-Martínez, J. M., Caballero-Alfonso, A. M., Faria, S. H., Li, J., Castro-Hernández, J. J. (2016). Climate effects on historic bluefin tuna captures in the Gibraltar Strait and Western Mediterranean. Journal of Marine Systems, 158, 84-92.
<URL: doi:10.1016/j.jmarsys.2016.02.002>.
Polanco-Martínez, J. M., Caballero-Alfonso, A. M., Ganzedo, U., Castro-Hernández, J. J. (2018). A reconstructed database of historic bluefin tuna captures in the Gibraltar Strait and Western Mediterranean. Data in Brief, 16, 206-210. <URL: doi:10.1016/j.dib.2017.11.028>.
The plot_rolcor_estim_1win
function plots the time series under study and create a simple plot of the rolling window correlation coefficients that are statistically significant that are obtained by the rolcor_estim_1win
function.
plot_rolcor_estim_1win(inputdata, corcoefs, CRITVAL, widthwin, left_win, righ_win, varX="X", varY="Y", coltsX="black", coltsY="blue", rmltrd=TRUE, Scale=TRUE, HeigWin1=2.05, HeigWin2=2.75, colCOEF="black", CEXLAB=1.15, CEXAXIS=1.05, LWDtsX=1, LWDtsY=1, LWDcoef=1, colCRITVAL="black", pchCRIVAL=16)
plot_rolcor_estim_1win(inputdata, corcoefs, CRITVAL, widthwin, left_win, righ_win, varX="X", varY="Y", coltsX="black", coltsY="blue", rmltrd=TRUE, Scale=TRUE, HeigWin1=2.05, HeigWin2=2.75, colCOEF="black", CEXLAB=1.15, CEXAXIS=1.05, LWDtsX=1, LWDtsY=1, LWDcoef=1, colCRITVAL="black", pchCRIVAL=16)
inputdata |
The same data matrix (time, first and second variable) that was used with the |
corcoefs |
Rolling correlation coefficients estimated with the |
CRITVAL |
The critical values computed through the function |
widthwin |
|
left_win , righ_win
|
These parameters are used to accommodate (to the left and right) the times of the rolling window correlation coefficients and these are provided by the |
varX , varY
|
Names of the first (e.g., X) and the second (e.g., Y) variables contained in |
coltsX , coltsY
|
Colors to be used when the variables are plotted, by default are “black” for the first variable and “blue” for the second, but other colors can be used. |
rmltrd |
Remove (by default is “TRUE”; “FALSE” otherwise) the linear trend in the variables under analysis. It is advisable to remove the trend before estimating the rolling window correlation coefficients, especially, for large window-lengths. |
Scale |
Scale (by default is “TRUE”; “FALSE” otherwise) is used to “normalize” or “standardize” the variables under analysis. It is highly advisable to ”normalize/standardize” the time series under study to have them in the same scales. |
HeigWin1 , HeigWin2
|
Proportion of window's size to plot the time series under analysis ( |
colCOEF |
The color to be used when the correlation coefficients are plotted, by default the color is “black”, but other colors can be used. |
CEXLAB , CEXAXIS
|
These parameters are used to plot the sizes of the X-axis and Y-axis labels and X- and Y-axis, by default these parameters have values of 1.15 and 1.05, respectively, but it is possible to use other values. |
LWDtsX , LWDtsY
|
Line-widths for the first and the second variable when these are plotted, by default these have values of 1, but other values (widths) can be used. |
LWDcoef |
The line-width to be used when the correlation coefficients are plotted, by default this parameter has a value of 1, but it is possible to use other values. |
colCRITVAL |
|
pchCRIVAL |
|
The plot_rolcor_estim_1win
function plots the variables (time series) under analysis and for the selected window-length, the rolling correlation coefficients that are statistically significant, which are estimated through a non-parametric computing-intensive method. The plot_rolcor_estim_1win
function uses the outputs of rolcor_estim_1win
. To implement this method we extend the works of Telford (2013), Polanco-Martínez (2019) and Polanco-Martínez (2020), and to implement the simple plot we follow to Polanco-Martínez (2020). The test/method to determine the statistical significance is described in Polanco-Martínez and López-Martínez (2021).
Outputs: A plot of the time series under analysis, and for the selected window-length, the rolling window correlation coefficients that are statistically significant. This multi-plot can be saved in your preferred format.
Josué M. Polanco-Martínez (a.k.a. jomopo).
Excellence Unit GECOS, IME, Universidad de Salamanca, Salamanca, SPAIN.
BC3 - Basque Centre for Climate Change, Leioa, SPAIN.
Web1: https://scholar.google.es/citations?user=8djLIhcAAAAJ&hl=en/.
Web2: https://www.researchgate.net/profile/Josue-Polanco-Martinez/.
Email: [email protected].
Polanco-Martínez, J. M. and López-Martínez, J.M. (2021). A non-parametric method to test the statistical significance in rolling window correlations, and applications to ecological time series. Ecological Informatics, 60, 101379. <URL: doi:10.1016/j.ecoinf.2021.101379>.
Polanco-Martínez, J. M. (2020). NonParRolCor: an R package for estimating rolling window multiple correlation in ecological time series. Ecological Informatics, 60, 101163. <URL: doi:10.1016/j.ecoinf.2020.101163>.
# Code to test the function "plot_rolcol_estim_1win" # Defining NonParRolCor parameters MCSim <- 2 Np <- 2 X_Y <- rolcor_estim_1win(as.matrix(syntheticdata[1:350,]), CorMethod="pearson", widthwin=21, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=MCSim, Np=Np, prob=0.95) plot_rolcor_estim_1win(syntheticdata[1:350,], corcoefs=X_Y$Correlation_coefficients, CRITVAL=X_Y$CRITVAL, widthwin=X_Y$widthwin, left_win=X_Y$left_win, righ_win=X_Y$righ_win)
# Code to test the function "plot_rolcol_estim_1win" # Defining NonParRolCor parameters MCSim <- 2 Np <- 2 X_Y <- rolcor_estim_1win(as.matrix(syntheticdata[1:350,]), CorMethod="pearson", widthwin=21, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=MCSim, Np=Np, prob=0.95) plot_rolcor_estim_1win(syntheticdata[1:350,], corcoefs=X_Y$Correlation_coefficients, CRITVAL=X_Y$CRITVAL, widthwin=X_Y$widthwin, left_win=X_Y$left_win, righ_win=X_Y$righ_win)
The plot_rolcor_estim_heatmap
function plots the time series under study and create a heat map of the rolling window correlation coefficients that are statistically significant that are obtained by the rolcor_estim_heatmap
function.
plot_rolcor_estim_heatmap(inputdata, corcoefs, CRITVAL, Rwidthwin="", typewidthwin="", widthwin_1=3, widthwin_N=dim(inputdata)[1], varX="X", varY="Y", coltsX="black", coltsY="blue", LWDtsX=1, LWDtsY=1, CEXLAB=1.15, CEXAXIS=1.05)
plot_rolcor_estim_heatmap(inputdata, corcoefs, CRITVAL, Rwidthwin="", typewidthwin="", widthwin_1=3, widthwin_N=dim(inputdata)[1], varX="X", varY="Y", coltsX="black", coltsY="blue", LWDtsX=1, LWDtsY=1, CEXLAB=1.15, CEXAXIS=1.05)
inputdata |
The same data matrix (time, first and second variable) that was used with the |
corcoefs |
Rolling correlation coefficients estimated with the |
CRITVAL |
The critical values computed through the function |
Rwidthwin |
|
typewidthwin |
Contains the type (“FULL” or “PARTIAL”) of heat map that will be plotted, this information is provided by |
widthwin_1 |
First value for the size (length) of the windows when the option |
widthwin_N |
Last value for the size (length) of the windows when the option |
varX , varY
|
Names of the first (e.g., X) and the second (e.g., Y) variables contained in |
coltsX , coltsY
|
Colors to be used when the variables are plotted, by default are “black” for the first variable and “blue” for the second, but other colors can be used. |
LWDtsX , LWDtsY
|
Line-widths for the first and the second variable when these are plotted, by default these have values of 1, but other values (widths) can be used. |
CEXLAB , CEXAXIS
|
These parameters are used to plot the sizes of the X-axis and Y-axis labels and X- and Y-axis, by default these parameters have values of 1.15 and 1.05, respectively, but it is possible to use other values. |
The plot_rolcor_estim_heatmap
function plots the variables (time series) under analysis and a heat map of the rolling correlation coefficients that are statistically significant. This function supersedes to the function heatmap_NonParRolCor
of the previous version of NonParRolCor
. The plot_rolcor_estim_heatmap
function uses the outputs of the rolcor_estim_heatmap
function. To implement this method we extend the works of Telford (2013), Polanco-Martínez (2019) and Polanco-Martínez (2020), and to implement the heat map we follow to Polanco-Martínez (2020). The test/method to determine the statistical significance is described in Polanco-Martínez and López-Martínez (2021). plot_rolcor_estim_heatmap
uses the functions diverge_hcl
(package:colorspace) and alpha
(package:scales) to create the palette of colors.
Outputs: A plot of the time series under analysis and a heat map (a multi-plot via screen) of the rolling correlation coefficients statistically significant. This multi-plot can be saved in your preferred format.
Josué M. Polanco-Martínez (a.k.a. jomopo).
Excellence Unit GECOS, IME, Universidad de Salamanca, Salamanca, SPAIN.
BC3 - Basque Centre for Climate Change, Leioa, SPAIN.
Web1: https://scholar.google.es/citations?user=8djLIhcAAAAJ&hl=en/.
Web2: https://www.researchgate.net/profile/Josue-Polanco-Martinez/.
Email: [email protected].
Polanco-Martínez, J. M. and López-Martínez, J.M. (2021). A non-parametric method to test the statistical significance in rolling window correlations, and applications to ecological time series. Ecological Informatics, 60, 101379. <URL: doi:10.1016/j.ecoinf.2021.101379>.
Polanco-Martínez, J. M. (2020). NonParRolCor: an R package for estimating rolling window multiple correlation in ecological time series. Ecological Informatics, 60, 101163. <URL: doi:10.1016/j.ecoinf.2020.101163>.
# Code to test the function "plot_rolcor_estim_heatmap" # Defining NonParRolCor parameters TYPEWIDTHWIN="PARTIAL" # Number of Monte-Carlo simulations (MCSim), please use at least 1000. # WARNING: MCSim=2, it's just to test this example! MCSim <- 2 Np <- 2 # Number of cores X_Y <- rolcor_estim_heatmap(syntheticdata[1:350,], CorMethod="pearson", typewidthwin=TYPEWIDTHWIN, widthwin_1=29, widthwin_N=51, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=MCSim, Np=Np) plot_rolcor_estim_heatmap(syntheticdata[1:350,], X_Y$matcor, X_Y$CRITVAL, Rwidthwin=X_Y$Windows, typewidthwin=TYPEWIDTHWIN, widthwin_1=29, widthwin_N=51)
# Code to test the function "plot_rolcor_estim_heatmap" # Defining NonParRolCor parameters TYPEWIDTHWIN="PARTIAL" # Number of Monte-Carlo simulations (MCSim), please use at least 1000. # WARNING: MCSim=2, it's just to test this example! MCSim <- 2 Np <- 2 # Number of cores X_Y <- rolcor_estim_heatmap(syntheticdata[1:350,], CorMethod="pearson", typewidthwin=TYPEWIDTHWIN, widthwin_1=29, widthwin_N=51, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=MCSim, Np=Np) plot_rolcor_estim_heatmap(syntheticdata[1:350,], X_Y$matcor, X_Y$CRITVAL, Rwidthwin=X_Y$Windows, typewidthwin=TYPEWIDTHWIN, widthwin_1=29, widthwin_N=51)
The rolcor_estim_1win
function estimates the rolling window correlation coefficients for only one window-length (time-scales) for two time series sampled on identical time points, and their statistical significance via a non-parametric computing-intensive method. To carry out the computational implementation we extend the works of Telford (2013), Polanco-Martínez (2019) and Polanco-Martínez (2020). The test/method to determine the statistical significance is described in Polanco-Martínez and López-Martínez (2021). The rolcor_estim_1win
function is highly flexible since this contains several parameters to control the estimation of the correlation. A list of parameters are described in the following lines.
rolcor_estim_1win(inputdata, CorMethod="pearson", widthwin=3, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=1000, Np=2, prob=0.95)
rolcor_estim_1win(inputdata, CorMethod="pearson", widthwin=3, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=1000, Np=2, prob=0.95)
inputdata |
A matrix of 3 columns: time (regular/evenly spaced), the first (e.g., the independent) variable, and the second (e.g., the dependent) variable. Please verify if |
CorMethod |
The method used to estimate the correlations, by default is “pearson,” but other options (“spearman” and “kendall”) are available (please look at: R>?cor.test). |
widthwin |
The window size or length that indicates the window's size to compute the rolling window correlations. |
Align |
To align the rolling object, NonParRolCor uses three options: “left”, “center”, and “right” (please look at: R>?running). However, there are some restrictions that have been described lines above. We recommend to use the “center” option to ensure that variations in the correlations are aligned with the variations in the relationships of the variables under study, rather than being shifted to left or to right (Polanco-Martínez 2019, 2020), but this imply that the window-lengths must be odd. |
rmltrd |
Remove (by default is “TRUE”; “FALSE” otherwise) the linear trend in the variables under analysis. It is advisable to remove the trend before estimating the rolling window correlation coefficients, especially, for large window-lengths. |
Scale |
Scale (by default is “TRUE”; “FALSE” otherwise) is used to “normalize” or “standardize” the variables under analysis. It is highly advisable to ”normalize/standardize” the time series under study to have them in the same scales. |
MCSim |
Number of Monte-Carlo simulations to permute the second variable. It is advisable to use at least 1000 simulations. |
Np |
Number of CPU cores, by default is 2. Please verify the number of cores of your computer. WARNING: it is not advisable to use the maximum number of cores of your computer. |
prob |
Numeric vector of probabilities with values in the interval [0,1], by default prob=0.95 (p=0.05), please look at R?quantile, Telford (2013) or Polanco-Martínez and López-Martínez (2021) for more information. |
The rolcor_estim_1win
function estimates the rolling window correlation coefficients for only one window-length and their statistical significance between two time series sampled on identical time points. The function rolcor_estim_1win
uses the functions cor.test
(package:stats) and running
(package:gtools) to estimate correlation coefficients and to compute the rolling window correlations, and also the functions foreach
(package:foreach) and makeCluster
(package:parallel) to parallelize the estimation of the statistical significance.
Outputs:
A list containing six elements: Correlation_coefficients
and CRITVAL
contain the rolling window correlation coefficients and their respective critical values to determine the statistical significance of these coefficients, CorMethod
is the method used to estimate the correlation coefficients (e.g., Pearson, Spearman or Kendall), widthwin
contain the window-length (time-scales), and left_win
and righ_win
are used to accommodate the times of the rolling window correlation coefficients.
Josué M. Polanco-Martínez (a.k.a. jomopo).
Excellence Unit GECOS, IME, Universidad de Salamanca, Salamanca, SPAIN.
BC3 - Basque Centre for Climate Change, Leioa, SPAIN.
Web1: https://scholar.google.es/citations?user=8djLIhcAAAAJ&hl=en/.
Web2: https://www.researchgate.net/profile/Josue-Polanco-Martinez/.
Email: [email protected].
José L. López-Martínez.
Faculty of Mathematics, Universidad Autónoma de Yucatán (UADY), Tizimín, MEXICO.
Web1: https://scholar.google.es/citations?user=552PKVEAAAAJ&hl=es/.
Web2: https://www.researchgate.net/profile/Jose-Lopez-Martinez-3/.
Email: [email protected].
Polanco-Martínez, J. M. and López-Martínez, J.M. (2021). A non-parametric method to test the statistical significance in rolling window correlations, and applications to ecological time series. Ecological Informatics 60, 101379. <URL: doi:10.1016/j.ecoinf.2021.101379>.
Polanco-Martínez, J. M. (2020). NonParRolCor: an R package for estimating rolling window multiple correlation in ecological time series. Ecological Informatics, 60, 101163. <URL: doi:10.1016/j.ecoinf.2020.101163>.
Polanco-Martínez, J. M. (2019). Dynamic relationship analysis between NAFTA stock markets using nonlinear, nonparametric, non-stationary methods. Nonlinear Dynamics, 97(1), 369-389. <URL: doi:10.1007/s11071-019-04974-y>.
Telford, R.: Running correlations – running into problems. (2013). <URL:
https://quantpalaeo.wordpress.com/2013/01/04/>.
# Code to test the function "rolcor_estim_1win" # Defining the 'NonParRolCor' parameters # Number of Monte-Carlo simulations (MCSim), please use at least 1000. # WARNING: MCSim=2, it's just to test this example! MCSim <- 2 Np <- 2 # Number of cores X_Y <- rolcor_estim_1win(syntheticdata[1:350,], CorMethod="pearson", widthwin=3, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=MCSim, Np=Np, prob=0.95)
# Code to test the function "rolcor_estim_1win" # Defining the 'NonParRolCor' parameters # Number of Monte-Carlo simulations (MCSim), please use at least 1000. # WARNING: MCSim=2, it's just to test this example! MCSim <- 2 Np <- 2 # Number of cores X_Y <- rolcor_estim_1win(syntheticdata[1:350,], CorMethod="pearson", widthwin=3, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=MCSim, Np=Np, prob=0.95)
The rolcor_estim_heatmap
function estimates the rolling window correlation coefficients for all the possible window-lengths (time-scales) or for a band of window-lengths for two time series sampled on identical time points, and their statistical significance via a non-parametric computing-intensive method. To carry out the computational implementation we extend the works of Telford (2013), Polanco-Martínez (2019) and Polanco-Martínez (2020). The test/method to determine the statistical significance is described in Polanco-Martínez and López-Martínez (2021). The rolcor_estim_heatmap
function is highly flexible since this contains several parameters to control the estimation of the correlation. A list of parameters are described in the following lines.
rolcor_estim_heatmap(inputdata, CorMethod="pearson", typewidthwin="FULL", widthwin_1=3, widthwin_N=dim(inputdata)[1], Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=1000, prob=0.95, Np=2)
rolcor_estim_heatmap(inputdata, CorMethod="pearson", typewidthwin="FULL", widthwin_1=3, widthwin_N=dim(inputdata)[1], Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=1000, prob=0.95, Np=2)
inputdata |
A matrix of 3 columns: time (regular/evenly spaced), the first (e.g., the independent) variable, and the second (e.g., the dependent) variable. Please verify if |
CorMethod |
The method used to estimate the correlations, by default is “pearson”, but other options (“spearman” and “kendall”) are available (please look at: R>?cor.test). |
typewidthwin |
“FULL” is to estimate the windows from 2, 4, ..., to dim(inputdata)[1]) if |
widthwin_1 |
First value for the size (length) of the windows when the option |
widthwin_N |
Last value for the size (length) of the windows when the option |
Align |
To align the rolling object, NonParRolCor uses three options: “left”, “center”, and “right” (please look at: R>?running). However, there are some restrictions, which have been described lines above. We recommend to use the “center” option to ensure that variations in the correlations are aligned with the variations in the relationships of the variables under study, rather than being shifted to left or to right (Polanco-Martínez 2019, 2020), but this imply that the window-lengths must be odd. |
rmltrd |
Remove (by default is “TRUE”; “FALSE” otherwise) the linear trend in the variables under analysis. It is advisable to remove (at least the linear) the trend before estimating the rolling window correlation coefficients, especially, for large window-lengths. |
Scale |
Scale (by default is “TRUE”; “FALSE” otherwise) is used to “normalize” or “standardize” the variables under analysis. It is highly advisable to ”normalize/standardize” the time series under study to have them in the same scales. |
MCSim |
Number of Monte-Carlo simulations to permute the second variable. It is advisable to use at least 1000 simulations. |
prob |
Numeric vector of probabilities with values in the interval [0,1], by default prob=0.95 (p=0.05), please look at R?quantile, Telford (2013), or Polanco-Martínez and López-Martínez (2021) for more information. |
Np |
Number of CPU cores, by default is 2. Please verify the number of cores of your computer. WARNING: it is not advisable to use the maximum number of cores of your computer. |
The rolcor_estim_heatmap
function estimates the rolling window correlation coefficients and their statistical significance between two time series sampled on identical time points for all the possible window-lengths or for a band of window-lengths. This function supersedes to the function estimation_NonParRolCor
of the previous version of NonParRolCor
. The function rolcor_estim_heatmap
uses the functions cor.test
(package:stats) and running
(package:gtools) to estimate the correlation coefficients and compute the rolling window correlations, and also the functions foreach
(package:foreach) and makeCluster
(package:parallel) to parallelize the estimation of the rolling window correlations.
Outputs:
A list containing eight elements: the_matrixCOR
and CRITVAL
contain the rolling window correlation coefficients and their respective critical values, nwin
and Rwidthwin
contain the number of window-lengths (time-scales) and the window-lengths, left_win
and righ_win
are used to accommodate the times of the rolling window correlation coefficients, finally MCSim
indicates the number of Monte-Carlo simulations and prob
the significance level.
Josué M. Polanco-Martínez (a.k.a. jomopo).
Excellence Unit GECOS, IME, Universidad de Salamanca, Salamanca, SPAIN.
BC3 - Basque Centre for Climate Change, Leioa, SPAIN.
Web1: https://scholar.google.es/citations?user=8djLIhcAAAAJ&hl=en/.
Web2: https://www.researchgate.net/profile/Josue-Polanco-Martinez/.
Email: [email protected].
José L. López-Martínez.
Faculty of Mathematics, Universidad Autónoma de Yucatán (UADY), Tizimín, MEXICO.
Web1: https://scholar.google.es/citations?user=552PKVEAAAAJ&hl=es/.
Web2: https://www.researchgate.net/profile/Jose-Lopez-Martinez-3/.
Email: [email protected].
Polanco-Martínez, J. M. and López-Martínez, J.M. (2021). A non-parametric method to test the statistical significance in rolling window correlations, and applications to ecological time series. Ecological Informatics 60, 101379. <URL: doi:10.1016/j.ecoinf.2021.101379>.
Polanco-Martínez, J. M. (2020). NonParRolCor: an R package for estimating rolling window multiple correlation in ecological time series. Ecological Informatics, 60, 101163. <URL: doi:10.1016/j.ecoinf.2020.101163>.
Polanco-Martínez, J. M. (2019). Dynamic relationship analysis between NAFTA stock markets using nonlinear, nonparametric, non-stationary methods. Nonlinear Dynamics, 97(1), 369-389. <URL: doi:10.1007/s11071-019-04974-y>.
Telford, R.: Running correlations – running into problems. (2013). <URL:
https://quantpalaeo.wordpress.com/2013/01/04/>.
# Code to test the function "rolcor_estim_heatmap" # Defining the 'NonParRolCor' parameters TYPEWIDTHWIN="PARTIAL" # Number of Monte-Carlo simulations (MCSim), please use at least 1000. # WARNING: MCSim=2, it's just to test this example! MCSim <- 2 Np <- 2 # Number of cores X_Y <- rolcor_estim_heatmap(syntheticdata[1:350,], CorMethod="pearson", typewidthwin=TYPEWIDTHWIN, widthwin_1=29, widthwin_N=51, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=MCSim, Np=Np)
# Code to test the function "rolcor_estim_heatmap" # Defining the 'NonParRolCor' parameters TYPEWIDTHWIN="PARTIAL" # Number of Monte-Carlo simulations (MCSim), please use at least 1000. # WARNING: MCSim=2, it's just to test this example! MCSim <- 2 Np <- 2 # Number of cores X_Y <- rolcor_estim_heatmap(syntheticdata[1:350,], CorMethod="pearson", typewidthwin=TYPEWIDTHWIN, widthwin_1=29, widthwin_N=51, Align="center", rmltrd=TRUE, Scale=TRUE, MCSim=MCSim, Np=Np)
The data set syntheticdata
contains three columns: the first one are the “times” (from 1 to 500) (named “Times”), the second (named “X”) and the third (named “Y”) columns were generated by a bi-variate AR1 process with similar autocorrelation coefficients of 0.25. We generate two correlated bi-variate AR1 time series with positive (direct) correlation (0.85) for the first 250 elements and with negative (inverse) correlation (-0.85) for the last 250 elements (Polanco-Martínez and López-Martínez 2021).
data(syntheticdata)
data(syntheticdata)
One file in ASCII format containing three columns and 500 rows, columns are separated by spaces.
Author's own production (Josué M. Polanco-Martínez) based on: mpiktas (<URL: https://stats.stackexchange.com/users/2116/mpiktas/>). How to simulate two correlated AR(1) time series?, Cross Validated. (2013). <URL: https://stats.stackexchange.com/q/71831/> (version: 2013-10-03).
Polanco-Martínez, J. M. and López-Martínez, J.M. (2021). A non-parametric method to test the statistical significance in rolling window correlations, and applications to ecological time series, Ecological Informatics 60, 101379. <URL: doi:10.1016/j.ecoinf.2021.101379>.