Time Series Analysis Module
Analyze correlated time-series data from simulations, experiments, or any sequential measurement process.
Capabilities
Automatic equilibration detection
Integrated autocorrelation time / statistical inefficiency estimation
Geweke convergence diagnostic
Sub-sampling utilities: regular, random, or block-averaged
Equilibration Length Estimation
Scan candidate truncation points and return the one that maximizes the
effective sample size using the automated procedure by Chodera
[chodera2016]. The search grid can be thinned with nskip and
the tail of the data can be ignored via ignore_end (int points, float
fraction, or None for one-quarter). Geyer estimators impose larger
minimum tail lengths. Parallel evaluation is supported with
number_of_cores. Constant series are detected automatically and
return (0, n_data).
Statistical Inefficiency
Four estimators are provided:
statistical_inefficiency– standard integrated autocorrelation timegeyer_r_statistical_inefficiency– Geyer initial monotone sequence [geyer1992], [geyer2011]geyer_split_r_statistical_inefficiency– split-chain variant (needs ≥ 8 points)geyer_split_statistical_inefficiency– split-chain minimum variance
All functions accept an optional second array y to compute
cross-correlation instead of auto-correlation. FFT convolution is used
by default for series longer than 30 points; it can be disabled with
fft=False. The summation window is controlled with
minimum_correlation_time. Constant data return si = n_data.
Geweke Diagnostic
Compute z-scores by comparing early and late segments of the chain [geweke1992], [plummer2006]. Signature:
geweke(x, first=0.1, last=0.5, intervals=20)
Returns an (intervals, 2) array whose columns are [start index, z-score].
Sampling Utilities
Three modes are available through a single entry point:
uncorrelated– take everysi-th pointrandom– one random point persi-length blockblock_averaged– average of eachsi-length block
Pre-computed indices can be supplied via uncorrelated_sample_indices
to avoid recomputing the statistical inefficiency.
Quick Example
import numpy as np
from kim_convergence.timeseries import (
estimate_equilibration_length,
statistical_inefficiency,
geweke,
uncorrelated_time_series_data_samples
)
data = np.random.randn(10000) # your correlated series
eq, si = estimate_equilibration_length(data, nskip=10)
z = geweke(data[eq:], intervals=20)
uncorr = uncorrelated_time_series_data_samples(
data[eq:], si=si, sample_method='block_averaged')
Hints
Geyer estimators are preferred for noisy or slowly-decaying correlations
Block averaging is best for thermodynamic observables
Use
ignore_endto exclude non-stationary tailsSet
number_of_cores > 1to accelerate long scansFor series shorter than 30 points, FFT is disabled automatically
The module works with any time-ordered data, not just simulation trajectories