DMAP-AI calculation documentation

Calculation process and formulas used in DMAP-AI

This page documents how the current DMAP-AI Research Version calculates SPI and related diagnostic outputs. It is a methods page, not a step-by-step tutorial. The SPI workflow first uses a Gamma-distribution approach and then uses an empirical distribution as a safe fallback when the Gamma calculation becomes unstable or returns invalid values.

Overall calculation workflow

The Research Version calculates drought indicators from precipitation for one selected point. The current free workflow is point-based and focused on SPI.

  1. Read the selected point and dates. The tool receives latitude, longitude, start date, end date, selected data source, and optional baseline period.
  2. Download or read precipitation. The backend obtains precipitation from NASA POWER or ERA5-Land/CDS depending on the user selection.
  3. Clean the precipitation series. Missing values are removed from the fitting set. Precipitation is treated as a non-negative variable.
  4. Aggregate precipitation. For the current yearly Research Version workflow, precipitation is summed over each calendar year.
  5. Define the baseline sample. If the user provides a valid baseline period, that period is used. If not, the analysis period can be used as the fitting baseline.
  6. Calculate SPI. DMAP-AI first tries the Gamma-distribution SPI calculation. If the Gamma probability is invalid, not finite, or numerically unstable, the backend falls back to an empirical distribution.
  7. Calculate categories and diagnostics. The returned outputs include SPI categories, drought events, severity metrics, wavelet diagnostics, optional copula diagnostics, and JSON metadata.

Standardized Precipitation Index (SPI)

SPI transforms accumulated precipitation into a standard normal variable. Negative values represent drier-than-normal conditions, positive values represent wetter-than-normal conditions, and values near zero represent near-normal conditions.

Step 1

Precipitation accumulation

Let \(P_t\) be precipitation at time step \(t\). For an accumulation window of length \(k\), the accumulated precipitation is:

\[ X_t^{(k)} = \sum_{j=0}^{k-1} P_{t-j} \] For the current yearly Research Version workflow, \(X_y\) is the Jan–Dec precipitation total for year \(y\).

For yearly SPI, the time step is one year. For future monthly workflows, the same formula can be applied to monthly accumulations such as 1-month, 3-month, 6-month, or 12-month SPI.

Step 2

Gamma probability model

The Gamma distribution is fitted to valid baseline precipitation accumulations. The probability density function is:

\[ g(x;\alpha,\beta)=\frac{x^{\alpha-1}e^{-x/\beta}}{\Gamma(\alpha)\beta^{\alpha}}, \qquad x>0 \] \(\alpha\) is the shape parameter, \(\beta\) is the scale parameter, and \(\Gamma(\alpha)\) is the Gamma function.

The Gamma cumulative probability is:

\[ G(x)=\int_{0}^{x} g(u;\alpha,\beta)\,du \]
Zero precipitation adjustment

Cumulative probability used for SPI

If zero precipitation values exist in the fitting sample, their probability is included before transforming to SPI.

\[ q = \frac{n_0}{n} \] \[ H(x)=q+(1-q)G(x) \] \(n_0\) is the number of zero values and \(n\) is the number of valid baseline values. For annual precipitation, \(q\) is usually zero.
SPI transform

Transform probability to standard normal space

SPI is the inverse standard normal value associated with the cumulative probability.

\[ SPI_t = \Phi^{-1}\left(H(X_t)\right) \] \(\Phi^{-1}\) is the inverse cumulative distribution function of the standard normal distribution.

To avoid infinite values, probabilities are clipped to a small open interval:

\[ H^*(x)=\min\left(1-\varepsilon,\max\left(\varepsilon,H(x)\right)\right) \]
Backend stability rule

Empirical distribution fallback used by DMAP-AI

In the backend, SPI first uses the Gamma-distribution calculation. However, Gamma fitting can become unstable when the sample is too short, the values are nearly identical, the variance is very small, or the fitted distribution returns an invalid cumulative probability. In those cases, DMAP-AI uses an empirical distribution fallback instead of returning NaN.

\[ R(x_i)=\#\{x_j \le x_i\} \] \[ F_{emp}(x_i)=\frac{R(x_i)-0.5}{n} \] \[ SPI_{emp}(x_i)=\Phi^{-1}\left(F_{emp}^*(x_i)\right) \] \(R(x_i)\) is the rank of value \(x_i\) among the valid baseline values. \(F_{emp}^*(x_i)\) is the clipped empirical probability.

This fallback keeps the output usable when the Gamma method fails. The empirical SPI should be interpreted as a rank-based standardized value. It preserves the ordering of dry and wet years but does not assume that the precipitation sample follows a fitted Gamma distribution.

SPI category thresholds and classification formulas

The category tables convert calculated SPI values into drought or wetness labels. The SPI values themselves do not change when the category method changes.

Classic SPI

Classic SPI threshold table

SPI rangeCategoryMeaning
\(SPI \le -2.00\)Extreme droughtVery strong dry signal.
\(-2.00 < SPI \le -1.50\)Severe droughtStrong dry signal.
\(-1.50 < SPI \le -1.00\)Moderate droughtModerate dry signal.
\(-1.00 < SPI < 1.00\)Near normalClose to normal precipitation conditions.
\(1.00 \le SPI < 1.50\)Moderately wetModerate wet signal.
\(1.50 \le SPI < 2.00\)Very wetStrong wet signal.
\(SPI \ge 2.00\)Extremely wetVery strong wet signal.
USDM-style labels

SPI/SPEI-based U.S. Drought Monitor-style categories

SPI range used in DMAP-AILabel used in DMAP-AIOfficial SPI/SPEI range shown by USDM
\(SPI \le -2.00\)D4 – Exceptional drought-2.00 or less
\(-2.00 < SPI \le -1.60\)D3 – Extreme drought-1.60 to -1.99
\(-1.60 < SPI \le -1.30\)D2 – Severe drought-1.30 to -1.59
\(-1.30 < SPI \le -0.80\)D1 – Moderate drought-0.80 to -1.29
\(-0.80 < SPI \le -0.50\)D0 – Abnormally dry-0.50 to -0.79
\(SPI > -0.50\)No drought / normal or wet-0.49 or above

These thresholds are adapted from the official U.S. Drought Monitor drought classification table, which gives approximate SPI/SPEI ranges for D0–D4 categories. In DMAP-AI, they are used only as SPI-based communication labels. They are not an official U.S. Drought Monitor map because the official USDM uses multiple indicators, local information, expert assessment, and drought impacts.

Percentile and k-means methods

Distribution-based categories

Percentile classes use the empirical probability position of each SPI value:

\[ p_i = F_{emp}(SPI_i) \]

K-means groups SPI values by minimizing within-cluster variation:

\[ \min_{C_1,\ldots,C_K}\sum_{k=1}^{K}\sum_{SPI_i\in C_k}\left(SPI_i-\mu_k\right)^2 \] Clusters are sorted by their centroids \(\mu_k\), from driest to wettest, before labels are assigned.

Drought severity, event duration, and magnitude

Severity calculations use only drought periods. They do not measure wetness magnitude.

Event detection

Drought-event indicator

Let \(\theta\) be the selected drought threshold, such as \(-1.0\) for moderate drought and worse. A time step is part of a drought event when:

\[ I_t = \begin{cases} 1, & SPI_t \le \theta \\ 0, & SPI_t > \theta \end{cases} \]

A drought event is a consecutive sequence of time steps where \(I_t=1\).

Duration and minimum SPI

Event summary metrics

For a drought event \(E=[a,b]\), the duration and minimum SPI are:

\[ D_E=b-a+1 \] \[ SPI_{min,E}=\min_{t\in E}(SPI_t) \] For yearly analysis, duration is measured in years. For monthly analysis, duration would be measured in months.
Magnitude

Drought deficit and magnitude

The drought deficit for one time step is the distance below the selected threshold:

\[ d_t = \max(0,\theta-SPI_t) \]

The drought magnitude of an event is the sum of drought deficits during that event:

\[ M_E=\sum_{t\in E}(\theta-SPI_t) \] Because \(SPI_t\le\theta\) inside a drought event, \(\theta-SPI_t\) is positive.

Mean event intensity can also be calculated as:

\[ \bar{I}_E=\frac{M_E}{D_E} \]

Magnitude is expressed in SPI × time step. For yearly SPI, it can be read as SPI-years. A larger value means the drought was deeper, longer, or both. A value greater than 2 is not a universal category by itself; it must be interpreted with the selected threshold, duration, and minimum SPI.

Wavelet diagnostics

Wavelet diagnostics show how drought variability changes across time and period. In the Research Version, the period unit follows the selected time step: years for yearly SPI and months for monthly SPI.

Scalogram

Continuous wavelet transform

The wavelet coefficient at time \(\tau\) and scale \(s\) can be written as:

\[ W_x(s,\tau)=\sum_{t=1}^{N}x_t\,\psi^*\left(\frac{t-\tau}{s}\right) \] \(x_t\) is the SPI series, \(\psi^*\) is the complex conjugate of the mother wavelet, and \(s\) is the scale or period.
Power

Wavelet power and global spectrum

The wavelet power is:

\[ P_x(s,\tau)=|W_x(s,\tau)|^2 \]

The global wavelet spectrum summarizes power over the whole record:

\[ G_x(s)=\frac{1}{N}\sum_{\tau=1}^{N}|W_x(s,\tau)|^2 \]
Coherence

Wavelet coherence

For two series \(x\) and \(y\), such as SPI and precipitation, wavelet coherence is:

\[ R^2_{xy}(s,\tau)=\frac{\left|S\left(s^{-1}W_{xy}(s,\tau)\right)\right|^2}{S\left(s^{-1}|W_x(s,\tau)|^2\right)S\left(s^{-1}|W_y(s,\tau)|^2\right)} \] \[ W_{xy}(s,\tau)=W_x(s,\tau)W_y^*(s,\tau) \] \(S\) is a smoothing operator. Coherence ranges from 0 to 1.

In this page, strong fluctuation means strong wavelet power at a specific time and period. It means SPI changed repeatedly and strongly at that time scale. It does not automatically mean that drought was severe every time that period appears.

Copula diagnostics used in the Research Version

The Copula tab in the current Research Version is a bivariate, rank-based dependence diagnostic. It converts the selected paired variables to pseudo-observations in the unit square, draws the U–V scatter plot, and reports dependence and drought-persistence summaries. It is useful for describing association and persistence, but it is not a drought forecast by itself.

User-selected pairing

Step 1: build the paired variables

The user selects a variable pairing in the Copula tab. For the option shown as SPI vs SPI (lag 1; persistence), the paired variables are consecutive SPI values:

\[ X_i = SPI_t, \qquad Y_i = SPI_{t+1} \] This pairing is used to examine whether dry or wet SPI conditions tend to persist into the next time step.

Other pairings follow the same structure. For example, a precipitation–SPI pairing uses one variable from precipitation and the other from SPI for the same matched time steps. Records with missing or invalid paired values are excluded before the diagnostic is calculated.

Pseudo-observations

Step 2: convert pairs to U and V values

The Copula tab does not plot the original SPI or precipitation values directly. It first converts each paired variable to a rank-based probability value between 0 and 1:

\[ U_i = \frac{rank(X_i)}{n+1}, \qquad V_i = \frac{rank(Y_i)}{n+1} \] Here, \(n\) is the number of valid paired observations. This is why the chart is labeled U–V scatter, and both axes range from 0 to 1.

In the U–V scatter plot, each dot is one valid paired time step. For SPI-related variables, small values near 0 represent the driest or lowest-ranked conditions, while values near 1 represent the wettest or highest-ranked conditions.

Dependence summary

Step 3: calculate correlation and Kendall tau

The summary box reports the correlation between the rank-based U and V values as the Gaussian rho value:

\[ \hat{\rho}=Corr(U,V) \] Correlation is bounded by \(-1 \le \hat{\rho} \le 1\). A positive value means the two variables tend to move in the same rank direction. A negative value means they tend to move in opposite rank directions.

Kendall tau is also reported as a nonparametric rank-dependence measure:

\[ \hat{\tau}=\frac{C-D}{\frac{1}{2}n(n-1)} \] \(C\) is the number of concordant pairs and \(D\) is the number of discordant pairs.

Values near zero indicate weak dependence. In the SPI lag-1 persistence example, a positive value means low SPI ranks tend to be followed by low SPI ranks and high SPI ranks tend to be followed by high SPI ranks.

Gaussian family

Why tail-dependence values are not shown in the tool

The current Research Version uses a Gaussian copula to summarize overall dependence. In a Gaussian copula, lower-tail and upper-tail dependence are theoretically zero for any non-perfect correlation.

Because these values are not informative in normal DMAP-AI runs, the Research Version does not show lower-tail or upper-tail dependence in the Copula summary. The more useful outputs are Gaussian rho, Kendall tau, the U–V scatter, sample size, and the persistence summaries such as P(drought continues next step).

Sample size

Sample size used

The summary item Sample size used is the number of valid paired observations after the selected pairing is built and missing or invalid values are removed:

\[ n = \#\{(X_i,Y_i): X_i \text{ and } Y_i \text{ are valid and finite}\} \]

For the lag-1 SPI persistence option, one time step is lost because each pair needs both \(SPI_t\) and \(SPI_{t+1}\).

Persistence / duration

Drought persistence summary for SPI ≤ -1.0

When the selected pairing is SPI vs SPI lag 1, the summary box also reports persistence statistics based on a drought threshold of \(SPI \le -1.0\). Define a drought indicator as:

\[ D_t= \begin{cases} 1, & SPI_t \le -1.0 \\ 0, & SPI_t > -1.0 \end{cases} \]

The probability that drought continues into the next step is calculated from times that are already in drought:

\[ P(\text{drought continues next step})= \frac{\#\{t:D_t=1 \text{ and } D_{t+1}=1\}}{\#\{t:D_t=1\}} \]

The probability of a new drought next step is calculated from times that are not currently in drought:

\[ P(\text{new drought next step})= \frac{\#\{t:D_t=0 \text{ and } D_{t+1}=1\}}{\#\{t:D_t=0\}} \]

The approximate mean drought length is the average length of consecutive drought runs:

\[ \bar{L}=\frac{1}{m}\sum_{j=1}^{m} L_j \] \(L_j\) is the length of drought run \(j\), and \(m\) is the number of drought runs. If the available record is too limited for this estimate, the tool can report NA.
How to read the Copula tab

How the displayed results should be interpreted

Displayed itemMeaning in the current Research Version
U–V scatterRank-based paired values in the unit square. Each dot is one valid paired time step.
Pair typeThe selected variable pairing, such as SPI vs SPI lag 1 for persistence analysis.
Family: GaussianThe dependence summary is reported in a Gaussian-copula-style framework.
Gaussian rhoCorrelation of the rank-based U and V values.
Kendall tauRank-dependence measure based on concordant and discordant pairs.
Persistence / durationEmpirical transition statistics using \(SPI \le -1.0\) as the drought threshold.

The Copula tab is a descriptive diagnostic. It should be interpreted together with the SPI table, charts, and drought severity tab. It does not replace the main SPI calculation and should not be interpreted as an official drought forecast.

Data-source processing formulas

The current Research Version uses point-based precipitation data. The same SPI formulas are used after precipitation is converted to a consistent unit and time scale.

NASA POWER

NASA POWER precipitation

NASA POWER is the default free point-based data source in the Research Version. Daily precipitation values are summed to the selected analysis period:

\[ P_{year} = \sum_{d=1}^{D_y} P_d \] \(D_y\) is the number of available days in year \(y\).
ERA5-Land / CDS

ERA5-Land/CDS precipitation

ERA5-Land/CDS is an advanced option. If precipitation is returned as meters of water equivalent, it is converted to millimeters before aggregation:

\[ P_{mm}=1000\times P_m \]

After unit conversion, values are summed over the selected period in the same way as the NASA POWER workflow.

Quality rules

Valid values used for fitting

The fitting sample uses valid numerical precipitation accumulations. In formula form:

\[ \mathcal{B}=\{X_t: X_t \text{ is valid, finite, and inside the baseline period}\} \]

The Gamma and empirical SPI calculations are based on this valid baseline set. Missing or invalid values should not be used to estimate the distribution.

References for methods

These references support the scientific methods described on this page.

  1. U.S. Drought Monitor. Drought Classification. National Drought Mitigation Center, University of Nebraska–Lincoln. https://droughtmonitor.unl.edu/About/AbouttheData/DroughtClassification.aspx
  2. U.S. Drought Monitor. What is the USDM? National Drought Mitigation Center, University of Nebraska–Lincoln. https://droughtmonitor.unl.edu/About/WhatistheUSDM.aspx
  3. McKee, T. B., Doesken, N. J., and Kleist, J. (1993). The relationship of drought frequency and duration to time scales. Proceedings of the 8th Conference on Applied Climatology.
  4. Edwards, D. C., and McKee, T. B. (1997). Characteristics of 20th century drought in the United States at multiple time scales. Colorado State University.
  5. Guttman, N. B. (1999). Accepting the Standardized Precipitation Index: a calculation algorithm. Journal of the American Water Resources Association.
  6. World Meteorological Organization (2012). Standardized Precipitation Index User Guide.
  7. Torrence, C., and Compo, G. P. (1998). A practical guide to wavelet analysis. Bulletin of the American Meteorological Society.
  8. Grinsted, A., Moore, J. C., and Jevrejeva, S. (2004). Application of the cross wavelet transform and wavelet coherence to geophysical time series. Nonlinear Processes in Geophysics.
  9. Sklar, A. (1959). Fonctions de répartition à n dimensions et leurs marges. Publications de l’Institut de Statistique de l’Université de Paris.
  10. Nelsen, R. B. (2006). An Introduction to Copulas. Springer.