Training AI Models to Predict Permeability of Porous Media

Problem Statement

Permeability is a fundamental property of a porous material that quantifies the ease with which fluid can pass through it under pressure. It is central to Darcy's Law:

Q: flow rate · k: permeability · μ: viscosity · A: area · Δp: pressure drop · L: length

Predicting permeability is important for groundwater flow, oil recovery, and filtration. Traditional methods are slow, computationally expensive, and difficult to apply to complex structures.

Slow simulations — hours per sample
High computational cost — resource-intensive 3D flow
Complex structures — intricate, heterogeneous geometry

Goal

Develop a machine learning model that quickly and accurately predicts permeability, reducing computation time while maintaining accuracy.

Machine learning — data-driven approaches
Faster prediction — orders-of-magnitude speedup
Accurate results — geometric, topological, & network features

Method

Generated 1000 synthetic 3D porous structures composed of randomly-distributed overlapping spheres (diameter 5–15 voxels, porosity 0.5) using PuMA [1] and computed permeability via direct flow simulations. Extracted pore network representations using the SNOW2 algorithm in PoreSpy [2], identifying pore centers (nodes) and throats (edges). Calculated structural (diameter, diffusivity, surface area, tortuosity), topological (persistent homology via GUDHI [3] alpha complexes and HomCloud distance transform), and network features (connectivity, centrality, edge length) to train a Neural Network.

Figure 1: Pipeline from synthetic structure generation (PuMA) through pore network extraction (SNOW2/PoreSpy), feature calculation, to ML model training.

Topological Feature Extraction

Figure 2: Alpha Complex Filtration

Balls grow around void-space points; simplices form when circumradius ≤ α, capturing connected components (H₀), loops (H₁), and cavities (H₂). Computed using GUDHI [3].

Figure 3: HomCloud Distance Transform

Signed Euclidean distance transform on binary 3D images. Superlevel filtration reveals birth/death of topological features as threshold decreases.

Results and Conclusions

The ML model predicted 1000 permeabilities in 4.66 s vs 92,480 s with PuMA — a ~20,000× speedup.

Best accuracy (MAPE ≈ 4.7%) came from combining all features. Structural and topological features performed similarly, while network features alone were faster but less accurate.

Persistence features from Alpha Complexes and structural descriptors produced nearly identical results (MAPE ≈ 4.71% and 4.76%). Network-based features from SNOW2 resulted in higher errors (~6–12%) but were faster to compute.

Figure 4: Model test error (%) vs. feature extraction time (s) [log scale]. Star marks PuMA direct computation baseline (92,480 s).

Societal Impact and Ethics

Fast, accurate permeability predictions benefit critical applications:

Groundwater management — monitoring aquifer flow, contaminant transport, and water resource sustainability
CO₂ sequestration — modeling subsurface carbon storage capacity and long-term seal integrity
Energy systems — optimizing oil/gas recovery, geothermal reservoir design, and hydrogen storage
Industrial filtration — designing efficient membrane and filter systems for water treatment

Ethical concerns include model generalizability across different porous media types, potential training data bias from synthetic-only datasets, and overreliance on ML predictions without experimental validation. Transparent reporting, cross-validation, and responsible deployment are essential.

Future Work

We plan to extend our method to more complex datasets, such as the coquina rock sample shown below [4]. These structures present greater computational challenges due to their size and heterogeneity. By applying our pipeline to real-world samples, we aim to investigate whether ML methods can robustly predict permeability despite increased structural complexity. We also plan to explore additional topological descriptors and ensemble methods to improve prediction accuracy.

Figure 5: Cross-section of man-made coquina sample [4]. Pixel brightness corresponds to solid fraction (lighter = higher). Image: 1000 × 1000 px, each 40 μm × 40 μm.

References

Ferguson, J. C., et al. (2021). Update 3.0 to PuMA: The Porous Microstructure Analysis software. SoftwareX, 15, 100775.
Gostick, J., et al. (2019). PoreSpy: A Python toolkit for quantitative analysis of porous media images. JOSS, 4(37), 1296.
GUDHI library — Topological data analysis and geometric inference in higher dimensions. https://gudhi.inria.fr/
M. Carvalho, PUC — Rio de Janeiro, Brazil — private communication.