NJITGrace Hopper AI Research
Institute Seed Program
Tier 1 Spark Grant

Training AI Models to Predict Permeability of Porous Media

Catherin Neena Lalu, Manav Arora, Ebru Dagdelen, Aakash Karlekar, Matthew Illingworth, Jonathan Jaquette, Linda Cummings, Lou Kondic
Department of Mathematical Sciences, New Jersey Institute of Technology
NSF
NSF Grant
DMS-2201627
DMR-2410985
Problem Statement

Permeability is a fundamental property of a porous material that quantifies the ease with which fluid can pass through it under pressure. It is central to Darcy's Law:

Q: flow rate · k: permeability · μ: viscosity · A: area · Δp: pressure drop · L: length

Predicting permeability is important for groundwater flow, oil recovery, and filtration. Traditional methods are slow, computationally expensive, and difficult to apply to complex structures.

  • Slow simulations — hours per sample
  • High computational cost — resource-intensive 3D flow
  • Complex structures — intricate, heterogeneous geometry
Goal

Develop a machine learning model that quickly and accurately predicts permeability, reducing computation time while maintaining accuracy.

  • Machine learning — data-driven approaches
  • Faster prediction — orders-of-magnitude speedup
  • Accurate results — geometric, topological, & network features
Method

Generated 1000 synthetic 3D porous structures composed of randomly-distributed overlapping spheres (diameter 5–15 voxels, porosity 0.5) using PuMA [1] and computed permeability via direct flow simulations. Extracted pore network representations using the SNOW2 algorithm in PoreSpy [2], identifying pore centers (nodes) and throats (edges). Calculated structural (diameter, diffusivity, surface area, tortuosity), topological (persistent homology via GUDHI [3] alpha complexes and HomCloud distance transform), and network features (connectivity, centrality, edge length) to train a Neural Network.

Figure 1: Pipeline from synthetic structure generation (PuMA) through pore network extraction (SNOW2/PoreSpy), feature calculation, to ML model training.
Topological Feature Extraction
Figure 2: Alpha Complex Filtration
Balls grow around void-space points; simplices form when circumradius ≤ α, capturing connected components (H₀), loops (H₁), and cavities (H₂). Computed using GUDHI [3].
Figure 3: HomCloud Distance Transform
Signed Euclidean distance transform on binary 3D images. Superlevel filtration reveals birth/death of topological features as threshold decreases.
Results and Conclusions

The ML model predicted 1000 permeabilities in 4.66 s vs 92,480 s with PuMA — a ~20,000× speedup.

Best accuracy (MAPE ≈ 4.7%) came from combining all features. Structural and topological features performed similarly, while network features alone were faster but less accurate.

Persistence features from Alpha Complexes and structural descriptors produced nearly identical results (MAPE ≈ 4.71% and 4.76%). Network-based features from SNOW2 resulted in higher errors (~6–12%) but were faster to compute.

Figure 4: Model test error (%) vs. feature extraction time (s) [log scale]. Star marks PuMA direct computation baseline (92,480 s).
Societal Impact and Ethics

Fast, accurate permeability predictions benefit critical applications:

  • Groundwater management — monitoring aquifer flow, contaminant transport, and water resource sustainability
  • CO₂ sequestration — modeling subsurface carbon storage capacity and long-term seal integrity
  • Energy systems — optimizing oil/gas recovery, geothermal reservoir design, and hydrogen storage
  • Industrial filtration — designing efficient membrane and filter systems for water treatment

Ethical concerns include model generalizability across different porous media types, potential training data bias from synthetic-only datasets, and overreliance on ML predictions without experimental validation. Transparent reporting, cross-validation, and responsible deployment are essential.

Future Work

We plan to extend our method to more complex datasets, such as the coquina rock sample shown below [4]. These structures present greater computational challenges due to their size and heterogeneity. By applying our pipeline to real-world samples, we aim to investigate whether ML methods can robustly predict permeability despite increased structural complexity. We also plan to explore additional topological descriptors and ensemble methods to improve prediction accuracy.

Figure 5: Cross-section of man-made coquina sample [4]. Pixel brightness corresponds to solid fraction (lighter = higher). Image: 1000 × 1000 px, each 40 μm × 40 μm.
References
  1. Ferguson, J. C., et al. (2021). Update 3.0 to PuMA: The Porous Microstructure Analysis software. SoftwareX, 15, 100775.
  2. Gostick, J., et al. (2019). PoreSpy: A Python toolkit for quantitative analysis of porous media images. JOSS, 4(37), 1296.
  3. GUDHI library — Topological data analysis and geometric inference in higher dimensions. https://gudhi.inria.fr/
  4. M. Carvalho, PUC — Rio de Janeiro, Brazil — private communication.