CloudSR

ICLR 2026 Workshop on Machine Learning for Remote Sensing

Recovering Cloud Microstructures with
Cascaded Diffusion Inversion

Hanan Gani   Guy Pulik   Daniel Rosenfeld   Duncan Watson-Parris   Salman Khan

MBZUAI University of California, San Diego UAEREP
Two-stage diffusion inversion framework overview

Overview of the proposed two-stage framework. Stage 1 learns cross-sensor alignment from real paired data, and Stage 2 refines cloud structures using high-resolution supervision.

Abstract

High-resolution satellite imagery is important for observing fine-scale cloud structures that inform cloud microphysics analysis and weather-modification strategies. Existing super-resolution approaches are mostly designed for natural images and struggle to transfer to cross-sensor, multi-spectral cloud imagery. We propose a two-stage diffusion-based super-resolution framework that recovers high-resolution cloud microstructures by diffusion inversion.

Stage 1 trains on real paired data to learn robust degradation handling and inter-sensor alignment, while Stage 2 uses self-supervised downgrading of high-resolution data to refine structure and texture synthesis. Across both SEVIRI → VIIRS and MSG → MTG, the method improves reconstruction quality and visual fidelity over transformer and diffusion baselines.

Method

Stage 1: Cross-Sensor Alignment

The first stage is trained on real paired low- and high-resolution samples to learn a physically consistent mapping under temporal mismatch, geometric differences, and cross-sensor appearance shift.

Stage 2: Structural Refinement

The second stage uses high-resolution data with synthetic degradations so the supervision is perfectly aligned. This helps recover cloud filaments, boundaries, and fine-scale structures more reliably.

The main idea is to separate robust cross-sensor mapping from high-frequency structure recovery, then combine both stages in a cascaded inference pipeline.

Results

Qualitative Comparison

Qualitative comparison for SEVIRI to VIIRS and MSG to MTG

Compared with SwinIR, StableSR, and SinSR, the proposed method preserves sharper cloud structures while avoiding oversharpened artifacts.

Gradient Preservation

Gradient preservation ratio plot

The cascaded setup moves closest to the ideal gradient preservation ratio of 1.0.

Model SEVIRI → VIIRS MSG → MTG
PSNR ↑ Grad. ≈ 1 Percep ↓ PSNR ↑ Grad. ≈ 1 Percep ↓
SwinIR 18.37 0.19 0.45 15.91 0.53 0.75
StableSR 20.52 1.72 0.44 18.71 2.94 0.42
SinSR 20.69 0.46 0.37 24.0 0.96 0.30
Ours 21.25 1.06 0.28 24.0 1.03 0.29

Citation

@inproceedings{gani2026recovering,
  title     = {Recovering Cloud Microstructures with Cascaded Diffusion Inversion},
  author    = {Gani, Hanan and Pulik, Guy and Rosenfeld, Daniel and Watson-Parris, Duncan and Khan, Salman},
  booktitle = {ICLR 2026 Workshop on Machine Learning for Remote Sensing (ML4RS)},
  year      = {2026}
}