Premium accounts now available! Sign up and create a premium account. Read more Close

Advertisement

Image

SpatialDataAgent: Autonomous Spatial Omics Data Curation at Decade Scale

Preprint Created on 31 May 2026 bioRxiv

Fragmented metadata in spatial omics archives has rendered large volumes of multimodal molecular-histological data inaccessible as 'dark data'. Here, we introduce SpatialDataAgent, an agentic workflow for autonomous spatial omics data curation, combining schema-constrained evidence evaluation with a self-refining standardization agent. Applied to a decade of GEO records, SpatialDataAgent identified 769 paired H&E-spatial transcriptomics (ST) datasets, representing a 6.4-fold scale expansion over existing manually curated baselines. Within the benchmarking window, the framework achieved a 141% increase in high-confidence (Class A) paired datasets, which were automatically filtered and assembled to establish HESRT (a datalake containing 29.2 million spots/cells), establishing a blueprint for evidence-grounded autonomous curation of multimodal biomedical archives.

Ji, J.-H., Zou, Q., Cheng, J., She, Z., Hao, Y., Liu, W., Zhang, D., Wang, Z., Yu, J.-T., Yuan, Z.

Advertisement

Stats

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 10
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement