A newer version of the Gradio SDK is available:
6.1.0
title: HAT Super-Resolution for Satellite Images
emoji: 🛰️
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.46.1
app_file: app.py
pinned: false
HATSAT - Super-Resolution for Satellite Images
This Hugging Face Space demonstrates a fine-tuned Hybrid Attention Transformer (HAT) model for satellite image super-resolution. The model performs 4x upscaling of satellite imagery, enhancing the resolution while preserving important geographical and structural details.
Model Details
- Architecture: HAT (Hybrid Attention Transformer)
- Upscaling Factor: 4x
- Input Channels: 3 (RGB)
- Training: Fine-tuned on satellite imagery dataset
- Base Model: Pre-trained HAT model from ImageNet
Model Configuration
- Window Size: 16
- Embed Dimension: 180
- Depths: [6, 6, 6, 6, 6, 6]
- Number of Heads: [6, 6, 6, 6, 6, 6]
- Compress Ratio: 3
- Squeeze Factor: 30
- Overlap Ratio: 0.5
Usage
- Upload a satellite image (RGB format)
- The model will automatically upscale it by 4x
- Download the enhanced high-resolution result
Training Details
The model was fine-tuned using:
- Loss Function: L1Loss
- Optimizer: Adam (lr=2e-5)
- Training Iterations: 20,000
- Scheduler: MultiStepLR with milestones at [10000, 50000, 100000, 130000, 140000]
Applications
This model is particularly useful for:
- Enhancing low-resolution satellite imagery
- Geographic analysis and mapping
- Environmental monitoring
- Urban planning and development
- Agricultural monitoring
Technical Implementation
The model implements several key architectural components:
- Hybrid Attention Blocks (HAB): Combining window-based and overlapping attention
- Overlapping Cross-Attention Blocks (OCAB): For enhanced feature extraction
- Residual Hybrid Attention Groups (RHAG): Stacked attention layers with residual connections
- Channel Attention Blocks (CAB): For feature refinement
Performance
The model has been trained for 20,000 iterations with careful monitoring of PSNR and SSIM metrics on satellite imagery validation data.
Acknowledgments
This model is a fine tuned version of HAT (Hybrid Attention Transformer) and trained on the SEN2NAIPv2 dataset.
Base Model: HAT
- GitHub Repository: https://github.com/XPixelGroup/HAT
- Paper: Activating More Pixels in Image Super-Resolution Transformer
- Authors: Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, Chao Dong
Training Dataset: SEN2NAIPv2
- HuggingFace Dataset: https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2
- Description: High-resolution satellite imagery dataset for super-resolution tasks
Citation
If you use this model in your research, please cite both the original HAT paper and the SEN2NAIPv2 dataset:
@article{chen2023hat,
title={Activating More Pixels in Image Super-Resolution Transformer},
author={Chen, Xiangyu and Wang, Xintao and Zhou, Jiantao and Qiao, Yu and Dong, Chao},
journal={arXiv preprint arXiv:2205.04437},
year={2022}
}
@misc{sen2naipv2,
title={SEN2NAIPv2: A Large-Scale Dataset for Satellite Image Super-Resolution},
author={TACO Foundation},
year={2024},
url={https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2}
}