File size: 3,436 Bytes
154404e
841d16c
 
 
 
154404e
36f9b89
154404e
 
 
 
4993aa4
841d16c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16cb92d
 
 
 
 
 
 
 
 
 
 
 
 
841d16c
 
16cb92d
841d16c
 
 
 
 
 
 
 
16cb92d
 
 
 
 
 
 
841d16c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
title: HAT Super-Resolution for Satellite Images
emoji: 🛰️
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.46.1
app_file: app.py
pinned: false
---

# HATSAT - Super-Resolution for Satellite Images

This Hugging Face Space demonstrates a fine-tuned **Hybrid Attention Transformer (HAT)** model for satellite image super-resolution. The model performs 4x upscaling of satellite imagery, enhancing the resolution while preserving important geographical and structural details.

## Model Details

- **Architecture**: HAT (Hybrid Attention Transformer)
- **Upscaling Factor**: 4x
- **Input Channels**: 3 (RGB)
- **Training**: Fine-tuned on satellite imagery dataset
- **Base Model**: Pre-trained HAT model from ImageNet

## Model Configuration

- **Window Size**: 16
- **Embed Dimension**: 180
- **Depths**: [6, 6, 6, 6, 6, 6]
- **Number of Heads**: [6, 6, 6, 6, 6, 6]
- **Compress Ratio**: 3
- **Squeeze Factor**: 30
- **Overlap Ratio**: 0.5

## Usage

1. Upload a satellite image (RGB format)
2. The model will automatically upscale it by 4x
3. Download the enhanced high-resolution result

## Training Details

The model was fine-tuned using:
- **Loss Function**: L1Loss
- **Optimizer**: Adam (lr=2e-5)
- **Training Iterations**: 20,000
- **Scheduler**: MultiStepLR with milestones at [10000, 50000, 100000, 130000, 140000]

## Applications

This model is particularly useful for:
- Enhancing low-resolution satellite imagery
- Geographic analysis and mapping
- Environmental monitoring
- Urban planning and development
- Agricultural monitoring

## Technical Implementation

The model implements several key architectural components:
- **Hybrid Attention Blocks (HAB)**: Combining window-based and overlapping attention
- **Overlapping Cross-Attention Blocks (OCAB)**: For enhanced feature extraction
- **Residual Hybrid Attention Groups (RHAG)**: Stacked attention layers with residual connections
- **Channel Attention Blocks (CAB)**: For feature refinement

## Performance

The model has been trained for 20,000 iterations with careful monitoring of PSNR and SSIM metrics on satellite imagery validation data.

## Acknowledgments

This model is a fine tuned version of **HAT (Hybrid Attention Transformer)** and trained on the **SEN2NAIPv2** dataset.

### Base Model: HAT
- **GitHub Repository**: [https://github.com/XPixelGroup/HAT](https://github.com/XPixelGroup/HAT)
- **Paper**: [Activating More Pixels in Image Super-Resolution Transformer](https://arxiv.org/abs/2205.04437)
- **Authors**: Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, Chao Dong

### Training Dataset: SEN2NAIPv2
- **HuggingFace Dataset**: [https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2](https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2)
- **Description**: High-resolution satellite imagery dataset for super-resolution tasks

## Citation

If you use this model in your research, please cite both the original HAT paper and the SEN2NAIPv2 dataset:

```bibtex
@article{chen2023hat,
  title={Activating More Pixels in Image Super-Resolution Transformer},
  author={Chen, Xiangyu and Wang, Xintao and Zhou, Jiantao and Qiao, Yu and Dong, Chao},
  journal={arXiv preprint arXiv:2205.04437},
  year={2022}
}

@misc{sen2naipv2,
  title={SEN2NAIPv2: A Large-Scale Dataset for Satellite Image Super-Resolution},
  author={TACO Foundation},
  year={2024},
  url={https://huggingface.co/datasets/tacofoundation/SEN2NAIPv2}
}
```