ash12321 commited on
Commit
cab54a3
Β·
verified Β·
1 Parent(s): cc156cd

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +74 -232
README.md CHANGED
@@ -1,142 +1,43 @@
1
- ---
2
- license: mit
3
- tags:
4
- - pytorch
5
- - autoencoder
6
- - deepfake-detection
7
- - cifar10
8
- - computer-vision
9
- - image-reconstruction
10
- - anomaly-detection
11
- datasets:
12
- - cifar10
13
- metrics:
14
- - mse
15
- library_name: pytorch
16
- pipeline_tag: image-feature-extraction
17
- ---
18
-
19
  # Residual Convolutional Autoencoder for Deepfake Detection
20
 
21
- ## Model Description
22
-
23
- This is a **5-stage Residual Convolutional Autoencoder** trained on CIFAR-10 for high-quality image reconstruction and deepfake detection. The model achieves exceptional reconstruction quality (Test MSE: 0.004290) with **100% detection rate** on out-of-distribution images at calibrated thresholds.
24
-
25
- ### Key Features
26
-
27
- ✨ **Exceptional Performance**: 98.4% loss reduction during training
28
- 🎯 **Perfect Detection**: 100% TPR with calibrated thresholds
29
- πŸš€ **Fast Inference**: ~3,600 samples/sec on H100
30
- πŸ“Š **Calibrated Thresholds**: Real thresholds from distribution analysis
31
- πŸ“¦ **Complete Package**: Model + thresholds + examples + docs
32
-
33
- ### Architecture
34
-
35
- - **Encoder**: 5 downsampling stages (128β†’64β†’32β†’16β†’8β†’4) with residual blocks
36
- - **Latent Dimension**: 512
37
- - **Decoder**: 5 upsampling stages with residual blocks
38
- - **Total Parameters**: 34,849,667
39
- - **Input Size**: 128x128x3 (RGB images)
40
- - **Output Range**: [-1, 1] (Tanh activation)
41
-
42
- ## Training Details
43
-
44
- ### Training Data
45
- - **Dataset**: CIFAR-10 (50,000 training images, 10,000 test images)
46
- - **Image Size**: Resized to 128x128
47
- - **Normalization**: Mean=0.5, Std=0.5 (range [-1, 1])
48
-
49
- ### Training Configuration
50
- - **GPU**: NVIDIA H100 80GB HBM3
51
- - **Batch Size**: 1024
52
- - **Optimizer**: AdamW (lr=1e-3, weight_decay=1e-5)
53
- - **Loss Function**: MSE (Mean Squared Error)
54
- - **Scheduler**: ReduceLROnPlateau (factor=0.5, patience=5)
55
- - **Epochs**: 100
56
- - **Training Time**: ~26 minutes
57
-
58
- ### Training Results
59
- - **Initial Validation Loss**: 0.266256 (Epoch 1)
60
- - **Final Validation Loss**: 0.004294 (Epoch 100)
61
- - **Final Test Loss**: 0.004290
62
- - **Improvement**: 98.4% reduction in loss
63
-
64
- ## Performance
65
-
66
- ### Reconstruction Quality
67
-
68
- | Metric | Value |
69
- |--------|-------|
70
- | Test MSE Loss | 0.004290 |
71
- | Validation MSE Loss | 0.004294 |
72
- | Training Time | 26.24 minutes |
73
- | Parameters | 34,849,667 |
74
- | GPU Memory | ~40GB peak |
75
- | Throughput | ~3,600 samples/sec |
76
 
77
- ### Detection Performance (Calibrated on Random Noise vs CIFAR-10)
78
 
79
- | Distribution | Mean Error | Median Error | Error Ratio |
80
- |-------------|-----------|--------------|-------------|
81
- | **Real Images (CIFAR-10)** | 0.004293 | 0.003766 | 1.00x |
82
- | **Fake Images (Random Noise)** | 0.401686 | 0.401680 | **93.56x** |
83
-
84
- **Separation Quality**: 93.56x ratio demonstrates excellent discrimination capability!
85
-
86
- ## Calibrated Detection Thresholds
87
-
88
- These thresholds are **scientifically calibrated** based on actual error distributions:
89
-
90
- | Threshold | MSE Value | True Positive Rate | False Positive Rate | Use Case |
91
- |-----------|-----------|-------------------|---------------------|----------|
92
- | **Strict** | 0.012768 | 100.0% | 1.0% | High-stakes verification |
93
- | **Balanced** | 0.009066 | 100.0% | 5.0% | General detection |
94
- | **Sensitive** | 0.009319 | 100.0% | 4.5% | Screening applications |
95
- | **Optimal** | 0.204039 | 100.0% | 0.0% | Maximum separation |
96
-
97
- πŸ’‘ **All thresholds achieve 100% detection** on out-of-distribution images while maintaining low false positive rates on real images.
98
-
99
- See `thresholds_calibrated.json` for complete calibration data and statistics.
100
 
101
  ## Quick Start
102
-
103
- ### Installation
104
-
105
- ```bash
106
- pip install torch torchvision huggingface_hub pillow
107
- ```
108
-
109
- ### Basic Usage
110
-
111
  ```python
112
  from huggingface_hub import hf_hub_download
113
- from model import load_model
114
  import torch
 
115
  from torchvision import transforms
116
  from PIL import Image
117
  import json
118
 
119
  # Download model and thresholds
120
  checkpoint_path = hf_hub_download(
121
- repo_id="ash12321/deepfake-autoencoder-cifar10-v2",
122
- filename="model_best_checkpoint.ckpt"
123
  )
124
-
125
- thresholds_path = hf_hub_download(
126
- repo_id="ash12321/deepfake-autoencoder-cifar10-v2",
127
  filename="thresholds_calibrated.json"
128
  )
129
 
130
  # Load model
131
  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
132
- model = load_model(checkpoint_path, device)
 
 
 
133
 
134
- # Load calibrated thresholds
135
- with open(thresholds_path, 'r') as f:
136
- config = json.load(f)
137
- threshold = config['reconstruction_thresholds']['thresholds']['balanced']['value']
138
-
139
- print(f"Using threshold: {threshold:.6f}")
140
 
141
  # Prepare image
142
  transform = transforms.Compose([
@@ -146,142 +47,83 @@ transform = transforms.Compose([
146
  ])
147
 
148
  image = Image.open("your_image.jpg").convert('RGB')
149
- input_tensor = transform(image).unsqueeze(0).to(device)
150
 
151
- # Detect deepfake
152
  with torch.no_grad():
153
- error = model.reconstruction_error(input_tensor, reduction='none')
154
-
155
- is_fake = error.item() > threshold
156
- print(f"Image is {'FAKE' if is_fake else 'REAL'}")
157
- print(f"Reconstruction error: {error.item():.6f}")
158
- print(f"Threshold: {threshold:.6f}")
 
 
 
 
159
  ```
160
 
161
- ## Reconstruction Examples
162
-
163
- ![Reconstruction Comparison](reconstruction_comparison.png)
164
 
165
- Original CIFAR-10 images (top) vs reconstructions (bottom) showing excellent quality.
166
 
167
- ![Threshold Calibration](threshold_calibration.png)
 
 
 
 
168
 
169
- Error distribution analysis showing clear separation between real and fake images.
170
 
171
- ## Files in This Repository
 
 
 
 
172
 
173
- - `model_best_checkpoint.ckpt` - Trained model weights (621 MB)
174
- - `model.py` - Model architecture and utilities
175
- - `thresholds_calibrated.json` - **Real calibrated thresholds** with statistics
176
- - `inference_example.py` - Complete working examples
177
- - `reconstruction_comparison.png` - CIFAR-10 reconstruction quality
178
- - `threshold_calibration.png` - Distribution analysis visualization
179
- - `config.json` - Model metadata
180
-
181
- ## Advanced Usage
182
-
183
- ### Using Calibrated Thresholds
184
-
185
- ```python
186
- import json
187
-
188
- # Load all threshold options
189
- with open('thresholds_calibrated.json', 'r') as f:
190
- config = json.load(f)
191
-
192
- thresholds = config['reconstruction_thresholds']['thresholds']
193
-
194
- # Choose based on your use case
195
- strict_threshold = thresholds['strict']['value'] # 1% FPR
196
- balanced_threshold = thresholds['balanced']['value'] # 5% FPR
197
- optimal_threshold = thresholds['optimal']['value'] # 0% FPR
198
-
199
- print(f"Strict (99th percentile): {strict_threshold:.6f}")
200
- print(f"Balanced (95th percentile): {balanced_threshold:.6f}")
201
- print(f"Optimal (max separation): {optimal_threshold:.6f}")
202
- ```
203
-
204
- ### Batch Processing
205
-
206
- ```python
207
- # Process multiple images efficiently
208
- images = torch.stack([transform(Image.open(f)) for f in image_paths])
209
- images = images.to(device)
210
-
211
- with torch.no_grad():
212
- errors = model.reconstruction_error(images, reduction='none')
213
- fake_mask = errors > threshold
214
-
215
- num_fakes = fake_mask.sum().item()
216
- print(f"Detected {num_fakes}/{len(image_paths)} potential fakes")
217
-
218
- # Print individual results
219
- for i, (path, error, is_fake) in enumerate(zip(image_paths, errors, fake_mask)):
220
- status = "FAKE" if is_fake else "REAL"
221
- print(f"{path}: {status} (error: {error:.6f})")
222
- ```
223
 
224
- ### Calibration Statistics
 
 
 
 
 
225
 
226
- The model was calibrated using:
227
- - **Real Images**: CIFAR-10 test set (10,000 images)
228
- - **Fake Images**: Random noise (10,000 synthetic samples)
229
- - **Mean Separation**: 93.56x ratio
230
- - **Perfect Discrimination**: 100% TPR at all thresholds
231
 
232
- ## Applications
 
 
 
 
 
233
 
234
- - βœ… **Deepfake Detection**: 100% detection on out-of-distribution images
235
- - βœ… **Anomaly Detection**: Identify unusual or manipulated images
236
- - βœ… **Quality Assessment**: Measure image quality through reconstruction
237
- - βœ… **Feature Extraction**: 512-D latent representations
238
- - βœ… **Image Compression**: Compress to latent space
239
- - βœ… **Domain Shift Detection**: Identify distribution changes
240
 
241
- ## Limitations & Recommendations
242
 
243
- ### Limitations
244
- - Trained on CIFAR-10 (32x32 upscaled to 128x128)
245
- - Thresholds calibrated on random noise (not real deepfakes)
246
- - Performance may vary on high-resolution images
247
- - Requires fine-tuning for specific deepfake detection tasks
248
 
249
- ### Recommendations
250
- - **For Production**: Recalibrate thresholds on your target distribution
251
- - **For High-Res Images**: Consider fine-tuning on larger images
252
- - **For Real Deepfakes**: Calibrate with actual deepfake datasets
253
- - **For Best Results**: Use ensemble with other detection methods
254
 
255
  ## Citation
256
-
257
- If you use this model in your research, please cite:
258
-
259
  ```bibtex
260
- @misc{deepfake-autoencoder-cifar10-v2,
261
- author = {ash12321},
262
- title = {Residual Convolutional Autoencoder for Deepfake Detection},
263
- year = {2024},
264
- publisher = {HuggingFace},
265
- howpublished = {\url{https://huggingface.co/ash12321/deepfake-autoencoder-cifar10-v2}}
266
  }
267
  ```
268
 
269
  ## License
270
 
271
- MIT License - See LICENSE file for details
272
-
273
- ## Model Card Authors
274
-
275
- - **ash12321**
276
-
277
- ## Acknowledgments
278
-
279
- - Trained on NVIDIA H100 80GB HBM3
280
- - Built with PyTorch 2.5.1
281
- - Thresholds calibrated using distribution analysis
282
-
283
- ---
284
-
285
- *Model trained and calibrated on December 08, 2025*
286
-
287
- **Status**: βœ… Production Ready with Calibrated Thresholds
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Residual Convolutional Autoencoder for Deepfake Detection
2
 
3
+ Multi-dataset trained model with **19.18x separation** between real and fake images.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
+ ## Model Performance
6
 
7
+ - **Training Time**: 21.4 minutes on H200 GPU
8
+ - **Best Validation Loss**: 0.007970 (Epoch 29)
9
+ - **Anomaly Separation**: 19.18x (fake images have 19x higher reconstruction error)
10
+ - **Datasets**: CIFAR-10, CIFAR-100, STL-10 (205,000 training images)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  ## Quick Start
 
 
 
 
 
 
 
 
 
13
  ```python
14
  from huggingface_hub import hf_hub_download
 
15
  import torch
16
+ from model import ResidualConvAutoencoder
17
  from torchvision import transforms
18
  from PIL import Image
19
  import json
20
 
21
  # Download model and thresholds
22
  checkpoint_path = hf_hub_download(
23
+ repo_id="ash12321/deepfake-autoencoder-cifar10-v2",
24
+ filename="model_universal_best.ckpt"
25
  )
26
+ threshold_path = hf_hub_download(
27
+ repo_id="ash12321/deepfake-autoencoder-cifar10-v2",
 
28
  filename="thresholds_calibrated.json"
29
  )
30
 
31
  # Load model
32
  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
33
+ model = ResidualConvAutoencoder(latent_dim=512, dropout=0.1).to(device)
34
+ checkpoint = torch.load(checkpoint_path, map_location=device)
35
+ model.load_state_dict(checkpoint['model_state_dict'])
36
+ model.eval()
37
 
38
+ # Load thresholds
39
+ with open(threshold_path) as f:
40
+ thresholds = json.load(f)
 
 
 
41
 
42
  # Prepare image
43
  transform = transforms.Compose([
 
47
  ])
48
 
49
  image = Image.open("your_image.jpg").convert('RGB')
50
+ image_tensor = transform(image).unsqueeze(0).to(device)
51
 
52
+ # Get reconstruction error
53
  with torch.no_grad():
54
+ error = model.reconstruction_error(image_tensor)
55
+ error_value = error.item()
56
+ print(f"Reconstruction error: {error_value:.6f}")
57
+
58
+ # Check against threshold (balanced mode)
59
+ balanced_threshold = thresholds['reconstruction_thresholds']['thresholds']['balanced']['value']
60
+ if error_value > balanced_threshold:
61
+ print("⚠️ Potential deepfake detected!")
62
+ else:
63
+ print("βœ… Image appears authentic")
64
  ```
65
 
66
+ ## Detection Thresholds
 
 
67
 
68
+ Three calibrated threshold levels:
69
 
70
+ | Mode | Threshold | False Positive Rate | Description |
71
+ |------|-----------|---------------------|-------------|
72
+ | **Strict** | 0.055737 | ~1% | Very low false positives |
73
+ | **Balanced** | 0.039442 | ~5% | Recommended for general use |
74
+ | **Sensitive** | ~0.039 | ~2.5% | More sensitive detection |
75
 
76
+ ## Model Architecture
77
 
78
+ - **Encoder**: 5 downsampling blocks (128β†’64β†’32β†’16β†’8β†’4)
79
+ - **Latent Space**: 512 dimensions
80
+ - **Decoder**: 5 upsampling blocks (4β†’8β†’16β†’32β†’64β†’128)
81
+ - **Residual Blocks**: Skip connections with dropout (0.1)
82
+ - **Total Parameters**: ~40M
83
 
84
+ ## Training Details
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
 
86
+ - **Epochs**: 30 (best at epoch 29)
87
+ - **Batch Size**: 1024
88
+ - **Optimizer**: AdamW (lr=1e-4, weight_decay=1e-5)
89
+ - **Scheduler**: Cosine Annealing with Warm Restarts
90
+ - **Data Augmentation**: Horizontal flip, color jitter
91
+ - **Mixed Precision**: AMP enabled
92
 
93
+ ## Statistics
 
 
 
 
94
 
95
+ ### Real Images
96
+ - Mean error: 0.018391
97
+ - Median error: 0.015647
98
+ - Std: 0.010279
99
+ - 95th percentile: 0.039442
100
+ - 99th percentile: 0.055737
101
 
102
+ ### Fake Images (Synthetic)
103
+ - Mean error: 0.352695
104
+ - Median error: 0.347151
 
 
 
105
 
106
+ **Separation Ratio**: 19.18x 🎯
107
 
108
+ ## Files
 
 
 
 
109
 
110
+ - `model_universal_best.ckpt` - Full checkpoint (418MB)
111
+ - `thresholds_calibrated.json` - Calibrated thresholds
112
+ - `model.py` - Model architecture
113
+ - `config.json` - Training configuration
114
+ - `README.md` - This file
115
 
116
  ## Citation
 
 
 
117
  ```bibtex
118
+ @misc{deepfake_autoencoder_2024,
119
+ title={Residual Convolutional Autoencoder for Deepfake Detection},
120
+ author={Your Name},
121
+ year={2024},
122
+ publisher={HuggingFace},
123
+ url={https://huggingface.co/ash12321/deepfake-autoencoder-cifar10-v2}
124
  }
125
  ```
126
 
127
  ## License
128
 
129
+ MIT License