Spaces:

RFTSystems
/

DCLR_Optimiser

Running

App Files Files Community

DCLR_Optimiser / README.md

RFTSystems

Update README.md

5f795a8 verified 25 days ago

preview code

raw

history blame contribute delete

4.09 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

license: other
title: DCLR_OPTIMISER_CIFAR-10
sdk: gradio
emoji: 🚀
colorFrom: yellow
colorTo: blue
short_description: This DCLR configuration demonstrated superior VS LION-ADAM
thumbnail: >-
  https://cdn-uploads.huggingface.co/production/uploads/685edcb04796127b024b4805/BKs_WnksHY5pXPe1ciVt3.png
sdk_version: 6.0.0

🚀 DCLR Optimizer for CIFAR-10 Image Classification

This Hugging Face Space showcases the DCLR (Dynamic Consciousness-based Learning Rate) optimizer applied to a SimpleCNN model for image classification on the CIFAR-10 dataset.

🧠 Introduction to DCLR Optimizer

The DCLR optimizer is a novel approach that dynamically adjusts the learning rate based on the model's 'consciousness' (represented by the entropy of its output activations) and the gradient norm. Unlike traditional optimizers with fixed or schedule-based learning rates, DCLR aims to adapt more intelligently to the training landscape. It's designed to promote faster convergence and potentially better generalization by moderating step sizes based on the certainty of predictions and the steepness of the loss surface.

In our analysis, the best-tuned DCLR configuration demonstrated superior test accuracy compared to Adam, Lion, DCLRAdam, and even its original untuned configuration, highlighting its potential when properly configured.

💡 How to use the Gradio Demo

Upload an Image: Drag and drop an image (or click to upload) from your local machine into the designated area in the Gradio interface.
Webcam/Sketch (Optional): If enabled, you might be able to use your webcam or draw an image directly.
Get Predictions: The model will automatically process your image and display the top 3 predicted classes for the CIFAR-10 dataset along with their confidence scores.

Try uploading images of planes, cars, birds, cats, deer, dogs, frogs, horses, ships, or trucks to see how well the model classifies them!

🏗️ Model Architecture and Dataset

Model: SimpleCNN

The model used is a SimpleCNN, a lightweight Convolutional Neural Network designed for basic image classification tasks. It consists of:

Two convolutional layers (nn.Conv2d), each followed by a ReLU activation and Max Pooling.
Fully connected layers (nn.Linear) to process the flattened feature maps and output class scores.

Dataset: CIFAR-10

The model was trained on the CIFAR-10 dataset, which comprises 60,000 32x32 color images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images. The 10 classes are:

plane, car, bird, cat, deer, dog, frog, horse, ship, truck

⚙️ Hyperparameter Tuning and Optimal Parameters

Extensive hyperparameter tuning was performed for the DCLR optimizer using a grid search approach over different learning rates (lr) and lambda_ values. The tuning process involved training the model for a fixed number of epochs (5 epochs for initial screening) with various combinations and evaluating their test accuracy.

Our analysis identified the following optimal hyperparameters for DCLR on this task:

Learning Rate (lr): 0.1
Lambda (lambda_): 0.1

This best-tuned DCLR configuration achieved a final test accuracy of 70.70% over 20 epochs, significantly outperforming the original DCLR configuration and other optimizers like Adam and DCLRConscious, and performing competitively with Lion and DCLRAdam.

📊 Performance Visualizations

Here are the performance plots comparing DCLR (tuned) against other optimizers:

Training Performance (Loss and Accuracy over Epochs)

Final Test Accuracy Comparison

🙏 Acknowledgments

The DCLR optimizer is inspired by research into dynamic learning rate adaptation based on information theory.
CIFAR-10 dataset is provided by the Canadian Institute for Advanced Research.
Gradio and Hugging Face for providing an excellent platform for sharing ML demos.