GoldPricePredictor / README.md
theonegareth's picture
Upload README.md with huggingface_hub
2072f33 verified
metadata
language: en
license: mit
library_name: sklearn
tags:
  - sklearn
  - gold-price-prediction
  - time-series
  - classification
  - financial-prediction
datasets:
  - custom
metrics:
  - accuracy
  - f1-score
  - roc-auc
model-index:
  - name: Gold Price Direction Predictor
    results:
      - task:
          type: classification
          name: Binary Classification
        dataset:
          type: custom
          name: Antam Gold Prices
        metrics:
          - type: accuracy
            value: 0.55
            name: Accuracy
          - type: f1
            value: 0.56
            name: F1 Score
          - type: roc_auc
            value: 0.58
            name: ROC AUC

Gold Price Direction Predictor

This model predicts the next-day direction of gold prices (up or down) based on historical Antam gold price data and technical indicators.

Model Description

  • Model Type: Binary Classification (Gradient Boosting / XGBoost / LightGBM)
  • Task: Predict whether gold price will go up or down the next day
  • Input: Feature vector with technical indicators (returns, lags, RSI, MACD, Bollinger Bands, etc.)
  • Output: Probability of price going up (0-1), thresholded at optimized value for prediction

Intended Uses & Limitations

Intended Uses

  • Financial analysis and decision support
  • Educational purposes for machine learning in finance
  • Research on gold price prediction

Limitations

  • Trained on historical Antam gold prices only
  • May not generalize to other markets or time periods
  • Prediction accuracy is around 55-60% (better than random but not perfect)
  • Requires up-to-date feature computation for real-time use

How to Use

Loading the Model

from huggingface_hub import hf_hub_download
from joblib import load

# Download model
model_path = hf_hub_download("theonegareth/GoldPricePredictor", "gold_direction_model.joblib")
model = load(model_path)

Making Predictions

The model expects a pandas DataFrame with the same feature columns used in training.

import pandas as pd

# Example feature vector (you need to compute these from your data)
features = pd.DataFrame({
    'ret': [0.01],
    'log_ret': [0.00995],
    'ret_lag_1': [0.005],
    # ... all required features
})

# Predict probability of going up
proba_up = model.predict_proba(features)[:, 1]
prediction = (proba_up >= 0.52).astype(int)  # Using optimized threshold

Feature Engineering

To use this model, you need to compute the same features from your gold price data:

  • Daily returns and log returns
  • Lagged returns (1-5 days)
  • Rolling means and stds (3,5,10,20 days)
  • RSI (14-day)
  • MACD and signal
  • Bollinger Bands
  • Day of week and month

See the training notebooks for the complete add_features_adaptive function.

Training Data

  • Source: Antam historical gold prices (Indonesian market)
  • Period: [Insert date range from your data]
  • Features: 25+ technical indicators
  • Target: Next-day price direction (up=1, down=0)

Performance

Based on holdout testing:

  • Accuracy: ~55%
  • F1 Score: ~56%
  • ROC AUC: ~58%

See the confusion matrix, ROC curve, and feature importance plots in the repository.

Training Procedure

  1. Data preprocessing and feature engineering
  2. Time-series split for cross-validation
  3. Hyperparameter tuning with RandomizedSearchCV
  4. Model selection based on F1 score
  5. Threshold optimization for final predictions

Models compared: Gradient Boosting, XGBoost, LightGBM

Contact

For questions or issues, please open an issue on this repository.

License

MIT License