Spaces:

Kamichanw
/

vqa_accuracy

Runtime error

Kamichanw commited on Aug 11, 2024

Commit

df20d76

verified ·

1 Parent(s): 9544977

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -13,13 +13,11 @@ pinned: false
 ## Metric Description
 The **VQAaccuracy** metric is used for evaluating the accuracy of visual question answering (VQA) models. It is designed to be robust to the variability in how different humans may phrase their answers. The accuracy for an answer (`ans`) predicted by the model is calculated as:
-$$
- \text{Acc}(ans) = \min\left(\frac{\# \text{humans that said} ans}{3}, 1\right)
-$$
 This metric aligns with the official VQA evaluation by averaging the machine accuracies over all possible sets of human annotators.
 ## How to Use
-The **VQAaccuracy** metric can be used to evaluate the performance of a VQA model by comparing the predicted answers to a set of ground truth answers. The metric can be integrated into your evaluation pipeline as follows:
 ### Inputs
 - **predictions** (`list` of `str`): The predicted answers generated by the VQA model.
@@ -39,20 +37,13 @@ The accuracy values range from 0 to 100, with higher values indicating better pe
 Here is an example of how to use the **VQAaccuracy** metric:
 ```python
-from evaluate import load
-# Load the metric
-vqa_accuracy = load("Kamichanw/vqa_accuracy")
-# Example predictions and references
-predictions = ["yes", "2", "blue"]
-references = [["yes", "yeah", "yep"], ["2", "two"], ["blue", "bluish"]]
-# Compute the accuracy
-results = vqa_accuracy.compute(predictions=predictions, references=references)
-print(results)
-# Output: {"overall": 24.07}
 ```
 ## Limitations and Bias

 ## Metric Description
 The **VQAaccuracy** metric is used for evaluating the accuracy of visual question answering (VQA) models. It is designed to be robust to the variability in how different humans may phrase their answers. The accuracy for an answer (`ans`) predicted by the model is calculated as:
+$ \text{Acc}(ans) = \min\left(\frac{\# \text{humans that said } ans}{3}, 1\right) $
 This metric aligns with the official VQA evaluation by averaging the machine accuracies over all possible sets of human annotators.
 ## How to Use
+The **VQAAccuracy** metric can be used to evaluate the performance of a VQA model by comparing the predicted answers to a set of ground truth answers. The metric can be integrated into your evaluation pipeline as follows:
 ### Inputs
 - **predictions** (`list` of `str`): The predicted answers generated by the VQA model.
 Here is an example of how to use the **VQAaccuracy** metric:
 ```python
+>>> from evaluate import load
+>>> vqa_accuracy = load("Kamichanw/vqa_accuracy")
+>>> predictions = ["yes", "2", "blue"]
+>>> references = [["yes", "yeah", "yep"], ["2", "two"], ["blue", "bluish"]]
+>>> results = vqa_accuracy.compute(predictions=predictions, references=references)
+>>> print(results)
+{"overall": 24.07}
 ```
 ## Limitations and Bias