Kamichanw commited on
Commit
df20d76
·
verified ·
1 Parent(s): 9544977

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -18
README.md CHANGED
@@ -13,13 +13,11 @@ pinned: false
13
 
14
  ## Metric Description
15
  The **VQAaccuracy** metric is used for evaluating the accuracy of visual question answering (VQA) models. It is designed to be robust to the variability in how different humans may phrase their answers. The accuracy for an answer (`ans`) predicted by the model is calculated as:
16
- $$
17
- \text{Acc}(ans) = \min\left(\frac{\# \text{humans that said} ans}{3}, 1\right)
18
- $$
19
  This metric aligns with the official VQA evaluation by averaging the machine accuracies over all possible sets of human annotators.
20
 
21
  ## How to Use
22
- The **VQAaccuracy** metric can be used to evaluate the performance of a VQA model by comparing the predicted answers to a set of ground truth answers. The metric can be integrated into your evaluation pipeline as follows:
23
 
24
  ### Inputs
25
  - **predictions** (`list` of `str`): The predicted answers generated by the VQA model.
@@ -39,20 +37,13 @@ The accuracy values range from 0 to 100, with higher values indicating better pe
39
  Here is an example of how to use the **VQAaccuracy** metric:
40
 
41
  ```python
42
- from evaluate import load
43
-
44
- # Load the metric
45
- vqa_accuracy = load("Kamichanw/vqa_accuracy")
46
-
47
- # Example predictions and references
48
- predictions = ["yes", "2", "blue"]
49
- references = [["yes", "yeah", "yep"], ["2", "two"], ["blue", "bluish"]]
50
-
51
- # Compute the accuracy
52
- results = vqa_accuracy.compute(predictions=predictions, references=references)
53
-
54
- print(results)
55
- # Output: {"overall": 24.07}
56
  ```
57
 
58
  ## Limitations and Bias
 
13
 
14
  ## Metric Description
15
  The **VQAaccuracy** metric is used for evaluating the accuracy of visual question answering (VQA) models. It is designed to be robust to the variability in how different humans may phrase their answers. The accuracy for an answer (`ans`) predicted by the model is calculated as:
16
+ $ \text{Acc}(ans) = \min\left(\frac{\# \text{humans that said } ans}{3}, 1\right) $
 
 
17
  This metric aligns with the official VQA evaluation by averaging the machine accuracies over all possible sets of human annotators.
18
 
19
  ## How to Use
20
+ The **VQAAccuracy** metric can be used to evaluate the performance of a VQA model by comparing the predicted answers to a set of ground truth answers. The metric can be integrated into your evaluation pipeline as follows:
21
 
22
  ### Inputs
23
  - **predictions** (`list` of `str`): The predicted answers generated by the VQA model.
 
37
  Here is an example of how to use the **VQAaccuracy** metric:
38
 
39
  ```python
40
+ >>> from evaluate import load
41
+ >>> vqa_accuracy = load("Kamichanw/vqa_accuracy")
42
+ >>> predictions = ["yes", "2", "blue"]
43
+ >>> references = [["yes", "yeah", "yep"], ["2", "two"], ["blue", "bluish"]]
44
+ >>> results = vqa_accuracy.compute(predictions=predictions, references=references)
45
+ >>> print(results)
46
+ {"overall": 24.07}
 
 
 
 
 
 
 
47
  ```
48
 
49
  ## Limitations and Bias