Channels
digitalhumanities
mimic-scores-report
summarization-task
csdlsrv1
grounded_language
main
automatic-pyramid
aclworkshops
openienet
concept_maps
srl
test
nlp-seminar-2018
relevancy_attention
שניים-אוחזים-2018
acl2020
gqa-annotation
gqa-paper-efforts
emnlp2020
subwords
proj-phonolgical-experiments
random
hebrewner
papers
deepgeneration
conferences
wordembeddingworkshop
texttocode
camoni
science_datasets
groupmeeting2023
groupmeeting2019
mimic
general
deeplearningworkshop
Powered by
#relevancy_attention
t
talbaumel
08/20/2017, 7:34 AM
See pretrained blackbox duc 5 ROUGE-1: rouge_1_f_score: 0.0425 with confidence interval (0.0371, 0.0481) rouge_1_recall: 0.0224 with confidence interval (0.0195, 0.0254) rouge_1_precision: 0.5250 with confidence interval (0.4830, 0.5623) ROUGE-2: rouge_2_f_score: 0.0075 with confidence interval (0.0061, 0.0089) rouge_2_recall: 0.0039 with confidence interval (0.0032, 0.0046) rouge_2_precision: 0.1029 with confidence interval (0.0846, 0.1231) ROUGE-l: rouge_l_f_score: 0.0401 with confidence interval (0.0352, 0.0453) rouge_l_recall: 0.0211 with confidence interval (0.0184, 0.0240) rouge_l_precision: 0.4976 with confidence interval (0.4599, 0.5334)
t
talbaumel
08/20/2017, 7:43 AM
Just relevant duc 5 ROUGE-1: rouge_1_f_score: 0.0397 with confidence interval (0.0349, 0.0449) rouge_1_recall: 0.0209 with confidence interval (0.0183, 0.0237) rouge_1_precision: 0.5138 with confidence interval (0.4735, 0.5481) ROUGE-2: rouge_2_f_score: 0.0057 with confidence interval (0.0047, 0.0066) rouge_2_recall: 0.0029 with confidence interval (0.0024, 0.0034) rouge_2_precision: 0.0845 with confidence interval (0.0691, 0.1008) ROUGE-l: rouge_l_f_score: 0.0376 with confidence interval (0.0331, 0.0426) rouge_l_recall: 0.0198 with confidence interval (0.0173, 0.0225) rouge_l_precision: 0.4894 with confidence interval (0.4488, 0.5244)
t
talbaumel
08/20/2017, 7:51 AM
Binary Oracle duc5
t
talbaumel
08/20/2017, 7:51 AM
ROUGE-1: rouge_1_f_score: 0.0465 with confidence interval (0.0410, 0.0524) rouge_1_recall: 0.0246 with confidence interval (0.0215, 0.0278) rouge_1_precision: 0.5284 with confidence interval (0.4857, 0.5683) ROUGE-2: rouge_2_f_score: 0.0080 with confidence interval (0.0064, 0.0095) rouge_2_recall: 0.0042 with confidence interval (0.0034, 0.0050) rouge_2_precision: 0.0997 with confidence interval (0.0803, 0.1197) ROUGE-l: rouge_l_f_score: 0.0437 with confidence interval (0.0387, 0.0490) rouge_l_recall: 0.0231 with confidence interval (0.0203, 0.0260) rouge_l_precision: 0.4985 with confidence interval (0.4574, 0.5355)
t
talbaumel
08/20/2017, 7:55 AM
Her model seems to be better across all tests but the ratios remains as reported
t
talbaumel
08/20/2017, 8:30 AM
https://docs.google.com/spreadsheets/d/1c3osuzfkaeIp7mZgZVA-zhI4b5pv7zGLKkApAlzyHx0/edit?usp=sharing
t
talbaumel
08/20/2017, 9:56 AM
Added the same experiment but with comparing the query instead of the model summarizations I have no idea why but the scores are even better