情報処理学会 第88回全国大会

7Y-08
多様性と正確性を両立する画像キャプション生成
○李 鵬月,植田佳祐,呉  偉(静岡大)
We propose a new framework for diverse beam search (DBS) and evaluate its effectiveness in neural image captioning. Our method introduces an adaptive, dynamically scheduled weight that more effectively balances diversity and accuracy throughout the decoding process. Experiments on the MS-COCO dataset demonstrate consistent and, perhaps surprisingly, simultaneous improvements in both quality and diversity metrics compared with standard beam search and conventional DBS, resulting in image captions that exhibit both higher accuracy and greater diversity.