情報処理学会 第88回全国大会

6ZN-04
Speaker Diarization and Sentiment Analysis for Conversations Captured with 360-Degree Video
○何 承錦,Oky Dicky Ardiansyah Prima(岩手県大)
We propose a multimodal analysis pipeline for 360-degree meeting environments that integrates audiovisual diaries with emotion analysis. In contrast to conventional systems, the proposed method incorporates both visual and audio embeddings alongside YOLOv8-based tracking to enable robust speaker identification in omnidirectional settings. The system synchronizes audio and video streams to accurately determine who is speaking and where the speaker is located, thereby generating structured meeting records and corresponding affective evaluations. Experimental results indicate that the system performs reliably in complex, multi-speaker scenarios, offering a comprehensive framework for automated meeting documentation.