FIT2022 第21回情報科学技術フォーラム

イベント企画

トップコンファレンス4-2　コンピュータビジョン

9/14 9:30-12:00
第4イベント会場

座長：菅野裕介（東京大学）

9:30-9:50 講演(1) 【タイトル邦題】点と平面の距離を最小にした逆平面フィッティングによる法線積分
曹　旭（大阪大学大学院情報科学研究科マルチメディア工学専攻松下研究室特任助教）
【原発表の書誌情報】 Xu Cao, Boxin Shi, Fumio Okura, and Yasuyuki Matsushita, "Normal Integration via Inverse Plane Fitting with Minimum Point-to-Plane Distance," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2382-2391, 2022.
【概要】 This paper studies the three-dimensional (3D) shape recovery from its normal map, i.e., normal integration, via a geometric method. Unlike existing variational methods, we cast the normal integration problem as an inverse plane fitting problem, which minimizes the point-to-plane distances given the plane normals. As a result, our method can be more robust than traditional variational methods while remaining a convex least-square formulation.
	【略歴】 2022年大阪大学大学院情報科学研究科博士後期課程修了。同年より大阪大学大学院情報科学研究科特任助教。
9:50-10:10 講演(2) 【タイトル邦題】ダイバージェンス最適化を用いたノイズありユニバーサルドメイン適応
Qing　Yu（東京大学大学院情報理工学系研究科電子情報学専攻相澤研究室）
【原発表の書誌情報】 Qing Yu, Atsushi Hashimoto, and Yoshitaka Ushiku. "Divergence Optimization for Noisy Universal Domain Adaptation", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2515-2524, 2021.
【概要】本研究では，教師なしドメイン適応をさらに実用的な条件に近づけるノイズありユニバーサルドメイン適応という問題設定とその解法を提案する．信頼できないアノテーションデータに対応しながら，ユニバーサルドメイン適応を実現するために，分岐した２つの識別器を使用し，推定結果に対するダイバージェンスを制御することで高い認識性能を達成した．
	【略歴】 2020年東京大学大学院学際情報学府修士課程修了，現在東京大学大学院情報理工学系研究科相澤研究室博士後期課程に在籍中．日本学術振興会特別研究員 (DC1)．研究室でコンピュータビジョンに関する研究に従事．
10:10-10:30 講演(3) 【タイトル邦題】ノイズの多いラベルを用いた学習のためのオーグメンテーション方策
西　健斗（Harvard University ）
【原発表の書誌情報】 Nishi, Kento, et al. "Augmentation strategies for learning with noisy labels." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
【概要】本論文では、「ノイズの多いラベルを用いた学習」問題に取り組むアルゴリズムについて、損失のモデリングと解析に一セットのオーグメンテーションを使用し、学習には別のセットを使用することを提案し、評価したベンチマークの全てにおいてパフォーマンスを改善できることを実証する。
	【略歴】 2004年千葉県生まれ。 2011年より米国カリフォルニア州在住。小学4年時からプログラミングを始め、高校入学後は主にAI/ML 分野に興味を持つ。 2020年夏よりUCSB Four Eyes Labに客員研究員として在籍。主著者として共同執筆した論文が翌年のAAAIとCVPRに採択。 2022年6月、Lynbrook高校卒業。同8月にHarvard大学に進学。
10:30-10:50 講演(4) 【タイトル邦題】 Aperture Rendering Generative Adversarial Networksを用いた自然画像からの奥行と被写界深度効果の教師なし学習
金子　卓弘（日本電信電話株式会社 NTTコミュニケーション科学基礎研究所特別研究員）
【原発表の書誌情報】 Kaneko, T.: Unsupervised Learning of Depth and Depth-of-Field Effect from Natural Images with Aperture Rendering Generative Adversarial Networks, Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15679–15688 (2021).
【概要】自然画像からの奥行と被写界深度効果の教師なし学習の実現をめざし、Aperture Rendering Generative Adversarial Networks（AR-GAN）と呼ぶ新たな深層生成モデルを提案する。AR-GANは、カメラの絞りの機構を組み込むことで、光学的な制約を課しながら被写界深度効果を表現できることが特徴である。実験では、様々な自然画像のデータセットに対してAR-GANが有効であることを示す。
	【略歴】 2014年東京大学大学院修士課程修了。同年、日本電信電話株式会社に入社、NTTコミュニケーション科学基礎研究所に所属。2020年東京大学大学院博士課程修了。博士（情報理工学）。2020年よりNTTコミュニケーション科学基礎研究所特別研究員。画像生成、音声合成、音声変換を対象としたコンピュータビジョン、信号処理、機械学習、深層学習の研究に従事。日本機械学会畠山賞、ICPR Best Student Paper Award、音声研究会研究奨励賞、東京大学大学院研究科長賞等を各受賞。
10:50-11:10 講演(5) 【タイトル邦題】 AutoDO: 陰関数微分を用いた効率的な学習データ自動最適化技術
Denis　Gudovskiy（Panasonic AI Lab, USA Senior Researcher）
【原発表の書誌情報】 Gudovskiy, D. and Rigazio, L. and Ishizaka, S. and Kozuka, K. and Tsukizawa, S.: AutoDO: Robust AutoAugment for Biased Data With Label Noise via Scalable Probabilistic Implicit Differentiation, Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.16601-16610 (2021)
【概要】 In our AutoDO model, we explicitly estimate a set of per-point hyperparameters to flexibly change distribution of train data. In particular, we include hyperparameters for augmentation, loss weights, and softlabels that are jointly estimated using implicit differentiation.
	【略歴】 Denis Gudovskiy is a senior researcher at Panasonic AI lab in Mountain View. He specializes in deep learning-based algorithms for AI applications. His portfolio of research projects includes optimization of deep neural networks for edge AI devices, explainable AI tools, and automatic dataset management for computer vision applications. Before joining Panasonic in 2016, Denis held research and engineering positions in Intel, Olympus and Huawei wireless divisions. Denis received his M.Sc. in Computer Engineering from the University of Texas, Austin in 2008.
11:10-11:30 講演(6) 【タイトル邦題】 Scene Text Recognitionを行う際に、実データのみを用いたらどうなるか？
白　定勳（東京大学大学院情報理工学系研究科電子情報学専攻相澤・山肩・松井研究室博士課程３年生）
【原発表の書誌情報】 Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa: What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels. Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.7013-7022 (2021)
【概要】 Scene Text Recognition(STR)分野では、以前実データはその数が少なく（１万枚以下）、実データのみを使うと精度が低いという暗黙的常識があった。現在すべての最新手法は16Mの大量の合成データを用いて学習している。実データは学習に全く使われないことが多く、疎かにされることが多かった。我々は、上記の暗黙的常識が「少ない実データを効率良く使う研究」を妨げていると考え、この暗黙的常識を覆した。我々は、最近増えつつある実データを集め、その数は合計276Kで合成データ16Mの1.7%であるが、実データのみを使っても十分にモデルを学習できることを見せた。
	【略歴】 2014年東京農工大学機械システム工学科卒業。2016年京都大学大学院情報学研究科修了。2016年NCSOFT Corp.に入社し、自然言語処理に関する研究開発を行った。2018年NAVER Corp.にて画像内文字認識を担当し、2019年画像内文字認識国際大会にて１位や３位獲得。2020年東京大学大学院情報理工学系研究科博士課程に入学し、画像内文字認識に関する研究を行っている。