omukazu

About me

I'm Kazumasa Omura (大村和正), currently working for Nikkei Inc.
My research interests lie in discourse and practical applications of NLP.

Email: omukazu5313 at gmail.com

Education

2024/03/25 - 2021/04/01

Doctor

Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University

Doctorial Dissertation: "Studies on Data-Driven Discourse Relation Recognition toward Natural Language Understanding"
[webpage]
2021/03/23 - 2019/04/01

Master

Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University
2019/03/26 - 2015/04/01

Bachelor

Department of Electrical and Electronic Engineering, Faculty of Engineering, Kyoto University
High School

General Course of Ishikawa Prefectural Kanazawa Izumigaoka Senior High School

Experience

Present - 2024/04/01

Researcher

Nikkei Innovation Lab at Nikkei Inc.
2024/03/31 - 2022/04/01

JSPS DC Research Fellow (DC2)

Research Project: "Building a Commonsense Reasoning Model Considering Inference Process on Event Relational Knowledge"
[webpage]
2022/03/31 - 2021/04/01

Information/AI/Data Science Doctoral Fellowship

[webpage]
2024/03/31 - 2020/10/01

Student Intern

Nikkei Innovation Lab at Nikkei Inc.
2021/03/31 - 2020/07/01

OA (Office Assistant)

Language Media Lab at Kyoto University
「教育研究助成に関する技術補助業務」

Publication

Refereed - Journal Paper

JNLP

「基本イベントに基づく常識推論データセットの構築と利用」
大村和正, 河原大輔, 黒橋禎夫
自然言語処理 Vol.30 No.4, December 2023, pp. 1206-1239
silver_trophy 論文賞 (4/38) [webpage] [news]
[paper] [webpage]

Refereed - Conference Papers

LREC

COLING

long

poster

"An Empirical Study of Synthetic Data Generation for Implicit Discourse Relation Recognition"
Kazumasa Omura, Fei Cheng, and Sadao Kurohashi
In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Turin, Italy, pp. 1073–1085 (1,554/3,471)
[proceeding] [report]

ICCE

short

oral

"Toward Game-Based Learning of Japanese Writing for Elementary School Students"
Kazumasa Omura, Kei Kubo, Frederic Bergeron, and Sadao Kurohashi
In Proceedings of the 31st International Conference on Computers in Education (ICCE 2023), Shimane, Japan, pp. 655-660 (accepted as a short paper, 14/33)
[proceeding] [webpage] [app] [prototype]

COLING

long

virtual oral

"Improving Commonsense Contingent Reasoning by Pseudo-data and its Application to the Related Tasks"
Kazumasa Omura and Sadao Kurohashi
In Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022), October 2022, Gyeongju, Republic of Korea, pp. 812-823 (522/1,563)
silver_trophy Outstanding Paper Award (11/634) [ss] [closing] [news]
[proceeding] [webpage]

EMNLP

main

long

virtual oral

"A Method for Building a Commonsense Inference Dataset based on Basic Events"
Kazumasa Omura, Daisuke Kawahara, and Sadao Kurohashi
In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 2020, Online, pp. 2450–2460 (602/2,445)
[proceeding] [webpage]

----------

ACL

demo

"KWJA: A Unified Japanese Analyzer Based on Foundation Models"
Nobuhiro Ueda, Kazumasa Omura, Takashi Kodama, Hirokazu Kiyomaru, Yugo Murawaki, Daisuke Kawahara, and Sadao Kurohashi
In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACL 2023 Demo), July 2023, Toronto, Canada, pp. 538-548 (58/155)
[proceeding] [webpage]

NLP-COVID19

"A System for Worldwide COVID-19 Information Aggregation"
Akiko Aizawa, Frederic Bergeron, Junjie Chen, Fei Cheng, Katsuhiko Hayashi, ... , Kazumasa Omura, ... , Masashi Toyoda, Nobuhiro Ueda, Honai Ueoka, Masao Utiyama, and Ying Zhong (in alphabetical order)
In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, November 2020, Online
[proceeding] [website]

COIN

"Diversity-aware Event Prediction based on a Conditional Variational Autoencoder with Reconstruction"
Hirokazu Kiyomaru, Kazumasa Omura, Yugo Murawaki, Daisuke Kawahara, and Sadao Kurohashi
In Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing (COIN), November 2019, Hong Kong, pp. 113-122
[proceeding]

Non-Refereed - Articles

JNLP

『「基本イベントに基づく常識推論データセットの構築と利用」の研究過程』
大村和正
自然言語処理 Vol.31 No.2, June 2024, pp. 748-754
[article]

JNLP

"A Method for Building a Commonsense Inference Dataset based on Basic Events"
大村和正
自然言語処理 Vol.28 No.1, March 2021, pp. 287-291
[article]

Non-Refereed - Others

Domestic

NLP

「検索クエリログを用いない自然な質問のマイニングの検討」
大村和正, 石原祥太郎
言語処理学会第31回年次大会, March 2025, 長崎
[proceeding]

NLP

「極性と重要度を考慮した決算短信からの業績要因文の抽出」
大村和正, 白井穂乃, 石原祥太郎, 澤紀彦
言語処理学会第29回年次大会, March 2023, 沖縄
[proceeding]

NLP

「疑似問題による常識推論能力の改善と関連タスクへの効果」
大村和正, 黒橋禎夫
言語処理学会第28回年次大会, March 2022, Online
[proceeding]

NLP

「決算短信からの業績要因文の抽出に向けた業績発表記事からの訓練データの生成」
大村和正, 白井穂乃, 石原祥太郎, 澤紀彦
言語処理学会第28回年次大会, March 2022, Online
[proceeding]

ICT Innovation

「テキストからの蓋然的関係知識の獲得と計算機および人間の学習への活用」
大村和正, 黒橋禎夫
京都大学第16回ICTイノベーション, February 2022, Online

NLP

「ことばつなぎゲーム:ゲーミフィケーションによる小学生の作文教育」
大村和正, 久保圭, 黒橋禎夫
言語処理学会第27回年次大会, March 2021, Online
[proceeding]

NLP

「基本イベントに基づく常識推論データセットの構築」
大村和正, 河原大輔, 黒橋禎夫
言語処理学会第26回年次大会, March 2020, Online
[errata]

----------

YANS

「単語・文・文書を統合的に扱う主観的な日本語難易度付きコーパスの構築に向けて」
前川大輔, 大村和正, 樽本空宙, 石原祥太郎, 梶原智之
第20回言語処理若手シンポジウム (YANS2025), September 2025, 静岡

NLP

「テキスト生成モデルによる日本語形態素解析」
児玉貴志, 植田暢大, 大村和正, 清丸寛一, 村脇有吾, 河原大輔, 黒橋禎夫
言語処理学会第29回年次大会, March 2023, 沖縄
[proceeding]

「KWJA: 汎用言語モデルに基づく日本語解析器」
植田暢大, 大村和正, 児玉貴志, 清丸寛一, 村脇有吾, 河原大輔, 黒橋禎夫
情報処理学会第253回自然言語処理研究会, September 2022, 京都
silver_trophy 優秀研究賞 (2/20) [webpage]
[proceeding]