Joongwon (Daniel) Kim
I am a second-year Ph.D. student in the Natural Language Processing group at the University of Washington. I am thankful to be advised by Hannaneh Hajishirzi.
Previously, I was an undergrad at the University of Pennsylvania, working with Chris Callison-Burch and Mark Yatskar.
My research interests lie in natural language processing and machine learning, currently in improving open-source LLM performance on complex reasoning tasks with planning and tool usage.
I am supported by the NSF-GRFP Fellowship.
- 10/2023: Our work TaskWeb: Selecting Better Source Tasks for Multi-task NLP has been accepted to EMNLP 2023! New version coming soon.
- 05/2023: Our new preprint TaskWeb: Selecting Better Source Tasks for Multi-task NLP has been released!
- 09/2022: I begin my Ph.D. at the University of Washington!
- 04/2022: I have been awarded the NSF-GRFP Fellowship (2022-27).
- 03/2022: I have been awarded the CSE Educators' Endowed Fellowship in Computer Science & Engineering from the Allen School.
- 12/2021: I have been selected for honorable mentions for the CRA Outstanding Undergraduate Researcher Awards 2022.
Google Scholar  /
Publications / Pre-Prints
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
Joongwon Kim, Akari Asai, Gabriel Ilharco, Hannaneh Hajishirzi
Proceedings of EMNLP, 2023 (long)
We introduce TaskWeb, our benchmark of pairwise task transfers between 22 different NLP tasks across three different model types, sizes and adaptation method.
Based on TaskWeb, we propose a new method TaskShop for estimating transferability between source and target tasks with only a small number of target examples.
We demonstrate that selecting helpful source tasks with our method allows us to perform multi-task learning on much smaller training sets and still improve zero-shot performance across various target tasks.
Induce, Edit, Retrieve: Language Grounded Multimodal Schema for Instructional Video Retrieval
Yue Yang, Joongwon Kim, Artemis Panagopolou, Mark Yatskar, Chris Callison-Burch
CVPR 2022 @ ODRUM, 2022 (spotlight talk)
We built schemas for goal-oriented tasks by aligning YouTube videos with wikiHow steps. Then, we proposed methods for editing the schemas to handle unseen but related tasks.
Finally, we leveraged our schemas to perform instructional video retrieval on several datasets and demonstrated that our method improves over other retrieval approaches.
BiSECT: Learning to Split and Rephrase Sentences with Bitexts
Joongwon Kim*, Mounica Maddela*, Reno Kriz, Wei Xu, Chris Callison-Burch
Proceedings of EMNLP, 2021 (long)
We curated a multilingual corpus for sentence splitting by using machine translation over parallel corpora. Moreover, we developed a
sentence splitter with controllable generation. We showed that our dataset and model outperformed existing methods in both automatic and human evaluations. Work done in collaboration with Georgia Tech.