Kalvin Chang 張郁騰 /ʈ͡ʂɑŋ˦ y˥˨ tʰəŋ˨˦/
Linguistically motivated NLP

BAIR
UC Berkeley
I am an incoming CS PhD at Berkeley AI Research and Berkeley NLP, advised by Alane Suhr. My research aims to build data-efficient approaches to learn the long tail of language variation in both text and speech. I have a track record of publication in top NLP and speech conferences, with a portfolio of 7 co-first authored publications across ASR, NLP, and computational linguistics.
I was fortunate to work on computational linguistics and speech processing with David Mortensen and Shinji Watanabe at Carnegie Mellon, where I graduated with a Master’s of Language Technologies (Rank 1) and a BS in Computer Science (with University Honors) from CMU.
news
Feb 13, 2025 | Accepted to the Toyota Technical Institute at Chicago’s CS PhD program, the University of Cambridge’s Engineering and Computation, Cognition, and Language PhD programs, the University of Edinburgh’s PhD in Informatics program, the University of Waterloo’s CS PhD program, and UC Berkeley’s CS PhD program. |
---|---|
Jan 31, 2025 | Awarded a Gates Cambridge Scholarship as one of 35 / 600 US applicants. |
Dec 08, 2024 | Selected to attend the inaugural SDAIA Winter School on multi-modal LLMs as a Researcher to work on ASR for code-switching. |
Oct 17, 2024 | Presented four posters at the SANE 2024 Workshop [1] [2] [3] [4] . |
Sep 10, 2024 | Won Honorable Mention at the Interspeech 2024 Responsible Speech Foundation Models Special Session for “Self-supervised Speech Representations Still Struggle with African American Vernacular English” (Chang* et al., 2024). |
Aug 19, 2024 | Returned to CMU LTI as a Visiting Scholar in WAVLab and ChangeLingLab, advised by Professors Shinji Watanabe and David Mortensen. |
selected publications
- Self-supervised Speech Representations Still Struggle with African American Vernacular EnglishInterspeech, 2024
- Evaluating Self-supervised Speech Models on a Taiwanese Hokkien CorpusIn 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2023
- Transformed Protoform ReconstructionIn Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Jul 2023
- Automating Sound Change Prediction for Phylogenetic Inference: A Tukanoan Case StudyIn Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change, Dec 2023
- WikiHan: A New Comparative Dataset for Chinese LanguagesIn Proceedings of the 29th International Conference on Computational Linguistics, Oct 2022
- Phonotactic Complexity across DialectsIn Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024
- PWESuite: Phonetic Word Embeddings and Tasks They FacilitateIn Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024