<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://dialect-erc.github.io//feed.xml" rel="self" type="application/atom+xml"/><link href="https://dialect-erc.github.io//" rel="alternate" type="text/html"/><updated>2026-01-26T09:15:11+00:00</updated><id>https://dialect-erc.github.io//feed.xml</id><title type="html">DIALECT</title><subtitle>Natural Language Understanding for non-standard languages and dialects </subtitle><entry><title type="html">Keynote Talk at ACL 2024</title><link href="https://dialect-erc.github.io//news/Keynote-Talk-at-ACL-2024" rel="alternate" type="text/html" title="Keynote Talk at ACL 2024"/><published>2024-08-14T00:00:00+00:00</published><updated>2024-08-14T00:00:00+00:00</updated><id>https://dialect-erc.github.io//news/Keynote-Talk-at-ACL-2024</id><content type="html" xml:base="https://dialect-erc.github.io//news/Keynote-Talk-at-ACL-2024">&lt;p&gt;At &lt;a href=&quot;https://2024.aclweb.org/program/keynotes/#barbara-plank&quot;&gt;ACL 2024&lt;/a&gt; Prof. Barbara Blank held a keynote presentation on the topic “Are LLMs Narrowing Our Horizon? Let’s Embrace Variation in NLP!”&lt;/p&gt; &lt;p&gt;While acknowledging the remarkable achievements in NLP and their increasing integration into society, Barbara highlighted concerns about the field becoming more homogeneous. She presented a compelling case for embracing variation across three essential dimensions: model inputs, outputs, and research approaches. This strategy, she argued, is key to developing more trustworthy and innovative human-facing NLP systems.&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;/assets/img/news/acl-talk2.jpg&quot; alt=&quot;Keynote Talk at ACL 2024&quot; class=&quot;object-cover object-center w-full&quot; itemprop=&quot;image&quot; /&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">At ACL 2024 Prof. Barbara Blank held a keynote presentation on the topic “Are LLMs Narrowing Our Horizon? Let’s Embrace Variation in NLP!”</summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://dialect-erc.github.io//acl-logo.png"/><media:content medium="image" url="https://dialect-erc.github.io//acl-logo.png" xmlns:media="http://search.yahoo.com/mrss/"/></entry><entry><title type="html">Natural Language Processing For Bavarian</title><link href="https://dialect-erc.github.io//news/Natural-Language-Processing-for-Bavarian" rel="alternate" type="text/html" title="Natural Language Processing For Bavarian"/><published>2024-04-17T00:00:00+00:00</published><updated>2024-04-17T00:00:00+00:00</updated><id>https://dialect-erc.github.io//news/Natural-Language-Processing-for-Bavarian</id><content type="html" xml:base="https://dialect-erc.github.io//news/Natural-Language-Processing-for-Bavarian">&lt;p&gt;We are proud to present our recent research on NLP for Bavarian / &lt;strong&gt;NLP fi Bairisch&lt;/strong&gt;!&lt;/p&gt; &lt;p&gt;Dialects were a blind spot for NLP research, as it has focused largely on the ‘standard’ language variant(s). We aim to contribute to closing this gap in this project.&lt;/p&gt; &lt;p&gt;Accepted works to appear at LREC-COLING 2024 in Turin this year:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova and Barbara Plank. &lt;em&gt;Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data.&lt;/em&gt; In LREC-COLING 2024.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Verena Blaschke, Barbara Kovačić, Siyao Peng, Hinrich Schütze and Barbara Plank. &lt;em&gt;MaiBaam: A Multi-Dialectal Bavarian Universal Dependency Treebank&lt;/em&gt;. In LREC-COLING 2024.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Miriam Winkler, Virginija Juozapaityte, Rob van der Goot and Barbara Plank. &lt;em&gt;Slot and Intent Detection Resources for Bavarian and Lithuanian: Assessing Translations vs Natural Queries to Digital Assistants.&lt;/em&gt; In LREC-COLING 2024.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt;</content><author><name></name></author><summary type="html">We are proud to present our recent research on NLP for Bavarian / NLP fi Bairisch!</summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://dialect-erc.github.io//BavarianNLP.png"/><media:content medium="image" url="https://dialect-erc.github.io//BavarianNLP.png" xmlns:media="http://search.yahoo.com/mrss/"/></entry><entry><title type="html">Survey: Corpora For Germanic Low Resource Language Varieties</title><link href="https://dialect-erc.github.io//news/Survey-Corpora-for-Germanic-low-resource-language-varieties" rel="alternate" type="text/html" title="Survey: Corpora For Germanic Low Resource Language Varieties"/><published>2023-05-01T00:00:00+00:00</published><updated>2023-05-01T00:00:00+00:00</updated><id>https://dialect-erc.github.io//news/Survey:-Corpora-for-Germanic-low-resource-language-varieties</id><content type="html" xml:base="https://dialect-erc.github.io//news/Survey-Corpora-for-Germanic-low-resource-language-varieties">&lt;p&gt;What corpora are available for Germanic low-resource language varieties?&lt;/p&gt; &lt;p&gt;We presented a survey and &lt;a href=&quot;https://github.com/mainlp/germanic-lrl-corpora&quot;&gt;repository&lt;/a&gt; for Germanic low-resource language varieties at NoDaLiDa 2023 in Tórshavn, Faroe Islands on May 22nd-24th, 2023:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;Verena Blaschke, Hinrich Schütze and Barbara Plank. &lt;a href=&quot;https://aclanthology.org/2023.nodalida-1.41/&quot;&gt;A Survey of Corpora for Germanic Low-Resource Languages and Dialects.&lt;/a&gt; In NoDaLiDa 2023.&lt;/li&gt; &lt;/ul&gt;</content><author><name></name></author><summary type="html">What corpora are available for Germanic low-resource language varieties?</summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://dialect-erc.github.io//GerLowResourceVarieties.png"/><media:content medium="image" url="https://dialect-erc.github.io//GerLowResourceVarieties.png" xmlns:media="http://search.yahoo.com/mrss/"/></entry><entry><title type="html">Language Technologies For Digital Inclusion</title><link href="https://dialect-erc.github.io//news/Language-technologies-for-digital-inclusion" rel="alternate" type="text/html" title="Language Technologies For Digital Inclusion"/><published>2023-04-29T00:00:00+00:00</published><updated>2023-04-29T00:00:00+00:00</updated><id>https://dialect-erc.github.io//news/Language-technologies-for-digital-inclusion</id><content type="html" xml:base="https://dialect-erc.github.io//news/Language-technologies-for-digital-inclusion">&lt;p&gt;Featured in the LMU news: “Barbara Plank researches natural language processing (NLP) at LMU. She works on language technologies and artificial intelligence with a strong focus on human concerns.”&lt;/p&gt; &lt;p&gt;Read up the article in English: &lt;a href=&quot;https://www.lmu.de/en/newsroom/news-overview/news/language-technologies-for-digital-inclusion.html&quot;&gt;Language technologies for digital inclusion&lt;/a&gt; - or in German: &lt;a href=&quot;https://www.lmu.de/de/newsroom/newsuebersicht/news/sprachtechnologien-fuer-die-digitale-teilhabe.html&quot;&gt;Sprachtechnologien für die digitale Teilhabe&lt;/a&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">Featured in the LMU news: “Barbara Plank researches natural language processing (NLP) at LMU. She works on language technologies and artificial intelligence with a strong focus on human concerns.”</summary></entry><entry><title type="html">On Ground Truth In Machine Learning: Human Label Variation</title><link href="https://dialect-erc.github.io//news/On-Ground-Truth-in-machine-learning-Human-Label-Variation" rel="alternate" type="text/html" title="On Ground Truth In Machine Learning: Human Label Variation"/><published>2022-12-13T00:00:00+00:00</published><updated>2022-12-13T00:00:00+00:00</updated><id>https://dialect-erc.github.io//news/On-Ground-Truth-in-machine-learning:-Human-Label-Variation</id><content type="html" xml:base="https://dialect-erc.github.io//news/On-Ground-Truth-in-machine-learning-Human-Label-Variation">&lt;p&gt;The problem of &lt;em&gt;human label variation&lt;/em&gt; arises in AI, when human annotators assign different valid labels to the same item. This is a ubiquitous problem in AI in general, and especially pronounced in problems where language is involved, as language is ambiguous (amongst others). Yet, most AI systems today are trained on the assumption that there exists a single &lt;em&gt;ground truth&lt;/em&gt;, or, a single valid interpretation per item.&lt;/p&gt; &lt;p&gt;We presented &lt;a href=&quot;https://twitter.com/MaiNLPlab/status/1600795488605073409&quot;&gt;several papers at EMNLP 2022&lt;/a&gt; in Abu Dhabi which challenge this assumption of a single ground truth and look at human label variation. Here are some selected highlights:&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;Barbara Plank. &lt;a href=&quot;https://aclanthology.org/2022.emnlp-main.731/&quot;&gt;&lt;em&gt;The “Problem” of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation.&lt;/em&gt;&lt;/a&gt; In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2022.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;Joris Baan, Wilker Aziz, Barbara Plank and Raquel Fernández. &lt;a href=&quot;https://aclanthology.org/2022.emnlp-main.124/&quot;&gt;&lt;em&gt;Stop Measuring Calibration When Humans Disagree&lt;/em&gt;.&lt;/a&gt; In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2022.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;p&gt;The first is a position paper on the problem of human label variation. The second looks at calibration under the lens of human label variation: Calibration is a popular framework to evaluate whether a neural networks knows when it does not know - i.e., its predictive probabilities are a good indication of how likely a prediction is to be correct. Correctness is commonly estimated against the human majority class (a single ground truth). What does this mean in light of human label variation? Read up here:&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://twitter.com/mxmeij/status/1601832608073388032&quot;&gt;Img credits to Max-Müller Eberstein&lt;/a&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">The problem of human label variation arises in AI, when human annotators assign different valid labels to the same item. This is a ubiquitous problem in AI in general, and especially pronounced in problems where language is involved, as language is ambiguous (amongst others). Yet, most AI systems today are trained on the assumption that there exists a single ground truth, or, a single valid interpretation per item.</summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://dialect-erc.github.io//mme-jbaan2022.jpg"/><media:content medium="image" url="https://dialect-erc.github.io//mme-jbaan2022.jpg" xmlns:media="http://search.yahoo.com/mrss/"/></entry></feed>