Liste

End-to-End Low-Resource Speech Translation for Swiss German Dialects (E2E_SG)

Lay summary

Inhalt und Ziel des Forschungsprojekts

Unser Ziel ist es, herauszufinden, wie man ST für die verschiedenen schweizerdeutschen Dialekte am besten einsetzt. Dazu werden wir zuerst einen kontrollierten Datensatz erstellen, indem wir für verschiedene Dialekte die selben Daten erheben. Dieser Datensatz bildet die Grundlage für Experimente, in denen wir u.a. folgende Fragen beantworten wollen:

Wie viele Trainingsdaten werden benötigt um ein ST-System für Schweizerdeutsch zu trainieren?
Welche Unterschiede bestehen zwischen den Dialekten in Bezug auf die erreichbare Genauigkeit der Transkription?
Ist es besser, ein globales ST-System zu trainieren, oder sind dialektspezifische Systeme sinnvoller?
Wie können wir Daten für einen bestimmten Dialekt (z.B. Zürichdeutsch) auch für andere Dialekte (z.B. St.Gallerdeutsch) nutzbar machen?
Wie gehen wir mit dialektspezifischem Vokabular um?

Wissenschaftlicher und gesellschaftlicher Kontext des Forschungsprojekts

Unsere Arbeit wird wichtige Erkenntnisse zur Erkennung gesprochenen Schweizerdeutschs mittels ST generieren. Damit wird es möglich, sprachverarbeitende Systeme für die schweizerdeutschen Dialekte zu entwickeln, welche z.B. zur Transkription von Sitzungen, für Sprachassistenten oder die Untertitelung eingesetzt werden können.

Abstract

This project investigates the application of recent findings in Speech Translation to Swiss German. Speech Translation (ST) is the task of translating spoken utterances in one language into written text in a different language. It serves as an essential tool for breaking down language barriers in various communication settings and a promising means in preserving endangered languages. We will investigate how ST can be applied to Swiss German dialects, i.e. to translate speech in Swiss German into text in Standard German. This has numerous important real-life applications, e.g. voice bots such as Siri or Alexa, interview transcription, generating meeting protocols, evaluation of call-center dialogues, etc. The rationale behind transcribing to Standard German is that this will provide a unified written form and will allow us to benefit from the abundance of NLP methods that exist for Standard German as a well-studied high-resource language (e.g. part-of-speech tagging, named entity recognition, sentiment analysis, summarisation etc.). ST is an appealing approach for Swiss German, since it does not rely on an intermediate textual representation in the source language (note that there is no unified written form for Swiss German). However, it usually requires a large amount of annotated training data, which is not available for the numerous Swiss German dialects. For this reason, we will investigate how ST systems could be built for Swiss German dialects without the need to generate annotated data for each dialect. More precisely, we will1. create a large-scale parallel corpus of 450 hours of audio in 7 major Swiss dialects and corresponding translations in Standard German text. This corpus is generated in a fully controlled environment, allowing it to be used for scientifically sound experiments on Swiss German dialects. 2. implement 3 ST systems and run experiments on how to optimally train an ST system for different dialects.3. investigate how to translate training data between different dialects of Swiss German, using Speech-to-Speech translation, Vocabulary Enhancement and Voice Adaptation, to mitigate the lack of training data for most Swiss German dialects.4. compile a set of recommendations and best practices on how to create general purpose ST systems for Swiss German dialects.The results of this project will pave the way for developing practical ST solutions for Swiss German on a broad scale and contribute to the prevalence of Swiss German as part of Switzerland’s cultural heritage in the digital age.The project is conducted by researchers from Zurich University of Applied Sciences (ZHAW), University of Applied Sciences of North-Western Switzerland (FHNW) and University of Zurich (UZH). It runs for one year with approx. 2 full-time positions.

Last updated:19.06.2024

SNSF
Project funding (Div. I-III)
Original data source 200729 i

1 People

Mark Cieliebak

We help you find the perfect fit.

Lay summary

Abstract