D-Nikud: Enhancing Hebrew Diacritization with LSTM and Pretrained Models

Rosenthal, Adi; Shaked, Nadav

Computer Science > Computation and Language

arXiv:2402.00075 (cs)

[Submitted on 30 Jan 2024]

Title:D-Nikud: Enhancing Hebrew Diacritization with LSTM and Pretrained Models

Authors:Adi Rosenthal, Nadav Shaked

View PDF HTML (experimental)

Abstract:D-Nikud, a novel approach to Hebrew diacritization that integrates the strengths of LSTM networks and BERT-based (transformer) pre-trained model. Inspired by the methodologies employed in Nakdimon, we integrate it with the TavBERT pre-trained model, our system incorporates advanced architectural choices and diverse training data. Our experiments showcase state-of-the-art results on several benchmark datasets, with a particular emphasis on modern texts and more specified diacritization like gender.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2402.00075 [cs.CL]
	(or arXiv:2402.00075v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2402.00075

Submission history

From: Nadav Shaked [view email]
[v1] Tue, 30 Jan 2024 22:07:12 UTC (157 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2402

Change to browse by:

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:D-Nikud: Enhancing Hebrew Diacritization with LSTM and Pretrained Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:D-Nikud: Enhancing Hebrew Diacritization with LSTM and Pretrained Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators