Phonikud

Overcoming Phonetic Underspecification
for Hebrew Text-To-Speech

1Independent Researcher 2Reichman University 3Cisco Systems 4Tel Aviv University 5Carnegie Mellon University
Interspeech 2026

For further improved Hebrew G2P, see our follow-up work: Renikud

Model Architecture

Introduction

Text-to-speech for Modern Hebrew is challenged by underspecified phonetic features such as vowels and stress. Phonikud is an open-source grapheme-to-phoneme system that produces fully specified IPA transcriptions for more accurate Hebrew TTS. The project also introduces ILSpeech, a Hebrew audio, text, and IPA corpus for G2P benchmarking, TTS training, and audio-to-IPA evaluation.

What Makes Us Different

Real-Time Inference

Works with real-time TTS like Piper using IPA phonemes.

Edge Deployment

Runs locally on Raspberry Pi and edge devices.

Data-Efficient Training

Fine-tunes TTS with as little as 2 hours of data.

Hebrew Phonetics

Handles stress and vocal shva missed by others.

Assistive Tech

Low-latency screen reader support, even offline.

Open TTS Dataset

Studio-quality Hebrew speech with IPA annotations.

Open Models & Training

Weights, TTS models, and training code included.

Fine-Grained Phonetic Control

Edit phonemes directly or let G2P handle it.

From Text to Speech

See how Phonikud transforms Hebrew text through each stage.

1
Text
השפה העברית נשמעת יפה כשמבטאים אותה נכון
Input: Regular Hebrew text without vowel markings
2
Diacritics
הַשָּׂפָה הָעִבְרִית נִשְׁמַ֫עַת יָפָה כְּשֶׁמְּֽבַטְּאִים אוֹתָהּ נָכוֹן
Enhanced diacritics with stress markers and vocal shva
3
Phonemes
hasafˈa haʔivʁˈit niʃmˈaʔat jafˈa kʃemevatʔˈim ʔotˈa naχˈon.
Phonikud converts to precise IPA phonetic transcription
4
Audio
Real-time TTS synthesis from phonemes - listen to the result
💡
Flexible Input
Pro tip: You can input at any stage! Whether you want the model to add diacritics, add them yourself, or directly input phonemes. Try it in the demo!
Full control over the pipeline - input text, diacritics, or phonemes

Method Comparison

Comparative evaluation of Phonikud against existing Hebrew TTS approaches

Text Sample ElevenLabs
Eleven v3
Google
Gemini v2.5
RoboShaul
1st place
Phonikud (Ours)
Ours v1 (alpha)
הוא צפה בס֫רט וראה חיה שצ֫פה במ֫ים 🐸
הוא רצה את זה גם אבל היא ר֫צה מהר והקד֫ימה אותו 🏃‍♀️
בוא תרד לאכול יש בור֫קס עם ת֫רד 🥬

Explore More

More resources, demos, and tools for Phonikud

Citation

@inproceedings{kolani2026phonikud,
  title={Phonikud: Overcoming Phonetic Underspecification for Hebrew Text-To-Speech},
  author={Yakov Kolani and Maxim Melichov and Cobi Calev and Morris Alper},
  booktitle={Proc. Interspeech 2026},
  year={2026},
  url={https://arxiv.org/abs/2506.12311},
}