Persian ASR Repository

This repository provides resources and tools for Automatic Speech Recognition (ASR) in Persian. It includes datasets, pretrained models, and a speech segmentation tool, all aimed at advancing Persian ASR technology.

Persian ASR Dataset and Preparation Scripts

This section contains information and scripts to prepare and use the Persian ASR dataset. Detailed instructions can be found in the README located in the data directory.

Pretrained Persian ASR Models

We provide several pretrained ASR models for Persian and will release more in the future.

Available Models:

Model	Hugging Face Repository	WER (Greedy Decoding)	WER (Beam=5)	WER (Beam=5 + LM)
Wav2Vec2 XLS-R 300M	wav2vec2-xls-r-300m-fa	27.92%	27.89%	22.63%
Conformer Medium	nemo-conformer-medium-fa	32.08%	31.94%	27.47%

Usage:

Before running the model scripts, the dataset must be prepared in the .jsonl format. The preparation scripts for common datasets can be found in the data directory. To learn more about how to generate the necessary .jsonl files, refer to the data README.

To use a pretrained model, navigate to the corresponding model folder under models. Each model folder contains two scripts: train.py and inference.py. You can get detailed usage instructions by running:

python train.py --help
python inference.py --help

CTC-Based Speech Segmentation Tool

This tool performs speech segmentation based on Connectionist Temporal Classification (CTC). For detailed usage instructions, refer to the README inside the segmentation tool folder.

Installation

To use the tools and models in this repository, clone the repository and install the necessary dependencies:

git clone https://github.com/alifarrokh/persian-asr.git
cd persian-asr
pip install -r requirements.txt

Contribution

We welcome contributions! Please submit a pull request or open an issue to suggest improvements or report bugs.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data		data
models		models
speech_segmentation		speech_segmentation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Persian ASR Repository

Table of Contents

Persian ASR Dataset and Preparation Scripts

Pretrained Persian ASR Models

Available Models:

Usage:

CTC-Based Speech Segmentation Tool

Installation

Contribution

About

Languages

License

alifarrokh/persian-asr

Folders and files

Latest commit

History

Repository files navigation

Persian ASR Repository

Table of Contents

Persian ASR Dataset and Preparation Scripts

Pretrained Persian ASR Models

Available Models:

Usage:

CTC-Based Speech Segmentation Tool

Installation

Contribution

About

Topics

Resources

License

Stars

Watchers

Forks

Languages