Auto Transcription

A script made in 2022 to automate the transcription of audio files using Whisper AI.

Description

Auto Transcription is a bash script designed to automate the transcription of audio files using Whisper AI. The script processes audio files, converts them to a standard format, and generates transcriptions that is added to the id3v2 tag and also saved as txt in a structured manner.

Features

Automated transcription of audio files using Whisper AI.
Conversion of audio files to a standard format (320kbps MP3).
Organization of output files in a structured directory.
Logging of processing details.
Removal of special characters and normalization of file names.

Prerequisites

Bash
Python 3
ffmpeg
id3v2
Whisper AI model

Installation

Clone the repository:

git clone https://github.com/your_username/auto_transcription.git
cd auto_transcription

Install the required dependencies:

sudo apt-get install ffmpeg id3v2
pip install openai-whisper

Ensure the config.xml file is correctly set up with the input and output paths:

<general_params>
    input_path=/path/to/your/input
    output_path=/path/to/your/output
</general_params>

Ensure the right model is selected in tools/whisper_fun.py:
```
model = whisper.load_model("large-v3")
```

Input File Organization

To ensure the run.sh script processes your audio files correctly, you need to organize your input files in a specific directory structure. Follow these guidelines:

Input Directory Structure:
- The input directory should be specified in the config.xml file under the <input_path> tag.
- Inside the input directory, create subdirectories for each person. Each person's directory should contain subdirectories for different events.
Event Directory Naming:
- Each event directory should be named in the format yyyy_mm_dd_eventName, where yyyy is the year, mm is the month, and dd is the day of the event.
- Example: 2024_05_24_BirthdayParty.
Audio Files:
- Place the audio files (in .mp3 format) inside the respective event directories.
- Ensure that the audio files are named appropriately, as the script will process and rename them.

Example Directory Structure

/media/pedro/Arquivos/acervos/input
├── Person1
│   ├── 2024_05_24_BirthdayParty
│   │   ├── track1.mp3
│   │   ├── track2.mp3
│   └── 2024_06_15_Conference
│       ├── track1.mp3
│       ├── track2.mp3
├── Person2
│   ├── 2024_07_10_Wedding
│   │   ├── track1.mp3
│   │   ├── track2.mp3

Usage

Place your audio files in the input directory specified in config.xml.
Run the script:
```
./run.sh
```

The script will process the audio files, generate transcriptions, and save the output in the specified output directory.

Directory Structure

bash/auto_transcription
├── config
│   └── config.xml
├── logs
├── run.sh
└── tools
    ├── aux_functions.sh
    ├── gather_txt.sh
    └── whisper_fun.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto Transcription

Table of Contents

Description

Features

Prerequisites

Installation

Input File Organization

Example Directory Structure

Usage

Directory Structure

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
logs		logs
tools		tools
README.md		README.md
run.sh		run.sh

pedro-pscunha/auto-transcript

Folders and files

Latest commit

History

Repository files navigation

Auto Transcription

Table of Contents

Description

Features

Prerequisites

Installation

Input File Organization

Example Directory Structure

Usage

Directory Structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages