Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new-contrib: Audio Whisper API with Local Device Microphones #1271

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions authors.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -106,13 +106,13 @@ aaronwilkowitz-openai:
charuj:
name: "Charu Jaiswal"
website: "https://www.linkedin.com/in/charu-j-8a866471"
avatar: "https://avatars.githubusercontent.com/u/18404643?v=4"
avatar: "https://avatars.githubusercontent.com/u/18404643?v=4"

rupert-openai:
name: "Rupert Truman"
website: "https://www.linkedin.com/in/rupert-truman/"
avatar: "https://avatars.githubusercontent.com/u/171234447"

keelan-openai:
name: "Keelan Schule"
website: "https://www.linkedin.com/in/keelanschule/"
Expand All @@ -138,6 +138,31 @@ eszuhany-openai:
website: "https://www.linkedin.com/in/szuhany/"
avatar: "https://avatars.githubusercontent.com/u/164391912"


papertigers:
name: "Karl Moritz Hermann"
website: "https://www.linkedin.com/in/kmhermann/"
avatar: "https://pbs.twimg.com/profile_images/1249352292081569792/MbZrKr9v_400x400.jpg"

harieki:
name: "Harry Jackson"
website: "https://www.linkedin.com/in/harry-jackson-b9951317/"
avatar: "https://pbs.twimg.com/profile_images/1621193381993125888/0UN2oFYx_400x400.jpg"

zlipkin-openai:
name: "Zoe Lipkin"
website: "https://www.linkedin.com/in/zoelipkin"
avatar: "https://pbs.twimg.com/profile_images/1651768911416500224/5mhCRdH2_400x400.jpg"

iamkobe:
name: "Yingning Cheng"
website: "https://www.linkedin.com/in/yingningc/"
avatar: "https://media.licdn.com/dms/image/D4D03AQEH0XZfnJXb6A/profile-displayphoto-shrink_800_800/0/1697817712538?e=1704931200&v=beta&t=cRLGzAFgnYJw1ZBG8abCQhxNd4di9TyWw-MG5_ZEVgk"

carlkho:
name: "Carl Kho"
website: "http://carlkho.com"
avatar: "https://avatars.githubusercontent.com/u/106736711?v=4"
pap-openai:
name: "Pierre-Antoine Porte"
website: "https://www.linkedin.com/in/portepa/"
Expand Down
569 changes: 569 additions & 0 deletions examples/Whisper_transcribe_device_microphone.ipynb
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the reader speaks Filipino they can't test this part out – how about translating from a more common second language like Spanish?

Also, an indefinite record makes many notebooks crash – set a 5-10 second limit as well.

# Demo: Transcribe lengthy Filipino speech and translate into English with proper grammar and punctuation
result = transcribe_audio(
    debug=False,
    prompt="Filipino spoken. Proper grammar and punctuation. Skip fillers.",
    timed_recording=False,
    record_seconds=0,
    is_english=False,
)

print("\nTranscription/Translation:", result)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining transcribing and translating here is a bit weird in this function, and also drops the prompt param for translations. (The prompt should be in english for translation and language of choice in a transcription). I'd split this out into two clear helper functions for translate and transcribe.

def process_audio(file_name, is_english=True, prompt=""):
    with open(file_name, "rb") as audio_file:
        if is_english:
            response = client.audio.transcriptions.create(
                model="whisper-1", file=audio_file, prompt=prompt
            )
        else:
            response = client.audio.translations.create(
                model="whisper-1", file=audio_file
            )

        return response.text.strip()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't this this is how we intend for the prompt parameter to be used – looking at our docs, it is more of an example(s) than an instruction.
Screenshot 2024-11-26 at 4 03 42 PM

Large diffs are not rendered by default.

Binary file removed examples/data/10k/lyft_2021.pdf
Binary file not shown.
Binary file removed examples/data/10k/uber_2021.pdf
Binary file not shown.

This file was deleted.

Binary file removed examples/data/recommendations_embeddings_cache.pkl
Binary file not shown.
1,001 changes: 0 additions & 1,001 deletions examples/data/sample_clothes/sample_styles_with_embeddings.csv

This file was deleted.

Binary file added images/whisper_onChatGPTApp_cvk.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
225 changes: 215 additions & 10 deletions registry.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1492,7 +1492,11 @@
- completions
- chatgpt-and-api

<<<<<<< HEAD
- title: GPT Actions library - Zapier
=======
- title: GPT Actions library -
>>>>>>> 587df2b578e6e55dcbd67324ff0f670bc19ad2f6
path: examples/chatgpt/gpt_actions_library/gpt_action_zapier.ipynb
date: 2024-08-05
authors:
Expand All @@ -1509,8 +1513,138 @@
- dylanra-openai
tags:
- completions
- functions
- assistants

- title: Using Whisper API to transcribe text using your Device Microphone
path: examples/Whisper_transcribe_device_microphone.ipynb
date: 2024-08-24
authors:
- carlkho
tags:
- whisper
<<<<<<< HEAD
- audio
- transcribe
- translate
=======

- title: Azure AI Search with Azure Functions and GPT Actions in ChatGPT
path: examples/chatgpt/rag-quickstart/azure/Azure_AI_Search_with_Azure_Functions_and_GPT_Actions_in_ChatGPT.ipynb
date: 2024-07-08
authors:
- maxreid-openai
tags:
- embeddings
- chatgpt
- tiktoken
- completions

- title: GPT Actions library - getting started
path: examples/chatgpt/gpt_actions_library/.gpt_action_getting_started.ipynb
date: 2024-07-09
authors:
- aaronwilkowitz-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - BigQuery
path: examples/chatgpt/gpt_actions_library/gpt_action_bigquery.ipynb
date: 2024-07-09
authors:
- aaronwilkowitz-openai
tags:
- gpt-actions-library
- chatgpt

- title: Data Extraction and Transformation in ELT Workflows using GPT-4o as an OCR Alternative
path: examples/Data_extraction_transformation.ipynb
date: 2024-07-09
authors:
- charuj
tags:
- completions
- vision

- title: GPT Actions library - Outlook
path: examples/chatgpt/gpt_actions_library/gpt_action_outlook.ipynb
date: 2024-07-15
authors:
- rupert-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - Sharepoint (Return Docs)
path: examples/chatgpt/gpt_actions_library/gpt_action_sharepoint_doc.ipynb
date: 2024-05-24
authors:
- maxreid-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - Sharepoint (Return Text)
path: examples/chatgpt/gpt_actions_library/gpt_action_sharepoint_text.ipynb
date: 2024-05-24
authors:
- maxreid-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library (Middleware) - Azure Functions
path: examples/chatgpt/gpt_actions_library/gpt_middleware_azure_function.ipynb
date: 2024-05-24
authors:
- maxreid-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - Canvas LMS
path: examples/chatgpt/gpt_actions_library/gpt_action_canvaslms.ipynb
date: 2024-07-17
authors:
- keelan-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - Salesforce
path: examples/chatgpt/gpt_actions_library/gpt_action_salesforce.ipynb
date: 2024-07-18
authors:
- aa-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - Gmail
path: examples/chatgpt/gpt_actions_library/gpt_action_gmail.ipynb
date: 2024-07-24
authors:
- alwestmo-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - Jira
path: examples/chatgpt/gpt_actions_library/gpt_action_jira.ipynb
date: 2024-07-24
authors:
- rupert-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - Notion
path: examples/chatgpt/gpt_actions_library/gpt_action_notion.ipynb
date: 2024-07-25
authors:
- dan-openai
tags:
- gpt-actions-library
- chatgpt

- title: Introduction to Structured Outputs
path: examples/Structured_Outputs_Intro.ipynb
Expand All @@ -1520,6 +1654,7 @@
tags:
- completions
- functions
- assistants

- title: GPT Actions library (Middleware) - Google Cloud Function
path: examples/chatgpt/gpt_actions_library/gpt_middleware_google_cloud_function.md
Expand Down Expand Up @@ -1551,6 +1686,15 @@
- chatgpt
- chatgpt-middleware

- title: GPT Actions library - Confluence
path: examples/chatgpt/gpt_actions_library/gpt_action_confluence.ipynb
date: 2024-07-31
authors:
- eszuhany-openai
tags:
- gpt-actions-library
- chatgpt

- title: GPT Actions library - Google Drive
path: examples/chatgpt/gpt_actions_library/gpt_action_google_drive.ipynb
date: 2024-08-11
Expand All @@ -1561,26 +1705,86 @@
- chatgpt
- chatgpt-productivity

- title: GPT Actions library - Snowflake Direct
path: examples/chatgpt/gpt_actions_library/gpt_action_snowflake_direct.ipynb
date: 2024-08-13
- title: GPT Actions library - SQL Database
path: examples/chatgpt/gpt_actions_library/gpt_action_sql_database.ipynb
date: 2024-07-31
authors:
- evanweiss-openai
tags:
- chatgpt
- gpt-actions-library

- title: GPT Actions library - Box
path: examples/chatgpt/gpt_actions_library/gpt_action_box.ipynb
date: 2024-08-02
authors:
- gladstone-openai
- keelan-openai
tags:
- gpt-actions-library
- chatgpt
- chatgpt-data

- title: GPT Actions library - Snowflake Middleware
path: examples/chatgpt/gpt_actions_library/gpt_action_snowflake_middleware.ipynb
date: 2024-08-14
- title: GPT Actions library - OneDrive
path: examples/chatgpt/gpt_actions_library/gpt_action_onedrive.ipynb
date: 2024-07-10
authors:
- gladstone-openai
- girishd-openai
tags:
- gpt-actions-library
- chatgpt
- chatgpt-data

- title: Using ChatGPT’s Code Interpreter to generate SVG images
path: examples/Code_Interpreter_SVG.ipynb
date: 2024-07-19
authors:
- kerens
tags:
- chatgpt
- vision
- completions
- python

- title: GPT-4o for Speech-to-Text Transcription
path: examples/Speech-to-Text_Transcription_GPT4o.ipynb
date: 2024-08-05
authors:
- kerens
tags:
- chatgpt
- completions
- functions
- python
- whisper

- title: GPT Actions library - Quickstart
path: examples/chatgpt/gpt_actions_library/gpt_actions_library_quickstart.ipynb
date: 2024-07-07
authors:
- aaronwilkowitz-openai
tags:
- gpt-actions-library
- chatgpt

- title: How to fine-tune GPT-3.5-Turbo
path: examples/fine_tuning_gpt_3_5_turbo.ipynb
date: 2024-08-11
authors:
- aayushg
tags:
- completions

- title: Using Whisper API to transcribe text using your Device Microphone
path: examples/Whisper_transcribe_device_microphone.ipynb
date: 2024-08-24
authors:
- carlkho
tags:
- whisper
- audio
- transcribe
- translate

- title: GPT Actions library - Retool Workflow
path: examples/chatgpt/gpt_actions_library/gpt_action_retool_workflow.md
date: 2024-08-28
Expand Down Expand Up @@ -1608,3 +1812,4 @@
tags:
- completions
- reasoning
>>>>>>> 587df2b578e6e55dcbd67324ff0f670bc19ad2f6