Python: Include a function_invoke_attempt index with Streaming CMC #10009

moonbox3 · 2024-12-18T10:45:10Z

Motivation and Context

During auto function calling, we're yielding all messages back without any indication as to which invocation index they are related to. This information could be helpful to the caller to understand in which order message chunks were received during the auto function invocation loop.

Depending upon the behavior of auto function calling, the request_index iterates up to the maximum_auto_invoke_attempts. The caller doesn't know today which function auto invoke attempt they're currently on -- so simply handing all yielded messages can be confusing. In a new PR, we will handle adding the request_index (perhaps with a different name) to make it easier to know which streaming message chunks to concatenate, which should help reduce the confusion down the line.

Description

This PR adds:

The function_invoke_attempt attribute to the StreamingChatMessageContent class. This can help callers/users track which streaming chat message chunks belong to which auto function invocation attempt.
A new keyword argument was added to the _inner_get_streaming_chat_message_contents to allow the function_invoke_attempt int to be passed through to the StreamingChatMessageContent creation in each AI Service. This additive keyword argument should not break existing.
Updates unit tests
Combines three previously distinct samples related to auto function calling into one sample, and allows the user to configure other chat completion services that support auto function calling.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

…sts and samples.

markwallace-microsoft · 2024-12-18T10:47:58Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
semantic_kernel/connectors/ai
chat_completion_client_base.py	127	2	98%	406, 416
semantic_kernel/connectors/ai/anthropic/services
anthropic_chat_completion.py	158	6	96%	141, 147, 160, 166, 170, 382
semantic_kernel/connectors/ai/azure_ai_inference/services
azure_ai_inference_chat_completion.py	105	6	94%	110–113, 122, 144, 168
semantic_kernel/connectors/ai/bedrock/services
bedrock_chat_completion.py	136	14	90%	117, 139, 164, 168–171, 229, 247–266, 325
semantic_kernel/connectors/ai/google/google_ai/services
google_ai_chat_completion.py	119	4	97%	126, 153, 179, 181
semantic_kernel/connectors/ai/google/vertex_ai/services
vertex_ai_chat_completion.py	119	4	97%	121, 148, 174, 176
semantic_kernel/connectors/ai/mistral_ai/services
mistral_ai_chat_completion.py	122	38	69%	119–122, 132, 147–150, 165, 181–185, 200–208, 225–233, 246–259, 265, 274–278, 323–326
semantic_kernel/connectors/ai/ollama/services
ollama_chat_completion.py	138	34	75%	116, 141, 145–146, 156, 169, 186, 206–207, 211, 224–234, 245–247, 258–267, 279, 289–290, 312, 323–324, 350, 359–367
semantic_kernel/connectors/ai/onnx/services
onnx_gen_ai_chat_completion.py	72	6	92%	69–70, 100, 126, 174, 180
semantic_kernel/connectors/ai/open_ai/services
open_ai_chat_completion_base.py	127	7	94%	71, 81, 102, 122, 143, 179, 291
semantic_kernel/contents
streaming_chat_message_content.py	70	1	99%	225
TOTAL	16777	1849	89%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
2966	4 💤	0 ❌	0 🔥	1m 14s ⏱️

eavanvalkenburg

Added some big questions for you :D

eavanvalkenburg · 2024-12-18T12:59:26Z

python/samples/concepts/auto_function_calling/chat_completion_with_function_calling.py

+        print("\n[No tool calls returned by the model]")
+
+
+async def handle_streaming(


this to me makes it seem it is quite complex and takes lots of code to make streaming function calling work, can't we do two samples, one fully auto and streaming only, the other non-auto-invoke with this stuff?

I agree we should break the samples into streaming and non-streaming. A similar structure already exists in the chat_completion concept samples.

eavanvalkenburg · 2024-12-18T13:01:02Z

python/semantic_kernel/connectors/ai/open_ai/services/azure_chat_completion.py

@@ -148,9 +148,10 @@ def _create_streaming_chat_message_content(
        chunk: ChatCompletionChunk,
        choice: ChunkChoice,
        chunk_metadata: dict[str, Any],
+        function_invoke_attempt: int = 0,


could we make function_invoke_attempt part of the settings/FunctionChoiceBehavior? this seems like it's quite messy?

eavanvalkenburg · 2024-12-18T13:02:28Z

python/semantic_kernel/contents/streaming_chat_message_content.py

@@ -51,6 +53,12 @@ class StreamingChatMessageContent(ChatMessageContent, StreamingContentMixin):
        __add__: Combines two StreamingChatMessageContent instances.
    """

+    function_invoke_attempt: int | None = Field(


I can see why this is needed more for streaming, but wouldn't regular CMC also benefit from this?

I don't think the regular CMC will benefit from this because the regular CMCs aren't chunks. They contain the full content, and users won't need to concatenate them thus they don't need to know which request attempt a message belong to. The ordering of the CMC in the chat history already encodes that information.

TaoChenOSU · 2024-12-18T16:50:41Z

python/samples/concepts/auto_function_calling/chat_completion_with_function_calling.py

+        print("\n[No tool calls returned by the model]")
+
+
+async def handle_streaming(


I agree we should break the samples into streaming and non-streaming. A similar structure already exists in the chat_completion concept samples.

TaoChenOSU · 2024-12-18T16:53:04Z

python/semantic_kernel/connectors/ai/anthropic/services/anthropic_chat_completion.py

@@ -154,6 +154,7 @@ async def _inner_get_streaming_chat_message_contents(
        self,
        chat_history: "ChatHistory",
        settings: "PromptExecutionSettings",
+        function_invoke_attempt: int = 0,


When does a user want to pass in a value that's not 0?

TaoChenOSU · 2024-12-18T16:57:56Z

python/semantic_kernel/contents/streaming_chat_message_content.py

@@ -51,6 +53,12 @@ class StreamingChatMessageContent(ChatMessageContent, StreamingContentMixin):
        __add__: Combines two StreamingChatMessageContent instances.
    """

+    function_invoke_attempt: int | None = Field(


I don't think the regular CMC will benefit from this because the regular CMCs aren't chunks. They contain the full content, and users won't need to concatenate them thus they don't need to know which request attempt a message belong to. The ordering of the CMC in the chat history already encodes that information.

Include a function_invoke_attempt index with Streaming CMC. Update te…

69c7a6a

…sts and samples.

moonbox3 self-assigned this Dec 18, 2024

moonbox3 requested a review from a team as a code owner December 18, 2024 10:45

markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Dec 18, 2024

eavanvalkenburg reviewed Dec 18, 2024

View reviewed changes

TaoChenOSU reviewed Dec 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Include a function_invoke_attempt index with Streaming CMC #10009

Python: Include a function_invoke_attempt index with Streaming CMC #10009

moonbox3 commented Dec 18, 2024

markwallace-microsoft commented Dec 18, 2024

eavanvalkenburg left a comment

eavanvalkenburg Dec 18, 2024

TaoChenOSU Dec 18, 2024

eavanvalkenburg Dec 18, 2024

eavanvalkenburg Dec 18, 2024

TaoChenOSU Dec 18, 2024

TaoChenOSU Dec 18, 2024

TaoChenOSU Dec 18, 2024

TaoChenOSU Dec 18, 2024

		print("\n[No tool calls returned by the model]")


		async def handle_streaming(

Python: Include a function_invoke_attempt index with Streaming CMC #10009

Are you sure you want to change the base?

Python: Include a function_invoke_attempt index with Streaming CMC #10009

Conversation

moonbox3 commented Dec 18, 2024

Motivation and Context

Description

Contribution Checklist

markwallace-microsoft commented Dec 18, 2024

Python Unit Test Overview

eavanvalkenburg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment