Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] 希望deepflow能提供主动触发profiling的能力而非通过config配置让agent拥有持续剖析的能力 #8550

Open
2 of 3 tasks
zcl1115 opened this issue Nov 25, 2024 · 0 comments
Assignees
Labels

Comments

@zcl1115
Copy link

zcl1115 commented Nov 25, 2024

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

我们分析了deepflow的AutoProfiling能力,发现这是一种通过static_config.ebpf.on-cpu-profile.regex来配置针对特定进程规则的所发起的持续性的profiling能力,这种情况对于应用数较少的情况下是有很好的效果的,但是如果对于一整个大集群,公司内部业务情况比较复杂的情况下,这种持续性的profiling能力,一方面配置困难(没有统一的名称),另一方面很多时候进行的profiling将会是无用的,因为用户并不关注。
所以我们希望deepflow能够将AutoProfiling能力开放为一种AlarmProfiling的能力,在某些特定情况下(比如告警、用户手动执行的时候),能够存在一个入口能够去触发对于某个特定进程的profiling能力。

Use case

目前我们针对deepflow-agent采集的数据配置了请求成功率的告警,我们希望在请求成功率触发告警的时候,我们通过告警分析得出成功率告警所在的node机器的ip,然后我们可以主动去触发对应node机器的针对该进程的profiling,主动触发的方式有两种可选:

  1. 通过我们内部的命令通道下发命令到node机器上,调用deepflow-agent特定的命令,传入对应的进程pid执行profiling
  2. 通过deepflow-server下发命令到对应的实例上,传入process表中对应的进程名或者pid执行profiling

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant