-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenLineage failed to send DAG start event #44984
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
cc @kacpermuda seems like #42448 didn't fix this issue? |
I think it's unrelated to #42448, but it should not happen anyway. Thanks @paul-laffon-dd for reporting that, I'll investigate it in my free time. |
Thanks @kacpermuda From my understanding of the issue there is one argument of the dag_started that is trying to serialize an operator where the What do you think of switching to a |
From what I understand, there were MANY problems with previous implementation using ThreadPoolExecutor. The problem is that |
The solution should be to not serialize any operators. Not exactly sure where the operator is coming from (I don't think it's this:
Regarding ThreadPoolExecutor, we've switched from that solution since it caused even worse issues: #39235 @paul-laffon-dd do you know where it's coming from, or have a reproduction? |
I don't know from which facet this is coming from and I don't have a way to deterministically reproduce it. From my understanding, this only happens if the result of |
Apache Airflow Provider(s)
openlineage
Versions of Apache Airflow Providers
1.14.0
Also seeing missing start DAG events for versions <= 1.12.0. However, those versions weren't logging the exception, making it difficult to determine if this is the same issue.
Apache Airflow version
2.10.1
Operating System
Amazon Linux
Deployment
Amazon (AWS) MWAA
Deployment details
MWAA with:
apache-airflow-providers-openlineage==1.14.0
OPENLINEAGE_URL
pointing to a webserver logging all received requestsWhat happened
OpenLineage provider failed to send some DAG start events, with the following exception in the scheduler logs:
What you think should happen instead
No response
How to reproduce
The failures to send events were non-deterministic and appear to be caused by a race condition. They seem to occur more frequently when multiple DAGs are being scheduled simultaneously.
I used this code to reproduce the issue, and it failed to send at least one DAG start almost every minute.
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: