Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Data Inconsistency in RealtimeToOffline Minion tasks #14659

Open
Harnoor7 opened this issue Dec 13, 2024 · 1 comment
Open

[Bug] Data Inconsistency in RealtimeToOffline Minion tasks #14659

Harnoor7 opened this issue Dec 13, 2024 · 1 comment

Comments

@Harnoor7
Copy link
Contributor

Harnoor7 commented Dec 13, 2024

Currently for RTO tasks, watermark is updated in the executor after segments are uploaded. Hence, there can be possible scenarios where segments were uploaded to an offline table but RTO metadata watermark was not updated. AND RTO task generator does not validate if the segment has already been processed in the previous minion run.

Due to above, Offline table can have inconsistent data.

@Harnoor7
Copy link
Contributor Author

Since created offline segment names for offline table will be same, segments just gets replaced. But if user modifies task config, segment names can change and then we get incorrect query results.

@Harnoor7 Harnoor7 changed the title [Bug] Data Inconsistency issues in RealtimeToOffline Minion tasks [Bug] Data Inconsistency in RealtimeToOffline Minion tasks Dec 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants