-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Archiver working on Segments (day/week/month/range/year) on every run, even with parameter --skip-segments-today #22868
Comments
Hi @peterbo, thank you for raising this issue. We also became aware of it some time ago, and included a fix in There is still a gap though: If archiving is not running on the first day of a period, it will still be running too often. For example the week period for a segment will be skipped on a Monday, but is getting archived multiple times on a Tuesday, even with "--skip-segments-today" implying it should only take the Monday data into account and run exactly once on Tuesday. This behaviour is already planned to be fixed. Can you give |
Thank you, @mneudert. I don't have updated too many instances yet to 5.2.0, but I'll give it a shot! |
Hi @mneudert I can confirm, that this still exists with 5.2.0 (as you already mentioned): Even though archiving is running once an hour, it invalidates archives for today and yesterday: It works on the previous 30 days range 18.11 until yesterday (no new data invalidated this archive): And also on week/month/year/range archives (with and without segments): Also, it is ignoring the time_before_today_archive_considered_outdated setting, which is set to 3000: The instance was updated to 5.2.0 yesterday around noon, so the new invalidation behaviour from #22546 is already live. |
Thank you for giving the new release a try @peterbo.
That message will, under most circumstances, always be displayed, even if nothing is actually invalidated in the end. If you run archiving with verbose logging (
In this case it skipped both the "All Visits" (
If you have "previous30" configured to be archived, either by configuring it in The range will currently always be subject to I expect this will be declared as a bug, and planned for a fix.
As you have noticed, with For your custom dimension it should match the described and planned-to-be-fixed bug around periods not starting today. |
What happened?
Two examples:
INFO [2024-12-16 17:02:50] 45194 Archived website id 1, period = month, date = 2024-12-01, segment = 'referrerType==campaign', 562709 visits found. Time elapsed: 11.308s
INFO [2024-12-16 17:06:42] 45194 Archived website id 1, period = year, date = 2024-01-01, segment = 'dimension1==logged-in;dimension2==consent_given', 1559273 visits found. Time elapsed: 34.031s
Next run:
INFO [2024-12-16 17:13:08] 74184 Archived website id 1, period = month, date = 2024-12-01, segment = 'referrerType==campaign', 562709 visits found. Time elapsed: 5.748s
INFO [2024-12-16 17:13:42] 74184 Archived website id 1, period = year, date = 2024-01-01, segment = 'dimension1==logged-in;dimension2==consent_given', 1559273 visits found. Time elapsed: 21.569s
Archiver is called like this:
./console core:archive --no-ansi --skip-segments-today
Additonally, the archiver seems to be ignoring time_before_today_archive_considered_outdated which is set to 3000 (archiving runs above are only 10 minutes apart).
What should happen?
Only work on Segments once, when archiver contains parameter "--skip-segments-today", not on every run (the exception is when a previous period was invalidated by new data, which is not the case here).
How can this be reproduced?
In a Matomo instance that contains Segments, call the archiver and view the processing information.
Matomo version
5.1.2
PHP version
8.3
Server operating system
Debian
What browsers are you seeing the problem on?
Not applicable (e.g. an API call etc.)
Computer operating system
No response
Relevant log output
No response
Validations
The text was updated successfully, but these errors were encountered: