Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: IT fails in anvilprod due to multiple values for is_supplementary (#5229) #5231

Conversation

nadove-ucsc
Copy link
Contributor

@nadove-ucsc nadove-ucsc commented May 22, 2023

Connected issues: #5229

Checklist

Author

  • PR is a draft
  • Target branch is develop
  • Name of PR branch matches issues/<GitHub handle of author>/<issue#>-<slug>
  • PR title references all connected issues
  • PR title matches1 that of a connected issue or comment in PR explains why they're different
  • For each connected issue, there is at least one commit whose title references that issue
  • PR is connected to all connected issues via ZenHub
  • PR description links to connected issues
  • Added partial label to PR or this PR completely resolves all connected issues

1 when the issue title describes a problem, the corresponding PR
title is Fix: followed by the issue title

Author (reindex, API changes)

  • Added r tag to commit title or this PR does not require reindexing
  • Added reindex label to PR or this PR does not require reindexing
  • Added a (compatible changes) or A (incompatible ones) tag to commit title or this PR does not modify the Azul service API
  • Added API label to connected issues or this PR does not modify the Azul service API

Author (chains)

  • This PR is blocked by previous PR in the chain or this PR is not chained to another PR
  • Added base label to the blocking PR or this PR is not chained to another PR
  • Added chained label to this PR or this PR is not chained to another PR

Author (upgrading)

  • Documented upgrading of deployments in UPGRADING.rst or this PR does not require upgrading
  • Added u tag to commit title or this PR does not require upgrading
  • Added upgrade label to PR or this PR does not require upgrading

Author (operator tasks)

  • Added checklist items for additional operator tasks or this PR does not require additional tasks

Author (hotfixes)

  • Added F tag to main commit title or this PR does not include permanent fix for a temporary hotfix
  • Reverted the temporary hotfixes for any connected issues or the prod branch has no temporary hotfixes for any connected issues

Author (before every review)

  • Rebased PR branch on develop, squashed old fixups
  • Ran make requirements_update or this PR does not touch requirements*.txt, common.mk, Makefile and Dockerfile
  • Added R tag to commit title or this PR does not touch requirements*.txt
  • Added reqs label to PR or this PR does not touch requirements*.txt
  • make integration_test passes in personal deployment or this PR does not touch functionality that could break the IT

Peer reviewer (after requesting changes)

Uncheck the Author (before every review) checklists.

Peer reviewer (after approval)

  • PR is not a draft
  • Ticket is in Review requested column
  • Requested review from primary reviewer
  • Assigned PR to primary reviewer

Primary reviewer (after requesting changes)

Uncheck the before every review checklists. Update the N reviews label.

Primary reviewer (after approval)

  • Actually approved the PR
  • Labeled connected issues as demo or no demo
  • Commented on connected issues about demo expectations or all connected issues are labeled no demo
  • Decided if PR can be labeled no sandbox
  • PR title is appropriate as title of merge commit
  • N reviews label is accurate
  • Moved ticket to Approved column
  • Assigned PR to current operator

Operator (before pushing merge the commit)

  • Checked reindex label and r commit title tag
  • Checked that demo expectations are clear or all connected issues are labeled no demo
  • PR has checklist items for upgrading instructions or PR is not labeled upgrade
  • Squashed PR branch and rebased onto develop
  • Sanity-checked history
  • Pushed PR branch to GitHub
  • Pushed PR branch to GitLab dev and added sandbox label or PR is labeled no sandbox
  • Pushed PR branch to GitLab anvildev or PR is labeled no sandbox
  • Build passes in sandbox deployment or PR is labeled no sandbox
  • Build passes in anvilbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in sandbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in anvilbox deployment or PR is labeled no sandbox
  • Deleted unreferenced indices in sandbox or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Deleted unreferenced indices in anvilbox or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Started reindex in sandbox or this PR does not require reindexing sandbox
  • Started reindex in anvilbox or this PR does not require reindexing sandbox
  • Checked for failures in sandbox or this PR does not require reindexing sandbox
  • Checked for failures in anvilbox or this PR does not require reindexing sandbox
  • Added PR reference to merge commit title
  • Collected commit title tags in merge commit title
  • Moved connected issues to Merged column in ZenHub
  • Pushed merge commit to GitHub

Operator (after pushing the merge commit)

  • Shortened the PR chain or this PR is not labeled base
  • Pushed merge commit to GitLab dev or PR is labeled no sandbox
  • Pushed merge commit to GitLab anvildev or PR is labeled no sandbox
  • Build passes on GitLab dev1
  • Reviewed build logs for anomalies on GitLab dev1
  • Build passes on GitLab anvildev1
  • Reviewed build logs for anomalies on GitLab anvildev1
  • Deleted PR branch from GitHub
  • Deleted PR branch from GitLab dev
  • Deleted PR branch from GitLab anvildev

1 When pushing the merge commit is skipped due to the PR being
labelled no sandbox, the next build triggered by a PR whose merge commit is
pushed determines this checklist item.

Operator (reindex)

  • Deleted unreferenced indices in dev or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Deleted unreferenced indices in anvildev or this PR does not remove catalogs or otherwise causes unreferenced indices
  • Started reindex in dev or this PR does not require reindexing
  • Started reindex in anvildev or this PR does not require reindexing
  • Checked for and triaged indexing failures in dev or this PR does not require reindexing
  • Checked for and triaged indexing failures in anvildev or this PR does not require reindexing
  • Emptied fail queues in dev deployment or this PR does not require reindexing
  • Emptied fail queues in anvildev deployment or this PR does not require reindexing

Operator

  • Unassigned PR

Shorthand for review comments

  • L line is too long
  • W line wrapping is wrong
  • Q bad quotes
  • F other formatting problem

@github-actions github-actions bot added the orange [process] Done by the Azul team label May 22, 2023
@coveralls
Copy link

coveralls commented May 22, 2023

Coverage Status

Changes Unknown when pulling 95334f0 on issues/nadove-ucsc/5229-it-fails-anvilprod-multiple-supp into ** on develop**.

@codecov
Copy link

codecov bot commented May 22, 2023

Codecov Report

Merging #5231 (77f0b87) into develop (61f6e0d) will increase coverage by 0.00%.
The diff coverage is 0.00%.

❗ Current head 77f0b87 differs from pull request most recent head 95334f0. Consider uploading reports for the commit 95334f0 to get more accurate results

@@           Coverage Diff            @@
##           develop    #5231   +/-   ##
========================================
  Coverage    84.38%   84.38%           
========================================
  Files          149      148    -1     
  Lines        18312    18271   -41     
========================================
- Hits         15452    15418   -34     
+ Misses        2860     2853    -7     
Impacted Files Coverage Δ
test/integration_test.py 0.00% <0.00%> (ø)

... and 3 files with indirect coverage changes

@nadove-ucsc
Copy link
Contributor Author

Fix verified by running IT with anvilprod snapshots and confirming that a bundle containing both supplementary and non-supplementary files was indexed.

achave11-ucsc
achave11-ucsc previously approved these changes May 23, 2023
Copy link
Member

@achave11-ucsc achave11-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@achave11-ucsc achave11-ucsc marked this pull request as ready for review May 23, 2023 05:18
Copy link
Member

@hannes-ucsc hannes-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you ran IT on the branch you pushed, I find it difficult to follow how the IT is passing without the fix for #5207 which your branch does not include. Without that fix, the IT frequently fails early due to an oversized partition. Do you know why the IT passes on your branch without that fix?

On my branch (issues/hannes-ucsc/5015-anvilprod), which include this fix and the one for #5207, the IT fails with extra items in the set of indexed bundles, or missing items in the set if expected bundles:

https://gitlab.prod.anvil.gi.ucsc.edu/ucsc/azul/-/jobs/4499

That branch uses pinned seed (6634795309975096822). Could you please re-run IT from your branch, and with that seed?

I will run IT with a random seed on my branch.

@hannes-ucsc hannes-ucsc added the 0 reviews [process] Lead didn't request any changes label May 23, 2023
@hannes-ucsc
Copy link
Member

Are you using a common prefix in the deployment you ran the IT against?

@hannes-ucsc
Copy link
Member

I will run IT with a random seed on my branch.

Also fails, and in the same way (AssertionError: Items in the first set but not the second).

@hannes-ucsc hannes-ucsc removed their assignment May 23, 2023
@nadove-ucsc
Copy link
Contributor Author

Are you using a common prefix in the deployment you ran the IT against?

I was, but I have now removed it. https://service.nadove5.anvil.gi.ucsc.edu/repository/sources

@nadove-ucsc
Copy link
Contributor Author

That branch uses pinned seed (6634795309975096822). Could you please re-run IT from your branch, and with that seed?

I rebased on develop, fixed the seed, and tweaked the partition size restrictions to compensate for the removal of the common prefix. With this patch in place, all tests pass on nadove5.

Subject: [PATCH] fix
---
Index: test/integration_test.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/test/integration_test.py b/test/integration_test.py
--- a/test/integration_test.py	(revision d51d7c94425facef9ca708dc25c6b8fa69faf292)
+++ b/test/integration_test.py	(date 1684890437821)
@@ -198,7 +198,7 @@
         super().setUp()
         # All random operations should be made using this seed so that test
         # results are deterministically reproducible
-        self.random_seed = randint(0, sys.maxsize)
+        self.random_seed = 6634795309975096822
         self.random = Random(self.random_seed)
         log.info('Using random seed %r', self.random_seed)
 
@@ -265,7 +265,7 @@
         fqids = self.azul_client.list_bundles(catalog, source, partition_prefix)
         bundle_count = len(fqids)
         partition = f'Partition {effective_prefix!r} of source {source.spec}'
-        if not config.is_sandbox_or_personal_deployment:
+        if True:
             # For sources that use partitioning, 512 is the desired partition
             # size. In practice, we observe the reindex succeeding with sizes
             # >700 without the partition size becoming a limiting factor. From

@nadove-ucsc nadove-ucsc force-pushed the issues/nadove-ucsc/5229-it-fails-anvilprod-multiple-supp branch from dc441c6 to d51d7c9 Compare May 24, 2023 01:31
@nadove-ucsc
Copy link
Contributor Author

To reproduce @hannes-ucsc 's observed error, we needed to make this change:

Subject: [PATCH] fix
---
Index: src/azul/terra.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/azul/terra.py b/src/azul/terra.py
--- a/src/azul/terra.py	(revision f6e73608ab5b3f0e4caacfb7ed375cb309256ca8)
+++ b/src/azul/terra.py	(date 1685045014024)
@@ -646,7 +646,7 @@
                 # FIXME: AnVIL deployments conflate indexer and public SAs
                 #        https://github.com/DataBiosphere/azul/issues/4398
                 service_account=(config.ServiceAccount.indexer
-                                 if config.is_anvil_enabled() and not config.is_hca_enabled()
+                                 if False
                                  else config.ServiceAccount.public)
             )
         )

to emulate the effect of this commit on his branch.

@nadove-ucsc nadove-ucsc force-pushed the issues/nadove-ucsc/5229-it-fails-anvilprod-multiple-supp branch from d51d7c9 to 9a10cd1 Compare May 25, 2023 23:48
@nadove-ucsc
Copy link
Contributor Author

Files that are incorrectly marked as supplementary were causing the bundles that contain them to be counted twice: once (correctly) as a primary bundle and once (incorrectly) as a supplementary bundle. My first fix didn't work because when looking at the /files/ index, there were no non-supplementary files aggregated along with the supplementary ones.

Copy link
Member

@hannes-ucsc hannes-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, the IT passes on my branch, too.

Please request a PL slot about this.

@hannes-ucsc hannes-ucsc removed their assignment May 26, 2023
@nadove-ucsc nadove-ucsc force-pushed the issues/nadove-ucsc/5229-it-fails-anvilprod-multiple-supp branch from 9a10cd1 to 77f0b87 Compare May 26, 2023 22:27
@hannes-ucsc hannes-ucsc force-pushed the issues/nadove-ucsc/5229-it-fails-anvilprod-multiple-supp branch from 77f0b87 to 95334f0 Compare May 27, 2023 00:18
@achave11-ucsc achave11-ucsc merged commit 1f28050 into develop May 30, 2023
@achave11-ucsc achave11-ucsc deleted the issues/nadove-ucsc/5229-it-fails-anvilprod-multiple-supp branch May 30, 2023 23:28
@achave11-ucsc achave11-ucsc removed their assignment May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 reviews [process] Lead didn't request any changes orange [process] Done by the Azul team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants