Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC] Catch ImagePullBackOff from receptor workunitand surface to job_explanation #15689

Open
wants to merge 1 commit into
base: devel
Choose a base branch
from

Conversation

TheRealHaoLiu
Copy link
Member

SUMMARY

Instead of just result traceback so we stop getting the

Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b''

for this specific case

ISSUE TYPE
  • Bug, Docs Fix or other nominal change
COMPONENT NAME
  • API
AWX VERSION

ADDITIONAL INFORMATION

and surface to job_explanation instead of just result traceback so we stop getting the
```
Failed to JSON parse a line from worker stream. Error: Expecting value: line 1 column 1 (char 0) Line with invalid JSON data: b''
```
for this specific case
Copy link

sonarqubecloud bot commented Dec 5, 2024

logger.warning(detail)
log_name = self.task.instance.log_format
logger.warning(f"Could not launch pod for {log_name}. ImagePullBackOff.")
self.task.runner_callback.delay_update(job_explanation=f'{detail}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't this hit the latter line L559, self.task.runner_callback.delay_update(result_traceback=f'Receptor detail:\n{detail}')? Because it had output that you didn't find helpful?

Instead of adding another condition, I would rather suggest the change

diff --git a/awx/main/tasks/receptor.py b/awx/main/tasks/receptor.py
index 576ad661d5..aba140434e 100644
--- a/awx/main/tasks/receptor.py
+++ b/awx/main/tasks/receptor.py
@@ -549,9 +549,9 @@ class AWXReceptorJob:
                         receptor_output = b"".join(lines).decode()
                     if receptor_output:
                         self.task.runner_callback.delay_update(result_traceback=f'Worker output:\n{receptor_output}')
-                    elif detail:
+                    if detail:
                         self.task.runner_callback.delay_update(result_traceback=f'Receptor detail:\n{detail}')
-                    else:
+                    if not (detail or receptor_output):
                         logger.warning(f'No result details or output from {self.task.instance.log_format}, status:\n{state_name}')
                 except Exception:
                     logger.exception(f'Work results error from job id={self.task.instance.id} work_unit={self.task.instance.work_unit_id}')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to get it in job_explaination instead of result_traceback. I'm sick of that JSON parse error. This way we start widdle down the occurrence of the JSON parse error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key change is setting job_explaination

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your position is that the receptor detail should go in job_explanation instead of result_traceback, then you should modify the existing line (using f'Receptor detail:\n{detail}'), instead adding a boutique condition for the narrow case you are looking at.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants