Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge Python 3 compatible changes #519

Merged
merged 15 commits into from
Feb 11, 2020
Merged

Conversation

replaceafill
Copy link
Member

@replaceafill replaceafill commented Feb 7, 2020

This PR cherry picks most of the Python 3 compatible changes from https://github.com/artefactual/archivematica-storage-service/tree/dev/issue-806-python-3-upgrade. The only two commits left out are:

An extra commit was created to fix linting.

I tested this in Jenkins using the matrix-qa features and even though I got a GPG space related error in the aip-encryption.feature, a local test run with docker using the same feature file passes. The black-box tag also passes in docker, so I'm fairly confident this will pass AMAUATs if it's merged.

Connected to archivematica/Issues#806

@replaceafill replaceafill marked this pull request as ready for review February 7, 2020 05:49
@replaceafill replaceafill requested a review from a team February 7, 2020 05:50
@ross-spencer ross-spencer self-requested a review February 7, 2020 14:53
Copy link
Contributor

@ross-spencer ross-spencer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work @replaceafill -- just one potential bug noticed. I should note as well, that like you I was running this code throughout a good part of the py3 work so there's some good confidence built.

I ran pylint and I wonder if you would address the absolute_import changes mentioned here, and feel free to leave this out, but if you had the time, could you maybe address the final compatibility issue recognized by the linter? I believe (I could be wrong) this is because we're reusing e for a variable we might otherwise call entry when e was previously defined as an exception object e. I think then pylint --py3k will return a huge thumbs up!

pylint --py3k $(find * | grep py$)
************* Module locations.migrations.0024_allow_blank_aws_auth
[W1618 no-absolute-import] import missing `from __future__ import absolute_import` File: storage_service/locations/migrations/0024_allow_blank_aws_auth.py, line 4, in 
************* Module locations.migrations.0023_s3_bucket_field
[W1618 no-absolute-import] import missing `from __future__ import absolute_import` File: storage_service/locations/migrations/0023_s3_bucket_field.py, line 4, in 
************* Module locations.migrations.0025_update_package_size
[W1618 no-absolute-import] import missing `from __future__ import absolute_import` File: storage_service/locations/migrations/0025_update_package_size.py, line 4, in 
************* Module locations.tests.test_s3
[W1618 no-absolute-import] import missing `from __future__ import absolute_import` File: storage_service/locations/tests/test_s3.py, line 1, in 
************* Module locations.tests.test_space
[W1618 no-absolute-import] import missing `from __future__ import absolute_import` File: storage_service/locations/tests/test_space.py, line 1, in 
************* Module locations.models.space
[W1661 exception-escape] Using an exception object that was bound by an except handler File: storage_service/locations/models/space.py, line 757, in Space.browse_rsync

-----------------------------------
Your code has been rated at 9.99/10

Thank you too for the Jenkins report! I really appreciate its use in this context. I think Santi would too! We just need to figure out a way to make it trigger on command via the PR!

@@ -1,3 +1,5 @@
from __future__ import absolute_import
from __future__ import print_function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should try isort or consider waiting until it is in CI, we would end up with from __future__ import absolute_import, print_function here I think. There may be other cases below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree the imports look strange. But I'd rather we wait to define what style we're going to use for isort and then run it once accross all projects like we did with black.

@@ -1,3 +1,5 @@
from __future__ import absolute_import
from __future__ import print_function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the patterns in the migration, it seems that certain future imports are brought in as a matter of course to increase Py3 compatibility? I can see the benefit of that. Just looking to clarify as I begin the CR where I don't believe the absolute_import has an impact on this particular script -- I think this would be hard to determine across the entire code-base so another + for doing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the Python 3 sprint I noticed several conversations in the python-modernize project regarding this:

I think that if the imports don't hurt and linting doesn't complain then it's +1 from me 😊

from django import forms
from django.contrib import auth
from django.utils.translation import ugettext_lazy as _

from common import utils

from locations.models import Location, Space
from six.moves import zip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -61,11 +62,11 @@
import scandir
from django.core.management.base import BaseCommand
from django.db.utils import IntegrityError
from django.utils.six.moves import input
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. Looking at six vs. django.utils.six is that the former is better maintained and more suited for our purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same module, but Django's version is older:

@@ -369,7 +370,7 @@ def check_if_aip_already_exists(aip_uuid):
" want to import this AIP anyway (and destroy the existing one),"
' then enter "y" or "yes": '.format(aip_uuid)
)
user_response = input(prompt)
user_response = eval(input(prompt))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure the impact of eval here so I gave it a try with the management command. I am seeing an error, but perhaps I am missing something?

Command

python manage.py import_aip --compression-algorithm "7z with bzip" --decompress-source --pipeline 15affa1a-0bb8-499b-ad98-ce18b5b0f494 --aip-storage-location a272a155-2cfc-4a4d-92c9-613ebfd3394d demo_csv-ebca30c2-285a-4835-8722-de29981afdf5.7z

Error

An AIP with UUID ebca30c2-285a-4835-8722-de29981afdf5 already exists in this Storage Service? If you want to import this AIP anyway (and destroy the existing one), then enter "y" or "yes": y
Traceback (most recent call last):
  File "manage.py", line 11, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 354, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 346, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python2.7/site-packages/django/core/management/base.py", line 394, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python2.7/site-packages/django/core/management/base.py", line 445, in execute
    output = self.handle(*args, **options)
  File "/src/storage_service/common/management/commands/import_aip.py", line 153, in handle
    options["tmp_dir"],
  File "/src/storage_service/common/management/commands/import_aip.py", line 429, in import_aip
    check_if_aip_already_exists(aip_uuid)
  File "/src/storage_service/common/management/commands/import_aip.py", line 375, in check_if_aip_already_exists
    user_response = eval(input(prompt))
  File "<string>", line 1, in <module>
NameError: name 'y' is not defined

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice catch! The module was already using six and it seems like modernize considered the input to be the Python keyword so it reconverted it.

line
for line in subprocess.check_output("7z").splitlines()
if b"Version" in line
][0].decode("utf8")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -294,7 +295,7 @@ class Migration(migrations.Migration):
blank=True,
help_text=b"Identifier for the Archivematica pipeline",
unique=True,
verbose_name=b"UUID",
verbose_name="uuid",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the case-change significant here? UUID vs. uuid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was a mistake. Thanks for catching it!

@@ -1,6 +1,7 @@
# -*- coding: utf-8 -*-
from __future__ import unicode_literals

from __future__ import absolute_import
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be last migration in the list, just noting that these all apply okay when run.


__all__ = ("Package",)


LOGGER = logging.getLogger(__name__)


@six.python_2_unicode_compatible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful decorator to know about.

@@ -308,6 +309,7 @@ def test_move_from_ss_chunked_file(self):
shutil.copy(os.path.join(FIXTURES_DIR, "chunk_file.txt"), str(self.tmpdir))
file_path = str(self.tmpdir / "chunk_file.txt")
self.ds_object.CHUNK_SIZE = 10 * 1024 # Set testing chunk size
self.ds_object.BUFFER_SIZE = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting, were these BUFFER_SIZE additions (403) determined by the Python 3 migration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly this was needed because the default buffer size is 1MB and it also affects the chunking functionality in Duracloud.

The files used in the test fixtures are very small (a few KBs) so they didn't really trigger chunking. Setting the buffer size this small guarantees we'll hit that logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

except Exception as e:
LOGGER.warning("rsync list failed: %s", e, exc_info=True)
except Exception as error:
LOGGER.warning("rsync list failed: %s", error, exc_info=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really appreciate this one, thanks @replaceafill!

@@ -308,6 +309,7 @@ def test_move_from_ss_chunked_file(self):
shutil.copy(os.path.join(FIXTURES_DIR, "chunk_file.txt"), str(self.tmpdir))
file_path = str(self.tmpdir / "chunk_file.txt")
self.ds_object.CHUNK_SIZE = 10 * 1024 # Set testing chunk size
self.ds_object.BUFFER_SIZE = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Copy link
Contributor

@ross-spencer ross-spencer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all looks good @replaceafill. And what does pylint say?

archivematica-storage-service$ pylint --py3k $(find * | grep py$)

-------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 9.99/10, +0.01)

💯👍

@replaceafill replaceafill merged commit 6ac2c22 into qa/0.x Feb 11, 2020
@replaceafill replaceafill deleted the dev/issue-806-merge-python-3 branch February 11, 2020 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants