Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace iso3166list with multiple optional iso3166 #9

Open
KDean-Dolphin opened this issue Jun 26, 2024 · 5 comments
Open

Replace iso3166list with multiple optional iso3166 #9

KDean-Dolphin opened this issue Jun 26, 2024 · 5 comments
Assignees

Comments

@KDean-Dolphin
Copy link

The use of iso3166list is a problem for simple validators that are concerned only with structural validation (i.e., type and length). The way AIs 423 and 425 are defined, a structural validator would allow "1234" to pass.

423 ? N..15,iso3166list req=01,02 ex=426 # COUNTRY - INITIAL PROCESS
424 ? N3,iso3166 req=01,02 ex=426 # COUNTRY - PROCESS
425 ? N..15,iso3166list req=01,02 ex=426 # COUNTRY - DISASSEMBLY

I suggest these be rewritten using the optional terminology as follows:

423         ?  N3,iso3166 [N3,iso3166] [N3,iso3166] [N3,iso3166] [N3,iso3166] req=01,02 ex=426                     # COUNTRY - INITIAL PROCESS
424         ?  N3,iso3166                    req=01,02 ex=426                     # COUNTRY - PROCESS
425         ?  N3,iso3166 [N3,iso3166] [N3,iso3166] [N3,iso3166] [N3,iso3166] req=01,02 ex=426                     # COUNTRY - DISASSEMBLY

An alternative would be to borrow the repetition syntax from regex:

423         ?  N3,iso3166{1,5}               req=01,02 ex=426                     # COUNTRY - INITIAL PROCESS
424         ?  N3,iso3166                    req=01,02 ex=426                     # COUNTRY - PROCESS
425         ?  N3,iso3166{1,5}               req=01,02 ex=426                     # COUNTRY - DISASSEMBLY
@terryburton
Copy link
Member

terryburton commented Jun 27, 2024

We have a similar issues with minimum/specific lengths with AIs such as (8007) "X..34,iban" where an ISBN cannot practically be as short as a single character and it is only the linter than enforces the length. The same with the key linter that (when not hooking into an external GCP lookup function) ensures a minimum of (currently) four leading digits.

I've never considered the current behaviour to be a problem since we provide the linters, as well as reference code (the Syntax Engine) for how to apply them, so code that is concerned only with structural validation can be easily enhanced. If people wanted a structure-only solution then maybe they could resort to the published regexs instead.

Nevertheless, I'm still tempted by the suggested N3,iso3166 [N3],iso3166 [N3],iso3166 ... syntax for the ISO 3166 list examples by virtue of the fact that is makes a linter redundant. Being smart by "doing more with less" is a worthwhile goal!

The suggested syntax works in the specific cases that you mention, but we have to be very careful about introducing a syntax that selects between more than a single optional component since it may easily lead to ambiguity, e.g. when the linters or character sets differ. Given X3 [X5],wibble [N5],wobble and an input ABC12345, do we parse 12345 using csetX and process it with wibble or do we parse it with csetN and process it with wobble?

We must carefully assert that the validation process involves a left to right match against each component of the AI's format specification for consecutive format-defined-size prefixes of the given data, declaring success if only optional components remain when we run out of data.

I've pushed a Syntax Dictionary update (currently with another test AI (85)) as well as some corresponding unit tests to a "multi-optional" branch of the Syntax Engine: gs1/gs1-syntax-engine@2c765a6

The existing Syntax Engine code already does everything you would expect when it encounters multiple optional trailing components. It also survives fuzzing which generally is a good way of finding unexpected issues with changes of syntax. I'll give it some more thought and see if anything comes up.

I would feel more reassured by the approach if we can introduce constraints into the GenSpecs for the AI system as a whole:

Either (preferably), forbid AIs from being defined with multiple optional components (currently there is only ever one optional final component) so that we are free to define fine-grained substructure in the BSR without fear of the syntax clashing with the GenSpecs in the future. This is the status quo, but I have not see it codified anywhere, so it's not clear whether it will remain so.

Or (less ideal), define that (1) all optional components must be at the end of the format specification (just as is currently the case) and that (2) all non-terminal components, whether mandatory or optional, must have fixed lengths (plainly obvious to avoid ambiguity). Then define a single correct matching action for the case of multiple optional components — preferably using left-to-right matching described above, and certainly not resorting to a "feasibility check" in order to resolve ambiguity.

All current AI definitions already satisfy both of the above constraint scenarios. Therefore any agreed restrictions are design constraints for future work and do not change the current GS1 system.

Without some codified constrains I think that we risk defining a behaviour for the BSR that gets redefined in the future by updates to the GenSpecs, possibly even with the GenSpec's behaviour being defined on an individual AI basis (for future AIs). This would not align with the BSRs existing notion of left-to-right componentisation.

Whilst the alternative N3,iso3166{1,5} syntax is perhaps less open to misinterpretation (since it cannot be misconstrued as selecting between differing optional components) I am reluctant to extend the current syntax with repetition so that it is not a basic left-to-right match (or give a false impression that a regex engine is required).

@terryburton
Copy link
Member

Done in 6d37ba2. Thanks for the suggestion.

Leaving this open until we've captured the GS1 GenSpecs AI design constraints issue elsewhere.

@KDean-Dolphin
Copy link
Author

How would you like to approach the GenSpecs issue? I think the text above is a good foundation for a Work Request.

@terryburton
Copy link
Member

How would you like to approach the GenSpecs issue? I think the text above is a good foundation for a Work Request.

@PetaDing-GS1 may be able to advise whether this can be bundled into the scope of any existing / planned work. Otherwise I'm certainly happy for someone to champion a new work item based on the above text.

@terryburton
Copy link
Member

Following discussion, @PetaDing-GS1 to consider how to progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants