Improve Go Case Conversion #98

sergiosalvatore · 2024-10-04T20:56:14Z

provide the ability to augment or replace the set of initialisms that are used
provide an EncodeCasingFunc for the Go style
introduce the concept of "atoms" so that Go will not attempt to break up names that we want to be kept together.

silvenac

lgtm, just nits

tagformat/caseconversion/case_conversion.go

silvenac · 2024-10-04T23:47:22Z

tagformat/caseconversion/case_conversion.go

+
+// SetInitialisms replaces the set of initialisms used by the GoCaseConverter
+// with the argument.
+func (g *GoCaseConverter) SetInitialisms(initialisms []string) {


do we want to set or recommend any limits to max number of initialism or atoms? (stealing your "we should have limits on everything" philosophy 😉)

It's a good thought -- I just put in some protections against short initialisms or atoms that would certainly cause problems. I think, given that Dials is a library, for it to impose limits on the number of initialisms or atoms it's probably a step too far.

silvenac · 2024-10-04T23:50:30Z

tagformat/caseconversion/case_conversion.go

+// AddInitialisms adds the passed initialisms to the set of initialisms.
+func (g *GoCaseConverter) AddInitialism(initialism ...string) {
+	g.initialisms = append(g.initialisms, initialism...)
+	sort.Strings(g.initialisms)


if we don't set any limits on how long the lists of initialisms are, maybe we should insert at the right sort index, but probably this is a micro-optimization. also do we care about duplicates?

My impression is that users should not be calling AddInitialism in a loop and instead would just call it once in main at startup to configure the extra things they want to add.

dfinkel

a couple comments

dfinkel · 2024-10-09T12:18:08Z

tagformat/caseconversion/case_conversion.go

+			continue
+		}
+		chunkLen := 2
+		for chunkLen <= len(word) {


You shouldn't need to iterate over sizes like this.

The doc for SearchStrings says:

The return value is the index to insert x if x is not present

... so the longest prefix present will be a nearby predecessor to the index that's actually returned.
I think it would be better to iterate backwards from the returned index until either the offset is either no longer a prefix or you hit the beginning of the list. (assuming that it isn't an exact match)

(If we're going to do exact matches after case-conversion, we might as well use a set, and get O(1) checks for the inner loop.)

That's a good point -- I think the longest prefix (or an exact match) will necessarily be the entry in the position at one less than the return value of SearchStrings or 0 if there are no prefixes. I've updated the code to use this logic and it appears to work as expected with the fewest loop iterations.

At some point I convinced myself that the longest prefix would be in the previous position, but while writing that comment, I had trouble convincing myself of it again (which is why I was thinking about iterating backwards)

e.g. if we're looking for abcd and there are entries, a and aa, then aa will be in the preceding position, but we need to grab a, which would be another position earlier.

If we enforce that entries aren't prefixes of one another, then that simplification works, though. (I think I originally thought about this scheme in the context of Fresnel's topic prefix ACLs, where it makes no sense for them to be prefixes of eachother (but, in hindsight we probably aren't validating that 🫤 ))

dfinkel · 2024-10-09T12:23:19Z

tagformat/caseconversion/case_conversion.go

+		} else if initialismOffset < len(g.initialisms) && g.initialisms[initialismOffset] == maybeInitialism {
+			if i == 0 {
+				b.WriteString(w)
+			} else {
+				b.WriteString(maybeInitialism)
+			}
+		} else {
+			if i == 0 {
+				b.WriteString(w)
+			} else {
+				b.WriteString(cases.Title(language.English, cases.NoLower).String(w))


seems like you can pull the i == 0 check here up into the outer if/else cascade and simplify both branches.

While there is some duplication in statements I feel like the original way I wrote it is easier to understand -- but I've updated it nonetheless to simplify the branches.

yeah, it might warrant a comment.
It is a little odd to see that i == 0 check at the beginning now.

- provide the ability to augment or replace the set of initialisms that are used - provide an EncodeCasingFunc for the Go style - introduce the concept of "atoms" so that Go will not attempt to break up names that we want to be kept together.

sergiosalvatore force-pushed the impl-encode-go-camel-case branch from 102af63 to fdd3c0f Compare October 4, 2024 21:07

silvenac approved these changes Oct 7, 2024

View reviewed changes

sergiosalvatore force-pushed the impl-encode-go-camel-case branch from fdd3c0f to 925f319 Compare October 7, 2024 14:32

dfinkel reviewed Oct 9, 2024

View reviewed changes

sergiosalvatore force-pushed the impl-encode-go-camel-case branch 2 times, most recently from b16213c to cca74ee Compare October 18, 2024 19:24

Improve Go Case Conversion

c275b0a

- provide the ability to augment or replace the set of initialisms that are used - provide an EncodeCasingFunc for the Go style - introduce the concept of "atoms" so that Go will not attempt to break up names that we want to be kept together.

sergiosalvatore force-pushed the impl-encode-go-camel-case branch from cca74ee to c275b0a Compare December 2, 2024 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Go Case Conversion #98

Improve Go Case Conversion #98

sergiosalvatore commented Oct 4, 2024

silvenac left a comment

silvenac Oct 4, 2024

sergiosalvatore Oct 7, 2024

silvenac Oct 4, 2024

sergiosalvatore Oct 7, 2024

dfinkel left a comment

dfinkel Oct 9, 2024

sergiosalvatore Oct 17, 2024

dfinkel Oct 18, 2024

dfinkel Oct 9, 2024

sergiosalvatore Oct 17, 2024

dfinkel Oct 18, 2024

Improve Go Case Conversion #98

Are you sure you want to change the base?

Improve Go Case Conversion #98

Conversation

sergiosalvatore commented Oct 4, 2024

silvenac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dfinkel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment