-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why does security/.../removals.txt not work with Age? #800
Comments
I'll have to look at it
…On Thu, May 2, 2024, 21:18 Markus Scherer ***@***.***> wrote:
I just added the following to the security data input file removals.txt
(PR #777 <#777>). Why
does the Age property not seem to work here?
I also tried a simple \p{Age=16} without the intersection with the list
of scripts. No effect either.
@macchiati <https://github.com/macchiati> ideas?
# PAG meeting 2024-04-18 before Unicode 16 beta:
# [Mark]: Policy is that by default
# new characters in scripts that are not Excluded or Limited Use,
# are marked as Uncommon_Use & communicate to SEW
# to ask if there are any exceptions (needed in customary modern widespread use).
# ----
# https://www.unicode.org/reports/tr31/#Table_Recommended_Scripts
# ----
# TODO: This should work with the following set pattern but doesn't;
# and neither with \p{Age=16}. Why?
# [\P{Age=15.1}&[\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt}]] ; uncommon_use
—
Reply to this email directly, view it on GitHub
<#800>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMEREYWGVFHUFFYORQ3ZAMF2ZAVCNFSM6AAAAABHE27LK6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TMOBWGU2DOMQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The following works to get just characters > version 14.0 and ≤ 15.1.
[\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt}
&\p{Age=15.1}
-\p{Age=14.0}]
BTW, the following is a more concise way to list the scripts, if you are
using modified version of UnicodeSet in the unicodetools
\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt}
==>
\p{script=/Zyyy|Zinh|Arab|Armn|Beng|Bopo|Cyrl|Deva|Ethi|Geor|Grek|Gujr|Guru|Hang|Hani|Hebr|Hira|Kana|Knda|Khmr|Laoo|Latn|Mlym|Mymr|Orya|Sinh|Taml|Telu|Thaa|Thai|Tibt/}
I often wish we had that in the stock ICU...
…On Fri, May 3, 2024 at 7:17 AM Mark Davis Ⓤ ***@***.***> wrote:
I'll have to look at it
On Thu, May 2, 2024, 21:18 Markus Scherer ***@***.***>
wrote:
> I just added the following to the security data input file removals.txt
> (PR #777 <#777>). Why
> does the Age property not seem to work here?
> I also tried a simple \p{Age=16} without the intersection with the list
> of scripts. No effect either.
> @macchiati <https://github.com/macchiati> ideas?
>
> # PAG meeting 2024-04-18 before Unicode 16 beta:
> # [Mark]: Policy is that by default
> # new characters in scripts that are not Excluded or Limited Use,
> # are marked as Uncommon_Use & communicate to SEW
> # to ask if there are any exceptions (needed in customary modern widespread use).
> # ----
> # https://www.unicode.org/reports/tr31/#Table_Recommended_Scripts
> # ----
> # TODO: This should work with the following set pattern but doesn't;
> # and neither with \p{Age=16}. Why?
> # [\P{Age=15.1}&[\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt}]] ; uncommon_use
>
> —
> Reply to this email directly, view it on GitHub
> <#800>, or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACJLEMEREYWGVFHUFFYORQ3ZAMF2ZAVCNFSM6AAAAABHE27LK6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TMOBWGU2DOMQ>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I just added the following to the security data input file removals.txt (PR #777). Why does the Age property not seem to work here?
I also tried a simple
\p{Age=16}
without the intersection with the list of scripts. No effect either.@macchiati ideas?
The text was updated successfully, but these errors were encountered: