Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concatenated profane words make "false positive" prediction #53

Open
ciapecki opened this issue Oct 1, 2024 · 1 comment
Open

concatenated profane words make "false positive" prediction #53

ciapecki opened this issue Oct 1, 2024 · 1 comment

Comments

@ciapecki
Copy link

ciapecki commented Oct 1, 2024

predict_prob(['fuck','shit','fuckshit'])
#[1. 0.99999982 0.03636672]

Is there a possibility to treat the last element of array as profane?

@dimitrismistriotis
Copy link
Owner

Thanks for the issue.

We had similar discussions in the past including for when the code was "living" in Gitlab, nice to have it here for reference.

In order to do so we should update the dataset with more sentences having fuckshit annotated as profane. Currently we are using the original dataset and have not discussed updating it, although I am open to the possibility if one can appoint a good corpus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants