Skip to content
/ Nano Public

Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control

License

Notifications You must be signed in to change notification settings

sfanxiang/Nano

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Nano

Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control.

This is the repository for https://arxiv.org/abs/2211.05750.

Instructions

If at stage 0, run python scripts/gen.py --stage 0 --prompts <prompts>.

Otherwise, run python scripts/train_classifier.py && python scripts/train_lm.py <prompts>. Generate text with python scripts/gen.py --stage <stage> --dev <device> --start-idx <index> --prompts <prompts> for each prompt, incrementing <index> each time.

<prompts> is in the format [(<number0>, '<prompt0>'), (<number1>, '<prompt1>')] where <number> is the amount to generate for each prompt.

To label the results locally, run python scripts/label.py --stage <stage>.

About

Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages