Skip to content

Unitxt 1.13.0

Compare
Choose a tag to compare
@elronbandel elronbandel released this 25 Sep 06:37
· 154 commits to main since this release
80b284f

Unitxt 1.13.0 - Multi Modality and Types

New type handling capabilities

The most significant change in this release is the introduction of type serializers to unitxt.
Type serializers in charge of taking a specific type of data structure such as Table, or Dialog and serialize it to textual representation.
Now you can define tasks in unitxt that have complex types such as Table or Dialog and define serializers that handle their transformation to text.

This allows to control the representation of different types from the recipe api:

from unitxt import load_dataset
from unitxt.struct_data_operators import SerializeTableAsMarkdown

serializer = SerializeTableAsMarkdown(shuffle_rows=True, seed=0)
dataset = load_dataset(card="cards.wikitq", template_card_index=0, serializer=serializer)

And if you want to serialize this table differently you can change any of the many available table serializers.

Defining New Type

If you wish to define a new type with custom serializers you can do so by using python typing library:

from typing import Any, List, TypedDict

class Table(TypedDict):
    header: List[str]
    rows: List[List[Any]]

Once your type is ready you should register it to unitxt type handling within the code you are running:

from unitxt.type_utils import register_type

register_type(Table)

Now your type can be used anywhere across unitxt (e.g in task definition or serializers).

Defining a Serializer For a Type

If you want to define a serializer for your custom type or any typing type combination you can do so by:

class MySerizlizer(SingleTypeSerializer):
    serialized_type = Table
    def serialize(self, value: Table, instance: Dict[str, Any]) -> str:
        # your code to turn value of type Table to string

Multi-Modality

You now can process Image-Text to Text or Image-Audio to Text datasets in unitxt.
For example if you want to load the doc-vqa dataset you can do so by:

from unitxt import load_dataset

dataset = load_dataset(
    card="cards.doc_vqa.en",
    template="templates.qa.with_context.title",
    format="formats.models.llava_interleave",
    loader_limit=20,
)

Since we have data augmentation mechanisms it is just natural to use it for images. For example if you want your images in grey scale:

dataset = load_dataset(
    card="cards.doc_vqa.en",
    template="templates.qa.with_context.title",
    format="formats.models.llava_interleave",
    loader_limit=20,
    augmentor="augmentors.image.grey_scale", # <= Just like the text augmenters!
)

Then if you want to get the scores of a model on this dataset you can use:

from unitxt.inference import HFLlavaInferenceEngine
from unitxt.text_utils import print_dict
from unitxt import evaluate

inference_model = HFLlavaInferenceEngine(
    model_name="llava-hf/llava-interleave-qwen-0.5b-hf", max_new_tokens=32
)

test_dataset = dataset["test"].select(range(5))

predictions = inference_model.infer(test_dataset)
evaluated_dataset = evaluate(predictions=predictions, data=test_dataset)

print_dict(
    evaluated_dataset[0],
    keys_to_print=["source", "media", "references", "processed_prediction", "score"],
)

Multi modality support in unitxt is building upon the type handling introduced in the previous section with two new types: Image and Audio.

What's Changed

New Contributors

Full Changelog: 1.12.4...1.13.0