Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DorisExporter marshal all data to single json array lead to stream load request too small #36896

Open
flashmouse opened this issue Dec 19, 2024 · 4 comments · May be fixed by #36912
Open

DorisExporter marshal all data to single json array lead to stream load request too small #36896

flashmouse opened this issue Dec 19, 2024 · 4 comments · May be fixed by #36912
Labels
exporter/doris needs triage New item requiring triage

Comments

@flashmouse
Copy link

Component(s)

exporter/doris

Describe the issue you're reporting

DorisExporter use stream load to insert data into doris. stream load have a default load limitation streaming_load_max_mb (default 10GB), but currently DorisExporter marshal all data to a single json array so request size restricted by streaming_load_json_max_mb (default only 100MB) so the maximum request body size each stream load request DorisExporter send is only 100MB. As a result, otel collector have to send too many small requests to doris, it's very ineffective.

I hope DorisExporter could reduce write requests frequency to doris. I think implementation below maybe feasible:

  1. each dTrace parse to one single json line, and set strip_outer_array=fasle
  2. parse []*dTrace to csv format to bypass the limitation of streaming_load_json_max_mb
@flashmouse flashmouse added the needs triage New item requiring triage label Dec 19, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@joker-star-l
Copy link
Contributor

Thanks for your advice, I will change it with the first method you mentioned.

@joker-star-l
Copy link
Contributor

By the way, another reason for too many requests might be that you don't have a batch processor in the config file.

@flashmouse
Copy link
Author

I had added batch processor and caught error said my data size exceed streaming_load_json_max_mb so I decrease the parameters of batch procerror to fit it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/doris needs triage New item requiring triage
Projects
None yet
2 participants