reader can not read any data until the writer close the file #215

quantsword · 2017-03-27T00:13:12Z

I run 2 programs in separate process. On program created a file and write to the end 64K a time till the file size reaches 1G. The other program was launched after writer started (file created) to read the same time. In my first iteration, reader program could not read anything even after write finishes and exit. I added "UpdateFilesize" to update filesize when read returns 0. Reader can read data after writer finishes.

I believe reader should be allowed to read even if write does not close the file yet. It is ok for writer to keep the lease on last chunk. But for completed chunk, It should allow everybody to read.

mckurt · 2017-03-31T04:03:42Z

Hi,

I replicated this setup: one writer and one reader running concurrently. As soon as a chunk got fully written, e.g. gets stable, reader started to read the chunk. However, my setup was using 1x replication without striping.

Mehmet

mikeov · 2017-03-31T04:14:06Z

Striped files, including RS, cannot be read until closed by design: close call sets the logical file size (EOF). Until striped file is closed its logical file size remains 0.

quantsword · 2017-04-02T05:29:17Z

Thanks for your response.

Rationale why this is needed.
Scenario #1, It is typical for data processing pipeline to use files to connect the input and output of 2 process steps. Step 1 generate items and write to file, step 2 read this file and do further processing. If process2 is a map process (no need for sorting, shuffling, etc), it can start the process as soon as data is readable from the file. After it consume all the data items, it can wait till more data become ready.

Scenario #2, data is appended to a qfs file and index is created at the same time. Data request can come any time for any record. If an record is available as indicated in index (index can be local file instead of QFS), it will try to read from QFS file with the specified location. It can wait a little bit but not till QFS file close, as append will always happen.

My Question, after QFS collected 6 strips and generated 3 recovery strips, it will push these 9 strips into 9 chunk server. The write client can notify meta server with a updated file size. Is there any design concerns in this approach?

thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reader can not read any data until the writer close the file #215

reader can not read any data until the writer close the file #215

quantsword commented Mar 27, 2017

mckurt commented Mar 31, 2017 •

edited

Loading

mikeov commented Mar 31, 2017

quantsword commented Apr 2, 2017

reader can not read any data until the writer close the file #215

reader can not read any data until the writer close the file #215

Comments

quantsword commented Mar 27, 2017

mckurt commented Mar 31, 2017 • edited Loading

mikeov commented Mar 31, 2017

quantsword commented Apr 2, 2017

mckurt commented Mar 31, 2017 •

edited

Loading