Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObjectPool for pages and a SlicingArena #743

Open
JohannesLichtenberger opened this issue Oct 1, 2024 · 7 comments
Open

ObjectPool for pages and a SlicingArena #743

JohannesLichtenberger opened this issue Oct 1, 2024 · 7 comments

Comments

@JohannesLichtenberger
Copy link
Member

Page instances should be recycled, when the BufferManager evicts pages,as we currently always allocate new instances. Instead we should preallocate "empty" instances and clear them once evicted from the cache and put the instances back into one or two ObjectPools (for IndirectPages and KeyValueLeafPages). In the off-heap branch we should also allocate a big memory chunk upfront and use a slicing allocator to slice into smaller chunks for the KeyValueLeafPages.

@XiangyuTan-learning
Copy link

Hi @JohannesLichtenberger, I am brand new to make PR for open source project, do you mind if I take on this issue? as I saw there is a "good first issue" label on it. Thanks!

@JohannesLichtenberger
Copy link
Member Author

@XiangyuTan-learning do you have software engineering experience? I think it may be ok for new developers on the project, but you might have to have some experience...

@XiangyuTan-learning
Copy link

@JohannesLichtenberger , I am a second year master student specialised in Software Engineering. The reason for bothering you is that one of my course assignment requiring to make PR to a open source object, so.... But I am only familiar with JAVA language only, can you please give me some advice about whether this issue good for me to choose or I have to choose others. Thanks

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented Oct 13, 2024

You can try... the main issue is that we must reuse KeyValueLeafPages as we're allocating too much garbage. Thus, once the page is evicted from a RecordPageCache, it could potentially be reused for a new key-value leaf page, which is read from disk instead of always creating new objects (in PageKind, the KeyValueLeafPages are deserialized, and a new instance currently is created. We should thus instead use an object pool (for instance include StormPot -- in libraries.gradle for instance add stormpot : 'com.github.chrisvest:stormpot:3.2') and we have to include it in sirix-core.

Then, my main idea is to release the page to the ObjectPool once it's evicted from the RecordPageCache (it's not pinned anymore, and the TinyLFU algorithm decides to evict the page...). Thus, it can be returned to the pool, and once a new KeyValueLeafPage is needed, it should be fetched from the ObjectPool instead of creating a new instance. Furthermore, before releasing the page, the MemorySegment(s) should be cleared (fill with 0-bytes maybe) and all fields should be reset as if it were a new instance.

Another thing to note is that the KeyValueLeafPage in the branch which uses off-heap memory to store the page data in a slotted page has an Arena to create a MemorySegment for the slotted page (for the slots/the data). Instead of using an Arena per page and closing the arena we should use a global arena instead, then create all KeyValueLeafPages in the ObjectPool upfront with a SlicingAllocator (which uses chunks from a big MemorySegment).

Hope this makes some sense to you...

@JohannesLichtenberger
Copy link
Member Author

@XiangyuTan-learning do you think you can work on this? It's the most pressing issue right now.

It's the update-slots-to-memorysegments branch you should work on...

@XiangyuTan-learning
Copy link

@JohannesLichtenberger , thanks, I would like to work on it. But my assignment only give me like two weeks to do this, and I am not sure if I can implement it. Besides, like you said, it's the most pressing issue for the project. So I am wondering can I just start working on it without assigning it to me, then if someone capable of implementing this shows up, you can assign it to them directly. I don't want to make negative effect to the project due to my inadequate capability. How do you feel about my opinion?

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented Oct 15, 2024

I'll work on it myself, I guess ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
@JohannesLichtenberger @XiangyuTan-learning and others