Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MWS: Add support for large tiddlers to be stored as attachment files #8022

Closed
wants to merge 131 commits into from

Conversation

Jermolene
Copy link
Member

@Jermolene Jermolene commented Mar 1, 2024

Note: This PR targets the branch for MultiWikiServer

This PR adds support for storing large tiddlers as attachment files. There are several motivations:

  • To be able to efficiently handle tiddlers larger than SQLite's roughly 1GB limit on string size
  • To be able to efficiently stream large tiddlers directly over HTTP

Questions

This PR includes a change to the schema. At this point, we can probably just merge it and instruct testers to delete their existing database files, if necessary using --mws-save-archive to first take a backup. But it is also a good opportunity to think through how we want to handle database upgrades in the future. Perhaps we should start versioning the schema and add a metadata table to the schema that includes the version number. Or maybe it would be more resilient to just make the creation of each table and field be conditional on whether it already exists.

Progress

  • Extend schema to add an attachment reference to the tiddlers table
  • Use an attachment when saving a tiddler above a given size
  • Route to stream attachments back to the browser
  • Make the attachment size limit be configurable (currently fixed at 200KB)
  • Tweak design to allow ACL checks to be performed in the GET /attachments/:attachment_name route. The plan is to identify attachments by the combination of bag name and title, which will allow the database to be consulted for security checks before allowing access
  • Streaming mutlipart form data directly to the attachment store so that we can efficiently upload large tiddlers
  • Fix uploading the same file twice – in particular, might be problematic if done via two browsers that happen to send a different content type
  • Avoid making text tiddlers become attachments
  • Support for deleting attachments when the associated tiddler is modified or deleted
  • Update --mws-save-archive to handle attachments
  • Review error handling
  • Apply length limits while processing multipart form data to prevent DDOS attacks
  • Support HTTP range requests to allow streaming videos to support seeking within video

This is needed in order for our CI to be able to run the tests
It is now possible to create and edit tiddlers, using the existing tiddlywebadaptor syncing mechanism. There are a lot of hacks and lumpiness to make things compatible, so I think I will end up with an independent implementation
May not actually be needed
Avoids "string too long" errors when working with big tiddlers (>100MB)
Cannot yet specify the bags for the new recipe
Currently hard wired to kick in for tiddlers over 10MB (in base64 representation for binary tiddlers)
I intend to further refactor things, but wanted to capture the point at which things are working end-to-end.
And fix up the response timing
@Jermolene
Copy link
Member Author

These changes have now reached a stable point and so I'm going to merge into #7915, and carry across the todo items that are not complete

@Jermolene Jermolene closed this Mar 10, 2024
@Jermolene Jermolene deleted the multi-wiki-support-attachments branch March 13, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants