-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Delayed Clipboard Rendering] Privacy issue while reading data for web custom types #439
Comments
We suggest the following solution to address this problem: e.g.
In the browser, the source url format will be present in the ClipboardItem object which can then be used by the web app to verify if it can trust the source of the web custom formats.
|
I'm not sure how that addresses the leak? |
@annevk So if the target app doesn't request the web custom format, then the system clipboard wouldn't call back into the source app (in this case browser) where the callback is triggered for the web custom format. The malicious site wouldn't be able to know where the user pasted the content as the callback wouldn't be triggered in this case. Am I misunderstanding the issue? Also, after discussing at the TPAC meeting, we realized that the relationship between source and target app isn't always 1:1 when it comes to web custom formats. There can be multiple apps that could support the same web custom format, so the malicious site can only guess where the user is pasting. The scenario is also way too constrained, in that the user has to copy from the site that inserted the web custom format and paste it exactly into the app that supports this format. This isn't always the case when it comes to copy/paste. |
Discussed this issue with privacy folks internally. Removing the support for web custom formats in delay rendering makes the API less useful, but we also want to limit the privacy impact. The proposal is to restrict the number of web custom formats that can be delay rendered. That way a random site can't advertise a large number of web custom formats with delay rendering and cast a wide net to track the user's paste activity. e.g. if we allow only 5 (arbitrary number) web custom formats that can be delay rendered, then it limits the options for sites to track the user paste activity. |
EditingWG meeting minutes: |
Here is the meeting link which is scheduled for 11/30 at 8am: https://us04web.zoom.us/j/75936156146?pwd=zwBgjXu4EBOlUnrurHo0PZ7Ka1RU5X.1 |
8am in Salem, Oregon? |
I think its Nov 30, 2023 08:00 AM Pacific Time (US and Canada). Same time as our EditingWG meetings. |
Is that meeting on 11/30 the right one still? I had gotten a meeting invite and then it was cancelled. |
@smaug---- Yes, 11/30 is the right one. Link to Zoom meeting: https://us04web.zoom.us/j/75936156146?pwd=zwBgjXu4EBOlUnrurHo0PZ7Ka1RU5X.1 |
Meeting minutes: Github issue: #439 Meeting (11/30) Attendees: Scribe: Dan Clark (Microsoft) Anupam: Want to discuss privacy concerns related to delayed rendering of custom formats. When source app delay renders format, registers callback. When the user pastes into an app that tries to read it, the system clipboard calls back to the source app which triggers a callback. In the callback the source app populates the data. For web custom format, random site can register callback for custom format. Don’t need to produce data at copy time. When the callback is triggered during paste, the source app can determine that the paste is happening. . https://mozilla.zoom.us/j/92378268466?pwd=OHRQSnRycXd3VXk5L3NhMlBIYVpVdz09 Olli: These issues are there. Even if you know one external app then you know a lot. Copy-pasting from company A to B products can happen and advertisements can be targeted to the user. |
(I wrote this to whatwg Matrix channel too) I wonder if async clipboard rendering could happen in a worklet which can't communicate back to the main page. When copy is triggered, page would pass structured clone (including blobs and what not) to the worklet and then paste might or might not happen later by reading the data (converted to the right type) from the worklet. This is not trying to address the concerns Apple has around sanitization (I'm also personally less worried about that somewhat separate issue). |
I think the issue is that web apps don't want to do the work to generate the data at all if it isn't required during paste. It would save a lot of cost in terms of COGS, CPU cycles etc. I don't think most sites would even care about delay rendering, but the sites that do, need a way to efficiently produce data for formats. |
I assume sites do need to snapshot something when copy happens, so that they know what internal data to use when delayed rendering eventually happens during paste. (Data could get modified between copy and paste, but user wanted the data when copy happened). Would be good to get some feedback from folks who might use the API |
Regarding the separate worklet: the issue here is that Excel keeps the data/model/business logic to process the data solely on the server. The "COGS" is server CPU cycles plus bandwidth to send the data to the user's device (several pre-serialized copies, even). @snianu can correct me if I'm misunderstanding/misrepresenting this. It's not clear to me how many sites use this architecture so I agree that it's important to gather more feedback from a broader range of potential clients of this API, as @smaug---- suggests. Adobe did weigh in but it seems that the details, such as limiting delayed rendering to built-in formats, limiting to 5 custom formats, a worklet, etc, may affect whether the API is still considered useful for Photoshop. |
I also attended the meeting.
…On Thu, Nov 30, 2023, 19:15 snianu ***@***.***> wrote:
Meeting minutes:
Delayed Clipboard Rendering
Github issue: #439 <#439>
Explainer:
https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/DelayedClipboard/DelayedClipboardRenderingExplainer.md
Meeting (11/30)
Attendees:
Evan Stade (Google)
Anne van Kesteren (Apple)
Anupam Snigdha (Microsoft)
Sanket Joshi (Microsoft)
Wenson Hsieh (Apple)
Megan Gardner (Apple)
Christine Hollingsworth (Google)
Thomas Steiner (Google)
Ajay Rahtekar (Google)
Ayu Ishii (Google)
Olli Pattay(Mozilla)
Simon Pieters(Mozilla)
Scribe: Dan Clark (Microsoft)
Anupam: Want to discuss privacy concerns related to delayed rendering of
custom formats. When source app delay renders format, registers callback.
When the user pastes into an app that tries to read it, the system
clipboard calls back to the source app which triggers a callback. In the
callback the source app populates the data. For web custom format, random
site can register callback for custom format. Don’t need to produce data at
copy time. When the callback is triggered during paste, the source app can
determine that the paste is happening. .
The source app can then determine that the user is pasting into Photoshop.
Proposed mitigation in issue is restricting delayed rendering to only
builtin formats. That’s really restrictive for API since web custom formats
can have high fidelity content. Source app doesn’t know destination app,
doesn’t want to spend cycles to generate the data for custom format if not
needed. Primary use case is to support delayed rendering for custom
formats, so trying to find middleground to reduce fingerprinting concerns.
Photoshop can advertise photoshop format, figma can advertise figma format,
etc. If we allow site to advertise more than 1 custom format then there are
a couple cases. One case where site is malicious, paste will fail if page
doesn’t generate data. Or will paste if page does generate data.
Fingerprinting is related to first case where pase fails, malicious site is
able to detect user paste activity. In 1st case where paste fails, will be
bad experience for user. User would only try as one time thing. Try to
paste, it fails. User won’t try again. So in malicious site, do we want to
restrict number of formats?
Simon: Don’t understand why only problematic if paste fails.
Anupam: source app already knows web custom format is intended for their
app. Already know ecosystem.
Simon: That’s part of the issue. Don’t want the site to know when a given
app is installed or not.
Anne: not sure you’re thinking about failing correctly. Source app can
write multiple formats at the same time. Upon paste, maybe custom one gets
read, maybe not. In case where it does, they have signal user is using
particular app. When it doesn’t, paste doesn’t necessarily fail because
falls back to text/plain.
Anupam: Native apps try to read the most preferred formats first. If there
is custom format, will try to read that first.
Anne:
Anupam: If plaintext read and delay rendered, can’t determine destination
app.
Anne: You’re saying paste fails therefore user will stop, but it’s not
clear that the paste will fail.
Anupam: Talking about custom formats only here.
Anne: It can populate builtin formats as well so paste will always succeed.
Anupam: Builtin formats aren’t a concern here.
Wenson: With builtins, not a fingerprinting concern because user knows
what’s being pasted. With custom formats there’s more obscure data that can
be written.
Anupam: Fingerptinging concern is for custom formats
Simon: Not obvious can’t fingerprint with standard formats
Wenson: Referring to urls, text/plain, text/html, and image/png. IN all
these cases in Webkit, with HTML if thing writing data writes extra info in
HTML comment, we’ll sanitize it when writing to clipboard. Only visible
content preserved. For all builtin types can sanitize to preserve privacy.
Simon: Important thing to me is you’re talking about only those 4 types.
Evan: In Chromium, Not removing all comments. There will be signs in HTML,
don’t think sanitization scrubs all info about where it’s coming from.
Wenson: We try to. Really only visible text.
Sanket: Concern raised here is that w/ delayed rendering, source app gets
callback. Info getting exposed is where user pasted. If malicious site
exposes format for e.g. photoshop, they can know that user pasted into
photoshop.
Anne: Secondary concern: is now communication channel between source and
dest app.
Simon: That’s the point of the feature.
Sanket: When user pastes, the app has to produce the payload. The point
Anupam is making is why this is unique to web custom formats is that when
site’s rendering native format, may be read by N different destinations.
Problem with custom formats is they may be very app/ecosystem specific.
Goal is can we find mitigation that enables API to be usable with custom
formats since it’s important for them to be delay renderable. While still
minimizing fingerprinting risk.
Wenson: Use case: paste from web version to native version of same app.
Know the app itself owns the website too. So no risk there since it’s the
same owner. Can we use that to allow delay rendering in that case, if it’s
the same associated domain for the destination.
Sanket: Discussed this before briefly. Question is what’s the mechanism.
Evan: That’s important question but also sounds like reduction of interop.
In example we’re using here, some other version of Office can’t interop
with most popular version. Not a good direction.
Johannes: This was the concern raised last time, could shut out smaller
competitors.
Anne: Like the same origin limitation which makes it not that useful.
Anupam: Need to know what the payload should be, can’t just be random
bytes. Or paste fails in dest app. Paste fails if you don’t populate the
data, or dest app doesn’t know what format should be, can’t parse it. Can’t
just populate random bytes.
Sanket: Case that Johannes and others are taking about, there’s a format
you want to use across an ecosystem. E.g. Office/text format used by
multiple editors. Yes the mitigation we’re discussing here would limit
that. But don’t think use cases heard from web devs are about that. Use
cases are in the realm of what Wenson described. Custom formats used for
copy/paste within that ecosystem.
Anupam: If other apps, let’s say office publishes web custom format
publicly and want other apps to read it, can move it from custom format to
builtin bucket.
Johannes: Concern is if MSFT word doesn’t want competitors, can close it.
No one else can create interoperable thing with Word.
Anupam: By design. Not mean to act like built in. Great if they all know
what it is, but if they don’t then it’s by design they won’t be able to
read it. If there’s web text/html format, and has HTML content and
metadata, and Office/Photoshop wants to publish this format and want other
apps to read it. Can describe the format
Johannes: Is that currently the case? If I create Word competitor, am I
initially shut out to receive paste from Word or can I try to write a
parser for Word format?
Anupam: There’s no restriction. Any app can read it. App can add
enterprise policy to restrict but that’s out of scope for this discussion.
So yes, in general clipboard does not restrict what app can read. If you
don’t know what data is present in custom format, wouldn’t be able to
process it. Can use HTML parser but if it has metadata the parsing will
fail. Fingerprinting concern is that with web custom formats, if it’s
restricted to a particular ecossytem/set of apps, in last meeting we
proposed mitigation of restricting delay rendering to 5 or small number.
Source app can’t cast wider net and advertise all custom formats on
clipboard and hope to detect one.
Anne: Even for single format don’t think we have something that works yet.
Fingerprinting concern and concern you can’t sanitize it. Don’t see a
viable way around those other than same origin/same-app-bounds solutionn.
Neither seem great because seem to prevent other origins from competing
with yours.
Evan: I still think contents of HTML can’t be effectively sanitized. Uses
this font size, this layout, nests table elements in this way…
But that info only goes in one direction when you’re pasting into web app.
This can go the other way, which is potentially bad. That said you know
only the one app. Fingerprinting is to identify user base on which apps are
installed but you only get info about one and it’s fuzzy.
Simon: Have options for targeting many formats. Build up search tree, with
couple of pastes get a good idea of where paste is going, user habits.
Evan: Narrows the vector if you have to convince user to copy/paste. Is
sign of user trust.
In Chromium, for certain ways in case you use async API there’s a
permission prompt. We don’t love these but it’s a way to get the user to
make a trust decision. User won’t just grant it to totally malicious site.
If it’s their chosen office software..what kind of site are we concerned
about for fingerprinting?
Olli: Company A’s office products may want to know if you’re pasting to
Company B’s products., might be useful for Company A’s advertising.
Sanket: Copy-paste should succeed in cases where the formats are expected
to work. It’s very hard for different apps to provide legit copy-paste with
formats that they are not aware of.
Olli: Not talking about failing. Company A learns about user paste
activity and advertises things.
Sanket: Primary goal is for the paste to succeed.
Anne: Why can’t it learn how to write the format?
Sanket: It’s hard to do.
Anne: That is not really tricky.
Evan: This is tangential. This is for delay web custom formats not
built-in formats. Delay formats are important for sophisticated apps. We
are only allowing it for a few web custom formats. If we can’t provide that
then why do we need delay rendering?
Sanket: It’s costly to generate the payload so we need delay rendering
even if it's just for one format. For photoshop, the cost for generating
format is wasted if it's not pasted in photoshop.
Web custom formats can be used to extend support for more formats. Target
apps
https://mozilla.zoom.us/j/92378268466?pwd=OHRQSnRycXd3VXk5L3NhMlBIYVpVdz09
Olli: These issues are there. Even if you know one external app then you
know a lot. Copy-pasting from company A to B products can happen and
advertisements can be targeted to the user.
Evan: If you are typing in a website then they already know a lot about
you. Weigh the utility for delay rendering. Are we concerned about
malicious website?
Olli: That is one part. Even if there is one app then they can exploit a
zero day vulnerability.
Evan: Most attacks cast a wider net. If there is a zero day then target
everyone.
Anne: Target everyone then it gets noticed.
Sanket: The thing is fingerprinting not security. You can write malicious
data to the clipboard. Callback is what indicates the source app. Web devs
are asking for it so there is utility for delay rendering. Trade off is
worthy.
Wenson: Is the use case for 1 or 2 formats time intensive?
Sanket: Only handful of custom types. Eco system specific formats are
handful.
Wenson: Can be mitigated by asking for all delay formats including the
builtin types during paste.
Sanket: All custom or builtin formats?
Wenson: Either would have the advantage of not leaking types?
Sanket: It would kill the benefits of delay rendering. Custom format is
pasted then we would trigger callback for both custom and builtin formats
which is a waste of resources.
If it’s just the custom formats, then it might be ok?
Anne: Not quite sure how that would mitigate it? Concern is with custom
format (one or more).
There is a secondary concern that we can't santize.
Wenson: The only advantage with an all at once approach is you can’t cast
a wider net. Can only target one specific app. Doesn’t address Anne’s
concern though.
Anne: Not sure about custom formats.
Wenson: Same origin and same app is fine. No privacy risk.
Evan: All of the things are concerned with all formats?
Anne: We are concerned about custom formats.
Evan: Different concern.
Anne: It’s a concern with custom format.
Evan: Not sure what to do with that concern. We’re trying to make progress
with delay rendering.
Anne: When I proposed custom formats I didnt have that concern in my mind.
Evan: There is a proposal for reading unsanitized html. Is that something
you thought about?
Wenson: Only allow for same origin. Sanitize for cross origin.
Evan: That is tangential but was just curious.
Sanket: Where are we at? Fingerprinting is unique to web custom format.
What is the objection?
Simon: If we add more native formats then it might have the same issue
Anne: Monitor user habits but delayed clipboard rendering of builtin
formats.
—
Reply to this email directly, view it on GitHub
<#439 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAERMOEUXSUKV426TGKQV2LYHDEL5AVCNFSM6AAAAAA4PRR2BSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZUGMYDQMBZG4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
@smaug---- is Firefox open to considering support for delay rendering of web custom formats if there are partners (like Adobe) interested in using this API? Note that Webkit doesn't support web custom format in cross-origin, and they have voiced concerns with delay rendering of web custom formats in general. Firefox and Chromium are the only browsers that support web custom formats in cross and same origins, so if we think support for web custom format provides value, and without it, delay rendering API is less useful, then we need to find a way to mitigate this privacy issue for web custom formats. Restricting it to 5 (and > 1) web custom formats sounds like a reasonable workaround. Thoughts? |
I think we aren't at least against it. The privacy issue needs to be fixed though, and the 5 types approach doesn't seem to really help much, which is why I was thinking worklet based solutions, which would also help with responsiveness. |
To mitigate this privacy concern and at the same time allow UAs to support delay rendering of web custom formats, we propose the following:
Spec will be written in a way that supports all behaviors so UAs that do not support delay rendering of web custom formats are still in compliant with the spec. Adding @benjamind to also chime-in to this issue as this API would help Adobe web properties to be in parity with the native app. |
That doesn't seem like an acceptable model. That will inevitable lead to a race to the bottom, making privacy worse for all end users. |
@annevk Is Webkit going to support web custom formats for cross-origin sites? |
(with all my hats off) it's hard to see how to if this privacy issue isn't adequately addressed. |
@hober @annevk Does the below solution address the privacy concern with web custom formats? If not, is there anything specific that you think is problematic?
|
The Web Editing Working Group just discussed
The full IRC log of that discussion<dandclark> topic: [Delayed Clipboard Rendering] Privacy issue while reading data for web custom types<dandclark> github: https://github.com//issues/439 <dandclark> snianu: TAG had feedback on this, had same privacy concern. Proposed mitigations that don't work. I responded in thread. <dandclark> ...: At this point, issue has been dragging for too long. Privacy issue is from web custom format. Can we do this just for builtin formats? <dandclark> ...: Can standardize all the infrastructure for this separately without web custom formats, privacy issue doesn't have to block that. <dandclark> ...: Can we separate the two things? Make web custom formats as separate proposal, move forward with that separately if we find some mitigation. <dandclark> ...: But that wouldn't require API changes to async clipboard API. <dandclark> ...: Proposal is standardize for builtin formats. Secondary proposal to restrict custom formats to just 1. But if that's controversial let's resolve to just support builtin formats for delayed clipboard rendering. <dandclark> smaug: Concern is if we do it only for builtins, and then want to do it later for custom formats, what if API shape needs to be different. <dandclark> smaug: Could be confusing. <dandclark> snianu: With the builtin formats, only change is to add callback instead of Blob. <dandclark> ...: If we dont' have calbacks for web custom formats, if we find some other way for devs to do that it would be some other API. To support web custom format we could add overloaded constructor. It's easily extendable. Don't have to change entire API surface. <dandclark> smaug: If there's something that can support both use cases, hopefully we can build something forward-compatible. <dandclark> snianu: Only thing that changes is the callback. <dandclark> snianu: Say we add more security restrictions, that doesn't need a web API change. If you use web custom format you are subject to these restructions, if not it just works. <dandclark> snianu: Builtin formats add real value. <dandclark> ...: And we're adding more. <dandclark> whsieh: Hard to imagine case where having callback wouldn't be backwards-compatible. <dandclark> whsieh: Provides value in short term. <dandclark> smaug: Maybe. <dandclark> whsieh: Even then, callback could return object we add. <dandclark> smaug: Would be nice to see concrete proposal. E.g. I'd like to see using streams to write the data. <dandclark> snianu: Looking into streams doesn't have to block the current API. <dandclark> snianu: Proposal is the same, just don't allow custom formats. <dandclark> smaug: Callbacks were resolved, but what they actually do was not resolved. <dandclark> ...: Producing data as blob requires lots of memory, streams could solve that. <dandclark> snianu: Could extend it to support streams in callback. <dandclark> snianu: Could also talk to partners about this. <dandclark> snianu: Some are interested in that. COuld do it separetely , doesn't have to block this. <dandclark> smaug: That's fine. Let's get issue filed. <dandclark> snianu: I'll file issue and link it. <dandclark> RESOLVED: Privacy concern doesn't exist if we only support built-in formats. We'll do builtins first and then custom formats in the future. <snianu> https://github.com/w3c/clipboard-apis/issues/191 <dandclark> s/and then custom formats in the future./and then consider custom formats in the future. |
There's a fundamental issue with custom formats in that they cannot be sanitized and thus allow for exfiltration of data the end user might not expect across applications. I think the timeout was proposed to address a separate issue, but that does not seem like a great solution as the end user will now have this latency inflicted upon them. Only supporting this for built-in formats as resolved seems like a reasonable way forward. |
@annevk commented this on the webkit standard positions repo:
@evanstade @inexorabletash @sanketj We want to discuss possible mitigations for this issue.
The text was updated successfully, but these errors were encountered: