Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more detail to security section #185

Closed
cwilso opened this issue Mar 26, 2018 · 23 comments · Fixed by #250 or #271
Closed

Add more detail to security section #185

cwilso opened this issue Mar 26, 2018 · 23 comments · Fixed by #250 or #271
Assignees
Labels
category: editorial https://www.w3.org/policies/process/#class-2 Needs Edits https://speced.github.io/spec-maintenance/about/
Milestone

Comments

@cwilso
Copy link
Contributor

cwilso commented Mar 26, 2018

Perhaps some of the detail from mozilla/standards-positions#58 (comment).

@cwilso cwilso self-assigned this Mar 19, 2019
@cwilso cwilso modified the milestones: V2, V1 Mar 19, 2019
@chr15m
Copy link

chr15m commented May 15, 2019

As per the linked discussion it might be a good idea to specifically make reference to the upload-firmware threat that Mozilla is worried about.

To address that specific concern, which is only an issue if the page wants to do 0xF0 prefixed SysEx messages, maybe it would be a good idea to add a suggestion that the user is prompted/warned. Other messages in the same range (e.g. 0xF1, 0xF8 etc.) are not a problem when it comes to this particular security concern.

@toyoshim
Copy link
Contributor

The current spec already covers the case.
Any sysex won't be available by default.
Sites need to call the API with sysex option, and it results in permission prompting.

Actually the spec suggests to prompt always, and Chrome has a plan to do so in months.

@chr15m
Copy link

chr15m commented May 16, 2019

Thank you, yes, the current spec covers the basic case and "prompt always" is good.

However from the security perspective that Mozilla advocates it seems like the granularity of the cases in the spec is not high enough. They are worried about case (3) below where insecure user devices may be connected:

  1. No-SysEx use (safe).
  2. SysEx use but without 0xF0 custom SysEx messages (safe).
  3. SysEx use with 0xF0 custom messages (potentially unsafe).

In my experiements the {"sysex": true} setting is required for both (2) & (3) MIDI messages. For example to enable "Song Position Pointer" (0xF2) messages in Chrome I had to pass the sysex option. SPP messages belong to (2) above.

From what I am reading from Mozilla it seems like they think situation (3) is sufficiently dangerous as to warn the user more sternly and there is nothing in the spec about this concern.

In addition it seems like (2) is closer to (1) in terms of security and functionality (e.g. SPP messages are more like note-on and control-change messages than they are like custom sysex messages). Even in the MIDI spec they are not called "SysEx" but fall in the "System Common Messages" section.

I personally do not feel the security issues raised are as serious as some people indicate on that thread, but I understand their position and I would like to see MIDI support land in Firefox.

@toyoshim
Copy link
Contributor

We will clarify a sub-set of safe sysex common messages and will update the spec to explain the three level of previleges. Some platform may not support the strongest vendor/product specific messages, like updating firmware via sysex messages.

@bome
Copy link

bome commented Oct 12, 2020

@toyoshim that sounds fantastic! Hopefully it'll convince Mozilla that a safe implementation of Web MIDI is possible.

@chr15m, "SysEx without 0xF0" is inaccurate, please check out the MMA's classification of MIDI messages. I've just amended it with the respective status bytes to clarify. The MMA's position is that only category 5 is potentially harmful.

@chr15m
Copy link

chr15m commented Oct 13, 2020

@bome great! Thank you for this more detailed info.

@hoch
Copy link
Member

hoch commented Sep 11, 2023

2023 TPAC Audio WG Discussion:

  • Break up the current privacy/security section into a separate section. (This is a V1 blocker.)
  • Review the details provided in this issue and select the ones that are relevant to the new security section. (Not a V1 blocker, but relatively easy to do)
  • Create, review, and merge a PR.

@mjwilson-google
Copy link
Contributor

Here is some more information on breaking up the privacy/security section: https://w3c.github.io/documentreview/#how_to_get_horizontal_review

@mjwilson-google
Copy link
Contributor

From reviewing the RFCs:

I think the way I will split this is to have the privacy section focus on fingerprinting and tracking concerns, and the security section focus on everything else. The initial breaking up might require some minor rewriting as well to make everything flow properly.

@jwt27
Copy link

jwt27 commented Sep 28, 2023

I was told to also post my concerns here.

My idea was to develop a hardware synth with a web interface, but the SecureContext requirement makes this use case practically impossible. Implementing SSL/TLS in an embedded http/websocket server would add significant overhead, and dealing with SSL certificates here is entirely impractical.

I understand the concerns about sysex output, although it's not entirely clear to me how SSL improves this. For device enumeration and input though, I do believe the SecureContext requirement is overly restrictive, and I'm hoping it can be relaxed.

@cwilso
Copy link
Contributor Author

cwilso commented Sep 28, 2023

I'm not sure I understand your use case, could you elaborate? I'm not sure why would you have an embedded server - if you are developing a web interface for a hardware synth, you can put the web interface on a public-facing web server (even on Github); if you were putting an embedded server on the hardware synth itself, you probably don't need web MIDI (because the server could directly drive the synth).

@jwt27
Copy link

jwt27 commented Sep 28, 2023

Maybe it's a bit niche, but I thought it would be a valid use case. The idea is to have a simple http server integrated in the synth itself, serving a static html/js page. On the client side you can then connect a controller via WebMIDI and communicate directly with the synth via websockets.

True, I suppose I don't need it since you could connect the controller directly by MIDI cables. But it would be very convenient to have, and could also enable more exotic workflows (eg. collaborative use by sharing over a vpn, if latency permits).

@chr15m
Copy link

chr15m commented Sep 28, 2023

This is the same situation when using a native wrapper library like Cordova, Capacitor, Ionic etc. Most things work on these platforms because browsers generally make an exception to the SSL rules for pages served over localhost. Chrome allows webmidi over localhost for example. If Firefox does not, it would rule out using the Firefox engine in such situations.

@bradisbell
Copy link

@jwt27 Your use case actually isn't niche at all, and I think it should be given more consideration. We have the same problem on every network connected appliance that has a web server built-in. None of the clients for these devices are allowed to use any of the more "advanced" web platform features. With Google's persistent push to prevent functionality for origins without HTTPS, all of these types of devices' capabilities are limited. What's worse is that this results in quite insecure workarounds, like requiring all your users to install software to access your synth (or your network camera, or router, or, etc.).

I've made a more general "web we want" post about this: WebWeWant/webwewant.fyi#245 (comment) Unfortunately it's been mostly ignored and I do not know how to pursue it further. A comment about your synth use case would be welcomed there, but I don't know if it's worth your time as I'm not sure posts there are properly considered.

@mjwilson-google
Copy link
Contributor

I hear your pain trying to make a web-enabled hardware device. Self-signed certificates are annoying to your users, and any overhead in one part of the system directly takes away from the other parts.

The way the specification is written now, all of the interfaces require a SecureContext and we don't have separate input and output methods for SysEx and non-SysEx messages. I don't see a clear way to relax the specification only for non-SysEx use cases. I am also reluctant to propose major changes to the interfaces since there are already shipping implementations of Web MIDI.

If there is a specific suggestion for how to spec this out please make a pull request and the working group can review it. Please keep in mind that this use case is at least uncommon, even if not "niche", and that the current security model was considered and discussed at length.

I don't think the Web MIDI API is going to drive sweeping changes around how HTTPS is used on the Web, unfortunately. One thing to keep in mind is that even if it's not intended, it's possible to expose embedded web servers to the greater Internet. Web APIs have to assume that they will be used on the Web, and I personally feel like it's worth being a bit conservative when there is potential for abuse. But as noted above, if we can find a clean way to modify the spec which satisfies this use case I am open to it.

@cwilso
Copy link
Contributor Author

cwilso commented Sep 28, 2023

@jwt27 Perhaps I'm not understanding how this hardware device would be connected. I think you mean the hardware synth would have a network connection that you would use for its connection to the computer, and you would have some kind of discovery, or a hard-coded IP address on local network, and navigate to that. That seems... odd. If you're building a synthesizer, why wouldn't you make it a MIDI device (likely USB-MIDI), if only to make it easy to insert into a typical music setup (i.e. work like any other synth device)? And if you DID do that, then it's pretty trivially easy to have a MIDI loop app somewhere on the Web that connects your device (which would have a known device name) and any controller you like, and it would be trivial to have that web app live on a secure connection. (Even with a service worker, so after the first time you wouldn't need to be network-connected.)

@bradisbell you are very incorrect in painting this as Google's persistent push to prevent [powerful] functionality for origins without HTTPS; this is an industry-wide push, and as I previously pointed out, is a strong recommendation from the W3C TAG (only one member of the TAG works for Google). This is a security choice by people who have deeply explored how web security needs to work.

As an aside, I don't think it would be a good idea to regress to not requiring a secure context; at the very least this would need to be justified to the TAG (who review Web APIs).

@jwt27
Copy link

jwt27 commented Sep 29, 2023

Maybe part of the problem is that users have no say in how a "secure context" is defined. I think they should at least be able to whitelist local subnets.

And if you DID do that, then it's pretty trivially easy to have a MIDI loop app somewhere on the Web that connects your device (which would have a known device name) and any controller you like, and it would be trivial to have that web app live on a secure connection. (Even with a service worker, so after the first time you wouldn't need to be network-connected.)

Sure, all of that can be done via external services. But it would be much nicer if it was all self-contained in one box.

@mjwilson-google
Copy link
Contributor

The Secure Contexts specification does allow user-agents to have a way to allow end users to configure a set of origins as trustworthy, although this is intended for development:
https://w3c.github.io/webappsec-secure-contexts/#is-origin-trustworthy
https://w3c.github.io/webappsec-secure-contexts/#development-environments
This doesn't guarantee that every user-agent has this ability though.

It sounds like this discussion might be expanding beyond the scope of Web MIDI. If the issue is about how secure context is defined, it's probably more useful to discuss over there since we can't change that with the Web MIDI spec (https://github.com/w3c/webappsec-secure-contexts/issues). If it is about how a particular browser or other user-agent implements secure context it's probably better to work with that project's bug tracker, for the same reason.

Of course it's fine to continue discussing here. We should consider what SecureContext is trying to protect against:
https://w3c.github.io/webappsec-secure-contexts/#threat-models-risks
The following quote is particularly relevant, since it shows that we can't rely solely on user permissions: "Granting permissions to unauthenticated origins is, in the presence of a network attacker, equivalent to granting the permissions to any origin."

Enumerating devices is a potential fingerprinting vector, so access to those lists should be in a SecureContext. But as I understand it, the Web MIDI API can't really function without first using MIDIAccess. So I don't see how we could satisfy @jwt27's use case while keeping device enumeration secure. Am I missing something?

@cwilso
Copy link
Contributor Author

cwilso commented Sep 29, 2023

And if you DID do that, then it's pretty trivially easy to have a MIDI loop app somewhere on the Web that connects your device (which would have a known device name) and any controller you like, and it would be trivial to have that web app live on a secure connection. (Even with a service worker, so after the first time you wouldn't need to be network-connected.)

Sure, all of that can be done via external services. But it would be much nicer if it was all self-contained in one box.

But I'm not sure how you would put this in "one box". Your user needs to "go" to something - the best experience would probably be via WebUSB, since it could redirect to a web app upon the device being plugged in. But in your "local server" case, how would you set up that local server? Making the hardware a TCP/IP device, with an embedded web server? (I'm presuming this is what you mean, since that's the case that would be more difficult to use HTTPS.). The user experience discovering that seems like it would be less ideal; you'd need to provide network configuration and discovery (unless you have it a hardcoded fixed IP address). The only benefit of this is it would work if the entire system was not connected to the Internet. By contrast, you could either have a web URL to direct the user to and a MIDI-connected synth (what I suggested above), or you could make your hardware device a USB device that supported WebUSB (in which case it can pop up a "WebSynth detected, click to go to mywebsynth.com" dialog when the device is connected.

@jwt27
Copy link

jwt27 commented Sep 29, 2023

But I'm not sure how you would put this in "one box". Your user needs to "go" to something - the best experience would probably be via WebUSB, since it could redirect to a web app upon the device being plugged in. But in your "local server" case, how would you set up that local server? Making the hardware a TCP/IP device, with an embedded web server? (I'm presuming this is what you mean, since that's the case that would be more difficult to use HTTPS.). The user experience discovering that seems like it would be less ideal; you'd need to provide network configuration and discovery (unless you have it a hardcoded fixed IP address). The only benefit of this is it would work if the entire system was not connected to the Internet. By contrast, you could either have a web URL to direct the user to and a MIDI-connected synth (what I suggested above), or you could make your hardware device a USB device that supported WebUSB (in which case it can pop up a "WebSynth detected, click to go to mywebsynth.com" dialog when the device is connected.

IP configuration could be static or via DHCP, and the IP address would be shown on the synth's display. Or it could announce its hostname via mDNS or similar. In any case, configuration would be a one-time event, I don't see this as a major hurdle.

The Secure Contexts specification does allow user-agents to have a way to allow end users to configure a set of origins as trustworthy, although this is intended for development: https://w3c.github.io/webappsec-secure-contexts/#is-origin-trustworthy https://w3c.github.io/webappsec-secure-contexts/#development-environments This doesn't guarantee that every user-agent has this ability though.

It sounds like this discussion might be expanding beyond the scope of Web MIDI. If the issue is about how secure context is defined, it's probably more useful to discuss over there since we can't change that with the Web MIDI spec (https://github.com/w3c/webappsec-secure-contexts/issues). If it is about how a particular browser or other user-agent implements secure context it's probably better to work with that project's bug tracker, for the same reason.

Thanks, I see this discussion already exists:
w3c/webappsec-secure-contexts#60
Will subscribe there too.

Of course it's fine to continue discussing here. We should consider what SecureContext is trying to protect against: https://w3c.github.io/webappsec-secure-contexts/#threat-models-risks The following quote is particularly relevant, since it shows that we can't rely solely on user permissions: "Granting permissions to unauthenticated origins is, in the presence of a network attacker, equivalent to granting the permissions to any origin."

Enumerating devices is a potential fingerprinting vector, so access to those lists should be in a SecureContext. But as I understand it, the Web MIDI API can't really function without first using MIDIAccess. So I don't see how we could satisfy @jwt27's use case while keeping device enumeration secure. Am I missing something?

A possible solution is to simply remove all identifiable information. In an insecure context, enumeration could always return a "Default MIDI Device", regardless of whether one exists. Which specific device that is mapped to is then left up to the browser, eg. by prompting the user on the permission dialog.
Access to multiple devices would then still be locked behind a SecureContext, but I expect the general use case requires only a single input and/or output device.

@mjwilson-google
Copy link
Contributor

Oops, didn't mean to close this. Sorry.

@mjwilson-google mjwilson-google added the Needs Discussion The issue needs more discussion before it can be fixed. label Sep 30, 2023
@mjwilson-google mjwilson-google removed their assignment Oct 15, 2023
@mjwilson-google
Copy link
Contributor

TPAC 2024 notes:
The requirement for CR is that we have all known security issues described. But it is not necessary to have all of them solved. It's ok to say we have identified a security issue but are not doing anything to resolve it.

So the way to resolve this issue is to go back over the thread and make sure all the issues are described in the section, and to also mark if there are no mitigations.

@mjwilson-google mjwilson-google added Needs Edits https://speced.github.io/spec-maintenance/about/ and removed Needs Discussion The issue needs more discussion before it can be fixed. labels Sep 24, 2024
@mjwilson-google
Copy link
Contributor

I would like to close this out soon with another PR, since it is a wide review blocker.

I went over the link in the first comment of this issue:
mozilla/standards-positions#58
There are many comments, but the main concern is about malicious firmware updates causing USB-MIDI devices to act as HID devices which would be a complete system compromise.

The issue linked to the following public doc about security analysis and mitigation:
https://docs.google.com/document/d/1SjYRmNvQKxOPHufWbx6n0NOeTKKd_O1WIiTBZAjVj5E/edit#heading=h.lavkfo6fb57g
This defines five classes of MIDI messages, and goes over some possible mitigation strategies for the firmware loading issue.

Finally, we have discussion in this issue about secure context and the difficulties it causes.

The current security section of the spec is organized by discussing the difference between sending or receiving short or SysEx messages (four different combinations). However, SysEx sending/receiving is discussed as one point in each subsequent list. It also only mentions firmware download in passing, whereas this seems to have been the biggest security issue that has been discussed so far.

Given the above, my plan is the following:

  • Don't change current spec language around secure context. Sorry @jwt27, I think this is out of scope for the Web MIDI spec. I hope that the secure context definition can be updated in a way that is safe and also makes local embedded device deployment easy; if that happens then Web MIDI implementations should automatically pick up the changes without us having to change any spec language here.
  • Discuss malicious firmware update issue first and suggest mitigation strategies, with new prose. In particular, there was a lot of discussion around allowlist / blocklist, permissions, and secure context in the linked issue. Also note that the main concern is for older devices, and note that at least one actual device has been identified that doesn't require user interaction to initiate a firmware update mode.
  • Rework the existing prose into two new categories: short messages + MMA-defined SysEx, and manufacturer-defined SysEx. Then discuss sending / receiving combined in each category. This corresponds to classes 1-4 and class 5 in the public doc linked above.

I didn't see any other security issues that weren't covered by the existing prose. If anybody has additional input please feel free to add here, otherwise we can discuss on the PR when it's ready. If I have any difficulties reworking the existing prose I will raise them here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: editorial https://www.w3.org/policies/process/#class-2 Needs Edits https://speced.github.io/spec-maintenance/about/
Projects
None yet
8 participants