-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Virtual MIDI ports #45
Comments
I've spent some time thinking about this feature lately and I want to share some ideas and open questions I have. At the simplest, what I see we could have is this: dictionary MIDIPortOptions {
string name;
string manufacturer;
string version;
}
partial interface MIDIAccess {
MIDIInput createVirtualOutput(MIDIPortOptions options);
MIDIOutput createVirtualInput(MIDIPortOptions options);
} As you can see, The resulting ports would naturally have no One open question is whether we want to have these as factories on the However, this needs to be thought through. I've heard that some iOS apps use virtual MIDI ports to communicate with each other. If that is the case, we need to consider whether a web app pretending to be another native application should be considered a potential risk. In the worst case nightmare scenario an app would be transmitting Lua (or similar) code via MIDI which could result in a cross-application scripting attack, possibly leveraging all the privileges the user has granted that application. Another, much likelier case would be that a user's credentials would be transferred from one application to another, similar to OAuth except that the authentication would happen in another application instead of on the web, and an intercepting application could steal these credentials. |
Here is my thoughts for virtual ports. If we start to support virtual ports, we should consider each device latency more seriously. Also virtual ports can be used for controlling remote MIDI devices through the internet. Even in this use case, latency is important and should be handled correctly. So in the v2 spec, MIDIPort may want to have an additional attribute for latency from event arrivals to audio playing back. |
I agree about the latency, we need to take that into account. Some use cases:
So basically, I think what we need is for normal ports a way to read their latency (if not available, report 0) and for virtual ports to write their latency, e.g. partial interface MIDIInput {
readonly double latency;
};
partial interface MIDIOutput {
readonly double latency;
};
partial interface VirtualMIDIInput {
double latency;
};
partial interface VirtualMIDIOutput {
double latency;
}; |
I prefer these were constructors instead of a factory. Agree about the latency, but I'm not sure about using 0 as meaning both 0 latency and unknown... but then, making the attribute nullable might not be great either. |
I can't think of any case where the default behavior in the case of unknown latency would not be to assume zero latency, so for most cases there would be no extra work to account for the unknown latency situation, hence the suggestion. If we're able to come up with sane scenarios where you'd benefit from it being non-zero and unknown, I'll be happy to use another value.
I agree, but we'll have to carefully assess whether there's a security risk in that. |
I understand the security issues you mentioned above - but those appear to be orthogonal to having a constructor or factory (maybe I'm missing something). |
It comes down to whether we need to ask for permission or not, and if we do, the factory method has the permission model already set up (to get the MIDIAccess instance), whereas for global constructors there isn't one. That is, unless the constructor takes a MIDIAccess instance as an argument (in which case I'd argue that it doesn't make sense to detach it from the MIDIAccess) or we throw if a valid MIDIAccess instance hasn't been created during the current session. |
I'm not sure that the security/privacy model for virtual ports will be the same as for non-virtual ports, as I expect one would want to have virtual ports exposed to native software as well? |
I continue to hear demand for this from nearly every vendor I talk to. |
#126 is pretty close to this issue, but: The situation has become more urgent than it was last year because operating systems are no longer providing GM Synths. Whether this gets into the Web MIDI API itself or not, it would be great to have a shim. |
Just for playing back a SMF file with GM synths or your own custom synths, Web MIDI is not needed at all. Web Audio is enough for such purpose. Did you see the link I posted in #126? The important thing here is that we need a standardized way to share software synths written in JavaScript. |
@notator The delta between virtual input and virtual output ports is nearly zero. If we do one, we should do the other. On the other hand, for "I'm asking for 'one' that can be loaded with custom sounds" - you're asking for an IMPLEMENTATION of a virtual device, that enables custom sound loading; this isn't going to get baked into the Web MIDI API, since it's like declaring one programmable synth is the right one. |
I think this is a great idea but feels wide open to very different definitions of what constitutes a "virtual device", and where its implementation might live (in a browser page? in the browser itself, persistently? in native apps?). Also there's some overlap with existing "virtual" device mechanisms for MIDI, e.g. in iOS. And how persistent or transient is such a device? Is it only there when the page that instantiates it happens to be open? Not to mention all the security considerations. In short virtual devices seem cool and I'm sure vendors are asking for them (hey I want them too), but I wonder if we all know exactly what we mean when we say it, and if we mean the same thing. It feels like more research and discussion is needed to nail the use cases and make this idea definite enough to implement. Also, Web Audio has a very similar problem to solve in terms of inter-app communication. I would hope that Web Audio and Web MIDI could wind up adopting a similar approach to abstracting sources and destinations for both audio and MIDI data. |
If we solved issue #99 and then used ServiceWorkers (https://github.com/slightlyoff/ServiceWorker/blob/master/explainer.md) then it wouldn't be hard to extend the spec to allow for virtual ports and have a reasonable lifecycle story. (At least on Linux and Mac. Windows has no standard way to do virtual ports.) |
SGTM on the Worker story. I think exposing virtual ports against applications outside the web browser could be optional on platforms that underlying system supports such feature. |
@cwilso Hi Chris, I think it would be a good idea to concentrate on virtual output devices first, then see what the answer implies for input. That's because I can see from toyoshim's link [1] that output looks pretty ripe already... @toyoshim Hi! Thanks for the link. Great stuff! Is that your site? There is, of course quite a lot to discuss:
To help thinking about threading, here's a use case: I'm a beginner with web workers, but currently imagine setting up something like this before the performance begins: If things do indeed work like that, then I'm the one who has control over the life cycle of all the threads and devices. But can Sharedworkers or ServiceWorkers access virtual output devices? I have been unable to find out. |
@notator It is not my site, but of my friend who is a famous music application developer in Japan. |
@toyoshim Ah yes, I forgot: +1 for "The important thing here is that we need a standardized way to share software synths written in JavaScript." I think the interface that should be implemented by software synths can be more or less dictated by the powers-that-be here. This is much easier than defining an interface for hardware manufacturers. The interface should, I think be modelled on the one for hardware synthesizers. And, if its not clear enough already, my original request from #126 is no longer on the table. :-) |
@agoode Service Worker WILL NOT fix this. You wouldn't be able to keep an AudioContext alive inside a SW; SWs are designed to come alive when needed, but not be resident/running all the time. For a soft synth, you need to wake up and be alive (when routed, usually). Let's keep this focused: THIS issue is about creating virtual MIDI ports, input and output, that can then be used by other programs on the system while this application is resident - i.e., creating a MIDI "pipe". The other issue (#124) is for managing and referring to virtual instruments - including, presumably, how you initialize them. (ServiceWorker might be involved there, but it's not going to solve it by itself.) Whether the MIDI API is available to Workers (#99) is relevant in that you'll probably need it in the context of whatever the initialization context is for #124, but it's also useful in more narrow contexts (e.g. I want to run my sequencer in a non-UI thread). |
Forgot to say: @joeberkovitz : Note the above, I'm trying to keep each of these issues separated, because the bedrock they detail is independently useful. This issue, for example - you could utilize a Web MIDI synth that you loaded in your browser from Ableton (since it could show up as a MIDI device in OSX). The top-of-the-heap "virtual device" spec is #124, and yes, it's heavily related to the the virtual device/routing issue in Web Audio; I'd expect at the very least you'd want to be able to bundle them. |
Currently, I'm a bit confused, but would very much like to understand this. So much has happened since the original posting of this issue in January 2013, that I think it would be helpful to start a fresh one. In particular, I suspect that the meaning of "virtual port" has changed. @rianhunter (I've now added a thumbs up to your comment above). Could you say more precisely what you are trying to achieve? A motivation and concrete use-case would be very helpful!
If communications were going via a well defined API, then restrictions (e.g. no SysEx messages) could apply. As I understand it, the port could either be created by the native application itself, or be part of the operating system that is running the application. In either case, the API's restrictions could be strictly enforced by the OS. Maybe that would be a necessary requirement for the API: It would need the cooperation of the OS vendors. So they'd have to be involved in development of the API at an early stage. |
Sure, that might be generally helpful. I think my use case represents a generic use case but please lmk if you disagree. I provide a web application to users of my synthesizer that exposes all MIDI functions that my synthesizer supports. Critically, I provide an interface to upload samples to the synthesizer (which uses SysEx messages) but also every other CC, NRPN, AND RPN that my synth supports for helping them quickly get up to speed. Another core feature is that the web application allows users to save "sessions" which represent every MIDI setting currently set by the web application. This allows them to quickly restore their current settings. This application intends to operate and feel like a native application, so asking the user permission to use MIDI is perfectly fine. As my product has been around for a few years, a common use case has emerged where my users will have multiple MIDI sources connected to the synthesizer, one of which is my web application. This is a problem for users who use the web application to manage device state because if they change a CC through a separate midi controller then my web application is not aware of that change, so they have to key it in twice if they want to preserve that setting in the web application: once through their MIDI controller and once through the web application. My idea is to provide a virtual MIDI input port by the web application, though which they can route all of their controllers and the web application would resend those messages to the output port corresponding to the device. That way, every setting changed through a hardware controller will be reflected in the web application. You can find the application in question here: https://www.supermidipak.com/app/ |
Thank you, this is a famous bug for us. Yes, technically it's a Windows bug. We still have a responsibility to design web APIs that are possible to implement safely, with the understanding that OSes and other applications will often have bugs.
I think the ultimate goal is to allow web applications to send and receive MIDI messages to and from native applications, and possibly to and from other web applications. This issue #45 is about creating virtual MIDI input and output ports associated with a web application that are visible to native applications on the same system, which may be a step in that direction. My current understanding of "virtual port" is that it shows up in the operating system as if it were any other MIDI device, and can be accessed via the Web MIDI API like any other I read through things in more detail and discussed with @cwilso directly, and it looks like the big potential security issue here is what the last paragraph of the second comment on this issue mentions: two different web applications could open virtual MIDI ports and have a channel to send arbitrary data to each other. This may circumvent the protections the web platform provides for certain types of data transmission. It's subtly different from existing loopback MIDI devices on the user's computer because the user wouldn't necessarily have knowledge of or control over ports created by the web applications. Even restricting to non-SysEx messages would allow arbitrary data transmission, since the data could be encoded into CC or note on / off messages. I don't think the OS could do much to help with this either, since the data would all still be valid MIDI messages.
This is something we are actively working on in Chromium, so you can consider it a Chromium bug (see https://crbug.com/1420307). The spec is clear that the user should be prompted for all access (https://webaudio.github.io/web-midi-api/#dom-navigator-requestmidiaccess), although that section should be updated a bit (see #220).
It seems like your use case would be satisfied by an input-only port, which would also satisfy the soft-synth case. Only allowing virtual input ports feels like it would be safer to me, although at some point we should involve an actual security expert in this discussion. Of course, if we could safely implement both input and output that would be more versatile. |
I think this is the only WebMIDI-specific issue that has been raised in this discussion on this feature. When considering the threat model, this doesn't seem like an issue if both WebMIDI applications have gained access to WebMIDI through an explicit permission by the user. Right now WebMIDI is considered an API that only trusted applications may use, i.e. there are no untrusted use-cases for WebMIDI (even though this may have been the initial goal). If there were a mitigation for malicious use of virtual ports by applications that have already been granted trust by the user, that would then imply the existence of a third level of trust: applications which may use virtual ports unrestricted. This would make the trust-model more complex but I also think it would require justification for this extra level of threat: Why would we distrust an otherwise trusted application to specifically use virtual ports maliciously? In any case, my suggested way to mitigate that issue is to by default hide WebMIDI-created virtual ports to other WebMIDI applications. Perhaps later, there can be an extra level of permissions requested to allow those ports to be exposed to other WebMIDI applications if there was enough demand.
Just want to point out that only allowing virtual input ports isn't any safer than providing both ports when considering the inter-domain communication channel threat specifically. Was there another threat you were considering for this mitigation?
That's a relief. Thanks for the reference! |
@rianhunter said
Completely hiding virtual ports from other WebMIDI apps seems to me to be overkill, but I agree with you that there could well be a way to solve the problem by having an extra level of permissions. @mjwilson-google Thanks for the clarification, I think we're now on the same page. :-)
Could the problem be solved by requiring that ports created by web applications (=virtual ports) always ask for the user's permission, regardless of whether the request for use was coming from an OS/native application or another web application? That would elevate the permission status of virtual ports to that of the existing hardware ports. |
I'm not a security expert, but my understanding is that there is a type of side-channel attack that manipulates users to do things on trusted sites that expose information about themselves. So it's not necessarily about the application being malicious, it's the existence of the side-channel that is the security risk. I'm not sure if it's a real danger in this particular case given the permissions in place, but it's something that may come up in a security review.
That seems like it would eliminate this threat to me, too. We should probably think about how this could be implemented.
Right, because the "listener" site could open its port first and the "sender" site could connect to it directly. Good point.
Something like always showing a prompt every time a virtual port is created or connected to? It could make the user more aware, although we want to avoid "prompt fatigue" where users start clicking OK on everything without reading it because there are too many prompts. @cwilso also suggested a persistent indicator, although we didn't work through any details. If we can find a similar existing API (not necessarily MIDI-related) and examine what it does that might give us some ideas, too. Also, just to be clear about why we're having this discussion, even if something is in the specification the browser vendors could block their own implementations due to security concerns. I think we're doing the right thing for now thinking through the possible security issues and mitigations. Once we have rough consensus here I will ask the browser vendors for opinions from their security teams. In other words, even though I'm a spec editor I'm not the final authority on if a mitigation is good enough, and my goal is to specify something that will actually get implemented safely by all the implementers. |
Thanks Michael, that sounds great. I'm going to file a bug on Chromium regarding this specific issue (I'll CC you) just so there is some documented progress being tracked on that front and maybe to provoke more interested parties there. Working on my own, I may be able to have a prototype available for Windows and Linux in the next month or so. Maybe sooner if others help me out. |
It seems like I'm not able to classify the issue as "Feature" or add you to the CC list but you can find the created issue here: https://bugs.chromium.org/p/chromium/issues/detail?id=1515390 |
Hello, just wanted to share some thoughts as a regular Web MIDI and virtual MIDI port user: On platform capability... To my knowledge, defining virtual MIDI devices is not generally available. @cwilso You mentioned the possibility of a newer Windows API that might enable this feature? I checked the docs and didn't see one, but could you please take a look? Maybe I'm missing it or am not understanding. In any case, if the platform does not support virtual MIDI devices, then this all seems moot. I agree with previous comments in that it does not make sense to ship device drivers with the browser. The only working virtual MIDI driver I'm aware of on Windows is from Tobias Erichsen. https://www.tobias-erichsen.de/software/virtualmidi.html On security considerations around cross-domain/cross-application communication... The purpose of such a virtual port is to enable applications to communicate with each other over MIDI. Hampering this in any way defeats almost all of the usefulness of the functionality. Yes, we want web applications to be able to communicate with other web and non-web applications alike. Cross-domain MIDI communication should be possible if the users allow it to be. It is up to the users to decide what they want to do, and up to the user agent to carry out what they want. |
@bradisbell I was thinking of the WinRT MIDI API (https://learn.microsoft.com/en-us/windows/uwp/audio-video-camera/midi). I suspect this is not applicable, but I'm a long way out from when I used to write Windows apps. :). Tobias' work is pretty much what would need to be incorporated on Windows, I think. As for the security considerations: trust me, I understand the purpose of virtual ports, and I 100% understand how useful they would be in integration between web and native as well as web-to-web. I first started using MIDI (and programming it) in the late '80's/early '90s. At the same time, I've helped build the Web pretty much since its inception too - and we can't just make something possible because a set of users want it, when it might very negatively impact an incredibly larger set of users. We will need to create a design that is bulletproof for all users, and it is going to have to pass muster with the security and privacy horizontal reviews (both Chromium ones and W3C ones, I mean). I'm just setting expectations that I doubt very much "just put it behind a permission" is going to be good enough. |
Thank you, it's important to hear as many perspectives as possible.
I agree, and I think this the key point here is "if the users allow it". I think most users wouldn't expect allowing site A to use their MIDI devices and then allowing site B to use their MIDI devices would also allow sites A and B to send arbitrary data between each other. That is what we have to be careful about: that we don't allow more than the user intended to allow. |
@mjwilson-google If the user allows Site A to create a virtual MIDI devices, and Site B to use MIDI devices, then the user will certainly expect that Site A and Site B could communicate arbitrarily. Again, that is a key, if not the, use case... Allowing interconnection between applications, web or otherwise. If it's necessary, I see no problem with another tier of permissions. Currently, in practice we have:
This could be added to mitigate concerns:
Adding a virtual MIDI device implies adding a device with full MIDI capability. MIDI devices are, without known exception, available to any application on the host that supports MIDI. Therefore, the user should not expect a virtual MIDI device to be any different or otherwise limited. As long as we indicate to the user that they are allowing permission to the application for adding virtual MIDI devices, I don't see a problem. |
@cwilso I just checked that documentation page and unfortunately it doesn't look like it lists an example showing how to create virtual MIDI ports. If this feature requires a kernel driver, then that makes things a lot more complicated. I have done Windows driver development in the past and writing a virtual MIDI driver doesn't seem too hard (as in wouldn't take longer than a year, end-to-end) but logistically would be a lot more difficult in terms of deploying with browsers. I've been in this sort of situation before and the ideal thing to do would be to lobby Microsoft to add this functionality to Windows. The odds of that working out might seem far-fetched but in comparison to getting both Firefox, Chrome, and others to coordinate on the bundling, installing, and auto-upgrading of a kernel driver, it's probably more likely :) Plus, Microsoft has a stake in not incentivizing the creation of more third-party kernel drivers. I'm happy to attempt to reach out to Microsoft myself but maybe this is something better suited for @mjwilson-google, I'm also happy to coordinate with Microsoft engineers on building such a bundled first-party driver. @bradisbell I think we are all in agreement in terms of keeping the permissions model as simple as possible while also making the API as powerful as possible. I think we have decent first-pass ideas on the table to address this issue but it would be best to get more feedback from someone who spends their working days thinking about web security. |
@bradisbell Yes, if we add another explicit permission it should be clear. As @cwilso pointed out, the security reviews will have the final say and experience has shown that only adding a permission isn't always enough. That said, we can definitely propose it and see what feedback we get. I will try to summarize the recent discussions:
I think you brought up a good point that doing this is basically installing a device on the user's system. Some of my coworkers work on other device-related web APIs, and I talked with one of them who doesn't think that there is any other web API that installs a virtual device currently, and that it could break cross-origin protections. This might be the difficult point to understand; I don't fully understand all the details either but much of the web is built on assumptions about how different origins can communicate with each other. Anything that breaks these assumptions will receive a lot of scrutiny. Again, this isn't to say that this is impossible. But it looks like we might be doing something new, and if we can't satisfy the security and privacy reviews we won't be able to move forward. If the above list looks complete then I can bring it up at the next Audio Working Group meeting and ask for opinions and for the browser vendors to do a preliminary security review. Also, reminder that anyone is welcome to join the W3C audio community group and attend working group meetings (https://www.w3.org/community/audio-comgp/).
Thank you for your confidence in me, but realistically I don't think I can drive this effectively right now. I will bring it up at the next Audio Working Group meeting though. |
Just FYI I can think of a few ways to implement this. OS's usually provide an ID with a MIDI device, even on Windows. If not there are application-level workarounds, e.g. adding a forced prefix to devices created through Web MIDI.
No problem. I'll see what I can do. I'll try to join the next audio working group meeting as well. |
Thanks for reaching out to me on Discord. Windows MIDI Services, which will ship in-box in latest supported windows 10 and 11 releases in 2024, includes a number of features you need. Because we now have a Windows service in the middle, new transports are written as user-mode service plugins (COM components), not kernel drivers. We do want to avoid writing KS drivers for anything which doesn't need it. App-to-app MIDI is part of that, just like how the built-in diagnostics loopback endpoints are. Network MIDI 2.0 (coming), app-to-app MIDI, our diagnostics loopbacks, Bluetooth MIDI (coming), are all written as user-mode components in the service. Project is OSS, but is mirrored internally to include in Windows builds. I'll have another release out within apx one week (there are a few bugs in the message scheduling logic in Release 2 which block some folks) that you may want to look at. Note that only apps using the new API will be able to create the virtual endpoints. That means that they need to be UMP-aware apps. Once created, they will likely be made available to the legacy APIs (winmm, older WinRT MIDI). We need to verify no issues there. We do recommend anyone writing new MIDI code this year use the new API completely, as it can do everything the old API can do, including talking with MIDI 1.0 devices, plus more. It also has a much faster USB implementation, auto-translation of MIDI 1.0 bytestream messages to/drom a MIDI 1.0 device, etc. The API itself uses only the new Universal MIDI Packet for messaging, however. https://aka.ms/midirepo PS: @cwilso nice to see you around :) |
BTW, while considering any new features for Web MIDI, you may want to consider MIDI 2.0 as well. Our new API in Windows is MIDI 2.0-centered, and Apple, Linux (ALSA), and Android also have MIDI 2.0 support now. It's taking us longer on Windows because we've completely rewritten MIDI from the ground up to support all this. In the MIDI Association (I am the chair of the executive board), there have been some folks interested in Web MIDI 2.0, but no takers yet for working with the W3C to formalize it. |
Yes, we have an issue for that here: #211. I am trying to get the current specification to Recommendation status first. I am in support of this: I think if we can get MIDI 2.0 on the web that will help drive adoption, and it's good to know that the platform support is there. |
Quick update: the next WG meeting is tentatively scheduled for January 31, 2024. I already put this on the schedule. |
Another update: WG/CG meeting has been actually scheduled for January 25 at 09:00 Pacific time: |
Conclusion from WG meeting today:
|
In speaking with Korg at NAMM, they really wanted to have the ability to define virtual MIDI input/output ports - e.g., to build a software synthesizer and add it to the available MIDI devices when other apps/pages query the system.
Yamaha requested the same feature, even to the point of potentially creating a reference software synth.
We had talked about this feature early on, but cut it from v1; truly adding the device into the list of available MIDI devices to the system (e.g. other native Windows/OSX apps could access) would likely be quite hard, involving writing virtual device drivers, etc., which is why we decided not to feature in v1. We might consider a more limited feature of virtual devices that are only exposed to web applications; this might still be a substantial amount of work to add, but I wanted to capture the feedback.
The text was updated successfully, but these errors were encountered: