-
Notifications
You must be signed in to change notification settings - Fork 46
# Testing Firefox in "headless" mode #603
Comments
Using Xvfb is more like a real user experience, certainly. Xvfb uses the normal Firefox code but renders to a framebuffer that is never drawn to the screen. Headless mode more or less stubs out the rendering code. Headless mode may be faster since it does less. It could be more reliable if the X server does something asynchronously that is synchronous in the headless implementation. At some point headless mode on Linux passed all web-platform tests that Linux+xvfb did on our CI (excluding disabled tests), without needing special metadata just for headless. But I don't think that configuration is actually run, so it's hard to know whether that level of compatibility is being maintained. Overall I would hesitate to move to using headless by default for a case where we are so interested in the precise behaviour of the browser |
Thanks, James! |
Like @jgraham, I'd be hesitate to move away from the codepaths that actually get run by users. Certainly headless is a useful performance optimisation for many things, but it isn't testing what users actually see, and I think that (more than performance) is what we should be focusing on. (Performance, after all, can be solved in other ways, like more parallelism.) |
@jugglinmike have you looked at running Chrome headless? I think there is an argument for exercising the headless codepaths, namely that this is what web developers setting up cross-browser testing are likely to do if they know how. It's certainly a risk that there are differences between headless and headful, but perhaps we could do a weekly or monthly run of headful to compare against, rather than always spending more cycles? Aside: It would be great if |
A datapoint is that we have not had many deviations in results between Xvfb and headless mode in the Firefox CI, as https://searchfox.org/mozilla-central/search?q=headless&case=false®exp=false&path=testing%2Fmarionette%2Fharness%2F**%2F*.py shows. To the contrary, running Firefox in headless mode provides more stable results because Marionette does not have to interact with unpredictable window managers and the inherent asynchronous nature of the X11 protocol. Headless will also be faster because anything related to painting and window manipulation is stubbed out. Headless mode is controlled through the There is arguably a stronger argument for using headless in the WPT CI than there is for wpt.fyi result collection, where you might value speed over correctness. I would be strongly opposed to use headless in the Firefox CI, where emulating the user’s environment by interacting with the WM, loading libgtk, &c. is paramount to ensure Firefox is not only theoretically usable. I think the open question is what difference in actual test results we can expect from running in headless mode? If there are no differences, isn’t the Xvfb approach only theoretically better in this context? |
I think that we should make For local runs I can see the advantage of implying |
I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1434382 about a wptrunner |
@andreastt the searchfox.org link you shared doesn't seem to list test statuses, should I be able to see a list of tests that do differ between headless and Xvfb there? |
I've filed web-platform-tests/wpt#13005 for the CLI feature request. |
@foolip For the functional Marionette tests, all tests are expected to pass unless marked to skip, so we only have one test for the remote protocol that we can’t get working. Looking at WPT results I’m sure is a more relevant datapoint, but I wanted to point out that we don’t have many known-headless-issues in our WebDriver implementation. |
@foolip I haven't attempted to use Chrome's headless feature. There hasn't been a need to change the approach that was initially implemented, and I've had my doubts about authenticity. I'm comfortable if this makes our results less representative of a web developer's testing experience if it means the results are more representative of a user's browsing experience. To the extent that these things don't align, we're sure to surprise some people no matter which "side" we choose. Targeting the user is what the web developer is doing, anyway. If their process is flawed, then that's clearly a problem, but it may be something we can help fix. I just think we should consider the "headless" mode the special case. Hoping to get some clarity about the actual effect on test results, I enabled Chrome's headless mode on Bocoup's fork and triggered a TaskCluster build. The experiment was fairly catastrophic: over 2,000 testharness.js tests timed out, and 2 of the 26 TaskCluster tasks never completed. It's safe to say that I'm missing something. If anyone reading has any tips on how the feature should be enabled for WPT, then I'd be happy to try again! |
Oh, wow, that's not quite ready to land then :) I'm afraid I don't have any tips, if we wanted to pursue this we'd have to dig into the failure categories one by one. |
Just out of curiosity, I would be interested to know how Firefox fares in that experiment. Setting |
I'll give it a shot, @andreastt . Is setting |
I'll give it a shot, @andreastt . Is setting MOZ_HEADLESS=1 different
than using the -headless flag?
They are equivalent.
|
@andreastt I wanted to verify this locally before experimenting with all of WPT
After a few minutes, it exits with:
Are you set up to try this on your end? If not, can I get you more debugging |
@jugglinmike do you have a link to the TaskCluster / Buildbot log? |
Nope, that was a result of running WPT locally. I can generate logs for you, though. Full command:
Output: https://gist.github.com/jugglinmike/151e08f22d7b4427e0bd670294d37d49 |
Can you include |
Sure, though the command
Produced a log file that has no more information than the previous file: https://gist.github.com/jugglinmike/7cc77c951cb50bedeb374d60fb6f936a Should I be using |
But in such a case you should tweak the test runner and set the preference |
Thanks, @whimboo! I think we're getting somewhere. Full command:
Output: https://gist.github.com/jugglinmike/97f57b6aa7c3e5d3d92ffd5bc8cd612a In particular:
|
Hm, please note that the initialization of Marionette got aborted because the component is not enabled! I also don't see that the |
Depending on what version of Firefox you’re using, you need to set the preference in the correct case. Since recently, Nightly only supports The log does say that it’s connecting to Marionette, and Marionette does appear to be enabled. The If you could try |
@andreastt, nope. The case is irrelevant to Marionette so far. I exactly requested to revert that for your patch on https://bugzilla.mozilla.org/show_bug.cgi?id=1482829 before it landed. So while the internal Marionette component is enabled, the internal |
I think @whimboo may be on to something, since making that change had no discernable effect on the log.
The following commands also hang:
However, explicitly specifying
Which is strange. For kicks, I tried specifying a bogus argument instead of
That also hangs, giving a pretty good indication of the problem here. The WPT CLI currently replaces the In the mean time, I've triggered a build on TaskCluster by explicitly specifying both arguments: https://tools.taskcluster.net/groups/F8b0wLxIRY2VkFN0Ykn9Dw |
Yes, regrading the log pref casing I was misremembering events. The casing shouldn’t matter. Also the debug log shows
Thanks! But shouldn’t it be |
Yup, that wasn't to say |
As of [1] (merged on 2018-09-24), the WPT CLI enables the "headless" mode of Chrome and Firefox by default. This mode is more convenient for contributors running the tests from their development system, but it also relies on functionality which is not enabled during typical usage of the browsers. Results collected in this mode are therefore less authentic than results collected using a virtual display. @jgraham commented on this in [2]: > unless we have some reasonable assurance that headless mode matches > non-headless in all cases (e.g. the relevant browser running both > configurations in CI with identical results) I don't think we should > enable it for wpt.fyi-ingested runs. That issue includes evidence that the results are not equivalent, meaning "headless" mode is not appropriate for publicly documenting the capabilities of each browser. Extend the results collection system to use the new `--no-headless` command-line argument. [1] web-platform-tests/wpt#13076 [2] web-platform-tests#603
Please note that we don't run headless in our own CI yet. So we aren't aware of differences ourselves. But lately we were at least talking about enabling headless for wdspec as the first step. So this may happen soon. |
In its default configuration, Firefox requires a display in order to run. This project currently executes Firefox using
xvfb
, a utility that creates virtual displays in the X windowing system. Recently, we have been struggling with a regression that appears to be related to the communication between Firefox and the virtual display (see gh-592).As of version 55, Firefox implements a "headless" mode which allows the browser to be run in the absence of a display. In gh-592, @whimboo recommended enabling that feature instead of configuring Firefox to use a virtual display. In addition to side-stepping the regression, they mentioned that it would be faster overall.
There are implementation details which the feature's documentation does not describe but which may adversely influence conformance test results:
vh
andvh
)More generally, I wonder about authenticity. Which is more true-to-life: Firefox with a virtual frame buffer or Firefox in headless mode?
@whimboo are you familiar with the feature? Can you speak to any of these concerns? Or do you know who I could contact to learn more?
The text was updated successfully, but these errors were encountered: