Wayland really breaks things… Just for now?

This post is in part a response to an aspect of Nate’s post “Does Wayland really break everything?“, but also my reflection on discussing Wayland protocol additions, a unique pleasure that I have been involved with for the past months¹.

Some facts

Before I start I want to make a few things clear: The Linux desktop will be moving to Wayland² – this is a fact at this point (and has been for a while), sticking to X11 makes no sense for future projects. From reading Wayland protocols and working with it at a much lower level than I ever wanted to, it is also very clear to me that Wayland is an exceptionally well-designed core protocol, and so are the additional extension protocols (xdg-shell & Co.). The modularity of Wayland is great, it gives it incredible flexibility and will for sure turn out to be good for the long-term viability of this project (and also provides a path to correct protocol issues in future, if one is found). In other words: Wayland is an amazing foundation to build on, and a lot of its design decisions make a lot of sense!

The shift towards people seeing “Linux” more as an application developer platform, and taking PipeWire and XDG Portals into account when designing for Wayland is also an amazing development and I love to see this – this holistic approach is something I always wanted!

Furthermore, I think Wayland removes a lot of functionality that shouldn’t exist in a modern compositor – and that’s a good thing too! Some of X11’s features and design decisions had clear drawbacks that we shouldn’t replicate. I highly recommend to read Nate’s blog post, it’s very good and goes into more detail. And due to all of this, I firmly believe that any advancement in the Wayland space must come from within the project.

But!

But! Of course there was a “but” coming – I think while developing Wayland-as-an-ecosystem we are now entrenched into narrow concepts of how a desktop should work. While discussing Wayland protocol additions, a lot of concepts clash, people from different desktops with different design philosophies debate the merits of those over and over again never reaching any conclusion (just as you will never get an answer out of humans whether sushi or pizza is the clearly superior food, or whether CSD or SSD is better). Some people want to use Wayland as a vehicle to force applications to submit to their desktop’s design philosophies, others prefer the smallest and leanest protocol possible, other developers want the most elegant behavior possible. To be clear, I think those are all very valid approaches.

But this also creates problems: By switching to Wayland compositors, we are already forcing a lot of porting work onto toolkit developers and application developers. This is annoying, but just work that has to be done. It becomes frustrating though if Wayland provides toolkits with absolutely no way to reach their goal in any reasonable way. For Nate’s Photoshop analogy: Of course Linux does not break Photoshop, it is Adobe’s responsibility to port it. But what if Linux was missing a crucial syscall that Photoshop needed for proper functionality and Adobe couldn’t port it without that? In that case it becomes much less clear on who is to blame for Photoshop not being available.

A lot of Wayland protocol work is focused on the environment and design, while applications and work to port them often is considered less. I think this happens because the overlap between application developers and developers of the desktop environments is not necessarily large, and the overlap with people willing to engage with Wayland upstream is even smaller. The combination of Windows developers porting apps to Linux and having involvement with toolkits or Wayland is pretty much nonexistent. So they have less of a voice.

A quick detour through the neuroscience research lab

I have been involved with Freedesktop, GNOME and KDE for an incredibly long time now (more than a decade), but my actual job (besides consulting for Purism) is that of a PhD candidate in a neuroscience research lab (working on the morphology of biological neurons and its relation to behavior). I am mostly involved with three research groups in our institute, which is about 35 people. Most of us do all our data analysis on powerful servers which we connect to using RDP (with KDE Plasma as desktop). Since I joined, I have been pushing the envelope a bit to extend Linux usage to data acquisition and regular clients, and to have our data acquisition hardware interface well with it. Linux brings some unique advantages for use in research, besides the obvious one of having every step of your data management platform introspectable with no black boxes left, a goal I value very highly in research (but this would be its own blogpost).

In terms of operating system usage though, most systems are still Windows-based. Windows is what companies develop for, and what people use by default and are familiar with. The choice of operating system is very strongly driven by application availability, and WSL being really good makes this somewhat worse, as it removes the need for people to switch to a real Linux system entirely if there is the occasional software requiring it. Yet, we have a lot more Linux users than before, and use it in many places where it makes sense. I also developed a novel data acquisition software that even runs on Linux-only and uses the abilities of the platform to its fullest extent. All of this resulted in me asking existing software and hardware vendors for Linux support a lot more often. Vendor-customer relationship in science is usually pretty good, and vendors do usually want to help out. Same for open source projects, especially if you offer to do Linux porting work for them… But overall, the ease of use and availability of required applications and their usability rules supreme. Most people are not technically knowledgeable and just want to get their research done in the best way possible, getting the best results with the least amount of friction.

KDE/Linux usage at a control station for a particle accelerator at Adlershof Technology Park, Germany, for reference (by 25years of KDE)³

Back to the point

The point of that story is this: GNOME, KDE, RHEL, Debian or Ubuntu: They all do not matter if the necessary applications are not available for them. And as soon as they are, the easiest-to-use solution wins. There are many facets of “easiest”: In many cases this is RHEL due to Red Hat support contracts being available, in many other cases it is Ubuntu due to its mindshare and ease of use. KDE Plasma is also frequently seen, as it is perceived a bit easier to onboard Windows users with it (among other benefits). Ultimately, it comes down to applications and 3rd-party support though.

Here’s a dirty secret: In many cases, porting an application to Linux is not that difficult. The thing that companies (and FLOSS projects too!) struggle with and will calculate the merits of carefully in advance is whether it is worth the support cost as well as continuous QA/testing. Their staff will have to do all of that work, and they could spend that time on other tasks after all.

So if they learn that “porting to Linux” not only means added testing and support, but also means to choose between the legacy X11 display server that allows for 1:1 porting from Windows or the “new” Wayland compositors that do not support the same features they need, they will quickly consider it not worth the effort at all. I have seen this happen.

Of course many apps use a cross-platform toolkit like Qt, which greatly simplifies porting. But this just moves the issue one layer down, as now the toolkit needs to abstract Windows, macOS and Wayland. And Wayland does not contain features to do certain things or does them very differently from e.g. Windows, so toolkits have no way to actually implement the existing functionality in a way that works on all platforms. So in Qt’s documentation you will often find texts like “works everywhere except for on Wayland compositors or mobile”⁴.

Many missing bits or altered behavior are just papercuts, but those add up. And if users will have a worse experience, this will translate to more support work, or people not wanting to use the software on the respective platform.

What’s missing?

Window positioning

SDI applications with multiple windows are very popular in the scientific world. For data acquisition (for example with microscopes) we often have one monitor with control elements and one larger one with the recorded image. There is also other configurations where multiple signal modalities are acquired, and the experimenter aligns windows exactly in the way they want and expects the layout to be stored and to be loaded upon reopening the application. Even in the image from Adlershof Technology Park above you can see this style of UI design, at mega-scale. Being able to pop-out elements as windows from a single-window application to move them around freely is another frequently used paradigm, and immensely useful with these complex apps.

It is important to note that this is not a legacy design, but in many cases an intentional choice – these kinds of apps work incredibly well on larger screens or many screens and are very flexible (you can have any window configuration you want, and switch between them using the (usually) great window management abilities of your desktop).

Of course, these apps will work terribly on tablets and small form factors, but that is not the purpose they were designed for and nobody would use them that way.

I assumed for sure these features would be implemented at some point, but when it became clear that that would not happen, I created the ext-placement protocol which had some good discussion but was ultimately rejected from the xdg namespace. I then tried another solution based on feedback, which turned out not to work for most apps, and now proposed xdg-placement (v2) in an attempt to maybe still get some protocol done that we can agree on, exploring more options before pushing the existing protocol for inclusion into the ext Wayland protocol namespace. Meanwhile though, we can not port any application that needs this feature, while at the same time we are switching desktops and distributions to Wayland by default.

Window position restoration

Similarly, a protocol to save & restore window positions was already proposed in 2018, 6 years ago now, but it has still not been agreed upon, and may not even help multiwindow apps in its current form. The absence of this protocol means that applications can not restore their former window positions, and the user has to move them to their previous place again and again.

Meanwhile, toolkits can not adopt these protocols and applications can not use them and can not be ported to Wayland without introducing papercuts.

Window icons

Similarly, individual windows can not set their own icons, and not-installed applications can not have an icon at all because there is no desktop-entry file to load the icon from and no icon in the theme for them. You would think this is a niche issue, but for applications that create many windows, providing icons for them so the user can find them is fairly important. Of course it’s not the end of the world if every window has the same icon, but it’s one of those papercuts that make the software slightly less user-friendly. Even applications with fewer windows like LibrePCB are affected, so much so that they rather run their app through Xwayland for now.

I decided to address this after I was working on data analysis of image data in a Python virtualenv, where my code and the Python libraries used created lots of windows all with the default yellow “W” icon, making it impossible to distinguish them at a glance. This is xdg-toplevel-icon now, but of course it is an uphill battle where the very premise of needing this is questioned. So applications can not use it yet.

Limited window abilities requiring specialized protocols

Firefox has a picture-in-picture feature, allowing it to pop out media from a mediaplayer as separate floating window so the user can watch the media while doing other things. On X11 this is easily realized, but on Wayland the restrictions posed on windows necessitate a different solution. The xdg-pip protocol was proposed for this specialized usecase, but it is also not merged yet. So this feature does not work as well on Wayland.

Automated GUI testing / accessibility / automation

Automation of GUI tasks is a powerful feature, so is the ability to auto-test GUIs. This is being worked on, with libei and wlheadless-run (and stuff like ydotool exists too), but we’re not fully there yet.

Wayland is frustrating for (some) application authors

As you see, there is valid applications and valid usecases that can not be ported yet to Wayland with the same feature range they enjoyed on X11, Windows or macOS. So, from an application author’s perspective, Wayland does break things quite significantly, because things that worked before can no longer work and Wayland (the whole stack) does not provide any avenue to achieve the same result.

Wayland does “break” screen sharing, global hotkeys, gaming latency (via “no tearing”) etc, however for all of these there are solutions available that application authors can port to. And most developers will gladly do that work, especially since the newer APIs are usually a lot better and more robust. But if you give application authors no path forward except “use Xwayland and be on emulation as second-class citizen forever”, it just results in very frustrated application developers.

For some application developers, switching to a Wayland compositor is like buying a canvas from the Linux shop that forces your brush to only draw triangles. But maybe for your avant-garde art, you need to draw a circle. You can approximate one with triangles, but it will never be as good as the artwork of your friends who got their canvases from the Windows or macOS art supply shop and have more freedom to create their art.

Triangles are proven to be the best shape! If you are drawing circles you are creating bad art!

Wayland, via its protocol limitations, forces a certain way to build application UX – often for the better, but also sometimes to the detriment of users and applications. The protocols are often fairly opinionated, a result of the lessons learned from X11. In any case though, it is the odd one out – Windows and macOS do not pose the same limitations (for better or worse!), and the effort to port to Wayland is orders of magnitude bigger, or sometimes in case of the multiwindow UI paradigm impossible to achieve to the same level of polish. Desktop environments of course have a design philosophy that they want to push, and want applications to integrate as much as possible (same as macOS and Windows!). However, there are many applications out there, and pushing a design via protocol limitations will likely just result in fewer apps.

The porting dilemma

I spent probably way too much time looking into how to get applications cross-platform and running on Linux, often talking to vendors (FLOSS and proprietary) as well. Wayland limitations aren’t the biggest issue by far, but they do start to come come up now, especially in the scientific space with Ubuntu having switched to Wayland by default. For application authors there is often no way to address these issues. Many scientists do not even understand why their Python script that creates some GUIs suddenly behaves weirdly because Qt is now using the Wayland backend on Ubuntu instead of X11. They do not know the difference and also do not want to deal with these details – even though they may be programmers as well, the real goal is not to fiddle with the display server, but to get to a scientific result somehow.

Another issue is portability layers like Wine which need to run Windows applications as-is on Wayland. Apparently Wine’s Wayland driver has some heuristics to make window positioning work (and I am amazed by the work done on this!), but that can only go so far.

A way out?

So, how would we actually solve this? Fundamentally, this excessively long blog post boils down to just one essential question:

Do we want to force applications to submit to a UX paradigm unconditionally, potentially loosing out on application ports or keeping apps on X11 eternally, or do we want to throw them some rope to get as many applications ported over to Wayland, even through we might sacrifice some protocol purity?

I think we really have to answer that to make the discussions on wayland-protocols a lot less grueling. This question can be answered at the wayland-protocols level, but even more so it must be answered by the individual desktops and compositors.

If the answer for your environment turns out to be “Yes, we want the Wayland protocol to be more opinionated and will not make any compromises for application portability”, then your desktop/compositor should just immediately NACK protocols that add something like this and you simply shouldn’t engage in the discussion, as you reject the very premise of the new protocol: That it has any merit to exist and is needed in the first place. In this case contributors to Wayland and application authors also know where you stand, and a lot of debate is skipped. Of course, if application authors want to support your environment, you are basically asking them now to rewrite their UI, which they may or may not do. But at least they know what to expect and how to target your environment.

If the answer turns out to be “We do want some portability”, the next question obviously becomes where the line should be drawn and which changes are acceptable and which aren’t. We can’t blindly copy all X11 behavior, some porting work to Wayland is simply inevitable. Some written rules for that might be nice, but probably more importantly, if you agree fundamentally that there is an issue to be fixed, please engage in the discussions for the respective MRs! We for sure do not want to repeat X11 mistakes, and I am certain that we can implement protocols which provide the required functionality in a way that is a nice compromise in allowing applications a path forward into the Wayland future, while also being as good as possible and improving upon X11. For example, the toplevel-icon proposal is already a lot better than anything X11 ever had. Relaxing ACK requirements for the ext namespace is also a good proposed administrative change, as it allows some compositors to add features they want to support to the shared repository easier, while also not mandating them for others. In my opinion, it would allow for a lot less friction between the two different ideas of how Wayland protocol development should work. Some compositors could move forward and support more protocol extensions, while more restrictive compositors could support less things. Applications can detect supported protocols at launch and change their behavior accordingly (ideally even abstracted by toolkits).

You may now say that a lot of apps are ported, so surely this issue can not be that bad. And yes, what Wayland provides today may be enough for 80-90% of all apps. But what I hope the detour into the research lab has done is convince you that this smaller percentage of apps matters. A lot. And that it may be worthwhile to support them.

To end on a positive note: When it came to porting concrete apps over to Wayland, the only real showstoppers so far⁵ were the missing window-positioning and window-position-restore features. I encountered them when porting my own software, and I got the issue as feedback from colleagues and fellow engineers. In second place was UI testing and automation support, the window-icon issue was mentioned twice, but being a cosmetic issue it likely simply hurts people less and they can ignore it easier.

What this means is that the majority of apps are already fine, and many others are very, very close! A Wayland future for everyone is within our grasp!

I will also bring my two protocol MRs to their conclusion for sure, because as application developers we need clarity on what the platform (either all desktops or even just a few) supports and will or will not support in future. And the only way to get something good done is by contribution and friendly discussion.

Footnotes

Apologies for the clickbait-y title – it comes with the subject ︎
When I talk about “Wayland” I mean the combined set of display server protocols and accepted protocol extensions, unless otherwise clarified. ︎
I would have picked a picture from our lab, but that would have needed permission first ︎
Qt has awesome “platform issues” pages, like for macOS and Linux/X11 which help with porting efforts, but Qt doesn’t even list Linux/Wayland as supported platform. There is some information though, like window geometry peculiarities, which aren’t particularly helpful when porting (but still essential to know). ︎
Besides issues with Nvidia hardware – CUDA for simulations and machine-learning is pretty much everywhere, so Nvidia cards are common, which causes trouble on Wayland still. It is improving though. ︎