Self-hosting a Matrix server for 5 years

yaky.dev

259 points by the-anarchist 2 days ago


lousken - 2 days ago

Same setup here since 2017. Since then, RAM usage decreased by 60%. The admin panel is not something I'd need but it would be a nice-to-have. Started with postgres as I wouldn't go for anything else if I wanna use it for decades. It has 2.5GBs for 10users and I don't mind if it takes 10 or 20, that's something I expected. Never did a cleanup of anything, I just dumped the db and moved to OVH recently onto a new VPS with NVMe SSD, it flies.

The fact that I cannot delete attachments that users delete is certainly my biggest irritation, 50GBs of stuff I am not sure if I can or cannot delete, but considering the size, I am just gonna bite the bullet, couple terabytes should not be a problem in 25 years. But this is def something I would love to see addressed sooner rather than later. It must be a pain even for the matrix.org server team.

After moving to a better server I do not have issues with slow notification unless the phone is sleeping for longer period of time which is an android optimization (I'd assume). It is more reliable than teams at this point. One of my friends had issues but removing 15 old devices fixed the issue.

As for element-x, I did call out "the another rewrite" issue especially with android and I do think it makes things worse. I still do not know how am I supposed to fix calling and video between old and new clients. For now I don't bother with new clients and everyone is using old ones, but it starts to become an issue as classic clients are in maintenance only mode

Almondsetat - 2 days ago

>this also creates a situation where anything said across federation cannot be unsaid, which is an ironic situation for a protocol/system that often comes up when talking about privacy.

How is it ironic? No protocol in the world can force anyone to delete anything from their own device. Chat apps that implement this function are either proprietary (so you cannot control what they can do) or, if OSS, do it on a pinky-promise-basis.

The_President - 2 days ago

Ran a homeserver for 5 years on a minimal VPS and it worked fine. Upsides - works everywhere, self hosted, feature complete. Client software in the ecosystem mostly felt bloated, with the exception of NeoChat. By 2022 the clients could no longer call each other. Decommissioned it this year in favor of traditional XMPP which works fine and it's nice that notifications are appropriately processed, finally.

Our team highly appreciates the work done in Matrix it's just unfortunate that the elephant in the room was never addressed at the start of the project, which is the need for a -simple- first-party administrative dashboard or tool to manage users, storage, and configuration. Without that core component, then you've got a layer of complexity between an admin and an audit which will increase likelihood of misconfiguration or resource management issues.

styanax - 2 days ago

As a former user I felt these pain points trying to do nothing more than have a very active one-on-one chat with a good friend. Tens of messages an hour, maybe 2 years running. Using matrix.org and the pre-X clients. It's fine for group chat (IRC style) but that's not a high bar.

(a) the encryption between using a mobile and the webapp desyncs/breaks all the time, it just sucks. I mean you'll get "cannot decrypt" a lot, have to bounce back and forth and generally try and force it to re-sync properly again. Sometimes never worked at all. Lots of issues on GH over the years.

(b) as mentioned in this article, insane delays on new message notif and sending and receiving. Just logging in on the webapp every morning took minutes of some sort of mysterious sync process, often the mobile app had the same problems. The X stuff may fix this, we were pre-X.

(c) cleanup. There's no message retention set on matrix.org, when I wanted to extract and remove our past chats the process and experience was excruciatingly bad. It took tens of hours over several weekends of the webapp (mobile completely non-op in practice for this) polling and loading old content, just so I could select 100 at a time to delete and then it took an hour. Once I started culling back over a year or so, the loading got longer and longer and longer, until eventually it 100% stopped working at all to load old messages.

Signal and DeltaChat are far, far better experiences for one-on-one chats with friends & family. The Delta client is a bit UI/UX behind but not horrible; e.g. you can't correct a typo in a sent message in Delta, unlike Signal - because each msg is a unique gpg-encrypted "email" rather than a database object that can be re-manipulated.

jamesbelchamber - a day ago

> The only thing that I don't really understand is the decision on data replication. If a user on server A joins a room on server B, recent room data is copied from server B to server A and then kept in sync on both servers.

The idea here is that rooms are abstracted from servers and sort-of exist ephemerally. This has the advantage/disadvantage of making it hard for the underlying infrastructure to exert control over the hosted communities, and seems to have become a distinguishing feature of federation.

My experience of Matrix as a possible replacement for Discord has led me to believe it's mostly a disadvantage since it leads to gross misalignments between the communities in top and the infrastructure providers underneath. I consider e.g. Discourse to be much healthier (although I would like to see an app for Discourse so that my Discourse communities behave more like Discord/Slack servers) and it's frustrating to me that there hasn't been a clear "Discourse for chat" emerge to replace Discord.

maelito - 2 days ago

I've been using Matrix for several years as a user. It works great. The problems decrypting messages have gone. X is becoming a good client. I'm deleting my whatsapp and télégram accounts in a few weeks after a painful week-long backup...

Edit : I wonder how easy it is to backup a Matrix accounts's data. Conversations and files.

nehal3m - 2 days ago

I’ve been running a Matrix server for about two years on a Proxmox host in a colo I rent for the purpose (plus some other hobby stuff, but mainly because I just think it’s cool). This playbook is awesome and it’s pretty easy to set up and keep running: https://github.com/spantaleev/matrix-docker-ansible-deploy

pferde - 2 days ago

Regarding the "Requires federation" section, that is not true. I've been running a small family-only homeserver for several years now, and had federation disabled on it from the very beginning, and there have been exactly zero issues related to (lack of) federation with it.

Aurornis - a day ago

> While technically, Synapse can work with a sqlite database (and which at first seems like an OK choice for having <10 users on the server), it WILL become corrupted.

Does anyone have any more information on this? Running Postgres is not a big deal, but I would expect SQLite to be fine given how well it works in my experience.

chrismorgan - 2 days ago

> While technically, Synapse can work with a sqlite database (and which at first seems like an OK choice for having <10 users on the server), it WILL become corrupted

I want to hear more about this. Is this because Synapse’s SQLite support is half-baked? What sort of corruption are we taking about?

mcluck - 2 days ago

I've tried off and on to actually use Matrix. I was a bit of a loud supporter in the early days. Unfortunately, it looks like it still hasn't grown past the fundamental issues I was having then. It might be time to try something else

jimkleiber - 2 days ago

As someone who has looked into forking Matrix for a new type of chat service, I'm grateful to see a more in-depth look at running it behind the scenes. Thank you.

df0b9f169d54 - 2 days ago

OT: I have some very big groups in Telegram (7 years or more, with a lot of pictures). Can Matrix (Rocketchat or alternatives) have similar storage features (with some migration scripts)? Thanks

24t - 2 days ago

To add another data point, I've been hosting a (tiny) matrix server for a few months. I'm pretty comfortable with self-hosting using docker, so I opted not to use the ansible scripts in the hope that it'd keep my setup simpler and more maintainable. Somehow I didn't find any mentions of ESS until Synapse was already up and running, but Kubernetes would have been a dealbreaker for similar reasons.

In this short time I've run a database migration (sqlite is the default, but MAS requires postgres), tried and failed to migrate to MAS (required to use Element X) and have lost a couple of days messing around with coturn and eturnal with nothing to show for it -- my calls still don't connect when NAT is involved. I have to tell new users to ignore the recommendations to install Element X until I get MAS working.

There's a lot of room for foundational improvements here, even updating docs to point would-be server admins to the recommended setup du jour would help.

eTomte - a day ago

I've been running synapse on a small VPS for the past few years. I got some of the bridges working too. Def bumps along the way, but its still the daily driver for me and about five other people.

Recently I spun up the new ESS Community Edition on a new VPS. Much easier to get up and running. I was delightfully surprised. although since that one uses kubernetes and other things I'm not familiar with getting bridges and other things I've become used to is going to require more learning on my part. Since ESS is so new, not a lot of newbie friendly howtos yet out there.

I remain optimistic.

Barathkanna - 2 days ago

TLDR Self-hosting isn’t dying because people stopped caring. It’s dying because the complexity has gotten out of hand.

This post highlights how something that used to be a fun, lightweight hobby has turned into a full-time maintenance burden. Systems like Matrix are powerful, but they’ve become so intricate that even skilled engineers struggle to run them reliably. The result is a slow drift back toward centralized platforms, not out of preference, but because convenience keeps winning over autonomy. It’s a reminder of the growing gap between the ideal of a user-owned internet and the realities of modern software.

ekjhgkejhgk - 2 days ago

And I thought that XMPP felt broken...

JadedBlueEyes - a day ago

> Synapse is the only choice that supports bridges

This is not true, at least today. Continuwuity, which is an alternative server implementation, and its predecessors support bridges very well.

pelzatessa - a day ago

Been selfhosting synapse for about 1.5 years in a docker compose setup using bunkerweb (formerly "bunkerized nginx", which better explains it premise) reverse proxy, eturnal for TURN and postgres, also recently added livekit and MAS for element call and element X compatibility. All that runs on a small 2vcore/4gb VPS, and it runs pretty good, I experience a server crash every half a month, but that may be caused by the fact that bunkerweb isn't the most lightweight solution (they actually require 8GB RAM minimum, so I'm already beneath the limit), and also because I run some other software (mailserver, ebook server, plex, etc..).

My experience as a administrator has been pretty good, perhaps it's because from the beginning I was optimistic, it suited my needs as I wanted a selfhosted, modern and fairly convenient communication platform. From what I recall, most problems during configuration were caused mostly by bunkerweb (or rather my inability to correctly set it up to proxy requests correctly and not hijack the 4xx and 5xx HTTP codes). Synapse itself has been a pleasure to maintain, but also bear in mind that I did not tinker with with it, I basically set it up and let it run for about a year and then added MAS and livekit.

Yeah, disk usage sucks, for about 5-10 active users and 1.5 year usage my postgres "schemas" folder clocks at 10Gibs. It doesn't include the media_store catalog where synapse keeps media (images, videos). The homeserver is federated and I joined a couple of big rooms in the past. Mechanics mentioned in the links below do help though:

https://matrix-org.github.io/synapse/v1.40/admin_api/purge_h...

https://github.com/matrix-org/rust-synapse-compress-state

Clientwise also sucks, but I think enough has already been said on this matter. But it's good enough to keep my nontechnical friends using it. They do hate it, but not enough to kick me in the arse. Would love to say that this proves that element clientside is usable, but I also have to admit that my friends are just hella good guys who would even write pigeon mail to me if I stopped using anything else for communication :) for me as a techie, element is obviously alright. Clunky, but works. I think clients simply need more time.

What irritates me is the Matrix authentication service (MAS), it's kind of a separate service for matrix homeservers that handles logins specifficaly. You can't use element X without it. However when it's enabled, you cannot log in from your client, instead a web browser opens and shows the login panel where you have to authorize, and then it should return to the client. Except in my case it simply doesn't :( I observed that for some reason chromium based browsers won't redirect back to the element app, and it doesn't know that the authorization has been granted. I managed to bypass it by copying the URL and opening it on firefox, but in one instance even that didn't work.

But other than that MAS problems everything has been fine from administration standpoint. I think it simply needs more time, as it already has traction, I see that a lot of new projects seem to include a matrix room in their social/communication channels, frequently it's the only option besides the bugtracker. And I'm willing to wait patiently :)

edit: added links for people who also struggle with disk space usage

nonamesleft - 2 days ago

Ran synapse for a few months, figured out all the tui clients are either abandonware or broken (originally thought i could use bitlbee, and did the install before realizing it was unusable).

Looked at current tui offerings now some years later, situation seems to be unchanged, the only client that ran back then was gomuks, and that has received a rewrite that hasn't reached feature parity yet.

I am probably the type of person referred to in the last part of xkcd 1782.

alexnewman - a day ago

I've been developing open source code for over 25 years. i've deployed hugely complicated systems like hadoop. I've never seen anything as hard to run as this + the bridges.

udev4096 - 2 days ago

I have been hosting synapse for 2 years now and it's been a smooth sail. I don't recall having any major breaking changes, most updates are smooth. Element client itself is definitely PITA but it's getting better

CyberDildonics - a day ago

Wouldn't this just be called hosting?

jchw - 2 days ago

I've mentioned this here before, but it bears repeating. A couple years ago or so, I made the catastrophic choice to use Dendrite as my homeserver software. It seemed like a safe bet: it was supposedly lighter weight than Synapse, being written in Go instead of Python and with everything reengineered from the ground up. It didn't support everything, but nothing in the disclaimers made it seem like it was about to abruptly become defunded and essentially unmaintained. Alas, that's exactly what happened not too long after I made that choice. Despite showing no interest in maintaining it, New Vector still found it necessary to relicense their abandonware under the more restrictive AGPL license. Good priorities. Then, when a security patch was needed, a new release was rushed out that included not just a security fix, but also a bug that caused Dendrite to completely stop processing messages for minutes at a time multiple times a day. (This only got fixed months later by a volunteer.) Joining large federated rooms on Dendrite took so long that I thought it was just broken; it could take hours to days to complete the operation! There was even a brief period when Element actually didn't even support Dendrite, leaving everyone locked out. Dendrite has never supported Element X, and the old sliding sync proxy was never updated to support the new simplified sliding sync either, which means you're stuck with the old slow sync and no support for things that require it. Also, most appservers still don't work right on Dendrite either. I got Mautrix-Discord working, but only for DMs.

I legitimately could go on and I'm sure I've forgotten things. It's amazing how quickly my experience with Dendrite went from pretty good to nightmare.

I realized that nobody in charge at the Matrix Foundation or New Vector really cared enough about leaving people stranded on a completely broken server to actually do anything about it (and trust me, I'm not alone. In every single federated room I've ever been in, I've always seen hostnames with dendrite subdomains. I could see them pass by in the logs while joining servers was taking hours.) I honestly considered just leaving the Matrix ecosystem, but I wasn't alone on my home server, so I decided to do my best to fix the problem. I wrote a tool that attempts to migrate the data from Dendrite to Synapse. This is a complicated operation that really took a huge amount of effort to get working, but after a couple of months of failed attempts, I had a test where I was able to seamlessly perform the migration and have clients continue to work and stay in federated rooms. So after getting it "close enough", I went ahead and gave the migration a shot in production and of course, it didn't work very well. All of the user accounts were intact, but a lot of stuff was broken. People indeed stayed in federated rooms, but my room state migration was definitely not 100% correct. Despite this, though, after manually cleaning up the database a bit more, hackishly while live, it was mostly a success. I believe I am probably the first person to directly move from Dendrite to Synapse.

So now that I am on Synapse, have my thoughts on Matrix changed? Yes. It's significantly better using Synapse, without question. The ecosystem is still a mess, but everything about Synapse is less broken than Dendrite. There are so many features Dendrite just doesn't do, like URL previews.

Why not contribute to Dendrite? Honestly, I don't want to. Their CLA sucks and they're not going to change it for me, and I don't think they're really going to spend time reviewing PRs given the circumstances. If I'm going to contribute to a project without retaining my rights I'd prefer to be on payroll. That's not something you should get from a community member. Either change the CLA to guarantee the project must stay open, or don't expect any free contributions.

Why not post my migration tool? Well honestly, for starters, it's not a very high quality tool. I could probably do some good for the Matrix ecosystem if I could get this tool in much better shape and have it migrating complex room states correctly, but I don't even know if I want to help anyways. This should've never been my problem. I will fully admit that it was my bad choice that got me here, but I really think it can be forgiven: nothing I saw suggested to me that Dendrite was on the way out. On the contrary, everything suggested it was the future, and just simply not ready for large scale usage yet. I'm bitter. I spent a lot of hours on this problem and I feel like hours spent on the Matrix ecosystem won't be repaid.

I hate to be this cynical, but it's just how it is. It's a mess. I didn't bother going into the other messes that still exist when using Synapse, like the seemingly many different ways that VoIP can work in Element and Element X, and the fact that Element X seems to only support a newer VoIP protocol that Element on desktop does not. (Surprise! There is no Element X on desktop...)

Matrix has some other downsides, that I think are tolerable but definitely make me a bit bummed. It leaks quite a lot of metadata to the homeservers, which is kind of alright, but I do think it's a bit sad; even room names are not encrypted, clearly it would be possible to do better. The ecosystem of clients is sad; Element is the only one that is feature complete and while I think it has improved quite a lot I still would prefer a native application over a web view. (You kind of need a webview if you want feature parity though, since group A/V in Element desktop seems to just use a Jitsi iframe...)

The upside is that it is federated and at least messages and files are E2EE in DMs and optionally for groups. I do like that.

Personally though the federation thing feels a bit off to me. I know it's a pipe dream, but it just feels like 1-on-1 and small group DMs should be roughly peer-to-peer. Servers should be for chatrooms and relays. The problems I see are mobile notifications, offline messaging, and discovery. I have wondered if a model like AT proto could get you there for DMs. I would like to try to prototype something some day, but I know that at this point the XKCD 927 count for IM software is pretty insane, so if I'm going to throw my hat into the ring it better be worth it.

Maybe some day I will be less bitter. I mean, Matrix is free, how much can I really be angry if it didn't work how I wanted it to. But, it's hard. I tried to buy in hard, and wound up making a lot of trouble for me and a small group of people that I inadvertently looped into my own mess. I trusted Matrix because it seemed to be the leading option, but definitely I will now be much, much more careful before adopting an ecosystem into my life and the life of people around me. For example, I still have no ActivityPub server... Maybe it's better if I just wait and see what happens there before jumping in, if I'm going to.

tony-john12 - 2 days ago

[dead]

Arathorn - a day ago

There's a bunch of outdated info in here, unfortunately:

> Does not have an admin panel

The admin panel is at https://github.com/element-hq/element-admin (but it's relatively new, so many folks haven't noticed it exists)

> No image captions > > This is silly, but while (official?) bridges support image captions, official Element app does not.

Element Web & X support captions. (Element Web doesn't currently support authoring them, but can display them - obviuosly this is on the todo list though).

Element Classic has basically not been touched in 2.5 years, unfortunately, and so yes - doesn't support them. But at this point the old app is just not being developed; we don't have bandwidth to do both.

> Slow notifications

This sounds like it might be overloaded server problems ftr.

> Element X is Slower > Somehow, it is slower. Clicking on a conversation takes 0.5-1.0 seconds to load it, compared to almost instant load on Classic.

This was an Android specific bug which was fixed a few weeks ago; EXA should now be instant as it should be, at least on nightlies: it was stuff surrounding https://github.com/matrix-org/matrix-rust-sdk/pull/5841 and https://github.com/matrix-org/matrix-rust-sdk/pull/5854. Unsure if the fix has made it into a build yet.

EDIT: the fix shipped in stable Element X Android a few weeks ago.

> Conversations are sorted by... who knows. It is not recent nor alphabetical.

It should be by recency.

> Onboarding is bad

Sorry, but you have to run an auth server (matrix-authentication-service) if you want Element X to work.

EDIT: if i've got any of the above wrong please just say, rather than downvote :D