By default, Signal doesn't recall
signal.org564 points by feross 3 days ago
564 points by feross 3 days ago
I'm happy to have this setting. It's a great setting and I appreciate Signal adding it.
However, if an attacker has the ability to directly query the Recall database, they almost certainly have access to read all your Signal messages on your device. The locations where Recall files live are even more protected and isolated than your %APPDATA%\Roaming\Signal directory is.
Everything running as you on your computer has full control of all your Signal messages and your identity assigned to the device. This is untrue of your Recall data, which from last I saw required a lot of finagling to get the permissions right for you to access it raw.
It's kind of funny that this argument always comes up when talking about Windows security.
What if I told you that botched sandboxing by default is not the standard we should accept? And that Windows' lack of competence to isolate processes isn't even what the NT Kernel envisioned (see e.g. ReactOS)?
I'd never run Windows as a host system, given the track record of how Microsoft deals with RCEs and privilege escalation issues that have been unfixed for decades at this point.
* Arguing about what should be doesn't alter the facts of the current situation
* It's unfair to single out Windows here, when other platforms are not much better. Mechanisms for stronger sandboxing and storage firewalling do exist on all platforms, but in practice these are barely used on desktops, and this is true across all three major OS families. E.g. Flatpaks exist, but I believe they still represent the minority of actual application installs
And ironically Wayland also gets frequent, heavy criticism for doing the right thing here - treating screen or input capture as a privileged operation, rather than a default right of any random application you have running (though I agree they should have standardized an escape hatch earlier)
I do not want stronger sandboxing for my apps. I'll only allow it to run on my machine if I trust it anyway. I'm not worried about my apps being able to read my data. If a piece of software doesn't do what I want safely and securely, then I don't try to duct-tape it into a sandbox that's regularly broken out of anyway. I remove and use other software.
If flatpak/appimage are the only way to get a piece of software, I won't even waste my time evaluating it.
Do you also run everything as root?
No, I don't. Because it doesn't need to. But not running as root already provides all the protection I want. System files are not my data. My data lives in my home. I want my apps to have access to my stuff, not the whole system. But more importantly, I do want them to be able to talk to each other and effortlessly open files written by one another. "Isolating" them from each other is pointless if I then proceed to punch holes in everything just so it can work.
"This thing isn't working ... "Oh... it turns out it was missing a permission, should I give it that permission? What's it for? Fuck if I know..."
Or the other way round:
"This app seems like it's working properly, but can I restrict this particular permission for it? Fuck if I know. I'll just try and see if anything is broken"
Or I can just run the application normally and have everything always work.
You should dig into ~/.local and what happens there. I'd never store my keepassxc database file in my home folder if I were you.
Apps need sandboxing, because the linux/posix philosophy of separation through users and groups for each process doesn't really work in the modern day and how graphical software works.
Firejail's approach comes close to "sane" user sandboxes, but technically that's the job of the init daemon (pid 0), there's just no GUI for systemd sandboxes yet that's easily usable.
Podman is also really nice as a user-facing sandboxing daemon.
i know what happens in there. Shit that I install because I want to goes in there. And my keepassxc password is protected by a strong password and a hardware token. They are specifically designed so you can safely store them anywhere (ex cloud backup), so I don't see why you brought that specific example up
Well, I agree with the Wayland and copy/paste clusterfvck.
I was more referring towards Qubes. I think Qubes does a lot of things right, I just wish its user facing settings and tools were easier to use in a graphical manner.
It's the same security for the Signal as on MacOS and Linux. Your user has full control to it, generally all processes running as you can see it and mess with it.
Recall is encrypted with a key in the TPM. But getting access to that encrypted sqlite db is a yawn.
Getting the key is harder, but possible. You can breach into Microsoft's group with a particular set of GPOs - if you can run a particular set of server commands on the local network.
Signal data is encrypted at rest. The key is stored in the OS store - usually meaning the TPM.
However, the key isn't in Microsoft's main grouping. To date, no one has extracted the Signal key this way. Other exploits are required.
Signal being smaller than the whole of Microsoft, reduces the attack surface.
If Signal uses the Windows Data Protection API for saving/encrypting the key (and some data online suggests that it does), then it’s trivial to fetch it back with the same APIs if you’re running as the same user. (I use `keyring` on Windows to access the key(s) VMware Workstation uses to encrypt Windows 11 VM vTPMs)
It’s kept secure by a chain of keys that may be backed by the TPM, but the security boundary is the user, not the app identity. IIRC Store/UWP apps may get their own boundary for credentials (due to how .appx is implemented).
> no one has extracted the Signal key this way
This is incorrect. Any process running as your user can trivially get the key.
At least this gives forward secrecy, so if someone takes control of your computer they can only spy on signal messages AFTER that point, and can't access prior messages that Recall has captured.
This is only forward secrecy for messages that were deleted and would have been captured by Recall and are still within the snapshot history which has a maximum number of days.
All the messages you've previously synced to the device exist in that Signal AppData directory and can be trivially searched and read by any application running as your user account. And all attachments are also just sitting there.
For example:
Given how popular the disappearing messages feature is, a lot of messages are going to be deleted where Recall is a second, less tested copy of things which the user believes are gone. Given the past history of AI tools it seems dangerously likely that someone would find that their Recall history was retained longer than necessary or could be retrieved through some creative misuse of Recall which doesn’t require system access.
> Given how popular the disappearing messages feature is,
I don't think it's really that popular or extensively used. Most people I know who use signal use it pretty rarely. I'll turn it on when I'm about to send something sensitive, but generally it's not enabled. I've been using Signal since 2015 and I've probably only sent or received a hundred or so disappearing messages. I've sent and received many, many thousands of messages. And I mean even in this HN thread tons of people are taking about how they wish the iOS app would have better backup and transfer functions. Something tells me they're not itching to transfer all those already deleted messages.
And sure, maybe Recall ends up saving things longer than it was set for. Maybe Signal does as well. And once again, accessing all your Signal database doesn't even require system privileges just your local user account.
Your browser can access all your Signal messages. Your chat app can access all your Signal messages. Your email client can access all your Signal messages. Your calculator app can access all your Signal messages. That videogame made by Tencent can access all your Signal messages. They don't even have to screengrab, they can just read them.
You should be able to access everything on your own computer - that is a good thing.
The real problem with Recall is that Microsoft will access the data to apply some algorithmic secret sauce. The product manager already probably has all kinds of ideas for the future: targeting ads, upselling licenses, or making MS more attractive to law enforcement.
Yes, there is a benefit for the user, like a nicer search or something, but that is relatively minor. ..because Microsoft is not your friend. This feature is born from a harvesting mindset.
Regarding disappearing messages: I have two chats on WhatsApp where disappearing messages are turned off, and maybe a handful on Signal.
It is actually less common to disable them on WhatsApp than on Signal, mainly because Signal forbids delete windows longer than four weeks. That is not long enough sometimes.
I have no idea how our usage here ends up being so dramatically different. I don't even tend to talk about the feature with most contacts.
> Microsoft will access the data
Incorrect. This runs locally. Microsoft is advertising the data just as much as you opening the file in notepad or browsing a folder in explorer.
And once again I just point to all the people complaining about a lack of backup and weak transfer ability. They're not looking to backup a nearly empty history.
> Incorrect. This runs locally.
It does not matter whether Recall runs locally. Microsoft controls the OS, the feature, and the update pipeline. If they decide tomorrow to start syncing Recall data to the cloud - for any reason - they can. The local processing angle is just an implementation detail, not a meaningful protection.
What matters is that - with this feature turned on - MS is structuring and indexing your private data at the system level. That is not a neutral act. Once the data is structured and accessible, uploading it is trivial. And given Microsoft's cloud-first direction, the trajectory seems clear to me.
I understand your point, but the theoretical similarity between unstructured local data and the Recall database is not useful in practice. It's like telling a farmer that it doesn't matter whether the grain is in the barn or still in the field because he can access it either way.
> Microsoft controls the OS, the feature, and the update pipeline
So then this is true whether or not Recall exists, because Microsoft could have gathered this data either way. They could decide tomorrow to have Explorer siphon off that data, they could have Edge siphon off that data, they could have Windows update siphon off that data. Microsoft could have silently been doing this the whole time.
If you don't trust Microsoft with Recall, you shouldn't trust Microsoft with any of it. And you probably should have moved off Windows a long time before.
> MS is structuring and indexing your private data at the system level
This has been going on for a long time. Once again, if you don't like the idea of Microsoft running software in a Microsoft operating system to read your files you really shouldn't be running Windows, and shouldn't have been running Windows for decades.
signal desktop keeps database keys in the os keystore via electron safestorage api
on linux that’ll be kwallet or something, on mac it’s the keyring. it’s as secure as your password manager
edit: okay, you’re right, on windows it’s useless lol https://www.electronjs.org/docs/latest/api/safe-storage
> [on Windows] content is protected from other users on the same machine, but not from other apps running in the same userspace.
It's not different on Linux, every App can access any key in kwallet. To make this not possible the os would need to generate some kind of unique app id that can access only what it stored. This would probably result in a lot of lost passwords for normal users.
Right, this feels like an entry to Raymond Chen's "it rather involved being on the other side of this airtight hatchway"!
I have nothing against Signal implementing screenshot hiding. But it's not exactly fixing a gaping security hole.
I agree with Signal here and love their commitment. Strangely (to me) they do 'recall' things in other ways:
* They have a message retention setting, 'Disappearing messages'; it works on message correspondents' devices too (if Ali sets Disappearing messages' to '1 day' for the chat with Barry, and then texts Barry, 1 day later Signal deletes the message on both Ali's and Barry's devices).
However, 'Disappearing messages' applies only to text messages. For every voice and video call, Signal retains a record of the date and time and the participants, and Signal saves it on the devices of each participant. Beyond a doubt, Signal's developers are well aware of the value of such metadata - as valuable as call content, in different ways - and the need for confidentiality (if you aren't familiar with that particular issue, I promise that every security professional is).
I'm shocked that they do it. What about a human rights dissident who is arrested - or whose phone is stolen - their phone won't show any sign of the text messages but it shows everyone they called and when, implicating all those other people and putting them at risk, and also evidence against the phone's owner. And even if they are disciplined and manually delete each of those records - afaik you can delete each call record one at a time - the other call participants' phones still retain the records. There is nothing someone can do to protect themself.
Better security here doesn't seem hard to implement. Also, I think having different settings for text messages and for voice/video calls makes retention settings more confusing for users. Many will believe they are safe without realizing the risk of this metadata - they trust the experts at Signal to understand these things and keep them safe - and many will assume everything disappears. Just have one setting for all data and metadata in the chat.
* Also, afaik if you delete the entire correpondence with someone - delete their entire chat history and delete them from the Signal address book - Signal retains information on them, such as settings for that chat. It seems that an attacker could identify all the deleted correspondents; again, there's no way to protect yourself.
> Better security here doesn't seem hard to implement.
You seem to assume it would be very simple to implement this — how do you come to this conclusion? My priors would suggest that the vast amount of effort that went into the Signal protocol renders low-hanging fruit regarding privacy fairly unlikely.
The GP is actually right here, Signal keeps the call log in the message history (deleting the call entry from the message history deletes it from the call log), but the disappearing messages setting doesn't get applied to the call log.
It's weird to see a bunch of messages, a call, more messages, and a day later the messages around are gone, but the call remains in the history. They could have just applied the disappearing messages settings to the call entries too, as it would be natural to do, and this problem wouldn't exist.
I don't think it's malicious, because what the server knows is independent of what the UI shows, but it's a very odd UI issue that does reduce privacy.
> Signal keeps the call log in the message history
Do you mean in the UI or do you mean in the underlying database, or in both?
They keep it in the UI, therefore I assume in the database as well. If you delete a call entry in the message history (like you delete a message), it gets removed from the "call history" tab as well.
The UI could combine data from two db tables. Anyway, that part is just a curiosity.
> vast amount of effort that went into the Signal protocol
If it requires protocol development, I'd agree. I expect - knowing no more than Signal's blog posts - that it has two components:
* Local database: These records need a retention period column, somehow - however they implement it with text messages. That seems straightforward.
* 'Distributed retention' - implementing the retention period setting on the remote devices of other call participants. I expect they would do it the same way they do with text messages, and I would guess it's just a field in a packet somewhere; e.g., establish a secure connection and then in the call's initial packet,
time = 2025-05-21T22:13:11Z
call.from = lblume
call.to = mmooss
retention.period = 1440 minutes
Correct. Signal also saves changes to the disappearing messages timer by default.
> but it shows everyone they called and when
Let's not forget that Signal uses real life phone numbers as identifiers, making the secret police's job even easier.
It can use usernames now. I don't have any of my Signal contacts in my contact list, and I can't see their phone numbers any more since they introduced the usernames. Not sure if by digging in the database files I could extract the numbers or not.
Can you sign up without a phone number?
I think it's required to sign up. It's never used after that (unless you want to use it).
> I think it's required to sign up.
Ok, now where are Signal's servers hosted? You're not safe for any secret police from those countries and countries friendly to the hosting countries.
> It's never used after that (unless you want to use it).
As in there's no way to accidentally leak your phone number to your contacts on, say, a new installation that comes with the option to make it visible by default?
Edit: You are making one uninformed assertion after another. Stop making endless errors and just look up these things at signal.org. They are very open about it.
> Ok, now where are Signal's servers hosted? You're not safe for any secret police from those countries and countries friendly to the hosting countries.
Signal is very open about what information they collect, which is all they can produce: a phone number, and "the date and time a user registered with Signal and the last date of a user’s connectivity to the Signal service".
https://signal.org/bigbrother/eastern-virginia-grand-jury/
> As in there's no way to accidentally leak your phone number to your contacts on, say, a new installation that comes with the option to make it visible by default?
Is there? What are you claiming, and based on what? There are infinite speculative security risks.
I wonder if 2025 will be the year of Linux.
Windows has turned itself into spyware. Apple is too expensive and going the same way.
Meanwhile the user experience of Linux has dramatically increased. Put on a good skin and most people wouldn't notice the difference. You don't need to reply that you can, I know you can. You're on HN. But most people just use their computer for the browser and most people can't tell Chrome from Firefox. Most people get their lockin by their tech friend or child. Really, Microsoft's only lockin remains Office.
It won't be a complete shift but the signs of growing userbase is there. Would be a huge win for open source! If you haven't tried Linux in a few years try giving something like PopOS a go or if you want to say you use Arch then try EndeavourOS. Both are very stable, latter slightly less.
Edit: enfuse was right, I should have suggested EndeavourOS instead of Manjaro.
The problem is, until laptops sold at Walmart or Best Buy start coming with Linux pre-installed as an option, adoption will never happen. Installing an aftermarket OS is just an incredibly unrealistic expectation from the average user.
Microsoft knows this, and they will do everything they can to prevent OEMs from shipping anything other than Windows. Apple of course, forget it. Their profit comes from leeching off FOSS and selling it, they would never allow distribution of it directly.
> until laptops sold at Walmart or Best Buy start coming with Linux pre-installed as an option
This is a circular problem though. They'll do it if Linux starts becoming more popular.If you want to see this, make sure your browser agent is broadcasting Linux[0]. Make sure you're using Steam in Linux.
But right now Steam has Linux at <3%[1]. It is more than OSX, but not enough. I do think above 5% and it'll start to be taken seriously, and 10% we'll start seeing moves. Linux doesn't need 90% of the marketshare to dramatically change the world. 10% is more than enough. Even 20% would be momentous and force both Microsoft and Apple to change strategies. Don't feel like there's no hope. Just because it is an unrealistic expectation today doesn't mean it will be tomorrow. And your actions today change the odds of what happens tomorrow. So don't give up.
You don't have to change the world overnight. But you do need to make steps in the right direction, even if small, to make the world move.
[0] You can even do this while using Windows! Hell, you can use Chrome and tell people you're using Firefox on Linux if you believe in those things but just are unwilling to make the switch yourself. The signaling still does something (it is better than nothing).
[1] https://store.steampowered.com/hwsurvey/Steam-Hardware-Softw...
> But right now Steam has Linux at <3%[1].
I think the overwhelming majority of this is Steam Deck usage. While that's certainly a feather in the cap for Linux, I don't think really counts toward Linux momentum as we're using the term here. Nobody is going to start investing in polished desktop Linux software because there are a lot of Steam Deck buyers.
> I think the overwhelming majority of this is Steam Deck usage
Please click the link and on the OS tab for a breakdown, as your conjecture is falsifiable[0] MOST POPULAR PERCENTAGE CHANGE
------------------------------------------------------
Linux 2.27% -0.06%
------------------------------------------------------
"Arch Linux" 64 bit 0.21% -0.02%
Linux Mint 22.1 64 bit 0.14% +0.02%
Ubuntu Core 22 64 bit 0.10% 0.00%
Ubuntu 24.04.2 LTS 64 bit 0.10% 0.00%
"Manjaro Linux" 64 bit 0.06% 0.00%
"EndeavourOS Linux" 64 bit 0.06% 0.00%
Debian GNU/Linux 12 (bookworm) 64 bit 0.05% 0.00%
We do know that SteamOS is Arch based. So yeah, it is the dominant player there. I'm not entirely surprised, but I don't think anyone was.But important to note, there's only a 0.05 difference between Arch and Mint. It's important to note because
1) Arch is incredibly popular and we can't guarantee all users in the Arch category are SteamOS users
2) Mint is currently the most popular distro[1]
> Nobody is going to start investing in polished desktop Linux software because there are a lot of Steam Deck buyers.
Maybe not, but also polishing of the Linux desktop has happened regardless of this. In fact, it is what drove SteamOS. Please refer to the items on [1] as literally the top 8 distros were developed for this explicit purpose (making Linux more user friendly).[0] We can determine it to be true or false.
Where did you pull this data for? I get exteremely different results myself:
I literally just googled "Steam hardware survey"
Btw, for some reason I can't view that image. Tried in 3 browsers on my phone...
https://store.steampowered.com/hwsurvey/Steam-Hardware-Software-Survey-Welcome-to-Steam
I came back and found the difference. You clicked "Linux Only". But I'm glad you did, because it gives us additional information helping us actually answer the previous question more accurately. Strongly falsifying the earlier conjecture that they were mostly SteamOS. We can see only a third are. 2/3rds of Linux Gamers are NOT using SteamOS (definitely a subset of SteamOS users are also not using a Steam Deck)
"SteamOS Holo" 64 bit 33.78% -0.70%
"Arch Linux" 64 bit 9.45% -0.23%
Freedesktop SDK 24.08 (Flatpak runtime) 64 bit 6.41% +0.15%
Linux Mint 22.1 64 bit 6.20% +0.89%
Ubuntu Core 22 64 bit 4.62% +0.23%
Ubuntu 24.04.2 LTS 64 bit 4.44% +0.26%
"Manjaro Linux" 64 bit 2.61% -0.05%
"EndeavourOS Linux" 64 bit 2.46% +0.06%
Debian GNU/Linux 12 (bookworm) 64 bit 2.27% -0.08%
Pop!_OS 22.04 LTS 64 bit 2.23% +0.02%
Other 25.54% -0.53%
https://store.steampowered.com/hwsurvey/Steam-Hardware-Software-Survey-Welcome-to-Steam?platform=linux
With the number of Steam Decks sold estimated at 3-4 million, and the number of monthly active Steam users being around 130 million, I think it's safe to say that 0.21% does not represent SteamOS install base. As far as I know, SteamOS doesn't show as Arch, but rather as its own thing
The way to make Linux takeover is get kids using it
To get kids using it needs to do lots of cool shit easily
Windows could play games easily when Linux could not even use a USB mouse
The time is right to make Linux do cool shit easily with local generative models that help iteratively create games
Replace all the desktop legacy with some blank canvas and local models that draw on the canvas. Ship some baked in models to generate shells of games to iterate from, boom.
This is exactly the fear of big SaaS and why VCs outside a key handful are done with it.
Apple Silicon is a glimpse of local compute future. Fanless laptops running models that generate entire coherent universes like Marvel and Star Wars. (Don’t need giant models just dense enough to get 80% and let users “zoom and enhance” with their own input)
Show that potential with local models on Linux and it’s over. Three options then; government demands hardware is locked down to preserve Hollywood/gaming/media, open compute wins, or both sides destroy the world over it.
In an interview with IGN during Covid lockdown Gabe Newell was describing generative AI as an existential threat to content creators. It could be temporary as the next gen grows up with a new normal and doesn’t obsess about a career in digital design or web dev, yt video production. It could end humanity as existential dread settles in for millions stuck in some narrative about their existence that no longer holds economic value.
Interesting times.
> The way to make Linux takeover is get kids using it
Agree! > Windows could play games easily when Linux could not even use a USB mouse
I don't think I've ever had a USB mouse (or wireless mouse or keyboard) issue in the last 15 years.Games? I'll give you that. But honestly, Steam has really made that almost a non-issue. Good guy steam! (their work has affected more than SteamOS)
> Replace all the desktop legacy with some blank canvas and local models
This seems like the opposite of what you initially argued.Models as in... LLMs or ML models? This seems like a great way to break things. I'd really encourage you to get these things to try to do what you're saying they should do.
> Apple Silicon is
Where are you going with this? > Show that potential with local models on Linux and it’s over.
I'm an ML researcher... these models are generally made and deployed on linux systems. Explicitly because they work better there and is easier to deal with. > In an interview with IGN during Covid lockdown
Serious question: you okay? Did a LLM contribute to your comment? Did a LLM make the whole comment? GPT, can you describe to me Act 4 Scene 5 from Henry V but as told by a Pirate from the deep south? (American south)Your last line sounds like it'll get some hilarious prompts. I'm going to try it.
> honestly, Steam has really made that almost a non-issue.
Not for online gaming.