I taught a bucket to speak Git

tigrisdata.com

89 points by xena 3 days ago


Eikon - 3 days ago

Most of the pain here is the typical set of issues people run into trying to make S3 a filesystem as-is, common with S3FS-family approaches.

ZeroFS (https://github.com/Barre/zerofs) is 9P/NFS/NBD over S3 on an LSM. Point stock go-git, or just /usr/bin/git, at a mount and skip the gymnastics. Rename is a metadata op in the keyspace, so you get it atomic on any S3, no Tigris-specific X-Tigris-Rename needed.

Different point on the spectrum, but less square-peg, also most probably much, much faster (it works great on linux-sized repos) :)

nolist_policy - 2 days ago

If you want to store a git repo on S3, you can that with git-annex[1] today. It can do client side encryption and large files as well.

[1] https://git-annex.branchable.com

colechristensen - 2 days ago

I did something similar, though a full reimplementation of a git and git-lfs library in Elixir. Still a work in progress though as the S3 backend isn't quite complete and there are performance problems doing some git things through S3.

https://anvil.fangorn.io/fangorn/ex_git_objectstore

The documentation isn't quite correct, but it's getting there

manithree - 2 days ago

I've been using git-remote-s3 (https://github.com/awslabs/git-remote-s3) which I believe was inspired by this git-remote-s3 (https://crates.io/crates/git-remote-s3).

Not exactly the same thing, but along the same lines of "git and s3 are both object stores, why not use s3 as git storage?"

ctoth - 3 days ago

Came here for a five-gallon bucket hooked to Dulwich (archiving rain?), Slightly disappointed :)

Go Git and Dulwich and friends are indeed fun tech.

StarlaAtNight - 2 days ago

I’m not sure I understand what pain point this solves. What’s the value of doing this? Is it at a certain scale?

supriyo-biswas - 2 days ago

Great work, though I wish some of this work could be upstreamed to Gitea instead?

znnajdla - 2 days ago

This was really thought provoking — it made me realize that Git just happens to use a filesystem for persistence, but doesn’t necessarily have to. A POSIX filesystem might not even be the best way to store a git repo. Makes me wonder: what else could speak Git + POSIX? Redis? Postgres? IPFS is a fun one — it’s already content addressed.