Show HN: I wrote a full text search engine in Go

github.com

126 points by novocayn 4 days ago


Xeoncross - 3 days ago

I really liked the README, that was a good use of AI.

If you're interested in the idea of writing a database, I recommend you checkout https://github.com/thomasjungblut/go-sstables which includes sstables, a skiplist, a recordio format and other database building blocks like a write-ahead log.

Also https://github.com/BurntSushi/fst which has a great Blog post explaining it's compression (and been ported to Go) which is really helpful for autocomplete/typeahead when recommending searches to users or doing spelling correction for search inputs.

kaycey2022 - 3 days ago

I don't care you vibe coded it.. run some benchmarks on it to show how it compares to other stuff.

We are soon entering into the territory of "no one cares if you did it, but can you say something interesting?". I created X software is soon leaving the ranks of cool stuff.

eudoxus - 3 days ago

Would love to hear how this compares to another popular go based full text search engine (with a not too dissimilar name) https://github.com/blevesearch/bleve?

Copenjin - 3 days ago

Did you vibe code this? A few things here and there are a bit of a giveaway imho.

efilife - 2 days ago

You are avoiding the questions whether this was vibe coded or not. I see that almost every single project of yours was vibe coded down to the readmes. Why hide this?

wolfgarbe - 3 days ago

Can the index size exceed the RAM size (e.g., via memory mapping), or are index size and document number limited by RAM size? It would be good to mention those limitations in the README.

kdawkins - 4 days ago

This is very cool! Your readme is intersting and well written - I didn't know I could be so interested in the internals of a full text search engine :)

What was the motivation to kick this project off? Learning or are you using it somehow?

add-sub-mul-div - 4 days ago

Why did you create this new account if there's already 3 existing accounts promoting your stuff and only your stuff?

mwsherman - 3 days ago

Shameless plug, you may wish to do Lucene-style tokenizing using the Unicode standard: https://github.com/clipperhouse/uax29/tree/master/words

n_u - 4 days ago

Cool project!

I see you are using a positional index rather than doing bi-word matching to support positional queries.

Positional indexes can be a lot larger than non-positional. What is the ratio of the size of all documents to the size of the positional inverted index?

atrettel - 3 days ago

This is pretty interesting.

Could you explain more why you avoided parsing strings to build queries? Strings as queries are pretty standard for search engines. Yes, strings require you to write an interpreter/parser, but the power in many search engines comes from being able to create a query language to handle really complicated and specific queries.

cursedpikachu - 2 days ago

This is good for someone playing around with Go and data structures with vibe coding, but I just hope HN doesn't get flooded with vibe coded toy projects.

pstuart - 3 days ago

You'll need to license it if you want others to consider using it.

wolfgarbe - 4 days ago

Great work! Would be interesting to see how it compares to Lucene performance-wise, e.g. with a benchmark like https://github.com/quickwit-oss/search-benchmark-game

oldgregg - 3 days ago

looks great! would love to see benchmark with bleve and a lightweight vector implementation.

coolThingsFirst - 3 days ago

AI slop.