Today I learned that bash has hashmaps (2024)

xeiaso.net

169 points by stefankuehnel 5 days ago


teddyh - 2 days ago

It’s amazing how much of a superpower merely reading the manual is nowadays.

<https://www.gnu.org/software/bash/manual/bash.html#Arrays>

syntacticbs - 2 days ago

Oh dear. I've been trying to get people to not use this feature for a while.

One thing that has bitten me in the past is that, if you declare your associative arrays within a function, that associative array is ALWAYS global. Even if you declare it with `local -A` it will still be global. Despite that, you cannot pass an associative array to a function by value. I say "by value" because while you can't call `foo ${associative_array}` and pick it up in foo with `local arg=$1, you can pass it "by reference" with `foo associative_array` and pick it up in foo with `local -n arg=$1`, but if you give the passed in dereferenced variable a name that is already used in the global scope, it will blow up, eg `local -n associative_array=$1`.

As a general rule for myself when writing bash, if I think one of my colleagues who has an passable knowledge of bash will need to get man pages out to figure out what my code is doing, the bash foo is too strong and it needs to be dumbed down or re-written in another language. Associative arrays almost always hit this bar.

aunderscored - 5 days ago

Not only do they exist, but they have some fantastic foot guns! https://mywiki.wooledge.org/BashPitfalls#A.5B.5B_-v_hash.5B....

somat - 2 days ago

I love shell, I think it's killer feature are pipes and whoever figured out how to design so you can pipe in and out of control structures(Doug Mcilroy?) is a goddamn genius. however, after writing one to many overly clever shell scripts, I have a very clearly delineated point in which the script has become too complex and it is time to rewrite in a language better suited for the task. and that point is when I need associative arrays.

A lot of the sins of shell are due to it's primary focus as an interactive language. Many of those features that make it so nice interactively really hurt it as a scripting language.

DSpinellis - 2 days ago

I advocate the following rules for when to write and when not to write a shell script.

# Write a shell script:

* Heavy lifting done by powerful tool (sort, grep, curl, git, sed, find, …)

* Script will glue diverse tools

* Workflow resembles a pipeline

* Steps can be interactively developed as shell commands

* Portability

* Avoid dependency hell

* One-off job

# Avoid shell scripting:

* Difficult to see the preceding patterns

* Hot loops

* Complex arithmetic / data structures / parameters / error handling

* Mostly binary data

* Large code body (> 500 LoC)

* Need a user interface

A need for associative arrays (implemented in Bash as via hashmaps) moves the task to the second category (avoid shell scripting).

DiabloD3 - 2 days ago

No one else seems to have mentioned this, but POSIX sh does not include this feature.

Until it does, and major POSIX shs have shipped with it for a decade, then the feature will actually exist in a way that the average shell coder cares about. You're better off just shipping Rust/Go/etc binaries and leave sh for your simple glue tasks.

Even I've eventually switched to this, and I've written 10k+ long Bash scripts that followed best practices and passed shellcheck's more paranoid optional tests.

Use the right language for the right job.

adityaathalye - 16 hours ago

Bash Associative Arrays [1] are handy! Some examples of how I've used them:

- my site builder (for evalapply.org): inject metadata into page templates. e.g. https://github.com/adityaathalye/shite/blob/b4163b566f0708fd...

- oxo game (tic-tac-toe): reverse index lookup table for board positions: https://github.com/adityaathalye/oxo/blob/7681e75edaeec5aa1f...

- personal machine setup: associate name of installed application to its apt source name, so we can check for the app, and then install package https://github.com/adityaathalye/bash-toolkit/blob/f856edd30...

[1] I'd say "hashmap" is acceptable, colloquially. However, I don't think Bash computes hashes of the keys.

(edit: fix formatting snafu)

nielsbot - 2 days ago

Nice that these exist—but does anyone else absolutely abhor shell programming? The syntax is impossible to memorize, it’s incredibly easy to make mistakes, and debugging is a pain. I hate it more than C++ and AppleScript.

lars512 - 2 days ago

Unfortunately, MacOS ships an earlier version of bash that does not include associative arrays, so they’re not as portable as you might like.

dghf - 2 days ago

Picky and probably pointless question: are they actually hashmaps? If I understand correctly, a hashmap isn’t the only way to implement an associative array.

oniony - 5 days ago

A couple of weeks ago I learnt that Bash on Mac does not have associative arrays. We worked around the issue by changing the script to run under Zsh, but beware.

PeterWhittaker - 2 days ago

bash associative arrays are fantastic, I've made heavy use of them, but be warned that there have been several memory leaks in the implementation. (Sorry, no version notes, once we realized this, we rewrote a key component in C.)

IIRC, the latest bash addresses all of these, but that doesn't help too much if you are stuck with an older, stable OS, e.g., RHEL 7 or 8; even 9 likely has a few remaining.

These leaks become an issue if you have a long running bash script with frequent adds and updates. The update leak can be mitigated somewhat by calling unset, e.g., unset a[b], before updating, but only partially (again, apologies, no notes, just the memory of the discovery and the need to drop bash).

I'd agree with the idea that bash wasn't the best choice for this purpose in the first place, but there was history and other technical debt involved.

biorach - 2 days ago

Every few years I rediscover this fact and every few years I do my best to forget it

rednafi - 2 days ago

I discover stuff like this every day, and it’s delightful. Sure, reading the manual would’ve saved me from the surprise, but it’s incredibly difficult for me to read them unless I have a specific goal in hand.

I found out about hashmaps in Bash a year ago[1], and it came as a surprise. This morning, I came across dynamic shell variables and the indirection syntax[2]. Each time I wrote about them and learned more than I would have if I had just passively grokked the manual.

[1]: https://rednafi.com/misc/associative_arrays_in_bash/

[2]: https://rednafi.com/misc/dynamic_shell_variables/

pie_flavor - 2 days ago

I'm guilty of this. I knew zsh had them but since I can never remember the exact things zsh has that bash doesn't, I just assume anything remotely useful isn't compatible.

This policy comes from a six hour debugging session involving (somewhere) a script that manipulated a binary file - bash can't store zero bytes in variables and zsh can, but it's not like it'll tell you that it's pointlessly truncating your data and I never thought to look. So now every step in that script gets converted back and forth through xxd, and I don't trust Bash anymore.

bluedino - a day ago

Got burned on this on an interview question before.

In a related question I said something about switching to a language like Python when a script started to get "complicated"

Then the interviewer explained how his favorite language was bash, how he uses it for everything...

I did not get the job. Ironically my next job I did a bunch of Perl to Bash conversion.

nickjj - 2 days ago

One thing the article doesn't mention is how you can use indirect expansion to get a list of all keys.

For example: `${!myvar[@]}` would list all of the keys.

I've written about associative arrays in Bash a bit here: https://nickjanetakis.com/blog/associative-arrays-in-bash-ak...

rixed - a day ago

When you feel the need for a hash in shell, or even an array, is also probably when you should rewrite your shell in python.

(I feel obliged to add that when you feel the need to add type annotations a bit later is when to abandon python)

lervag - 2 days ago

Notice that other shells also has associative arrays, or at least zsh. I've found hyperpolyglot [0] to be a nice Rosetta stone for translating syntax between e.g. bash and zsh.

[0]: https://hyperpolyglot.org/unix-shells#associative-arrays

harel - a day ago

Before I learned about SSH identities, I used to have a shell script sourced into my bashrc with a hashmap containing all my hosts and full connect string, so that I could `ssh hostname` like one does with SSH identities.

evgpbfhnr - 2 days ago

Don't loop on values with `*`; the difference key/value is the lack of `!` at the start of the expression; `*` and `@` rules are the sames a $@ and $* and you almost never want *.

agnishom - 2 days ago

> Q: How do I declare a Hashmap?

> A: You use the command `declare -A HASHMAP_NAME`

This is why I think Bash is a horrible language

ssahoo - a day ago

With that kind of syntax, Wouldn't you just use perl or python instead?

oweiler - 2 days ago

They are also slow AF because a lookup takes linear time.

globular-toast - a day ago

Where are people learning that "hashmap" means associative array? It's obviously an easier word so I can see the natural language preferring that. Is this common? I can see it causing some communication problems when the details are important.

wwoessi - 2 days ago

bashmaps?

faragon - 2 days ago

LZ77 compression/decompression in pure Bash using hashmaps:

https://github.com/faragon/lzb/blob/master/lzb

j45 - 2 days ago

For some tasks, if as much as possible was coded in bash, it would work being called anywhere from any programming language.

Now to add hashtables to that.

crabbone - 2 days ago

To me, this is a development in the wrong direction. Shell is great precisely because it's so minimal. Everything is string rule is one that calms all of your type-induced fears.

Having to implement hash-tables, while still keeping the appearances of everything being a string is the anti-pattern known as "string programming" (i.e. when application develops a convention about storing type / structure information inside unstructured data, often strings).

I would love for there to be a different system shell with a small number of built-in types that include some type for constructing complex types. There are many languages that have that kind of type system: Erlang, for example. But, extending Unix Shell in this way is not going to accomplish this.