Losing language features: some stories about disjoint unions

graydon2.dreamwidth.org

111 points by Bogdanp 4 days ago

Back in the day, when memory wasn't as cheap as it is now, there was a strong belief that forcing the user to "waste bits" on a proper sum type was a non-starter for a "real" language. It was widely assumed that the reason you were "sharing memory" between two fields was to conserve space, because you were clever enough to have recognized that they couldn't both be used at the same time. But doing so, it was generally assumed, meant that you were space constrained and so anything that took away your precious savings was bad.

I'm not saying this was "right" in any sense, but it wasn't just foolish old timers not recognizing that a "better" solution was possible. When you grew up having every single bit of memory threaded by hand (and costing macroscopic amounts of money), you think about memory efficiency differently.

adrian_b - 10 hours ago

This kind of unsafe union has already been present in the first FORTRAN version from 1956, as the statement EQUIVALENCE.
Presumably this was inspired by existing practices in the assembly-language programs.
Many early programming languages have followed the example of FORTRAN.
In October 1964, John McCarthy (which had not only been leading the development of the first LISP, but he had also been a major contributor to ALGOL) has proposed the keyword "union" and he had advocated for safer implementations of disjoint union types. The first such implementation has been in ALGOL 68, a few years later.
Unfortunately, the C language has taken the "union" keyword from ALGOL 68, but instead of implementing proper disjoint union types, the C "union" has been made essentially identical with the FORTRAN EQUIVALENCE from 1956.
The "variant record" of Pascal, was not really better, so that can also be counted as a failure to implement proper union types.
For a long time Pascal and C have been among the most popular programming languages, creating bad habits in the use of unions.
- nyrikki - 5 hours ago
  
  > Unfortunately, the C language has taken the "union" keyword from ALGOL 68, but instead of implementing proper disjoint union types, the C "union" has been made essentially identical with the FORTRAN EQUIVALENCE from 1956.
  Many word address machines, explicitly the PDP-7, it was only the instructions that changed, with ADD being one's complement and YAD being two's
  Remember we only got B/C because the PDP-7 didn't have enough ram to implement FORTRAN.
  Similar reasons C case switch falls through, the PDP-7 SAD instruction forced it. Then they abused that fall through to support lower case on the PDP-11, which would have been powerful enough for the type of union you are talking about.
  Midas/Macro assembly is tangential related but it is really a side effect of word addressable, accumulator/index machines.
  IIRC, Lisp is a good example for the difference between equality by value, reference, or even predicated equality.
  If you want to think about just how limited the PDP-7 was look at the instructions
  www.bitsavers.org/pdf/dec/pdp7/PDP7_Instruction_list.pdf
- pklausler - 6 hours ago
  
  `EQUIVALENCE` was (and is) a storage overlay of variables. It's not part of the type system, and one cannot define a derived type in standard Fortran that contains components with overlaid storage.
  - adrian_b - an hour ago
    
    While that is right, what you could do with EQUIVALENCE was pretty much equivalent with what you can do with "union" in C. For some things FORTRAN was more verbose, for other things C is more verbose.
    Accessing a value was more laconic in FORTRAN, because you omitted the union name and the "." that are required in C.
    The fact that a union is considered a type in C allows the use of a typedef, which removes the need to repeat the definitions for multiple variables, but that is about the only help that you get in C.
    Both in C and in FORTRAN you must know the current type of the stored value and there is no protection against using the wrong type. If the current type cannot be determined statically, you must manage manually another variable storing a type tag, in both languages.
    
    pklausler - 15 minutes ago
    
    EQUIVALENCE is useful even when the types are identical.
  - MarkusQ - an hour ago
    
    Modern type systems grew out of the problems caused by such storage hacks (and the general issues of having a bunch of bits that have to be consistently interpreted in some way). Kvetching about the type system in FORTRAN is like complaining about the Wright Flyer's lack of overhead storage bins.
cbsmith - 19 hours ago

Yeah, I remember using untagged unions because the code "knew" which type was the right one to use at the right time and you wanted to save space (and wanted your structures to fit well in registers --and later cache lines).
- setr - 17 hours ago
  
  I suppose that’s what cobol copybooks are; untagged unions, everywhere, all the time, all at once
rwmj - 10 hours ago

struct page in Linux is this taken to its logical conclusion.
https://github.com/torvalds/linux/blob/89be9a83ccf1f88522317...
Edit: A bit of background: https://lwn.net/Articles/565097/
- tux3 - 10 hours ago
  
  Fun fact: With the advent of folios, struct page is starting to undergo some major changes, with the eventual goal of shrinking it into a single 64bit number.
  But not to worry, the unreadable mess of C unions is not going away. struct folio will eventually absorb all those fields, and more. The only difference is there's a single folio for a whole set of pages, so moving the data there will save a significant amount of memory.
  https://github.com/torvalds/linux/blob/89be9a83ccf1f88522317...
NooneAtAll3 - 15 hours ago

looking at VSCode and browsers eating RAM as if it's nothing makes me think modern approach isn't that good either
- eru - 14 hours ago
  
  Maybe, but it's not sumtypes that are to blame.

jchw - 19 hours ago

Lack of sum types is probably one of the worst things about working in Go, and I think it is a much bigger problem than the lack of generics ever was. Sadly, though, I don't think you can really just bolt sum types onto an already complete programming language design.

jerf - 6 hours ago
The major problem with "bolting on" sum types to Go is that it is way too redundant with interfaces and the interactions get weird, because this already works:
```
    type Color interface {
        isColor()
    }

    type Red struct {}
    func (r Red) isColor() {}

    type Green struct {}
    func (g Green) isColor() {}

    type RGB struct {
        Red byte
        Green byte
        Blue byte
    }
    func (r RGB) isColor() {}

    func PrintColor(color Color, text string) {
         switch c := color.(type) {
         case Red:
             // print in red here
         case Green:
             // print in green here
         case RGB:
             // handle the RGB
         }
    }
```
This doesn't prevent the zero value of a Color being 'nil' but it does make "a value of type Color" effectively a sum type because you can't do anything useful to it without unpacking it in a type-aware way. There is no way in Go to have a Red but act like you have an RGB, values always have their types.
(There is a common misconception among Go programmers that this doesn't seal the type because you can implement a "isColor" method on your own types in other packages, but it won't work. Try it if you want to be sure.)
One could argue they've effectively already been bolted on, all that is missing is some syntax gloss like
```
    sum Color {
        Red{}
        Green{}
        RGB{
            Red   byte
            Green byte
            Blue  byte
        }
    }
```
that would simply unsugar to the above.
- jchw - 5 hours ago
  
  Yes, this is all true. However, this approach has more downsides than just nil and cumbersome syntax imo. It takes more memory and pointer-chasing than "proper" sum types would require, and while you can "seal" the type, there's no exhaustiveness checking.
andrewflnr - 16 hours ago

I wonder if a version of OCaml that had a better concurrency story 10-15 years ago could have taken Go's place.
- dvdkon - 12 hours ago
  
  Sadly I don't think so, just because it looks "too functional".
  If comments on the web are anything to go by, being familiar to C and JS programmers is one of the main reasons for Go's success. I think it has plenty of its own specifics, it's not like e.g. C# which really did start as a straightforward Java clone, but OCaml has even more differences.
- pjmlp - 10 hours ago
  
  Go only enjoys widespread adoption thanks to Docker and Kubernetes success in the industry, had it not been the case, we would be talking about it just like we talk about Limbo and Oberon-2, its influences.
  The kind of folks that dropped Python and Java for Go, would not have picked up OCaml even if it had a better concurrency story 10-15 years ago.
  - andrewflnr - 6 hours ago
    
    I guess part of the hypothetical is, "a version of OCaml supported and pushed by Google". But the same idea probably applies, Rob Pike et al (he was involved right?) probably never go for a functional language, even a relatively pragmatic one.
    
    pjmlp - 6 hours ago
    
    That would be Kotlin, Google's C#, given how history goes.
    I has enough ML influence on it, as a simplified Scala, and also can compile to native in various ways.
    On the other hand you will notice that the only thing Android team only cared about Go, was the Soong build syste, and parts of the GPU debugger.
- noelwelsh - 12 hours ago
  
  I believe it could have, and OCaml didn't because of some bad predictions by the core team (when asked about better concurrency, the response was along the lines that we'll all have tens of CPUs soon, so better in process support wasn't necessary), and the restrictions of academia (there is little value given to engineering work).
- jchw - 14 hours ago
  
  Maybe, but I think focusing on concurrency is the wrong idea.
  I think Go + sum types could be good. Maybe. But, honestly, it's hard to say for sure. First-order effects are great: we have sum types and can use them to model problems. Second-order effects get muddy: We have sum types and they are the ideal solution for a bunch of problems, but without other features they're awkward to use. For example... now you can do a Result type... but if you want to return multiple values in a Result, you need a set/tuple type too. If you do that, is Go's multiple return values concept still a good idea? I could probably go on but hopefully this illustrates the point.
  I think a lot of people don't acknowledge why Go is truly so successful. Well OK, first elephant in the room, it's obviously successful because it's backed by Google, a company who can and did throw immense resources at making the implementation and standard library pretty damn good as well as helping to push it, but that alone wouldn't have propelled it to where it is today (I mean, compare it to Dart.)
  But beyond that, Go is successful because Go code is very simple to write and leaves the programmer with relatively few things to be concerned about, and yet in spite of that, Go code generally runs reasonably fast, uses relatively small amounts of memory and is generally not very crash-prone. (Some people would happily debate that, but I trust software like restic, esbuild, rclone and Syncthing every day without fail, among other smaller Go programs. I'm OK with making that assertion.)
  If you put in the effort to make really good Rust code, the effort is not wasted, but it is a lot of effort when the stupid Go code often does the trick. Consider Discord's presence service: they switched from Go to Rust and massively cut costs and improved latency. Rust wins, Rust better? But... most services will never scale to the point where marginal improvements in latency, throughput or RAM are going to be worth a lot of man-hours worth of programming and ongoing maintenance. Usually throwing a couple more VMs at the problem is just going to be the path of lesser resistance. This was always the case when comparing lower-level to higher-level, but Go amplifies it because the performance gap isn't as big, but the complexity gap remains very large or maybe gets larger.
  Is writing Rust code really that hard? No, not at all, it's more that writing good Go code is so relatively easy, the language is simple and the standard library is loaded with useful tools. Go's CLI flag parser is pretty unimpressive, but so many projects just need a very basic flag parser and it works totally fine for that, you just don't need to overthink it 99.99% of the time. Same for net/http, the built-in image and compression codecs, TLS stack, and more. Meanwhile, designing and maintaining good high-quality Rust crates is just relatively hard. You've got to worry about async vs sync, various cfg options like nostd and noalloc, and dealing with a lot of wrinkles in the language. Want to use a dyn trait? You'll need to make sure the trait doesn't have any functions with generic parameters or async; makes perfect sense, but adds tension; you want to avoid unnecessary runtime costs in the ideal case but still have the flexibility to use a dyn trait in other cases. Not to mention the borrow checker and how it interacts with a lot of these design choices. Go code is much dumber, but often times it's sufficient.
  And that's the crux of it. Go is simple in a stupid way, rather than an elegant way, but it really makes the most of it. If you want to supplant Go at what it does best, trying to make a better programming language overall may be a bit misguided. It's probably not that hard to come up with a programming language design that is "better" than Go. What's hard, IMO, is coming up with a programming language where the cognitive load increase relative to Go is met with a pay-off that people using Go would consider genuinely worth it. Something like guaranteed data race safety is definitely good enough if it's something someone needs or at least strongly wants for a given use case. Sum types, on the other hand, are very nice thing to have that make modelling data easier and less error-prone, but not having them isn't exactly the end of the world... In Go, people sometimes emulate them with interfaces and type-switches, and it's not fantastic, but it's usually sufficient.
  Ocaml probably could/should be more successful, but I'm not sure it competes with Go, I think they're in entirely different wheelhouses. Ocaml feels like it doesn't want to compete with Go, but then again, I only have relatively surface-level knowledge of Ocaml, so I can't really say for sure.
  - noelwelsh - 12 hours ago
    
    I feel OCaml is as easy to write as Go, though perhaps that is not the case for someone with a typical imperative programmer background. I feel a good part of Go's success is that the core language is very similar to Python, which is obviously very widely known.
    Most of the issues you discuss with Rust are not issues in OCaml, as OCaml has GC. The language is simpler, in that programs are concerned with fewer concepts (e.g. no lifetimes), but less expressive in that they cannot formally talk about these concepts (though see the Jane Street work to add in some Rust-like concerns: https://blog.janestreet.com/oxidizing-ocaml-locality/)
  - alain_gilbert - 13 hours ago
    
    Check out this toy project I made.
    It's basically a fork of the Go lexer/parser that adds Result/Option/Tuple/Set... propagation operators (and more)
    and it compiles down to Go code.
    https://github.com/alaingilbert/agl
  - delifue - 12 hours ago
    
    Go with sum type: Borgo
    https://github.com/borgo-lang/borgo
eru - 14 hours ago

> Lack of sum types is probably one of the worst things about working in Go, and I think it is a much bigger problem than the lack of generics ever was.
Maybe, but once you have eg an Option or Either (a.k.a. Result) type, you typically really want to have some functions that work generically on all versions of them. (Though you could probably get away with using Go's version of void*, the empty interface, in a lot of cases?)
- jchw - 14 hours ago
  
  Basically, anything that makes it possible to do an Etiher/Result type blows the whole damn thing up. It calls everything into question, e.g. whether multiple returns really makes sense. It's kind of a shame, because I would really like to be able to model things better (and ideally more efficiently too) when working in Go.
  - eru - 13 hours ago
    
    The 'multiple return values' thing is really stupid in any case, even without sum-types: it's like "we have tuples, but only for this one special case."
    Just add first-class tuples to the language.
    
    dontlaugh - 9 hours ago
    
    It’s an actual liability, since you can’t uniformly compose the results of functions with different numbers of return values.
    
    aatd86 - 6 hours ago
    
    I think it could be retroffited but would require a syntax to have a variable marked as being a tuple.
    If we look behind the curtains, go functions somehow return tuples that are automatically unrolled into respective variables, sometimes. That's why if you have f(a, b, c){} and g() (b,c){} you can't do f(a, g()) currently.
    It does not unroll it automatically for some reason here. The result remains in "tuple" form ?
    I think it may leave some space to have tuples if needed.
    In any case, I don't see it as technically infeasible.
    
    eru - 5 hours ago
    
    > I think it could be retroffited but would require a syntax to have a variable marked as a being tuple.
    Like Perl's sigils? Or PHP's distinction between 'normal' variables with a $ and variables of function type which have no marking but are case insensitive?
    
    dontlaugh - 6 hours ago
    
    Go functions don’t actually semantically return tuples, multiple return is part of the calling convention.
    It can be changed more easily than in other languages because everyone statically links, but it’s not trivial.
    
    aatd86 - 5 hours ago
    
    They are not first class in the language but I think I remember the compiler having them as a representation for some language objects.
    Maybe https://go.dev/src/go/types/tuple.go
pjmlp - 10 hours ago
Personally, by now I would already be happy with Pascal like enumerations, instead of the stupid iota/const hack, yes it is an hack.
```
    type
        StatusCodes = (Success, Ongoing, Done)
```
or if you prefer, while bikeshedding what keyword to use,
```
    type
        StatusCodes enumeration (Success, Ongoing, Done)
```
Naturally, it is so much better to write
```
    type StatusCodes int

    const (
     Success StatusCodes = iota
     Ongoing
     Done
    )
```

ivanjermakov - 20 hours ago

I'm surprised how many modern languages lack first-class sum type support, considering the amount of domain use cases for them.

js8 - 10 hours ago

Not only programming languages, SQL as well. Sum types are also kinda awkward to represent in relational algebra (it can be done similar to inheritance, your type constructor will be the ancestor relation and for each data constructor you have a descendant relation - this is because of the categorical duality between sum and product types, foreign key constraints are arrows).

Taniwha - 16 hours ago

Wirth was on the Algol68 committee - I'm sure he understood how those sorts of unions worked.

He also avoided a lot of the more advanced features of Algol68, he thought it too complex, when he designed Pascal

adrian_b - 10 hours ago

I doubt that he understood how unions must work, because the "variant record" he has put in Pascal in 1970 was really bad, worse than even the initial proposal of John McCarthy, while the "union" of ALGOL 68 was pretty decent.
Implementing correctly disjoint unions, i.e. allowing them to be used only exactly like an enumeration in the variable tested by a select/case/switch statement, and then only as the correct type in each of the alternatives, introduces a little overhead in a compiler, but it is certainly not a serious source of bloat in comparison with most other things that must be done by a compiler.
If the programming language designer had clearly in mind how disjoint union types must work, they would have been easy to implement even for the minicomputers and microcomputers of the seventies.
- pjmlp - 7 hours ago
  
  Niklaus Wirth eventually started designing for minimalism, while I greatly enjoy some languages designed by him, the ultimate minimalism that kept himself busy in Oberon-07 was clearly not my point of view.
  Even Go v1.0 type system is more advanced than Oberon-07 final form.
  - adrian_b - 6 hours ago
    
    I agree.
    In my opinion, the most important contributions of Niklaus Wirth to programming languages have been in his early work on Euler, PL360 and ALGOL W, which introduced various features that were novel at that time.
    Starting with Pascal in 1970, his programming languages remained reasonably good for teaching programming and how to implement a compiler, due to their simplicity, but all of them were seriously behind contemporaneous languages.
    While Mesa of Xerox was a nice and innovative language, that cannot be said about Modula, Wirth's attempt to reimplement similar features after his sabbatical at Xerox, which was only mediocre.
    On the other hand the early languages of Wirth were very innovative, e.g. Euler was one of the first 2 languages with pointers, the other being CPL. In contrast with CPL which had implicit pointer dereferencing, Euler had explicit address-of and indirection operators, but it got their syntax right, not like in C, where the indirection operator has been mistakenly defined as prefix, instead of postfix.
    
    pjmlp - 3 hours ago
    
    However I still think the world would have been better with Modula-2 than C, unfortunately marketing was never Niklaus Wirth strong point, and no mainstream OS made it unavoidable.
    Zig is basically Modula-2 type system in C clothing, plus comptime, if only people were equally as hyped back in the 1980's.
    The usual "Why Pascal..." fails flat in the presence of Modula-2, which was actually designed as systems language, and not as a language to learn about programming.
    Oberon (the 1992 original), for its simplicity, introduced a wider audience to the concept that system programming with automatic resource management isn't something out of this world, even though Cedar is more interesting in features.
    I was more interested into Component Pascal and Active Oberon, even though there were the work of other researchers at ETHZ.
    Nonetheless it was his work, that inspired me to dig into everything Xerox PARC was doing, and discovering there was more happening there than only Smalltalk.
    I became amazed at the work done across Interlisp-D, Mesa, Cedar, how advanced their ideas for what an IDE is supposed to look like, that many mainstream languages still can't offer.
    So in a sense that was also a contribution from Niklaus Wirth to everyone that got interested into his work, and decided to go down the rabbit hole.
    
    suprtx - 2 hours ago
    
    Sorry, nit: "fails flat" should probably be "falls flat".
    
    pjmlp - 41 minutes ago
    
    Thanks.

Buttons840 - 14 hours ago

> But another thing Muratori points out is that is that Dahl and Nygaard copied the feature in safe working form into Simula, and Stroustrup knew about it and intentionally dropped it from C++, thinking it inferior to the encapsulation you get from inheritance. This is funny! Because of course C already had case #3 above -- completely unchecked/unsafe unions, they only showed up in 1976 C, goodness knows why they decided on that -- and the safe(ish) std::variant type has taken forever to regrow in C++.

This seems like a mistake. At the end of the day, a bunch of code and logic has to be written somewhere, and I think it's better done outside the data object, at least some of the time.

Imagine you have the classic Shape class / interface and someone wants to write some code to determine whether a Shape is happy or sad, based on their synesthesia, what are they suppose to do? I guess just add a happy_or_sad() method to the interface? Like, we're just going to pile--err, I mean, "encapsulate"--every possible thing that can be done with the data into the data object?

The OOP way is probably some Shape class hierarchy with a Shape superclass and a bunch of specific Square, Circle, Triangle, subclasses. So I guess you go and modify a dozen subclasses to add your happy_or_sad() method. And you're definitely going to have to fork the code because nobody wants to upstream your personal feelings about which Shapes are happy or sad.

It's better to have a sum type for your Shape and then everyone can put all their code and logic outside of the Shape itself, and the type system will ensure, at compile time, that no Shape variants have been missed, so refactoring is assisted by the type system.

taeric - 6 hours ago

Amusingly (to me, at least), I took a stab a long time ago to show how a visitor could get you most of this. As I state in the post, I make no claim that this is as convenient or powerful. It does do what you basically describe, though. https://taeric.github.io/sum-types.html
Note instead of my worse Optional class, what you are describing is a Shape class where there is a different function for each supported shape. If you add a new shape, everywhere that made a shape visitor would have to be updated do deal with the new shape. (A sibling post described this.)
Amusingly, I recall coworkers that did not want to use the "acceptVisitor" method and would force cast to call over to the methods directly. Caught me incredibly off guard.
eru - 14 hours ago

Inheritance can simulate sum-types. You can also simulate sum-types via what OOP people would probably call a visitor pattern: you hand me call-backs for what to do in all the different cases, and some trusted piece of code that you implement once for the type, does the case distinction and calls the right call-back. (You can either hand over the call-backs as functions / function pointers, or you can implement them as member methods pure OOP style.)
They are all equivalent in principle, but some of them are a lot more annoying to work with, especially when you want to do a pattern matching over multiple values at the same time, or match on nested patterns.
- pjmlp - 7 hours ago
  
  The way it goes is that sum types are closed for extension, while enjoying the confort of pattern matching, whereas OOP approach is open for extension, with the associated boilerplate in the visitor logic.
  In languages of ML linage, we can combine both approaches with a mix of sum types and functors/type classes.
  - eru - 5 hours ago
    
    > [...] whereas OOP approach is open for extension, with the associated boilerplate in the visitor logic.
    Yes, by default. But doesn't eg Java let you mark classes as final or something like that?
    > In languages of ML linage, we can combine both approaches with a mix of sum types and functors/type classes.
    I think Rust allows something similar, but I'm not sure whether you'd call it 'ML linage'?
    
    pjmlp - 3 hours ago
    
    Yes, you can mark classes final as closed for extension, however Java follows the wisdom at the time that by default classes are open for extension.
    Additionally, modern Java has sum types modeled on the same way as Scala.
    Rust is to some extent inspired by ML languages, hence why several OCaml and Haskell refugees also hang around Rust.
- pyrale - 8 hours ago
  
  > You can also simulate sum-types via what OOP people would probably call a visitor pattern
  You can also simulate it by using if/else everywhere.
  At this point, that means you have zero language feature supporting the use case, and type safety is up to the developers implementing patterns correctly everywhere.
- Buttons840 - 14 hours ago
  
  If I write my happy_or_sad() callback and pass it to the Shape, it would be nice if I could get some exhaustiveness checking, but like the blog says, C++ and other popular languages intentionally left it out.
  - layer8 - 12 hours ago
    
    With the visitor pattern, you have to implement an interface with one method for each choice, so (in statically typed languages) that implicitly does an exhaustiveness check, because if you forget to implement one of the interface methods, instantiating your implementing class won’t compile.
    One nice thing about the visitor pattern is that it doesn’t have to match the type hierarchy. For example, you could have a visitor interface method that is invoked for blue shapes, even if there is no BlueShape type. Similarly, the same type hierarchy can support multiple visitor interfaces, so that you can perform different case distinctions on the same value. This is something that sum types can’t do.
    
    eru - 11 hours ago
    
    > Similarly, the same type hierarchy can support multiple visitor interfaces, so that you can perform different case distinctions on the same value. This is something that sum types can’t do.
    Haskell's pattern synonyms should be able to handle this?
    The problem with the visitor pattern is that it doesn't compose well. So if you want to match on two values at once, or match deeper into a value, that works well for most pattern matching, but is annoying to piece together with the visitor pattern.
    
    layer8 - 7 hours ago
    
    I’m not really familiar with Haskell’s pattern synonyms, but maybe yes. What the visitor pattern gives you is that in principle the implementation structure can be orthogonal to the visitor case distinction. You can have objects encapsulate (and possibly hide) one structure while exposing different structural views via visitor. This also allows to re-arrange the implementation of the alternatives without breaking client code. In functional languages, you’d probably rather have functions converting the source value to a different (sum) type to perform the case distinction on.
    I agree that syntactic sugar for visitor ”matching” would be nice, and it’s something a language could add. The visitor pattern itself doesn’t prevent adding such syntax.
    
    pyrale - 8 hours ago
    
    A gard on your pattern match would be enough I guess.
    
    eru - 5 hours ago
    
    It depends on what you are trying to do. Eg guards can really replace or-patterns.
  - eru - 13 hours ago
    
    In what I described, you would need to hand callbacks for all possibilities, because no parameters are optional.
  - greener_grass - 10 hours ago
    
    The designer of the type must manually ensure exhaustive checking but then consumers will be forced to do exhaustive matching.
    It's not that bad in practice.

eru - 14 hours ago

Alas, this one gives a 403 Forbidden.

https://archive.is/oTbMW works though.

suprtx - 2 hours ago

While I only watched 25%-50% of the linked talk by Casey Muratori, spread out here and there, and fast-forwarded through the rest, I did not like his talk. And it reflects on this blog post as well. Casey Muratori obviously spent a lot of time on it, but programming and computer science is a huge field, and it is possible to spend a lifetime on even a part of one aspect of it.

In that talk, Casey Muratori refers to Simula, a PDF of it can be found at https://www.mn.uio.no/tjenester/it/hjelp/programvare/simula/... . You may want to use an OCR tool on that PDF, for instance ocrmypdf is available for a number of Linux distributions. I am not sure if it is the same version of Simula as what is being discussed, but it does have the "connection" statement, at PDF-page 56, which has "inspect" as the first syntactical element. That does look vaguely similar to the pattern matching of ML, but it does not AFAICT support a number of significant features that many love about modern pattern matching, such as nested patterns. Does it have field bindings as part of a pattern, and matching against specific constant values, or only matching the class type? I am not sure if it supports exhaustiveness checking. Does it mandate a finite number of possibilities, to help exhaustiveness checking? And the "connection" statement has two variants. AFAICT, it is the kind of abstraction that is primitive enough that one can get close to its functionality with "switch" in C++ together with a type-cast, and a far cry from what Standard ML (later?) supported. In that light, it might not be surprising that it was not included in C++.

When was pattern matching as we know it in modern times invented, or was it a gradual evolution? https://en.wikipedia.org/wiki/Hope_(programming_language) is cited as introducing https://en.wikipedia.org/wiki/Algebraic_data_type in the 1970s. And Hope had for instance this spin on just one aspect of pattern matching:

> Changing the order of clauses does not change the meaning of the program, because Hope's pattern matching always favors more specific patterns over less specific ones.

This is different from modern pattern matching, where the order (AFAIK generally across modern languages) does matter.

I am not sure that Casey Muratori did a good job of researching this topic, but I am not sure if and how much I can fault him, since the topic is complex and huge and may require a lot of research. Researching the history of programming languages may be difficult, since it would both require a high technical level and also have to be focused on history. One could probably have several full-time university positions just spending their time researching, documenting and describing the history of programming languages. And the topic is a moving target, with the professionals having to have a good understand of multiple languages and of programming language theory in general, and preferably also some general professional software development experience.

All in all, the data types and pattern matching of the 1970s might be extremely different from the discriminated unions and pattern matching of the 1990s. C++ also does not have garbage collection, which complicates the issue. Rust, for instance, that also does not have garbage collection, has different binding modes for the bindings in pattern matches.

It is important to note that subtyping and inheritance are different. And even FP languages can use subtyping.

I think both Casey Muratori (and Graydon Hoare, if he has not already read it) could be interested in reading the book Types and Programming Languages, even though that book is old by now and may not contain a lot of newer advancements and theory. I also think that Casey Muratori could have benefited (in regards to this talk, at least) from learning and using Scala and its sealed traits in regards to pattern matching, if I recall correctly, Scala had as one of its objectives to attempt unify OOP and FP. I do agree that OOP can be abused, and personally I am lukewarm on inheritance, especially as direct modelling of a domain as discussed in the talk without deeper thought whether such an approach is good relative to other options and trade-offs. But subtyping, as well as objects that can be used as a kind of "mini-module", is typically more appealing than inheritance IMO. "Namespacing" of objects is also popular.

Some theory and terminology also discuss "open" and "closed" types.

And, after all, Haskell has type classes, which is not OOP, but is relevant for ad-hoc polymorphism (is Casey Muratori familiar with type classes or ad-hoc polymorphism?), Rust has traits, not quite the same as type classes but related. Scala has various kinds of implicits in regards to that. And Rust also has "dyn" traits, not so commonly used, but are available.

suprtx - 2 hours ago

I forgot to mention that even Haskell has had some strangeness in its pattern matching once, that it banned later, as recently as 2010.
https://stackoverflow.com/questions/3748592/what-are-nk-patt...
> What do they mean by "n+k patterns"? I guess it's the second line, but I don't get what might be wrong with it. Could anyone explain what is the issue there? Why aren't these n + k patterns allowed any more in Haskell 2010?

ahaferburg - 9 hours ago

C++ is a gun with five triggers, one for each finger that holds the grip. That's what makes it so versatile and powerful.

voidUpdate - 6 hours ago

The gun is also permanently attached to your leg, pointing downwards
dontlaugh - 9 hours ago

And most of the triggers blow up the gun, unless you’re pulling multiple in a specific combination. Most combinations also blow up the gun.