I like Odin and hope for it to gain more momentum.
I have an important metric for new "systems" languages: does the language allow NULL for it's commonly used pointer type. Rust by default does not (References may _not_ be NULL). Here is where I think Odin makes a mistake.
In the linked blog post Odin mentions ^os.File which means pointer to File (somewhat like *FILE in C). Theoretically the pointer can be NULL. In practice, maybe I would need to check for NULL or maybe I would not (I would have to scan the Odin function's documentation to see what the contract is).
In Rust, if a function returns &File or &mut File or Box<File> etc. I know that it will never be NULL.
So Odin repeats the famous "billion-dollar mistake" (Tony Hoare). Zig in comparison is bit more fussy about NULL in pointers so it wins my vote.
Currently this is my biggest complaint about Odin. While Odin packages a lot of power programming idioms (and feels a bit higher level and ergonomic than Zig) it makes the same mistake that Golang, C and others make regarding allowing NULL in the default pointer type.
One thing I think worth considering for systems languages on this point: if you don't want to solve every expressiveness issue downstream of Result/Option/etc from the outset, look at Swift, which has nullable types.
MyObject can't be null. MyObject? can be null. Handling nullability as a special thing might help with the billion-dollar mistake without generating pressure to have a fully fleshed out ADT solution and everything downstream of that.
To people who would dismiss ADTs as a hard problem in terms of ergonomics: Rust makes it less miserable thanks to things like the question-mark shorthand and a bazillion trait methods. Languages like Haskell solve it with a monads + do syntax + operating overload galore. Languages like Scala _don't_ solve it for Result/Option in any fun way and thus are miserable on this point IMHO
I like to think about how many problems a feature solves to judge whether it's "worth it". I believe that the Sum types solve enough different problems that they're worth it, whereas nullability solves only one problem (the C-style or Java-style null object) the Sum types can solve that with Option<T> and also provide error handling with Result<T, Err> and control flow with ControlFlow<Continue, Break> among others so that's already a better deal.
Nullability is a good retro-fit, like Java's type erased generics, or the DSL technology to cram a reasonable short-distance network protocol onto the existing copper lines for telephones. But in the same way that you probably wouldn't start with type erased generics, or build a new city with copper telephone cables, nullability isn't worth it for a new language IMO.
- `?T` and `T!E` as type declaration syntax that desugars to them;
- and `.?` and `.!` operators so chains like `foo()?.bar()!.baz()` can be written and all the relevant possible return branches are inserted without a fuss.
Having `Option` and `Result` be simply normal types (and not special-casing "nullable") has benefits that are... obvious, I'd say. They're just _normal_. Not being special cases is great. Then, having syntactic sugars to make the very, _very_ common cases be easy to describe is just a huge win that makes correct typing more accessible to many more people by simply minimizing keystrokes.
The type declaration sugar is perhaps merely nice to have, but I think it really does change the way the average programmer is willing to write. The chaining operators, though... I would say I borderline can't live without those, anymore.
Chaining operators can change the SLOC count of some functions by as much as... say, 75%, if we consider a language like Go with it's infamous "if err not nil" clause that is mandated to spread across three lines.
I mean, yeah, type erasure does give parametricity, but, you can instead design your language so that you monomorphize but insist on parametricity anyway. If you write stable Rust your implementations get monomorphized but you aren't allowed to specialize them - the stable language doesn't provide a way to write two distinct versions of the polymorphic function.
And if you only regard parametricity as valuable rather than essential then you can choose to relax that and say OK, you're allowed to specialize but if you do then you're no longer parametric and the resulting lovely consequences go away, leaving it to the programmers to decide whether parametricity is worth it here.
I don't understand your first paragraph. Monomorphization and parametricity are not in conflict; the compiler has access to information that the language may hide from the programmer. As an existance proof, MLTon monomorphizes arrays while Standard ML is very definitely parametric: http://www.mlton.org/Features
I agree that maintaining parametricity or not is a design decision. However, recent languages that break it (e.g. Zig) don't seem to understand what they're doing in this regard. At least I've never seen a design justification for this, but I have seen criticism of their approach. Given that type classes and their ilk (implicit parameters; modular implicits) give the benefits of ad-hoc polymorphism while mantaining parametricity, and are well established enough to the point that Java is considering adding them (https://www.youtube.com/watch?v=Gz7Or9C0TpM), I don't see any compelling reason to drop parametricity.
My point was that you don't need to erase types to get parametricity. It may be that my terminology is faulty, and that in fact what Rust is doing here does constitute "erasing" the types, in that case what describes the distinction between say a Rust function which is polymorphic over a function to be invoked, and a Rust function which merely takes a function pointer as a parameter and then invokes it ? I would say the latter is type erased.
The Scala solution is the same as Haskell. for comprehensions are the same thing as do notation. The future is probably effect systems, so writing direct style code instead of using monads.
It's interesting that effect system-ish ideas are in Zig and Odin as well. Odin has "context". There was a blog post saying it's basically for passing around a memory allocator (IIRC), which I think is a failure of imagination. Zig's new IO model is essentially pass around the IO implementation. Both capture some of the core ideas of effect systems, without the type system work that make effect systems extensible and more pleasant to use.
I personally don't enjoy the MyObject? typing, because it leads to edge cases where you'd like to have MyObject??, but it's indistinguishable from MyObject?.
E.g. if you have a list finding function that returns X?, then if you give it a list of MyObject?, you don't know if you found a null element or if you found nothing.
It's still obviously way better than having all object types include the null value.
When you want to distinguish `MyObj??` then you'll have to distinguish the optionality of one piece of code (wherever your `MyObj?` in the list came from) with some other (list find) before "mixing" them. E.g. by first mapping `MyObj?` to `MyObj | NotFoundInMyMap` (or similar polymorphic variant/anonymous sum types) and then putting it in a list. This could be easily optimized away or be a safe no-op cast.
Common sum types allow you to get around this, because they always do this "mapping" intrinsically by their structure/constructors when you use `Either/Maybe/Option` instead of `|`. However, it still doesn't always allow you to distinguish after "mixing" various optionalities - if find for Maps, Lists, etc all return `Option<MyObj>` and you have a bunch of them, you also don't know which of those it came from. This is often what one wants, but if you don't, you will still have to map to another sum type like above.
In addition, when you don't care about null/not found, you'll have the dual problem and you will need to flatten nested sum types as the List find would return `Option<Option<MyObj>>` - `flatten`/`flat_map`/similar need to be used regularly and aren't necessary with anonymous sum types that do this implicitly.
Both communicate similar but slightly different intent in the types of an API. Anonymous sum types are great for errors for example to avoid global definitions of all error cases, precisely specify which can happen for a function and accumulate multiple cases without wrapping/mapping/reordering.
Sadly, most programming languages do not support both.
> E.g. if you have a list finding function that returns X?, then if you give it a list of MyObject?, you don't know if you found a null element or if you found nothing.
This is a problem with the signature of the function in the first place. If it's:
where the _result_ is responsible for the return value wrapping. Making this not copy is a more advanced exercise that is bordering on impossible (safely) in C++, but Rust and newer languages have no excuse for it
inline fun <T> Sequence<T>.last(predicate: (T) -> Boolean): T {
var last: T? = null
var found = false
for (element in this) {
if (predicate(element)) {
last = element
found = true
}
}
if (!found) throw NoSuchElementException("Sequence contains no element matching the predicate.")
@Suppress("UNCHECKED_CAST")
return last as T
}
A proper option type like Swift's or Rust's cleans up this function nicely.
Your example produces very distinguishable results. e.g. if Array.first finds a nil value it returns Optional<Type?>.some(.none), and if it doesn't find any value it returns Optional<Type?>.none
The two are not equal, and only the second one evaluates to true when compared to a naked nil.
This is Swift, where Type? is syntax sugar for Optional<Type>. Swift's Optional is a standard sum type, with a lot of syntax sugar and compiler niceties to make common cases easier and nicer to work with.
Well, in a language with nullable reference types, you could use something like
fn find<T>(self: List<T>) -> (T, bool)
to express what you want.
But exactly like Go's error handling via (fake) unnamed tuple, it's very much error-prone (and return value might contain absurd values like `(someInstanceOfT, false)`). So yeah, I also prefer language w/ ADT which solves it via sum-type rather than being stuck with product-type forever.
I like go’s approach on having default value, which for struct is nil. I don’t think I’ve ever cared between null result and no result, as they’re semantically the same thing (what I’m looking for doesn’t exist)
I think the notion that "null" is a billion dollar mistake is well overblown. NULL/nil is just one of many invalid memory addresses, and in practice most of invalid memory address are not NULL. This is related to the drunkard’s search principle (a drunk man looks for his lost keys at night under a lamppost because he can see in that area). I have found that null pointers are usually very easy to find and fix, especially since most platforms reserve the first page of (virtual) memory to check for these errors.
In theory, NULL is still a perfectly valid memory address it is just that we have decided on the convention that NULL is useful for marking a pointer as unset.
Many languages (including Odin) now have support for maybe/option types or nullable types (monads), however I am still not a huge fan of them in practice as I rarely require them in systems-level programming. I know very well this is a "controversial" opinion, but systems-level programming languages deal with memory all the time and can easily get into "unsafe" states on purpose. Restricting this can actually make things like custom allocators very difficult to implement, along with other things.
n.b. Odin's `Maybe(^T)` is identical to `Option<&T>` in Rust in terms of semantics and optimizations.
> I have found that null pointers are usually very easy to find and fix, especially since most platforms reserve the first page of (virtual) memory to check for these errors.
This is true. However, you have done these fixes after noticing them at runtime. This means that you have solved the null problem for a certain control + data state in code but you don't know where else it might crop up again. In millions of lines of code, this quickly becomes a whack-a-mole.
When you use references in Rust, you statically prove that you cannot have null error in a function for all inputs the function might get if you use Rust style references. This static elimination is helpful. Also you force programmers to distinguish between &T and Option<&T> and Result<&T,E> -- all of which are so common in system's code.
Today it is safe to assume that a byte is 8 bits. Similarly it is safe to assume that the first page in virtual memory is non-readable and non-writable -- why not make use of this fore knowledge ?
> This is related to the drunkard’s search principle (a drunk man looks for his lost keys at night under a lamppost because he can see in that area).
This is a nice example and I do agree in spirit. But then I would offer the following argument: Say, a problem (illegal/bad virtual address) is caused 60% by one culprit (NULL dereference) and 40% by a long tail of culprits (wrong virtual memory address/use after free etc). One can be a purist and say "Hey, using Rust style references" only solves the 60% case, addresses can be bad for so many other reasons ! Or one can pragmatic and try to deal with the 60%.
I cringe every time I see *some_struct in Linux kernel/system code as function argument/return. Does NULL indicate something semantically important ? Do we need to check for NULL in code that consumes this pointer ? All these questions arise every time I see a function signature. Theoretically I need to understand the whole program to truly know whether it is redundant/necessary to check for NULL or not. That is why I like what Rust and Zig do.
But to answer your general points here: Odin is a different language with a different way of doing things compared to others, so their "solutions" to specific "problems" do not usually apply to a language like Odin.
The problem with solving the "60% case" means you now have introduced a completely different way of programming, which might solve that, but the expense of so many other cases. It's a hard language design question and people focusing on this specific case have not really considered how it effects anything else. Sadly, language features and constructs are rarely isolated from other things, even the architecture of the code the programmer writes.
As for C code, I agree it's bad that there is no way to know if a pointer uses NULL to indicate something or not, but that's pretty much not a problem in Odin. If people want to explicitly state that, they either use multiple return values, which is much more common in Odin (which is akin to Result<&T, E> in Rust, but of course not the same for numerous reasons), or they use `Maybe(^T)` (akin to Option<&T> in Rust).
I understand the problems that C programmers face, I am one, which is why I've tried to fix a lot of them without trying to be far from the general C "vibe".
> It's a hard language design question and people focusing on this specific case have not really considered how it effects anything else. Sadly, language features and constructs are rarely isolated from other things, even the architecture of the code the programmer writes
And my suggestion is that Rust has got the balance right when it comes to pointers. I can use the traditional unsafe nullable pointer *SomeStruct when it can be null and use &SomeStruct when it cannot be NULL in Rust. Initialization can be a bit painful in Rust but the wins are worth it. Yes, initialization can be less efficient in Rust but then most of the time spent is spend in algorithms when they run, not during initialization of the data structure.
Rust has needless complexity when it comes to asynchronous programming. The complexity is off the charts and the language just becomes unusable. But the non-async subset of Rust feels consistent and well engineered from a programming theory perspective.
In summary, Rust has not compromised itself by using non-null references as the default pointer type and neither will Odin, if it takes a different approach towards references. Take the example of OCaml - it also takes a very principled approach towards NULL. OTOH Java suffers from NULL problems as every object could be null in disguise.
Nevertheless Odin is a remarkably clean and powerful language and I'm following its progress closely ! Thanks for building it !
see, this seems like something that's nice to actually put into the types; a Ptr<Foo> is a real pointer that all the normal optimizations can be done to, but cannot be null or otherwise invalid, and UnsafePtr makes the compiler keep its distance and allows whatever tricks you want.
Yes it's the burden of proof. That's why writing Rust is harder than C++. Or why Python is easier than anything else. As a user and customer, I'd rather pay more for reliable software though.
n.b. Sorry for the long reply, this is actually a really complex and complicated topic in terms of language design, trade-offs, and empirics.
It's a trade-off in design which is not immediately obvious. If you want to make pointers not have a `nil` state by default, this requires one of two possibilities: requiring the programmer to test every pointer on use, or assume pointers cannot be `nil`. The former is really annoying, and the latter requires something which I did not want to (which you will most likely not agree with just because it doesn't _seem_ like a bad thing from the start): explicit initialization of every value everywhere.
The first option is solved with `Maybe(^T)` in Odin, and that's fine. It's actually rare it is needed in practice, and when it is needed, it's either for documenting foreign code's usage of pointers (i.e. non-Odin code), or it's for things which are not pointers at all (in Odin code).
The second option is a subtle one: it forces a specific style and architectural practices whilst programming. Odin is designed around two things things: to be a C alternative which still feels like C whilst programming and "try to make the zero value useful", and as such, a lot of the constructs in the language and core library which are structured around this. Odin is trying to be a C alternative, and as such it is not trying to change how most C programmers actually program in the first place. This is why you are allowed to declare variables without an explicit initializer, but the difference to that of C is that variables will be zero-initialized by default (you can do `x: T = ---` to make it uninitialized stack memory, so it is an opt-in approach).
Fundamentally this idea of explicit individual-value based initialization everywhere is a viral concept which does lead to what I think are bad architectural decisions in the long run. Compilers are dumb---they cannot do everything for you, especially know the cost of the architectural decisions throughout your code, which requires knowing the intent of your code. When people argue that a lot of the explicit initialization can be "optimized" out, this is only thinking form a local position of individual values, which does total up to being slower in some cases.
To give an example of what I mean, take `make([]Some_Struct, N)`, in Odin, it just zeroes the memory because in some cases, it is literally free (i.e. `mmap` must zero). However, when you need to initialize each value of that slice, you are not turning a O(1) problem into a O(N) problem. And it can get worse if each field in the struct also needs its own form of construction.
But as I said in my original comment, I do not think the `nil` pointer problem, especially in Odin since it has other array constructs, is actually an empirical problem in practice. I know a lot of people want to "prove" things whatever they can at compile-time, but I still have to test a lot of code in the first place, and for this very very specific example, it is a trivial one to catch.
P.S. This "point" has been brought up a lot before, and I do think I need to write an article on the topic explaining my position because it is a bit tiring rewriting the same points out each time.
P.P.S. I also find this "gotcha" people bring up is the most common one because it is _seems_ like an obvious "simple" win, and I'd argue it's the exact opposite of either "simple" and even a "win". Language designing is all about trade-offs and compromises as there is never going to be a perfect language for anyone's problem. Even if you designed the DSL for your task, you'll still have loads of issues, especially with specific semantics (not just syntax).
Odin’s design is informed by simplicity, performance and joy and I hope it stays that way. Maybe it needs to stay a niche language under one person’s control in order to do so since many people can’t help but try to substitute their own values when touring through a different language.
I wonder if some day we'll look back differently on the "billion-dollar mistake" thing. The key problem with null references is that it forces you to either check any given reference to see if it's null if you don't already know, or you would have to have a contract that it can't be null. Not having null references really does solve that problem, but still in every day programs you often wind up with situations where you actually can know from the outside that some function will return a non-empty value, but the function itself is unable to make that guarantee in a way that the compiler can enforce it; in those cases, you have no choice but to face the same dilemma. In Rust this situation plays out with `unwrap()`, which in practice most reasonably-sized codebases will end up with some. You could always forbid it, but this is only somewhat of an improvement because in a lot of cases there also isn't anything logical to do once that invariant hasn't held. (Though for critical production workloads, it is probably a good idea to try to find something else to do other than let the program entirely crash in this event, even if it's still potentially an emergency.)
In other words: after all this time, I feel that Tony Hoare framing null references as the billion-dollar mistake may be overselling it at least a little. Making references not nullable by default is an improvement, but the same problem still plays out so as long as you ever have a situation where the type system is insufficient to be able to guarantee the presence of a value you "know" must be there. (And even with formal specifications/proofs, I am not sure we'll ever get to the point where is always feasible to prove.) The only real question is how much of the problem is solved by not having null references, and I think it's less than people acknowledge.
(edit: Of course, it might actually be possible to quantify this, but I wasn't able to find publicly-available data. If any organization were positioned to be able to, I reckon it would probably be Uber, since they've developed and deployed both NullAway (Java) and NilAway (Go). But sadly, I don't think they've actually published any information on the number of NPEs/panics before and after. My guess is that it's split: it probably did help some services significantly reduce production issues, but I bet it's even better at preventing the kinds of bugs that are likely to get caught pre-production even earlier.)
I think Hoare is bang on because we know the only similar values in many languages are also problematic even though they're not related to memory.
The NaNs are, as their name indicates, not numbers. So the fact this 32-bit floating point value parameter might be NaN, which isn't even a number, is as unhelpful as finding that the Goose you were passed as a parameter is null (ie not actually a Goose at all)
There's a good chance you've run into at least one bug where oops, that's NaN and now the NaN has spread and everything is ruined.
The IEEE NaNs are baked into the hardware everybody uses, so we'll find it harder to break away from this situation than for the Billion Dollar Mistake, but it's clearly not a coincidence that this type problem occurs for other types, so I'd say Hoare was right on the money and that we're finally moving in the correct direction.
What I'm saying is, I disagree that "we know" these things. We know that there are bugs that can be statically prevented by having non-nullable types to enforce contracts, but that doesn't in itself make null the actual source of the problem.
A language with non-nullability-by-default in its reference types is no worse than a language with no null. I say this because, again, there will always be situations where you may or may not have a value. For example, grabbing the first item in a list; the list may be empty. Even if you "know" the list contains at least one item, the compiler does not. Even if you check the invariant to ensure that it is true, the case where it is false may be too broken to handle and thus crashing really is the only reasonable thing to do. By the time the type system has reached its limits, you're already boned, as it can't statically prevent the problem. It doesn't matter if this is a nullable reference or if its an Option type.
Because of that, we're not really comparing languages that have null vs languages that don't. We're comparing languages that have references that can be non-nullable (or functionally equivalent: references that can't be null, but optional wrapper types) versus languages that have references that are always nullable. "Always nullable" is so plainly obviously worse that it doesn't warrant any further justification, but the question isn't whether or not it's worse, it's how much worse.
Maybe not a billion dollars worse after all.
P.S.: NaN is very much the same. It's easy to assign blame to NaN, and NaN can indeed cause problems that wouldn't have existed without it. However, if we had signalling NaNs by default everywhere, I strongly suspect that we would still curse NaN, possibly even worse. The problem isn't really NaN. It's the thing that makes NaN necessary to begin with. I'm not defending null as in trying to suggest that it isn't involved in causing problems, instead I'm suggesting that the reasons why we still use null are the true root issue. You really do fix some problems by killing null, but the true root issue still exists even after.
It's overblown until it isn't. Hoare didn't pluck that number from thin air. This is now a solved problem in modern programming languages. If Odin doesn't have this and other essential memory safety features, it's certainly not worth the massive retooling effort.
Not a Odin user, but iirc odin also has Go like zero values. There is no practical option unless you have null. Like a string cant be null, its "at least" an empty string. But whats a base value for a pointer? A function? An interface? Either you wrap (ocaml style) or use null. Its pragmatism vs correctness, a balance as old as computing.
Odin's type system is just different to many other languages, and trying to compare it to others doesn't always work out very well.
`Maybe` does exist in Odin. So if you want a `nil` string either use `Maybe(string)` or `cstring` (which has both `nil` (since it is a pointer) and `""` which is the empty string, a non-nil pointer). Also, Odin doesn't have "interface"s since it's a strictly imperative procedural language.
As for you question about base values, I am not sure what you mean by this. Odin hasn't got a "everything is a pointer internally" approach like many GC'd languages. Odin follows in the C tradition, so the "base value" is just whatever the type is.
Odin offers a Maybe(T) type which might satisfy your itch. It's sort of a compromise. Odin uses multiple-returns with a boolean "ok" value for binary failure-detection. There is actually quite a lot of syntax support for these "optional-ok" situations in Odin, and that's plenty for me. I appreciate the simplicity of handling these things as plain values. I see an argument for moving some of this into the type-system (using Maybe) when it comes to package/API boundaries, but in practice I haven't chosen to use it in Odin.
Maybe(T) would be for my own internal code. I would need to wrap/unwrap Maybe at all interfaces with external code.
In my view a huge value addition from plain C to Zig/Rust has been eliminating NULL pointer possibility in default pointer type. Odin makes the same mistake as Golang did. It's not excusable IMHO in such a new language.
Both Odin and Go have the "zero is default" choice. Every type must have a default and that's what zero signifies for that type. In practice some types shouldn't have such a default, so in these languages that zero state becomes a sentinel value - a value notionally of this type but in fact invalid, just like Hoare's NULL pointer, which means anywhere you didn't check for it, you mustn't assume you have a valid value of that type. Sometimes it is named "null" but even if not it's the same problem.
Even ignoring the practical consequences, this means the programmer probably doesn't understand what their code does, because there are unstated assumptions all over the codebase because their type system doesn't do a good job of writing down what was meant. Almost might as well use B (which doesn't have types).
I have an important metric for new "systems" languages: does the language allow NULL for it's commonly used pointer type. Rust by default does not (References may _not_ be NULL). Here is where I think Odin makes a mistake.
In the linked blog post Odin mentions ^os.File which means pointer to File (somewhat like *FILE in C). Theoretically the pointer can be NULL. In practice, maybe I would need to check for NULL or maybe I would not (I would have to scan the Odin function's documentation to see what the contract is).
In Rust, if a function returns &File or &mut File or Box<File> etc. I know that it will never be NULL.
So Odin repeats the famous "billion-dollar mistake" (Tony Hoare). Zig in comparison is bit more fussy about NULL in pointers so it wins my vote.
Currently this is my biggest complaint about Odin. While Odin packages a lot of power programming idioms (and feels a bit higher level and ergonomic than Zig) it makes the same mistake that Golang, C and others make regarding allowing NULL in the default pointer type.