Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Note that Fortran supports arrays with an arbitrary starting index, in this case -2. So this table supports indices in the range [-2, 9].

That is such a useful feature! Surprised I haven’t seen that more often. So much fiddly code exists that’s just fixing offsets to conform to 0 (or 1 for Lua) -based indexing!



.NET supports this because [Visual] Basic supports it. This can be used from C# - and other languages - but there is no nice syntax supporting it.

  // This also supports multidimensional arrays, that is why the parameters are arrays.
  var array = Array.CreateInstance(elementType: typeof(Int32), lengths: [ 5 ], lowerBounds: [ -2 ]);

  // This does not compile, the type is Int32[*], not Int32[].
  // Console.WriteLine(array[0]);

  array.SetValue(value: 42, index: -2);

  Console.WriteLine(array.GetValue(-2));


TIL thanks!


Unfortunately, Fortran's implementation of this has some inconsistencies. Doing certain operations will convert from the custom indexing back to 1-based indexing.

https://github.com/sourceryinstitute/fidbits/blob/master/src...

https://fortran-lang.discourse.group/t/just-say-no-to-non-de...


Worse than the pitfalls that can arise with a correct compiler is the fact that most Fortran compilers have bugs with non-default lower bounds -- and they're not always the same bugs, so portability is a real problem. The feature is fine as it stood with Fortran '77 dummy arrays.


Pascal (and Modula if I'm not mistaken) supports this too.


Ada too, which is of course Pascal-based. Like many programming language features, it feels like it was lost simply because C didn't have it and everyone wanted to copy C.


Reminds me of this fantastic talk: https://www.youtube.com/watch?v=wo84LFzx5nI


Many BASIC dialects as well.

Hence why the whole base index discussion only became relevant in C based languages.


In C can't you just offset pointer and then you'll be able to index with arbitrary starting value?


Yes. I do this a lot when writing linear algebra stuff. All the math texts write things in 1-based notation for matrices. The closer I can make the code match the paper I'm implementing makes life so much easier. Of course there's a big comment at the beginning of the function when I modify the pointer to explain why I'm doing it.


Technically no, a pointer pointing outside of its array (or similar) at any point is undefined behaviour. More importantly for this discussion, without support from the language it's not very ergonomic to work with. What happens when you need to call strlen, memcpy, or free?


It works in this case where you want to move the zero index forward a few cells to a valid offset. It is only UB for the general case where the offset may land outside valid memory. C has always supported negative indices, so moving index zero forward into the middle of the array is fine.


Lua actually has arbitrary indexing, it's just that some iterator functions in the standard library assume arrays begin at 1.


It does and doesn't. You can have any arbitrary index, but that changes the table from being an array to being both an array and a dictionary, some real weird Frankenstein stuff


> but that changes the table from being an array to being both an array and a dictionary

You're confusing the definition of the language with the implementation. In implementation you're right, most runtimes will treat arrays starting at 1 as a special case and optimize that access. The language itself doesn't make that distinction though. Here an array is simply any table indexed by integers. The documentation states it's thusly:

    You can start an array at index 0, 1, or any other value


The feature worked fine and was portable in Fortran ‘77, but its interactions with modern features are full of shocking pitfalls even when the compiler implements them correctly, which only two do, so it’s not really portable either.


I'd agree with 1-based indexing problems, but not 0-based, which seems very natural. And if you have -2-based, I'd argue that perhaps you don't want an array.


I think it's down to personal preferences/how you think. I haven't actually used any languages that didn't have 0 based indexing, but I remember it being very painful and super unintuitive to learn, it just didn't make sense at all (still doesn't, but it's not a problem for me anymore). I always thought 1 based would make a lot more sense and be way easier to learn.


The zero-based approach makes sense in C, where it is syntactic sugar and `a[i]` is equal to `*(a + i)` Treating it as offset of 0 is logical.

The more you go away from raw pointer semantics the less intuitive it gets.


For extra fun, in C you can write i[a] since addition is commutative *(i+a) == *(a+i)


I wonder how many of the "off by one" issues/bugs we encounter in the wild are because of arrays typically using 0-based indexing vs 1-based indexing.


None, in my experience. 1-based is far, far likely to introduce errors, as you have to keep adjusting the index to the algorithm and/or what is actually going on underneath.


> as you have to keep adjusting the index to the algorithm and/or what is actually going on underneath.

Yeah, that's mixing both of them. Wouldn't it work as well if they all used 1-based indexing or 0-based indexing? Sounds like the issue was that algorithms/stuff underneath wasn't 1-based.


I would say it is sensible. Given that an array index is an unsigned integer, what are you going to do with an index of zero?

Perhaps I've been influenced by writing a lot of code in assembler, way back when, but zero-based has always seemed completely natural to me, to the extent that I find it very hard to understand algorithms expressed in non-zero based code.


An array index can be signed with no problem. If you're worried about the address calculations, well, the address of an array doesn't have to be the address to its initial element, does it? It can be the address of the element with index 0 (even if an array does not have such an index at all):

    arr: array[-2..10] of integer;

    pointer(@arr) == pointer(@arr[0])
Or you can use descriptors (dope vectors), but that involves quite an overhead. The books on compilers from the 70s (e.g. Gries's "Compiler construction for digital computers") have rather extensive discussions on both approaches.


Yes, I'm familiar with Pascal indexes, but I don't find them very natural for the kind of programs I write. I want functions/classes to do any sort of translation.

> The books on compilers from the 70s (e.g. Gries's "Compiler construction for digital computers")

Yellow cover, I think? My then GF bought it for Xmas at about 1984 or so. Not the best book on the topic, IMHO.


I think encountering C arrays and pointers rewrote my brain so that 0 based indexing made more sense even though up to that point (~40 years ago) I'd only used languages that used 1 based indexing (Basic and Pascal).


0 based is much simpler for any mathematical calculation done with the indexes. only reasons I can think where you need to handle it is when getting the last index from the length of the array or when interacting with the user. With 1-based you'll need to subtract or add 1 all over the place when doing almost anything


not sure why would i want that.

Now to get the 3rd element from array, you have to know the start index, so another parameter to pass to function.


Your issue is you are trying to work out how use to this feature as a zero based array.

EDIT (for more explanation): I have an input value from -10 to 100. I want to use this value to lookup something in a table. IN a ero indexed world I have to know what the lowest value is and subtract that from the input value to get to zero (so "another parameter to pass to function").

With an arbitrary start index the array is just indexed from the lowest value (-10). There is nothing more needing to be passed in.


You might want the element at "position" 0 though (which with the origin at -2 would be the 3rd element). E.g. treat the array index as a coordinate in a 1D coordinate system with user-defined origin.


When you pass the array to a function, it is 1-indexed by default in the body of that function, unless that function sets a specific starting index.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: