May 10th, 2024

Indexing In UCL


I've been thinking a little about how to support indexing in UCL, as in getting elements from a list or keyed values from a map.  There already exists an index builtin that does this, but I'm wondering if this can be, or even should be, supported in the language itself.

I've reserved . for this, and it'll be relatively easy to make use of it to get map fields. But I do have some concerns with supporting list element dereferencing using square brackets. The big one being that if I were to use square brackets the same way that many other languages do, I suspect (although I haven't confirmed) that it could lead to the parser treating them as two separate list literals. This is because the scanner ignores whitespace, and there's no other syntactic indicators to separate arguments to proc calls, like commas:

echo $x[4]      --> echo $x [4]
echo [1 2 3][2] --> echo [1 2 3] [2]

So I'm not sure what to do here. I'd like to add support for . for map fields but it feels strange doing that just that and having nothing for list elements.

I can think of three ways to address this.

Do Nothing — the first option is easy: don't add any new syntax to the language and just rely on the index builtin. TCL does with lindex, as does Lisp with nth, so I'll be in good company here.

Use Only The Dot — the second option is to add support for the dot and not the square brackets. This is what the Go templating language does for keys of maps or structs fields. They also have an index builtin too, which will work with slice elements.

I'd probably do something similar but I may extend it to support index elements. Getting the value of a field would be what you'd expect, but to get the element of a list, the construct .(x) can be used:

echo $x.hello     # returns the "hello" field
echo $x.(4)       # returns the forth element of a list

One benefit of this could be that the .(x) construct would itself be a pipeline, meaning that string and calculated values could be used as well:

echo $x.("hello")
echo $x.($key)
echo $x.([1 2 3] | len)
echo $x.("hello" | toUpper)

I can probably get away with supporting this without changing the scanner or compromising the language design too much. It would be nice to add support for ditching the dot completely when using the parenthesis, BASIC, but I'd probably run into the same issues as with the square brackets if I did, so I think that's out.

Use Parenthesis To Be Explicit — the last option is to use square brackets, and modify the grammar slightly to only allow the use of suffix expansion within parenthesis. That way, if you'd want to pass a list element as an argument, you have to use parenthesis:

echo ($x[4])       # forth element of $x
echo $x[4]         # $x, along with a list containing "4"

This is what you'd see in more functional languages like Elm and I think Haskell. I'll have  see whether this could work with changes to the scanner and parser if I were to go with this option. I think it may be achievable, although I'm not sure how.

An alternative way might be to go the other way, and modify the grammar rules so that the square brackets would bind closer to the list, which would mean that separate arguments involving square brackets would need to be in parenthesis:

echo $x[4]         # forth element of $x
echo $x ([4])      # $x, along with a list containing "4"

Or I could modify the scanner to recognise whitespace characters and use that as a guide to determine whether square brackets following a value. At least one space means the square bracket represent a element suffix, and zero mean two separate values.

So that's where I am at the moment. I guess it all comes down to what works best for the language as whole. I can live with option one but it would be nice to have the syntax. I rather not go with option three as I'd like to keep the parser simple (I rather not add to all the new-line complexities I've have already).

Option two would probably be the least compromising to the design as a whole, even if the aesthetics are a bit strange. I can probably get use to them though, and I do like the idea of index elements being pipelines themselves. I may give option two a try, and see how it goes.

Anyway, more on this later.