April 11th, 2024

Tool Command Language

I have this idea for a tool command language. Something similar to TCL, in that it's chiefly designed to be used as an embedded scripting language and chiefly in an interactive context.

It's been an idea I've been having in my mind for a while, but I've got the perfect use case for it. I've got a tool at work I use to do occasional admin tasks. At the moment it's implemented as a CLI tool, and it works. But the biggest downside is that it needs to form connections to the cluster to call internal service methods, and it always take a few seconds to do so. I'd like to be able to use it to automate certain actions, but this delay would make doing so a real hassle.

Some other properties that I'm thinking off:

It should be able to support structured data, similar to how Lisp works
It should be able to support something similar to pipes, similar to how the shell and Go's template language works.

Some of the trade-offs that come of it:

It doesn't have to be fast. In fact, it can be slow so long as the work embedding and operating it can be fast.
It may not be completely featureful. I'll go over the features I'm thinking of below, but I say upfront that you're not going to be building any cloud services with this. Administering cloud servers, maybe; but leave the real programs to a real language.

Some Notes On The Design

The basic concept is the statement. A statement consists of a command, and zero or more arguments. If you've used a shell before, then you can imagine how this'll look:

firstarg "hello, world"
--> hello, world

Each statement produces a result. Here, the theoretical firstarg will return the first argument it receives, which will be the string "hello, world"

Statements are separated by new-lines or semicolons. In such a sequence, the return value of the last argument is returned:

firstarg "hello" ; firstarg "world"
--> world

I'm hoping to have a similar approach to how Go works, in that semicolons will be needed if multiple statements share a line, but will otherwise be unnecessary. I'm using the Participal parser library for this, and I'll need to know how I can configure the scanner to do this (or even if using the scanner is the right way to go).

The return value of statements can be used as the arguments of other statements by wrapping them in parenthesis:

echo (firstarg "hello") " world"
--> hello world

This is taken directly from TCL, except that TCL uses the square brackets. I'm reserving the square brackets for data structures, but the parenthesis are free. It also gives it a bit of a Lisp feel.

Pipelines

Another way for commands to consume the output of other commands is to build pipelines. This is done using the pipe | character:

echo "hello" | toUpper
--> HELLO

Pipeline sources, that is the command on the left-most side, can be either commands that produce a single result, or a command that produces a "stream". Both are objects, and there's nothing inherently special about a stream, other than there some handling when used as a pipeline. Streams are also designed to be consumed once.

For example, one can consider a command which can read a file and produce a stream of the contents:

cat "taleOfTwoCities.txt"
--> It was the best of times,
--> it was the worst of times,
--> …

Not every command is "pipe savvy". For example, piping the result of a pipeline to echo will discard it:

echo "hello" | toUpper | echo "no me"
--> no me

Of course, this may differ based on how the builtins are implemented.

Variables

Variables are treated much like TCL and shell, in that referencing them is done using the dollar sign:

set name "josh"
--> "Josh"
echo "My name is " $name
--> "My name is Josh"

Not sure how streams will be handled with variables but I'm wondering if they should be condensed down to a list. I don't like the idea of assigning a stream to a variable, as streams are only consumed once, and I feel like some confusion will come of it if I were to allow this.

Maybe I can take the Perl approach and use a different variable "context", where you have a variable with a @ prefix which will reference a stream.

set file (cat "eg.text")

echo @file
\# Echo will consume file as a stream

echo $file
\# Echo will consume file as a list

The difference is subtle but may be useful. I'll look out for instances where this would be used.

Attempting to reference an unset variable will result in an error. This may also change.

Other Ideas

That's pretty much what I have at the moment. I do have some other ideas, which I'll document below.

Structured Data Support: Think lists and hashes. This language is to be used with structured data, so I think it's important that the language supports this natively. This is unlike TCL which principally works with strings and the notion of lists feels a bit tacked on to some extent.

Both lists and hashes are created using square brackets:

\# Lists. Not sure if they'll have commas or not
set l [1 2 3 $four (echo "5")]

\# Maps
set m [a:1 "b":2 "see":(echo "3") (echo "dee"):$four]

Blocks: Yep, containers for a groups of statements. This will be used for control flow, as well as for definition of functions:

set x 4
if (eq $x 4) {
  echo "X == 4"
} else {
  echo "X != 4"
}

foreach [1 2 3] { |x|
  echo $x
}

Here the blocks are just another object type, like strings and stream, and both if and foreach are regular commands which will accept a block as an argument. In fact, it would be theoretically possible to write an if statement this way (not sure if I'll allow setting variables to blocks):

set thenPart {
  echo "X == 4"
}

if (eq $x 4) $thenPart

The block execution will exist in a context, which will control whether a new stack frame will be used. Here the if statement will simply use the existing frame, but a block used in a new function can push a new frame, with a new set of variables:

proc myMethod { |x|
  echo $x
}

myMethod "Hello"
--> "Hello

Also note the use of |x| at the start of the block. This is used to declare bindable variables, such as function arguments or for loop variables. This will be defined as part of the grammar, and be a property of the block.

Anyway, that's the current idea.