How Do I Create My Own Scripting Language?

Background

An exciting new feature we are currently developing for Core is to add a built-in scripting language. Development is well underway, and we will be launching it before the end of the year. But it has led to some interesting questions and challenges along the way.

This blog post is not intended to be a detailed technical specification. It is more about the general approach to the problem and hopefully, will have some useful tips for anyone embarking on a similar journey.

Step 1 – Sketch out some Use Cases

Before we start on anything, it is normally a good idea to ask two questions:

  • What are we trying to achieve?
  • Why do we think it will be useful?

If you don’t have good answers to those questions, the probability of success will be low.

In our case, we are currently focused on how we can make Core better as a CRM solution. That means automating things the user would otherwise have to do for themselves. Which saves our users time and (hopefully) leads to more happy customers.

So, with that in mind, we made a long list of things the initial feature set needs to support. The following are just a few examples that relate to Invoices:

  • Simple Arithmetic
    • Net Amount = Quantity x Unit Price
    • Tax Amount = Net Amount x Tax Rate
    • Gross Amount = Net Amount + Tax Amount
  • String Formatting
    • Invoice Number = “INV-000001” (Where the number is auto generated)
  • Date Calculations
    • Due Date = Invoice Date + Settlement Period
  • Aggregate Functions
    • Invoice Total = Sum of Invoice Item Amounts

Once we have a feature set capable of supporting the initial use cases, there will be many more things a creative user can do with those features. There may be other use cases that are still not possible, but we can expand the feature set later as required.

Step 2 – Language Definition

Initially, we thought we could go straight from example use cases to creating a language definition document. But, as soon as you start trying to write down a language definition you realise just how many questions there are.  Then you start to realise that many of these questions depend on answering some more general questions of principle.

This led us to a very helpful intermediate step:

Step 2A – Agree Some Language Principles

For this step, you need to get some people into the room. Yes, we still like the old-fashioned gather round the whiteboard approach, but you could do it on Teams or Zoom if you prefer. Either way, it’s very important to involve the type of people the language is intended to be used by.

In our case, we are very much aimed at smart but non-technical people. This is not a language for programming gurus, but nor is it scripting for dummies. So, we pulled in the smartest people we know, who are not developers or technical specialists.

Man at a desk looking at the wall

Should we copy an existing language?

This is an interesting question. If we can assume some (maybe most) of our users are familiar with Excel, would it ease adoption if we copied that language? What about other languages like javascript or SQL?

Our conclusion was that most of our users might have some familiarity with other scripting languages but will probably not be totally proficient in any of them. Any short-term benefits in this area are likely to be swallowed up by added frustration, any time our language differs from the supposedly copied one, or wherever we don’t support some specific feature the user loves.

So, unless you are going to make an exact subset or superset of another language, you should probably allow yourself to be influenced by a language you like, but don’t try to copy it too closely.

Should we use SHOUTY CAPS or trendy lower case?

Any scripting language needs to have keywords of some sort (Unless you’re talking about APL which only gets around it by inventing special keyboard characters). My view is that this is strictly a matter of taste and if you ask three people you will get four different opinions.

We went for SHOUTY CAPS, because we are so trendy, we are ahead of the lower-case people and ready in advance for the next wave.

How many brackets do we need (and what kind)?

We want some kind of conditional logic, if this, then that, etc. but do we want:

  • Excel style: IF(condition, do this, do that)
  • Javascript style: if(condtion){ do this } else { do that }
  • SQL style: IF condition THEN do this ELSE do that END

We went for the SQL style, which seemed most intuitive to the non-coders.

How difficult are dates?

Several of our use cases involve working with dates.

  • Date plus Interval (in days/months/years)
  • Difference between two dates (in days/months/years)

Do we want things that look like arithmetic:

  • Today + 3 Days
  • End Date – Start Date

Or things more like functions:

  • DATE_ADD(Today, 3, DAY)
  • DATE_DIFF(End Date, Start Date, DAY)

There were good arguments on both sides, we went for the functional approach in the end. Mostly on the principle that the plus and minus operators should have one and only one job, see below for more on that.

Another question that cropped up about dates, and this one could keep you going forever: What do we mean if we ask what is the difference in months between two dates?

Clearly: 14/02/2023 to 14/03/2023 is 1 month.

But what about: 14/02/2023 to 13/03/2023?

  • Zero
  • 88
  • 96
  • Something else?

What exactly is 0.96 of a month? Sorry, what was the question?

I don’t claim to have a good answer to this and certainly not “the” answer. I only have “an” answer. We interpret the question to mean: How many full months are there between the two dates, so:

  • 14/02/2023 to 13/03/2023 = 0 Months
  • 14/02/2023 to 14/03/2023 = 1 Month

It’s clear, it’s simple and if it’s not what you want, then please ask a different question.

How helpful should we try to be?

There are some things that seem similar at first but are quite different if you really think about it.

If we want to add together two numbers, we all agree it should look like this: 2 + 3.

So, if we want to add together two strings shouldn’t it look like this: “ABC” + “DEF”?

Well maybe, but what about these:

  • “ABC” + 1
  • “ABC” + “1”
  • “1” + “1”

It’s not clear what some of those even mean. What does the user expect if they try to join a string and a number, or two strings that both contain numeric digits?

Am I the only one who thinks “1” + “1” = “11” might be slightly unexpected?

The fundamental choice is between:

  1. Try to be helpful (PHP I’m looking at you)
  2. Try to be an obstinate jobsworth (Java anyone?)

Don’t get me wrong, I love PHP, but trying to be helpful is not always, well, helpful. I guess it depends if you prefer to have something that appears to work but might be broken in hard to spot ways. Or something that is obviously broken with an in-your-face error message.

I know, we all want it to not be broken at all. But trust me, it will be broken until it’s fixed. The only question is how soon do you want to know that it’s broken so you can fix it?

I’m sorry Mrs Smith, I’m afraid the surgeon has removed the wrong kidney. He was supposed to be operating on Mr Jones, but the scheduling software seems to think that “9” comes after “11”, I can’t imagine why.

In conclusion, we have gone for a more strictly typed approach. The plus operator only does one thing, it adds together two numbers. If you try to add a string, you get an error. If you want to join strings you need to use a different operator:

  • “ABC” & “DEF” = “ABCDEF”

If you are confident your string only contains numbers, then you can convert it to a number:

  • 1 + INT(“1”) = 2

Similarly, you can convert a number to a string:

  • STRING(1) & “1” = “11”

It does seem a bit pedantic at first, but in the long run it saves a lot of time trouble shooting obscure bugs.

Step 2B – Go Back to Writing the Language Definition

By this stage, the language definition pretty much writes itself and there’s not too much else to say. The only hard questions left are do we need this feature in version 1, or can it wait until later?

Step 3 – Implement it all in Code

If you read my previous post about ChatGPT, you will know that implementing it in code is the easy part. You will have to wait for a later blog post to find out about some of the interesting challenges in implementation.

Step 4 – Release it to the Public

The moment we have all been waiting for. Release it into the wild and wait for the applause. What do you mean, bug reports, feature requests and mild disappointment that it doesn’t make your coffee for you? Have you ever released any software? That’s not how it works.  If you would like to discuss this further Click Here to book a chat with one of our friendly team who would be happy to help you.

Join to newsletter.

Curabitur ac leo nunc vestibulum.