01 Mar 24

Tools built on top of large language models can produce impressive looking code in response to text prompts. Is this enough to enable non-technical people to produce real software?

Generative AI and large language models (LLMs) have attracted serious attention recently and one of their many accomplishments is the ability to produce code. Now, these technologies are relatively new and they have their fair share of challenges including hallucinations and copyright disputes. Assuming that all these other problems can somehow be addressed, how does the prospect of using natural language as a kind of programming language really stack up?

On the face of it, natural language seems like the ultimate way to code. After all, most people over the age of four already have some level of proficiency. So in theory, nearly everybody should immediately be able to program - without the need for lengthy training and having to memorise huge quantities of technical detail. That is quite a compelling prospect.

To an extent it seems to deliver. For example, entering a text prompt such as "give me the code for the factorial of every factorial in JavaScript" into a tool like Microsoft Copilot returns a plausible looking solution - along with some nicely written narrative. You can also try different programming languages, ask for test cases and even get it to calculate results with ridiculously large return values.

However, unless the person using the tool can code, they have no real idea what the proposed solution actually does. Yes, they can read the text description - if there is one - but there is no guarantee that this is correct or even relevant. The input prompt isn't much help either. Natural language can be vague and is often ambiguous - like the factorial example above which can be interpreted in a number of ways. Language also tends to be laden with assumptions, which are by definition unstated. The only way you can be sure that you got the code you wanted - or even the code you asked for - is either to understand the code itself or else produce your own comprehensive set of test cases. If you were using Zoea you would only need to produce the test cases.

Nevertheless, non-technical humans have been causing software to be created, largely through the medium of natural language for nearly as long as computers have existed. These people are the stakeholders in any software development project. Stakeholders are often experienced end users that know a lot about their own domain but usually know little about software and - without help - they can even struggle to articulate software requirements.

Analysing software requirements is a skill that is different from but just as important as coding. It involves eliciting, capturing and structuring requirements, as well as identifying and resolving gaps and contradictions. Good analysts also understand the broader business and technological contexts - so they can spot problematic requirements which just don't work or that might jeopardise security, performance and usability. So while stakeholders are the source of most software requirements, the end result of requirements analysis is a synthesis of both stakeholder and analyst knowledge and experience.

People can interact with computers about software in one of two possible ways. You can talk about the software in software terms - in which case all parties need to be able to code. Alternatively, they can communicate in terms of the software requirements. This requires some form of explicit requirements model that all parties can examine, understand and modify. Modern AI makes extensive use of models but these are highly abstract, mathematical and opaque. Requirements modelling on the other hand has more in common with classic approaches to AI. This includes both how knowledge is represented and also how it is reasoned with.

Natural language plays an important role in describing software but it is not always the best approach. Most non-trivial code has multiple execution paths that account for different scenarios or the various ways that the software can be used. While it is certainly possible to describe all of this using text, the result quickly becomes long, tedious and impossible to comprehend - increasing the risk of errors and omissions. This is why people use diagrams and tabular notations to describe software - in conjunction with text. If anything, the prospect of AI being involved in requirements analysts and code generation only increases the need for such multi-channel communication.

There is also a social dimension to coding that needs to be considered. Most software development involves multiple stakeholders, analysts and developers. This requires multi-party communication - which is not the same thing as many unilateral conversations. Indeed, social collaboration of this sort is a more fundamental problem than a domain specific instance like software development. This implies it is a problem that needs to be solved first - it is not just something that can be bolted on afterwards.

Going forward, natural language clearly has a role to play but on its own it is not really sufficient to allow non-technical people to develop real (non-trivial) software. The tools for their part also have to do more than just be able to code. They will also need to be able to carry out requirements analysis and this will require skills like the ability to reason - something that LLMs aren't particularly good at and indeed were never designed to do. The open question is how these gaps will be filled.