Posted on 14th May 2024

The design of applied AI systems

When computers first became commercially available, there was a rush among organisations that could afford to invest in them. These mysterious trophies that started to appear in all kinds of workplaces would be proudly displayed to important people at every opportunity. There was just one problem: Many of these organisations had no idea what they were going to use them for or how to operate them.

Some parallels can definitely be drawn with the AI gold rush that has been happening for a while now, although it is up for debate whether the impact of this wave of AI will be as significant. Such is the hype that there will undoubtedly be a lot of resources expended on aborted projects over the next few years. But much like the coming of the computer age, increasing numbers of useful applications will inevitably be discovered.

In my opinion this is a time when thorough research into applied AI will prove to be valuable in the long term. Whether it is carried out in academic settings, commercial environments or the home office of a curious 'amateurs' tinkering with these things, true value will be derived from thoughtful experimentation and documentation of projects that incorporate AI. I'd like to draw a distinction between the kind of research I'm referring to and the kind of all-in 'cult of AI' bonanza most of the tech industry has been indulging in lately.

A good example of promising research came on my radar recently from one of the folks at the UK government's digital team (GDS). A side project by Tim Paul (detailed on his blog here) for generating forms from images resonated well with me. Digitised forms is an area I've worked in for some time and I have a particular interest in both the UX and overall systems design when it comes to how we can efficiently build usable forms for replacing or supplementing their paper counterparts.

tbd
A screenshot of Tim Paul's form extractor that uses AI to generate a web form from an image URL entered by a user

Essentially this project will take an image of a form and translate it into a digital version, using the GOV.UK standardised form components.

Applied AI project structure

Reading more about Tim's 'form extractor prototype' has helped me to understand the overall solution he created. As someone with an interest in (and a healthy amount of scepticism of) AI, I am always on the lookout for kernels of insight into the useful application of it.

In particular this project shines a light on how we can structure a system in order to integrate and manipulate AI. I think there are a few initial conclusions I've come to from analysing this and other AI implementations over the last year or so:

In some ways 'employing' AI is like employing a human in that despite the knowledge it has, those managing the AI have to figure out how to get the best out of AI. There are many ways it can go wrong. I believe building useful software that incorporates AI is not an endeavour to embark on without experience of software design and engineering.

In short AI is just a small piece of the puzzle in terms of an applied AI system. It is not something that 'bolts-on' like a gaming expansion-pack magically giving your product more functionality.

In many ways the design and architecture of applied AI systems is still in its infancy, despite what any software vendor will tell you about their expertise in this area.

Examining the design of an applied AI project

Looking at the example of the the form extractor prototype, we can isolate a few parts of the system and try to tease out some useful insights from how they interact. See the following sketch below I drew to make sense of the interactions of the elements of this project:

tbd
A diagram of the architecture of the form extractor project

This is a basic representation of how this project works. Note that the AI (LLM) is just one piece of the overall system, not forgetting that this is a prototype. While this drawing summarises the general approach, there are a few things missing. Firstly this diagram doesn't show that several key components of the system are strongly dependent on one another to create the desired result:

Regardless of how you would implement this system (e.g. with JSON, an LLM based AI, the GOV.UK web macros etc.), I think these are the three key components that actually need to be designed and built one way or another to make it work. Each of these components affects the other.

However there is one fundamental thing missing from the picture which ties the whole system together: The standardised set of web components from the government forms design system. Of course if it were a different application of form generation intended for a different purpose, it would be a different set of digital form components.

tbd
An updated diagram of the architecture of the form extractor project showing the key components as a 'stack' of dependencies

The point is the standardised components from the design system are the basis of the whole solution (including the schema sent to the LLM). If there were no design system to rely on or Tim didn't have the idea to base his project on this design system, the design of the forms would be 'untethered' and likely result in an undesirable (if not completely unusable) outcome.

Many people like to explain generative AI as being like an 'autocomplete', using existing knowledge harvested from elsewhere and some initial incomplete input to take a best guess at what we want. When it comes to form design I think this is a useful analogy - in this case the system 'autocompletes' a form design based on the input of the image and an abstract schema based on the design system. The latter is vital in constraining the AI's output.

Applied AI is not LEGO

Hopefully I've done a reasonable job of conveying the main point of this post, but in case it isn't clear I'll try to expand on it:

AI doesn't understand out intentions (yet), or the context in which we are asking it questions. In fact LLMs don't 'understand' anything, they piece together information from what is essentially a vast network of probabilistic data that is used to train them.

To build a functioning system that utilises AI, we must design and build the relevant components that interact with it. The architecture of these components will depend on what it is we are actually trying to achieve and our understanding of the problem. These components' inter-dependencies and the external factors that the system will rely on have nothing directly to do with AI.

In many cases, the components we build to make an applied AI system work are not interchangeable with other unrelated systems. They likely capture a lot of domain specific knowledge, some of which (like a form component design system) may be pre-existing, but in other cases the relevant knowledge may need to be discovered and documented first before an effective system using AI can be built.

LLMs are not a substitute for design

In order to create the required elements that will form part of a functioning applied AI system, it is imperative we (humans) understand the domain the system is to operate within, and design the appropriate elements using human input. This means doing things like user research, contextual analysis, designing interactions, creating wireframes and prototypes, and manual testing.

We need to carry out research and cannot simply jump into creating a solution without the necessary artifacts and design maturity. There are simply too many unknowns when it comes to AI. The less constraints and direction we apply to our 'applied AI' solutions, the more likely they are to result in unsatisfactory systems which don't solve problems.

Conclusions about applied AI systems

The thrust of this blog post has been to explain why I believe applied AI systems require significant design and that design often includes components which encapsulate domain specific knowledge based on thorough research. These components may have dependencies on each other and external factors, and are required to required to work together to get effective results from AI.

From examining Tim Paul's government forms prototype, which impressively takes an image of a paper form and turns it into a HTML form, we can see that this solution requires a carefully designed architecture. None of the components of the prototype are overly complex, but how they work together has obviously required careful thought and experimentation.

I believe if organisations concentrated a little more on investing in this kind of research, they would get better results out of their forays into integrating AI into their systems.

In this case a small amount of 'prompt engineering' was perhaps not as important as recognising that the AI required a schema to constrain its response and defining this appropriately. Through development of a few relatively small pieces of code, several components were created that formed a solution.

For this particular example the following components were required:

But crucially the outcome of the displayed form in a desired format could not have been achieved if there were no design system for the form components in the first place. Though the AI has no direct knowledge of the design system, elements of the design system informed the schema was supplied to it. It is not so much a prototype built on top of AI, as a prototype built on top of a design system (where AI is used for the image recognition component).

This is a compelling prototype which made me start thinking about the possibility of generating whole web apps for digitising forms by simply scanning an image of a form. But on examination of what underpins this prototype I don't think this is likely to be something we can reliably do yet or possibly ever without requiring some bespoke design and development work. What the prototype does do is provide an example that paves the way for utilising AI for form generation when combined with a domain specific architecture.

It is worth emphasising that while this prototype is impressive and could reduce the time to produce digital forms, it uses a schema of human specified types of questions which are presented using a human designed format and the generated web forms would still require human validation and design finesse.

Ultimately if there is one thing I've learnt from looking at the design of applied AI systems so far, it is that effective use of AI in software requires proper design and architecture by skilled people. This usually results in an ensemble of components that work together to deliver results using AI in a specific context. It also cements my belief that there are many pieces of technology AI is being unnecessarily incorporated into providing little or no value at all. The 'magic' of applied AI is not in the AI itself but in how we use it.