Artificial Intelligence (AI) can be traced back to the 1950's and is defined as intelligence demonstrated by machines e.g. calculators and computers. As technology has advanced, AI has become smarter. Machines (computers) can be trained through exposing them to huge amounts of data sets so they can be trained to recognise patterns and themes, hence are able to make decisions based on this Machine Learning (ML) e.g. the difference between a cat and a dog, or the ability to detect anomalies on X rays and scans. All good.
What is Generative Artificial Intelligence?
aka GenAI
The most recent, and probably one of the biggest technological advancements in decades is Generative AI (GenAI). This is the ability for machines to learn, make decisions and create outputs in the form of text, audio, video, movement and more. Training and learning is achieved through much larger and more diverse data sets called Large Language Models (LLMs) based upon text and Large Multi-modal Models (LMMs) based upon a mix of text, imagery and sound. This was brought to the public domain in Oct 2022 via Microsoft/OpenAI with ChatGPT; Google with Gemini; Meta with Llama 3, plus others like Claude, Grok, Mistral, Falcon and there are many more emerging. These are foundational models that have been built to create new products and services and are also made available for other organisations to build on top of.
In fact, the images you see on this page were all created with GenAI. The ability to do so was via a 'prompt', which is an instruction a human writes to see what GenAI will create. This now known as 'Prompt Engineering', which is ultimately the art of writing a question that creates an output. The prompt used to create the one image on this page was "Create a photo of a cat and a dog riding a motorcycle". The image generated is far from perfect. Size and alignment are off, plus they're not wearing helmets. Maybe you can spot other flaws too? This indicates further training and tuning will be needed to create the perfect image, which may not be possible. The prompt for the image to the right (or below if viewing on mobile) was "Create an image of a Large Language Model" we don't know what it is either. Other examples of GenAI use cases can be found here.
Anyway, the point at which we have reached today and where Generative AI is heading in the future is only possible because of the advancement in computer technology. The models and the ever expanding data required to make them useful need to be trained, and this takes an immense amount of processing. This is where the Graphical Processing Unit (GPU) comes in; the 'muscle' behind GenAI.
Prompt Engineering
Prompt Engineering doesn't mean engineering that is completed on time, it's actually the art of writing a well formed instruction for GenAI to interpret. Jensen Huang, the CEO of NVIDIA was quoted in early 2024 saying, "programming is not going to be essential for you to be a successful person." He said this knowing that Generative AI already has the ability to write code and even debug code, so future success is how to instruct GenAI to attain the desired outcome. And yes, that's another GenAI created image, in fact every mage on this site is create by writing a Prompt.
The quality of the code is dependent on the quality of the data used to train the models and the quality of the Prompt written by the human. If you assume the quality of the model will always be improving (after all, such models are trained on data from repositories such as Github, GitLab and BitBucket.) Then the quality of the prompt written by the human is what can make the difference. It's therefore worth knowing that there is a format and structure to good Prompt Engineering.
A good Prompt can be broken down into 'Elements' which are used to help structure the instruction to get the desired GenAI outcome. Six elements are listed below, each with a description and example.
-
Task (T) - The specific action you want the GenAI model to perform. e.g. Summarize this article. Write a poem in the style of Shakespeare or Translate this sentence into Spanish.
-
Context (C) - Background information relevant to the task. e.g. This article is about the history of artificial intelligence. The poem should be about love and loss.
-
Examples (E) - Samples or references to guide the AI towards the desired output. e.g. This sentence is from a scientific paper on climate change. You can start the summary with "In this article..." or Rhyming scheme: ABAB CDCD EFEF
-
Persona (P) - The intended audience for the generated content. e.g. This summary is for a general audience. The poem is for a romantic partner. This translation should be understandable by a scientist.
-
Format (F) - The desired structure or layout of the output. e.g. The summary should be 3-5 sentences long. The poem should be a sonnet. The translation should be a complete sentence.
-
Tone (T) - The emotional mood or style you want the AI to convey. e.g. The summary should be objective and neutral. The poem should be sentimental and melancholic. The translation should be formal and accurate
​
The AI & GenAI Relationship Hierarchy
Sometimes it's easier to visualise the relationship of the Artificial Intelligence 'stack' and how each discipline is related. The diagram, in this section, shows Artificial intelligence (AI) is the entire 'target', representing the whole goal of creating intelligent machines. Machine learning is a bulls-eye within that target. It's a specific technique where machines learn from data without needing explicit programming. Deep learning (DL) is a smaller bullseye even further in. It's a kind of machine learning that uses complex artificial neural networks to tackle intricate problems, like image recognition or speech translation. Generative AI is like the innermost ring of the bullseye. It's a specific application of deep learning that focuses on creating entirely new content, like realistic images or even music, based on the data it's been trained on. Then there's the Large Language and Multi-Modal Models which fall within Deep Learning, but are not exclusive to Generative AI. AI encompasses everything, machine learning is a way to achieve AI, deep learning is a powerful kind of machine learning, and generative AI is the creative application of deep learning.
The GenAI Software Stack
Understanding what software layers are required to deploy Generative AI applications such as organisational specific GPTs (Generative Pre-trained Transformers) or image and video creation applications etc. may not be as complex as you would think.
​
Envisage a 'stack' as shown here. The base of the stack is what manages the GenAI infrastructure i.e. the GPUs, the networking, the orchestration of machines or instances and the base level operating system that is to be used and managed. On top of this comes the Models (LLMs/LMMs) you wish to use. Ideally, you take advantage of a foundational model already pre-trained and available; which you then train and deploy your own specific industry and organisational model on top of that (fine-tune). This gives you the specific intelligence to power your application, which resides on top of the model, moving you into Inferencing.
​
Full software stacks are offered by the GPU manufacturers, the public clouds, plus there are many open-source tools which can also be used. The choice is wide, and you can choose to take advantage of all the great work already been done by others to accelerate your GenAI initiatives.