Automatic Prompting Systems

A look at APS!

Jamie Horsnell

9/2/20242 min read

Automatic Prompting Systems

Prompting plays a crucial role in guiding Large Language Models (LLMs) to generate high-quality outputs. While LLMs are capable of producing impressive results with even simple prompts, the true potential of these models is unlocked when prompts are carefully crafted around the specific use case of the model.

While Large Language Models (LLMs) can deliver impressive results with basic prompts, achieving consistent and accurate responses often requires advanced prompt engineering techniques. However, with a wide range of techniques to be learned, it requires a steep learning curve to perfect, with time-consuming iterations and inconsistent results.

Automatic prompting systems aim to combat some of these issues to improve LLM performance on various tasks by automatically crafting and selecting the most suitable prompts, to generate accurate outputs with less resource and time.

One of the approaches is Automatic Prompt Selection (APS), which clusters task data into groups, generates prompts for each group, and then picks the best-performing prompt for each cluster. However, a more common is Automatic Prompt Engineer (APE), where the system is created by:

Creating the Training Dataset

A diverse set of high-quality prompts is needed, which is created by an automated process (a pipeline) that gathers and organises prompts from different tasks.

Cleaning and Sorting Prompts

This pipeline used advanced models to group similar prompts together and remove duplicates. Then, they used language models to sort these prompts by how relevant and useful they were for specific tasks.

Few-Shot Learning

For selected prompts, the researchers added extra instructions to make them even more effective. These enhanced prompts were then tested by a language model, and any that didn't meet the standards were sent back for further improvement.

Dataset Creation

By following this method, you can automatically generate a dataset with thousands of high-quality examples of augmented prompts. Models with a smaller amount of prompts (around 8,000-10,000) allow for speed and adaptability.

Model Training

Using this dataset, you can train smaller language models, like Llama-2-7B, to create an APS model. In production with larger models, like GPT-4 and Llama-3-70B, results significantly increase from the base models without any prompt enhancement.

Automated prompting systems offer several advantages, including efficiency, as they can explore a larger space of potential prompts much more quickly than manual methods. They also exhibit adaptability, adjusting prompts based on different input types or task requirements. They also allow for continuous integration pipelines for prompts to be continually optimised, with user feedback regularly incorporated into the system, and automated prompt testing. This has led to APS systems achieving results that surpass human-crafted prompts.

One of the key challenges however in developing APS is creating the training dataset for the augmentation model. This can be tackled by building an automated pipeline that curates a diverse set of high-quality prompts across different tasks. This pipeline employs embedding models and clustering algorithms to identify and remove duplicates, followed by using cutting-edge LLMs to classify prompts based on their quality and task relevance.

Overall, automated prompting techniques will serve a big part in GenAIOps allowing teams to focus on innovation rather than the intricacies of prompt crafting. A key pillar to building GenAI applications is the prompting engineering, and by using systems trained on a mixture of automated and manual feedback will allow for more robust, reliable, and scalable AI solutions, driving the future of GenAI operations.