University Logo

NSF EAGER Award Abstract #2454027

Demonstrating a Compact Foundation Model for Planning-Like(PL) Tasks for Next Generation Trusted Applications

Recent Updates

Project Overview

The current generation of Artificial Intelligence (AI) models has not only revolutionized real-world applications like conversations. These models have also transformed AI itself, as they can serve as the basis, or foundation, for more specialized AI tasks if given sufficient additional training. However, building and training these models and specializing them for challenging tasks like planning is hard due to barriers including limited computing resources, limited non-proprietary knowledge, and even limitations on the basic assumptions used to build the models. This effort investigate an alternative path: it creates and trains a specialized AI model, using innovations on current methods, that can be applied to problems involving planning and chains of tasks. The resulting model has the potential to outperform and be more efficient and more understandable at these types of tasks than the current generation of general-purpose Large Language Models. The project offers a unique demonstration opportunity to overcome resource barriers and democratize knowledge, especially for underserved communities. The model development process, resulting AI infrastructure, and accompanying outreach activities will engage research communities from a wide array of academic disciplines at multiple universities including minority-serving institutions.

AI foundation models, optimized primarily for next-token prediction, excel in generating coherent and contextually relevant text, making them effective for natural language and conversational agents. However, these models exhibit significant limitations when applied to tasks from real-world applications requiring sequential decision making, reasoning, and other planning-like tasks. Examples include business processes, guidelines, instructions, programs, and workflows. Previous work on this topic using foundation models has primarily focused on using or fine-tuning pre-trained, off-the-shelf models with limited success. This project will instead investigate a different approach, creating a comprehensive, yet compact, foundation model for planning-like tasks from scratch. This model will be based on innovative approaches for tokenization, training, and other steps that will enable the model to specialize in advanced planning. The project will follow a transparent, bottom-up methodology for building this new model that will open new avenues for research, education, and real-world applications.

Funding Information