NSF EAGER Award Abstract #2454027

Demonstrating a Compact Foundation Model for Planning-Like(PL) Tasks for Next Generation Trusted Applications

Recent Updates

Oct 2025 – A paper on GAICo has been accepted for presentation at Innovative Applications of Artificial Intelligence (IAAI/AAAI-2026), January 2026.
Sep 2025 – A paper on foundation Models (FMs), time series forecasting (PL task), and automated explanations has been accepted for presentation at AAAI2025 Fall Symposium on AI Trustworthiness and Risk Assessment for Challenged Contexts (ATRACC), Nov 2025 .
Jun 2025 – Project update to NAIRR community. See presentation, along with others in the cohort.
Jun 2025 – Release of the python package, GAICo, GenAI Results Comparator (GAICo), to help compare, analyze and visualize outputs from Large Language Models (LLMs) consistently; over 15k downloads by Oct 2025. [Python, Foundation Models, Library].
May 2025 – Release of the paper titled FABLE: A Novel Data-Flow Analysis Benchmark on Procedural Text for Large Language Model Evaluation on data-flow analysis of procedural text with plans, travel routes, and recipes, representing automated, semi-automated and manual workflows, respectively.
May 2025 – NAIRR compute resources awarded - NAIRR250014
Feb 2025 – Update to a web tool for tracking literature evolution in ‘LLM and planning’ [ICAPS 2025, ICAPS 2024]
Jan 2025 – Award announced, Project kickoff

Project Overview

The current generation of Artificial Intelligence (AI) models has not only revolutionized real-world applications like conversations. These models have also transformed AI itself, as they can serve as the basis, or foundation, for more specialized AI tasks if given sufficient additional training. However, building and training these models and specializing them for challenging tasks like planning is hard due to barriers including limited computing resources, limited non-proprietary knowledge, and even limitations on the basic assumptions used to build the models. This effort investigate an alternative path: it creates and trains a specialized AI model, using innovations on current methods, that can be applied to problems involving planning and chains of tasks. The resulting model has the potential to outperform and be more efficient and more understandable at these types of tasks than the current generation of general-purpose Large Language Models. The project offers a unique demonstration opportunity to overcome resource barriers and democratize knowledge, especially for underserved communities. The model development process, resulting AI infrastructure, and accompanying outreach activities will engage research communities from a wide array of academic disciplines at multiple universities including minority-serving institutions.

AI foundation models, optimized primarily for next-token prediction, excel in generating coherent and contextually relevant text, making them effective for natural language and conversational agents. However, these models exhibit significant limitations when applied to tasks from real-world applications requiring sequential decision making, reasoning, and other planning-like tasks. Examples include business processes, guidelines, instructions, programs, and workflows. Previous work on this topic using foundation models has primarily focused on using or fine-tuning pre-trained, off-the-shelf models with limited success. This project will instead investigate a different approach, creating a comprehensive, yet compact, foundation model for planning-like tasks from scratch. This model will be based on innovative approaches for tokenization, training, and other steps that will enable the model to specialize in advanced planning. The project will follow a transparent, bottom-up methodology for building this new model that will open new avenues for research, education, and real-world applications.

Funding Information

NSF, 2025-2027, NSF (EAGER) NAIRR Pilot: Demonstrating a Compact Foundation Model for Planning-Like(PL) Tasks for Next Generation Trusted Applications (see also: project page), Jan 2025 [Foundation Models, Planning, Demonstration].
NSF, 2025-2027, Resource Request for NAIRR Pilot Demonstration Project: A Compact Foundation Model for Planning-Like (PL) Tasks for Next Generation Trusted Applications (NAIRR250014), to support computing for NAIRR project, May 2025 [Resources, Foundation Models, Planning, Demonstration].