FMXExpress has released LocalPal, a native Delphi application CLI for downloading, managing, and chatting with local AI models powered by llama.cpp.
Built on the foundation established by Embarcadero’s SimpleChatWithDownload sample, LocalPal takes the concept further by providing a more complete local AI experience with model management, chat workflows, and a reusable foundation for building desktop AI applications in Delphi. The DLLs bundled with this repo do CPU inference and it’s not clear what version they are but I tried the most recent llama.cpp with this DLL bindings it seems like the bindings might need to be updated for the latest version of llama.cpp.
The framework includes support for:
- Downloading GGUF models
- Local AI chat interfaces
- llama.cpp integration
- Streaming AI responses
- Offline AI workflows
- Local model management
- Native Delphi implementation
Like the original Embarcadero sample, LocalPal runs models entirely on the user’s machine with no cloud APIs required, allowing developers to build private AI applications that keep data local while taking advantage of the growing ecosystem of GGUF-compatible models.
For Delphi developers interested in local-first AI, LocalPal provides a practical starting point for building ChatGPT-style experiences powered entirely by open-weight models running through llama.cpp.

