Delphi 11 10 XE8 XE7 XE Seattle Berlin Tokyo Rio Firemonkey, Delphi Android, Delphi IOS

Run Large Language Models Natively In Object Pascal

Developer BeRo1985 has made available PasLLM, a high-performance LLM inference engine written entirely in Object Pascal. Unlike wrappers around llama.cpp or external runtimes, PasLLM implements model loading, quantization, and inference natively, allowing large language models to run directly from Delphi and FreePascal applications.

The framework includes support for:

PasLLM introduces its own optimized quantization formats, designed to reduce model size while preserving inference quality. The project supports a wide range of modern open-weight models and can be integrated directly into Object Pascal projects without requiring Python or external inference frameworks.

Currently focused on CPU inference, PasLLM is aimed at developers who want complete control over local AI execution while staying entirely within the Delphi and FreePascal ecosystem. Future GPU acceleration is planned through the author’s PasVulkan framework.

Explore a pure Object Pascal approach to running modern LLMs locally.

Exit mobile version