A typical PostgresML deployment consists of two parts: the PostgreSQL extension, and the dashboard web app. The extension provides all the machine learning functionality, and can be used independently. The dashboard provides a system overview for easier management, and notebooks for writing experiments.
The extension can be installed by compiling it from source, or if you're using Ubuntu 22.04, from our package repository.
!!! tip
If you're just looking to try PostgresML without installing it on your system, take a look at our Quick Start with Docker guide.
!!!
To get the source code for PostgresML, you can clone our Github repository:
git clone https://github.com/postgresml/postgresmlWe provide a Brewfile that will install all the necessary dependencies for compiling PostgresML from source:
cd pgml-extension && \
brew bundleRust
PostgresML is written in Rust, so you'll need to install the latest compiler from rust-lang.org. Additionally, we use the Rust PostgreSQL extension framework pgrx, which requires some initialization steps:
cargo install cargo-pgrx --version 0.9.8 && \
cargo pgrx initThis step will take a few minutes. Perfect opportunity to get a coffee while you wait.
With all the dependencies installed, you can compile and install the extension:
cargo pgrx installThis will compile all the necessary packages, including Rust bindings to XGBoost and LightGBM, together with Python support for Hugging Face transformers and Scikit-learn. The extension will be automatically installed into the PostgreSQL installation created by the postgresql@15 Homebrew formula.
PostgresML uses Python packages to provide support for Hugging Face LLMs and Scikit-learn algorithms and models. To make this work on your system, you have two options: install those packages into a virtual environment (strongly recommended), or install them globally.
=== "Virtual environment"
To install the necessary Python packages into a virtual environment, use the virtualenv tool installed previously by Homebrew:
virtualenv pgml-venv && \
source pgml-venv/bin/activate && \
pip install -r requirements.txt && \
pip install -r requirements-xformers.txt --no-dependencies=== "Globally"
Installing Python packages globally can cause issues with your system. If you wish to proceed nonetheless, you can do so:
pip3 install -r requirements.txt===
We have one last step remaining to get PostgresML running on your system: configuration.
PostgresML needs to be loaded into shared memory by PostgreSQL. To do so, you need to add it to preload_shared_libraries.
Additionally, if you've chosen to use a virtual environment for the Python packages, we need to tell PostgresML where to find it.
Both steps can be done by editing the PostgreSQL configuration file postgresql.conf usinig your favorite editor:
vim /opt/homebrew/var/postgresql@15/postgresql.confBoth settings can be added to the config, like so:
shared_preload_libraries = 'pgml,pg_stat_statements'
pgml.venv = '/absolute/path/to/your/pgml-venv'
Save the configuration file and restart PostgreSQL:
brew services restart postgresql@15You should be able to connect to PostgreSQL and use our extension now:
!!! generic
!!! code_block time="953.681ms"
CREATE EXTENSION pgml;
SELECT pgml.version();
!!!
!!! results
psql (15.3 (Homebrew))
Type "help" for help.
pgml_test=# CREATE EXTENSION pgml;
INFO: Python version: 3.11.4 (main, Jun 20 2023, 17:23:00) [Clang 14.0.3 (clang-1403.0.22.14.1)]
INFO: Scikit-learn 1.2.2, XGBoost 1.7.5, LightGBM 3.3.5, NumPy 1.25.1
CREATE EXTENSION
pgml_test=# SELECT pgml.version();
version
---------
2.7.4
(1 row)
!!!
!!!
We like and use pgvector a lot, as documented in our blog posts and examples, to store and search embeddings. You can install pgvector from source pretty easily:
git clone --branch v0.4.4 https://github.com/pgvector/pgvector && \
cd pgvector && \
echo "trusted = true" >> vector.control && \
make && \
make installTest pgvector installation
You can create the vector extension in any database:
!!! generic
!!! code_block time="21.075ms"
CREATE EXTENSION vector;
!!!
!!! results
psql (15.3 (Homebrew))
Type "help" for help.
pgml_test=# CREATE EXTENSION vector;
CREATE EXTENSION
!!!
!!!
!!! note
If you're looking to use PostgresML in production, try our cloud. We support serverless deployments with modern GPUs for startups of all sizes, and dedicated GPU hardware for larger teams that would like to tweak PostgresML to their needs.
!!!
For Ubuntu, we compile and ship packages that include everything needed to install and run the extension. At the moment, only Ubuntu 22.04 (Jammy) is supported.
Add our repository to your system sources:
echo "deb [trusted=yes] https://apt.postgresml.org $(lsb_release -cs) main" | \
sudo tee -a /etc/apt/sources.listUpdate your package lists and install PostgresML:
export POSTGRES_VERSION=15
sudo apt update && \
sudo apt install postgresml-${POSTGRES_VERSION}The postgresml-15 package includes all the necessary dependencies, including Python packages shipped inside a virtual environment. Your PostgreSQL server is configured automatically.
We support PostgreSQL versions 11 through 15, so you can install the one matching your currently installed PostgreSQL version.
If you prefer to manage your own Python environment and dependencies, you can install just the extension:
export POSTGRES_VERSION=15
sudo apt install postgresql-pgml-${POSTGRES_VERSION}pgvector, the extension we use for storing and searching embeddings, needs to be installed separately for optimal performance. Your hardware may support vectorized operation instructions (like AVX-512), which pgvector can take advantage of to run faster.
To install pgvector from source, you can simply:
git clone --branch v0.4.4 https://github.com/pgvector/pgvector && \
cd pgvector && \
echo "trusted = true" >> vector.control && \
make && \
make installPostgresML will compile and run on pretty much any modern Linux distribution. For a quick example, you can take a look at what we do to build the extension on Ubuntu, and modify those steps to work on your distribution.
To get the source code for PostgresML, you can clone our Github repo:
git clone https://github.com/postgresml/postgresmlYou'll need the following packages installed first. The names are taken from Ubuntu (and other Debian based distros), so you'll need to change them to fit your distribution:
export POSTGRES_VERSION=15
build-essential
clang
libopenblas-dev
libssl-dev
bison
flex
pkg-config
cmake
libreadline-dev
libz-dev
tzdata
sudo
libpq-dev
libclang-dev
postgresql-{POSTGRES_VERSION}
postgresql-server-dev-${POSTGRES_VERSION}
python3
python3-pip
libpython3
lld
Rust
PostgresML is written in Rust, so you'll need to install the latest compiler version from rust-lang.org.
We use the pgrx Postgres Rust extension framework, which comes with its own installation and configuration steps:
cd pgml-extension && \
cargo install cargo-pgrx --version 0.9.8 && \
cargo pgrx initThis step will take a few minutes since it has to download and compile multiple PostgreSQL versions used by pgrx for development.
Finally, you can compile and install the extension:
cargo pgrx installThe dashboard is a web app that can be run against any Postgres database which has the extension installed. There is a Dockerfile included with the source code if you wish to run it as a container.
To get our source code, you can clone our Github repo (if you haven't already):
git clone clone https://github.com/postgresml/postgresml && \
cd pgml-dashboardUse an existing database which has the pgml extension installed, or create a new one:
createdb pgml_dashboard && \
psql -d pgml_dashboard -c 'CREATE EXTENSION pgml;'Create a .env file with the necessary DATABASE_URL, for example:
DATABASE_URL=postgres:///pgml_dashboardThe dashboard is written in Rust and uses the SQLx crate to interact with Postgres. Make sure to install the latest Rust compiler from rust-lang.org.
To setup the database, you'll need to install sqlx-cli and run the migrations:
cargo install sqlx-cli --version 0.6.3 && \
cargo sqlx database setupThe dashboard frontend is using Sass which requires Node & the Sass compiler. You can install Node from Brew, your package repository, or by using Node Version Manager.
If using nvm, you can install the latest stable Node version with:
nvm install stableOnce you have Node installed, you can install the Sass compiler globally:
npm install -g sassFinally, you can compile and run the dashboard:
cargo run
Once compiled, the dashboard will be available on localhost:8000.
The dashboard can also be packaged for distribution. You'll need to copy the static files along with the target/release directory to your server.