diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index a06488079fe..0139a2bdc6a 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -42,7 +42,7 @@ jobs: makepot: "true" services: postgres: - image: postgres:12.0 + image: pgvector/pgvector:pg12 env: POSTGRES_USER: odoo POSTGRES_PASSWORD: odoo diff --git a/README.md b/README.md index 4b62afca2e4..080d1f03aa0 100644 --- a/README.md +++ b/README.md @@ -55,6 +55,7 @@ addon | version | maintainers | summary [excel_import_export](excel_import_export/) | 18.0.1.0.0 | kittiu | Base module for developing Excel import/export/report [fetchmail_attach_from_folder](fetchmail_attach_from_folder/) | 18.0.2.0.0 | NL66278 | Attach mails in an IMAP folder to existing objects [fetchmail_notify_error_to_sender](fetchmail_notify_error_to_sender/) | 18.0.1.0.0 | | If fetching mails gives error, send an email to sender +[field_vector](field_vector/) | 18.0.1.0.0 | lmignon | New specialized field to store vector data [html_text](html_text/) | 18.0.1.0.0 | | Generate excerpts from any HTML field [iap_alternative_provider](iap_alternative_provider/) | 18.0.1.0.0 | sebastienbeau | Base module for providing alternative provider for iap apps [jsonifier](jsonifier/) | 18.0.1.1.1 | | JSON-ify data for all models diff --git a/field_vector/README.rst b/field_vector/README.rst new file mode 100644 index 00000000000..72ee2a9e6eb --- /dev/null +++ b/field_vector/README.rst @@ -0,0 +1,268 @@ +.. image:: https://odoo-community.org/readme-banner-image + :target: https://odoo-community.org/get-involved?utm_source=readme + :alt: Odoo Community Association + +============ +Field Vector +============ + +.. + !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + !! This file is generated by oca-gen-addon-readme !! + !! changes will be overwritten. !! + !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + !! source digest: sha256:cc9c0caa318b8abd50983092bbe0bed65b7903cba44089fe272c31e4c0b6ab32 + !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +.. |badge1| image:: https://img.shields.io/badge/maturity-Beta-yellow.png + :target: https://odoo-community.org/page/development-status + :alt: Beta +.. |badge2| image:: https://img.shields.io/badge/license-LGPL--3-blue.png + :target: http://www.gnu.org/licenses/lgpl-3.0-standalone.html + :alt: License: LGPL-3 +.. |badge3| image:: https://img.shields.io/badge/github-OCA%2Fserver--tools-lightgray.png?logo=github + :target: https://github.com/OCA/server-tools/tree/18.0/field_vector + :alt: OCA/server-tools +.. |badge4| image:: https://img.shields.io/badge/weblate-Translate%20me-F47D42.png + :target: https://translation.odoo-community.org/projects/server-tools-18-0/server-tools-18-0-field_vector + :alt: Translate me on Weblate +.. |badge5| image:: https://img.shields.io/badge/runboat-Try%20me-875A7B.png + :target: https://runboat.odoo-community.org/builds?repo=OCA/server-tools&target_branch=18.0 + :alt: Try me on Runboat + +|badge1| |badge2| |badge3| |badge4| |badge5| + +This addon provides a new field type called "Vector" that allows you to +store and manage vector into your Odoo database. + +**Table of contents** + +.. contents:: + :local: + +Use Cases / Context +=================== + +The advent of large language models (LLMs) has highlighted the +importance of vector representation as a powerful representation of data +to easily determine the similarity between different pieces of +information. Vector representation is a way of encoding information in a +numerical format that captures the semantic meaning of the data. This +allows for efficient similarity comparisons. + +Installation +============ + +To install this module, you need to ensure that the +`pgvector `__ extension is +installed and available in your PostgreSQL instance. + +Configuration +============= + +[ This file is not always required; it should explain **how to configure +the module before using it**; it is aimed at users with administration +privileges. + +Please be detailed on the path to configuration (eg: do you need to +activate developer mode?), describe step by step configurations and the +use of screenshots is strongly recommended.] + +To configure this module, you need to: + +- Go to *App* > Menu > Menu item +- Activate boolean… > save +- … + +Usage +===== + + | **⚠️ Warning** + | This addon is **not compatible** with the Python ``pgvector`` + library. Please ensure that you do not use this library alongside + the addon to avoid potential issues. This is mainly due to the fact + that numpy arrays can't be stored into the odoo cache since they + are not comparable with the default '==' or '!=' operators. + +The module is a technical module providing a new field type called +"Vector". It's intended to be used by developers who want to store and +manage vector data in their Odoo database when they develop their own +modules. + +Field declaration +----------------- + +To declare a field of type vector, you can use the following syntax: + +.. code:: python + + + from odoo.addons.field_vector.fields import Vector + + + class YourModel(models.Model): + _name = 'your.model' + + vector_field = Vector(dimensions=3) + +The ``dimensions`` parameter is required and specifies the number of +dimensions of the vector. The field will be stored as a ``vector`` type +in PostgreSQL, which is a native type for storing vectors. + +By default the field is declared as no ``prefetch=False`` and with +``autopad=True``. You can override these parameters by passing them as +arguments to the field: + +.. code:: python + + from odoo.addons.field_vector.fields import Vector + class YourModel(models.Model): + _name = 'your.model' + + vector_field = Vector(dimensions=3, prefetch=True, autopad=False) + +The ``prefetch`` parameter allows you to enable or disable prefetching +of the field when loading records. If set to ``True``, the field will be +prefetched when loading records, which can improve performance when +accessing the field frequently. If set to ``False``, the field will not +be prefetched, which can save memory and improve performance when +accessing the field infrequently (which would be the common case). + +The ``autopad`` parameter allows you to enable or disable automatic +padding of the vector when storing it in the database. If set to +``True``, the vector will be automatically padded with zeros to match +the specified dimensions. If set to ``False``, the vector will not be +padded but if the vector is shorter than the specified dimensions an +error will be raised. + +Field usage +----------- + +The vector field can be used like any other field in Odoo. When +accessing the field, it will always return an +``odoo.addons.field_vector.fields.VectorValue`` object, which is a +wrapper around value stored into the database. This object provides a +convenient way to get the value of the vector as a numpy array. + +.. code:: python + + import numpy as np + from odoo.addons.field_vector.fields import VectorValue + + record = self.env['your.model'].create({ + 'vector_field': [1.0, 2.0, 3.0] + }) + + assert isinstance(record.vector_field, VectorValue) + assert isinstance(record.vector_field.value, np.ndarray) + +When setting the field, you can pass a list of values or a numpy array +or a ``VectorValue`` object or a list/tuple of values. The field will +automatically convert the value to a VectorValue and store it in the +database into the vector format. + +.. code:: python + + + record.vector_field = [1.0, 2.0, 3.0] + assert isinstance(record.vector_field, VectorValue) + + record.vector_field = np.array([1.0, 2.0, 3.0]) + assert isinstance(record.vector_field, VectorValue) + + record.vector_field = VectorValue([1.0, 2.0, 3.0]) + assert isinstance(record.vector_field, VectorValue) + +Plain SQL queries +----------------- + +When reading the field in plain SQL queries, the field will be returned +as a ``VectorValue`` object. You can use the ``value`` property to get +the value of the vector as a numpy array. + +.. code:: python + + + env.cr.execute('SELECT vector_field FROM your_model WHERE id = 1') + record = env.cr.fetchone() + vector_value = record[0] + assert isinstance(vector_value, VectorValue) + +When writing the field in plain SQL queries, you can pass a numpy array +or a list of values or a VectorValue object as the value of the field +(in this specific case tuples are not supported). + +.. code:: python + + + env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', (np.array([1.0, 2.0, 3.0]),)) + env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', ([1.0, 2.0, 3.0],)) + env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', (VectorValue([1.0, 2.0, 3.0]),)) + +Known issues / Roadmap +====================== + +- allows the use of specific operators into domain filters to search for + similar vectors. +- dedicated widget to display the vector in a more user-friendly way. +- evaluate removing the psycopg2 adapter (register.py) in favor of + explicit casting in convert_to_column/convert_to_cache. Currently the + adapter must be registered before any SQL query reads vector columns, + which creates a implicit dependency on ir.model.fields._register_hook + execution order. Without the adapter, plain SQL queries would return + raw strings instead of VectorValue objects. + +Bug Tracker +=========== + +Bugs are tracked on `GitHub Issues `_. +In case of trouble, please check there if your issue has already been reported. +If you spotted it first, help us to smash it by providing a detailed and welcomed +`feedback `_. + +Do not contact contributors directly about support or help with technical issues. + +Credits +======= + +Authors +------- + +* ACSONE SA/NV + +Contributors +------------ + +- Laurent Mignon laurent.mignon@acsone.eu (https://www.acsone.eu) + +Other credits +------------- + +The development of this module has been financially supported by: + +- `Alcyon Belux `__ + +Maintainers +----------- + +This module is maintained by the OCA. + +.. image:: https://odoo-community.org/logo.png + :alt: Odoo Community Association + :target: https://odoo-community.org + +OCA, or the Odoo Community Association, is a nonprofit organization whose +mission is to support the collaborative development of Odoo features and +promote its widespread use. + +.. |maintainer-lmignon| image:: https://github.com/lmignon.png?size=40px + :target: https://github.com/lmignon + :alt: lmignon + +Current `maintainer `__: + +|maintainer-lmignon| + +This module is part of the `OCA/server-tools `_ project on GitHub. + +You are welcome to contribute. To learn how please visit https://odoo-community.org/page/Contribute. diff --git a/field_vector/__init__.py b/field_vector/__init__.py new file mode 100644 index 00000000000..6d58305f5dd --- /dev/null +++ b/field_vector/__init__.py @@ -0,0 +1,2 @@ +from . import models +from .hooks import pre_init_hook diff --git a/field_vector/__manifest__.py b/field_vector/__manifest__.py new file mode 100644 index 00000000000..243e53c2272 --- /dev/null +++ b/field_vector/__manifest__.py @@ -0,0 +1,18 @@ +# Copyright 2025 ACSONE SA/NV +# License LGPL-3.0 or later (https://www.gnu.org/licenses/lgpl). + +{ + "name": "Field Vector", + "summary": """New specialized field to store vector data""", + "version": "18.0.1.0.0", + "license": "LGPL-3", + "author": "ACSONE SA/NV,Odoo Community Association (OCA)", + "website": "https://github.com/OCA/server-tools", + "depends": ["base"], + "maintainers": ["lmignon"], + "installable": True, + "pre_init_hook": "pre_init_hook", + "external_dependencies": { + "python": ["numpy"], + }, +} diff --git a/field_vector/fields.py b/field_vector/fields.py new file mode 100644 index 00000000000..ca8ca4615ee --- /dev/null +++ b/field_vector/fields.py @@ -0,0 +1,241 @@ +# Copyright 2025 ACSONE SA/NV +# License LGPL-3.0 or later (https://www.gnu.org/licenses/lgpl). +from __future__ import annotations + +from operator import attrgetter + +import numpy as np + +from odoo import fields +from odoo.tools import sql + + +class VectorValue: + """ + Class to represent a vector value. + This class as a wrapper around the text representation of the vector + to allow for easy manipulation and conversion to/from other formats. + + It's designed to be put in the record's cache and returned as record's value. + It's also used when the database is queried to convert the value to/from + the database format in a transparent way. + """ + + def __init__(self, value: list | tuple | np.ndarray, dimensions=None, autopad=True): + if not isinstance(value, list | tuple | np.ndarray): + raise ValueError( + f"Invalid type '{type(value)}' for VectorValue: " + "Only list, tuple or np.ndarray are allowed." + ) + if isinstance(value, np.ndarray): + if value.dtype != ">f4": + value = value.astype(">f4") + value = value.tolist() + self._value = value + if dimensions is not None and len(value) != dimensions and autopad: + self.pad(dimensions) + + def __repr__(self): + return f"VectorValue({self._value})" + + def __eq__(self, value: object, /) -> bool: + if isinstance(value, self.__class__): + return np.array_equal(self._value, value._value) + return False + + def __len__(self): + return len(self._value) + + def to_list(self): + """ + Convert the vector value to a list. + """ + return list(self._value) + + def pad(self, dimensions: int): + """ + Pad the vector value to the given size. + """ + if len(self._value) < dimensions: + self._value = [*self._value, *([0] * (dimensions - len(self._value)))] + return self + + @property + def value(self): + """ + Return the value as a numpy array. + """ + return np.asarray(self._value, dtype=">f4") + + @property + def dimensions(self): + """ + Return the dimensions of the vector. + """ + return len(self._value) + + @classmethod + def _from_db(cls, value: str) -> VectorValue: + """ + Convert a binary value from the database to a VectorValue. + """ + if value is None: + return None + return cls([float(v) for v in value[1:-1].split(",")]) + + @classmethod + def _to_db(cls, value: list | tuple | np.ndarray | VectorValue) -> str: + """ + Convert a VectorValue to a binary value for the database. + """ + if value is None: + return None + if isinstance(value, list | tuple | np.ndarray): + value = cls(value) + if not isinstance(value, cls): + raise ValueError( + f"Invalid type '{type(value)}' for VectorValue: " + "Only list, tuple or np.ndarray or VectoreValue are allowed." + ) + return "[" + ",".join([str(float(v)) for v in value.value]) + "]" + + +class Vector(fields.Field): + """ + Specialized field to store vector data. + This field is based on the pgvector extension for PostgreSQL. + It allows to store and manipulate vector data efficiently. + + This field can be used to store vectors of any size. + The dimension of the vector is defined at the field level. + + By default, the field is not pre-fetched. + To ease the use of the field, it is automatically padded to the size of the vector. + + + """ + + type = "vector" + dimensions = None + prefetch = False + autopad = True + + def __init__( + self, + dimensions=fields.SENTINEL, + string=fields.SENTINEL, + autopad=fields.SENTINEL, + **kwargs, + ): + super().__init__( + dimensions=dimensions, string=string, autopad=autopad, **kwargs + ) + + def vector_dimensions(self, record): + return self.dimensions + + def _setup_attrs(self, model_class, name): + res = super()._setup_attrs(model_class, name) + if self.store and ( + self.dimensions == fields.SENTINEL + or self.dimensions is None + or not isinstance(self.dimensions, int) + ): + raise ValueError( + "The size of the vector field must be an integer and cannot be None." + ) + return res + + @property + def column_type(self): + return ("vector", self._get_pg_type(self.dimensions)) + + def _get_pg_type(self, dimensions): + return f"vector({dimensions})" + + def get_current_vector_size(self, cr, table, column): + """Fetch the current vector size from pg_typeof()""" + cr.execute( + """ + SELECT atttypmod + FROM pg_attribute + JOIN pg_class ON pg_class.oid = pg_attribute.attrelid + WHERE pg_class.relname = %s + AND pg_attribute.attname = %s + """, + (table, column), + ) + result = cr.fetchone() + if result and result[0]: + return result[0] + return None + + def update_db_column(self, model, column): + if column: + db_size = self.get_current_vector_size(model._cr, model._table, self.name) + if db_size is not None and db_size != self.vector_dimensions(model): + sql.convert_column( + model._cr, + model._table, + self.name, + self._get_pg_type(self.vector_dimensions(model)), + ) + return super().update_db_column(model, column) + + _related_dimensions = property(attrgetter("dimensions")) + _description_dimensions = property(attrgetter("dimensions")) + + def convert_to_export(self, value: VectorValue, record): + return value.to_list() if value else None + + def convert_to_cache(self, value, record, validate=True): + if value is None or value is False: + return None + if not isinstance(value, list | tuple | np.ndarray | VectorValue): + raise ValueError( + f"Invalid type '{type(value)}' for {self.name}: " + "Only np.ndarray or list of floats/int are allowed." + ) + if not isinstance(value, VectorValue): + value = VectorValue( + value, dimensions=self.vector_dimensions(record), autopad=self.autopad + ) + if self.autopad and value.dimensions < self.vector_dimensions(record): + value = value.pad(self.vector_dimensions(record)) + if validate and value.dimensions != self.vector_dimensions(record): + raise ValueError( + f"Invalid vector size for {self.name}: " + f"{value.dimensions} != {self.vector_dimensions(record)}" + ) + return value + + def convert_to_record(self, value, record): + if value is None or value is False: + return None + if not isinstance(value, list | tuple | np.ndarray | VectorValue): + raise ValueError( + f"Invalid type '{type(value)}' for {self.name}: " + "Only np.ndarray, list of floats/int or VectorValue are allowed." + ) + if not isinstance(value, VectorValue): + value = VectorValue( + value, dimensions=self.vector_dimensions(record), autopad=self.autopad + ) + if self.autopad and value.dimensions < self.vector_dimensions(record): + value = value.pad(self.vector_dimensions(record)) + + if value.dimensions != self.vector_dimensions(record): + raise ValueError( + f"Invalid vector dimensions for {self.name}: " + f"{value.dimensions} != {self.dimensions}" + ) + return value + + def convert_to_read(self, value, record, use_name_get=True): + return self.convert_to_export(value, record) + + def convert_to_column(self, value, record, values=None, validate=True): + return self.convert_to_record(value, record) + + def convert_to_write(self, value, record): + return self.convert_to_column(value, record) diff --git a/field_vector/hooks.py b/field_vector/hooks.py new file mode 100644 index 00000000000..0421a32fead --- /dev/null +++ b/field_vector/hooks.py @@ -0,0 +1,27 @@ +# Copyright 2025 ACSONE SA/NV +# License LGPL-3.0 or later (https://www.gnu.org/licenses/lgpl). + + +def pre_init_hook(env): + """setup vector extension if not already setup""" + env.cr.execute("SELECT typname, oid FROM pg_type WHERE oid = to_regtype('vector')") + type_info = dict(env.cr.fetchall()) + if "vector" in type_info: + return {} + try: + env.cr.execute( + """ + CREATE EXTENSION IF NOT EXISTS vector; + """ + ) + except Exception: + import logging + + _logger = logging.getLogger(__name__) + _logger.warning( + "Could not automatically initialize pgvector support. " + "Database user may need superuser privileges and pgvector " + "extension must be installed. To manually prepare your " + "database, run as superuser:\n" + "CREATE EXTENSION vector;" + ) diff --git a/field_vector/i18n/field_vector.pot b/field_vector/i18n/field_vector.pot new file mode 100644 index 00000000000..59f5418dbad --- /dev/null +++ b/field_vector/i18n/field_vector.pot @@ -0,0 +1,34 @@ +# Translation of Odoo Server. +# This file contains the translation of the following modules: +# * field_vector +# +msgid "" +msgstr "" +"Project-Id-Version: Odoo Server 18.0\n" +"Report-Msgid-Bugs-To: \n" +"Last-Translator: \n" +"Language-Team: \n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: \n" +"Plural-Forms: \n" + +#. module: field_vector +#: model:ir.model.fields,field_description:field_vector.field_ir_model_fields__ttype +msgid "Field Type" +msgstr "" + +#. module: field_vector +#: model:ir.model,name:field_vector.model_ir_model_fields +msgid "Fields" +msgstr "" + +#. module: field_vector +#: model:ir.model.fields,field_description:field_vector.field_ir_model_fields__smart_search +msgid "Smart Search" +msgstr "" + +#. module: field_vector +#: model:ir.model.fields.selection,name:field_vector.selection__ir_model_fields__ttype__vector +msgid "Vector" +msgstr "" diff --git a/field_vector/i18n/it.po b/field_vector/i18n/it.po new file mode 100644 index 00000000000..d2f5683d65f --- /dev/null +++ b/field_vector/i18n/it.po @@ -0,0 +1,53 @@ +# Translation of Odoo Server. +# This file contains the translation of the following modules: +# * field_vector +# +msgid "" +msgstr "" +"Project-Id-Version: Odoo Server 16.0\n" +"Report-Msgid-Bugs-To: \n" +"PO-Revision-Date: 2025-11-10 09:29+0000\n" +"Last-Translator: mymage \n" +"Language-Team: none\n" +"Language: it\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=UTF-8\n" +"Content-Transfer-Encoding: \n" +"Plural-Forms: nplurals=2; plural=n != 1;\n" +"X-Generator: Weblate 5.10.4\n" + +#. module: field_vector +#. odoo-python +#: code:addons/field_vector/hooks.py:0 +#, python-format +msgid "" +"Error, can not automatically initialize vector support. Database user may have to be superuser and pgvector extensions to be installed. If you do not want Odoo to connect with a super user you can manually prepare your database. To dothis, open a client to your database using a super user and run:\n" +"CREATE EXTENSION vector;\n" +msgstr "" +"Errore, impossibile inizializzare automaticamente il supporto vettoriale. " +"L'utente del database potrebbe dover essere un superutente e le estensioni " +"pgvector devono essere installate. Se non si desidera che Odoo si connetta " +"con un superutente, è possibile preparare manualmente il database. Per " +"farlo, aprire un client per il database utilizzando un superutente ed " +"eseguire:\n" +"CREATE EXTENSION vector;\n" + +#. module: field_vector +#: model:ir.model.fields,field_description:field_vector.field_ir_model_fields__ttype +msgid "Field Type" +msgstr "Tipo campo" + +#. module: field_vector +#: model:ir.model,name:field_vector.model_ir_model_fields +msgid "Fields" +msgstr "Campi" + +#. module: field_vector +#: model:ir.model.fields,field_description:field_vector.field_ir_model_fields__smart_search +msgid "Smart Search" +msgstr "Ricerca intelligente" + +#. module: field_vector +#: model:ir.model.fields.selection,name:field_vector.selection__ir_model_fields__ttype__vector +msgid "Vector" +msgstr "Vettore" diff --git a/field_vector/models/__init__.py b/field_vector/models/__init__.py new file mode 100644 index 00000000000..4236f0a44c0 --- /dev/null +++ b/field_vector/models/__init__.py @@ -0,0 +1 @@ +from . import ir_model_fields diff --git a/field_vector/models/ir_model_fields.py b/field_vector/models/ir_model_fields.py new file mode 100644 index 00000000000..99ffd847dbb --- /dev/null +++ b/field_vector/models/ir_model_fields.py @@ -0,0 +1,30 @@ +# Copyright 2025 ACSONE SA/NV +# License LGPL-3.0 or later (https://www.gnu.org/licenses/lgpl). +from odoo import fields, models + +from ..register import register_vector + + +class IrModelFields(models.Model): + _inherit = "ir.model.fields" + + ttype = fields.Selection( + selection_add=[("vector", "Vector")], + ondelete={"vector": "cascade"}, + ) + + def init(self): + # This method is called when the module is installed + # at intallation time to register the field type in the database. + # This is needed to ensure that the type is registered + # where runnig tests at installation time. + res = super().init() + register_vector(self.env.cr) + return res + + def _register_hook(self): + # This method is called when the module is loaded to + # register the field type in the database. + res = super()._register_hook() + register_vector(self.env.cr) + return res diff --git a/field_vector/pyproject.toml b/field_vector/pyproject.toml new file mode 100644 index 00000000000..4231d0cccb3 --- /dev/null +++ b/field_vector/pyproject.toml @@ -0,0 +1,3 @@ +[build-system] +requires = ["whool"] +build-backend = "whool.buildapi" diff --git a/field_vector/readme/CONFIGURE.md b/field_vector/readme/CONFIGURE.md new file mode 100644 index 00000000000..2fdb0e64a6f --- /dev/null +++ b/field_vector/readme/CONFIGURE.md @@ -0,0 +1,10 @@ +[ This file is not always required; it should explain **how to configure the module before using it**; it is aimed at users with administration privileges. + +Please be detailed on the path to configuration (eg: do you need to activate developer mode?), describe step by step configurations and the use of screenshots is strongly recommended.] + + +To configure this module, you need to: + +- Go to *App* > Menu > Menu item +- Activate boolean… > save +- … diff --git a/field_vector/readme/CONTEXT.md b/field_vector/readme/CONTEXT.md new file mode 100644 index 00000000000..692befa8dc3 --- /dev/null +++ b/field_vector/readme/CONTEXT.md @@ -0,0 +1,4 @@ +The advent of large language models (LLMs) has highlighted the importance of vector +representation as a powerful representation of data to easily determine the +similarity between different pieces of information. +Vector representation is a way of encoding information in a numerical format that captures the semantic meaning of the data. This allows for efficient similarity comparisons. \ No newline at end of file diff --git a/field_vector/readme/CONTRIBUTORS.md b/field_vector/readme/CONTRIBUTORS.md new file mode 100644 index 00000000000..8af73de7ea8 --- /dev/null +++ b/field_vector/readme/CONTRIBUTORS.md @@ -0,0 +1 @@ +- Laurent Mignon (https://www.acsone.eu) \ No newline at end of file diff --git a/field_vector/readme/CREDITS.md b/field_vector/readme/CREDITS.md new file mode 100644 index 00000000000..dc3c6118c1d --- /dev/null +++ b/field_vector/readme/CREDITS.md @@ -0,0 +1,4 @@ +The development of this module has been financially supported by: + +- [Alcyon Belux](https://www.alcyonbelux.be/) + diff --git a/field_vector/readme/DESCRIPTION.md b/field_vector/readme/DESCRIPTION.md new file mode 100644 index 00000000000..b5e83335ce4 --- /dev/null +++ b/field_vector/readme/DESCRIPTION.md @@ -0,0 +1 @@ +This addon provides a new field type called "Vector" that allows you to store and manage vector into your Odoo database. \ No newline at end of file diff --git a/field_vector/readme/INSTALL.md b/field_vector/readme/INSTALL.md new file mode 100644 index 00000000000..ab8ddeaed1f --- /dev/null +++ b/field_vector/readme/INSTALL.md @@ -0,0 +1 @@ +To install this module, you need to ensure that the [**pgvector**](https://github.com/pgvector/pgvector) extension is installed and available in your PostgreSQL instance. diff --git a/field_vector/readme/ROADMAP.md b/field_vector/readme/ROADMAP.md new file mode 100644 index 00000000000..215bfbb5275 --- /dev/null +++ b/field_vector/readme/ROADMAP.md @@ -0,0 +1,7 @@ +- allows the use of specific operators into domain filters to search for similar vectors. +- dedicated widget to display the vector in a more user-friendly way. +- evaluate removing the psycopg2 adapter (register.py) in favor of explicit + casting in convert_to_column/convert_to_cache. Currently the adapter must be + registered before any SQL query reads vector columns, which creates a implicit + dependency on ir.model.fields._register_hook execution order. Without the adapter, + plain SQL queries would return raw strings instead of VectorValue objects. diff --git a/field_vector/readme/USAGE.md b/field_vector/readme/USAGE.md new file mode 100644 index 00000000000..0ebfd68be1a --- /dev/null +++ b/field_vector/readme/USAGE.md @@ -0,0 +1,93 @@ + +> **⚠️ Warning** +> This addon is **not compatible** with the Python `pgvector` library. Please ensure that you do not use this library alongside the addon to avoid potential issues. This is mainly due to the fact that numpy arrays can't be stored into the odoo cache since they are not comparable with the default '==' or '!=' operators. + +The module is a technical module providing a new field type called "Vector". It's intended to be used by developers who want to store and manage vector data in their Odoo database when they develop their own modules. + +## Field declaration + +To declare a field of type vector, you can use the following syntax: + +```python + +from odoo.addons.field_vector.fields import Vector + + +class YourModel(models.Model): + _name = 'your.model' + + vector_field = Vector(dimensions=3) +``` + +The `dimensions` parameter is required and specifies the number of dimensions of the vector. The field will be stored as a `vector` type in PostgreSQL, which is a native type for storing vectors. + +By default the field is declared as no `prefetch=False` and with `autopad=True`. +You can override these parameters by passing them as arguments to the field: + +```python +from odoo.addons.field_vector.fields import Vector +class YourModel(models.Model): + _name = 'your.model' + + vector_field = Vector(dimensions=3, prefetch=True, autopad=False) +``` + +The `prefetch` parameter allows you to enable or disable prefetching of the field when loading records. If set to `True`, the field will be prefetched when loading records, which can improve performance when accessing the field frequently. If set to `False`, the field will not be prefetched, which can save memory and improve performance when accessing the field infrequently (which would be the common case). + +The `autopad` parameter allows you to enable or disable automatic padding of the vector when storing it in the database. If set to `True`, the vector will be automatically padded with zeros to match the specified dimensions. If set to `False`, the vector will not be padded but if the vector is shorter than the specified dimensions an error will be raised. + +## Field usage + +The vector field can be used like any other field in Odoo. When accessing the field, it will always return an `odoo.addons.field_vector.fields.VectorValue` object, which is a wrapper around value stored into the database. This object +provides a convenient way to get the value of the vector as a numpy array. + +```python +import numpy as np +from odoo.addons.field_vector.fields import VectorValue + +record = self.env['your.model'].create({ + 'vector_field': [1.0, 2.0, 3.0] +}) + +assert isinstance(record.vector_field, VectorValue) +assert isinstance(record.vector_field.value, np.ndarray) + +``` + +When setting the field, you can pass a list of values or a numpy array or a `VectorValue` object or a list/tuple of values. The field will automatically convert the value to a VectorValue and store it in the database into the vector format. + +```python + +record.vector_field = [1.0, 2.0, 3.0] +assert isinstance(record.vector_field, VectorValue) + +record.vector_field = np.array([1.0, 2.0, 3.0]) +assert isinstance(record.vector_field, VectorValue) + +record.vector_field = VectorValue([1.0, 2.0, 3.0]) +assert isinstance(record.vector_field, VectorValue) + +``` + +## Plain SQL queries + +When reading the field in plain SQL queries, the field will be returned as a +`VectorValue` object. You can use the `value` property to get the value of the vector as a numpy array. + +```python + +env.cr.execute('SELECT vector_field FROM your_model WHERE id = 1') +record = env.cr.fetchone() +vector_value = record[0] +assert isinstance(vector_value, VectorValue) +``` + +When writing the field in plain SQL queries, you can pass a numpy array or a list of values or a VectorValue object as the value of the field (in this specific case tuples are not supported). + +```python + +env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', (np.array([1.0, 2.0, 3.0]),)) +env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', ([1.0, 2.0, 3.0],)) +env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', (VectorValue([1.0, 2.0, 3.0]),)) + +``` diff --git a/field_vector/register.py b/field_vector/register.py new file mode 100644 index 00000000000..a730e732834 --- /dev/null +++ b/field_vector/register.py @@ -0,0 +1,37 @@ +# Copyright 2025 ACSONE SA/NV +# License LGPL-3.0 or later (https://www.gnu.org/licenses/lgpl). + +import numpy as np +from psycopg2.extensions import adapt, new_type, register_adapter, register_type + +from .fields import VectorValue + +_is_vector_type_registered = False + + +class VectorAdapter: + def __init__(self, value): + self._value = value + + def getquoted(self): + return adapt(VectorValue._to_db(self._value)).getquoted() + + +def cast_vector(value, cur): + return VectorValue._from_db(value) + + +def register_vector(cr): + global _is_vector_type_registered + if _is_vector_type_registered: + return + cr.execute("SELECT typname, oid FROM pg_type WHERE oid = to_regtype('vector')") + type_info = dict(cr.fetchall()) + if "vector" not in type_info: + raise ValueError("vector type not found in the database") + + vector = new_type((type_info["vector"],), "VECTOR", cast_vector) + register_type(vector) + register_adapter(np.ndarray, VectorAdapter) + register_adapter(VectorValue, VectorAdapter) + _is_vector_type_registered = True diff --git a/field_vector/static/description/icon.png b/field_vector/static/description/icon.png new file mode 100644 index 00000000000..3a0328b516c Binary files /dev/null and b/field_vector/static/description/icon.png differ diff --git a/field_vector/static/description/index.html b/field_vector/static/description/index.html new file mode 100644 index 00000000000..501cef37310 --- /dev/null +++ b/field_vector/static/description/index.html @@ -0,0 +1,602 @@ + + + + + +README.rst + + + +
+ + + +Odoo Community Association + +
+

Field Vector

+ +

Beta License: LGPL-3 OCA/server-tools Translate me on Weblate Try me on Runboat

+

This addon provides a new field type called “Vector” that allows you to +store and manage vector into your Odoo database.

+

Table of contents

+ +
+

Use Cases / Context

+

The advent of large language models (LLMs) has highlighted the +importance of vector representation as a powerful representation of data +to easily determine the similarity between different pieces of +information. Vector representation is a way of encoding information in a +numerical format that captures the semantic meaning of the data. This +allows for efficient similarity comparisons.

+
+
+

Installation

+

To install this module, you need to ensure that the +pgvector extension is +installed and available in your PostgreSQL instance.

+
+
+

Configuration

+

[ This file is not always required; it should explain how to configure +the module before using it; it is aimed at users with administration +privileges.

+

Please be detailed on the path to configuration (eg: do you need to +activate developer mode?), describe step by step configurations and the +use of screenshots is strongly recommended.]

+

To configure this module, you need to:

+
    +
  • Go to App > Menu > Menu item
  • +
  • Activate boolean… > save
  • +
  • +
+
+
+

Usage

+
+
+
⚠️ Warning
+
This addon is not compatible with the Python pgvector +library. Please ensure that you do not use this library alongside +the addon to avoid potential issues. This is mainly due to the fact +that numpy arrays can’t be stored into the odoo cache since they +are not comparable with the default ‘==’ or ‘!=’ operators.
+
+
+

The module is a technical module providing a new field type called +“Vector”. It’s intended to be used by developers who want to store and +manage vector data in their Odoo database when they develop their own +modules.

+
+

Field declaration

+

To declare a field of type vector, you can use the following syntax:

+
+from odoo.addons.field_vector.fields import Vector
+
+
+class YourModel(models.Model):
+    _name = 'your.model'
+
+    vector_field = Vector(dimensions=3)
+
+

The dimensions parameter is required and specifies the number of +dimensions of the vector. The field will be stored as a vector type +in PostgreSQL, which is a native type for storing vectors.

+

By default the field is declared as no prefetch=False and with +autopad=True. You can override these parameters by passing them as +arguments to the field:

+
+from odoo.addons.field_vector.fields import Vector
+class YourModel(models.Model):
+    _name = 'your.model'
+
+    vector_field = Vector(dimensions=3, prefetch=True, autopad=False)
+
+

The prefetch parameter allows you to enable or disable prefetching +of the field when loading records. If set to True, the field will be +prefetched when loading records, which can improve performance when +accessing the field frequently. If set to False, the field will not +be prefetched, which can save memory and improve performance when +accessing the field infrequently (which would be the common case).

+

The autopad parameter allows you to enable or disable automatic +padding of the vector when storing it in the database. If set to +True, the vector will be automatically padded with zeros to match +the specified dimensions. If set to False, the vector will not be +padded but if the vector is shorter than the specified dimensions an +error will be raised.

+
+
+

Field usage

+

The vector field can be used like any other field in Odoo. When +accessing the field, it will always return an +odoo.addons.field_vector.fields.VectorValue object, which is a +wrapper around value stored into the database. This object provides a +convenient way to get the value of the vector as a numpy array.

+
+import numpy as np
+from odoo.addons.field_vector.fields import  VectorValue
+
+record = self.env['your.model'].create({
+    'vector_field': [1.0, 2.0, 3.0]
+})
+
+assert isinstance(record.vector_field, VectorValue)
+assert isinstance(record.vector_field.value, np.ndarray)
+
+

When setting the field, you can pass a list of values or a numpy array +or a VectorValue object or a list/tuple of values. The field will +automatically convert the value to a VectorValue and store it in the +database into the vector format.

+
+record.vector_field = [1.0, 2.0, 3.0]
+assert isinstance(record.vector_field, VectorValue)
+
+record.vector_field = np.array([1.0, 2.0, 3.0])
+assert isinstance(record.vector_field, VectorValue)
+
+record.vector_field = VectorValue([1.0, 2.0, 3.0])
+assert isinstance(record.vector_field, VectorValue)
+
+
+
+

Plain SQL queries

+

When reading the field in plain SQL queries, the field will be returned +as a VectorValue object. You can use the value property to get +the value of the vector as a numpy array.

+
+env.cr.execute('SELECT vector_field FROM your_model WHERE id = 1')
+record = env.cr.fetchone()
+vector_value = record[0]
+assert isinstance(vector_value, VectorValue)
+
+

When writing the field in plain SQL queries, you can pass a numpy array +or a list of values or a VectorValue object as the value of the field +(in this specific case tuples are not supported).

+
+env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', (np.array([1.0, 2.0, 3.0]),))
+env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', ([1.0, 2.0, 3.0],))
+env.cr.execute('UPDATE your_model SET vector_field = %s WHERE id = 1', (VectorValue([1.0, 2.0, 3.0]),))
+
+
+
+
+

Known issues / Roadmap

+
    +
  • allows the use of specific operators into domain filters to search for +similar vectors.
  • +
  • dedicated widget to display the vector in a more user-friendly way.
  • +
  • evaluate removing the psycopg2 adapter (register.py) in favor of +explicit casting in convert_to_column/convert_to_cache. Currently the +adapter must be registered before any SQL query reads vector columns, +which creates a implicit dependency on ir.model.fields._register_hook +execution order. Without the adapter, plain SQL queries would return +raw strings instead of VectorValue objects.
  • +
+
+
+

Bug Tracker

+

Bugs are tracked on GitHub Issues. +In case of trouble, please check there if your issue has already been reported. +If you spotted it first, help us to smash it by providing a detailed and welcomed +feedback.

+

Do not contact contributors directly about support or help with technical issues.

+
+
+

Credits

+
+

Authors

+
    +
  • ACSONE SA/NV
  • +
+
+ +
+

Other credits

+

The development of this module has been financially supported by:

+ +
+
+

Maintainers

+

This module is maintained by the OCA.

+ +Odoo Community Association + +

OCA, or the Odoo Community Association, is a nonprofit organization whose +mission is to support the collaborative development of Odoo features and +promote its widespread use.

+

Current maintainer:

+

lmignon

+

This module is part of the OCA/server-tools project on GitHub.

+

You are welcome to contribute. To learn how please visit https://odoo-community.org/page/Contribute.

+
+
+
+
+ + diff --git a/field_vector/tests/__init__.py b/field_vector/tests/__init__.py new file mode 100644 index 00000000000..4bea0213768 --- /dev/null +++ b/field_vector/tests/__init__.py @@ -0,0 +1,2 @@ +from . import test_field_vector +from . import test_field_vector_update diff --git a/field_vector/tests/models.py b/field_vector/tests/models.py new file mode 100644 index 00000000000..e9930069d28 --- /dev/null +++ b/field_vector/tests/models.py @@ -0,0 +1,22 @@ +# Copyright 2025 ACSONE SA/NV +# License LGPL-3.0 or later (https://www.gnu.org/licenses/lgpl). + +# DON'T IMPORT THIS MODULE IN INIT TO AVOID THE CREATION OF THE MODELS +# DEFINED FOR TESTS INTO YOUR ODOO INSTANCE +from odoo import models + +from ..fields import Vector + + +class TestModel(models.Model): + _name = "vector.model" + _description = "vector.model Fake Model" + + vector = Vector(dimensions=3, string="Default Vector") + no_autopad = Vector(dimensions=3, string="Vector not autopadded", autopad=False) + + +class TestModelUpgrade(models.Model): + _inherit = "vector.model" + + vector = Vector(dimensions=5) diff --git a/field_vector/tests/test_field_vector.py b/field_vector/tests/test_field_vector.py new file mode 100644 index 00000000000..6ad6e5198eb --- /dev/null +++ b/field_vector/tests/test_field_vector.py @@ -0,0 +1,147 @@ +# Copyright 2025 ACSONE SA/NV +# License LGPL-3.0 or later (https://www.gnu.org/licenses/lgpl). +import numpy as np +from odoo_test_helper import FakeModelLoader +from psycopg2.extensions import AsIs + +from odoo.addons.base.tests.common import BaseCommon + +from ..fields import VectorValue + + +class TestFieldVector(BaseCommon): + @classmethod + def setUpClass(cls): + res = super().setUpClass() + cls.loader = FakeModelLoader(cls.env, cls.__module__) + cls.loader.backup_registry() + cls.addClassCleanup(cls.loader.restore_registry) + + # pylint: disable=import-outside-toplevel + from .models import TestModel + + cls.loader.update_registry([TestModel]) + + cls.TestModel = cls.env[TestModel._name] + + return res + + def test_create_from_tuple(self): + record = self.TestModel.create({"vector": (1, 2, 3)}) + self.assertListEqual([1, 2, 3], record.vector.to_list()) + + def test_create_from_list(self): + record = self.TestModel.create({"vector": [1, 2, 3]}) + self.assertListEqual([1, 2, 3], record.vector.to_list()) + + def test_create_autopad(self): + record = self.TestModel.create({"vector": [1, 2]}) + self.assertListEqual([1, 2, 0], record.vector.to_list()) + + def test_create_no_autopad(self): + with self.assertRaisesRegex( + ValueError, + "Invalid vector dimensions", + ): + self.TestModel.create({"no_autopad": [1, 2]}) + + record = self.TestModel.create({"no_autopad": [1, 2, 3]}) + self.assertListEqual([1, 2, 3], record.no_autopad.to_list()) + + def test_from_db(self): + record = self.TestModel.create({"vector": [1, 2, 3]}) + record.flush_recordset() + record.invalidate_model() + new_record = self.TestModel.browse(record.id) + val = new_record.vector + self.assertIsInstance(val, VectorValue) + self.assertEqual(val.to_list(), [1, 2, 3]) + + def test_plain_sql_select(self): + record = self.TestModel.create({"vector": [1, 2, 3]}) + record.flush_recordset() + self.env.cr.execute( + "SELECT vector FROM %s WHERE id = %s", + ( + AsIs(record._table), + record.id, + ), + ) + val = self.env.cr.fetchone()[0] + # Even if we use plain SQL, the value is still a VectorValue + # because of the adapter registered for the vector type + # in the database. + self.assertIsInstance(val, VectorValue) + self.assertEqual(val.to_list(), [1, 2, 3]) + + def test_plain_sql_write(self): + record = self.TestModel.create({"vector": [1, 2, 3]}) + record.flush_recordset() + # as VectorValue + self.env.cr.execute( + "UPDATE %s SET vector = %s WHERE id = %s", + ( + AsIs(record._table), + VectorValue([4, 5, 6]), + record.id, + ), + ) + record.invalidate_model() + new_record = self.TestModel.browse(record.id) + val = new_record.vector + self.assertIsInstance(val, VectorValue) + self.assertEqual(val.to_list(), [4, 5, 6]) + + # as list + self.env.cr.execute( + "UPDATE %s SET vector = %s WHERE id = %s", + ( + AsIs(record._table), + [7, 8, 9], + record.id, + ), + ) + record.invalidate_model() + new_record = self.TestModel.browse(record.id) + val = new_record.vector + self.assertIsInstance(val, VectorValue) + self.assertEqual(val.to_list(), [7, 8, 9]) + + # as numpy array + self.env.cr.execute( + "UPDATE %s SET vector = %s WHERE id = %s", + ( + AsIs(record._table), + np.array([10, 11, 12]), + record.id, + ), + ) + record.invalidate_model() + new_record = self.TestModel.browse(record.id) + val = new_record.vector + self.assertIsInstance(val, VectorValue) + self.assertEqual(val.to_list(), [10, 11, 12]) + + def test_write(self): + record = self.TestModel.create({"vector": [1, 2, 3]}) + record.flush_recordset() + record.vector = [4, 5, 6] + value = record.vector + self.assertIsInstance(value, VectorValue) + self.assertEqual(value.to_list(), [4, 5, 6]) + record.flush_recordset() + record.invalidate_model() + new_record = self.TestModel.browse(record.id) + self.assertEqual(new_record.vector.to_list(), [4, 5, 6]) + record.vector = np.array([7, 8, 9]) + value = record.vector + self.assertIsInstance(value, VectorValue) + self.assertEqual(value.to_list(), [7, 8, 9]) + + def test_read(self): + record = self.TestModel.create({"vector": [1, 2, 3]}) + record.flush_recordset() + record.invalidate_model() + new_record = self.TestModel.browse(record.id) + val = new_record.read(["vector"])[0]["vector"] + self.assertEqual(val, [1, 2, 3]) diff --git a/field_vector/tests/test_field_vector_update.py b/field_vector/tests/test_field_vector_update.py new file mode 100644 index 00000000000..b8e4125c220 --- /dev/null +++ b/field_vector/tests/test_field_vector_update.py @@ -0,0 +1,39 @@ +# Copyright 2025 ACSONE SA/NV +# License LGPL-3.0 or later (https://www.gnu.org/licenses/lgpl). +from odoo_test_helper import FakeModelLoader + +from odoo.addons.base.tests.common import BaseCommon + + +class TestFieldVectorUpdate(BaseCommon): + def setUp(self): + res = super().setUp() + self.loader = FakeModelLoader(self.env, self.__module__) + self.loader.backup_registry() + self.addCleanup(self.loader.restore_registry) + + # pylint: disable=import-outside-toplevel + from .models import TestModel + + self.loader.update_registry([TestModel]) + + self.TestModel = self.env[TestModel._name] + + return res + + def test_update_db_column(self): + self.assertEqual( + self.TestModel._fields["vector"].get_current_vector_size( + self.env.cr, self.TestModel._table, "vector" + ), + 3, + ) + from .models import TestModelUpgrade + + self.loader.update_registry([TestModelUpgrade]) + self.assertEqual( + self.TestModel._fields["vector"].get_current_vector_size( + self.env.cr, self.TestModel._table, "vector" + ), + 5, + ) diff --git a/requirements.txt b/requirements.txt index 5d1fefa6f23..0f89476f74c 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,6 +1,7 @@ # generated from manifests external_dependencies cryptography dataclasses +numpy odoo_test_helper odoorpc openpyxl diff --git a/setup/_metapackage/pyproject.toml b/setup/_metapackage/pyproject.toml index 783bf6d717f..cb30beca583 100644 --- a/setup/_metapackage/pyproject.toml +++ b/setup/_metapackage/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "odoo-addons-oca-server-tools" -version = "18.0.20260520.0" +version = "18.0.20260601.0" dependencies = [ "odoo-addon-attachment_delete_restrict==18.0.*", "odoo-addon-attachment_queue==18.0.*", @@ -36,6 +36,7 @@ dependencies = [ "odoo-addon-excel_import_export==18.0.*", "odoo-addon-fetchmail_attach_from_folder==18.0.*", "odoo-addon-fetchmail_notify_error_to_sender==18.0.*", + "odoo-addon-field_vector==18.0.*", "odoo-addon-html_text==18.0.*", "odoo-addon-iap_alternative_provider==18.0.*", "odoo-addon-jsonifier==18.0.*",