superduperdb.backends.ibis package#

Subpackages#

Submodules#

superduperdb.backends.ibis.cursor module#

class superduperdb.backends.ibis.cursor.SuperDuperIbisResult(raw_cursor: Any, id_field: str, db: Datalayer | None = None, scores: Dict[str, float] | None = None, reference: bool = False, _it: int = 0)[source]#

Bases: SuperDuperCursor

SuperDuperIbisResult represents ibis query results with options to unroll results as i.e pandas

as_pandas()[source]#

superduperdb.backends.ibis.data_backend module#

class superduperdb.backends.ibis.data_backend.IbisDataBackend(conn: BaseBackend, name: str, in_memory: bool = False)[source]#

Bases: BaseDataBackend

build_artifact_store()[source]#

Build a default artifact store based on current connection.

build_metadata()[source]#

Build a default metadata store based on current connection.

create_ibis_table(identifier: str, schema: Schema)[source]#
create_model_table_or_collection(model: Model | APIModel)[source]#
create_table_and_schema(identifier: str, mapping: dict)[source]#

Create a schema in the data-backend.

disconnect()[source]#

Disconnect the client

drop(force: bool = False)[source]#

Drop the databackend.

get_table_or_collection(identifier)[source]#
insert(table_name, raw_documents)[source]#
url()[source]#

Databackend connection url

superduperdb.backends.ibis.db_helper module#

class superduperdb.backends.ibis.db_helper.Base64Mixin[source]#

Bases: object

convert_data_format(data)[source]#

Convert byte data to base64 format for storage in the database.

process_schema_types(schema_mapping)[source]#

Convert bytes to string in the schema.

recover_data_format(data)[source]#

Recover byte data from base64 format stored in the database.

class superduperdb.backends.ibis.db_helper.ClickHouseHelper(dialect)[source]#

Bases: Base64Mixin, DBHelper

match_dialect = 'clickhouse'#
process_before_insert(table_name, datas)[source]#
class superduperdb.backends.ibis.db_helper.DBHelper(dialect)[source]#

Bases: object

convert_data_format(data)[source]#
match_dialect = 'base'#
process_before_insert(table_name, datas)[source]#
process_schema_types(schema_mapping)[source]#
recover_data_format(data)[source]#
superduperdb.backends.ibis.db_helper.get_db_helper(dialect) DBHelper[source]#

Get the insert processor for the given dialect.

superduperdb.backends.ibis.field_types module#

class superduperdb.backends.ibis.field_types.FieldType(identifier: str | ibis.expr.datatypes.core.DataType)[source]#

Bases: Serializable

identifier: str | DataType#
superduperdb.backends.ibis.field_types.dtype(x)[source]#

Ibis dtype to represent basic data types in ibis e.g int, str, etc

superduperdb.backends.ibis.query module#

exception superduperdb.backends.ibis.query.IbisBackendError[source]#

Bases: Exception

This error represents ibis query related errors i.e when there is an error while executing an ibis query, use this exception to represent the error.

class superduperdb.backends.ibis.query.IbisCompoundSelect(table_or_collection: TableOrCollection, pre_like: Like | None = None, post_like: Like | None = None, query_linker: QueryLinker | None = None)[source]#

Bases: CompoundSelect

A query incorporating vector-search and a standard ibis query

A query with multiple parts.

like —-> select —-> like

Parameters:
  • table_or_collection – The table or collection that this query is linked to

  • pre_like – The pre_like part of the query (e.g. table.like(...)...)

  • post_like – The post_like part of the query (e.g. table.filter(...)....like(...))

  • query_linker – The query linker that is responsible for linking the query chain. E.g. table.filter(...).select(...).

  • i – The index of the query in the query chain

add_fold(fold: str) Select[source]#
compile(db: Datalayer, tables: Dict | None = None)[source]#

Convert the current query to an ibis native query.

Parameters:
  • db – The superduperdb connection

  • tables – A dictionary of ibis tables to use for the query

execute(db, reference: bool = False)[source]#

Execute the compound query on the DB instance.

Parameters:

db – The DB instance to use

get_all_tables()[source]#
model_update(db, ids: List[Any], key: str, model: str, version: int, outputs: Sequence[Any], flatten: bool = False)[source]#

Update model outputs for a set of ids.

Parameters:
  • db – The DB instance to use

  • ids – The ids to update

  • key – The key to update

  • model – The model to update

  • outputs – The outputs to update

property output_fields#
outputs(**kwargs)[source]#

This method returns a query which joins a query with the outputs for a table.

Parameters:
  • key – The key on which the model was evaluated

  • model – The model identifier for which to get the outputs

  • version – The version of the model for which to get the outputs (optional)

>>> q = t.filter(t.age > 25).outputs('txt', 'model_name')
property primary_id#
property renamings#
select_ids_of_missing_outputs(key: str, model: str, version: int)[source]#

Query which selects ids where outputs are missing.

property select_table#
class superduperdb.backends.ibis.query.IbisInsert(table_or_collection: 'TableOrCollection', documents: Sequence[ForwardRef('Document')] = <factory>, verbose: bool = True, kwargs: Dict = <factory>)[source]#

Bases: Insert

execute(db)[source]#

Insert the data.

Parameters:

parent – The parent instance to use for insertion

property select_table#
class superduperdb.backends.ibis.query.IbisQueryComponent(name: str, type: str = QueryType.ATTR, args: ~typing.Sequence = <factory>, kwargs: ~typing.Dict = <factory>)[source]#

Bases: QueryComponent

This class represents a component of an ibis query. For example filter in t.filter(t.age > 25).

This is a representation of a single query object in ibis query chain. This is used to build a query chain that can be executed on a database. Query will be executed in the order they are added to the chain.

If we have a query chain like this:

query = t.select([‘id’, ‘name’]).limit(10)

here we have 2 query objects, select and limit.

select will be wrapped with this class and added to the chain.

Parameters:
  • name – The name of the query

  • type – The type of the query, either query or attr

  • args – The arguments to pass to the query

  • kwargs – The keyword arguments to pass to the query

compile(parent: Any, db: Datalayer, tables: Dict | None = None)[source]#
get_all_tables()[source]#
property primary_id#
property renamings#
repr_() str[source]#
>>> IbisQueryComponent('__eq__(2)', type=QueryType.QUERY, args=[1, 2]).repr_()
class superduperdb.backends.ibis.query.IbisQueryLinker(table_or_collection: 'TableOrCollection', members: List = <factory>, primary_id: Union[str, List[str], NoneType] = None)[source]#

Bases: QueryLinker, _LogicalExprMixin

compile(db: Datalayer, tables: Dict | None = None)[source]#
execute(db)[source]#
get_all_tables()[source]#
property output_fields#
primary_id: str | List[str] | None = None#
property renamings#
repr_() str[source]#
property select_ids#
select_single_id(id)[source]#
select_using_ids(ids)[source]#
class superduperdb.backends.ibis.query.IbisQueryTable(identifier: str | Variable, primary_id: str = 'id')[source]#

Bases: _ReprMixin, TableOrCollection, Select

This is a symbolic representation of a table for building IbisCompoundSelect queries.

Parameters:

primary_id – The primary id of the table

add_fold(fold: str) Select[source]#
compile(db: Datalayer, tables: Dict | None = None)[source]#
execute(db)[source]#

Execute the query on the DB instance.

property id_field#
insert(*args, **kwargs)[source]#
model_update(db, ids: List[Any], key: str, model: str, version: int, outputs: Sequence[Any], flatten: bool = False, **kwargs)[source]#

Update model outputs for a set of ids.

Parameters:
  • db – The DB instance to use

  • ids – The ids to update

  • key – The key to update

  • model – The model to update

  • outputs – The outputs to update

outputs(**kwargs)[source]#

This method returns a query which joins a query with the model outputs.

Parameters:

model – The model identifier for which to get the outputs

>>> q = t.filter(t.age > 25).outputs('model_name', db)

The above query will return the outputs of the model_name model with t.filter() ids.

primary_id: str = 'id'#
repr_()[source]#
property select_ids: Select#
select_ids_of_missing_outputs(key: str, model: str, version: int) Select[source]#
select_single_id(id)[source]#
property select_table: Select#
select_using_ids(ids: Sequence[Any]) Select[source]#
class superduperdb.backends.ibis.query.QueryType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: str, Enum

This class holds type of query query: This means Query and can be called attr: This means Attribute and cannot be called

ATTR = 'attr'#
QUERY = 'query'#
_generate_next_value_(start, count, last_values)#

Generate the next value when not given.

name: the name of the member start: the initial start value or None count: the number of existing members last_values: the list of values assigned

class superduperdb.backends.ibis.query.RawSQL(query: str, id_field: str = 'id')[source]#

Bases: RawQuery

execute(db)[source]#

A raw query method which executes the query and returns the result

id_field: str = 'id'#
query: str#
class superduperdb.backends.ibis.query.Table(identifier: str, artifacts: dataclasses.InitVar[Optional[Dict]] = None, *, schema: Schema, primary_id: str = 'id')[source]#

Bases: Component

This is a representation of an SQL table in ibis, saving the important meta-data associated with the table in the superduperdb meta-data store.

Parameters:
  • identifier – A unique identifier for the component:param schema: The schema of the table

  • primary_id – The primary id of the table

compile(db: Datalayer, tables: Dict | None = None)[source]#
insert(documents, **kwargs)[source]#
like(r: Document, vector_index: str, n: int = 10)[source]#
outputs(**kwargs)[source]#
pre_create(db: Datalayer)[source]#

Called the first time this component is created

Parameters:

db – the db that creates the component

primary_id: str = 'id'#
schema: Schema#
property table_or_collection#
to_query()[source]#
type_id: ClassVar[str] = 'table'#
class superduperdb.backends.ibis.query._LogicalExprMixin[source]#

Bases: object

Mixin class which holds ‘__eq__’, ‘__or__’, ‘__gt__’, etc arithmetic operators These methods are overloaded for ibis logical expression dynamic wrapping with superduperdb.

and_(other, members, collection)[source]#
eq(other, members, collection)[source]#
getitem(other, members, collection)[source]#
gt(other, members, collection)[source]#
lt(other, members, collection)[source]#
not_(members, collection)[source]#
or_(other, members, collection)[source]#

superduperdb.backends.ibis.utils module#

superduperdb.backends.ibis.utils.get_output_table_name(model_identifier, version)[source]#

Get the output table name for the given model.

Module contents#