crandas.ctypes#

class crandas.ctypes.Bool(**kwargs)#

Bases: Integer

class crandas.ctypes.BytesCtype(max_length=None)#

Bases: CtypeBase

exception crandas.ctypes.ColumnBoundDerivedWarning#

Bases: Warning

class crandas.ctypes.Ctype#

Bases: object

Ctypes, or “crandas types”, are an extensible client-side type system that allow the user to provide additional type information beyond pandas/numpy dtypes.

Ctypes are represented as class instances, e.g. NullableInteger(). Some classes take arguments in their initialization, like Varchar(max_length=12). Each Ctype also has a string representation, like “varchar[12]”. Either of these can be specified to the ctype kwarg of cd.DataFrame, so e.g. >>> cd.DataFrame({“ints”: [1, 2, 3], “strings”: [“a”, “bb”, “ccc”]},

ctype={“ints”: NullableInteger(), “strings”: “varchar[5]”})

If a manual ctype is not specified, the appropriate ctype is automatically deduced using the pandas dtype. For details of how this is implemented, see the Ctype.for_series() classmethod.

Internal workings#

Each class (so e.g. Integer) has CtypeBase as a base class, and is decorated with @Ctype.register, which registers the Ctype’s .dtype, .ctype properties so that the Ctype class may perform automatic ctype inference on pandas.Series objects.

classmethod for_series(series, ctype_spec=None)#

Determine the Ctype for a pandas.Series object, based on the specified ctype_str, the series.dtype, the ctype_cls.from_series() function, or the value_type (i.e. the type of next(iter(series))), in that order.

classmethod from_spec(ctype_spec)#

Determine the Ctype based on a specification, that is a ctype object (i.e. an instance of a subclass of CtypeBase), a string, or a Python type

class crandas.ctypes.CtypeBase#

Bases: object

ctype: str

name of the ctype; corresponds to the API type communicated to the server

dtype: str

the ctype corresponds to this pandas dtype

args: List[str]

names of positional arguments (that are interpreted to be of type int)

kwargs: List[str]

names of keyword arguments

value_types: List[object]

this Ctype applies to values of these types (i.e. isinstance(value, obj) where obj in value_types)

class crandas.ctypes.FixedPoint#

Bases: CtypeBase

class crandas.ctypes.FractionalInteger#

Bases: CtypeBase

class crandas.ctypes.Integer(nullable=None, min=None, max=None)#

Bases: CtypeBase

class crandas.ctypes.IntegerList(length=None, nullable=None, min=None, max=None)#

Bases: CtypeBase

class crandas.ctypes.NotNullInteger(**kwargs)#

Bases: Integer

ctype = None#
class crandas.ctypes.NullableInteger(**kwargs)#

Bases: Integer

ctype = None#
class crandas.ctypes.Varchar(max_length=None)#

Bases: CtypeBase

crandas.ctypes.column_crandas_to_pandas(col, elements, modulus, not_null)#

Converts crandas JSON column to a set of values could be used in the pandas DataFrame constructor. Takes the column, the unmasked element values and the modulus used for this column.

Parameters:
  • col ((JSON-serializable) object) – crandas JSON column

  • elements (numpy array) – unmasked element values in [0,modulus)

  • modulus (int) – modulus for the values

  • not_null (numpy array of bits, or None) – indicator bits, if nullable

Returns:

values found in col

Return type:

Set of values (int/str/bit)

Raises:

RuntimeError – Only works for columns of type int, str or bits

crandas.ctypes.column_pandas_to_crandas(series, ctype_spec=None, auto_bounds=False)#

Convert pandas column to JSON representation for use in “new” command. This function does not perform masking and instead sets col[“elements”] to an iterable of integers

Parameters:
  • series (pd.Series) – pandas column

  • auto_bounds (bool, default: False) – if given, do not warn about automatically derived column bounds

Returns:

if the column is nullable, a 2-tuple is returned where the second item is a column that should be uploaded as the not_null store

Return type:

1-tuple or 2-tuple of (JSON-serializable) dicts

Raises:

TypeError – Column type is not supported by crandas

crandas.ctypes.derive_int_bounds(series, spec_min_value, spec_max_value)#

Derive int bounds from series and max/min specification

If specified maximum and/or minumum is given, this range is used, and it is verified that the values in the series comply with the range.

Otherwise, an integer type is derived according to the following order of preference: uint8, int8, uint16, int16, uint24, int24, uint32, int32. These respective datatypes have ranges [0,255], [-127,127], etc. Note that, for signed types, -2**(bit_length-1) is not included, e.g., int8 does not contain -128. This is done so that the product of two int8 fits in an int16, etc, and in particular, two int16 can be multiplied with each other while still fitting in a int32 and thus without performing field conversion.

In case an integer type is derived from data, a ColumnBoundDerivedWarning is given.

crandas.ctypes.encode_integers64(elements)#

Convert a set of values from the range (-session.modulus/2,session.modulus/2) to a np.array of type int64