Hubit configuration, data and queries
Objects defined here will automatically be created by Hubit
.
Therefore, the class definitions below simply document e.g. the attributes required
in a model config file or the required structure of a query path.
HubitQueryPath (_HubitPath)
Reference a field in the results data. The syntax follows general
Python syntax for nested objects. Only square
brackets are allowed. The content of the brackets is called an
index specifier and must comply with
QueryIndexSpecifier
.
To query, for example, the attribute weight
in the 4th element of the list
wheels
, which is stored on the object car
use the path
car.wheels[3].weight
. The query path car.wheels[:].weight
represents a list with elements being the weight
for all
wheels of the car.
If there are multiple cars stored in a list of cars, the
query path cars[:].wheels[3].weight
represents a list where the elements
would be the weights for the 4th wheel for all cars. The
query path cars[:].wheels[:].weight
represents a nested list where
each outer list item represents a car and the corresponding inner list elements
represent the weights for all wheels for that car.
HubitModelConfig
dataclass
Defines the hubit model configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
components |
List[HubitModelComponent] |
|
required |
HubitModelComponent
dataclass
Represents one isolated task carried out by
the function func_name
located at path
. The function requires
input from the paths defined in consumes_input
and
consumes_results
. The componet delivers results to the paths
in provides_results
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str |
Path to the module responsible for the task. if |
required |
func_name |
str |
The function name (entrypoint) that wraps the task. |
'main' |
provides_results |
List[HubitBinding] |
|
required |
consumes_input |
List[HubitBinding] |
|
<factory> |
consumes_results |
List[HubitBinding] |
|
<factory> |
index_scope |
dict |
A map from the index identifiers to an index. Used to
limit the scope of the component. If, for example, the scope
is |
<factory> |
is_dotted_path |
bool |
Set to True if the specified |
False |
HubitBinding
dataclass
Binds an internal component attribute with name
to a field
at path
in the shared data model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
HubitModelPath |
|
required |
name |
str |
Attribute name as it will be exposed in the component. |
required |
HubitModelPath (_HubitPath)
References a field in the input or results data. Compared to a
HubitQueryPath
,
a HubitModelPath
instance has different rules for index
specifiers (see ModelIndexSpecifier
).
To illustrate the use of the index identifiers for index mapping
in a model path consider a Hubit
model component that consumes
the path cars[IDX_CAR].parts[:@IDX_PART].name
(as discussed
here
)).
The component could use the parts names for a database
lookup to get the prices for each component. If we want Hubit
to store
these prices in the results, one option would be to store them in a
data structure similar to the input. To achieve this behavior the
component should provide a path that looks something like
cars[IDX_CAR].parts[:@IDX_PART].price
. Alternatively, the provided
path could be cars[IDX_CAR].parts_price[:@IDX_PART]
. In both cases,
the index identifiers defined in the input path (cars[IDX_CAR].parts[:@IDX_PART].name
)
allows Hubit
to store the parts prices for a car
at the same car index and part index as where the input was taken
from. Note that the component itself is unaware of which car (car index)
the input represents.
Hubit
infers indices and list lengths based on the input data
and the index specifiers defined for binding paths in the consumes_input
section.
Therefore, index identifiers used in binding paths in the consumes_results
and provides_results
sections should always be exist in binding paths in consumes_input
.
Further, to provide a meaningful index mapping, the index specifier
used in a binding path in the provides_results
section should be
identical to the corresponding index specifier in the consumes_input
.
The first binding in the example below has a more specific index specifier
(for the identifier IDX_PART
) and is therefore invalid. The second
binding is valid.
provides_results:
# INVALID
- name: part_name
path: cars[IDX_CAR].parts[IDX_PART].name # more specific for the part index
# VALID: Assign a 'price' attribute each part object in the car object.
- name: parts_price
path: cars[IDX_CAR].parts[:@IDX_PART].price # index specifier for parts
is equal to consumes.input.path
consumes_input:
- name: part_name
path: cars[IDX_CAR].parts[:@IDX_PART].name
In the invalid binding above, the component consumes all indices
of the parts list and therefore storing the price data at a specific part
index is not possible. The bindings below are valid since IDX_PART
is
omitted for the bindings in the provides_results
section
provides_results:
# Assign a 'part_names' attribute to the car object.
# Could be a a list of all part names for that car
- name: part_names
path: cars[IDX_CAR].part_names # index specifier for parts omitted
# Assign a 'concatenates_part_names' attribute to the car object.
# Could be a string with all part names concatenated
- name: concatenates_part_names
path: cars[IDX_CAR].concatenates_part_names # index specifier for parts omitted
consumes_input:
- name: part_name
path: cars[IDX_CAR].parts[:@IDX_PART].name
Index contexts
In addition to defining the index identifiers the input sections
also defines index contexts. The index context is the order and hierarchy
of the index identifiers. For example an input binding
cars[IDX_CAR].parts[IDX_PART].price
would define both the index
identifiers IDX_CAR
and IDX_PART
as well as define the index context
IDX_CAR -> IDX_PART
. This index context shows that a part index exists
only in the context of a car index. Index identifiers should be used in a unique
context i.e. if one input binding defines cars[IDX_CAR].parts[IDX_PART].price
then defining or using parts[IDX_PART].cars[IDX_CAR].price
is not allowed.
Query
dataclass
A Hubit query.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
paths |
List[HubitQueryPath] |
|
required |
FlatData (dict, Generic)
A key-value pair data representation. Keys represent a path in the internal dotted-style. In a dotted-style Hubit path index braces [IDX] are represented by dots .IDX.
as_dict(self)
Converts the object to a regular dictionary with string keys
inflate(self)
Inflate flat data to nested dict. Lists are represented as dicts
to handle queries that do not include all list elements. For
example, if the query ["cars[57].price"]
gives the flat data object
{"cars.57.price": 4280.0}
, the inflated version is
{'cars': {57: {'price': 4280.0}}
. The access syntax
for the dictionary representation of lists is identical
to the access syntax had it been a list. Using dictionaries
we can, however, represent element 57 without adding empty
elements for the remaining list elements.
QueryIndexSpecifier (_IndexSpecifier)
Index specifiers for HubitQueryPath
.
Currently, index specifiers should be either a positive integer or
the character :
. General slicing is not supported.
ModelIndexSpecifier (_IndexSpecifier)
Index specifiers for HubitModelPath
.
A model path index specifier is composed of three parts namely the
range, the identifier and the offset, in that order. The
structure of an model index specifier is "range @ identifier offset
" with
spaces added to increase clarity.
- The identifier is used internally map an index in input lists to
the equivalent index in the results. Can be any string of characters in
a-z, A-Z, digits as well as _ (underscore). An example could be
MYIDX
, which would refer to one index in a list. - The range must conform with
PathIndexRange
and may be used to control the scope of an identifier. An example could be0
or:
. - The offset is a signed integer that may be used to offset the affected
index. An example could be
-1
.
Using the examples above the model index specifier would be :@MYIDX-1
.
The range and last are optional. A non-empty range requires an empty
(i.e. zero) offset and vice versa.
To put index specifiers into some context
consider a Hubit component that provides cars[IDX_CAR].parts[:@IDX_PART].name
.
This path tells Hubit that the names of all parts of a specific car can
be provided.
Let us break down the path and take a closer look at the index specifers in
square brackets.
The index specifier :@IDX_PART
refers to all elements of the parts list
(using the range :
) and defines the identifier (IDX_PART
) to represent
elements of the parts list.
So in this case, the index specifier contains both a range and an
identifier, but no offset.
The left-most index specifier IDX_CAR
only contains an
identifier that represents elements of the cars list. Since no range
is specified the identifier refers to a specific car determined by
the query (e.g. cars[6].parts[:].name
).
cars[IDX_CAR].parts[:@IDX_PART].name
therefore references the names of
all parts of a specific car which depend on the query specified by the user.
A component that consumes this path would have access to these names in a list.
The index specifier 0@IDX_PART
would always reference element 0 of the
parts list irrespective of the query.
PathIndexRange (str)
In HubitModelPath
and
HubitQueryPath
the
supported ranges comprise
- Positive integers e.g.
0
,17
. - Negative integers e.g.
-1
,-2
. - The all-index character
:
.
Further, in the index_scope
attribute of
HubitModelComponent
the following
ranges are also allowed
d:
whered
is a positive integer e.g.3:
.:d
whered
is a positive integer e.g.:3
.d1:d2
whered1
andd2
are positive integers andd1
<d2
e.g.2:5