flowmachine.features.subscriber.label_event_score¶
Class LabelEventScore¶
LabelEventScore(*, scores: Union[flowmachine.features.subscriber.scores.EventScore, flowmachine.features.subscriber.hartigan_cluster._JoinedHartiganCluster], labels: Dict[str, Dict[str, Any]] = {'evening': {'type': 'MultiPolygon', 'coordinates': [[[[1e-06, 0.5], [1e-06, 1], [1, 1], [1, 0.5]]], [[[1e-06, -1], [1e-06, -0.5], [1, -0.5], [1, -1]]]]}, 'day': {'type': 'Polygon', 'coordinates': [[[-1, -0.5], [-1, 0.5], [0, 0.5], [0, -0.5]]]}}, required: Optional[str] = None)
Represents a label event score class.
This class will label a table containing scores based on a labelling dictionary. It allows one to specify labels which every subscriber must have. This class is used to label locations based on scoring signatures in the absence of other automated labelling mechanisms. Returns the original query object, with an added label
column.
Attributes¶
Parameters¶
-
scores
:typing.Union
A flowmachine.Query object. This represents a table that contains scores which are used to label a given location. This table must have a subscriber column (called subscriber).
-
labels
:typing.Dict
, default{'evening': {'type': 'MultiPolygon', 'coordinates': [[[[1e-06, 0.5], [1e-06, 1], [1, 1], [1, 0.5]]], [[[1e-06, -1], [1e-06, -0.5], [1, -0.5], [1, -1]]]]}, 'day': {'type': 'Polygon', 'coordinates': [[[-1, -0.5], [-1, 0.5], [0, 0.5], [0, -0.5]]]}}
A dictionary whose keys are the label names and the values geojson shapes, specified hour of day, and day of week score, with hour of day score on the x axis and day of week score on the y axis, where all scores are real numbers in the range [-1.0, +1.0]
-
required
:typing.Optional
, defaultNone
Optionally specifies a label which every subscriber must possess independently of the score. This is used in cases where, for instance, we require that all subscribers must have an evening/home location.
Examples¶
es = EventScore(start="2016-01-01", stop="2016-01-05", spatial_unit=make_spatial_unit("versioned-site"))
es.head()
subscriber site_id version lon lat score_hour score_dow
0 ZYPxqVGLzlQy6l7n QeBRM8 0 82.914285 29.358975 1.0 -1.0
1 4oLKbnxm3vXqjMVx zdNQx2 0 87.265225 27.585096 -1.0 1.0
2 vKVLDx8koQWZ2ez0 LVnDQL 0 86.551302 27.245265 0.0 -1.0
3 DELmRj9Vvl346G50 m9jL23 0 82.601710 29.815919 1.0 -1.0
4 lqOknAJRDNAewM10 RZgwVz 0 84.623447 28.283523 -1.0 1.0
ls = LabelEventScore(
scores=es,
labels={
"daytime": {
"type": "Polygon",
"coordinates": [[[-1.1, -1.1], [-1, 1.1], [1.1, 1.1], [1.1, -1.1]]],
}
},
)
ls.head()
label subscriber site_id version lon lat score_hour score_dow
0 daytime ZYPxqVGLzlQy6l7n QeBRM8 0 82.914285 29.358975 1.0 -1.0
1 daytime 4oLKbnxm3vXqjMVx zdNQx2 0 87.265225 27.585096 -1.0 1.0
2 daytime vKVLDx8koQWZ2ez0 LVnDQL 0 86.551302 27.245265 0.0 -1.0
3 daytime DELmRj9Vvl346G50 m9jL23 0 82.601710 29.815919 1.0 -1.0
4 daytime lqOknAJRDNAewM10 RZgwVz 0 84.623447 28.283523 -1.0 1.0
Methods¶
verify_bounds_dict_has_no_overlaps¶
verify_bounds_dict_has_no_overlaps(bounds: Dict[str, shapely.geometry.base.BaseGeometry]) -> bool
Check if any score boundaries overlap one another, and raise an exception identifying the ones that do.
Parameters¶
-
bounds
:typing.Dict
Dict mapping labels to lists of score boundaries expressed as shapely polygons
Returns¶
-
bool
True if none of the bounds overlap, otherwise False.
cache¶
cache
Returns¶
-
bool
True is caching is switched on.
column_names¶
column_names
Returns the column names.
Returns¶
-
typing.List
List of the column names of this query.
column_names_as_string_list¶
column_names_as_string_list
Get the column names as a comma separated list
Returns¶
-
str
Comma separated list of column names
dependencies¶
dependencies
Returns¶
-
set
The set of queries which this one is directly dependent on.
fully_qualified_table_name¶
fully_qualified_table_name
Returns a unique fully qualified name for the query to be stored as under the cache schema, based on a hash of the parameters, class, and subqueries.
Returns¶
-
str
String form of the table's fqn
index_cols¶
index_cols
A list of columns to use as indexes when storing this query.
Returns¶
-
ixen
:list
By default, returns the location columns if they are present and self.spatial_unit is defined, and the subscriber column.
Examples¶
daily_location("2016-01-01").index_cols
[['name'], '"subscriber"']
is_stored¶
is_stored
Returns¶
-
bool
True if the table is stored, and False otherwise.
query_id¶
query_id
Generate a uniquely identifying hash of this query, based on the parameters of it and the subqueries it is composed of.
Returns¶
-
str
query_id hash string
query_state¶
query_state
Return the current query state.
Returns¶
-
QueryState
The current query state
query_state_str¶
query_state_str
Return the current query state as a string
Returns¶
-
str
The current query state. The possible values are the ones defined in
flowmachine.core.query_state.QueryState
.
table_name¶
table_name
Returns a uniquename for the query to be stored as, based on a hash of the parameters, class, and subqueries.
Returns¶
-
str
String form of the table's fqn