Skip to content

flowmachine.features.subscriber.label_event_score

Class LabelEventScore

LabelEventScore(*, scores: Union[flowmachine.features.subscriber.scores.EventScore, flowmachine.features.subscriber.hartigan_cluster._JoinedHartiganCluster], labels: Dict[str, Dict[str, Any]] = {'evening': {'type': 'MultiPolygon', 'coordinates': [[[[1e-06, 0.5], [1e-06, 1], [1, 1], [1, 0.5]]], [[[1e-06, -1], [1e-06, -0.5], [1, -0.5], [1, -1]]]]}, 'day': {'type': 'Polygon', 'coordinates': [[[-1, -0.5], [-1, 0.5], [0, 0.5], [0, -0.5]]]}}, required: Union[str, NoneType] = None)
Source: flowmachine/features/subscriber/label_event_score.py

Represents a label event score class. This class will label a table containing scores based on a labelling dictionary. It allows one to specify labels which every subscriber must have. This class is used to label locations based on scoring signatures in the absence of other automated labelling mechanisms. Returns the original query object, with an added label column.

Attributes

Parameters

  • scores: typing.Union[flowmachine.features.subscriber.scores.EventScore, flowmachine.features.subscriber.hartigan_cluster._JoinedHartiganCluster]

    A flowmachine.Query object. This represents a table that contains scores which are used to label a given location. This table must have a subscriber column (called subscriber).

  • labels: typing.Dict[str, typing.Dict[str, typing.Any]], default {'evening': {'type': 'MultiPolygon', 'coordinates': [[[[1e-06, 0.5], [1e-06, 1], [1, 1], [1, 0.5]]], [[[1e-06, -1], [1e-06, -0.5], [1, -0.5], [1, -1]]]]}, 'day': {'type': 'Polygon', 'coordinates': [[[-1, -0.5], [-1, 0.5], [0, 0.5], [0, -0.5]]]}}

    A dictionary whose keys are the label names and the values geojson shapes, specified hour of day, and day of week score, with hour of day score on the x axis and day of week score on the y axis, where all scores are real numbers in the range [-1.0, +1.0]

  • required: typing.Union[str, NoneType], default None

    Optionally specifies a label which every subscriber must possess independently of the score. This is used in cases where, for instance, we require that all subscribers must have an evening/home location.

Examples

es = EventScore(start="2016-01-01", stop="2016-01-05", spatial_unit=make_spatial_unit("versioned-site"))
es.head()
         subscriber site_id  version        lon        lat  score_hour  score_dow
0  ZYPxqVGLzlQy6l7n  QeBRM8        0  82.914285  29.358975         1.0       -1.0
1  4oLKbnxm3vXqjMVx  zdNQx2        0  87.265225  27.585096        -1.0        1.0
2  vKVLDx8koQWZ2ez0  LVnDQL        0  86.551302  27.245265         0.0       -1.0
3  DELmRj9Vvl346G50  m9jL23        0  82.601710  29.815919         1.0       -1.0
4  lqOknAJRDNAewM10  RZgwVz        0  84.623447  28.283523        -1.0        1.0
ls = LabelEventScore(
        scores=es,
        labels={
            "daytime": {
                "type": "Polygon",
                "coordinates": [[[-1.1, -1.1], [-1, 1.1], [1.1, 1.1], [1.1, -1.1]]],
            }
        },
    )
ls.head()
     label        subscriber site_id  version        lon        lat  score_hour  score_dow
0  daytime  ZYPxqVGLzlQy6l7n  QeBRM8        0  82.914285  29.358975         1.0       -1.0
1  daytime  4oLKbnxm3vXqjMVx  zdNQx2        0  87.265225  27.585096        -1.0        1.0
2  daytime  vKVLDx8koQWZ2ez0  LVnDQL        0  86.551302  27.245265         0.0       -1.0
3  daytime  DELmRj9Vvl346G50  m9jL23        0  82.601710  29.815919         1.0       -1.0
4  daytime  lqOknAJRDNAewM10  RZgwVz        0  84.623447  28.283523        -1.0        1.0

Methods

verify_bounds_dict_has_no_overlaps

verify_bounds_dict_has_no_overlaps(bounds: Dict[str, shapely.geometry.base.BaseGeometry]) -> bool
Source: flowmachine/features/subscriber/label_event_score.py

Check if any score boundaries overlap one another, and raise an exception identifying the ones that do.

Parameters
  • bounds: typing.Dict[str, shapely.geometry.base.BaseGeometry]

    Dict mapping labels to lists of score boundaries expressed as shapely polygons

Returns
  • bool

    True if none of the bounds overlap, otherwise False.

cache

cache
Source: flowmachine/core/query.py

Returns
  • bool

    True is caching is switched on.

column_names

column_names
Source: flowmachine/features/subscriber/label_event_score.py

Returns the column names.

Returns
  • typing.List[str]

    List of the column names of this query.

column_names_as_string_list

column_names_as_string_list
Source: flowmachine/core/query.py

Get the column names as a comma separated list

Returns
  • str

    Comma separated list of column names

dependencies

dependencies
Source: flowmachine/core/query.py

Returns
  • set

    The set of queries which this one is directly dependent on.

fully_qualified_table_name

fully_qualified_table_name
Source: flowmachine/core/query.py

Returns a unique fully qualified name for the query to be stored as under the cache schema, based on a hash of the parameters, class, and subqueries.

Returns
  • str

    String form of the table's fqn

index_cols

index_cols
Source: flowmachine/core/query.py

A list of columns to use as indexes when storing this query.

Returns
  • ixen: list

    By default, returns the location columns if they are present and self.spatial_unit is defined, and the subscriber column.

Examples
daily_location("2016-01-01").index_cols
[['name'], '"subscriber"']

is_stored

is_stored
Source: flowmachine/core/query.py

Returns
  • bool

    True if the table is stored, and False otherwise.

query_id

query_id
Source: flowmachine/core/query.py

Generate a uniquely identifying hash of this query, based on the parameters of it and the subqueries it is composed of.

Returns
  • str

    query_id hash string

query_state

query_state
Source: flowmachine/core/query.py

Return the current query state.

Returns
  • QueryState

    The current query state

query_state_str

query_state_str
Source: flowmachine/core/query.py

Return the current query state as a string

Returns
  • str

    The current query state. The possible values are the ones defined in flowmachine.core.query_state.QueryState.

table_name

table_name
Source: flowmachine/core/query.py

Returns a uniquename for the query to be stored as, based on a hash of the parameters, class, and subqueries.

Returns
  • str

    String form of the table's fqn