Skip to content

flowmachine.features.subscriber.per_subscriber_aggregate

Class PerSubscriberAggregate

PerSubscriberAggregate(*, subscriber_query: flowmachine.features.subscriber.metaclasses.SubscriberFeature, agg_column: str, agg_method: str = 'avg')
Source: flowmachine/features/subscriber/per_subscriber_aggregate.py

Query that performs per-subscriber aggregation of a column. Returns a column     'subscriber' containing unique subscribers and a column 'value' containing the     aggregration.

Attributes

Parameters

  • subscriber_query: flowmachine.features.subscriber.metaclasses.SubscriberFeature
    A query with a `subscriber` column agg_column: str     The name of the column in `subscriber_query` to aggregate. Cannot be 'subscriber'. agg_method: {"count", "sum", "avg", "max", "min", "median", "stddev", "variance"} default "avg"     The method of aggregation to perform
    

Examples

Gets the maximum call duration of each subscriber on 2016-01-01.     >>>     per_location_query = PerLocationSubscriberCallDurations("2016-01-01", "2016-01-02")     >>>     max_psa = PerSubscriberAggregate(     ...         subscriber_query=per_location_query, agg_column="value", agg_method="max"     ...     )                 subscriber   value     0    038OVABN11Ak4W5P  4641.0     1    0Gl95NRLjW2aw8pW   876.0     2    0gmvwzMAYbz5We1E  2214.0     3    0MQ4RYeKn7lryxGa  3964.0     4    0Ze1l70j0LNgyY4w  3368.0     ..                ...     ...     350  ZmPRjkQ74Xeql71V  2385.0     351  ZQG8glazmxYa1K62  4238.0     352  Zv4W9eak2QN1M5A7   337.0     353  zvaOknzKbEVD2eME  2171.0     354  ZYPxqVGLzlQy6l7n  4602.0  [355 rows x 2 columns]

Methods

cache

cache
Source: flowmachine/core/query.py

Returns
  • bool

    True is caching is switched on.

column_names

column_names
Source: flowmachine/features/subscriber/per_subscriber_aggregate.py

Returns the column names.

Returns
  • typing.List[str]

    List of the column names of this query.

column_names_as_string_list

column_names_as_string_list
Source: flowmachine/core/query.py

Get the column names as a comma separated list

Returns
  • str

    Comma separated list of column names

dependencies

dependencies
Source: flowmachine/core/query.py

Returns
  • set

    The set of queries which this one is directly dependent on.

fully_qualified_table_name

fully_qualified_table_name
Source: flowmachine/core/query.py

Returns a unique fully qualified name for the query to be stored as under the cache schema, based on a hash of the parameters, class, and subqueries.

Returns
  • str

    String form of the table's fqn

index_cols

index_cols
Source: flowmachine/core/query.py

A list of columns to use as indexes when storing this query.

Returns
  • ixen: list

    By default, returns the location columns if they are present and self.spatial_unit is defined, and the subscriber column.

Examples
daily_location("2016-01-01").index_cols
[['name'], '"subscriber"']

is_stored

is_stored
Source: flowmachine/core/query.py

Returns
  • bool

    True if the table is stored, and False otherwise.

query_id

query_id
Source: flowmachine/core/query.py

Generate a uniquely identifying hash of this query, based on the parameters of it and the subqueries it is composed of.

Returns
  • str

    query_id hash string

query_state

query_state
Source: flowmachine/core/query.py

Return the current query state.

Returns
  • QueryState

    The current query state

query_state_str

query_state_str
Source: flowmachine/core/query.py

Return the current query state as a string

Returns
  • str

    The current query state. The possible values are the ones defined in flowmachine.core.query_state.QueryState.

table_name

table_name
Source: flowmachine/core/query.py

Returns a uniquename for the query to be stored as, based on a hash of the parameters, class, and subqueries.

Returns
  • str

    String form of the table's fqn