Skip to content

flowmachine.features.utilities.group_values

Source: flowmachine/features/utilities/group_values.py

Utility class that allows the subscriber to iterate through arbitrary groups of fields and apply a python function to the results.

Class GroupValues

GroupValues(group, value, start, stop, **kwargs)
Source: flowmachine/features/utilities/group_values.py

Query representing groups of a certain columns with the values of other columns as an array.

Attributes

Parameters

  • group: or list of strings, str

    Name of the column(s) that should be grouped e.g. msisdn_from

  • value: or list of strings, str

    Name of the column(s) that should be returned as an array

  • start, stop: str

    start and stop times of the analysis, in ISO-format

  • kwargs: dict

    Passed to flowmachine EventTableSubset

Examples

gv = GroupValues('msisdn_from', 'datetime')
for g,v in gv:
    print((g, str(max(v))))
('SubscriberA', 2016-01-01 23:00:01)
('Subscriberb', 2016-01-01 22:12:04)
...

Note

  • In the case when the subscriber passes more than one group or more than one values the results will be an iterator of the following form: - (group1, group2, array(value1), array(value2)) - This class is mostly used through the method ColumnMap which maps a subscriber defined python function to the output of the iterator.

Methods

ColumnMap

ColumnMap(self, fn)
Source: flowmachine/features/utilities/group_values.py

Maps a function to each of the returned arrays, and returns an iterator over the results.

Examples
def highest_min(date_list):
    return max([x.minute for x in date_list])
gv = GroupValues('msisdn_from', 'datetime')
cm = gv.ColumnMap(highest_min)
for c in cm:
    print(c)
('BKMy1nYEZpnoEA7G', 58)
('DzpZJ2EaVQo2X5vM', 56)
('Zv4W9eak2QN1M5A7', 55)
('NQV3J52PeYgbLm2w', 54)
...

cache

cache
Source: flowmachine/core/query.py

Returns
  • bool

    True is caching is switched on.

column_names

column_names
Source: flowmachine/features/utilities/group_values.py

Returns the column names.

Returns
  • typing.List[str]

    List of the column names of this query.

column_names_as_string_list

column_names_as_string_list
Source: flowmachine/core/query.py

Get the column names as a comma separated list

Returns
  • str

    Comma separated list of column names

dependencies

dependencies
Source: flowmachine/core/query.py

Returns
  • set

    The set of queries which this one is directly dependent on.

fully_qualified_table_name

fully_qualified_table_name
Source: flowmachine/core/query.py

Returns a unique fully qualified name for the query to be stored as under the cache schema, based on a hash of the parameters, class, and subqueries.

Returns
  • str

    String form of the table's fqn

index_cols

index_cols
Source: flowmachine/core/query.py

A list of columns to use as indexes when storing this query.

Returns
  • ixen: list

    By default, returns the location columns if they are present and self.spatial_unit is defined, and the subscriber column.

Examples
daily_location("2016-01-01").index_cols
[['name'], '"subscriber"']

is_stored

is_stored
Source: flowmachine/core/query.py

Returns
  • bool

    True if the table is stored, and False otherwise.

query_id

query_id
Source: flowmachine/core/query.py

Generate a uniquely identifying hash of this query, based on the parameters of it and the subqueries it is composed of.

Returns
  • str

    query_id hash string

query_state

query_state
Source: flowmachine/core/query.py

Return the current query state.

Returns
  • QueryState

    The current query state

query_state_str

query_state_str
Source: flowmachine/core/query.py

Return the current query state as a string

Returns
  • str

    The current query state. The possible values are the ones defined in flowmachine.core.query_state.QueryState.

table_name

table_name
Source: flowmachine/core/query.py

Returns a uniquename for the query to be stored as, based on a hash of the parameters, class, and subqueries.

Returns
  • str

    String form of the table's fqn