flowmachine.features.subscriber.total_active_periods¶
Source: flowmachine/features/subscriber/total_active_periods.py
Definition of the TotalActivePeriodsSubscriber class, breaks a time span into many smaller time spans and counts the number of distinct periods in which each subscriber is present in the data.
Note
The implementation of this algorithm was principally done in order to form an input to detection bias models (see [1]). However, it is not limited to that use-case.
Class TotalActivePeriodsSubscriber¶
TotalActivePeriodsSubscriber(start: str, total_periods: int, period_length: int = 1, period_unit: str = 'days', hours: Union[Tuple[int, int], NoneType] = None, table: Union[str, List[str]] = 'all', subscriber_identifier: str = 'msisdn', subscriber_subset: Union[flowmachine.core.query.Query, NoneType] = None)
Breaks a time span into distinct time periods (currently integer number of days). For each subscriber counts the total number of time periods in which each subscriber was seen. For instance we might ask for a month worth of data, break down our month into 10 3 day chunks, and ask for each subscriber how many of these three day chunks each subscriber was present in the data in.
Attributes¶
Parameters¶
-
start:striso-format date, start of the analysis.
-
total_periods:intTotal number of periods to break your time span into
-
period_length:int, default1Total number of days per period.
-
period_unit:str, defaultdaysSplit this time frame into hours or days etc.
-
subscriber_identifier:str, defaultmsisdnEither msisdn, or imei, the column that identifies the subscriber.
-
subscriber_subset:typing.Union[flowmachine.core.query.Query, NoneType], defaultNoneIf provided, string or list of string which are msisdn or imeis to limit results to; or, a query or table which has a column with a name matching subscriber_identifier (typically, msisdn), to limit results to.
-
kwargspassed to flowmachine.UniqueSubscribers
Examples¶
TotalActivePeriods('2016-01-01', 10, 3).get_dataframe()
subscriber total_periods
subscriberA 10
subscriberB 3
subscriberC 7
.
.
.
Methods¶
cache¶
cache
Returns¶
-
boolTrue is caching is switched on.
column_names¶
column_names
Returns the column names.
Returns¶
-
typing.List[str]List of the column names of this query.
column_names_as_string_list¶
column_names_as_string_list
Get the column names as a comma separated list
Returns¶
-
strComma separated list of column names
dependencies¶
dependencies
Returns¶
-
setThe set of queries which this one is directly dependent on.
fully_qualified_table_name¶
fully_qualified_table_name
Returns a unique fully qualified name for the query to be stored as under the cache schema, based on a hash of the parameters, class, and subqueries.
Returns¶
-
strString form of the table's fqn
index_cols¶
index_cols
A list of columns to use as indexes when storing this query.
Returns¶
-
ixen:listBy default, returns the location columns if they are present and self.spatial_unit is defined, and the subscriber column.
Examples¶
daily_location("2016-01-01").index_cols
[['name'], '"subscriber"']
is_stored¶
is_stored
Returns¶
-
boolTrue if the table is stored, and False otherwise.
query_id¶
query_id
Generate a uniquely identifying hash of this query, based on the parameters of it and the subqueries it is composed of.
Returns¶
-
strquery_id hash string
query_state¶
query_state
Return the current query state.
Returns¶
-
QueryStateThe current query state
query_state_str¶
query_state_str
Return the current query state as a string
Returns¶
-
strThe current query state. The possible values are the ones defined in
flowmachine.core.query_state.QueryState.
table_name¶
table_name
Returns a uniquename for the query to be stored as, based on a hash of the parameters, class, and subqueries.
Returns¶
-
strString form of the table's fqn