Skip to content

flowmachine.features.subscriber.daily_location

Source: flowmachine/features/subscriber/daily_location.py

Calculates an subscriber daily location using different methods. A daily location is a statistic representing where an subscriber is on a given day.

daily_location

daily_location(date, stop=None, *, spatial_unit: Union[flowmachine.core.spatial_unit.CellSpatialUnit, flowmachine.core.spatial_unit.GeomSpatialUnit, NoneType] = None, hours: Union[Tuple[int, int], NoneType] = None, method='last', table='all', subscriber_identifier='msisdn', ignore_nulls=True, subscriber_subset=None)
Source: flowmachine/features/subscriber/daily_location.py

Return a query for locating all subscribers on a single day of data.

Parameters

  • date: str

    iso format date for the day in question, e.g. 2016-01-01

  • stop: str

    optionally specify a stop datetime in iso format date for the day in question, e.g. 2016-01-02 06:00:00

  • spatial_unit: typing.Union[flowmachine.core.spatial_unit.CellSpatialUnit, flowmachine.core.spatial_unit.GeomSpatialUnit, NoneType], default None

    Spatial unit to which subscriber locations will be mapped. See the docstring of make_spatial_unit for more information.

  • hours: typing.Union[typing.Tuple[int, int], NoneType], default None

    Subset the result within certain hours, e.g. (4,17) This will subset the query only with these hours, but across all specified days. Or set to 'all' to include all hours.

  • method: str, default 'last'

    The method by which to calculate the location of the subscriber. This can be either 'most-common' or last. 'most-common' is simply the modal location of the subscribers, whereas 'lsat' is the location of the subscriber at the time of the final call in the data.

  • table: str, default 'all'

    schema qualified name of the table which the analysis is based upon. If 'ALL' it will use all tables that contain location data, specified in flowmachine.yml.

  • subscriber_identifier: {'msisdn', 'imei'}, default 'msisdn'

    Either msisdn, or imei, the column that identifies the subscriber.

  • subscriber_subset: flowmachine.core.Table, flowmachine.core.Query, list, str, default None

    If provided, string or list of string which are msisdn or imeis to limit results to; or, a query or table which has a column with a name matching subscriber_identifier (typically, msisdn), to limit results to.

Note

  • A date without a hours and mins will be interpreted as midnight of that day, so to get data within a single day pass (e.g.) '2016-01-01', '2016-01-02'. * Use 24 hr format!

locate_subscribers

locate_subscribers(start, stop, spatial_unit: Union[flowmachine.core.spatial_unit.CellSpatialUnit, flowmachine.core.spatial_unit.GeomSpatialUnit, NoneType] = None, hours: Union[Tuple[int, int], NoneType] = None, method='last', table='all', subscriber_identifier='msisdn', *, ignore_nulls=True, subscriber_subset=None)
Source: flowmachine/features/subscriber/daily_location.py

Return a class representing the location of an individual. This can be called with a number of different methods. Find the last/most-frequent location for every subscriber within the given time frame. Specify a spatial unit.

Parameters

  • start, stop: str

    iso format date range for the the time frame, e.g. 2016-01-01 or 2016-01-01 14:03:01

  • spatial_unit: typing.Union[flowmachine.core.spatial_unit.CellSpatialUnit, flowmachine.core.spatial_unit.GeomSpatialUnit, NoneType], default None

    Spatial unit to which subscriber locations will be mapped. See the docstring of make_spatial_unit for more information.

  • hours: typing.Union[typing.Tuple[int, int], NoneType], default None

    Subset the result within certain hours, e.g. (4,17) This will subset the query only with these hours, but across all specified days. Or set to 'all' to include all hours.

  • method: str, default 'last'

    The method by which to calculate the location of the subscriber. This can be either 'most-common' or last. 'most-common' is simply the modal location of the subscribers, whereas 'lsat' is the location of the subscriber at the time of the final call in the data.

  • table: str, default 'all'

    schema qualified name of the table which the analysis is based upon. If 'all' it will use all tables that contain location data, specified in flowmachine.yml.

  • subscriber_identifier: {'msisdn', 'imei'}, default 'msisdn'

    Either msisdn, or imei, the column that identifies the subscriber.

  • subscriber_subset: flowmachine.core.Table, flowmachine.core.Query, list, str, default None

    If provided, string or list of string which are msisdn or imeis to limit results to; or, a query or table which has a column with a name matching subscriber_identifier (typically, msisdn), to limit results to.

  • kwargs

    Eventually passed to flowmachine.spatial_metrics.spatial_helpers.

Examples

last_locs = locate_subscribers('2016-01-01 13:30:30',
                            '2016-01-02 16:25:00'
                             spatial_unit = CellSpatialUnit
                             method='last')
last_locs.head()
            subscriber    |    cell
            subscriberA   |   233241
            subscriberB   |   234111
            subscriberC   |   234111
                    .
                    .
                    .

Note

  • A date without a hours and mins will be interpreted as midnight of that day, so to get data within a single day pass (e.g.) '2016-01-01', '2016-01-02'. * Use 24 hr format!