api_etl.utils_misc module

Module containing some useful functions that might be used by all other modules.

class api_etl.utils_misc.DateConverter(dt=None, api_date=None, normal_date=None, normal_time=None, special_date=None, special_time=None, force_regular_date=False)

Bases: object

Class to convert dates from and to our special format, from and to api
date format, and to and from our regular format:
  • api_format: “16/02/2017 01:26”
  • normal date: “20170216”
  • normal time: “01:26:00”
  • special date: “20170215”
  • special time: “25:26:00”

This class has also methods to compute delays

compute_delay_from(dc=None, dt=None, api_date=None, normal_date=None, normal_time=None, special_date=None, special_time=None, force_regular_date=False)

Create another DateConverter and compares datetimes Return in seconds the delay: - positive if this one > ‘from’ (delayed) - negative if this one < ‘from’ (advance) :param dc: :param dt: :param api_date: :param normal_date: :param normal_time: :param special_date: :param special_time: :param force_regular_date:

class api_etl.utils_misc.S3Bucket(name, create_if_absent=False)

Bases: object

list_bucket_objects()
send_file(file_path, file_name=None, delete=False, ignore_hidden=False)
send_folder(folder_path, folder_name=None, delete=False, ignore_hidden=True)

Will keep same names for files inside folder.

Note: in S3, there is no folder, just files with names as path. :param folder_path: :param folder_name: :param delete: :param ignore_hidden:

class api_etl.utils_misc.StationProvider

Bases: object

Class to easily get lists of stations in gtfs format (7 digits) or transilien’s format (8 digits).

Warning: data sources have to be checked (“all” is ok, “top” is wrong).

get_station_ids(stations='all', gtfs_format=False)

Get stations ids either in API format (8 digits), or in GTFS format (7 digits).

Beware, this function has to be more tested. Beware: two formats: - 8 digits format to query api - 7 digits format to query gtfs files :param stations: :param gtfs_format:

get_stations_per_line(lines=None, uic7=False, full_df=False)

Get stations of given line (multiple lines possible) :param lines: :param uic7: :param full_df:

api_etl.utils_misc.build_uri(db_type, host, user=None, password=None, port=None, database=None)
api_etl.utils_misc.chunks(l, n)

Yield a list in ‘n’ lists of nearly same size (some can be one more than others).

Parameters:
  • l (list) – list you want to divide in chunks
  • n (int) – number of chunks you want to get
api_etl.utils_misc.get_paris_local_datetime_now(tz_naive=True)

Return paris local time (necessary for operations operated on other time zones) :param tz_naive:

api_etl.utils_misc.get_responding_stations_from_sample(sample_loc=None, write_loc=None)

This function’s purpose is to write down responding stations from a given “real_departures” sample, and to write it down so it can be used to query only necessary stations (and avoid to spend API credits on unnecessary stations) :param sample_loc: :param write_loc:

api_etl.utils_misc.s3_ressource()
api_etl.utils_misc.set_logging_conf(log_name, level='INFO')