diff --git a/README.md b/README.md index 6141d49..90654e8 100644 --- a/README.md +++ b/README.md @@ -81,12 +81,12 @@ cities(format='df') #### `Cities` parameters -| Name | Required | Description | Type | Default value | Example | -|---|---|---|---|---|---| -| `state_id` | ❌ | ID of the state | string | `None` | `'b112ffbe-17b3-4ad0-8f2a-2038745d1d14'` | -| `city_id` | ❌ | ID of the city | string | `None` | `'88959ad9-b2f5-4a33-a8ec-ceff5a572ca5'` | -| `city_name` | ❌ | Name of the city | string | `None` | `'Rio de Janeiro'` | -| `format` | ❌ | Format of the result | string | `'dict'` | `'dict'`, `'df'` or `'geodf'` | +| Name | Required | Description | Type | Default value | Example | +|-------------|----------|----------------------|--------|---------------|------------------------------------------| +| `state_id` | ❌ | ID of the state | string | `None` | `'b112ffbe-17b3-4ad0-8f2a-2038745d1d14'` | +| `city_id` | ❌ | ID of the city | string | `None` | `'88959ad9-b2f5-4a33-a8ec-ceff5a572ca5'` | +| `city_name` | ❌ | Name of the city | string | `None` | `'Rio de Janeiro'` | +| `format` | ❌ | Format of the result | string | `'dict'` | `'dict'`, `'df'` or `'geodf'` | ### Listing occurrences @@ -114,37 +114,28 @@ occurrences('813ca36b-91e3-4a18-b408-60b27a1942ef', format='geodf') #### `Occurrences` parameters -| Name | Required | Description | Type | Default value | Example | -|---|---|---|---|---|--------------------------------------------------------------------------------------------------------------------------------| -| `id_state` | ✅ | ID of the state | string | `None` | `'b112ffbe-17b3-4ad0-8f2a-2038745d1d14'` | -| `id_cities` | ❌ | ID of the city | string or list of strings | `None` | `'88959ad9-b2f5-4a33-a8ec-ceff5a572ca5'` or `['88959ad9-b2f5-4a33-a8ec-ceff5a572ca5', '9d7b569c-ec84-4908-96ab-3706ec3bfc57']` | -| `type_occurrence` | ❌ | Type of occurrence | string | `'all'` | `'all'`, `'withVictim'` or `'withoutVictim'` | -| `initial_date` | ❌ | Initial date of the occurrences | string, `date` or `datetime` | `None` | `'2020-01-01'`, `'2020/01/01'`, `'20200101'`, `datetime.datetime(2023, 1, 1)` or `datetime.date(2023, 1, 1)` | -| `final_date` | ❌ | Final date of the occurrences | string, `date` or `datetime` | `None` | `'2020-01-01'`, `'2020/01/01'`, `'20200101'`, `datetime.datetime(2023, 1, 1)` or `datetime.date(2023, 1, 1)` | -| `max_parallel_requests` | ❌ | Maximum number of parallel requests to the API | int | `16` | `32` | -| `format` | ❌ | Format of the result | string | `'dict'` | `'dict'`, `'df'` or `'geodf'` | +| Name | Required | Description | Type | Default value | Example | +|-------------------------|----------|------------------------------------------------|------------------------------|---------------|--------------------------------------------------------------------------------------------------------------------------------| +| `id_state` | ✅ | ID of the state | string | `None` | `'b112ffbe-17b3-4ad0-8f2a-2038745d1d14'` | +| `id_cities` | ❌ | ID of the city | string or list of strings | `None` | `'88959ad9-b2f5-4a33-a8ec-ceff5a572ca5'` or `['88959ad9-b2f5-4a33-a8ec-ceff5a572ca5', '9d7b569c-ec84-4908-96ab-3706ec3bfc57']` | +| `type_occurrence` | ❌ | Type of occurrence | string | `'all'` | `'all'`, `'withVictim'` or `'withoutVictim'` | +| `initial_date` | ❌ | Initial date of the occurrences | string, `date` or `datetime` | `None` | `'2020-01-01'`, `'2020/01/01'`, `'20200101'`, `datetime.datetime(2023, 1, 1)` or `datetime.date(2023, 1, 1)` | +| `final_date` | ❌ | Final date of the occurrences | string, `date` or `datetime` | `None` | `'2020-01-01'`, `'2020/01/01'`, `'20200101'`, `datetime.datetime(2023, 1, 1)` or `datetime.date(2023, 1, 1)` | +| `max_parallel_requests` | ❌ | Maximum number of parallel requests to the API | int | `16` | `32` | +| `format` | ❌ | Format of the result | string | `'dict'` | `'dict'`, `'df'` or `'geodf'` | +| `flat` | ❌ | Return nested columns as separate columns | bool | `False` | `True` or `False` | -#### Flattening `Occurrences` columns +##### About `flat` parameter -The flatten function is designed to simplify the analysis of occurrence data by flattening nested information found in specific columns. Nested information is commonly found in columns such as `contextInfo`, `state`, `region`, `city`, `neighborhood`, and `locality`. +Occurrence data often contains nested information in several columns. By setting the parameter `flat=True`, you can simplify the analysis by separating nested data into individual columns. This feature is particularly useful for columns such as `contextInfo`, `state`, `region`, `city`, `neighborhood`, and `locality`. -So, to access information about the contexto of occurrences, for an instance, identify its main reason, one might need to access the `contextInfo` column and then the `mainReason` key. The flatten function simplifies this process by creating new columns with the nested information as suffixes. +For example, to access detailed information about the context of occurrences, such as identifying the main reason, you would typically need to access the `contextInfo` column and then look for the mainReason key. With the `flat=True` parameter, this nested information is automatically split into separate columns, making the data easier to work with. -##### Usages - -```python -from crossfire.clients.occurrences import flatten -flatten(data, nested_columns=["contextInfo"]) -``` - -* `data`: The input data containing occurrence information. -* `nested_columns`: A list of column names to be flattened. If no columns are specified, all columns containing nested information will be flattened. If the column name is not in the list of columns with nested information, the function will raise an `NestedColumnError`. - -The function returns occurrences with the flattened columns. Each flattened column retains the original column name as a prefix and nested column as a suffix. For example, the `contextInfo` column will be flattened into `contextInfo_mainReason`, `contextInfo_complementaryReasons`, `contextInfo_clippings`, `contextInfo_massacre`, and `contextInfo_policeUnit`. +When `flat=True` is set, the function returns occurrences with the flattened columns. Each new column retains the original column name as a prefix and the nested key as a suffix. For instance, the `contextInfo` column will be split into the following columns: `contextInfo_mainReason`, `contextInfo_complementaryReasons`, `contextInfo_clippings`, `contextInfo_massacre`, and `contextInfo_policeUnit`. -##### Example +###### Example ```python from crossfire import occurrences @@ -153,11 +144,12 @@ from crossfire.clients.occurrences import flatten occs = occurrences('813ca36b-91e3-4a18-b408-60b27a1942ef') occs[0].keys() # dict_keys(['id', 'documentNumber', 'address', 'state', 'region', 'city', 'neighborhood', 'subNeighborhood', 'locality', 'latitude', 'longitude', 'date', 'policeAction', 'agentPresence', 'relatedRecord', 'contextInfo', 'transports', 'victims', 'animalVictims']) -flattened_occs = flatten(occs, nested_columns=['contextInfo']) +flattened_occs = occurrences('813ca36b-91e3-4a18-b408-60b27a1942ef', flat=True) occs[0].keys() # dict_keys(['id', 'documentNumber', 'address', 'state', 'region', 'city', 'neighborhood', 'subNeighborhood', 'locality', 'latitude', 'longitude', 'date', 'policeAction', 'agentPresence', 'relatedRecord', 'transports', 'victims', 'animalVictims', 'contextInfo', 'contextInfo_mainReason', 'contextInfo_complementaryReasons', 'contextInfo_clippings', 'contextInfo_massacre', 'contextInfo_policeUnit']) ``` +By using the `flat=True parameter`, you ensure that all nested data is expanded into individual columns, simplifying data analysis and making it more straightforward to access specific details within your occurrence data. ### Custom client