Skip to content

Commit

Permalink
update README adding flat as Occurrences parameter and aclaration abo…
Browse files Browse the repository at this point in the history
…ut its usage.
  • Loading branch information
FelipeSBarros committed May 19, 2024
1 parent 22ef83d commit efd25bd
Showing 1 changed file with 23 additions and 31 deletions.
54 changes: 23 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,12 +81,12 @@ cities(format='df')

#### `Cities` parameters

| Name | Required | Description | Type | Default value | Example |
|---|---|---|---|---|---|
| `state_id` || ID of the state | string | `None` | `'b112ffbe-17b3-4ad0-8f2a-2038745d1d14'` |
| `city_id` || ID of the city | string | `None` | `'88959ad9-b2f5-4a33-a8ec-ceff5a572ca5'` |
| `city_name` || Name of the city | string | `None` | `'Rio de Janeiro'` |
| `format` || Format of the result | string | `'dict'` | `'dict'`, `'df'` or `'geodf'` |
| Name | Required | Description | Type | Default value | Example |
|-------------|----------|----------------------|--------|---------------|------------------------------------------|
| `state_id` | | ID of the state | string | `None` | `'b112ffbe-17b3-4ad0-8f2a-2038745d1d14'` |
| `city_id` | | ID of the city | string | `None` | `'88959ad9-b2f5-4a33-a8ec-ceff5a572ca5'` |
| `city_name` | | Name of the city | string | `None` | `'Rio de Janeiro'` |
| `format` | | Format of the result | string | `'dict'` | `'dict'`, `'df'` or `'geodf'` |


### Listing occurrences
Expand Down Expand Up @@ -114,37 +114,28 @@ occurrences('813ca36b-91e3-4a18-b408-60b27a1942ef', format='geodf')

#### `Occurrences` parameters

| Name | Required | Description | Type | Default value | Example |
|---|---|---|---|---|--------------------------------------------------------------------------------------------------------------------------------|
| `id_state` || ID of the state | string | `None` | `'b112ffbe-17b3-4ad0-8f2a-2038745d1d14'` |
| `id_cities` || ID of the city | string or list of strings | `None` | `'88959ad9-b2f5-4a33-a8ec-ceff5a572ca5'` or `['88959ad9-b2f5-4a33-a8ec-ceff5a572ca5', '9d7b569c-ec84-4908-96ab-3706ec3bfc57']` |
| `type_occurrence` || Type of occurrence | string | `'all'` | `'all'`, `'withVictim'` or `'withoutVictim'` |
| `initial_date` || Initial date of the occurrences | string, `date` or `datetime` | `None` | `'2020-01-01'`, `'2020/01/01'`, `'20200101'`, `datetime.datetime(2023, 1, 1)` or `datetime.date(2023, 1, 1)` |
| `final_date` || Final date of the occurrences | string, `date` or `datetime` | `None` | `'2020-01-01'`, `'2020/01/01'`, `'20200101'`, `datetime.datetime(2023, 1, 1)` or `datetime.date(2023, 1, 1)` |
| `max_parallel_requests` || Maximum number of parallel requests to the API | int | `16` | `32` |
| `format` || Format of the result | string | `'dict'` | `'dict'`, `'df'` or `'geodf'` |
| Name | Required | Description | Type | Default value | Example |
|-------------------------|----------|------------------------------------------------|------------------------------|---------------|--------------------------------------------------------------------------------------------------------------------------------|
| `id_state` || ID of the state | string | `None` | `'b112ffbe-17b3-4ad0-8f2a-2038745d1d14'` |
| `id_cities` || ID of the city | string or list of strings | `None` | `'88959ad9-b2f5-4a33-a8ec-ceff5a572ca5'` or `['88959ad9-b2f5-4a33-a8ec-ceff5a572ca5', '9d7b569c-ec84-4908-96ab-3706ec3bfc57']` |
| `type_occurrence` || Type of occurrence | string | `'all'` | `'all'`, `'withVictim'` or `'withoutVictim'` |
| `initial_date` || Initial date of the occurrences | string, `date` or `datetime` | `None` | `'2020-01-01'`, `'2020/01/01'`, `'20200101'`, `datetime.datetime(2023, 1, 1)` or `datetime.date(2023, 1, 1)` |
| `final_date` || Final date of the occurrences | string, `date` or `datetime` | `None` | `'2020-01-01'`, `'2020/01/01'`, `'20200101'`, `datetime.datetime(2023, 1, 1)` or `datetime.date(2023, 1, 1)` |
| `max_parallel_requests` || Maximum number of parallel requests to the API | int | `16` | `32` |
| `format` || Format of the result | string | `'dict'` | `'dict'`, `'df'` or `'geodf'` |
| `flat` || Return nested columns as separate columns | bool | `False` | `True` or `False` |


#### Flattening `Occurrences` columns
##### About `flat` parameter

The flatten function is designed to simplify the analysis of occurrence data by flattening nested information found in specific columns. Nested information is commonly found in columns such as `contextInfo`, `state`, `region`, `city`, `neighborhood`, and `locality`.
Occurrence data often contains nested information in several columns. By setting the parameter `flat=True`, you can simplify the analysis by separating nested data into individual columns. This feature is particularly useful for columns such as `contextInfo`, `state`, `region`, `city`, `neighborhood`, and `locality`.

So, to access information about the contexto of occurrences, for an instance, identify its main reason, one might need to access the `contextInfo` column and then the `mainReason` key. The flatten function simplifies this process by creating new columns with the nested information as suffixes.
For example, to access detailed information about the context of occurrences, such as identifying the main reason, you would typically need to access the `contextInfo` column and then look for the mainReason key. With the `flat=True` parameter, this nested information is automatically split into separate columns, making the data easier to work with.

##### Usages

```python
from crossfire.clients.occurrences import flatten
flatten(data, nested_columns=["contextInfo"])
```

* `data`: The input data containing occurrence information.
* `nested_columns`: A list of column names to be flattened. If no columns are specified, all columns containing nested information will be flattened. If the column name is not in the list of columns with nested information, the function will raise an `NestedColumnError`.

The function returns occurrences with the flattened columns. Each flattened column retains the original column name as a prefix and nested column as a suffix. For example, the `contextInfo` column will be flattened into `contextInfo_mainReason`, `contextInfo_complementaryReasons`, `contextInfo_clippings`, `contextInfo_massacre`, and `contextInfo_policeUnit`.
When `flat=True` is set, the function returns occurrences with the flattened columns. Each new column retains the original column name as a prefix and the nested key as a suffix. For instance, the `contextInfo` column will be split into the following columns: `contextInfo_mainReason`, `contextInfo_complementaryReasons`, `contextInfo_clippings`, `contextInfo_massacre`, and `contextInfo_policeUnit`.


##### Example
###### Example

```python
from crossfire import occurrences
Expand All @@ -153,11 +144,12 @@ from crossfire.clients.occurrences import flatten
occs = occurrences('813ca36b-91e3-4a18-b408-60b27a1942ef')
occs[0].keys()
# dict_keys(['id', 'documentNumber', 'address', 'state', 'region', 'city', 'neighborhood', 'subNeighborhood', 'locality', 'latitude', 'longitude', 'date', 'policeAction', 'agentPresence', 'relatedRecord', 'contextInfo', 'transports', 'victims', 'animalVictims'])
flattened_occs = flatten(occs, nested_columns=['contextInfo'])
flattened_occs = occurrences('813ca36b-91e3-4a18-b408-60b27a1942ef', flat=True)
occs[0].keys()
# dict_keys(['id', 'documentNumber', 'address', 'state', 'region', 'city', 'neighborhood', 'subNeighborhood', 'locality', 'latitude', 'longitude', 'date', 'policeAction', 'agentPresence', 'relatedRecord', 'transports', 'victims', 'animalVictims', 'contextInfo', 'contextInfo_mainReason', 'contextInfo_complementaryReasons', 'contextInfo_clippings', 'contextInfo_massacre', 'contextInfo_policeUnit'])
```

By using the `flat=True parameter`, you ensure that all nested data is expanded into individual columns, simplifying data analysis and making it more straightforward to access specific details within your occurrence data.

### Custom client

Expand Down

0 comments on commit efd25bd

Please # to comment.