Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Go packages in protos use incorrect repo #16

Closed
tims opened this issue Dec 26, 2018 · 1 comment
Closed

Go packages in protos use incorrect repo #16

tims opened this issue Dec 26, 2018 · 1 comment

Comments

@tims
Copy link
Contributor

tims commented Dec 26, 2018

The go packages in the protos are still pointing at gojektech.
option go_package = "github.com/gojektech/..."

Should be:
option go_package = "github.com/gojek/..."

@zhilingc
Copy link
Collaborator

zhilingc commented Jan 3, 2019

Resolved.

@zhilingc zhilingc closed this as completed Jan 3, 2019
Yanson pushed a commit to Yanson/feast that referenced this issue Jul 29, 2020
…-ingestion

Closes KE-609 - read data from Kafka
Closes KE-636 - write data into ADLS Gen2
Closes KE-655 - write data into Redis

Added a Spark ingestion job reading from Kafka into Delta Lake storage and Redis.

Created an integration test running against local Kafka, Redis and Spark and ingesting 128 random features into Redis and 128 random features into Delta, with different data types, and checking the result.

As Spark only runs on Java 8, the Integration test is skipped in the CI build (that runs under Java 11), but is run in the e2e tests, when the emulator container is built (including the ingestion jar).

Delta tables are automatically created and partitioned by day.


```bash
# run SparkIngestionTest.java
$ ls
/var/folders/67/59_hhx6d5lz0wg__x35g8wbw0000gn/T/junit9193296653579758585/bXlwcm9qZWN0/bXlwcm9qZWN0L2ZlYXR1cmVfc2V0X2Zvcl9kZWx0YQ==/event_timestamp_day=2020-06-08/part-00002-abd1fbe1-c773-4d83-972d-5193c75885e5.c000.snappy.parquet
/var/folders/67/59_hhx6d5lz0wg__x35g8wbw0000gn/T/junit9193296653579758585/bXlwcm9qZWN0/bXlwcm9qZWN0L2ZlYXR1cmVfc2V0X2Zvcl9kZWx0YQ==/event_timestamp_day=2020-06-08/part-00004-440cca76-fdc1-437a-93f2-d6739296cfe4.c000.snappy.parquet
/var/folders/67/59_hhx6d5lz0wg__x35g8wbw0000gn/T/junit9193296653579758585/bXlwcm9qZWN0/bXlwcm9qZWN0L2ZlYXR1cmVfc2V0X2Zvcl9kZWx0YQ==/event_timestamp_day=2020-06-08/part-00005-bb670850-3001-493b-9000-30b9b41f81e6.c000.snappy.parquet
...
# bXlwcm9qZWN0 is base64 for "myproject"
# bXlwcm9...9kZWx0YQ== is base64 for "myproject/feature_set_for_delta"
```

```python
>>> import pandas as pd
>>> df=pd.read_parquet("/var/folders/67/59_hhx6d5lz0wg__x35g8wbw0000gn/T/junit9193296653579758585/bXlwcm9qZWN0/bXlwcm9qZWN0L2ZlYXR1cmVfc2V0X2Zvcl9kZWx0YQ==")
>>> df.iloc[1]
event_timestamp        2020-06-08 07:47:04.931000
created_timestamp      2020-06-08 07:47:43.915000
ingestion_id                              testjob
entity_id_primary                     -2101962939
entity_id_secondary               iAkKJCqcry6NTS4
f_BYTES                        b'Smts8audYXVOYTw'
f_STRING                          gK1chlMFi2Btbdd
f_INT32                                -951335350
f_INT64                       3659296699309908912
f_DOUBLE                                 0.697594
f_FLOAT                                  0.879062
f_BOOL                                       True
f_STRING_LIST                   [Xx92gwGy2PUQCLl]
f_INT32_LIST                        [-1400465827]
f_INT64_LIST                [6251409099521094342]
f_DOUBLE_LIST               [0.11672805668860675]
f_FLOAT_LIST                          [0.6432213]
f_BOOL_LIST                               [False]
event_timestamp_day                    2020-06-08
Name: 1, dtype: object
```

The spark-ingestion uses several classes copied and adapted from feast-ingestion and feast-storage-connector-redis. To reduce merge conflicts downstream, I've kept those classes as close as possible to the original. When we approach PR submission into public Feast, we can work on creating shared projects for both ingestion modes.

A few data types have been disabled in the tests as they give a difference when checking for equality, although based on manual inspection they seem ok. Need to debug later on.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants