Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

CUQ-5: Mortgage example using gQuant. #9

Merged

Conversation

avolkov1
Copy link
Contributor

Draft pull request for review. I still need to clean up the code and finish up the jupyter notebook. But implementation wise it's all there.

@avolkov1
Copy link
Contributor Author

avolkov1 commented Jul 19, 2019

Review my current progress so far.
https://github.com/avolkov1/gQuant/tree/CUQ-5-Mortgage-Example-Using-qQuant/notebook/mortgage_e2e_gquant

Jupyter notebook:

  • mortgage_e2e_gquant.ipynb

modules:

  • mortgage_common.py
  • mortgage_gquant_plugins.py

scripts to run (python <script>):

  • mortgage_run_workflow_local.py
  • mortgage_run_workflow_daskdistrib.py

TODO before ready to merge:
1.
Investigate the TypeError being thrown in node.py
https://github.com/rapidsai/gQuant/blob/b94c987516f2b60118b4147f179541ec86574247/gquant/dataframe_flow/node.py#L215

Type checking not being handled correctly for a corner case with cudf dataframes when cudf.read_csv is used. The cudf.read_csv differs a bit from pandas version hence I think type mismatch.

Finish the notebook. Add a conclusion/summary, etc.

Just go over the code and add docstrings, comments, etc.

Copy link
Collaborator

@yidong72 yidong72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are some little things that need to be fixed like qquant and it is better to remove those commented out code for readability.

Copy link
Contributor Author

@avolkov1 avolkov1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will push the latest edits very soon. Been working on the various corner cases with logger in dask distributed workers when running in jupyter lab/notebook.

@avolkov1 avolkov1 force-pushed the CUQ-5-Mortgage-Example-Using-qQuant branch from b94c987 to 8317304 Compare July 23, 2019 23:00
Check category and date types for cudf dataframes.
Add mortgage workflow implemented with gQuant.
Finish jupyter notebook example.
Use input references in workflow definition.
@avolkov1 avolkov1 force-pushed the CUQ-5-Mortgage-Example-Using-qQuant branch from 8317304 to 3276b57 Compare July 23, 2019 23:19
@avolkov1
Copy link
Contributor Author

I squashed and pushed all the latest changes.
avolkov1@3276b57

If you cloned my CUQ-5-Mortgage-Example-Using-qQuant branch, you might have to delete and re-clone.

I finished the TODOs I listed for myself.

  1. Investigate the TypeError being thrown in node.py

https://github.com/avolkov1/gQuant/blob/3276b578dadedad64809ae4171cfb83644224b71/gquant/dataframe_flow/node.py#L198-L221

You have to be careful comparing types. Comparing a pandas dtype with numpy dtype throws a type error. Convert to common comparison type i.e. string then compare.

  1. Finished the notebook.
    https://github.com/avolkov1/gQuant/blob/CUQ-5-Mortgage-Example-Using-qQuant/notebook/mortgage_e2e_gquant/mortgage_e2e_gquant.ipynb

  2. Just go over the code and add docstrings, comments, etc.
    I still need/should add docstrings to various functions and classes.

@avolkov1 avolkov1 marked this pull request as ready for review July 23, 2019 23:29
Copy link
Collaborator

@yidong72 yidong72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me now.

I am going to merge it.

@yidong72 yidong72 merged commit 65e2d02 into NVIDIA:develop Jul 24, 2019
@avolkov1
Copy link
Contributor Author

Thanks!

Going over the code today adding docstrings to various functions and classes, and cleaning up the logger logic a bit. I'll make another PR on another branch when I finish.

@avolkov1 avolkov1 deleted the CUQ-5-Mortgage-Example-Using-qQuant branch July 24, 2019 21:42
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants