-
Notifications
You must be signed in to change notification settings - Fork 65
Conversation
499ed9c
to
12bfbc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! reviews in line.
# Fix random seed for reproducibility | ||
np.random.seed(7) | ||
|
||
start = time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to only time the training time. As data generation code is different in pure mxnet (extra step to crate Iterators)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixed.
from scipy.sparse import csc_matrix | ||
|
||
# Fix random seed for reproducibility | ||
np.random.seed(7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create a common file to run the benchmark and it will create a random dataset instead of fixed seed? Some thing like run_sparse_benchmark.py
, and the logic will be:
1. Generate random dataset
2. Pass the dataset to `keras_tf_model.py` to train and it returns acc, loss, time to train
3. Pass the dataset to `mxnet_sparse_model.py` to train and it returns acc, loss, time to train
In this way you can make sure same data is passed and each time you run it, data will be random generated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few minor comments.
|
||
start = time.time() | ||
model.fit(train_data, train_label, | ||
epochs=1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
epoch can be a configurable param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
def invoke_benchmark(batch_size, mode): | ||
# Fix random seed for reproducibility | ||
np.random.seed(7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we can remove this fixed seed since same data is passing to both modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
|
||
### Results | ||
| Instance Type | GPUs | Batch Size | MXNet (Time/Epoch) | Keras-TensorFlow (Time/Epoch) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can mention here if users want to reproduce exact result, they can use fixed seed 7. In general, we should observe consist/similar result with different random datasets
Or we can show an average of a few runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
… fixed random seed in script
3932458
to
8110427
Compare
@@ -124,4 +124,4 @@ script: | |||
PYTHONPATH=$PWD:$PYTHONPATH py.test tests/test_documentation.py; | |||
else | |||
PYTHONPATH=$PWD:$PYTHONPATH py.test tests/ --ignore=tests/integration_tests --ignore=tests/test_documentation.py --ignore=tests/keras/legacy/layers_test.py --cov-config .coveragerc --cov=keras tests/; | |||
fi | |||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep the same with dev branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the same as dev - for some reason its showing that there is a diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Diff is an extra line at end of file is deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! please fix the .travis.yml diff before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contributions Kalyanee!
Any metric on memory consumption applicable?
metric = mx.metric.MSE() | ||
mse = model.score(eval_iter, metric) | ||
print("Achieved {0:.6f} validation MSE".format(mse[0][1])) | ||
assert model.score(eval_iter, metric)[0][1] < 0.01001, "Achieved MSE (%f) is larger than expected (0.01001)" % \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why assert in benchmark script?
Same comment for above tf script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we are generating random data, so accuracy/loss doesn't add any value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For first comment - I followed the style of the benchmarking script for linear regression model written in MXNet - https://mxnet.incubator.apache.org/tutorials/python/linear-regression.html
As we have a pre-defined test and train set, even if its synthetic we can loss/accuracy can still help us with understanding how the model is performing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't track memory usage as we just wanted to check speed of mxnet vs keras-tf on sparse tensors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that accuracy/loss has no value for synthetic data, even if the acc increase, it does not mean it's performing better.
You can extend that in future if you want to include real dataset. and only check acc/loss on real dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About memory consumption , it's a good value add as sparse tensors save a lot of memory, and it's very easy to report. Just run on a GPU instance and report the output of nvidia-smi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes thats a good point - I will update the memory usage in another PR. This only tracks CPU performance.
### Configuration | ||
| Dataset | Synthetic(Randomly generated) | | ||
| :--------------- | :----------------------------------------------------------- | | ||
| Keras | v2.2.0 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.2.2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
| C5.18X Large | 0 | 128 | 5.72 sec | 9.86 sec | | ||
|
||
### Note | ||
For reproducing above results set seed to `7` by adding this line in the `run_sparse_benchmark` script - `np.random.seed(7)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we add this line rather than asking the user to add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We wanted to avoid the assumption that this benchmark is reproducible only one one dataset. So as per comments by @roywei we moved it under ReadMe - for details about how to get these values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can report an average of multiple runs. My point is better performance on a fixed seed is not convinced enough.
### Note | ||
For reproducing above results set seed to `7` by adding this line in the `run_sparse_benchmark` script - `np.random.seed(7)` | ||
|
||
Run the file as `python run_sparse_benchmark.py --batch=128 --epochs=1000` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 1000 epochs? 25 should be good enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25 doesn't converge well for either of the backends - when we have a MSE limit of 0.01001
set
MXNet model results in MSE 0.020521
Keras-TF results in MSE 0.017107
with 25 epochs
Results below show the performance comparison of linear regression with MXNet vs Keras-Tensorflow using sparse tensors | ||
``` | ||
|
||
### Configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment about sparse type - csr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
from scipy import sparse | ||
|
||
|
||
def invoke_benchmark(batch_size, epochs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it would be useful going forward if you have 2 function invoke_tf_keras_benchmark() and invoke_mxnet_benchmark(). and from CLI take an option for framework name may be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that was added in the earlier revisions - we decided to just go forward with having only one function with invoking both the benchmarks.
Summary
Added primary benchmarking model for sparse with synthetic data
Related Issues
PR Overview