Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.

Set of new timeseries transforms #475

Open
wants to merge 44 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
284fcd7
Use latest ML.Net dev packages from MachineLearning feed.
Jan 17, 2020
ad00b70
Re-enable the default nuget.org feed. It does not appear to cause
Jan 17, 2020
258a799
Add whitespace change to restart CI build. Linux timed out.
Jan 21, 2020
c542c1d
Fix build issue when using pip version >= 20.0.0
Jan 21, 2020
4c5bac1
Merge branch 'master' into nuget_update
Jan 21, 2020
5423d6a
Merge branch 'master' into nightly
actions-user Jan 24, 2020
324d379
Merge branch 'master' into nightly
actions-user Jan 24, 2020
b3ed66b
Merge branch 'master' into nightly
actions-user Jan 25, 2020
5924fdc
Merge branch 'master' into nightly
ganik Mar 25, 2020
5feb56d
preview3
ganik Mar 25, 2020
fed9aa2
fix signing
ganik Mar 26, 2020
039356a
run ep only if VerifyManifest
ganik Mar 26, 2020
cbe0e75
draft of timeseries transforms
ganik Mar 31, 2020
6a2a913
Updated with latest changes
ganik Apr 19, 2020
7ddbba5
Merge branch 'master' into tsaml
ganik Apr 19, 2020
7ae2fa9
add unit tests
ganik Apr 19, 2020
e3196c7
Add timeseries transforms to onnx suite test.
ganik Apr 20, 2020
d6ae18f
Add automl ONNX tests
ganik Apr 20, 2020
7244760
0.4.0 version for Featurizers
ganik Apr 20, 2020
f333452
Featurizer Onnx Export tests (#484)
angryjinyan Apr 27, 2020
ac5ce11
Add tests for DateTimeSplitter with country (#486)
angryjinyan Apr 27, 2020
0d5e594
install ort-featurizers
ganik Apr 28, 2020
fa35f9f
Merge branch 'tsaml' of https://github.com/microsoft/NimbusML into tsaml
ganik Apr 28, 2020
634597d
fix feed
ganik Apr 28, 2020
7819fdb
Merge branch 'master' into tsaml
ganik Apr 28, 2020
f223671
update version for ort-featurizers
ganik Apr 28, 2020
1e601e7
Merge branch 'tsaml' of https://github.com/microsoft/NimbusML into tsaml
ganik Apr 28, 2020
ca0eaa8
fix tests
ganik Apr 28, 2020
f234b7c
skip ts checks
ganik Apr 28, 2020
9a0bec9
fix tests
ganik Apr 28, 2020
82f831b
fix test
ganik Apr 28, 2020
2b7accb
MLFeatur vcersion
ganik Apr 28, 2020
7a7bef7
exclude test for Mac
ganik Apr 29, 2020
1315d4d
do mv to save space
ganik Apr 29, 2020
99b0e71
Make more space for build
ganik Apr 30, 2020
7878982
more space
ganik Apr 30, 2020
b3797e2
more space
ganik Apr 30, 2020
f2e81d3
more space
ganik Apr 30, 2020
9252d1e
more space
ganik Apr 30, 2020
b088277
fix build
ganik Apr 30, 2020
2fc6a1f
more space
ganik Apr 30, 2020
bea3ec3
check in (#487)
angryjinyan May 1, 2020
ba376b9
Fix shape (#488)
angryjinyan May 1, 2020
e9b6b87
Merge branch 'master' into tsaml
ganik May 12, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add tests for DateTimeSplitter with country (#486)
* Add callstack field to BrdigeRuntime exception (#483)

* Add exception stack to the error message

* Add callstack field

Co-authored-by: Gani Nazirov <ganaziro@microsoft.com>

* Add callstack field to BrdigeRuntime exception (#483)

* Add exception stack to the error message

* Add callstack field

Co-authored-by: Gani Nazirov <ganaziro@microsoft.com>

* revert "Add callstack field to BrdigeRuntime exception (#483)"

This reverts commit 569ea7b.

* add in DateTimeSplitter tests for country

* add in big tests for more featurizers

Co-authored-by: Gani Nazirov <ganinz@hotmail.com>
Co-authored-by: Gani Nazirov <ganaziro@microsoft.com>
3 people authored Apr 27, 2020

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit ac5ce115d190c78b92301385eeadf8f7810944b2
52 changes: 51 additions & 1 deletion src/python/tests_extended/test_dft_based.py
Original file line number Diff line number Diff line change
@@ -25,6 +25,8 @@
TEST_CASES = {
'DateTimeSplitter_Simple',
'DateTimeSplitter_Complex',
'DateTimeSplitter_Canada_1day_before_christmas',
'DateTimeSplitter_Czech_non_english_holiday',
'ToKey_SimpleFloat',
'ToKey_SimpleDouble',
'ToKey_SimpleString',
@@ -41,6 +43,8 @@
'ShortGrainDropper',
'RollingWin_Pivot_Integration',
'Laglead_Pivot_Integration',
'Big_Test1',
'Big_Test2'
}

INSTANCES = {
@@ -62,6 +66,8 @@
'dtAmPmLabel', 'dtDayOfWeekLabel', 'dtIsPaidTimeOff','dtHolidayName'
])
]),
'DateTimeSplitter_Canada_1day_before_christmas' : DateTimeSplitter(prefix='dt', country='Canada') << 'tokens1',
'DateTimeSplitter_Czech_non_english_holiday' : DateTimeSplitter(prefix='dt', country='Czech') << 'tokens1',
'ToKey_SimpleFloat': ToKeyImputer(),
'ToKey_SimpleDouble': ToKeyImputer(),
'ToKey_SimpleString': ToKeyImputer(),
@@ -122,6 +128,34 @@
horizon=2),
ForecastingPivot(columns_to_pivot=['colA1'])
]),
'Big_Test1': Pipeline([
TimeSeriesImputer(time_series_column='ts',
filter_columns=['c', 'grain'],
grain_columns=['grain'],
impute_mode='ForwardFill',
filter_mode='Include'),
DateTimeSplitter(prefix='dt') << 'ts',
LagLeadOperator(columns={'c1': 'c'},
grain_columns=['dtMonthLabel'],
offsets=[-2, -1],
horizon=1),
ForecastingPivot(columns_to_pivot=['c1']),
ColumnSelector(drop_columns=['dtHolidayName'])
]),
'Big_Test2': Pipeline([
TimeSeriesImputer(time_series_column='ts',
filter_columns=['c', 'grain'],
grain_columns=['grain'],
impute_mode='ForwardFill',
filter_mode='Include'),
DateTimeSplitter(prefix='dt', country = 'Canada') << 'ts',
RollingWindow(columns={'c1': 'c'},
grain_column=['grain'],
window_calculation='Mean',
max_window_size=2,
horizon=2),
ForecastingPivot(columns_to_pivot=['c1'])
])
}

DATASETS = {
@@ -131,6 +165,12 @@
'DateTimeSplitter_Complex': pd.DataFrame(data=dict(
tokens1=[217081624, 1751241600, 217081625, 32445842582]
)),
'DateTimeSplitter_Canada_1day_before_christmas': pd.DataFrame(data=dict(
tokens1=[157161599]
)),
'DateTimeSplitter_Czech_non_english_holiday': pd.DataFrame(data=dict(
tokens1=[3911760000, 3834432000, 3985200000]
)),
'ToKey_SimpleFloat': pd.DataFrame(data=dict(
target=[1.0, 1.0, 1.0, 2.0]
)).astype({'target': np.float64}),
@@ -196,7 +236,17 @@
'Laglead_Pivot_Integration': pd.DataFrame(data=dict(
colA=[1.0, 2.0, 3.0, 4.0],
grainA=["one", "one", "one", "one"]
))
)),
'Big_Test1': pd.DataFrame(data=dict(
ts=[217081624, 217081625, 217081627, 217081629],
grain=[1970, 1970, 1970, 1970],
c=[10, 11, 12, 13]
)).astype({'ts': np.int64, 'grain': np.int32, 'c': np.double}),
'Big_Test2': pd.DataFrame(data=dict(
ts=[0, 86400, 172800],
grain=[1970, 1970, 1970],
c=[10, 11, 12]
)).astype({'ts': np.int64, 'grain': np.int32, 'c': np.double})
}

def get_file_size(file_path):