You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Auto preprocessing should work correctly. Pipeline should be fitted.
Current Behavior
FEDOT fails to fit catboostreg model with use_auto_preprocessing=True option.
PS C:\Users\nnikitin-user\Desktop\automl_may> & C:/Users/nnikitin-user/AppData/Local/Programs/Python/Python310/python.exe c:/Users/nnikitin-user/Desktop/automl_may/flood_1.py
2024-05-16 13:16:58,812 - ApiDataProcessor - Preprocessing data
2024-05-16 13:16:58,812 - ApiDataProcessor - Train Data (Original) Memory Usage: 452.05 MB Data Shapes: ((1117957, 53), (1117957, 1))
2024-05-16 13:22:54,236 - ApiDataProcessor - Train Data (Processed) Memory Usage: 1.05 GB Data Shape: ((1117957, 126), (1117957, 1))
2024-05-16 13:22:54,236 - ApiDataProcessor - Data preprocessing runtime = 0:05:55.423210
2024-05-16 13:22:55,149 - AssumptionsHandler - Initial pipeline fitting started
2024-05-16 13:23:21,260 - PipelineNode - Trying to fit pipeline node with operation: catboostreg
2024-05-16 13:23:22,181 - AssumptionsHandler - Initial pipeline fit was failed due to: all the input array dimensions except for the concatenation axis must match exactly, but along dimension
0, the array at index 0 has size 894365 and the array at index 1 has size 1117957.
Traceback (most recent call last):
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\api\api_utils\assumptions\assumptions_handler.py", line 71, in fit_assumption_and_check_correctness
pipeline.fit(data_train, n_jobs=eval_n_jobs)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\pipelines\pipeline.py", line 197, in fit
train_predicted = self._fit(input_data=copied_input_data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\pipelines\pipeline.py", line 112, in _fit
train_predicted = self.root_node.fit(input_data=input_data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\pipelines\node.py", line 200, in fit
self.fitted_operation, operation_predict = self.operation.fit(params=self._parameters,
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\operations\operation.py", line 87, in fit
self.fitted_operation = self._eval_strategy.fit(train_data=data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\operations\evaluation\boostings.py", line 33, in fit
operation_implementation.fit(train_data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\operations\evaluation\operation_implementations\models\boostings_implementations.py", line 28, in fit
input_data = input_data.get_not_encoded_data()
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\data\data.py", line 628, in get_not_encoded_data
new_features = np.hstack((num_features, cat_features))
File "<__array_function__ internals>", line 200, in hstack
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\core\shape_base.py", line 370, in hstack
return _nx.concatenate(arrs, 1, dtype=dtype, casting=casting)
File "<__array_function__ internals>", line 200, in concatenate
ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 894365 and the array at index 1 has size 1117957
Traceback (most recent call last):
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\api\api_utils\assumptions\assumptions_handler.py", line 71, in fit_assumption_and_check_correctness
pipeline.fit(data_train, n_jobs=eval_n_jobs)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\pipelines\pipeline.py", line 197, in fit
train_predicted = self._fit(input_data=copied_input_data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\pipelines\pipeline.py", line 112, in _fit
train_predicted = self.root_node.fit(input_data=input_data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\pipelines\node.py", line 200, in fit
self.fitted_operation, operation_predict = self.operation.fit(params=self._parameters,
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\operations\operation.py", line 87, in fit
self.fitted_operation = self._eval_strategy.fit(train_data=data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\operations\evaluation\boostings.py", line 33, in fit
operation_implementation.fit(train_data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\operations\evaluation\operation_implementations\models\boostings_implementations.py", line 28, in fit
input_data = input_data.get_not_encoded_data()
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\core\data\data.py", line 628, in get_not_encoded_data
new_features = np.hstack((num_features, cat_features))
File "<__array_function__ internals>", line 200, in hstack
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\core\shape_base.py", line 370, in hstack
return _nx.concatenate(arrs, 1, dtype=dtype, casting=casting)
File "<__array_function__ internals>", line 200, in concatenate
ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 894365 and the array at index 1 has size 1117957
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\Users\nnikitin-user\Desktop\automl_may\flood_1.py", line 85, in <module>
auto_model.fit(features=train, target="FloodProbability")
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\api\main.py", line 181, in fit
self.current_pipeline, self.best_models, self.history = self.api_composer.obtain_model(self.train_data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\api\api_utils\api_composer.py", line 63, in obtain_model
initial_assumption, fitted_assumption = self.propose_and_fit_initial_assumption(train_data)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\api\api_utils\api_composer.py", line 107, in propose_and_fit_initial_assumption assumption_handler.fit_assumption_and_check_correctness(deepcopy(initial_assumption[0]),
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\api\api_utils\assumptions\assumptions_handler.py", line 86, in fit_assumption_and_check_correctness
self._raise_evaluating_exception(ex)
File "C:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\api\api_utils\assumptions\assumptions_handler.py", line 94, in _raise_evaluating_exception raise ValueError(advice_info)
ValueError: Initial pipeline fit was failed due to: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 894365 and the array at index 1 has size 1117957. Check pipeline structure and the correctness of the data
PS C:\Users\nnikitin-user\Desktop\automl_may>
Possible Solution
Some features are deleted during the auto preprocessing.
Perhaps it is related to categorical features.
Debug the following breakpoints to find and fix the problem.
Expected Behavior
Auto preprocessing should work correctly. Pipeline should be fitted.
Current Behavior
FEDOT fails to fit catboostreg model with
use_auto_preprocessing=True
option.Possible Solution
Some features are deleted during the auto preprocessing.

Perhaps it is related to categorical features.
Debug the following breakpoints to find and fix the problem.
Steps to Reproduce
use_auto_preprocessing=True
Context [OPTIONAL]
Participating in a Kaggle competition.
The text was updated successfully, but these errors were encountered: