You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
can you share a copy of the data. Is it a tsv file?
Typically while summarizing the function uses the pandas.to_datetime function to convert it to a datetime object. If it doesnt find it in correct format it raises an error.
I extract the data from a Azure SQL DB, using pyodbc cursor.
The conversion raise an error when the data is in decimal data type. Once I convert them manually into float in the Azure DB, then the summarize function works fine.
The error is raised when I do not exclude EmployeeShare, EmployerShare and TotalContribution columns
Does anyone facing this issue?
I plan to do a summarization on the dataframe, end up having a datatype issue.
Can you please advice on this.
df = pd.DataFrame.from_records(data, columns=columns)
data_summary = lida.summarize(df, summary_method="llm", textgen_config=textgen_config)
df:
ContributionID MemberID EmployerID ContributionMonth EmployeeShare
0 1 27 15 May 883.43
1 2 44 2 December 626.79
2 3 1 17 January 732.94
3 4 28 15 September 149.57
4 5 49 15 September 616.06
5 6 45 8 February 154.46
6 7 41 16 August 941.70
7 8 2 3 July 707.85
8 9 2 8 May 186.81
9 10 22 7 June 558.11
EmployerShare TotalContribution ContributionDate
0 536.68 1420.11 2021-05-13
1 368.82 995.61 2024-12-23
2 716.15 1449.09 2021-01-03
3 258.10 407.67 2022-09-27
4 519.45 1135.51 2022-09-09
5 840.50 994.96 2022-02-25
6 990.86 1932.56 2020-08-17
7 960.77 1668.62 2021-07-08
8 349.01 535.82 2021-05-16
9 585.05 1143.16 2022-06-30
error log:
\lida\components\manager.py:131, in Manager.summarize(self, data, file_name, n_samples, summary_method, textgen_config)
[128] data = read_dataframe(data)
[130] self.data = data
--> [131] return self.summarizer.summarize(
[132] data=self.data, text_gen=self.text_gen, file_name=file_name, n_samples=n_samples,
[133] summary_method=summary_method, textgen_config=textgen_config)
\lida\components\summarizer.py:130, in Summarizer.summarize(self, data, text_gen, file_name, n_samples, textgen_config, summary_method, encoding)
[128] # modified to include encoding
[129] data = read_dataframe(data, encoding=encoding)
--> [130] data_properties = self.get_column_properties(data, n_samples)
[132 # default single stage summary construction
...
File tslib.pyx:596, in pandas._libs.tslib.array_to_datetime()
File tslib.pyx:588, in pandas._libs.tslib.array_to_datetime()
TypeError: <class 'decimal.Decimal'> is not convertible to datetime, at position 0
The text was updated successfully, but these errors were encountered: