You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
When using dropDuplicateRows to eliminate duplicate entries from a table, I observed that duplicates were still present in the output. Upon investigation, the root cause was identified in the isDuplicate function. This function is designed to iterate over rows that share a hash with the row being evaluated to determine if it is a duplicate. However, it incorrectly returns false (indicating the row is unique) during the first iteration if the first checked row does not match, without examining the remaining rows.
Expected Behavior:
The isDuplicate function should only return false after all rows with the matching hash have been checked and none are found to be identical to the row being evaluated. This ensures that a row is only considered unique if it has been verified against all potential duplicates.
Actual Behavior:
The function returns false prematurely after comparing with the first row that shares a hash, potentially leaving unexamined duplicates in the table.
Resolution:
The issue was resolved by modifying isDuplicate to complete its iteration over all rows with a matching hash before deciding that the row is not a duplicate. This change ensured that dropDuplicateRows correctly removed all duplicates from the table.
The text was updated successfully, but these errors were encountered:
Description:
When using
dropDuplicateRows
to eliminate duplicate entries from a table, I observed that duplicates were still present in the output. Upon investigation, the root cause was identified in theisDuplicate
function. This function is designed to iterate over rows that share a hash with the row being evaluated to determine if it is a duplicate. However, it incorrectly returnsfalse
(indicating the row is unique) during the first iteration if the first checked row does not match, without examining the remaining rows.Expected Behavior:
The
isDuplicate
function should only returnfalse
after all rows with the matching hash have been checked and none are found to be identical to the row being evaluated. This ensures that a row is only considered unique if it has been verified against all potential duplicates.Actual Behavior:
The function returns
false
prematurely after comparing with the first row that shares a hash, potentially leaving unexamined duplicates in the table.Resolution:
The issue was resolved by modifying
isDuplicate
to complete its iteration over all rows with a matching hash before deciding that the row is not a duplicate. This change ensured thatdropDuplicateRows
correctly removed all duplicates from the table.The text was updated successfully, but these errors were encountered: