Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Null characters, should we sanitize? #60

Open
visch opened this issue Jan 11, 2023 · 1 comment
Open

Null characters, should we sanitize? #60

visch opened this issue Jan 11, 2023 · 1 comment

Comments

@visch
Copy link
Member

visch commented Jan 11, 2023

Right now you'll get something like ValueError: A string literal cannot contain NUL (0x00) characters or
sqlalchemy.exc.DataError: (psycopg2.errors.UntranslatableCharacter) unsupported Unicode escape sequence as postgres doesn't allow NUL characters see https://www.postgresql.org/docs/current/functions-string.html#:~:text=chr(0)%20is%20disallowed%20because%20text%20data%20types%20cannot%20store%20that%20character.

Should we sanitize the data ie something like data.replace("\u0000","") or leave the offending record?

@visch visch changed the title Null characters, should we santizie? Null characters, should we sanitize? Jan 11, 2023
@williamlfish
Copy link

running into this as well. think it would be nice to at least have the option too 😅

github-merge-queue bot pushed a commit that referenced this issue Dec 10, 2024
Null characters are currently passed as-is to Postgres despite being
unsupported.

If it is encountered, it causes the sink to fail as noted here:
#60 with an error
like `ValueError: A string literal cannot contain NUL (0x00)
characters.`

This PR introduces a new option called `sanitize_null_text_characters`
which enables sanitization of these characters.

---------

Co-authored-by: Edgar Ramírez Mondragón <16805946+edgarrmondragon@users.noreply.github.com>
Co-authored-by: Edgar Ramírez-Mondragón <edgarrm358@gmail.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants