exception: "CharacterNotInRepertoire": "invalid byte sequence for encoding "UTF8": 0x00

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

相关文章推荐

1548603420 · 速汇金汇款取消政策与提款流程：重要注意事项与 ...· 2 天前 ·

体贴的小刀 · Amadeus vs Sabre: ...· 1 月前 ·

傻傻的地瓜 · 2.8. 历史穿梭 — GotGit· 3 月前 ·

大力的八宝粥 · 超次元偶像彦希本名女朋友黄婷婷个人资料整容黑 ...· 3 月前 ·

爱健身的芒果 · python截取文件名后三位_mob64ca ...· 4 月前 ·

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fist write a file :

# try:
            with open(filename, "w", newline="", encoding="utf-8") as output_file:
                writer = csv.writer(output_file, delimiter=delimiter)
                writer.writerows(list_of_records)
Then read the same file, and pushing it to :
**connection = psycopg2.connect(target_db_dsn)
        # open connection and copy the file
            with connection:
                with connection.cursor() as cursor:
                    f = open(filename, "r", encoding="utf-8", errors='ignore')
                    cursor.copy_from(f, self.load_full_name,
                                     sep=delimiter, null='')
                    f.close()**
2: what you expected to happen
all records would be read including:
Verzoek omtrent melding nr. M190901853\09 Vervangen Mitel door Mitel. Uw referentie|50000029735|50000073709||||50001123220|50000754894|2019-11-18T14:36:59Z|False|False|False||[]|[]|[]|False|2019-11-14T09:33:14Z|3024|1|5|1|2019-11-13T09:33:14Z|2019-12-03T09:09:06Z|50001123220|['[email protected]']|Incident|"<table align=""center"" width=""100%""><td class=""bodytemplate""><table align=""center"" border=""0"" cellpadding=""0"" cellspacing=""0"" class=""container""><table class=""text_header""><td class=""header_bar"" id=""header_title""> <td class=""header_bar"" id=""incident_nr"">M190901853\09 <td class=""text_co"|        M190901853\09Beste Voysone,Om een melding van ons te verwerken verzoeken wij het onderstaande.Korte Omschrijving      Vervangen Mitel door MitelLocatie/Object:        Toelichting:        Verzoek van de klant:Graag had ik besteld ;20 stuks MiVoice Business Enterprise User op Group Application Record ID (GARID) 7025768Voor vragen of opmerkingen zijn wij be||||||GET https://voys.freshservice.com/api/v2//tickets|2021-10-14 09:01:19
3: what happened instead
exception: "CharacterNotInRepertoire": "invalid byte sequence for encoding "UTF8": 0x00
NB. the offending issue was the '' in the string. The issue disappeared when at the stage of writing I do a replace of the '/'.
I would have expected that a '' was allowed (and have a lot of other records that do load that show that it is). SO maybe the '\0' or '\09  is something special?
'replace('\\', '_')'
** elif column_value.__class__.__name__ in [key for key, value in self.TYPE_MAPPING.items() if value == 'text']:
                        # flatten lists to string
                        column_value = str(column_value)
                        # truncate long lines
                        column_value = column_value[:400]
                        # remove newlines from column values
                        column_value = ''.join(column_value.splitlines())
                        # replace delimiter
                        if delimiter is not None:
                            column_value = column_value.replace(
                                delimiter, '_')
                        # replace nulls
                        column_value = column_value.replace(
                            '\x00', '').replace('\u0000', '').replace('\\', '_')
                        # append modified string
                        record.append(column_value)
If possible, provide a script reproducing the issue.
if you need the full code I can send it.
          Your file must be conform to the COPY format specified by PostgreSQL. The format and the escape rules are documented.
It would be easier for you to use Psycopg 3, which allows to copy from Python objects without the need to generate a file. See https://www.psycopg.org/psycopg3/docs/basic/copy.html#copy
          Hi Daniele,
Thanks for your advice. The export to file was intentional in my case, although I might try a psycopg3 copy directly from the data structure later as well.
In the meantime I indeed found out that this must have been more of a PostgreSQL issue than an issue with psycopg2.
I still do not fully understand why a string that contains a \0 character sequence in a CSV that is to be inserted into a text field in the database does:
leads to issues at all
apparently leads to issues for some but not all records.
Happy with my workaround for now, though. I also found it hard to find good solution posts when searching for one, so hoping that this conversation will be indexed for others to review :-)
One of the best related posts that I eventually found, was this one
cheers,