UTF8 Surrogates Not Allowed
Opened this issue · 1 comments
rsbohn commented
Something in the text returned from GPT 4o can't be logged to the database.
File "C:\tools\hudson\Lib\site-packages\sqlite_utils\db.py", line 3310, in insert_all
self.insert_chunk(
File "C:\tools\hudson\Lib\site-packages\sqlite_utils\db.py", line 3068, in insert_chunk
result = self.db.execute(query, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\tools\hudson\Lib\site-packages\sqlite_utils\db.py", line 524, in execute
return self.conn.execute(sql, parameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc81' in position 14511: surrogates not allowed
Work around: Disable logs and run the prompt again.
PS> cat .\transcript.csv | llm -m 4o -s "Extract each place name."
AlexanderYastrebov commented
Would be nice to have a small reproducer file.