FileFinder flow crashes due to Incorrect string value error
alexgumo7 opened this issue · 6 comments
Environment
- How did you install GRR? N/A
- What GRR version are you running?: 3.2.4.2
- What operating system does the GRR server run on? N/A
- What operating system does the affected GRR client run on, if applicable? Windows 10 10.0.19041SP0
Describe the issue
When using FileFinder to search a specific set of files the flow crashes because of an encoding problem due to non supported characters in different filenames. I have used OS, TSK and NTFS, all with the same result.
Error logs
Here I attach the backtrace of the error:
Traceback (most recent call last): File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/flow_base.py", line 685, in RunStateMethod method(responses) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/flows/general/filesystem.py", line 535, in _ProcessEntry response, matching_components, base_wildcard=True) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/flows/general/filesystem.py", line 555, in _ProcessResponse self.GlobReportMatch(response) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/flows/general/file_finder.py", line 126, in GlobReportMatch super(FileFinder, self).GlobReportMatch(response) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/flows/general/filesystem.py", line 399, in GlobReportMatch WriteStatEntries([stat_response], client_id=self.client_id) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/flows/general/filesystem.py", line 95, in WriteStatEntries _FilterOutPathInfoDuplicates(path_infos)) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/db.py", line 3402, in WritePathInfos return self.delegate.WritePathInfos(client_id, path_infos) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/mysql_paths.py", line 194, in WritePathInfos self._MultiWritePathInfos({client_id: path_infos}) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/db_utils.py", line 51, in Decorator result = f(*args, **kwargs) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/mysql_utils.py", line 241, in Decorated return self._RunInTransaction(Closure, readonly) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/mysql.py", line 559, in _RunInTransaction result = function(connection) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/mysql_utils.py", line 239, in Closure return func(self, *args, **new_kw) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/mysql_paths.py", line 269, in _MultiWritePathInfos cursor.executemany(query, path_info_values) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/mysql_pool.py", line 227, in executemany return self._forward(self.cursor.executemany, query, args) File "/usr/share/grr-server/lib/python3.6/site-packages/grr_response_server/databases/mysql_pool.py", line 166, in _forward return method(*args, **kwargs) File "/usr/lib/python3/dist-packages/MySQLdb/cursors.py", line 283, in executemany self.rowcount = sum(self.execute(query, arg) for arg in args) File "/usr/lib/python3/dist-packages/MySQLdb/cursors.py", line 283, in <genexpr> self.rowcount = sum(self.execute(query, arg) for arg in args) File "/usr/lib/python3/dist-packages/MySQLdb/cursors.py", line 253, in execute self._warning_check() File "/usr/lib/python3/dist-packages/MySQLdb/cursors.py", line 155, in _warning_check warn(self.Warning(*w[1:3]), stacklevel=3) _mysql_exceptions.Warning: (1366, "Incorrect string value: '\\xD9\\x88\\xD9\\x83\\xD8\\xB0...' for column 'path' at row 1")
Is there any solution to this issue?
What happens if you try to use client-side file-finder? I don't expect it to really work, since the issue seems to be in the database backend but it doesn't cost much to try.
Hi @panhania, I just tried it with client-side file-finder, it still gives the same error.
Yeah, I think this is going to require some effort to fix it. We will try to have it patched in the next GRR release. For now I could only advise to use some flows that do not write any records to the database, e.g. the timeline flow to list contents and metadata of many files, manually (e.g. using grep) filter what is interesting to you and then use multi-get file flow to collect these files.
Note that the interface in the "old UI" for the timeline flow does not allow to easily get the results—you will have to use the API shell for this and the GetCollectedTimelineBody
method.
Yes I can still use the timeline to identify the files, however the objective is to make a search and then download the files. Doing this for a single machine can still be viable, however, for more than one it gets more difficult and it doesn't escalate well. Moreover, as far as I know, to make a MultiGetFile when downloading from the Virtual FileSystem you need to do a ListDirectory flow first and this one also crashes because of non supported characters.
In that case, if you are in a hurry, don't want to wait for the next GRR release and you are fine with some hacking, you can workaround the issue. Modify the line responsible for inserting paths in the mysql_paths.py
file. Replace:
path = mysql_utils.ComponentsToPath(path_info.components)
with:
path = mysql_utils.ComponentsToPath(path_info.components)
path = path.encode("utf-8", "replace").decode("utf-8")
This will cause all spurious characters to be replaced with Unicode replacement character (�). Note that this is a lossy conversion, so you loose some data this way. Instead of replace
mode you can also use backslashreplace
that will use \u????
where ????
is a hex representation of a spurious byte.
Thanks! I will modify this if I find this error again.