Problem with Starting SemaSCDG for Mirai Executable Analysis
lt-deng opened this issue · 4 comments
Thank you for developing such an excellent malware analysis tool. I encountered an issue while using it. When I execute the following command:
python3 SemaSCDG.py -- CDFS ./databases/malware-linux/mirai/mirai002
The program will continue to print the following logs in loop. I'm unsure what the problem might be. Could you provide insight into what's causing this issue, and do you have any suggestions for resolving it?
... ...
... ...
<SimulationManager with 2 active>
INFO | 2024-05-08 23:24:55,520 | SemaExplorerCDFS | pause stash len :3
INFO | 2024-05-08 23:24:55,521 | SemaExplorerCDFS | fork_stack : 1 0x804fd3c || 0x804fd03
INFO | 2024-05-08 23:24:55,523 | SemaExplorerCDFS | Hey new addr !
INFO | 2024-05-08 23:24:55,655 | SemaExplorerCDFS | Hey new addr !
INFO | 2024-05-08 23:24:55,928 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:24:55,928 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:24:55,928 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_read'}
INFO | 2024-05-08 23:24:55,928 | CustomSimProcedure | Syscall found: read[<BV32 0x4>, <BV32 0x7ffefcf4>, <BV32 0x80>]
INFO | 2024-05-08 23:24:56,255 | SemaExplorerCDFS | Hey new addr !
INFO | 2024-05-08 23:24:56,383 | SemaExplorerCDFS | Hey new addr !
INFO | 2024-05-08 23:24:56,773 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:24:56,773 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:24:56,773 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_write'}
INFO | 2024-05-08 23:24:56,773 | CustomSimProcedure | Syscall found: write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]
Error in string resolv
INFO | 2024-05-08 23:24:58,100 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:24:58,100 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:24:58,100 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_read'}
INFO | 2024-05-08 23:24:58,100 | CustomSimProcedure | Syscall found: read[<BV32 0x4>, <BV32 0x7ffefcf4>, <BV32 0x80>]
INFO | 2024-05-08 23:24:59,016 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:24:59,016 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:24:59,016 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_write'}
INFO | 2024-05-08 23:24:59,016 | CustomSimProcedure | Syscall found: write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]
Error in string resolv
INFO | 2024-05-08 23:24:59,683 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:24:59,683 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:24:59,683 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_read'}
INFO | 2024-05-08 23:24:59,683 | CustomSimProcedure | Syscall found: read[<BV32 0x4>, <BV32 0x7ffefcf4>, <BV32 0x80>]
INFO | 2024-05-08 23:25:00,534 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:00,534 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:00,534 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_write'}
INFO | 2024-05-08 23:25:00,534 | CustomSimProcedure | Syscall found: write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]
Error in string resolv
INFO | 2024-05-08 23:25:01,200 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:01,200 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:01,200 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_read'}
INFO | 2024-05-08 23:25:01,200 | CustomSimProcedure | Syscall found: read[<BV32 0x4>, <BV32 0x7ffefcf4>, <BV32 0x80>]
INFO | 2024-05-08 23:25:02,052 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:02,052 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:02,052 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_write'}
INFO | 2024-05-08 23:25:02,052 | CustomSimProcedure | Syscall found: write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]
Error in string resolv
INFO | 2024-05-08 23:25:03,432 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:03,432 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:03,433 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_read'}
INFO | 2024-05-08 23:25:03,433 | CustomSimProcedure | Syscall found: read[<BV32 0x4>, <BV32 0x7ffefcf4>, <BV32 0x80>]
INFO | 2024-05-08 23:25:04,346 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:04,346 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:04,346 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_write'}
INFO | 2024-05-08 23:25:04,346 | CustomSimProcedure | Syscall found: write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]
Error in string resolv
INFO | 2024-05-08 23:25:05,009 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:05,009 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:05,009 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_read'}
INFO | 2024-05-08 23:25:05,009 | CustomSimProcedure | Syscall found: read[<BV32 0x4>, <BV32 0x7ffefcf4>, <BV32 0x80>]
INFO | 2024-05-08 23:25:05,861 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:05,861 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:05,862 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_write'}
INFO | 2024-05-08 23:25:05,862 | CustomSimProcedure | Syscall found: write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]
Error in string resolv
INFO | 2024-05-08 23:25:06,520 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:06,520 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:06,520 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_read'}
INFO | 2024-05-08 23:25:06,521 | CustomSimProcedure | Syscall found: read[<BV32 0x4>, <BV32 0x7ffefcf4>, <BV32 0x80>]
INFO | 2024-05-08 23:25:07,364 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:07,364 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:07,364 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_write'}
INFO | 2024-05-08 23:25:07,365 | CustomSimProcedure | Syscall found: write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]
Error in string resolv
INFO | 2024-05-08 23:25:08,795 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:08,795 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:08,796 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_read'}
INFO | 2024-05-08 23:25:08,796 | CustomSimProcedure | Syscall found: read[<BV32 0x4>, <BV32 0x7ffefcf4>, <BV32 0x80>]
INFO | 2024-05-08 23:25:09,724 | CustomSimProcedure | syscall detected
INFO | 2024-05-08 23:25:09,724 | CustomSimProcedure | <BV32 0x80>
INFO | 2024-05-08 23:25:09,724 | CustomSimProcedure | {'return_type': 'long', 'num_args': 3, 'name': 'sys_write'}
INFO | 2024-05-08 23:25:09,724 | CustomSimProcedure | Syscall found: write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]
Error in string resolv
... ...
... ...
Hello,
Mirai or GhostRAT ?
Thanks
Hello,
Mirai or GhostRAT ?
Thanks
Hello!
My issue pertains to the extraction of SCDG from a Mirai sample.
I have updated my initial issue after observing that the extraction process is indeed successful for a Gh0stRAT sample when allowed a longer processing time.
Thank you for your attention to this matter.
Hello,
I am investigating now, sorry for the waiting.
First of all, we can see the string "Error in string resolv" in the after call to"write[<BV32 0x3>, <BV32 0x7ffefcf4>, <BV32 0x80>]".
The basic debugging/analysis consist to check at :
-
the write() SimProcedure (for linux in this case)
class write(anger.SimProcedure): # pylint:disable=arguments-differ def run(self, fd, src, length): simfd = self.state.posix.get_fd(fd) if simfd is None: return -1 # length = self.state.solver.eval(length) return simfd.write(src, length)``` Seems normal
-
Where the string "Error in string resolv" is produced, i.e the function "add_call()" of the toolchain (might have change now, we are doing a big refactor). This function is basically called after each execution of a SimProcedure and is used for the SCDG creation, trying to resolv string argument parameter into concrete string:
-state.inspect.b("simprocedure", when=angr.BP_AFTER, action=self.syscall_to_scdg_builder.add_call)
-
Then lets see what the traces look likes to see the behavior (--keep_inter_scdg):
"status": "active",
"trace": [
{
"name": "main",
"args": [
"mirai002",
"<BV128 arg0_0_128>"
],
"addr": 134513000,
"ret": "symbolic",
"addr_func": 134513000
},
{
"name": "ioctl",
"args": [
0,
21505,
""
],
"addr_func": "0x8300036",
"addr": "0x80549e5",
"ret": "<BV32 unconstrained_ret_ioctl_98303_32{UNINITIALIZED}>"
},
{
"name": "ioctl",
"args": [
1,
21505,
""
],
"addr_func": "0x8300036",
"addr": "0x80549e5",
"ret": "<BV32 unconstrained_ret_ioctl_98305_32{UNINITIALIZED}>"
},
{
"name": "write",
"args": [
1,
"MIRAI\n",
6
],
"addr_func": "0x8300004",
"addr": "0x804f69a",
"ret": 6
},
{
"name": "open",
"args": [
"do",
577,
511
],
"addr_func": "0x8300005",
"addr": "0x804f6ee",
"ret": "0x3"
},
{
"name": "socket",
"args": [
2,
1,
0
],
"addr_func": "0x8300066",
"addr": "0x804f72a",
"ret": 4
},
{
"name": "connect",
"args": [
4,
2147417692,
16
],
"addr_func": "0x8300066",
"addr": "0x804f777",
"ret": 0
},
{
"name": "write",
"args": [
4,
"GET /do/do.x86 HTTP/1.0\r\n\r\n",
27
],
"addr_func": "0x8300004",
"addr": "0x804f79f",
"ret": 27
},
{
"name": "read",
"args": [
4,
2147417766,
1
],
"addr_func": "0x8300003",
"addr": "0x804fd1b",
"ret": 1
},
{
"name": "read",
"args": [
4,
2147417766,
1
],
"addr_func": "0x8300003",
"addr": "0x804fd1b",
"ret": 1
},
{
"name": "read",
"args": [
4,
2147417766,
1
],
"addr_func": "0x8300003",
"addr": "0x804fd1b",
"ret": 1
},
{
"name": "read",
"args": [
4,
2147417766,
1
],
"addr_func": "0x8300003",
"addr": "0x804fd1b",
"ret": 1
},
{
"name": "read",
"args": [
4,
2147417316,
128
],
"addr_func": "0x8300003",
"addr": "0x804fd75",
"ret": 128
},
{
"name": "write",
"args": [
3,
2147417316,
128
],
"addr_func": "0x8300004",
"addr": "0x804fd58",
"ret": 128
},
{
"name": "read",
"args": [
4,
2147417316,
128
],
"addr_func": "0x8300003",
"addr": "0x804fd75",
"ret": 128
},
{
"name": "write",
"args": [
3,
2147417316,
128
],
"addr_func": "0x8300004",
"addr": "0x804fd58",
"ret": 128
},
{
"name": "read",
"args": [
4,
2147417316,
128
],
"addr_func": "0x8300003",
"addr": "0x804fd75",
"ret": 128
},
...
It seems that it fetch ressource at "GET /do/do.x86" and then try to fetch data from it.
- Then lets see with ghidra, what is concretly in the mirai binary at this address. With full logs we can retrieve this address:
INFO | 2024-05-21 08:23:16,801 | angr.sim_manager | Stepping active of <SimulationManager with 5 active>
INFO | 2024-05-21 08:23:16,816 | angr.engines.engine | Ticked state: <IRSB from 0x8053c86: 1 sat 1 unsat>
INFO | 2024-05-21 08:23:16,838 | angr.engines.engine | Ticked state: <IRSB from 0x8053c64: 1 sat>
INFO | 2024-05-21 08:23:16,851 | angr.engines.engine | Ticked state: <IRSB from 0x804fd5d: 1 sat>
INFO | 2024-05-21 08:23:16,857 | angr.engines.engine | Ticked state: <IRSB from 0x8053c95: 1 sat>
INFO - 2024-05-21 08:23:16,861 - SyscallToSCDG - Syscall found: write[<BV32 0x3>, <BV32 0x7ffefce4>, <BV32 0x80>]
Error in string resolv
Which is 0x8053c95. We can also look at the request "GET /do/do.x86" which give us:
if (iVar10 < 0) {
FUN_08053c64(4,1,&DAT_08057150,4);
}
else {
puVar11 = (undefined *)
FUN_08053c64(4,uVar9,"GET /do/do.x86 HTTP/1.0\r\n\r\n",puVar14 + -0x805711b);
if (puVar14 + -0x805711b == puVar11) {
uVar15 = 0;
do {
iVar10 = FUN_08053c64(3,uVar9,&local_1a,1);
if (iVar10 != 1) goto LAB_0804f7ac;
uVar15 = uVar15 << 8 | (int)(char)local_1a;
} while (uVar15 != 0xd0a0d0a);
while( true ) {
uVar20 = 0x80;
puVar11 = local_1dc;
iVar10 = FUN_08053c64(3,uVar9,local_1dc,0x80);
if (iVar10 < 1) break;
FUN_08053c64(4,iVar8,local_1dc,iVar10);
}
FUN_08053c64(6,uVar9,puVar11,uVar20);
FUN_08053c64(6,iVar8);
FUN_08055ee1("./do & 2>&1");
FUN_08053c64(4,1,&DAT_0805717d,4);
}
}
We can now see the infinite loop. It probably a fetch function on the open socket to get the ";/do" binary. It will stop the the loop with the read function return 0. So we can circumvent that with modifying the read function of linux (no optimal we are working on automating that, hard with RAT):
class read(angr.SimProcedure):
def run(self, fd, dst, length):
simfd = self.state.posix.get_fd(fd)
if simfd is None:
return -1
return 0 # simfd.read(dst, length)
This will allow to discover more behavior of the malware but will also hide some others.
This modication allows us to get deeper:
"trace": [
{
"name": "main",
"args": [
"mirai002",
"<BV128 arg0_0_128>"
],
"addr": 134513000,
"ret": "symbolic",
"addr_func": 134513000
},
{
"name": "ioctl",
"args": [
0,
21505,
""
],
"addr_func": "0x8300036",
"addr": "0x80549e5",
"ret": "<BV32 unconstrained_ret_ioctl_98303_32{UNINITIALIZED}>"
},
{
"name": "ioctl",
"args": [
1,
21505,
""
],
"addr_func": "0x8300036",
"addr": "0x80549e5",
"ret": "<BV32 unconstrained_ret_ioctl_98305_32{UNINITIALIZED}>"
},
{
"name": "write",
"args": [
1,
"MIRAI\n",
6
],
"addr_func": "0x8300004",
"addr": "0x804f69a",
"ret": 6
},
{
"name": "open",
"args": [
"do",
577,
511
],
"addr_func": "0x8300005",
"addr": "0x804f6ee",
"ret": "0x3"
},
{
"name": "socket",
"args": [
2,
1,
0
],
"addr_func": "0x8300066",
"addr": "0x804f72a",
"ret": 4
},
{
"name": "connect",
"args": [
4,
2147417692,
16
],
"addr_func": "0x8300066",
"addr": "0x804f777",
"ret": 0
},
{
"name": "write",
"args": [
4,
"GET /do/do.x86 HTTP/1.0\r\n\r\n",
27
],
"addr_func": "0x8300004",
"addr": "0x804f79f",
"ret": 27
},
{
"name": "read",
"args": [
4,
2147417766,
1
],
"addr_func": "0x8300003",
"addr": "0x804fd1b",
"ret": 0
},
{
"name": "unlink",
"args": [
"mirai002"
],
"addr_func": "0x830000a",
"addr": "0x804f7bc",
"ret": -2
},
{
"name": "rt_sigprocmask",
"args": [
0,
2147417188,
0,
8
],
"addr_func": "0x83000af",
"addr": "0x804f7eb",
"ret": 0
},
{
"name": "rt_sigaction",
"args": [
17,
2147415496
],
"addr_func": "0x83000ae",
"addr": "0x805680f",
"ret": "0x0"
},
{
"name": "rt_sigaction",
"args": [
5,
2147415496
],
"addr_func": "0x83000ae",
"addr": "0x805680f",
"ret": "0x0"
},
{
"name": "open",
"args": [
"/dev/watchdog",
2,
0
],
"addr_func": "0x8300005",
"addr": "0x804f82a",
"ret": "0x5"
},
{
"name": "ioctl",
"args": [
5,
2147768068,
"\u0001"
],
"addr_func": "0x8300036",
"addr": "0x804f85d",
"ret": "<BV32 unconstrained_ret_ioctl_98308_32{UNINITIALIZED}>"
},
{
"name": "close",
"args": [
5
],
"addr_func": "0x8300006",
"addr": "0x804f86d",
"ret": 0
},
{
"name": "chdir",
"args": [
"/"
],
"addr_func": "0x830000c",
"addr": "0x804f883",
"ret": 0
},
{
"name": "socket",
"args": [
2,
2,
0
],
"addr_func": "0x8300066",
"addr": "0x8054cb6",
"ret": 5
},
{
"name": "connect",
"args": [
5,
2147415936,
16
],
"addr_func": "0x8300066",
"addr": "0x8054adb",
"ret": 0
},
{
"name": "getsockname",
"args": [
5,
2147415936,
2147415952
],
"addr_func": "0x8300066",
"addr": "0x8054b06",
"ret": 0
},
{
"name": "close",
"args": [
5
],
"addr_func": "0x8300006",
"addr": "0x80539f4",
"ret": 0
}
]
},
"3": {
"status": "active",
"trace": [
{
"name": "main",
"args": [
"mirai002",
"<BV128 arg0_0_128>"
],
"addr": 134513000,
"ret": "symbolic",
"addr_func": 134513000
},
{
"name": "ioctl",
"args": [
0,
21505,
""
],
"addr_func": "0x8300036",
"addr": "0x80549e5",
"ret": "<BV32 unconstrained_ret_ioctl_98303_32{UNINITIALIZED}>"
},
{
"name": "ioctl",
"args": [
1,
21505,
""
],
"addr_func": "0x8300036",
"addr": "0x80549e5",
"ret": "<BV32 unconstrained_ret_ioctl_98305_32{UNINITIALIZED}>"
},
{
"name": "write",
"args": [
1,
"MIRAI\n",
6
],
"addr_func": "0x8300004",
"addr": "0x804f69a",
"ret": 6
},
{
"name": "open",
"args": [
"do",
577,
511
],
"addr_func": "0x8300005",
"addr": "0x804f6ee",
"ret": "0x3"
},
{
"name": "socket",
"args": [
2,
1,
0
],
"addr_func": "0x8300066",
"addr": "0x804f72a",
"ret": 4
},
{
"name": "connect",
"args": [
4,
2147417692,
16
],
"addr_func": "0x8300066",
"addr": "0x804f777",
"ret": 0
},
{
"name": "write",
"args": [
4,
"GET /do/do.x86 HTTP/1.0\r\n\r\n",
27
],
"addr_func": "0x8300004",
"addr": "0x804f79f",
"ret": 27
},
{
"name": "read",
"args": [
4,
2147417766,
1
],
"addr_func": "0x8300003",
"addr": "0x804fd1b",
"ret": 0
},
{
"name": "unlink",
"args": [
"mirai002"
],
"addr_func": "0x830000a",
"addr": "0x804f7bc",
"ret": -2
},
{
"name": "rt_sigprocmask",
"args": [
0,
2147417188,
0,
8
],
"addr_func": "0x83000af",
"addr": "0x804f7eb",
"ret": 0
},
{
"name": "rt_sigaction",
"args": [
17,
2147415496
],
"addr_func": "0x83000ae",
"addr": "0x805680f",
"ret": "0x0"
},
{
"name": "rt_sigaction",
"args": [
5,
2147415496
],
"addr_func": "0x83000ae",
"addr": "0x805680f",
"ret": "0x0"
},
{
"name": "open",
"args": [
"/dev/watchdog",
2,
0
],
"addr_func": "0x8300005",
"addr": "0x804f82a",
"ret": "0x5"
},
{
"name": "ioctl",
"args": [
5,
2147768068,
"\u0001"
],
"addr_func": "0x8300036",
"addr": "0x804f85d",
"ret": "<BV32 unconstrained_ret_ioctl_98309_32{UNINITIALIZED}>"
},
{
"name": "close",
"args": [
5
],
"addr_func": "0x8300006",
"addr": "0x804f86d",
"ret": 0
},
{
"name": "chdir",
"args": [
"/"
],
"addr_func": "0x830000c",
"addr": "0x804f883",
"ret": 0
},
{
"name": "socket",
"args": [
2,
2,
0
],
"addr_func": "0x8300066",
"addr": "0x8054cb6",
"ret": 5
},
{
"name": "connect",
"args": [
5,
2147415936,
16
],
"addr_func": "0x8300066",
"addr": "0x8054adb",
"ret": 0
},
{
"name": "getsockname",
"args": [
5,
2147415936,
2147415952
],
"addr_func": "0x8300066",
"addr": "0x8054b06",
"ret": 0
},
{
"name": "close",
"args": [
5
],
"addr_func": "0x8300006",
"addr": "0x80539f4",
"ret": 0
}
More refinement of the SimProcedure is needed. You can checkout the windows version of read() which is more evolve. Note that Linux part is less dev than the Windows one.
Thank you very much for the detailed step-by-step guidance you provided. I have followed your process carefully and achieved a result that you described, effectively resolving my issue for now. As you say, I will try to extend the number of function summary to cover more basic blocks.
Many thanks once again!