The Loongson 3A4000 (link in Chinese) has built-in crypto acceleration support, but unfortunately, no public documentation has been released despite repeated community requests.
This repository serves to consolidate the community effort of 3A4000 crypto acceleration reverse-engineering.
- The MIPS-provided Codescape GNU Toolchain 2019.09-02
- Loongnix's binutils 2.28-13.fc21.loongson.5 binary
- A Loongson 3A4000-powered system to experiment with
The Loongson-modified binutils package has no source available at time of this writing. The Codescape SDK has full sources available, though.
Mnemonic | Instruction Description |
---|---|
aes128.enc |
AES-128 Encrypt One Block |
aes128.dec |
AES-128 Decrypt One Block |
aes192.enc |
AES-192 Encrypt One Block |
aes192.dec |
AES-192 Decrypt One Block |
aes256.enc |
AES-256 Encrypt One Block |
aes256.dec |
AES-256 Decrypt One Block |
sha1.hash.4r |
SHA-1 Hash, 4 Rounds |
sha1.ms.1 |
SHA-1 Message Schedule, First Half |
sha1.ms.2 |
SHA-1 Message Schedule, Second Half |
sha256.hash.2r |
SHA-256 Hash, 2 Rounds |
sha256.ms.1 |
SHA-256 Message Schedule, First Half |
sha256.ms.2 |
SHA-256 Message Schedule, Second Half |
There are some other crypto-related instructions present in the Codescape SDK binutils. Behavior of these instructions are not fully known yet.
OPCODE WS WD MINOR_OP
011110 00000 00000 xxxxx xxxxx 010111
Format: aes128.enc wd, ws
Purpose: AES-128 Encrypt One Block
Operation:
plaintext <- WR[wd]
key <- WR[ws]
WR[wd] <- AES-128-ECB-Encrypt(plaintext, key)
OPCODE WS WD MINOR_OP
011110 00000 00001 xxxxx xxxxx 010111
Format: aes128.dec wd, ws
Purpose: AES-128 Decrypt One Block
Operation:
ciphertext <- WR[wd]
key <- WR[ws]
WR[wd] <- AES-128-ECB-Decrypt(ciphertext, key)
OPCODE WT WS WD MINOR_OP
011110 00010 xxxxx xxxxx xxxxx 010111
Format: aes192.enc wd, ws, wt
Purpose: AES-192 Encrypt One Block
Operation:
plaintext <- WR[wd]
key <- WR[ws] || WR[wt]_63..0
WR[wd] <- AES-192-ECB-Encrypt(plaintext, key)
OPCODE WT WS WD MINOR_OP
011110 00011 xxxxx xxxxx xxxxx 010111
Format: aes192.dec wd, ws, wt
Purpose: AES-192 Decrypt One Block
Operation:
ciphertext <- WR[wd]
key <- WR[ws] || WR[wt]_63..0
WR[wd] <- AES-192-ECB-Decrypt(ciphertext, key)
OPCODE WT WS WD MINOR_OP
011110 00100 xxxxx xxxxx xxxxx 010111
Format: aes256.enc wd, ws, wt
Purpose: AES-256 Encrypt One Block
Operation:
plaintext <- WR[wd]
key <- WR[ws] || WR[wt]
WR[wd] <- AES-256-ECB-Encrypt(plaintext, key)
OPCODE WT WS WD MINOR_OP
011110 00101 xxxxx xxxxx xxxxx 010111
Format: aes256.dec wd, ws, wt
Purpose: AES-256 Decrypt One Block
Operation:
ciphertext <- WR[wd]
key <- WR[ws] || WR[wt]
WR[wd] <- AES-256-ECB-Decrypt(ciphertext, key)
OPCODE WT WS WD MINOR_OP
011110 10000 xxxxx xxxxx xxxxx 010111
Format: sha1.hash.4r wd, ws, wt
Purpose: SHA-1 Hash, 4 Rounds
Description:
Performs 4 rounds of SHA-1. Round function and constant is built-in, selectable by input operand.
Operation:
w0 <- WR[wt]_31..0
w1 <- WR[wt]_63..32
w2 <- WR[wt]_95..64
w3 <- WR[wt]_127..96
a <- WR[ws]_31..0
b <- WR[ws]_63..32
c <- WR[ws]_95..64
d <- WR[ws]_127..96
e <- WR[wd]_31..0
sel <- WR[wd]_65..64
dont_rotate_e <- WR[wd]_66
if not dont_rotate_e then
e <- e rol 30
endif
if sel == 0 then
f <- lambda b, c, d: (b and c) or ((not b) and d)
k <- 0x5A827999
elseif sel == 1 then
f <- lambda b, c, d: b xor c xor d
k <- 0x6ED9EBA1
elseif sel == 2 then
f <- lambda b, c, d: (b and c) or (b and d) or (c and d)
k <- 0x8F1BBCDC
elseif sel == 3 then
f <- lambda b, c, d: b xor c xor d
k <- 0xCA62C1D6
endif
temp <- (a rol 5) + f(b, c, d) + e + k + w0
e <- d
d <- c
c <- b rol 30
b <- a
a <- temp
temp <- (a rol 5) + f(b, c, d) + e + k + w1
e <- d
d <- c
c <- b rol 30
b <- a
a <- temp
temp <- (a rol 5) + f(b, c, d) + e + k + w2
e <- d
d <- c
c <- b rol 30
b <- a
a <- temp
temp <- (a rol 5) + f(b, c, d) + e + k + w3
e <- d
d <- c
c <- b rol 30
b <- a
a <- temp
WR[wd]_0..31 <- a
WR[wd]_63..32 <- b
WR[wd]_95..64 <- c
WR[wd]_127..96 <- d
OPCODE WT WS WD MINOR_OP
011110 11000 xxxxx xxxxx xxxxx 010111
Format: sha1.ms.1 wd, ws, wt
Purpose: SHA-1 Message Schedule, First Half
Operation:
WR[wd]_31..0 <- WR[wd]_31..0 xor WR[wd]_95..64 xor WR[wt]_31..0
WR[wd]_63..32 <- WR[wd]_63..32 xor WR[wd]_127..96 xor WR[wt]_63..32
WR[wd]_95..64 <- WR[wd]_95..64 xor WR[ws]_31..0 xor WR[wt]_95..64
WR[wd]_127..96 <- WR[wd]_127..96 xor WR[ws]_63..32 xor WR[wt]_127..96
OPCODE WS WD MINOR_OP
011110 00000 11001 xxxxx xxxxx 010111
Format: sha1.ms.2 wd, ws
Purpose: SHA-1 Message Schedule, Second Half
Operation:
WR[wd]_31..0 <- (WR[wd]_31..0 rol 1) xor (WR[ws]_63..32 rol 1)
WR[wd]_63..32 <- (WR[wd]_63..32 rol 1) xor (WR[ws]_95..64 rol 1)
WR[wd]_95..64 <- (WR[wd]_95..64 rol 1) xor (WR[ws]_127..96 rol 1)
WR[wd]_127..96 <- (WR[wd]_127..96 rol 1) xor (WR[wd]_31..0 rol 1)
Notes:
- The opcode of this instruction on Loongson 3A4000 is swapped with
aes.sr.dec
in upstream MIPS code.
OPCODE WT WS WD MINOR_OP
011110 10010 xxxxx xxxxx xxxxx 010111
Format: sha256.hash.2r wd, ws, wt
Purpose: SHA-256 Hash, 2 Rounds
Description:
Performs 2 rounds of SHA-256. K constants are not built in, these have to be manually added prior to invocation.
Operation:
a <- WR[ws]_31..0
b <- WR[ws]_63..32
e <- WR[ws]_95..64
f <- WR[ws]_127..96
c <- WR[wd]_31..0
d <- WR[wd]_63..32
g <- WR[wd]_95..64
h <- WR[wd]_127..96
w0 <- WR[wt]_31..0
w1 <- WR[wt]_63..32
S0 <- lambda a: (a ror 2) xor (a ror 13) xor (a ror 22)
S1 <- lambda e: (e ror 6) xor (e ror 11) xor (e ror 25)
ch <- lambda e, f, g: (e and f) xor ((not e) and g)
maj <- lambda a, b, c: (a and b) xor (a and c) xor (b and c)
temp1 <- h + S1(e) + ch(e, f, g) + w0
temp2 <- S0(a) + maj(a, b, c)
h <- g
g <- f
f <- e
e <- d + temp1
d <- c
c <- b
b <- a
a <- temp1 + temp2
temp1 <- h + S1(e) + ch(e, f, g) + w1
temp2 <- S0(a) + maj(a, b, c)
h <- g
g <- f
f <- e
e <- d + temp1
d <- c
c <- b
b <- a
a <- temp1 + temp2
WR[wd]_0..31 <- a
WR[wd]_63..32 <- b
WR[wd]_95..64 <- e
WR[wd]_127..96 <- f
OPCODE WT WS WD MINOR_OP
011110 11010 xxxxx xxxxx xxxxx 010111
Format: sha256.ms.1 wd, ws, wt
Purpose: SHA-256 Message Schedule, First Half
Operation:
a <- WR[ws]_31..0
b <- WR[ws]_63..32
e <- WR[ws]_95..64
f <- WR[ws]_127..96
e <- WR[wt]_31..0
s0 <- lambda x: (x ror 7) xor (x ror 18) xor (x >> 3)
WR[wd]_31..0 <- a + s0(b)
WR[wd]_63..32 <- b + s0(c)
WR[wd]_95..64 <- c + s0(d)
WR[wd]_127..96 <- d + s0(e)
OPCODE WT WS WD MINOR_OP
011110 11011 xxxxx xxxxx xxxxx 010111
Format: sha256.ms.1 wd, ws, wt
Purpose: SHA-256 Message Schedule, Second Half
Operation:
x1 <- WR[ws]_63..32
x2 <- WR[ws]_95..64
x3 <- WR[ws]_127..96
y0 <- WR[wt]_31..0
y2 <- WR[wt]_95..64
y3 <- WR[wt]_127..96
z0 <- WR[wd]_31..0
z1 <- WR[wd]_63..32
z2 <- WR[wd]_95..64
z3 <- WR[wd]_127..96
s1 <- lambda x: (x ror 17) xor (x ror 19) xor (x >> 10)
z0 <- z0 + x1 + s1(y2)
z1 <- z1 + x2 + s1(y3)
z2 <- z2 + x3 + s1(z0)
z3 <- z3 + y0 + s1(z1)
WR[wd]_31..0 <- z0
WR[wd]_63..32 <- z1
WR[wd]_95..64 <- z2
WR[wd]_127..96 <- z3
This document is placed under CC0.