geerlingguy/sbc-reviews

LattePanda Mu

Opened this issue · 16 comments

lattepanda-mu

Basic information

  • Board URL (official): https://www.lattepanda.com/lattepanda-mu
  • Board purchased from: (Provided for review by LattePanda)
  • Board purchase date: 2024-05-02
  • Board specs (as tested): 8G RAM / 64G eMMC
  • Board price (as tested): $139

Linux/system information

# output of `neofetch`
jgeerling@lattepanda-mu:~/tinymembench$ neofetch
            .-/+oossssoo+/-.               jgeerling@lattepanda-mu 
        `:+ssssssssssssssssss+:`           ----------------------- 
      -+ssssssssssssssssssyyssss+-         OS: Ubuntu 22.04.2 LTS x86_64 
    .ossssssssssssssssssdMMMNysssso.       Host: ADL-N 
   /ssssssssssshdmmNNmmyNMMMMhssssss/      Kernel: 6.5.0-28-generic 
  +ssssssssshmydMMMMMMMNddddyssssssss+     Uptime: 30 mins 
 /sssssssshNMMMyhhyyyyhmNMMMNhssssssss/    Packages: 1641 (dpkg), 9 (snap) 
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   Shell: bash 5.1.16 
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   Resolution: 1920x1080 
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   Terminal: /dev/pts/1 
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   CPU: Intel N100 (4) @ 3.400GHz 
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   GPU: Intel Device 46d1 
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   Memory: 966MiB / 7640MiB 
 /sssssssshNMMMyhhyyyyhdNMMMNhssssssss/
  +sssssssssdmydMMMMMMMMddddyssssssss+                             
   /ssssssssssshdmNNNNmyNMMMMhssssss/                              
    .ossssssssssssssssssdMMMNysssso.
      -+sssssssssssssssssyyyssss+-
        `:+ssssssssssssssssss+:`
            .-/+oossssoo+/-.

# output of `uname -a`
Linux lattepanda-mu 6.5.0-28-generic #29~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Apr  4 14:39:20 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Benchmark results

CPU

Power

  • Idle power draw (at wall): 5.5 W
  • Maximum simulated power draw (stress-ng --matrix 0): 22.1 W
  • During Geekbench multicore benchmark: 25 W
  • During top500 HPL benchmark: 25 W

Disk

Built-in 64GB eMMC storage

Benchmark Result
iozone 4K random read 26.33 MB/s
iozone 4K random write 32.84 MB/s
iozone 1M random read 271.25 MB/s
iozone 1M random write 105.48 MB/s
iozone 1M sequential read 270.56 MB/s
iozone 1M sequential write 106.40 MB/s
wget https://raw.githubusercontent.com/geerlingguy/pi-cluster/master/benchmarks/disk-benchmark.sh
chmod +x disk-benchmark.sh
sudo MOUNT_PATH=/ TEST_SIZE=1g ./disk-benchmark.sh

Run benchmark on any attached storage device (e.g. eMMC, microSD, NVMe, SATA) and add results under an additional heading.

Also consider running PiBenchmarks.com script.

Network

iperf3 results:

  • iperf3 -c $SERVER_IP: 942 Mbps
  • iperf3 --reverse -c $SERVER_IP: 838 Mbps
  • iperf3 --bidir -c $SERVER_IP: 937 Mbps up, 544 Mbps down

(Be sure to test all interfaces, noting any that are non-functional.)

GPU

glmark2-es2 results:

=======================================================
    glmark2 2021.02
=======================================================
    OpenGL Information
    GL_VENDOR:     Intel
    GL_RENDERER:   Mesa Intel(R) Graphics (ADL-N)
    GL_VERSION:    OpenGL ES 3.2 Mesa 22.2.5
=======================================================
[build] use-vbo=false: FPS: 2202 FrameTime: 0.454 ms
[build] use-vbo=true: FPS: 2251 FrameTime: 0.444 ms
[texture] texture-filter=nearest: FPS: 2441 FrameTime: 0.410 ms
[texture] texture-filter=linear: FPS: 2426 FrameTime: 0.412 ms
[texture] texture-filter=mipmap: FPS: 2440 FrameTime: 0.410 ms
[shading] shading=gouraud: FPS: 2052 FrameTime: 0.487 ms
[shading] shading=blinn-phong-inf: FPS: 2042 FrameTime: 0.490 ms
[shading] shading=phong: FPS: 1830 FrameTime: 0.546 ms
[shading] shading=cel: FPS: 1802 FrameTime: 0.555 ms
[bump] bump-render=high-poly: FPS: 1450 FrameTime: 0.690 ms
[bump] bump-render=normals: FPS: 2463 FrameTime: 0.406 ms
[bump] bump-render=height: FPS: 2417 FrameTime: 0.414 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 1742 FrameTime: 0.574 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 990 FrameTime: 1.010 ms
[pulsar] light=false:quads=5:texture=false: FPS: 2049 FrameTime: 0.488 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 846 FrameTime: 1.182 ms
[desktop] effect=shadow:windows=4: FPS: 1337 FrameTime: 0.748 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 763 FrameTime: 1.311 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 1022 FrameTime: 0.978 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 1030 FrameTime: 0.971 ms
[ideas] speed=duration: FPS: 1807 FrameTime: 0.553 ms
[jellyfish] <default>: FPS: 1374 FrameTime: 0.728 ms
[terrain] <default>: FPS: 194 FrameTime: 5.155 ms
[shadow] <default>: FPS: 1624 FrameTime: 0.616 ms
[refract] <default>: FPS: 477 FrameTime: 2.096 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 1969 FrameTime: 0.508 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 1967 FrameTime: 0.508 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 1969 FrameTime: 0.508 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 1968 FrameTime: 0.508 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 1965 FrameTime: 0.509 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 1820 FrameTime: 0.549 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 1951 FrameTime: 0.513 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 1972 FrameTime: 0.507 ms
=======================================================
                                  glmark2 Score: 1716 
=======================================================

Note: This benchmark requires an active display on the device. Not all devices may be able to run glmark2-es2, so in that case, make a note and move on!

TODO: See this issue for discussion about a full suite of standardized GPU benchmarks.

Memory

tinymembench results:

Click to expand memory benchmark result
tinymembench v0.4.10 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :   8198.4 MB/s
 C copy backwards (32 byte blocks)                    :   8208.1 MB/s
 C copy backwards (64 byte blocks)                    :   8261.9 MB/s
 C copy                                               :   7887.9 MB/s
 C copy prefetched (32 bytes step)                    :   4901.8 MB/s (0.2%)
 C copy prefetched (64 bytes step)                    :   5062.2 MB/s
 C 2-pass copy                                        :   6949.3 MB/s
 C 2-pass copy prefetched (32 bytes step)             :   3770.8 MB/s
 C 2-pass copy prefetched (64 bytes step)             :   3773.3 MB/s
 C fill                                               :  10764.4 MB/s (0.1%)
 C fill (shuffle within 16 byte blocks)               :  10727.3 MB/s
 C fill (shuffle within 32 byte blocks)               :  10721.1 MB/s
 C fill (shuffle within 64 byte blocks)               :  10724.3 MB/s
 ---
 standard memcpy                                      :  10790.2 MB/s
 standard memset                                      :  11019.0 MB/s
 ---
 MOVSB copy                                           :   8201.1 MB/s
 MOVSD copy                                           :   8202.8 MB/s
 SSE2 copy                                            :   8201.2 MB/s
 SSE2 nontemporal copy                                :  10945.0 MB/s
 SSE2 copy prefetched (32 bytes step)                 :   6560.1 MB/s
 SSE2 copy prefetched (64 bytes step)                 :   6840.2 MB/s
 SSE2 nontemporal copy prefetched (32 bytes step)     :   7616.0 MB/s (0.1%)
 SSE2 nontemporal copy prefetched (64 bytes step)     :   8108.0 MB/s (0.2%)
 SSE2 2-pass copy                                     :   6572.6 MB/s
 SSE2 2-pass copy prefetched (32 bytes step)          :   4806.6 MB/s
 SSE2 2-pass copy prefetched (64 bytes step)          :   4963.1 MB/s
 SSE2 2-pass nontemporal copy                         :   3277.9 MB/s
 SSE2 fill                                            :  11019.0 MB/s
 SSE2 nontemporal fill                                :  19722.2 MB/s

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    2.5 ns          /     3.7 ns 
    131072 :    3.8 ns          /     4.6 ns 
    262144 :    5.1 ns          /     6.0 ns 
    524288 :    6.4 ns          /     7.2 ns 
   1048576 :    7.0 ns          /     7.5 ns 
   2097152 :    7.6 ns          /     7.9 ns 
   4194304 :   13.7 ns          /    16.8 ns 
   8388608 :   28.6 ns          /    40.5 ns 
  16777216 :   77.1 ns          /   110.5 ns 
  33554432 :  108.5 ns          /   136.6 ns 
  67108864 :  120.0 ns          /   134.1 ns 

block size : single random read / dual random read, [MADV_HUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    2.5 ns          /     3.7 ns 
    131072 :    3.8 ns          /     4.6 ns 
    262144 :    4.4 ns          /     4.9 ns 
    524288 :    4.7 ns          /     5.0 ns 
   1048576 :    4.9 ns          /     5.0 ns 
   2097152 :    5.1 ns          /     5.1 ns 
   4194304 :   11.2 ns          /    14.3 ns 
   8388608 :   25.2 ns          /    37.1 ns 
  16777216 :   71.5 ns          /   103.7 ns 
  33554432 :   99.3 ns          /   124.5 ns 
  67108864 :  112.1 ns          /   131.2 ns 

sbc-bench results

Results: https://sprunge.us/uHzXI7

wget https://raw.githubusercontent.com/ThomasKaiser/sbc-bench/master/sbc-bench.sh
sudo /bin/bash ./sbc-bench.sh -r

Phoronix Test Suite

Results from pi-general-benchmark.sh:

  • pts/encode-mp3: 9.357 sec
  • pts/x264 4K: 6.82 fps
  • pts/x264 1080p: 29.64 fps
  • pts/phpbench: 740012
  • pts/build-linux-kernel (defconfig): 444.886 sec

A few notes unboxing and assembling this SBC with it's Lite Carrier ($40) and their official Aluminum Active Cooler ($12):

  • Custom 260-pin SO-DIMM edge pinout, BIOS, board layout examples, mechanical drawings, etc. are provided in the LattePanda-Mu GitHub project. It is provided under the MIT license, so examples can easily be modified for your own purposes.
    • Note: This is great, especially having the official carrier boards' design files included, because it immediately gives devs who can use KiCAD enough data to go on to customize an existing reference design, just like Raspberry Pi did with their CM4 carrier board.
  • The full kit looks like it includes a nice printed manual (as documented at 11:40 in Technically Unsure's video.
  • The lite board includes an open-ended PCIe Gen 3 x4 slot, which apparently is only powered when using DC barrel jack power (12V DC), but not using 15V USB-C PD input.
  • The Lite board includes 2x USB 3.0, 2x USB 2.0, Ethernet (1 Gbps), HDMI, and 12V barrel plug or 15V USB-C PD power input.
  • The active cooler fits over the rather large (compared to typical SBCs) N100 chip using three spring-loaded screws. I would not crank the screws down 100% tight, as there is ever-so-slight board flex if you attempt that.
  • DFRobot intends to make a higher-RAM variant at some point in the future (16GB was mentioned).
  • The official web page for the product lists a few potential future expansion boards: NAS carrier with 10 GbE (4x?) and 8 NVMe M.2 2280 slots, Router Carrier with 4x Ethernet jacks with PoE power output (maybe?), Graphics Carrier with MXM, SXM, OAM, or general PCIe cards, and a Cluster Carrier for multiple Mu boards. The illustrations show extremely basic designs lacking networking, power circuitry, etc. meaning those are likely quick illustrations to show potential future products based on what's stuck for things like the Pi Compute Module 4.
  • Spitballing... this module might be the easiest way for someone to build their own custom Intel-based PC, and with decent baseline specs. It'd be awesome if DFRobot can commit to building up the hardware ecosystem around the Mu, and maybe commit to long-term production (I didn't see anything about that on their website).

As MKBHD says (paraphrased), "don't buy a product based on future promises" — but is the Mu good enough to stand on its own merits, and could an ecosystem develop around this form factor like it has around the Compute Modules?

First boot:

  • When you plug it in, it doesn't boot automatically (this may be configurable in BIOS); instead, you press PWR to turn it on.
  • I booted it up from the eMMC the first time; it rebooted three times and in the middle boot mentioned there was a CMOS checksum error (no doubt because there was no CMOS battery, it came with one I installed after setting up the board).
  • It booted into Windows 11 Home (no license attached)
  • In Windows: playing back 4K video on YouTube, the Intel UHD Graphics iGPU goes up to 30-35% utilization, and CPU maxes out at 100% utilization, but only a couple frames are dropped overall, no real issues there.
  • In Windows: 8 GB creates a lot of memory pressure. Just one browser tab opened to YouTube pumped up utilization to 5 GB. Two more browser tabs (CNN and Microsoft), and we hit 6 GB!
  • I flashed Ubuntu 22.04 to a USB drive, and installed a Kioxia BG4 2230-size M.2 NVMe SSD in the M.2 M-key slot, and will attempt installing Ubuntu on the LattePanda Mu

NVMe notes:

  • With the USB drive plugged in, I pressed DEL to enter the AMI BIOS.
  • Ubuntu's installer didn't see the Kioxia BG4, so I popped out of the Installer and launched Terminal.
  • I pulled the BG4 and installed a WD SN520 instead.
  • ...and BIOS didn't see that either.
  • I've emailed LattePanda to ask if there's anything special I need to do to get the onboard M.2 M-key slot working.

Ubuntu installation:

  • I decided to just overwrite the Windows installation on the 64 GB built-in eMMC.
  • Installation was simple and straightforward once I went with the eMMC.
  • Power usage fluctuated between 9-22W (with some peaks hitting 25W), in both Windows and Ubuntu—the BIOS seemed to have all turbos enabled, so no chip limits in place.
Screenshot 2024-05-03 at 2 39 47 PM

Power consumption during Geekbench 6 run:

Screenshot 2024-05-03 at 3 27 14 PM

Pibenchmarks.com test results:

     Category                  Test                      Result      
HDParm                    Disk Read                 291.80 MB/sec            
HDParm                    Cached Disk Read          281.63 MB/sec            
DD                        Disk Write                103 MB/s                 
FIO                       4k random read            26323 IOPS (105295 KB/s) 
FIO                       4k random write           21222 IOPS (84891 KB/s)  
IOZone                    4k read                   54314 KB/s               
IOZone                    4k write                  44254 KB/s               
IOZone                    4k random read            31812 KB/s               
IOZone                    4k random write           44036 KB/s               

                          Score: 10891                                       

I don't see them yet on https://pibenchmarks.com/user/geerlingguy/ though.

Interestingly the device shares the 'product name' ADL-N with this Weibu Mini PC.

Power consumption while off (after I did a sudo shutdown now and waited a few hours):

Screenshot 2024-05-05 at 6 55 53 PM

Based on what I read from their schematics and Discord plus my tinkering,

  • The Lite board includes 2x USB 3.0, 2x USB 2.0, Ethernet (TODO — 1 Gbps? 2.5?), HDMI, and 12V barrel plug or 15V USB-C PD power input.

The Ethernet on the lite carrier is provided by a good old gigabit Realtek RTL8111H.

  • DFRobot intends to make a higher-RAM variant at some point in the future (16GB was mentioned).

From the "manual" included, DFRobot also has a 24GB variant listed.

  • I pulled the BG4 and installed a WD SN520 instead.
    ...and BIOS didn't see that either.

If you are using the lite carrier, that M.2 M-key slot only has SATA signal wired is wired to PCIe x1.

If you are using the lite carrier, that M.2 M-key slot only has SATA signal wired.

Oh! That's a bit crazy, I don't think I've even seen a 2230-sized SATA M.2 drive. But I guess they must exist...

[Edit: A few minutes searching Amazon... can't find one]

@geerlingguy Sorry my fault I stand corrected. I dig out their lite carrier schematic and confirm again because 2230 SATA SSD doesn't make a lot of sense... (I have seen 2242 "NGFF" SATA SSD in the late 2010s but never 2230)

The M-key slot is indeed connected to a PCIe x1 signal.

2024-05-08_100108
2024-05-08_100154

I received an email back from LattePanda saying:

The issue likely stems from some unknown factors causing confusion in the BIOS settings area. Numerous options are in an uncertain state, leading to abnormal operation. Specifically, for PCIe interfaces, this results in a complete halt of the PCIe REFCLK, preventing PCIe devices from functioning properly.

The solution they sent across was:

  1. Download the correct version of the BIOS branch and reflash it. Set the AFUWIN v5.16 software as shown in the image below.
  2. After the reflash is complete, shut down the computer, unplug the power cord, and remove the RTC battery. Wait 30 seconds to ensure the RTC is completely powered down and the BIOS settings are reset to default.
  3. Reinstall the RTC battery, plug in the power, and start the computer normally.

image

Were you able to get the SSD working finally? Don't think you went over that in the video. The M-key is rated for PCI-E x1.

The standards for m.2 are

M-key is for PCIe x4 / NVMe x4 or SATA.

B-key is for PCIe x2 / NVMe x2 or SATA.

E-key is for wifi like you said.

So if i understand this right, an m.2 SATA or nvme SSD should'nt work in that slot.

I am also wondering what you mean by "correct version of the BIOS branch".

From their github page i can see that they have this,

image

and it says that the PCIE version to make full use of the PCIE 3.0 lanes is "coming soon"

image

I have the same issue but with an older latte panda