I posted about ZRAM before, but because of my totally unscientific experiment, personal experience and the common question, which Linux to run on potatoes…

First, I tweaked ZRAM for my use-case(s) on my hardware, this settings might not be right for your use-cases or your hardware!

My hardware is a netbook with an Intel Celeron N4120 and 4G RAM (3.64G usable).

When I recently played around with ZRAM settings, it felt like the zstd algorithm made my netbook noticeable more sluggish. It never felt sluggish with lzo-rle or lz4.

In a totally unscientific way, I rebooted the computer several times (after a complete update of everything), executed my backup script several times, and measured the last 3 executions. (Didn’t touch the netbook during the runs.) The bottleneck of the backup script should not be ZRAM, but it is some reproducible workload that I could execute and measure.

To my surprise, I could measure a performance difference for my backup scripts, lz4 was consistent fastest in real and sys time w/o tweaks to vm.page-cluster!

Changing the vm.page-cluster to 0 further enhanced the speed for lz4, but with this one toggle, all of a sudden zstd is as fast as lz4 in my benchmark and runs with a more consistent runtime.

Changing the vm.swapiness to 180 decreased the speed for lz4, to my surprise.

Obviously the benchmarks are not 100% clean, although the trend for my workload was clearly in favor of lz4/zstd.

To the best of my knowledge, I ended up with nearly the same tweaks that Google makes for ChromeOS:

  • zstd as algorithm (I think ChromeOS uses lzo-rle)

  • 2*ram as ram-size

  • vm.page-cluster = 0

  • Install/enable systemd-oomd

vm.page-cluster = 0 seems like a no-brainer when using ZRAM, on my netbook it is literally the switch for ‘fast’ mode.

In summary: ZRAM makes my netbook totally usable for everyday tasks, and with tweaking the above settings I run Gnome 3, VS Code and Firefox/Evolution w/o trouble. (Of course, Xfce4 on the same machine is still noticeable more performant.)

I wonder if we should recommend to people asking for a lightweight distribution for potatoes to check/tweak their ZRAM settings by default.

Anyway, I would be interested in experiences from other people:

  1. Any other tweaks on my ZRAM or sysctl for potatoes which made a measurable difference for you?
  2. Any other tips to improve quality of life on potatoe machines? (Besides switching to KDE, LXDE, Xfce, etc. ;-))
  3. Any idea why vm.swapiness didn’t improve my measurements? To my understanding it should basically have cached more of my files in ZRAM, making the backup run faster. It even slowed the backup down, which I don’t understand.

Edit:

  • zstd beats lz4 on my machine for my benchmark when vm.page-cluster=0!
  • seaQueue@lemmy.world
    link
    fedilink
    arrow-up
    9
    arrow-down
    1
    ·
    edit-2
    4 months ago

    I wrote this years ago when I was doing a bunch of work with low ram (1gb) potato SBCs and I use it everywhere, including my 32/64gb SFF proxmox nodes: https://github.com/foundObjects/zram-swap

    You might find the comments re: swap sizing and compression ratios handy, I’ve found that lz4 approximates to a 2.5:1 compression ratio during most workloads. On your 4gb potato I’d run something like ~2GB lz4 zram, which would work out to a ~5GB zram device. I never bothered with sysctl tuning, you generally don’t need to.

    Edit: just about every Chromebook under the sun, and like 90%+ of all Android devices, runs lzo/lzo-rle zram swap at ~(½ramsize*3). Change that to *2.5 for lz4 and you’re set.

    • wolf@lemmy.zipOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 months ago

      Thank you for your answer and your insights.

      In my unscientific tests, sysctl/vm.page-cluster made a measurable difference (15% faster when setting it to 0), and it seems everyone else (PopOS, ChromeOS) tweaks at least this setting with ZRAM. I would assume the engineers at PopOS/ChromeOS also did some benchmarks before using this settings.

      Now I really would be interested, if you would measure a difference on your 1gb potato SBCs, because IMHO it should even have a bigger impact for them. (Of course, your workload/use cases might make any difference irrelevant, and of course potato SBCs have other bottlenecks like WiFi/IO, which might make this totally irrelevant.

      • seaQueue@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        4 months ago

        I don’t have my potato lab up and running at the moment but my android devices and sff hypervisors are all using page-cluster=0. That’s the default setting on android and ChromeOS I think, I probably tuned it on the proxmox machines years ago and forgot about it.

        Edit: that’s basically swap read ahead right? Ie: number of pages to read from swap at a time.

        • wolf@lemmy.zipOP
          link
          fedilink
          English
          arrow-up
          3
          ·
          4 months ago

          To my understand it is swap read-ahead, and the number is a power for the base 2. This means the default reads 2^3 = 8 pages ahead. According to what I read, the default of 3 was set in the age of rotating discs and never adapted for RAM swap devices.

          • seaQueue@lemmy.world
            link
            fedilink
            arrow-up
            2
            ·
            4 months ago

            Yeah, that’s my understanding of that sysctl too. If IOPS are cheap (and they are when dealing with ram or high IOPS NVMe) there’s no real point in performing extra read ahead.

    • wolf@lemmy.zipOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 months ago

      Thanks a lot! You are right, I saw this already.

      I can confirm the findings with my benchmarks: zstd has the best compression, lz4 is the fastest.

      • Samueru@lemmy.ml
        link
        fedilink
        arrow-up
        2
        ·
        4 months ago

        Here is what I ended up using for my sysctl conf, iirc I got some of these from popos default config:

        vm.swappiness = 180
        vm.page-cluster = 0
        vm.watermark_boost_factor = 0
        vm.watermark_scale_factor = 125
        vm.dirty_bytes = 268435456
        vm.dirty_background_bytes = 134217728
        vm.max_map_count = 2147483642
        vm.dirtytime_expire_seconds = 1800
        vm.transparent_hugepages = madvise
        
        • wolf@lemmy.zipOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 months ago

          Could you ELI5 the last five settings? I saw that Chrome OS sets vm.overcommit_memory = 1, it seems to make sense but is missing here.

          • Samueru@lemmy.ml
            link
            fedilink
            arrow-up
            2
            ·
            4 months ago

            I really don’t know lol

            Increasing the max_map_count is needed for some Steam games, iirc Arch is now dong this by default.

            iirc the dirty_bytes settings prevent the system from hanging if there is too much disk IO

            And setting transparent_hugepages to madvise was something I did when archlinux had this bug in the kernel: https://old.reddit.com/r/archlinux/comments/1atueo0/higher_ram_usage_since_kernel_67_and_the_solution/

            It was eventually fixed but I later ran into the issue again and I decided to keep it on madvise.

            • wolf@lemmy.zipOP
              link
              fedilink
              English
              arrow-up
              2
              ·
              4 months ago

              Nice, thanks a lot, especially the dirty_bytes settings are interesting to me, because I experience hangs with too much disk IO :-P.

              Cheers!

  • GustavoM@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    4 months ago

    I’m definitely not a “potato expert”, but what I use (on my orange pi zero 3 w/ 1 GiB of ram, at least) is simply:

    zram size= 100% of available ram, zstd, priority set at 100%. Because apparently if theres more zram swap than available ram, it’ll lead into memory leaks and/or slowdowns.