添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Hi all, Been a happy Manjaro user for many years now - running on a Thinkpad X1 Carbon Gen6. Many thanks to all those who work on this distro!

However, since the most recent kernel updates (I have tried both 6.8.5-1 and 6.6.26-1 LTS), I have been having unstable behavior. I have experienced several kernel panics and other kernel issues that eventually require a reboot or hard shutdown to resume normal system operation. I have not changed anything on the system recently and have not experienced a single such issue over the years.

Issues in the log are usually along the lines of:
BUG: scheduling while atomic: kworker/6:0/171277/0x00000002
watchdog: BUG: soft lockup - CPU#6 stuck for 160s! [kworker/6:1:168392] (multiple repeated)

Other strange symptons I have experienced since this update are NetworkManager seg-faulting (dumping core), and Bluetooth connections can sometimes not be managed via gui (neither Plasma nor blueman).

My first thought was a hardware failure. I have run all of Lenovo’s hardware tests and they all pass. I have also updated the Bios to the latest version.

I realise this is all quite vague. However, the issues appear random. I guess I’m wondering whether:

  • there have been any recent changes that could lead to such issues; and
  • anyone can offer suggestions on how to troubleshoot further? (Since I don’t see other topic in the forum, it’s more likely something specific on my end.)
  • Thanks!

    Difficulty: ★☆☆☆☆ Strong of its many members, the Manjaro support forum can provide you help whenever you have an issue with your Manjaro installation. But in order to work efficiently, we shall also ask you to follow three major points. Provide context Simply signaling an issue is rarely enough to understand how it occurred. It is thus important to provide details on how it happened: Detail prior actions leading to the issue. List solutions and guides you already tried, with links when…

    inxi -Farz output is:

    System:
      Kernel: 6.6.26-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
        clocksource: tsc avail: acpi_pm
        parameters: BOOT_IMAGE=/boot/vmlinuz-6.6-x86_64
        root=UUID=9c4bef01-d0e8-4736-819d-d44fc11a9434 rw quiet
        msr.allow_writes=on
        cryptdevice=UUID=453a0272-7f94-469c-b655-f58eee4a1ce3:luks-453a0272-7f94-469c-b655-f58eee4a1ce3
        root=/dev/mapper/luks-453a0272-7f94-469c-b655-f58eee4a1ce3
        resume=/dev/mapper/luks-453a0272-7f94-469c-b655-f58eee4a1ce3
        resume=UUID=955e1562-7d18-4778-97d4-b148a14a03ad intel_iommu=on
      Desktop: KDE Plasma v: 5.27.11 tk: Qt v: 5.15.12 info: frameworks
        v: 5.115.0 wm: kwin_wayland with: krunner vt: 2 dm: SDDM Distro: Manjaro
        base: Arch Linux
    Machine:
      Type: Laptop System: LENOVO product: 20KHCTO1WW v: ThinkPad X1 Carbon 6th
        serial: <superuser required> Chassis: type: 10 serial: <superuser required>
      Mobo: LENOVO model: 20KHCTO1WW v: SDK0J40709 WIN
        serial: <superuser required> part-nu: LENOVO_MT_20KH_BU_Think_FM_ThinkPad
        X1 Carbon 6th uuid: <superuser required> UEFI: LENOVO v: N23ET88W (1.63 )
        date: 02/28/2024
    Battery:
      ID-1: BAT0 charge: 26.7 Wh (68.1%) condition: 39.2/57.0 Wh (68.7%)
        power: 39.5 W volts: 12.8 min: 11.6 model: LGC 01AV494 type: Li-poly
        serial: <filter> status: charging cycles: 1568
      Info: model: Intel Core i7-8550U bits: 64 type: MT MCP arch: Coffee Lake
        gen: core 8 level: v3 note: check built: 2017 process: Intel 14nm family: 6
        model-id: 0x8E (142) stepping: 0xA (10) microcode: 0xF4
      Topology: cpus: 1x cores: 4 tpc: 2 threads: 8 smt: enabled cache:
        L1: 256 KiB desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB
        L3: 8 MiB desc: 1x8 MiB
      Speed (MHz): avg: 550 high: 700 min/max: 400/4000 scaling:
        driver: intel_pstate governor: powersave cores: 1: 400 2: 700 3: 400 4: 400
        5: 700 6: 700 7: 400 8: 700 bogomips: 32012
      Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
      Vulnerabilities:
      Type: gather_data_sampling mitigation: Microcode
      Type: itlb_multihit status: KVM: VMX disabled
      Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT
        vulnerable
      Type: mds mitigation: Clear CPU buffers; SMT vulnerable
      Type: meltdown mitigation: PTI
      Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
      Type: reg_file_data_sampling status: Not affected
      Type: retbleed mitigation: IBRS
      Type: spec_rstack_overflow status: Not affected
      Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
        prctl
      Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
        sanitization
      Type: spectre_v2 mitigation: IBRS; IBPB: conditional; STIBP: conditional;
        RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected
      Type: srbds mitigation: Microcode
      Type: tsx_async_abort status: Not affected
    Graphics:
      Device-1: Intel UHD Graphics 620 vendor: Lenovo driver: i915 v: kernel
        arch: Gen-9.5 process: Intel 14nm built: 2016-20 ports: active: eDP-1
        off: DP-1 empty: DP-2,HDMI-A-1,HDMI-A-2 bus-ID: 00:02.0 chip-ID: 8086:5917
        class-ID: 0300
      Device-2: IMC Networks Integrated Camera driver: uvcvideo type: USB
        rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-8:3 chip-ID: 13d3:56b2
        class-ID: 0e02
      Display: wayland server: X.org v: 1.21.1.12 with: Xwayland v: 23.2.6
        compositor: kwin_wayland driver: X: loaded: intel dri: i965 gpu: i915
        display-ID: 0
      Monitor-1: eDP-1 res: 1920x1080 size: N/A modes: N/A
      API: EGL v: 1.5 hw: drv: intel iris platforms: device: 0 drv: iris
        device: 1 drv: swrast surfaceless: drv: iris wayland: drv: iris x11:
        drv: iris inactive: gbm
      API: OpenGL v: 4.6 compat-v: 4.5 vendor: intel mesa v: 24.0.2-manjaro1.1
        glx-v: 1.4 direct-render: yes renderer: Mesa Intel UHD Graphics 620 (KBL
        GT2) device-ID: 8086:5917 memory: 15.03 GiB unified: yes display-ID: :0.0
      API: Vulkan v: 1.3.279 layers: 4 device: 0 type: integrated-gpu name: Intel
        UHD Graphics 620 (KBL GT2) driver: mesa intel v: 24.0.2-manjaro1.1
        device-ID: 8086:5917 surfaces: xcb,xlib,wayland
    Audio:
      Device-1: Intel Sunrise Point-LP HD Audio vendor: Lenovo
        driver: snd_hda_intel v: kernel alternate: snd_soc_skl,snd_soc_avs
        bus-ID: 00:1f.3 chip-ID: 8086:9d71 class-ID: 0403
      API: ALSA v: k6.6.26-1-MANJARO status: kernel-api with: aoss
        type: oss-emulator tools: alsactl,alsamixer,amixer
      Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
      Server-2: PipeWire v: 1.0.3 status: active with: 1: pipewire-pulse
        status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
        4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
    Network:
      Device-1: Intel Ethernet I219-V vendor: Lenovo driver: e1000e v: kernel
        port: N/A bus-ID: 00:1f.6 chip-ID: 8086:15d8 class-ID: 0200
      IF: enp0s31f6 state: down mac: <filter>
      Device-2: Intel Wireless 8265 / 8275 driver: iwlwifi v: kernel pcie:
        gen: 1 speed: 2.5 GT/s lanes: 1 bus-ID: 02:00.0 chip-ID: 8086:24fd
        class-ID: 0280
      IF: wlp2s0 state: up mac: <filter>
      Info: services: NetworkManager, sshd, systemd-timesyncd, wpa_supplicant
    Bluetooth:
      Device-1: Intel Bluetooth wireless interface driver: btusb v: 0.8 type: USB
        rev: 2.0 speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-7:2 chip-ID: 8087:0a2b
        class-ID: e001
      Report: btmgmt ID: hci0 rfk-id: 4 state: up address: <filter> bt-v: 4.2
        lmp-v: 8 status: discoverable: no pairing: no class-ID: 6c010c
    Drives:
      Local Storage: total: 476.94 GiB used: 268.9 GiB (56.4%)
      SMART Message: Unable to run smartctl. Root privileges required.
      ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung
        model: MZVLB512HAJQ-000L7 size: 476.94 GiB block-size: physical: 512 B
        logical: 512 B speed: 31.6 Gb/s lanes: 4 tech: SSD serial: <filter>
        fw-rev: 5L2QEXA7 temp: 41.9 C scheme: GPT
    Partition:
      ID-1: / raw-size: 453.23 GiB size: 445.05 GiB (98.19%)
        used: 268.7 GiB (60.4%) fs: ext4 dev: /dev/dm-0 maj-min: 254:0
        mapped: luks-453a0272-7f94-469c-b655-f58eee4a1ce3
      ID-2: /boot/efi raw-size: 260 MiB size: 256 MiB (98.46%)
        used: 29.6 MiB (11.6%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
    Swap:
      Kernel: swappiness: 60 (default) cache-pressure: 100 (default) zswap: yes
        compressor: zstd max-pool: 20%
      ID-1: swap-1 type: partition size: 23.44 GiB used: 167.7 MiB (0.7%)
        priority: -2 dev: /dev/nvme0n1p5 maj-min: 259:3
    Sensors:
      System Temperatures: cpu: 47.0 C pch: 48.0 C mobo: N/A
      Fan Speeds (rpm): fan-1: 0
    Repos:
      Packages: pm: pacman pkgs: 2775 libs: 563 tools: octopi,pamac,yay
        pm: flatpak pkgs: 0
      Active pacman repo servers in: /etc/pacman.d/mirrorlist
        1: https://mirror.aarnet.edu.au/pub/manjaro/stable/$repo/$arch
    Info:
      Memory: total: 16 GiB note: est. available: 15.39 GiB used: 9.8 GiB (63.7%)
      Processes: 353 Power: uptime: 18h 38m states: freeze,mem,disk
        suspend: deep avail: s2idle wakeups: 2 hibernate: platform avail: shutdown,
        reboot, suspend, test_resume image: 6.15 GiB
        services: org_kde_powerdevil,upowerd Init: systemd v: 255
        default: graphical tool: systemctl
      Compilers: clang: 16.0.6 gcc: 13.2.1 Shell: Bash v: 5.2.26
        running-in: tmux: inxi: 3.3.34
    

    I’ve had networkmanager crash multiple times today, but system itself has stayed functional. I’ve now tried disabling the associated systemd service and running directly from terminal via networkmanager --no-daemon --debug in case that leads to more useful information.

    Update: This setup has remained stable for the entire day. No kernel errors such as those in the original post.

    However, trying to restart NetworkManager.service via systemd resulted in multiple errors as shown, and finally a kernel panic (system completely unresponsive and flashing Caps-Lock key):

    Apr 16 18:47:04 kernel: BUG: scheduling while atomic: NetworkManager/201980/0x00000002
    Apr 16 18:47:04 kernel: BUG: scheduling while atomic: NetworkManager/201980/0x00000000
    Apr 16 18:47:05 systemd-coredump[202009]: Process 201980 (NetworkManager) of user 0 dumped core.
    Apr 16 18:47:05 systemd[1]: Failed to start Network Manager.
    Apr 16 18:47:05 kernel: BUG: scheduling while atomic: NetworkManager/202045/0x00000002
    Apr 16 18:47:05 kernel: BUG: scheduling while atomic: NetworkManager/202045/0x00000000
    Apr 16 18:47:05 NetworkManager[202045]: <error> [1713257225.3343] platform-linux: netlink[rtnl]: read: failed to retrieve incoming events: Bad address (-14)
    Apr 16 18:47:05 NetworkManager[202045]: <error> [1713257225.3344] platform-linux: netlink[rtnl]: read: failed to retrieve incoming events: Bad address (-14)
    Apr 16 18:47:05 systemd-coredump[202052]: Process 202045 (NetworkManager) of user 0 dumped core.
    Apr 16 18:47:05 systemd[1]: Failed to start Network Manager.
    Apr 16 18:47:05 NetworkManager[202061]: <error> [1713257225.8595] platform-linux: netlink[rtnl]: read: failed to retrieve incoming events: Bad address (-14)
    Apr 16 18:47:05 NetworkManager[202061]: <error> [1713257225.8596] platform-linux: netlink[rtnl]: read: failed to retrieve incoming events: Bad address (-14)
    Apr 16 18:47:05 kernel: BUG: scheduling while atomic: NetworkManager/202061/0x00000002
    Apr 16 18:47:05 kernel: BUG: scheduling while atomic: NetworkManager/202061/0x00000000
    Apr 16 18:47:05 kernel: BUG: scheduling while atomic: Link Monitor/1903/0x00000002
    Apr 16 18:47:05 kernel: BUG: scheduling while atomic: Link Monitor/1903/0x00000000
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: conky/111712/0x00000002
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: conky/111712/0x00000000
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: Qt bearer threa/1934/0x00000002
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: Qt bearer threa/1877/0x00000002
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: Qt bearer threa/1934/0x00000000
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: Qt bearer threa/1934/0x00000002
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: Qt bearer threa/1877/0x00000000
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: Qt bearer threa/1934/0x00000000
    Apr 16 18:47:06 systemd-coredump[202066]: Process 202061 (NetworkManager) of user 0 dumped core.
    Apr 16 18:47:06 systemd[1]: Failed to start Network Manager.
    Apr 16 18:47:06 systemd-coredump[202073]: Process 111704 (conky) of user 1000 dumped core.
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: NetworkManager/202083/0x00000002
    Apr 16 18:47:06 kernel: BUG: scheduling while atomic: NetworkManager/202083/0x00000000
    Apr 16 18:47:06 NetworkManager[202083]: <error> [1713257226.5571] platform-linux: netlink[rtnl]: read: failed to retrieve incoming events: Bad address (-14)
    Apr 16 18:47:06 NetworkManager[202083]: <error> [1713257226.5571] platform-linux: netlink[rtnl]: read: failed to retrieve incoming events: Bad address (-14)
    Apr 16 18:47:06 systemd-coredump[202094]: Process 202083 (NetworkManager) of user 0 dumped core.
    Apr 16 18:47:06 systemd[1]: Failed to start Network Manager.
    Apr 16 18:47:08 kernel: BUG: scheduling while atomic: kworker/1:4/194948/0x00000002
    Apr 16 18:47:08 kernel: BUG: workqueue leaked lock or atomic: kworker/1:4/0x7fffffff/194948
    Apr 16 18:47:08 kernel: BUG: scheduling while atomic: kworker/1:4/194948/0x00000000
    Apr 16 18:47:08 kernel: BUG: scheduling while atomic: Link Monitor/1903/0x00000002
    Apr 16 18:47:08 kernel: BUG: scheduling while atomic: NetworkManager/202109/0x00000002
    Apr 16 18:47:08 kernel: BUG: scheduling while atomic: Link Monitor/1903/0x00000000
    Apr 16 18:47:08 kernel: BUG: scheduling while atomic: NetworkManager/202109/0x00000000
    Apr 16 18:47:09 kernel: BUG: scheduling while atomic: Qt bearer threa/1522/0x00000002
    Apr 16 18:47:09 kernel: BUG: scheduling while atomic: Qt bearer threa/1522/0x00000000
    Apr 16 18:47:09 systemd-coredump[202144]: Process 202109 (NetworkManager) of user 0 dumped core.
    Apr 16 18:47:09 systemd[1]: Failed to start Network Manager.
    Apr 16 18:47:09 systemd[1]: Failed to start Network Manager.
    

    Today’s experience leads me to believe that this in not a kernel topic. It is something systemd and/or NetworkManager.service related. I’ve edited the original title to reflect this.

    Any ideas why networkmanager would behave so badly when run via systemd but totally fine when run via networkmanager --no-daemon --debug ?

    Question: Could this kind of strange intermittent behaviour be explained by intermittent connection to the wifi card? I am always on wifi (not ethernet), and it always seems to be NetworkManager crashing that leads to the subsequent kernel errors.

    I’ve had the laptop since new (approx 6 years) and, whilst I look after it, it has experienced a few accidental bumps in its time. Wondering it is worth opening and re-seating the card? (Would rather not open it up if it’s completely infeasible for this to be the cause.)

    IMHO its the same issue

    https://bbs.archlinux.org/viewtopic.php?id=294828

    6.8.5 and 6.8.6 (now in testing) are affected by it.

    I had to downgrade to 6.8.4.