Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.4.y] [arm64] spi cs seems to be broken #3355

Closed
asavah opened this issue Nov 28, 2019 · 26 comments
Closed

[5.4.y] [arm64] spi cs seems to be broken #3355

asavah opened this issue Nov 28, 2019 · 26 comments

Comments

@asavah
Copy link

asavah commented Nov 28, 2019

Describe the bug
SPI (possibly CS) is broken on 5.4.y

To reproduce
Attach an sh1106 or ssd1306 spi (4wire) display like in https://luma-oled.readthedocs.io/en/latest/hardware.html
Install luma.core and luma.oled python modules.
Try drawing something on the display the via a python script,
for example something from https://github.com/rm-hull/luma.examples/tree/master/examples
like this:

python3 clock.py --display sh1106 --interface spi --spi-device 0 --spi-port 0

Expected behaviour
There should be image or text on the screen.

Actual behaviour
Screen is black.
Wire CS pin of the display to ground instead of CE0 (BCM8) pin - works.

System
Own OS

  • Which model of Raspberry Pi? e.g. Pi3B+, PiZeroW
    pi 4b
  • Which OS and version (cat /etc/rpi-issue)?
    Own pet project
  • Which firmware version (vcgencmd version)?
Nov 19 2019 16:42:26
Copyright (c) 2012 Broadcom
version aeebba4c03968ede49097db077673eadc2888a22 (clean) (release) (start_x)
  • Which kernel version (uname -a)?
    Linux rpi4 5.4.0-k6 #1 SMP Thu Nov 28 00:43:25 EET 2019 aarch64 GNU/Linux
    dtparam=spi=on in config.txt
    Logs
    No mentions of spi or any other scary errors in dmesg, can provide additional info if requested.

Additional context
The clue may be that if I wire display CS pin to ground it starts working like a champ.
This setup (scripts, wiring) have been working like a champ since pi2 days.
Tried using spi 0 device 1 with CS pin swapped - same behavior.
Disconnected everything from the pi except the display - same behavior.
Checked and redid display wiring - same behavior.
Checked with a smaller ssd1306 spi display - same behavior.
Checked python-smbus, luma.oled and luma.core versions - nothing changed for months.
Tried reverting a bunch of bcm-spi2835.c related commits to the state said file was in 5.3.y - same ...
Tried playing with spi0-cs and spi0-hw-cs overlays - more of the same.

Sadly I don't own a scope so I can't properly peek at what's happening with CS.
My crappy multimeter shows constantly and quickly floating voltage on active CS pin in range 0.005v - 0.07v (this measurement is not to be trusted) , it's never 0, inactive CS pin is stable 3,3v

Kernel 5.3.y with same firmware version worked fine earlier today.
However I can't test with 5.3 ATM because I rebuilt the whole thing and glibc was foolishly built with --enable-kernel=5.4.0.
If needed I can rebuild for testing, takes a bit of time tho.

Edit: if needed I can easily and quickly rebuild the kernel to test any patches on the fly.
Edit2: all other peripherals I have (i2c rtc, temperature sensor, buttons, leds, etc) work properly, this was tested with everything but the display physically disconnected.

@pelwell
Copy link
Contributor

pelwell commented Nov 28, 2019

Yes, it's definitely broken. Out with the logic analyser...

@pelwell
Copy link
Contributor

pelwell commented Nov 28, 2019

The CS line is inverted on 5.4. An spi_set_cs(1) results in the GPIO driver being asked to set GPIO 8 to 0 (correct - it's active low) on 4.19 and 1 (incorrect) on 5.4. I'm pretty sure it's triggered by 3bd158c, but I haven't yet worked out whether the error is in the code or the device tree CS declarations.

@pelwell
Copy link
Contributor

pelwell commented Nov 29, 2019

It turns out to indeed be a side effect of 3bd158c, but the fault actually lies in the SPI-Py library. See my PR (lthiery/SPI-Py#25) for an explanation of the problem.

Download the patch (https://github.com/lthiery/SPI-Py/pull/25/commits/298c90d007ab740fbdbc6872d47b26011c25c554.patch), apply it to your tree (git am 298c90d*.patch), rebuild (sudo python setup.py install) and reboot - you should find it works again.

@asavah
Copy link
Author

asavah commented Nov 29, 2019

Thanks but the library I use https://github.com/rm-hull/luma.core uses https://github.com/doceme/py-spidev under the hood.

@pelwell
Copy link
Contributor

pelwell commented Nov 29, 2019

Try setting the spi_cs_high option when you instantiate the luma.core instance. I'm not saying that ought to be required, but if that works then at least you have a workaround for now and we'll know that it is a different aspect of the same issue.

@asavah
Copy link
Author

asavah commented Nov 29, 2019

Thanks, will do and report back once I get home so I can see the actual output on display =)

@asavah
Copy link
Author

asavah commented Nov 29, 2019

Confirmed - with cs_high=True the display works OK.

A stripped down example of how I initialize luma device for anyone with the same problem led here by google:

from luma.core.interface.serial import spi
from luma.oled.device import sh1106

serial = spi(port=0, device=0, cs_high=True)
device = sh1106(serial, width=128, height=64, rotate=2)

@pelwell
Copy link
Contributor

pelwell commented Nov 29, 2019

That's good news. One way forward would be to modify the library so that an absent cs_high value means "don't change it". I can also raise the question with the GPIO and SPI maintainers to see whether the current behaviour is expected.

@HiassofT
Copy link
Contributor

Not 100% sure if that's related but I noticed that on 5.4.1 GPIOs 7/8 are claimed according to pinmux-pins (with dtparam=spi=on on RPi4)

pin 7 (gpio7): fe204000.spi pinctrl-bcm2835:7 function gpio_out group gpio7
pin 8 (gpio8): fe204000.spi pinctrl-bcm2835:8 function gpio_out group gpio8

On 5.3.12 only the HW CS pins were configured, GPIOs were unclaimed

pin 7 (gpio7): fe204000.spi (GPIO UNCLAIMED) function gpio_out group gpio7
pin 8 (gpio8): fe204000.spi (GPIO UNCLAIMED) function gpio_out group gpio8

Did SPI switch to GPIO CS instead of HW CS?

@HiassofT
Copy link
Contributor

scratch my last comment, only difference in pinctrl between 5.4 and before is that the GPIOs are now properly claimed. function selection of the CS pins (gpio_out) is the same

@asavah
Copy link
Author

asavah commented Dec 13, 2019

I see that this commit has landed into stable tree https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.4.y&id=39552e0e71c0109fd83284e28aaa5e8691a36fc9
39552e0#diff-3db5dfeeff40a7afb4339663ae674301

From the looks of it this should fix the issue, will report back once I'm able to get to my pi and test it at runtime.

@pelwell
Copy link
Contributor

pelwell commented Dec 13, 2019

It's odd that Mark Brown didn't mention that patch when I raised the issue upstream, but I'm happy the issue might have been fixed. I notice that somebody was complaining about a regression, and they were pointed to a followup patch (https://patchwork.kernel.org/patch/11209839/ - the link is down for me at the moment).

@asavah
Copy link
Author

asavah commented Dec 13, 2019

Nope, works only with cs_high, sorry for the noise.
Right now for me kernel.org (except git) is down too.
I'll take a look and try building with the followup patch once its up again.

@asavah
Copy link
Author

asavah commented Dec 13, 2019

The followup patch was merged too, so I already had it.

@l1k
Copy link
Contributor

l1k commented Mar 10, 2020

Fixed in v5.6-rc5 with torvalds/linux@138c9c3, now also queued for v5.4-stable.

@asavah
Copy link
Author

asavah commented Mar 10, 2020

@l1k thanks for letting us know, will test and report back once 5.6 is released, sadly I don't have time right now to play with -rc.

@asavah
Copy link
Author

asavah commented Mar 10, 2020

@l1k I applied aforementioned commit on top of rpi-5.5.y, patch applied cleanly,
but sadly it does NOT fix the issue.

Mar 10 22:41:35 rpi4 keypad.py[541]:     serial = spi(port=SCREEN_PORT, device=SCREEN_DEVICE, cs_high=False)
Mar 10 22:41:35 rpi4 keypad.py[541]:   File "/usr/lib/python3.8/site-packages/luma.core-1.13.0-py3.8.egg/luma/core/interface/serial.py", line 281, in __init__
Mar 10 22:41:35 rpi4 keypad.py[541]:     self._spi.cshigh = cs_high
Mar 10 22:41:35 rpi4 keypad.py[541]: SystemError: error return without exception set

Setting cs_high to False or omitting it (default is False - low) did not make a difference.
The error happens here https://github.com/rm-hull/luma.core/blob/master/luma/core/interface/serial.py#L281 .

No errors in dmesg.

@SilvanRehm
Copy link

maybe related to this: mcp2515 worked with this kernel:

4.19.118-v7l+ #1311 SMP Mon Apr 27 14:26:42 BST 2020 armv7l GNU/Linux

but fails after an upgrade to
5.4.51-v7+ #1327 SMP Thu Jul 23 10:58:46 BST 2020 armv7l GNU/Linux

[    6.774540] spi spi0.0: setting up native-CS0 to use GPIO
[    6.774835] spi spi0.1: setting up native-CS1 to use GPIO
[    9.075741] mcp251x spi0.0: MCP251x didn't enter in conf mode after reset
[    9.075797] mcp251x spi0.0: Probe failed, err=16
[    9.075875] mcp251x: probe of spi0.0 failed with error -16

when downgrading to 4.19 with
sudo rpi-update e1050e94821a70b2e4c72b318d6c6c968552e9a2

works fine again.

@pelwell
Copy link
Contributor

pelwell commented Jul 30, 2020

[    6.774540] spi spi0.0: setting up native-CS0 to use GPIO
[    6.774835] spi spi0.1: setting up native-CS1 to use GPIO

You must be using an old or custom DTB file - these lines show that whatever you are using is requesting hardware/native CS lines, which doesn't work in 5.4.

A request for hardware chip selects looks like:

        cs-gpios = <0>, <0>;

whereas normal software-driven chip selects are:

        cs-gpios = <&gpio, 8, 1>, <&gpio, 7, 1>;

@SilvanRehm
Copy link

I did add the following lines to /boot/config.txt to activate the mcp2515:

dtoverlay=mcp2515-can0,oscillator=16000000,interrupt=25
dtoverlay=spi0-hw-cs

This is found in most guides on how to get CAN working with the RPi. Is there a way to fix this internally for 5.4 ?

@pelwell
Copy link
Contributor

pelwell commented Jul 30, 2020

Try removing or commenting out the second line.

@SilvanRehm
Copy link

Try removing or commenting out the second line.

worked, thanks a lot!

@pelwell
Copy link
Contributor

pelwell commented Jul 30, 2020

No, thank you - it's a handy reminder to delete that overlay.

@pelwell
Copy link
Contributor

pelwell commented Jul 30, 2020

It's deleted now. Although rpi-update won't delete the existing overlay (I'm not sure about apt upgrade, etc.), there's an entry in the new overlay_map file that will prevent it from being loaded.

popcornmix added a commit to raspberrypi/firmware that referenced this issue Jul 31, 2020
See: raspberrypi/linux#3765

kernel: overlays: Delete spi0-hw-cs
See: raspberrypi/linux#3355

kernel: backlight: gpio: Explicitly set the direction of the GPIO
See: raspberrypi/linux#3767

kernel: overlays: Add maxtherm overlay for MAX6675/31855
See: raspberrypi/linux#3763

firmware: arm_loader: Knock 1.7 seconds off boot time
See: #1375

firmware: Imx477 external sync signals
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Jul 31, 2020
See: raspberrypi/linux#3765

kernel: overlays: Delete spi0-hw-cs
See: raspberrypi/linux#3355

kernel: backlight: gpio: Explicitly set the direction of the GPIO
See: raspberrypi/linux#3767

kernel: overlays: Add maxtherm overlay for MAX6675/31855
See: raspberrypi/linux#3763

firmware: arm_loader: Knock 1.7 seconds off boot time
See: raspberrypi/firmware#1375

firmware: Imx477 external sync signals
@popcornmix
Copy link
Collaborator

This should be resolved in latest rpi-update kernel.

@asavah
Copy link
Author

asavah commented Oct 13, 2020

The explanation given in #3745 pretty much makes this issue invalid, I forgot I had this one open.
cs_high capability was dropped from luma.core , which was the problem that was affecting me.
rm-hull/luma.led_matrix#225
https://github.com/rm-hull/luma.core/pull/195/files

@asavah asavah closed this as completed Oct 13, 2020
Khamull added a commit to Khamull/Vintage_Radio that referenced this issue Jan 25, 2021
serial = spi(device=0, port=0, cs_high=True)#cs_high = True is a workaround to this isseu raspberrypi/linux#3355
This new option fixes an issues that was introduced in a update of the system
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants