ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
<alyssa> note to self: the infra for backwards opt passes has a serious footgun -- if you grab sources from the later instruction, it'll fuse them in earlier than the instructions producing those sources
<alyssa> propagating clamps and conversions doesn't hit this, but a new rule I tried does
<alyssa> the obvious way of fixing this may hurt scheduling
chewitt has quit [Read error: Connection reset by peer]
rasterman has quit [Quit: Gettin' stinky!]
camus has joined #panfrost
camus1 has quit [Ping timeout: 480 seconds]
JulianGro has quit [Remote host closed the connection]
chewitt has joined #panfrost
camus1 has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
camus1 has quit [Remote host closed the connection]
camus has joined #panfrost
AstrovSky has joined #panfrost
AstrovSky has quit []
chewitt has quit [Read error: Connection reset by peer]
anholt has quit [Read error: Connection reset by peer]
anholt has joined #panfrost
JulianGro has joined #panfrost
Elemental has joined #panfrost
chewitt has joined #panfrost
chewitt has quit []
hyrc has joined #panfrost
Elemental has left #panfrost [#panfrost]
Elemental has joined #panfrost
<Elemental> I have a Lenovo Duet with a Mali G72. I installed Linux on it using the Cadmium project (Debian based), Kernel 5.16.0-rc3, and am currently using Gnome. I'm getting several lockups per day. Some of them are hard lockups and others "soft" where I can still SSH in. Lots of panfrost errors in the Kernel log: gpu sched timeout, AS_ACTIVE bit stuck, page faults, etc. Any tips on troubleshooting this would be appreciated. Trying to figure out if it is a h
<Elemental> Trying to figure out if it is a hardware problem or not, since I just got it recently and can still return it. I did get at least one hard freeze in ChromeOS prior to installing Linux.
soreau has quit [Remote host closed the connection]
soreau has joined #panfrost
AreaScout has joined #panfrost
MajorBiscuit has joined #panfrost
atler has joined #panfrost
camus1 has joined #panfrost
camus has quit [Remote host closed the connection]
<tomeu> Elemental: hmm, not sure, we do test G72 in CI, but might not hit the same codepaths as gnome-shell and some apps
<tomeu> maybe macc24 will know something about it
camus has joined #panfrost
camus1 has quit [Read error: Connection reset by peer]
AreaScout_ has quit [Remote host closed the connection]
rasterman has joined #panfrost
hexdump0815 has joined #panfrost
<hexdump0815> i can confirm the observations of Elemental - i real world usage i also see quite a few panfrost related issues
<hexdump0815> this is with debian bullseye and a 5.15 kernel with mostly the cadmium mt8183 patches and mesa 21.3.2 - will have to move to sid and mesa head to see if this changes things a bit
<macc24> hmm
<hexdump0815> can it maybe be realted to the gpu frequency scaling?
<macc24> no clue
<macc24> i've been running with gpu devfreq on mt8183 for some time and never seen issues
<macc24> next week i'll be able to test panfrost from mesa-main on my krane
<hexdump0815> i see the exact same kinds of errors in the logs like Elemental and in my case its in xorg xfce (wm compositing is disabled) and some light opengl apps now and then
<macc24> >xorg
<hexdump0815> macc24: my plan is to move to sid and mesa head and maybe play with setting min=max freq for the gpu and see if i can find anything reproducable
<macc24> hexdump0815: check if cci devfreq impacts it too
<macc24> since that's not in ci too
<hexdump0815> macc24: it should be possible to build a kernel without the cci patches - right?
<macc24> hexdump0815: just set the governor to powersave/performance and it'll be fine
<hexdump0815> ok will try
camus1 has joined #panfrost
<macc24> bbl
camus has quit [Remote host closed the connection]
AreaScout_ has joined #panfrost
AreaScout_ has quit []
AreaScout_ has joined #panfrost
Major_Biscuit has joined #panfrost
MajorBiscuit has quit [Ping timeout: 480 seconds]
camus1 has quit []
nlhowell has joined #panfrost
hyrc has quit []
<Elemental> I woke up this morning thinking along the same lines, in regards to potential issues with frequency scaling or temperature.
<Elemental> One of the logged messages on boot is "[drm:panfrost_devfreq_init] Failed to register cooling device". There isn't a fan, but maybe there is some sort of other cooling method which isn't being used? Is there a temperature sensor I could monitor?
<Elemental> I'll try playing with the min/max frequency scaling as well (first limit it to minimum only) as hexdump0815 suggested.
hyrc has joined #panfrost
rasterman has quit [Ping timeout: 480 seconds]
Major_Biscuit has quit []
rasterman has joined #panfrost
Elemental has quit [Remote host closed the connection]
nlhowell is now known as Guest10568
nlhowell has joined #panfrost
Guest10568 has quit [Remote host closed the connection]
<hexdump0815> Elemental: i think its not a temperature issue - as i understand it there is maybe sometimes a moment of instability when switching between frequencies if not everything is setup perfectly which might result in those timeouts, errors etc.
Elemental has joined #panfrost
<Elemental> hexdump0815: I think you're right about that. It has been running fine now for a while with max_freq set to min_freq (300 MHz).
<Elemental> If it is still working fine by later today, I'll try setting min_freq and max_freq to 800 MHz, to see if it is related to the higher speed or the act of switching, as you suggest.
cphealy has quit [Quit: Leaving]
cphealy has joined #panfrost
rasterman has quit [Quit: Gettin' stinky!]
Bennett has joined #panfrost
* alyssa pretends to be a kernel developer
Elemental has quit [Remote host closed the connection]
* macc24 pretends to not be a kernel developer
* HdkR plays some Foo Fighters for the mood
AreaScout has quit [Quit: Leaving]
<alyssa> interesting...
<alyssa> soft_reset() works the first time it's used, but not the second
<alyssa> (Did the GPU hang in between?)
<macc24> alyssa: poke the gpu to go back into the reset vector, 0xffff f000
<macc24> eh, 0xffff fff0
<urja> -.-
<alyssa> Wondering if this is where the dummy job workaround comes in
<macc24> oh
<macc24> valhall
<alyssa> Yes
<alyssa> Assuming this is the dummy job thing .. guess I have a pile of code to write then
nlhowell has quit [Ping timeout: 480 seconds]
Bennett has quit [Remote host closed the connection]
JulianGro has quit [Ping timeout: 480 seconds]
xdarklight has quit [Quit: ZNC - https://znc.in]
JulianGro has joined #panfrost
oftc has joined #panfrost
oftc is now known as Guest10585
Guest10585 has left #panfrost [#panfrost]
xdarklight has joined #panfrost