ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
vliaskov has quit [Remote host closed the connection]
jhli has quit []
macromorgan has joined #dri-devel
macromorgan has quit [Remote host closed the connection]
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold has joined #dri-devel
mbrost has joined #dri-devel
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
Kayden has quit [Quit: -> sky]
iive has quit [Quit: They came for me...]
davispuh has quit [Ping timeout: 480 seconds]
u-amarsh04 has quit []
mbrost_ has quit [Ping timeout: 480 seconds]
cef has quit [Quit: Zoom!]
cef has joined #dri-devel
u-amarsh04 has joined #dri-devel
Company has quit [Quit: Leaving]
Leopold has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
mbrost has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
Dorc has joined #dri-devel
OftenTimeConsuming has quit [Remote host closed the connection]
heat has quit [Read error: Connection reset by peer]
heat has joined #dri-devel
OftenTimeConsuming has joined #dri-devel
u-amarsh04 has quit [Quit: Konversation terminated!]
new-amarsh04 has joined #dri-devel
Dorcas has joined #dri-devel
<Lynne>
what are the build instructions for nvk these days? only used in in pre-rust days
<Lynne>
meson errors out because syn is not installed, but it's not a binary package, so it cannot be installed manually
Dorc has quit [Ping timeout: 480 seconds]
<psykose>
there's a .wrap for it and a meson.build in subprojects/packagefiles/syn/meson.build
<psykose>
same for the other three deps
<psykose>
without meson fetching it you have to fetch them yourself into subprojects/
<psykose>
syn, quote, proc-macro2, unicode-ident
<psykose>
see the .wrap files for the version/url
<psykose>
if you don't have something like --wrap-mode nofallback/nodownload then meson does it automatically, with nodownload you have to download it first, with nofallback i have no idea how you'd make it work (never tried)
<Lynne>
subprojects where? it's not in mesa, and I don't see it in meson's wrap list
<airlied>
yeah it should just happen at meson time unless you turn if otff
<Lynne>
have to say, build systems are generally the worst pieces of software ever written
<psykose>
meson is pretty good
<Lynne>
I'm sure there were a lot of bad options, and the least bad was to have some unholy amalgamation of meson and cargo
<psykose>
you don't need cargo for this pretty sure
heat has quit [Ping timeout: 480 seconds]
<tjaalton>
gfxstrand-web: yes?
mbrost has quit [Read error: Connection reset by peer]
Dorcas has quit [Ping timeout: 480 seconds]
Leopold_ has quit [Read error: Connection reset by peer]
Leopold_ has joined #dri-devel
mbrost has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
bmodem has joined #dri-devel
Leopold_ has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
<airlied>
mripard: CC [M] drivers/gpu/drm/msm/msm_debugfs.o
<airlied>
CC [M] drivers/gpu/drm/msm/dp/dp_debug.o
<airlied>
/home/airlied/devel/kernel/dim/src/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c: In function ‘sun4i_hdmi_connector_atomic_check’:
<airlied>
/home/airlied/devel/kernel/dim/src/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c:191:17: error: implicit declaration of function ‘drm_atomic_get_new_connector_state’; did you mean ‘drm_atomic_helper_connector_reset’? [-Werror=implicit-function-declaration]
<airlied>
/home/airlied/devel/kernel/dim/src/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c:191:17: warning: initialization of ‘struct drm_connector_state *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
<airlied>
cc1: some warnings being treated as errors
<airlied>
seeing that after merging drm-next MR
<airlied>
I'm guessing a missing include
<gfxstrand>
tjaalton: Just wanted to make sure you saw we dropped the -experimental from NVK so you should plan to turn it on as soon as you pick up Mesa 24.1.
yyds has quit [Read error: Connection reset by peer]
Kayden has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
glennk has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold has joined #dri-devel
Leopold has quit [Remote host closed the connection]
Leopold has joined #dri-devel
yyds has quit []
yyds has joined #dri-devel
<tjaalton>
gfxstrand: yep, noted
KetilJohnsen has joined #dri-devel
KetilJohnsen has quit []
qflex_ has joined #dri-devel
qflex has quit [Read error: No route to host]
tzimmermann has joined #dri-devel
qflex_ has quit [Ping timeout: 480 seconds]
yyds has quit []
yyds has joined #dri-devel
mbrost has joined #dri-devel
sima has joined #dri-devel
<dolphin>
airlied, sima: duh, I was living wrong day of the week yesterday, will send the drm-intel-fixes PR
<mripard>
airlied: which config are you using? I don't see it with drm-misc-arm
sghuge has quit [Remote host closed the connection]
mbrost has quit [Read error: Connection reset by peer]
sghuge has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
<mripard>
airlied: you have a mail, I assume you will merge it in drm/next directly?
jkrzyszt has joined #dri-devel
fab has joined #dri-devel
warpme has joined #dri-devel
qflex has joined #dri-devel
hansg has joined #dri-devel
mvchtz has quit [Ping timeout: 480 seconds]
mvchtz has joined #dri-devel
JohnnyonFlame has quit [Read error: Connection reset by peer]
glennk has quit [Ping timeout: 480 seconds]
rgallaispou has joined #dri-devel
lynxeye has joined #dri-devel
Duke`` has joined #dri-devel
<airlied>
mripard: yes I'll grab it
<jfalempe>
sima: regarding https://paste.debian.net/hidden/9be7656c/, would it be better to have the lock in the plane struct instead ? so if your card have multiple output, they will have separate locks, and won't slows down each other ?
<sima>
jfalempe, yeah that would be an option and might indeed be cleaner
<jfalempe>
Also I find it simpler to have the get_scanout_buffer() function in struct drm_plane_funcs instead of the device modeconfig.
<jfalempe>
that allows to handle multiple output cleanly.
<sima>
since iterating over planes or whatever we need to look at in struct drm_device is safe if we register/unregister the panic notifier in drm_dev_register/unregister
<sima>
so you don't need the spinlock for those parts
<sima>
jfalempe, yeah agreed the hooks are best in the plane functions
<sima>
jfalempe, I'm typing some more words in the commit message for john ogness and then I'll send it out as an rfc, does that sound good?
<jfalempe>
sima: yes that's sound good.
<jfalempe>
also for the debugfs test, since using the notifier is a bit clumsy, another way to do it would be to loop through all drm devices, and all planes with a get_scanout_buffer() function ?
<sima>
jfalempe, my personal feel is that the notifier feels better, but we should move this testing infrastructure into common panic.c code
<sima>
maybe behind a Kconfig
<sima>
I've also been annoying john ogness whether he can create something like that for the panic flow
<sima>
since with his new console_lock replacement you can limit to only the safe panic lock takeovers, and so we could safely run the entire panic code as part of ci
<sima>
jfalempe, so maybe split out that patch standalone and submit it as an rfc for how to best do that?
<sima>
since it's not really drm specific at all right now and could be put into kernel/panic.c
<sima>
and if that holds it up for too long I think we could create a per drm_device trigger in debugfs, which would be entirely drm specific
<sima>
as a stop-gap solution
<jfalempe>
sima: yes, I can split it from the rest, but also I'm not sure what the other panic notifier are doing, so it may not make sense for them to be called from debugfs.
<sima>
hm yeah ...
<sima>
otoh testing is good, and there's a real effort to make the panic code not so fragile
<sima>
so my suggestion would be to split the panic notifier testing patch out as standalone and move the code to kernel/panic.c with a Kconfig or so
<jfalempe>
but, if there is a solution to test the panic in ci, that would be very good.
<sima>
and have the discussion with experts how we best simulate panics for testing
<sima>
and in parallel we do a debugfs on drm_device in the drm debugfs directories?
<sima>
and that drm trigger just walks over all planes and does the panic handling on each of them
<jfalempe>
sima: so we can trigger the panic code for each device independantly ?
<sima>
jfalempe, yeah
<sima>
jfalempe, the rfc with core folks would also be to have the discussion what we need to wrap that call with to simulate panic contexts the best
<jfalempe>
yes, that sounds good
rasterman has joined #dri-devel
<sima>
like we definitely want to disable hardirq handling
<sima>
ideally even get into nmi context, since that's the worst panic context
<sima>
to make sure our panic code _really_ works in the worst case panic situation
<sima>
if we just run it from the debugfs write function in full process context there's a lot of issues we won't catch
<sima>
like sleep, or taking locks accidentally and all that
<sima>
we want to make sure any sleep or even mutex_trylock blows up in test when kernel debugging is enabled
<jfalempe>
sima: yes that would be the best way to test it reliably
<sima>
so even if the core panic debugfs doesn't go anywhere, we need the rfc to have that discussion
<sima>
jfalempe, ^^ can you include that open question in the patch commit message to get this started?
<jfalempe>
sima: yes let me start a thread about this.
<sima>
well two opens: is a core panic test infra a good idea? and what is the best way to simulate panic context (ideally nmi) without actually panicking the system, so that it can be used in ci?
<sima>
jfalempe, thanks a lot!
<vsyrjala>
is there some idea how to make the panic stuff work if the hardware is in the middle of a commit when the panic occurs?
<sima>
vsyrjala, probably not
<sima>
but I think it can be made to work with my rfc patch
<sima>
if the driver opts to protect the mmio writes to the scanout registers with the raw spinlock
<vsyrjala>
at least on i915 the mmio writes will not even be done by the cpu in the future
<sima>
and then reads back the actual register state to figure out where the current fb is that the hw actually scans out
<sima>
ofc, if you can't trylock the spinlock then you're screwed and it's best to not do anything, since the hw is in a ill-defined state
Leopold has quit []
<sima>
and the commit work might be running in parallel, wreaking havoc
<sima>
vsyrjala, that case is easier I think, since the fw/gpu won't die in panic()
Leopold has joined #dri-devel
<sima>
so you can limit yourself to looking at sw state (with a minimal race window protected by the raw spinlock)
<sima>
safe in the knowledge that maybe the display doesn't show the new buffer yet, but once the fw has done it's job, it will
<sima>
even when the kernel is long dead at that point
<sima>
but yeah fundamentally there's a race, and my rfc has a fairly big window, but you can make it much smaller with driver code
<sima>
but it's never going to be zero
<jfalempe>
yes the panic handling is a best effort approach, we can't guarantee the panic screen will be displayed 100% of the time.
<airlied>
as long as some shitty mga or ast driver doesn't stop my serial port from getting it :-P
<vsyrjala>
i think the only safe way would be to have the panic handler wait for the hardware to finish its commit. othereise you could get all kinds of funny mmio faults and whatnot when the two register updates fight each other
<vsyrjala>
*iommu faults
<sima>
airlied, yeah that's really the primary goal, and why I think the standard design should lean _extremely_ heavily towards safety
<sima>
vsyrjala, pls no, hw can hang
<sima>
no, absolutely no waiting or spinning in panic context
dorcaslitunya has joined #dri-devel
<sima>
because on any reasonable system there's a bunch of other ways to dump out panics, so really good chances you just make it much, much worse
<sima>
for real console there's two steps actually for this reason: 1. only do the absolute safe stuff, over all panic outputs
<sima>
2. try harder and pray
<sima>
unfortunately panic notifiers aren't that great yet, but could be added easily
dorcaslitunyaVM has joined #dri-devel
<vsyrjala>
i guess we just don't use this then
<sima>
but 2 must be done only after all of 1 has finished
pcercuei has joined #dri-devel
vliaskov has joined #dri-devel
<sima>
vsyrjala, seems a bit drastic take when I just typed out what it'd take to make it happen like you want ...
<vsyrjala>
don't see how to make it work when the hardware is busy writing registers in parallel. we'd potentially just create more explosions. we could do it when we know the thing is idle though
<sima>
vsyrjala, panic code by default doesn't touch any display state at all
<sima>
we just overwrite whatever is currently being scanned out
<sima>
exactly because touching display state is pretty much impossible
<sima>
so the new panic code has code to write into yuv and could also write into tiled buffers
<sima>
so that you don't have to touch any fifo state or anything really tricky like that
<sima>
and the only hard part is making sure you pick the right buffer
<vsyrjala>
and actually having cpu acccess to said buffer
<sima>
amd has peek/poke registers
<sima>
shit hw is shit hw, can't help that
<sima>
ofc writing a few mb with peek/poke is going to be extremely slow, but that doesn't matter
<sima>
if you have a gart, I guess you could reserve one pte
<sima>
and probably need to protect the tlb flush with the panic spinlock to avoid lolz
<vsyrjala>
hmm. yeah, i suppose that could work
<sima>
if you have nothing, well it just sucks then
<vsyrjala>
going to be a slight pita to write all the manual tiling stuff though
<sima>
yeah
<sima>
and ccs clearing
<sima>
but we've tried the other approach of trying to reprogram hw state to be easier, and that defo doesn't work well enough beyond tech demo
<sima>
the entire thing being real pita for a ccs tiled buffer is also why I really want the debugfs interface
<vsyrjala>
yeah. i thought it was still just some kind of 'let's just update just the scanout address' approach
<sima>
vsyrjala, you could do a bit a mix, like clear the tiling bits
<sima>
I think amdgpu folks want to do that
<sima>
but no-no when the gpu fw pushes out the flips ofc :-(
<vsyrjala>
yeah
<sima>
but anything more my gut feeling is that it's just too easy to kill the hw because you programmed terrible watermarks
flynnjiang has quit [Quit: flynnjiang]
<sima>
vsyrjala, also like I said, if we improve the panic notifiers to have the same feature set as john ogness is adding for full blown consoles
Leopold has quit [Remote host closed the connection]
<sima>
then you get a lot of nifty tools to take over from the driver and a 2nd attempt where you can go risky
<vsyrjala>
psr/fbc/etc. might also be a pain. but i think we should have sufficient ways to kick those somewhat safely
<sima>
ofc the complexity should still be as close to taking over an uart, because this code runs in the absolute worst context
Leopold has joined #dri-devel
<sima>
yeah
<sima>
vsyrjala, the biggest with all of these is that beyond the panic raw spinlock that common code will trylock for you
<sima>
you cannot take any locks
<sima>
even spin_trylock is no-go because of -rt and nmi context
<sima>
jfalempe, btw just realized that per-plane spinlock might not be a good idea
<sima>
for hw with global resources like the peek/poke register, where the spinlock needs to be for the entire device
<sima>
so I'm leaning towards spinlock per drm_device again more
frankbinns1 has joined #dri-devel
<jfalempe>
sima: the lock is to protect access to the state framebuffer, device should have its own lock for its resources ?
frankbinns1 is now known as frankbinns
<sima>
jfalempe, they can't
<sima>
panic you get one, and only one raw spinlock, that you trylock
<sima>
otherwise we deviate too much from the new console_lock design, and I think we don't want that because there's a bunch of good reasons to make panic notifiers more like panic-only consoles
<sima>
see the entire discussion above, plus what I've just added to my rfc
<jfalempe>
but that means the driver will need to take the panic_lock each time it programs the hw, wouldn't that be too much lock contention ?
<sima>
jfalempe, the example I have is for protecting the peek/poke registers that e.g. amd has
<sima>
which is strictly for debugging only, and an _extremely_ slow way to access vram
<sima>
so adding a raw spinlock doesn't matter
<sima>
jfalempe, anther example would be protecting the go bit, or the scanout address register
<sima>
which should just be one mmio write per display flip
<jfalempe>
if it's used only by the panic code, there shouldn't be a race condition for peek/poke.
<sima>
so again entirely ok, the mmio will be much slower than the raw spinlock/unlock anyway
<sima>
jfalempe, it's for debug in general
<sima>
iirc they expose it through debugfs too as a debug tool
<sima>
it's good to figure out issues when you don't trust your gpu pagetables
frankbinns2 has quit [Ping timeout: 480 seconds]
glennk has joined #dri-devel
<jfalempe>
sima: if the panic occurs when you are already messing up with gpu vram from userspace, that's kind of a corner case, I'm not sure we can support anyway.
<sima>
jfalempe, sure, but the drm_panic_lock around that will make sure it won't blow up badly
<sima>
I really want to make sure that this new panic design is really safe, so we need to have an idea how to make these things work too
<sima>
ofc if you're race really badly, then no panic output for you
<sima>
but also the console takeoverlock would help with these
<jfalempe>
sima: may it crash the machine, or it may just corrupt the output ?
<sima>
jfalempe, crash, no
heat has joined #dri-devel
<sima>
because then an output later on that would work doesn't
<sima>
and that's the case we absolutely need to avoid
<sima>
failing to print is ok, corrupted display screen is ok, crashing or killing the hw, not ok at all
<jfalempe>
sima: that's what I think too.
<sima>
that's why we need the drm_panic_lock so that drivers can protect this additional pieces they might need in their panic code
<sima>
like when there's no way to write into the buffer reliably because unmapped vram
<sima>
except with these peek/poke registers
<jfalempe>
ok and it won't be practical to trylock all planes panic_lock in this case.
<sima>
yeah
<sima>
or would just add more potential failure paths and issues
<sima>
or people trying to use spin_trylock because hey it works in hardirq context (but not in nmi)
<jfalempe>
sima: ok so I will leave the panic_lock at device level.
dorcaslitunyaVM has quit [Read error: Connection reset by peer]
dorcaslitunya has quit [Read error: Connection reset by peer]
dorcaslitunya has joined #dri-devel
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
qyliss has quit [Quit: bye]
qyliss has joined #dri-devel
cmichael has joined #dri-devel
KetilJohnsen has joined #dri-devel
apinheiro has joined #dri-devel
rossy_ has quit []
rossy has joined #dri-devel
sgruszka has joined #dri-devel
rossy has quit []
rossy has joined #dri-devel
dorcaslitunyaVM has joined #dri-devel
glennk has quit [Ping timeout: 480 seconds]
Leopold has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
davispuh has joined #dri-devel
CounterPillow has quit [Ping timeout: 480 seconds]
CounterPillow has joined #dri-devel
bl4ckb0ne has quit [Remote host closed the connection]
Nefsen402 has quit [Remote host closed the connection]
emersion has quit [Remote host closed the connection]
emersion has joined #dri-devel
Nefsen402 has joined #dri-devel
bl4ckb0ne has joined #dri-devel
sgruszka has quit [Quit: Powered by WinIRC]
cmichael has quit [Remote host closed the connection]
rasterman has quit [Remote host closed the connection]
rasterman has joined #dri-devel
dorcaslitunya has quit [Read error: Connection reset by peer]
dorcaslitunyaVM has quit [Read error: Connection reset by peer]
ninjaaaaa has joined #dri-devel
simondnnsn has joined #dri-devel
dorcaslitunya has joined #dri-devel
dorcaslitunyaVM has joined #dri-devel
glennk has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
cmichael has joined #dri-devel
<pq>
Is drm_fixp2int_round() really ok? What is it supposed to do?
<pq>
in kernel
<pq>
it's certainly not rounding the way I understand rounding
guludo has joined #dri-devel
yyds has quit [Remote host closed the connection]
glennk has quit [Ping timeout: 480 seconds]
dorcaslitunya has quit [Remote host closed the connection]
dorcaslitunyaVM has quit [Read error: Connection reset by peer]
<tnt>
pq: DRM_FIXED_POINT_HALF looks weird to me. Should be DRM_FIXED_POINT AFAICT.
<pq>
that would make more sense
bmodem has quit [Ping timeout: 480 seconds]
<dolphin>
mripard: I think your MUA might have some problem (or mine), you seem to have replied to a mail that I initially never got and it appears as a reply to HDMI connector thread
fab has quit [Read error: Connection reset by peer]
fab has joined #dri-devel
vliaskov has quit [Remote host closed the connection]
<mripard>
dolphin: yeah, I screwed up on wednesday
<mripard>
it shouldn't be a problem anymore, but all the mails I've sent then have the same msg-id
<dolphin>
right, that explains the mayhem in 'alot' view
cmichael has quit [Remote host closed the connection]
cmichael has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
<mripard>
jani: thank you so much for the b4 integration in dim :)
cmichael has quit []
jsa has joined #dri-devel
davispuh has joined #dri-devel
apinheiro has quit [Quit: Leaving]
vals_ has quit [Ping timeout: 480 seconds]
guludo has quit [Ping timeout: 480 seconds]
guludo has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
bmodem has joined #dri-devel
cmichael has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
bolson has quit [Remote host closed the connection]
jsa has quit [Read error: Connection reset by peer]
padovan4 has joined #dri-devel
glennk has joined #dri-devel
Calandracas has quit [Remote host closed the connection]
jsa has joined #dri-devel
Calandracas has joined #dri-devel
zhiwang1 has joined #dri-devel
jsa has quit [Read error: Connection reset by peer]
tango_ has joined #dri-devel
jsa has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
simon-perretta-img has joined #dri-devel
Net147 has quit [Quit: Quit]
glennk has quit [Ping timeout: 480 seconds]
vliaskov has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
Net147 has joined #dri-devel
<sima>
jfalempe, I'm not sure we need any panic KCONFIG
<sima>
like even if people enable both fbcon and drm_panic the mess should be limited
Net147 has quit []
Net147 has joined #dri-devel
<sima>
for one, fbcon might not run on all drm drivers, or it's disabled because a compositor is running and so wont show anything
<sima>
on the other side, drivers need to write explicit support for drm-panic anyway
<jfalempe>
sima: ok that's fine. just asking because in an ideal world if you don't enable drm panic, you don't want to lock/unlock when updating the plane state.
<sima>
so in practice I don't expect much conflicts
<sima>
and even if you get them, panic goes through all of them in a loop
<sima>
so in the end, either drm-panic of fbcon wins, even if they both write into the same fb
<sima>
I think at least ...
<sima>
jfalempe, I don't think you can measure that, and we shouldn't add complexity with no benefit
<sima>
and every kconfig we add has a cost
<jfalempe>
I have a small workaround to disable fbcon when drm_panic runs, to avoid the graphic mess.
<jfalempe>
but I prefer to have a clean drm_panic merged first.
ninjaaaaa has quit [Read error: Connection reset by peer]
simondnnsn has quit [Read error: Connection reset by peer]
<sima>
jfalempe, graphical session should be enough to disable fbcon, or not?
<jfalempe>
sima: I didn't have issue with graphical session, but only tested with matrox and simpledrm
<jfalempe>
sima: I tested the conflict between fbcon and drm_panic on my arm device, with imx driver. (and this one don't have a graphical session).
ninjaaaaa has joined #dri-devel
simondnnsn has joined #dri-devel
junaid has joined #dri-devel
glennk has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
simon-perretta-img has joined #dri-devel
<DemiMarie>
sima: When it comes to panic outputs, I think your expectations for “reasonable” don’t match with reality.
<sima>
... what are my expectations? thus far I only locked at the locking, not once what it actually shows
<DemiMarie>
That there will be other ways to get the data out.
<sima>
uh that's not my expectation
<sima>
but we absolutely need to make sure that if there are other options, we don't go crash&burn in the drm panic handler and prevent those others from even having a chance
<DemiMarie>
Fair
<sima>
this is why the new console panic code is two stage, first it does everything which is safe
vliaskov has quit [Remote host closed the connection]
<sima>
and then it goes for the last ditch options
<sima>
and then it goes to fbcon to burn it all down :-)
<DemiMarie>
So on a typical client system you have two options: EFI pstore and DRM panic.
<sima>
only issue is that the panic notifiers drm-panic will use currently only have the first stage, but that can be fixed
<sima>
DemiMarie, netcon for desktops is pretty good too
<DemiMarie>
sima: not in my world, where the host is running completely offline
<sima>
or if it's network on a thunberbolt extension box for laptops
<sima>
DemiMarie, I think _that_ is unusual :-)
<sima>
plus you can just airgap a second machine like a rpi just to record netcon
<sima>
been there, done that
<DemiMarie>
sima: Unusual? Sadly, yes. Unreasonable? No. And we have both network and USB assigned to guests.
<sima>
also uart over usb cables works pretty well I've heard
kzd has joined #dri-devel
KetilJohnsen has quit [Ping timeout: 480 seconds]
<DemiMarie>
USB is assigned to guests too
<sima>
yeah that one needs special setup
<sima>
and special cable
<sima>
and an uart dongle to another machine
<sima>
but if all else fails, it tends to work very well, since it's just dead slow mmio writes
<DemiMarie>
So what I am saying is that your code may well be the only way to get messages off.
<sima>
yeah, but also: I've seen way to many pstore dumps that just show drm fbcon dying in panic
<sima>
so we really have to avoid that as the first goal
fab has quit [Read error: Connection reset by peer]
<sima>
DemiMarie, but otherwise I'm fully on board with you, which is why the new drm panic should be able to get stuff out even when you watch a video with yuv scanout
<sima>
the old one was just crash&burn in that case
Company has joined #dri-devel
<DemiMarie>
sima: A crash kernel would be awesome, *if* it could be made to work with LUKS-encrypted storage.
jsa has quit [Read error: Connection reset by peer]
<sima>
DemiMarie, hm didn't mjg59 do some very fancy demo with tpm secrete shuffling to make that work?
<sima>
but extremely far away from where I have clue
<DemiMarie>
sima: could there be special handling for cases where the panic was in process context?
<DemiMarie>
I’m thinking of stuff like, “Attempted to kill init!”, which is a userspace bug.
<DemiMarie>
vsyrjala: why will the firmware be doing the writes?
Haaninjo has joined #dri-devel
<vsyrjala>
which firmware?
<sima>
jfalempe, oh btw, why are you not using kms_dump_register? that looks a lot more like the thing we want ...
<DemiMarie>
vsyrjala: whatever does the MMIO writes
hansg has quit [Quit: Leaving]
<sima>
jfalempe, I guess I forgot why we're picking the panic notifier and not kmsg_dumper? since the latter is what pstore also uses ...
<sima>
DemiMarie, not sure what you'd gain in process context, the scheduler refuses service anyway?
<sima>
you can trylock more locks, but that's about it I think
<DemiMarie>
sima: maybe that particular panic (and others that are not actual kernel bugs) should do some stuff first.
<DemiMarie>
But that is getting off-topic
jsa has joined #dri-devel
<sima>
yeah, maybe it could oops first and then panic
<vsyrjala>
DemiMarie: oh that. it's just a small dma engine thingy. it doesn't have firmware. though eventually there will probably be firmware pain also added to the whole mix
<sima>
and with kmsg_dump we could have a knob so that we also print stuff on oops
<sima>
jfalempe, oh I've found your mail, I think we should switch over to kmsg_dumper
<sima>
I totally forgot about that again
<sima>
noralf's og drm-panic also used kmsg_dumper, so you should be able to steal code from there
<sima>
jfalempe, the other reason for kmsg_dumper is that at that point the panic output isn't even complete, so we definitely have to use that and not the notifier
<DemiMarie>
How does Windows display its BSODs?
<sima>
tbh no idea, but the kernels are fairly fundamentally different in so many ways I don't think the design would translate at all
rgallaispou has quit [Quit: Leaving.]
jkrzyszt has quit [Ping timeout: 480 seconds]
<tleydxdy>
does anyone know why occasionally (10s-1min) drm would stop putting any jobs onto the HW for ~20ms? from gpuvis I still see that ioctls are coming in but no jobs are being scheduled and no dma_fence is coming back
padovan43 has joined #dri-devel
opotin65 has joined #dri-devel
warpme has quit []
<tleydxdy>
just tried, this can be easily reproduced by tracing vkcube with VK_PRESENT_MODE_IMMEDIATE_KHR
apinheiro has joined #dri-devel
orbea1 has joined #dri-devel
orbea has quit [Read error: Connection reset by peer]
orbea1 has quit []
orbea has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
OftenTimeConsuming has quit [Remote host closed the connection]
mripard has quit [Remote host closed the connection]
OftenTimeConsuming has joined #dri-devel
<jenatali>
gfxstrand: Is there a generic pass that removes out-of-bounds loads/stores? vars_to_ssa does it for locals, and loop unrolling attempts to do it in some cases
<jenatali>
And if I was going to write one, should that be backend-specific? Seems like it should be general
<gfxstrand>
No, there's nothing general.
<gfxstrand>
What are you thinking?
<gfxstrand>
For NVK, we need something that gives us some sort of bounds checking behavior for indirect scratch access because the GPU just faults and kills your context and lots of apps seem to be hitting that.
<jenatali>
gfxstrand: For function_temp and shared, our backend wants them as derefs, since DXIL uses LLVM GEPs which are basically the same thing
<jenatali>
But if you have a GEP with a literal out-of-bounds index, that fails to validate, even if it's in code that never executes, so I need to remove those
jsa has quit []
<gfxstrand>
ah
<gfxstrand>
Yeah, that pass doesn't exist
<jenatali>
Worth being general for pre-io-lowering?
<gfxstrand>
IDK
<jenatali>
That sounds like a no. Makes my life easier. We can always move it later if someone else wants it
<gfxstrand>
Sounds good
kts has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
glennk has quit [Ping timeout: 480 seconds]
cmichael has quit [Quit: Leaving]
kts has quit [Ping timeout: 480 seconds]
lynxeye has quit [Quit: Leaving.]
fab has joined #dri-devel
mbrost has joined #dri-devel
tanty has quit [Quit: Ciao!]
<alyssa>
jenatali: going with no... statically invalid but unreachable code is very much a layered driver specific problem
<alyssa>
so unless zink wants it, probably doesn't matter
<jenatali>
Yep, fair enough
warpme has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
heat is now known as Guest1512
Guest1512 has quit [Read error: Connection reset by peer]
heat has joined #dri-devel
tanty has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
Sachiel has quit [Ping timeout: 480 seconds]
mbrost_ has joined #dri-devel
<jfalempe>
sima: with kmsg_dumper, you don't even have the panic reason. so you need to parse the ksmg to retrieve some useful output which is not great at all.
<jfalempe>
also I target drm_panic for average user of Linux distribution, so I want to have only one or two lines of text. All debug info can then go in a qr_code, so you can open a bug and have some info directly there.
<jfalempe>
I find it better than a blurry picture of an fbcon output.
mbrost has quit [Ping timeout: 480 seconds]
dorcaslitunya has joined #dri-devel
dorcaslitunyaVM has joined #dri-devel
Dr_Who has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
dorcaslitunya has quit [Ping timeout: 480 seconds]
dorcaslitunyaVM has quit [Ping timeout: 480 seconds]
padovan4 has quit []
padovan4 has joined #dri-devel
Sachiel has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
zhiwang1 has quit [Quit: Connection closed for inactivity]
Dr_Who has joined #dri-devel
Duke`` has joined #dri-devel
glennk has joined #dri-devel
mbrost_ has quit [Ping timeout: 480 seconds]
anujp has joined #dri-devel
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
flto has quit [Remote host closed the connection]
flto has joined #dri-devel
gouchi has joined #dri-devel
gouchi has quit [Remote host closed the connection]
anujp has quit [Ping timeout: 480 seconds]
raoul^ has joined #dri-devel
anujp has joined #dri-devel
dorcaslitunya has joined #dri-devel
anujp has quit [Ping timeout: 480 seconds]
soreau has quit [Ping timeout: 480 seconds]
soreau has joined #dri-devel
Dr_Who has quit []
anujp has joined #dri-devel
rasterman has quit [Remote host closed the connection]
dviola has quit [Ping timeout: 480 seconds]
rasterman has joined #dri-devel
Leopold_ has quit [Remote host closed the connection]
Leopold_ has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
simon-perretta-img has joined #dri-devel
anujp has quit [Ping timeout: 480 seconds]
Dr_Who has joined #dri-devel
anujp has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
iive has joined #dri-devel
qflex has quit []
dorcaslitunya has quit [Remote host closed the connection]
mbrost has joined #dri-devel
anujp has quit [Ping timeout: 480 seconds]
junaid has quit [Remote host closed the connection]
Duke`` has quit [Ping timeout: 480 seconds]
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
apinheiro has quit [Remote host closed the connection]
anujp has joined #dri-devel
sima has quit [Ping timeout: 480 seconds]
Leopold_ has quit [Remote host closed the connection]
HI has joined #dri-devel
Leopold has joined #dri-devel
anujp has quit [Ping timeout: 480 seconds]
HI has quit [Remote host closed the connection]
anujp has joined #dri-devel
vliaskov has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]