ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
dakr has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
linearcannon has joined #dri-devel
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
TMM has joined #dri-devel
co1umbarius has joined #dri-devel
columbarius has quit [Ping timeout: 480 seconds]
Thymo_ has joined #dri-devel
Thymo has quit [Ping timeout: 480 seconds]
<linearcannon> is there any technical reason that a fairly simple framebuffer driver like "ast" could not support PRIME?
<airlied> in theory it doesn't have hw accel so it's a lot of CPU overhead
<linearcannon> for context, i'm doing some research and development work involving pure software-rendered graphics, on a SuperMicro board which uses that driver for the onboard graphics. i want to run Sway, which currently seems to require PRIME support, and i want to be able to test vgem.
<linearcannon> if possible, i'd rather do some kernel hacking and let my (rather beefy) cpu handle it, than grab a GPU that will barely actually be used
<airlied> I think there are some patches posted for ast to enable it
<airlied> not idea how it works in practive
<linearcannon> ah, so there are! somehow i missed that in my initial round of searching, i'll have to give that a shot
kts has quit [Ping timeout: 480 seconds]
<alyssa> jekstrand: We need to land https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16676 or something like it
<alyssa> I think any backend that wants OpenCL needs some flavour of that pass ... It's blocking OpenCL on Valhall at any rate
<alyssa> for naming "lower_mem_width" is my preferred bikeshed flavour (and I have my own version of !16676 with some panfrost fixes), I don't really care as long as we figure something out
<alyssa> You put a preliminary... de facto nak? on there for 2 reasons
<alyssa> #2 was about nir_opt_load_store_vectorize, I don't see any connection tbh
<alyssa> ld/st vectorize is fundamentlaly about combining instructions, this is fundamentally about splitting them up
<alyssa> (In other words, an optimized OpenCL implementation will want to use both)
<alyssa> #1 ("It's Intel specific") is the bigger issue ... I don't really know how to square this, because I don't know what other backends want
linearcannon has quit [Remote host closed the connection]
pallavim has joined #dri-devel
camus has joined #dri-devel
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
Jeremy_Rand_Talos__ has joined #dri-devel
camus has quit [Remote host closed the connection]
camus has joined #dri-devel
pallavim has quit [Ping timeout: 480 seconds]
nchery_ has joined #dri-devel
pallavim has joined #dri-devel
nchery__ has quit [Ping timeout: 480 seconds]
ella-0 has joined #dri-devel
ella-0_ has quit [Read error: Connection reset by peer]
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
<jekstrand> alyssa: Yeah...
<jekstrand> alyssa: That wasn't a nak it was a "this needs clean-up on it's way to core".
<jekstrand> alyssa: It pretty much just needs a callback function which says how wide to go.
<jekstrand> And we'll have to figure out how to make the wider load hack work.
<jekstrand> alyssa: Very much not a nak
oneforall2 has quit [Ping timeout: 480 seconds]
<alyssa> :-D
oneforall2 has joined #dri-devel
LexSfX has quit []
LexSfX has joined #dri-devel
slattann has joined #dri-devel
nchery_ has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
bmodem has joined #dri-devel
bmodem has quit []
danvet has joined #dri-devel
Duke`` has joined #dri-devel
itoral has joined #dri-devel
nchery_ has joined #dri-devel
adavy has joined #dri-devel
Namarrgon has quit [Ping timeout: 480 seconds]
tzimmermann has joined #dri-devel
fab has joined #dri-devel
dviola has joined #dri-devel
Namarrgon has joined #dri-devel
nchery_ has quit [Ping timeout: 480 seconds]
mszyprow has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
srslypascal is now known as Guest914
srslypascal has joined #dri-devel
Guest914 has quit [Ping timeout: 480 seconds]
srslypascal has quit []
srslypascal has joined #dri-devel
everfree has quit [Quit: leaving]
everfree has joined #dri-devel
dviola has quit [Quit: WeeChat 3.6]
kts has joined #dri-devel
dviola has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
fab has quit [Quit: fab]
itoral_ has joined #dri-devel
itoral has quit [Ping timeout: 480 seconds]
nchery_ has joined #dri-devel
jfalempe has joined #dri-devel
<tzimmermann> javierm, did you ever understand the purpose of the line at https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_simple_kms_helper.c#L112 ?
fab has joined #dri-devel
<pq> daniels, javierm, with weston's kiosk-shell, don't you need to explicitly configure the apps to run in weston.ini? Simply running a Wayland client manually won't work? Or does it?
itoral__ has joined #dri-devel
Lucretia has quit []
jkrzyszt has joined #dri-devel
Lucretia has joined #dri-devel
itoral_ has quit [Ping timeout: 480 seconds]
sergi3 has quit []
sergi has joined #dri-devel
sergi has quit []
sergi has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
lemonzest has joined #dri-devel
<javierm> pq: I have no idea :)
nchery_ has quit []
<javierm> it certainly didn't in my tests. The foot terminal started and I could see it in the scene-graph but wasn't displayed
<javierm> daniels: oh, it seems the paste expires too quickly
nchery has joined #dri-devel
<javierm> tzimmermann: the drm_atomic_add_affected_planes() call ?
<tzimmermann> yes
<javierm> I'm reading that function now
<tzimmermann> it adds the planes for all of the state's crtcs
vliaskov has joined #dri-devel
sdutt has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
MajorBiscuit has joined #dri-devel
<javierm> tzimmermann: hmm, it seems is already done by drm_atomic_helper_check() so maybe isn't needed for drivers whose struct drm_mode_config_funcs .atomic_check is set to that helper?
kts has quit []
<tzimmermann> javierm, calling it results in an atomic_update for all planes. but atomic_check for crtcs runs after atomic_check for planes. so these atomic_updates run without atomic_check. that is suspicios :/
dviola has quit [Quit: WeeChat 3.6]
<tzimmermann> javierm, my next thought was that it triggers the simplekms update helper at https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_simple_kms_helper.c#L243
<tzimmermann> but that should happen in any case
itoral_ has joined #dri-devel
<tzimmermann> javierm, i copied the call into several drivers and i saw that ssd130x also has it. but i think it's not required. we should be able to leave it out until we figure out what it does
<tzimmermann> i thought you might have seen its purpose
<javierm> tzimmermann: no, in ssd1306 it's cargo cultting from looking at what other drivers do
<tzimmermann> :)
lynxeye has joined #dri-devel
<tzimmermann> these simplekms drivers only have one plane, so the call seems unnecessary. when i recently worked on ast, which uses two planes, it didn't do what i expected.
<javierm> tzimmermann: agreed that for simplekms it's not needed
<javierm> nor should be needed for ssd130x and simpledrm that also have 1 plane
itoral__ has quit [Ping timeout: 480 seconds]
<tzimmermann> melissawen, please see the discussion above. do you know the purpose of the call to add_affected_planes at https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/vkms/vkms_crtc.c#L190 ?
<javierm> tzimmermann: there's also an drm_atomic_add_affected_connectors() but that's much less used by drivers
<javierm> both are called by drm_atomic_helper_check_modeset()
<tzimmermann> i know, but i've not come across it much
<javierm> the question is why the atomic state needs to be recalculated for all planes in a CRTC check? It's because you could change CRTC and needs to add all planes associated with that CRTC?
<javierm> and wouldn't this be done already by drm_atomic_helper_check_modeset() if that helper is used by the driver?
<tzimmermann> from what i can tell, the calls make sense in check_modeset. connectors and planes are atomic_checked afterwards in the codepath
digetx has quit [Remote host closed the connection]
<javierm> tzimmermann: yes, that's my thinking too
<javierm> but calling it in the CRTC check seems superflous to me
<tzimmermann> javierm, for example: if you enable a crtc, it most likely needs an active primary plane. adding the planes will guarantee that
<tzimmermann> so its in check_modeset()
digetx has joined #dri-devel
<javierm> tzimmermann: yes, I understand why it's done by drm_atomic_helper_check(), and makes sense because after that it calls drm_atomic_helper_check_planes(dev, state)
<javierm> but don't see why drivers would need to call it again in their crtc atomic_check handler
thaytan has quit [Ping timeout: 480 seconds]
<javierm> in any case they should do in their drm_mode atomic_check if they have a custom one
garrison has joined #dri-devel
i-garrison has quit [Read error: Connection reset by peer]
garrison has quit []
i-garrison has joined #dri-devel
thaytan has joined #dri-devel
garnet has joined #dri-devel
jkrzyszt has quit [Remote host closed the connection]
jkrzyszt has joined #dri-devel
pcercuei has joined #dri-devel
bmodem has joined #dri-devel
<Venemo> karolherbst: who the hell is this guy who's still trolling on the rusticl MR?
bmodem1 has joined #dri-devel
pa- has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
garnet has quit [Remote host closed the connection]
JohnnyonFlame has joined #dri-devel
pallavim has quit [Remote host closed the connection]
pallavim has joined #dri-devel
<MrCooper> Venemo: curro is the clover maintainer
<Venemo> MrCooper: how is that possible? he hasn't made any single commit to clover for many years
bmodem has joined #dri-devel
YuGiOhJCJ has joined #dri-devel
bmodem1 has quit [Ping timeout: 480 seconds]
<MrCooper> even so, I doubt anyone else would want to claim that title :)
<Venemo> hehe
rsalvaterra has quit []
rsalvaterra has joined #dri-devel
lemonzest has quit [Remote host closed the connection]
devilhorns has joined #dri-devel
<daniels> pq: nope, you can just start clients externally - worked fine for me
<daniels> javierm: can you please start weston as weston --log=/path/to/foo.log --logger-scopes=log,proto,drm-backend, and attach the foo.log
<pq> alright
pa has joined #dri-devel
lemonzest has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
pallavim has quit [Remote host closed the connection]
pallavim has joined #dri-devel
chip_x has joined #dri-devel
chipxxx has quit [Read error: Connection reset by peer]
<javierm> to make sure that this time won't go away :)
kts has quit [Quit: Konversation terminated!]
<javierm> I've also copied there the systemd unit files I'm using to test starting weston and foot on boot
lemonzest has quit [Quit: WeeChat 3.5]
mvlad has joined #dri-devel
<javierm> pq: about seatd vs systemd the other day, I noticed that with XDG_SEAT=seat0 and PAMName=login weston can be started by systemd
<melissawen> tzimmermann, javierm, for the vkms case, we are using it to have a link between all plane_state to crtc_state and get active planes for the planes composition that the driver does and compute crc in the end
<melissawen> well... afaik
kts has joined #dri-devel
<melissawen> this commit explains the context a little: https://cgit.freedesktop.org/drm/drm-misc/commit/?id=8b1865873651d
<pq> javierm, I think XDG_SEAT should be already set by systemd. But yeah, the running Weston doc has lots about having Weston in a service unit.
<melissawen> maybe danvet has more thoughts
<pq> javierm, ideally there would be a specific PAMName for weston only, with the appropriate PAM configuration to go with it. I've been told "login" is not quite right.
<pq> javierm, but that's all more in the distro integration domain.
<emersion> i think it's to be able to configure special auth rules for weston only
<pq> yeah
<emersion> so yeah, you'd need to ship a weston PAM file as well, which includes login
<javierm> pq: AFAICT is not in the docs yet, but in an open MR https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/439
<emersion> /etc/pam.d/weston with `auth include login`
<javierm> pq: and yes, I read that specifying the seat shouldn't be needed anymore but I guess is because I'm trying to not run it from logind ?
<pq> javierm, wait, so are you using seatd with weston in a systemd service unit?
<tzimmermann> melissawen, ok. thanks for your answer
<dolphin> javierm: I think you would prefer to run an user session and start foot from there
<javierm> dolphin: probably yes. I'm currently just experimenting
rasterman has joined #dri-devel
<dolphin> I'm actually doing much of the same but with sway and cog
<pq> there are three ways: weston as root from system service, weston as a user from system service, and weston as a user service (and you need to arrange that user to auto-login)
fahien has joined #dri-devel
<javierm> pq: there's a seat0 already that's started by systemd-user-sessions.service IIUC
<javierm> I'm not using neither seatd nor logind
<javierm> pq: currently option 1 from your list
<dolphin> I have found the path of least resistance to add override systemd for getty@tty1 to autologin
<pq> javierm, yes. PAMName ideally starts a new sessions for the named user, activates all user services for that user, and then runs weston.
<javierm> pq: I see. Let me remove the explict seat0 but I think that was failing without it...
<pq> what PAMName actually does depends on how PAM in configured for that PAMName
dviola has joined #dri-devel
<pq> javierm, you might be missing the service directives that get a seat and user for the service.
<javierm> Sep 19 12:35:02 fedora weston[1432]: [12:35:02.670] [libseat/backend/logind.c:704] The sd_session_get_seat() failed: -61
<pq> or it might even be dependant on VTs
<javierm> Sep 19 12:35:02 fedora weston[1432]: [12:35:02.670] [libseat/libseat.c:76] Backend 'logind' failed to open seat, skipping
<javierm> pq: well, I'm trying to run without VTs :)
<pq> I know, that's why I said the normal procedure might not work.
<javierm> pq: ah, sorry. Misunderstood
<javierm> so yeah, probably is TTYPath=/dev/tty7 what ties it with the seat
<pq> possibly, yes
<javierm> pq: wonder then if no tty should mean seat0 by default to weston
bmodem has quit [Ping timeout: 480 seconds]
<pq> I don't think it's Weston who needs XDG_SEAT set, it defaults to seat0 anyway.
<pq> it might be any of the other components
<pq> like a PAM plugin
<javierm> pq: I see. Thanks for the pointer
<javierm> I've so many knowledge gaps in the user-space graphics stack...
<pq> it's not even that, this all is session management stuff
<pq> I have huge gaps too
<kennylevinsen> javierm: want logind or seatd?
<javierm> pq: and it seems there are too many assumptions about a VT / tty? being always present in the system
<javierm> kennylevinsen: logind
<pq> also since stuff like PAM configs are distribution-specific mostly, it's even more vague
<javierm> yeah
<javierm> pq: so it seems I need to do some reading before doing more random experiments :P
<kennylevinsen> for logind you need the service to use a PAM stack (through PAMName) that calls out to pam_systemd.so
<javierm> kennylevinsen: yes, but I thought PAMName=login would be enough
<javierm> although it seems that relies on the tty to figure out the seat
<kennylevinsen> If on a systemd distro it should be enough. Try to make the service start bash instead and check loginctl output
<javierm> kennylevinsen: Ok, thanks
<kennylevinsen> then you can see what logind thinks about things, show-session is useful and sometimes you get more info by looking in /run/systemd stuff
<javierm> got it. I'll figure out the session management part then
<pq> kennylevinsen, the quirk is, javierm wants this with CONFIG_VT=n, and I don't know how to define which seat a service should take over so that logind would be happy to let take control.
tobiasjakobi has joined #dri-devel
<kennylevinsen> XDG_SEAT I believe
itoral__ has joined #dri-devel
<kennylevinsen> Systemd-specific of course, as the name suggests >.>
<kennylevinsen> And requires udev tags set appropriately
<pq> so you need XDG_SEAT set before pam_systemd.so runs?
<kennylevinsen> yes, although I only remember faintly
<pq> well, it's the default seat, so no udev tags needed
<kennylevinsen> Also seat0 should still be default regardless
<kennylevinsen> I can have a look at it later if it still fails, need to go to a meeting
<pq> but with CONFIG_VT=n, you cannot assign a TTY to it
pallavim_ has joined #dri-devel
pallavim_ has quit [Remote host closed the connection]
<pq> pam_env.so could be used to set environment variables...
pallavim_ has joined #dri-devel
<emersion> pq, it's deprecated
<kennylevinsen> just use Environment= in the unit
<pq> kennylevinsen, does Environment apply before or after PAM stack?
<pq> since this env var is needed in the PAM stack and not in weston
<kennylevinsen> It's my best bet before I look at source. I think you can pass a debug flag to the systemd module ("debug")
itoral_ has quit [Ping timeout: 480 seconds]
tobiasjakobi has quit []
<javierm> kennylevinsen: using Environment="XDG_SEAT=seat0" in the unit file is what I did but thought that's a hacky way to do it
<pq> It's not hacky if it actually convinced logind to give control on that seat.
<pq> the service necessarily needs to take one specific seat
pallavim has quit [Ping timeout: 480 seconds]
<pq> I guess what confused me is that "normally" pam_systemd.so uses a heuristic to *set* XDG_SEAT rather than *use* XDG_SEAT.
<javierm> pq: it convinced and it starts weston on seat0, but thought that there would be another place to put that policy
<javierm> i.e: pam or logind setting a default XDG_SEAT=seat0 if no TTY or something like that
<pq> and I suppose setting the TTY is a way to trigger that heuristic.
<daniels> javierm: that's strange - are you using a packaged version of foot or git? I was running from git
<pq> I don't think "give ownership of the default seat" is something one would do by default.
<daniels> I can see that foot is making all the right requests, but for some reason it's just disappearing into the void
<pq> like if you log in via ssh, it's not seat0 - it's not any seat, IIRC
<javierm> daniels: from the f36 package: foot-1.13.1-1.fc36.x86_64
<javierm> but since it was working with the weston desktop shell, I thought that foot wouldn't be to blame
<daniels> javierm: do you mind trying from git please? it just built ootb for me at least
<daniels> yeah
<javierm> daniels: sure, one min
<daniels> thanks!
ahajda has joined #dri-devel
<javierm> daniels: same result with latest HEAD, commit https://gitlab.com/dnkl/foot/-/commit/debf1b8453ada57e69ec86fcb3fcb9ebf140d218
<javierm> daniels: this is a VM with virtio GPU in case that matters
<javierm> pq: got it. Makes sense then for explictly set that for this service file then
<pq> javierm, if you suspect GPU might have a problem, pass --use-pixman to Weston. Then it will use software rendering and DRM dumb buffers.
<pq> that then causes foot to use software rendering as well, if it wasn't already.
bmodem has joined #dri-devel
<javierm> pq: thanks, same result with pixman backend. Need to dig further but the strange thing is that foot is happy running but just not displayed
<javierm> anyways, need to work on more boring stuff now :) thanks to all folks for your assistance
<pq> no prob
<pq> it could be just kiosk-shell deciding to not show foot for some reason, if it's not present in the scenegraph.
<daniels> javierm: oh, I see the issue ...
<daniels> [atomic] drmModeAtomicCommit
<daniels> [repaint] flushed pending_state 0x20a2260
<daniels> somehow we've got ourselves into a state where we haven't committed anything, but seem to still be expecting a repaint event
<daniels> (this exact usecase wfm on i915 btw)
<daniels> hmm, maybe
vliaskov has quit [Remote host closed the connection]
vliaskov has joined #dri-devel
<vsyrjala> airlied: danvet: could you lay dowm some law for christian koenig? really getting fed up with him constantly pushing untested patches and breaking i915
Daanct12 is now known as Danct12
bmodem1 has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
pallavim_ has quit [Ping timeout: 480 seconds]
pallavim_ has joined #dri-devel
dakr has joined #dri-devel
abws has joined #dri-devel
pallavim_ has quit [Remote host closed the connection]
pallavim_ has joined #dri-devel
itoral__ has quit [Remote host closed the connection]
bmodem has joined #dri-devel
<kennylevinsen> pq, javierm: pam_systemd.so reads a lot of environment variables: https://github.com/systemd/systemd/blob/main/src/login/pam_systemd.c#L762
<kennylevinsen> and it has a bunch of "fun" heuristics, like seeing if there's an X11 server running, figuring out its controlling tty, converting that to a VT number and deciding the seat from that
<kennylevinsen> telling it explicitly is *definitely* better than relying on all that magic :P
bmodem1 has quit [Ping timeout: 480 seconds]
fahien has quit [Ping timeout: 480 seconds]
<javierm> kennylevinsen: yes, agree
<javierm> kennylevinsen: I guess this is similar to when we disabled all fbdev driver in favour of simpledrm in fedora, a bunch of stuff broke and needed fixes because were assuming that an early fbdev would exist
<javierm> gdm, plymouth, etc
fahien has joined #dri-devel
<pq> kennylevinsen, that makes sense now. Before I had no idea you even could tell pam_systemd the seat. :-)
<kennylevinsen> The More You (never wanted to) Know™
saurabhg has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
fab has joined #dri-devel
jewins has joined #dri-devel
mbrost has joined #dri-devel
abws has quit [Quit: abws]
mattst88 has quit [Read error: Connection reset by peer]
mattst88 has joined #dri-devel
Company has joined #dri-devel
rgallaispou has quit [Quit: Leaving.]
kts has quit [Ping timeout: 480 seconds]
<karolherbst> sooo.. to get iris pass the CL CTS I need !15811 !16442 and !18670
bmodem has quit [Read error: Connection reset by peer]
<karolherbst> would be cool if somebody could take a look to review/merge those
bmodem has joined #dri-devel
rgallaispou has joined #dri-devel
kts has joined #dri-devel
pcercuei has quit [Read error: Connection reset by peer]
pcercuei has joined #dri-devel
fxkamd has joined #dri-devel
chip_x has quit [Read error: No route to host]
Dr_Who has joined #dri-devel
anarsoul has quit [Quit: ZNC 1.8.2 - https://znc.in]
mbrost has quit [Read error: Connection reset by peer]
anarsoul has joined #dri-devel
fab has quit [Quit: fab]
fxkamd has quit []
sinatosk has joined #dri-devel
sdutt has joined #dri-devel
<DemiMarie> Could drmlog be used on panic? Windows at least manages to get text out on BSOD.
lemonzest has joined #dri-devel
<karolherbst> there were some ideas on showing a QR code on panics, but not sure where that went
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
sinatosk has quit []
sinatosk has joined #dri-devel
<DemiMarie> Another thought I had was for userspace to provide the kernel with an asymmetric encryption key during boot. In the case of a panic, Linux would use that to encrypt a crash dump to swap.
<karolherbst> there are ways of doing that already, but nothing of that is really user friendly atm
<DemiMarie> yeah
sinatosk has quit []
<karolherbst> but doing any fs related once you crashed the kernel is also quite dangerous
<karolherbst> what if you trash the fs?
<DemiMarie> I was thinking swap partition
<DemiMarie> but kexec might be the better solution, as is so often the case.
<karolherbst> well... you can have memory corruptions or weirdo locks taken
<karolherbst> yeah, kexec already solved some of the issue
<karolherbst> just that it's a mess if the graphics driver crashed :)
<karolherbst> or the GPU driver not able to use the GPU after kexec
<karolherbst> DemiMarie: there is kdum btw
<karolherbst> *kdump
rgallaispou has quit [Read error: Connection reset by peer]
<DemiMarie> karolherbst: yes, and as far as I can tell it is unencrypted so nobody shipping to end-users actually uses it
<DemiMarie> Encryption would make it possible to use in practice
<DemiMarie> karolherbst: How common is this?
<karolherbst> it's more confusing than a real problem
<karolherbst> so when kdump takes the dump, users might force reboot because black screen and such
<DemiMarie> What is?
<karolherbst> crashed GPU driver
alyssa has left #dri-devel [#dri-devel]
alyssa has joined #dri-devel
JohnnyonFlame has joined #dri-devel
<alyssa> jekstrand: So, panfrost needs to disable some optimizations for compute kernels if shared memory is used
<alyssa> (workgroup local memory)
<alyssa> namely, if workgroup local memory is used, then the hardware cannot split up or merge together workgroups that are too small or too large
fahien1 has joined #dri-devel
<alyssa> so we need a way to detect whether shared memory is used
<alyssa> nominally, `nir->info.shared_size > 0` is that check ... but that doesn't work!
<alyssa> because in OpenCL, `nir->info.shared_size` can be 0, with a variable amount of shared memory allocation at enqueue time!
<alyssa> so I guess I have 2 optios here
<alyssa> one is to extend nir_gather_info to statically check for any shared memory intrinsics and just report a bool of "can this shader possibly use shared memory?"
os369510 has joined #dri-devel
fahien has quit [Ping timeout: 480 seconds]
<alyssa> the other is to pinky-promise not to ever look at nir->info.shared_size at compile-time and move the "can shared memory be used?" check to enqueue-time when we actually have the shared size.
<alyssa> the other is a lot lazier and should work fine for my hardware
<alyssa> unsure whether we want this solved in NIR more properly, though, because nir->info.shared_size is a huge footgun for every driver that wants to eventually support CL
cphealy has joined #dri-devel
rgallaispou has joined #dri-devel
fahien1 is now known as fahien
anarsoul has quit [Read error: Connection reset by peer]
anarsoul has joined #dri-devel
<jekstrand> alyssa: Yeah...
<jekstrand> alyssa: Does the workgroup merging stuff involve re-compiles?
<alyssa> No, not for Mali at least
<alyssa> ir3 is the only hw in-tree that actually requires recompiles based on shared size
kts has quit [Ping timeout: 480 seconds]
xroumegue has quit [Ping timeout: 480 seconds]
mszyprow has quit []
<karolherbst> alyssa: this is totally fine though, CL allows you to report the max workgroup size based on what the shader has
<karolherbst> but yeah.. having a variable amount of shared mem can be problematic, as you have to assume the worst case I think
<karolherbst> there is one thing I am still wondering about: apparently you can declare a sized local mem array in a CL kernel, but I don't actually know what the compiler is making out of this
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
xroumegue has joined #dri-devel
<karolherbst> ehh maybe not
<karolherbst> alyssa: I think it might make sense to add another info field: has_variable_shared_mem and nir_gather_info could set it
<karolherbst> it's a bit tricky to figure that out though
Duke`` has joined #dri-devel
mbrost has joined #dri-devel
<karolherbst> the frontend knows this though, so we could make it part of the gallium API
bmodem has quit []
bmodem has joined #dri-devel
* alyssa shrugs
<alyssa> karolherbst: The commit I linked from the low-overhead MR deals with my panfrost problem
<karolherbst> yeah, I already saw it
<alyssa> I don't *want* to kick the can down the road but also I am quickly running out of OpenCL time for the month :-p
<karolherbst> I think it's fine to make runtime decisions at runtime and not compile time
<karolherbst> what does that merging affect btw, block size?
<karolherbst> I am planning to rework all the workgroup info stuff based on my MR to actually allow drivers to report back runtime info so I don't have to assume the worst case (subgroup size)
<karolherbst> and I also want to hook up last_block :)
<alyssa> nod
bmodem has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
slattann has quit []
rgallaispou has left #dri-devel [#dri-devel]
os369510 has quit [Remote host closed the connection]
rgallaispou has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
<pinchartl> sravn: "[PATCH v1 0/12] drm bridge updates" looks very nice. sorry for not noticing it earlier. I only have a small comment on 05/12. I think you can apply patches 01/12 to 11/12 (excluding 07/12 as the issue it addresses has already been fixed in drm-misc)
ybogdano has joined #dri-devel
anarsoul has quit []
anarsoul has joined #dri-devel
rgallaispou has left #dri-devel [#dri-devel]
<jenatali> alyssa: sounds like what you want us a scan for whether there's any ops on shared memory
<alyssa> jenatali: that's the gather_info change I suggested
<jenatali> (our backend has to do that currently because DXIL is invalid if it declares unused shared memory...)
<alyssa> but then I realized instead of typing out a 50 line patch to common code I can just do a 1 line patch to panfrost and ignore the whole mess :D
<jenatali> So if you did put it in common I could use it instead of our current scan, but it's not complicated so w/e
<alyssa> your scan seems to be all _dxil ops
<jenatali> Oh, sure, but those are just the same as the common ones but with uint offsets iirc
<jenatali> Instead of byte offsets
<alyssa> nod
<karolherbst> jenatali: it will become complicated once we don't inline everything
<alyssa> That reminds me I need to introduce load_global_agx and friends..
<jenatali> karolherbst: Yeah, DXIL requires everything to be inclined currently though (I think) so if the frontend didn't, I'd do it in the backend for now
<karolherbst> but it's really easy to actually know this: if the kernel as local mem args, there is variable shared mem
<karolherbst> end of story
<alyssa> actually what I want is a formatted memory load for AGX
<karolherbst> huh?
<alyssa> karolherbst: AGX's memory loads are formatted
<karolherbst> formatted how?
<alyssa> i8, i16, f16, i32, rgba8, rgb10a2, ...
<karolherbst> ahh images
<alyssa> no
<robclark> karolherbst: re: crashes vs fs.. chromebooks have console-ramoops so most cases we can get previous dmesg after a (warm) reboot.. which along w/ suzyq is a pretty big improvement vs debugging windows laptops ;-)
<alyssa> just memory
<karolherbst> alyssa: like.. you have to load differently depending on the datatypes?
<karolherbst> so loading a f32 with a i8v4 load would get you different results or something?
<karolherbst> robclark: sure.. but samsung (or some OEM) had buggy UEFI so people don't rely on that in distributions...
<karolherbst> there is actually UEFI storage for this purpose
<karolherbst> but no
<karolherbst> one had to screw it up
<alyssa> output_type load(base address, offset, extra shift, format) {
<alyssa> format *array = (format *) base_address;
<alyssa> return array[offset << extra shift] as output_type;
<alyssa> }
<alyssa> that is roughly the hardware behaviour
<karolherbst> okay.. so the format doesn't matter
<alyssa> yes, it does
<alyssa> there's a format conversion from the memory format to the register format
<karolherbst> would it break stuff if you load all 32 bit types as u32?
<alyssa> never mind
<karolherbst> the heck... the luxmark v3.1 C++ impl is faster than CL even on intel :D
heat has joined #dri-devel
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
aravind has quit [Ping timeout: 480 seconds]
<zmike> is there like a msaa version of glxgears somewhere?
<zmike> or some other very simple msaa-using app?
<karolherbst> zmike: glxgears -samples ?
<zmike> oh wow
<zmike> incredible
<alyssa> wild
TMM has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
<dj-death> OMG
TMM has joined #dri-devel
<karolherbst> okay.. so luxmark v3.1 ranking: 1. pocl 2. iris 3. llvmpipe 4. intel NEO: it crashes
lynxeye has quit [Quit: Leaving.]
<karolherbst> uhm.. well their C++ is even faster than pocl, but that's not CL, so it's clearly sheating
<karolherbst> *cheating
<karolherbst> also.. a good data point on the state of CL in general
<anholt> would be curious to see clvk in that mix
ybogdano has quit [Ping timeout: 480 seconds]
<karolherbst> yeah... wouldn't be surprised to be faster than intels stack actually, but it's crashing and I have no idea why...
jhli has quit [Remote host closed the connection]
saurabh_1 has joined #dri-devel
fab has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
saurabhg has quit [Ping timeout: 480 seconds]
<MrCooper> karolherbst: FYI, F37 has LLVM 15 but still spirv-llvm-translator 14, so rusticl (or opencl-spirv) doesn't build
<karolherbst> uhhh.. :(
<karolherbst> guess we'll have to make sure the toolchain people keep an eye on that and don't break it :/
<karolherbst> MrCooper: thought here are llvm-14 packages around, no??
<karolherbst> well.. llvmpipe is busted with llvm-15 anyway
<karolherbst> maybe only for CL
<MrCooper> indeed, there are llvm14-* packages
ybogdano has joined #dri-devel
<kisak> Debian/Ubuntu also has strange spirv-llvm tooling
danvet has quit [Read error: No route to host]
heat_ has joined #dri-devel
<tjaalton> kisak: strange how?
heat has quit [Read error: No route to host]
devilhorns has quit []
danvet has joined #dri-devel
<alyssa> panfrost/rusticl needs llvm 15 for conformance
<karolherbst> yeah... though I think llvm-14 is enough
saurabh_1 has quit [Ping timeout: 480 seconds]
<karolherbst> actually...
<karolherbst> yeah.. should be
<alyssa> I think there was some bug fix in -15 we needed
<karolherbst> not directly
<karolherbst> the fix we needed was when dropping opencl-c.h
<karolherbst> but I gated that with llvm-15
<karolherbst> so older llvm should be fine, it's just takes longer to compile kernels
<alyssa> oh right yes
<karolherbst> atm I am on llvm-14, because that's where llvmpipe isn't broken and it seems like none of the fails are llvm related
<kisak> tjaalton: llvm-toolchain-# needs to be built for spirv-llvm-translator-# to be built to *rebuild* llvm-toolchain-# with all the bits needed for libclc and it's not using the same release iteration between the two.
<karolherbst> kisak: well.. it hardly matters from which version libclc is from though
<turol> zmike: I don't see any way to make vk_dispatch_table const in vk_device
<turol> the compiler won't allow it even with egregious abuse of pointer casts
<DavidHeidelberg[m]> Should we start looking for LLVM 15 in CI?
<kisak> karolherbst: right, what matters is that there's *.spv bits non-deterministically missing from the llvm build based if there was a test rebuild of the distro release before reaching production. This is quite nasty in a PPA environment where I want to get rid of the retired llvm version, but it's technically needed for the older llvm-spirv-#
<zmike> turol: I wonder if we're going about that wrong and should instead collapse the table so it isn't a pointer
ngcortes has joined #dri-devel
<turol> i think it's not possible to do it this way because of c aliasing rules
<karolherbst> DavidHeidelberg[m]: after llvmpipe is fixed https://gitlab.freedesktop.org/mesa/mesa/-/issues/6735
<turol> since the calls to vulkan functions are external, it doesn't matter how much restrict or const we put there
<zmike> it shouldn't matter whether they're local or external, only whether the pointer gets reevaluated
<turol> the compiler is not allowed to treat the pointer as hoistable
<zmike> adding to just dev and the function pointers themselves wasn't enough, I assume?
iive has joined #dri-devel
<turol> i did not try changing the function pointers
<turol> addin restrict to the function pointer members is apparently not allowed
<zmike> oh huh
<turol> since vk_dispatch_table is directly part of of vk_device it should have worked when changing the definition of dev if it was going to
anholt has quit [Ping timeout: 480 seconds]
anholt has joined #dri-devel
<swick> emersion: am I blind or is the DRA format layout not documented?
pallavim_ has quit [Ping timeout: 480 seconds]
<airlied> MrCooper: someone is fixing f37, just had a build dep to sort out
jkrzyszt has quit [Ping timeout: 480 seconds]
jhli has joined #dri-devel
<karolherbst> airlied: how does it look like with the coro stuff btw?
<karolherbst> Venemo: I think it's best to ignore the rusticl MR now.. at least I won't respond because it's really just a waste of time :(
<Venemo> karolherbst: yeah the dude seems to be just trolling now
<Venemo> at least it is difficult to believe that he truly doesn't understand what everyone else there is saying
<karolherbst> Venemo: the sad part I think is, that I don't think it's trolling... :(
<karolherbst> just a huge disconnect between "technical reasons" and social aspects + project governance/maintenance
<Venemo> yeah but his 'technical concerns' were already discussed to death
<eric_engestrom> might not be intended as trolling, but it's basically indistinguishable from it now
<karolherbst> if one thinks "technical" points stand above all you get such a discussion
<karolherbst> anyway.. it's pointless and I just hope that not similar damages are done in the intel compiler space caused by the same person :/
<alyssa> Venemo: FWIW I'm inclined to nak rust gfx frontends for the "bindings hell" reason
<alyssa> then again I'm also of the boring school of thought that gallium is for gl+cl and nothing else :-p
<karolherbst> I was so close 🤏 of just saying the "damage" which was done was that I wasn't be willing to put up the emotional strength to convince him of any important clover change, but then I was thinknig: why should I even bother now
<Venemo> yeah
<karolherbst> alyssa: let's.... talk about this once somebody suggests it :D
<alyssa> karolherbst: glide
<karolherbst> jo.. right
<karolherbst> though not sure if we really want to take that one :D
<airlied> karolherbst: I got distracted building llvm yesterday, but it looks like something I fixed previously which makes me wonder if some version confusion is happening
<alyssa> Venemo: I'm also very reluctant around binding NIR to Rust, which is a shame because Rust has a lot of nice features for backend compilers
<ajax> i just had the thought "do i need to become the glide maintainer so we can stop arguing about this"
<alyssa> (ADTs/match alone is a Big thing. I guess C++ can do that these days.)
<ajax> and had horrible vertigo from being so glad to finally get to _stop_ shipping glide
<alyssa> ajax: wait which side are you on? :-p
<Venemo> alyssa: I see
<karolherbst> well.. nir bindings is a problem for future us and I'd keep it like this until we cross that bridge :)
<alyssa> Yeah
<karolherbst> I do use some stuff of nir and I suspect it might become more and more over time
<karolherbst> and we'll get a better feeling around stuff over time
<alyssa> Practically I want to leave rusticl in tree but not merge new Rust components in the medium term, so we can figure out as a community what the actual pain points are instead of the theoretical ones
<karolherbst> anyway.. until bindgen supporting static inlines it's way too painful anyway
<karolherbst> correct
<alyssa> and after rusticl has survived a few "rip up all of mesa and rewrite it" MRs from zmike or mareko, we'll have a lot more data to work with for the future :P
<karolherbst> also I am sure it will make more sense in the future if more people are used to seeing, reading and changing rust code
<zmike> got one of those coming in hot
<karolherbst> joooo
<zmike> bout to end this frontend's whole career
<karolherbst> though rusticls surface area is really not that huge
<karolherbst> :D
<karolherbst> that reminds me.. I still wanted to wire up spirv and the more I think about it the less I am convinced it's not a huge pita
<alyssa> zmike: gogogogo
<alyssa> karolherbst: good luck lol
<alyssa> for zink+rusticl to replace clvk, or?
<karolherbst> so one of the biggest advantages of doing all that funny stuff in nir is, that I can do proper DCE of kernel params
<anholt> novice question: I've got a dlopen("libEGL.so"), and LD_LIBRARY_PATH at the point of the call is pointing to my mesa build dir that does have a libEGL.so, and yet /home/anholt/src/angle/out/arm64-Release/libEGL.so gets loaded. what could get in the way of my LD_LIBRARY_PATH?
<karolherbst> but with spir-v.... that might become more of an issue
<karolherbst> though I _could_ do some optimizations on a spir-v level
<anholt> (this is all in service of doing angle vs zink shootout on real workloads)
<zmike> anholt: not seeing icd params?
<zmike> setting*
<anholt> what do you mean by icd params?
<zmike> like __EGL_VENDOR_LIBRARY_FILENAMES
<karolherbst> alyssa: yes... and the idea was that we just pass in the CL SPIRV into the vulkan runtime
<anholt> zmike: dlopen() doesn't look at that
<alyssa> anholt: strace?
<zmike> ohh I see
<jenatali> karolherbst: I really don't think that's a good idea, personally
<karolherbst> I'd be inclinced to ignore that DCE issue, but I know kernels where like 80% of the params are actually dead :(
<zmike> if it's directly linked?
<karolherbst> yes
<zmike> otherwise I'd check LD_DEBUG=all
<zmike> which I'd guess you've done
<alyssa> or that. usually I just use strace because I am a horse who only knows 1 trick :-p
<zmike> so...I'm out of ideas if it's not any of those
<karolherbst> ahh.. different problem
<anholt> zmike: info sharedlibrary shows it not loaded before the call. it shows the wrong egl loaded after the call.
<jenatali> karolherbst: You've seen how massive/complex some of those kernels can be, and you'd just be pushing the burden down to the Vk driver to deal with it instead of handling it in zink
<alyssa> (and strace shows exactly what paths get tried and the errnos)
<karolherbst> anyway... being able to cut the input buffers size by quite a lot is a huge advantage
<zmike> the exe might just be directly linked to angle
<zmike> I've had issues with that in the past
<karolherbst> jenatali: yeah....
<anholt> alyssa: yeah, strace shows it looking in exactly that directory.
<zmike> not sure I ever resolved it
<karolherbst> but zink has to convert that nir to spir-v which might be fine actually
<anholt> zmike: again, ldd and gdb's info sharedlibrary show it not linked to it at the point of dlopen() being called.
Haaninjo has joined #dri-devel
<alyssa> zmike: anholt is talking about .so's, I don't think you can directly link dynamic libraries (not .a's)
<jenatali> Yeah you get a round trip through nir, but as a result you get one place to deal with all of CL's craziness
<karolherbst> now that I replaced load_kernel_input by load_ubo it's not even a huge deal anymore, except load_global lowering
<alyssa> anholt: How did you teach your system about that ANGLE path in the first place?
<karolherbst> _but_
<alyssa> (I've never built ANGLE)
fahien has quit [Ping timeout: 480 seconds]
<karolherbst> maybe we can just use a few CL features in vk spv and write a vk_spv_cl_instructions extension or something...
<anholt> alyssa: the angle path is getting injected somehow by the angle build system.
<karolherbst> and that's limited to deal with global loads and other trivial things
<karolherbst> so we don't need ssbo lowering
<jenatali> That sounds like a better idea
<alyssa> anholt: dealing with bazel is above my pay grade
* alyssa taps out
<zmike> yeah now the ptsd is coming back to me
<karolherbst> also.. now that I create the CSO when the kernel is created, all that conversion overhead doens't even matter :)
<zmike> I don't think I ever solved this issue when I ran into it previously
<alyssa> karolherbst: you lower load_kernel_input to load_ubo and then agx backend will lower load_ubo to load_global_constant and then agx backend pass 2 will lower load_global_constant to load_global_constant_agx ... layers, lol
<karolherbst> :P
Haaninjo has quit [Read error: Connection reset by peer]
<alyssa> (I guess the latter lowerings could be combined, meh)
<karolherbst> I want to get rid of load_global_cosntant anyway I think
Haaninjo has joined #dri-devel
<alyssa> why?
<alyssa> replace it with access flags on load_global?
camus1 has quit [Read error: Connection reset by peer]
<karolherbst> because it's literally load_global, just promising the data won't change
camus has joined #dri-devel
<alyssa> the load_global -> load_global_agx lowering will be needed to handle 8-bit at any rate
<anholt> ah. rpath is the answer. and today I learned that rpath beats ld_library_path.
<alyssa> AGX doesn't do 8-bit at all
rasterman has quit [Quit: Gettin' stinky!]
<jenatali> alyssa: DXIL doesn't either :(
<karolherbst> do we have an access flag for constant data?
<alyssa> nir's alu bitsize lowering can get rid of all the 8-bit ALU except for u2u8/u2u16
<karolherbst> guess we could add one...
<alyssa> but you're still expected to handle the conversions and be able to do 8-bit loads/stores
<karolherbst> anyway
<karolherbst> load_constant_global should really become load_ubo as well
<karolherbst> rusticl could even keep track of what buffers are accessed the most and lower some to load_global if we get above limits
<alyssa> AGX doesn't do 8-bit loads/stores either in terms of registers -- but it has an i8 memory format!
<turol> question for radv developers
<karolherbst> or it's all indirect ubo
<alyssa> OCOC
<turol> checks wd_switch_on_eop
<turol> but it's set after that on line 914
<karolherbst> I don't have a good idea on how to deal with constants, because a driver can literally bind the same constant buffer unlimited times
<turol> i would expect first all things which set it, then things which check it
<turol> bug or intentional?
<alyssa> so can lower `8 ssa_1 = load_global` to `16 ssa_0 = load_global format i8; 8 ssa_1 = i2i8 ssa_0`, and then the optimizer can clean up the conversions
<alyssa> I think.
<jenatali> alyssa: Stores are harder, you can't lower to 16bit stores
<alyssa> jenatali: Yes, I can. Because it's a formatted store.
<karolherbst> heh wait...
<jenatali> Because you can't modify the neighboring bytes
<karolherbst> actually... the limit is args.. not buffers
<karolherbst> and it's 8 or more
<karolherbst> well.. some hardware doesn't support 8 :(
<alyssa> jenatali: We can lower to a formatted store. The memory format is i8, but the register format is i16.
<jenatali> Oh I see, that makes sense
<alyssa> Yep
<jenatali> As long as you can still specify a byte-aligned address even with an i16 register?
<karolherbst> jenatali: did you play around with using actual ubos for constant buffers?
<jenatali> karolherbst: Yeah, we didn't use it though because D3D drivers (notably our software driver (WARP)) had bugs when dynamically selecting a ubo
<karolherbst> heh...
<jenatali> So we just lower it to ssbo the same as global
<karolherbst> why selecting one dynamically though?
<karolherbst> so I see that CL_DEVICE_MAX_CONSTANT_ARGS has to be at least 8, which is quite low
<karolherbst> so you can make it all static
<karolherbst> and if you need a pointer to inside a constant buffer you get a idx/offset vec2, which should be fine.. unless that's what's also causing you issues
<karolherbst> I think I'll play around with this and see how it goes
<jenatali> karolherbst: I mean in a shader, an app doing `foo ? constant1[x] : constant2[x]`
<karolherbst> okay.. mhhh
<karolherbst> annoying
<jenatali> Or other types of insidious things like dynamically computing a constant pointer that could point to one buffer or another
<karolherbst> don't have indirect ubos in d3d?
<jenatali> Smuggling it through shared or global memory to break pointer tracking, etc
<jenatali> We do, but they're apparently so rarely used that there's driver bugs
<karolherbst> uhhh... well.. it's still fits in 64 bit unless you have 32 bit pointers :(
<karolherbst> ahh
<karolherbst> makes sense ...
<karolherbst> I suspect though it matters for performance :(
<airlied> vsyrjala: should we be requesting he cc intel-gfx for CI?
<Venemo> turol: as far as I see it is set a few lines above the highlighted line
<airlied> or did he do that and ignore the results?
<jenatali> Yeah we use 32bit buffer index, 32bit buffer offset, but if you can't track down a literal constant for that 32bit index due to the pointer smuggling, then you hit issues
<turol> Venemo: yes but also below
<karolherbst> yeah...
<turol> that looks suspicious to me
<karolherbst> iuf you can't use indirect ubos, because of bugs you really are in a world of pain there :(
<karolherbst> s/iuf/if/
<Venemo> turol: so what is your question?
<turol> the question is: is this intentional or a bug
<Venemo> I don't know
<karolherbst> jenatali: the other annoying part is... you can set the constant* arg to NULL and one also needs to encode that
<turol> neither do I, hence asking for radv developers
<Venemo> I'm a radv developer, hence tried to answer
<karolherbst> so I suspect we always want to encode the idx/offset in the input buffer, and just have a static access to be an optimization or something.. how annoying
<alyssa> jenatali: Yes. AGX requires addresses(and offsets) to be aligned to the alignment of the memory format, not the register format.
<karolherbst> which insane hw doesn't :P
<Venemo> turol: that code looks like it was copied from radeonsi a long time ago, best would be to check how radeonsi handles that now and see if it matches. but i wouldn't worry about it unless you suspect that this causes issues on affected hw
<alyssa> Venemo: Speaking of, any interest in standardiing on formatted loads/stores in NIR?
<alyssa> and on VS input lowering?
<alyssa> not 100% what that woud look like yet
<turol> there appears to be a mismatch so this is probably a bug
<Venemo> alyssa: same answer as last time, I'm open to suggestions :)
<Venemo> turol: feel free to open a bug report against radv on the mesa gitlab then
<alyssa> Venemo: fair enough
<alyssa> okay, how about this: I'll do something that makes sense for AGX and vendor it, and if you think you can use it in radv, you'll extend it and move it to common and drop the aco lowering? :)
<Venemo> alyssa: my plan was to add something like load/store_buffer_amd and add a format field to it
<alyssa> yeah, same here
<alyssa> load_global_agx
<Venemo> this would have a constant offset, scalar offset, vector offset, and a vector index
<alyssa> load_global_agx taking two parameters "base address" and "offset" with FORMAT, SHIFT, MASK immediates I guess
<Venemo> I think the mismatching requirements for the sources may be an issue here
<Venemo> alyssa: we would most likely want to use the vector index for these (we would pass the vertex id or instance id)
<alyssa> yeah, same here
<Venemo> does agx also have an index src?
<Venemo> you didn't say so
<alyssa> that's my offset source
<alyssa> offset in units of alignment(FORMAT)
heat has joined #dri-devel
heat_ has quit [Read error: No route to host]
<Venemo> okay, so your offset is not the same as our offset?
<alyssa> probably not
<alyssa> in practice it probably is? or maybe my offset is your vector index?
<Venemo> I think it is
<Venemo> do you not have a normal byte offset?
<alyssa> nope
<Venemo> or that's the base address?
<alyssa> well, I guess the base address
<alyssa> as I wrote earlier today, it's literally:
<alyssa> format_t *array = (format_t *) base_address;
<alyssa> return array[index << extra_shift];
<Venemo> well, one thing we could do is have all srcs and you would emit an extra add in your backends
<alyssa> there's no add needed in the usual case
<alyssa> E.g. for an rgba8 vertex buffer at base address B with stride #S and no instancing, the load is a single instruction:
<Venemo> I mean when the intrin has both the scalar and vector offset, you'd add those in your banckend and we'd emit those as part of the instr
gouchi has joined #dri-devel
<alyssa> load B, vertex ID << (log2(S / 4)), rgba8
<alyssa> Oh, er, right ok
<Venemo> sorry I can't type on my phone...
<DavidHeidelberg[m]> I was thinking about how to avoid restricted traces. One of the ideas is that instead of just asking developers of games/benchmarks/apps permission to use trace, we could offer them something like certification that Mesa3D supports (is tested with) their product. How does that sound?
<alyssa> right, if the stride is not power of two or there's an extra constant offset, we end up emitting an extra imad instruction, sure
Dr_Who has quit [Ping timeout: 480 seconds]
<Venemo> alyssa: maybe it's also OK to keep vendored intrinsics and once we see how the backends look we can decide if we can make them common
<alyssa> Nod
<alyssa> I want to say "sticking in an extra constant offset is just one ssa_scalar_chase away" but I guess you want to keep down compile times as always
<Venemo> well, the intrinsic would have a base, which would be the constant offset
<Venemo> similar to load_buffer_amd
<alyssa> nod
<Venemo> or do you mean you need an extra one besides that?
<alyssa> maybe this won't work out nicely then
<alyssa> could you write out pseudo-C code for what the AMD instruction does with all the sources? like I wrote for AGX? thanks
<Venemo> uhhh
<Venemo> alyssa: it's compilcated, can you look at the RDNA2 shader isa chapter 8.1?
<Venemo> 8.1.5 describes the addressing, and 8.1.1 shows a simplified version of the formula
Haaninjo has quit [Quit: Ex-Chat]
Haaninjo has joined #dri-devel
<Venemo> alyssa: I would rather not try to type it on my phone sorry
<alyssa> fair enough, will try to remember ot look when I get a chance
heat has quit [Read error: No route to host]
heat has joined #dri-devel
oneforall2 has quit [Remote host closed the connection]
oneforall2 has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
Nimr-alIslam has joined #dri-devel
fab has quit [Quit: fab]
Nimr-alIslam has quit [autokilled: This host violated network policy. Contact support@oftc.net for further information and assistance. (2022-09-19 20:47:12)]
mvlad has quit [Remote host closed the connection]
ngcortes has quit [Read error: Connection reset by peer]
lemonzest has quit [Quit: WeeChat 3.5]
ngcortes has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
gouchi has quit [Remote host closed the connection]
mbrost has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
ahajda has quit [Ping timeout: 480 seconds]
ybogdano has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
ybogdano has joined #dri-devel
ybogdano is now known as Guest966
Guest966 has quit [Read error: Connection reset by peer]
ybogdano has joined #dri-devel
vliaskov has quit [Remote host closed the connection]
ybogdano has quit [Ping timeout: 480 seconds]
paulk-bis has joined #dri-devel
paulk has quit [Ping timeout: 480 seconds]
Haaninjo has quit [Quit: Ex-Chat]
pcercuei has quit [Quit: dodo]
mbrost has quit [Ping timeout: 480 seconds]
Weiss-Fder[m] has joined #dri-devel
iive has quit [Quit: They came for me...]