ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<karolherbst> nir
<karolherbst> and llvm
<karolherbst> bnieuwenhuizen: we always inline, so we end up with nir shaders with 100k values
ybogdano has quit [Ping timeout: 480 seconds]
lygstate has quit [Remote host closed the connection]
<Plagman> would glXMakeCurrent hang forever if passed a drawable that's now-unmapped/destroyed?
<Plagman> the code just calls glXGetCurrentDisplay, glXGetCurrentDrawable, glXGetCurrentContext, does some stuff, then calls MakeCurrent with the return values of the calls above
<Plagman> and that hangs
<zmike> ajax: look, it's our favorite deadlock^
<Plagman> uh oh
<HdkR> Charge up the pain train
<Plagman> is this the thing that got fixed upstream recently?
<zmike> I don't think so
<Plagman> any workarounds?
<zmike> I'm not sure we're tracking this exact variant of it
<zmike> do you have a reliable way to trigger it?
<Plagman> yeah
<Plagman> especially if you have a steam deck, but with a bit of setup it can happen on desktop
<Plagman> needs prerelease steam bits though, i can send you the package
<zmike> yeah, hook me up and I'll look tomorrowish
<Plagman> we're probably going to work around it today by not calling the offending code at all, so i'll have to put together a repro for you later
<zmike> alrighty
<karolherbst> airlied: unorm CL_FILTER_LINEAR CL_ADDRESS_CLAMP_TO_EDGE fails for 2D :(
<karolherbst> I am sure it's something similiar, but sounds more like something luxmark would hit
<Plagman> zmike: my overall read on it is that we had a window with gl rendering, we unmapped/destroyed it, then some code did the sequence above, got the old window ID
<Plagman> and it hung on makecurrent
<zmike> hm
<zmike> can try hacking it into a piglit pretty easily to test I think
<zmike> really feeling like xcb_wait_for_special_event is going to be the cause of a new xcb release at this rate
columbarius has joined #dri-devel
<HdkR> Let me know if that happens, I'll need to update some things when a new xcb release comes up :)
<zmike> still trying to avoid it
<zmike> but the last fix that seemed like it was working for a related broke perf and had to be reverted
<zmike> and xcb_wait_for_special_event_with_timeout is much easier than rewriting the whole stack
co1umbarius has quit [Ping timeout: 480 seconds]
anholt has quit [Ping timeout: 480 seconds]
<karolherbst> :O
<karolherbst> jekstrand: that shit even works on iris :O
<karolherbst> rendering is even less broken
<karolherbst> 58% vs 63% :D
<karolherbst> but yeah.. something is very wrong with the rendering
<karolherbst> using both works as well :)
nchery is now known as Guest802
nchery has joined #dri-devel
HankB_ has quit []
Guest802 has quit [Ping timeout: 480 seconds]
HankB_ has joined #dri-devel
<HankB_> mripard: I submitted the bug report #1008692 for refcount_t: underflow; use-after-free. in dmesg output
<HankB_> If there is something I can do to help move that along, let me know. Thanks!
<karolherbst> mhh seems like that stuff just renders 2 minutes and verifies the result, and if it's crappy it's crappy
<karolherbst> _wow_
lygstate has joined #dri-devel
<karolherbst> enabling shader caching speeds up luxmark v3.1 by a factor of like 50
<karolherbst> no wonder the result was so broken
<karolherbst> the putput is perfect with iris :)
<karolherbst> next I wire up nouveau and then I can run this with three devices
<karolherbst> jenatali: soo.. driver compiling shit starved the pipeline, so everything was super slow and the result broken. now it's perfect :)
<karolherbst> I just forgot that luxmark had this finite runtime
<jenatali> Oh that makes sense
<jenatali> I still need to add a source -> spirv cache at some point
<karolherbst> it actually works, nice
<karolherbst> jenatali: ahh I already have that
<karolherbst> I just use the drivers cache for that
<karolherbst> I think...
<jenatali> If I had unlimited time, I'd get to it eventually
<karolherbst> ohh no, I only cache the libclc
<jenatali> But I've got even less time than I had before now :)
<karolherbst> I thought I cached everything
<karolherbst> oh well
<karolherbst> yeah.. that's fair I guess
<karolherbst> I have to rework compiling stuff anyway
<karolherbst> I really need to compile down to the driver way earlier
<karolherbst> but at least it's much simplier to do that than it was in clover
anholt has joined #dri-devel
<karolherbst> jenatali: perfecting that local group size situation is kind of the next target
Lucretia-backup has quit [Ping timeout: 480 seconds]
fxkamd has quit []
camus has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
jewins has quit [Ping timeout: 480 seconds]
<karolherbst> I think we have to perf optimize nir :D
<HdkR> Everyone loves more NIR performance optimizations
<karolherbst> ehh I meant the CPU side of things
<HdkR> Me too
<airlied> karolherbst: not inlining everything is also valid :-P
<karolherbst> :D
<karolherbst> noooo
<karolherbst> although I am not sure if the CPU impl or rusticl is slower here...
<karolherbst> but yeah.. I think we have to stop inlining evertyhing
<karolherbst> how can I make iris report shader stats?
<karolherbst> it's probably all regs and 10 times as much spilled
<karolherbst> yeah so their c++ code is a bit faster than iris on rusticl, mhhh
<karolherbst> rusticl sys reqs: 64GB of RAM
<zmike> so you're targeting zink is what you're saying
<karolherbst> wait...
<karolherbst> zmike: does clover run on top of zink? although I guess not
<zmike> it does not currently, no
<karolherbst> mind implementing set_global_binding real quick?
<zmike> I think you'd have much bigger problems than whatever that is :D
<karolherbst> I am sure I won't
<karolherbst> but honestly, why should I?
<karolherbst> I mean.. compiling form nir to spirv and uhhmm.. ufff
<karolherbst> did I tell you that those kernels generate nirs with like 200k values?
<zmike> zink doesn't do >32 sampled images, it doesn't handle indirect buffer access, and it can't translate any of the advanced cl spirv
<karolherbst> do you think I currently support any of that?
<karolherbst> and what do you mean by indirect buffer access?
<zmike> for (i) var = ubo[i]
<karolherbst> I don't do ubos
<karolherbst> problem solved
<zmike> 🤔
<karolherbst> and those 4 new nir intrinsics you can probably impleemnt easily
<zmike> was jekstrand wrong?
<karolherbst> with what?
<zmike> maybe this could work after all
<karolherbst> well
<karolherbst> global memory is painful
<zmike> well I was told that there's no possible way that cl could ever work on zink and don't even bother trying it's not going to happen
<karolherbst> so you kind of have to support 64 bit pointers
<karolherbst> but microsoft implemented that on top of ssbos
<karolherbst> soo....
<karolherbst> zmike: yeah.. I mean.. you'll some limitations at some point
<karolherbst> *hit
<airlied> jekstrand: what no cl on zink on vk?
* airlied forgets the reasoning
<karolherbst> but if your pointers don't exceed 4GB I think you're fine :p
<airlied> I also started writing it once
<zmike> airlied: he said it was impossible
<zmike> that's that
<karolherbst> zmike: MS implemented CL on top of d3d12
<karolherbst> mhhh
lemonzest has joined #dri-devel
<airlied> zmike: so I just have to forget he said it, and try again :-P
<airlied> "This is mostly garbage." I write the best commit msgs
<karolherbst> airlied: I only need the top commit, no?
<karolherbst> ehhh
<karolherbst> wait a second
<karolherbst> zink_set_global_binding doesn't do anything!
<airlied> yeah I stopped once I fell over zmikes spirv code :-P
<karolherbst> :D
<zmike> see?
<airlied> like clearly the existence of clvk means there is some path forward :-P
<zmike> I work on a demand-based system
<karolherbst> a sad one
<zmike> the demand is currently that I want more cts tests passing
<airlied> karolherbst: should reboot the amd nir paths, though never got images to work
<zmike> nobody else is telling me they need cl so I'm not doing it
<karolherbst> airlied: well.. now you don't need to blame the runtime as it's proven to work
<airlied> zmike: dschuermann would like it
<airlied> zmike: so he can test compute kernels on aco
<zmike> seems unlikely
<Sachiel> zmike: cl has cts tests that could be passing
<zmike> he's got so many zink bugs to fix
<airlied> zmike: barter system
<zmike> Sachiel: that's the wrong kind of cts
<karolherbst> Sachiel: I always mention the cl CTS ironically honestly
<zmike> airlied: that just sounds like making more work for him
<karolherbst> that doesn't deserve the name CTS, but here we are
<airlied> the main reason I wanted cl on vk is to see how aco would do, maybe aco/radeonsi will beat me
<zmike> wouldn't want to overload anyone
<zmike> 🤔
<karolherbst> the CTS is good enough to get things rolling
<karolherbst> although
<karolherbst> it made me run luxmark without having to debug it more than once
<karolherbst> so maybe not that bad
<zmike> maybe once I land all the stuff I have pending in the 2 remaining weeks before 22.1
<airlied> karolherbst: now for darktable?
<karolherbst> uhm...
<karolherbst> do they have benchmarks?
<airlied> nope, but it does load images
<karolherbst> let's see
<karolherbst> I bet I have to enable CL first
<karolherbst> hey
<karolherbst> it doesn't allow me to enable CL
<karolherbst> probably because iris is not advertiszed as GPU
<karolherbst> here we go
<karolherbst> airlied: ehh.. it doesn't work on fedora as it seems
<karolherbst> probably disabled
<airlied> pretty sure I git working once from the packaged version, but it was a long time ago
<airlied> not sure if it logged some info on startup
<karolherbst> it just tells me it's unavailable
<karolherbst> ahh there is -d opencl
<karolherbst> "fatal error: 'common.h' file not found" duh...
<karolherbst> airlied: uhhhh... I bet we have to do magic things for include paths?
<karolherbst> ahh no
<karolherbst> that dir just doens't exist
<karolherbst> ehh wrong machine
sagar__ has quit [Remote host closed the connection]
sagar__ has joined #dri-devel
nchery has quit [Ping timeout: 480 seconds]
<karolherbst> airlied: seems like the " in the args confused our stack
<karolherbst> now it enables CL :)
<karolherbst> and crashes
<karolherbst> yay
<karolherbst> it failed to compile a nir
<karolherbst> fun
<airlied> Lynne: pushed updated radv video branch
<karolherbst> "due to a slow GPU the opencl flag has been set to OFF." hey
<airlied> hehe lols
<karolherbst> btw I found a nice solution for USE_HOST_PTR when ptrs are not aligned: just don't create a mem object and let clients deal with that
<karolherbst> shadow buffering in clover was a mistake
lygstate has quit [Read error: Connection reset by peer]
slattann has joined #dri-devel
<slattann> Test Msg :)
shankaru has joined #dri-devel
ppascher has joined #dri-devel
anholt has quit [Quit: Leaving]
aravind has joined #dri-devel
sagar__ has quit [Remote host closed the connection]
sagar__ has joined #dri-devel
elongbug has quit [Read error: Connection reset by peer]
heat has quit [Ping timeout: 480 seconds]
naveenk2 has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
slattann has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
kts has joined #dri-devel
sdutt has quit [Read error: Connection reset by peer]
nchery has joined #dri-devel
tzimmermann has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
tales_ has quit [Remote host closed the connection]
mszyprow has joined #dri-devel
Guest742 is now known as DrNick
itoral has joined #dri-devel
mbrost has quit []
paulk1 has quit [Ping timeout: 480 seconds]
i-garrison has quit []
i-garrison has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
paulk1 has joined #dri-devel
mvlad has joined #dri-devel
jkrzyszt_ has joined #dri-devel
sigmaris has quit [Server closed connection]
sigmaris has joined #dri-devel
jfalempe has quit [Quit: Leaving]
zrusin has joined #dri-devel
zackr has quit [Read error: Connection reset by peer]
ahajda has joined #dri-devel
danvet has joined #dri-devel
moony has quit [Server closed connection]
moony has joined #dri-devel
jfalempe has joined #dri-devel
alanc has quit [Remote host closed the connection]
maxzor has joined #dri-devel
alanc has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.4]
<mripard> ukleinek: yeah, I think I see what the issue is. I've been trying to address it for the last couple of weeks, but it's taking more time than I expected
tjaalton has quit [Server closed connection]
tjaalton has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
<mripard> ukleinek: the basic idea is that prior to the RPi4, we had the GPU integrated in the same device and thus had support for it in vc4
<mripard> in the RPi4, the GPU is a separate device, with a separate driver
<mripard> but we screwed up a bit and are still using and exposing some parts of the GPU code on the RPi4
<mripard> I believe it's what you're seeing
<ukleinek> HankB_: ^
Arsen has quit [Server closed connection]
Arsen has joined #dri-devel
MajorBiscuit has joined #dri-devel
<ukleinek> mripard: HankB_ is the reporter, and I think in the meantime he managed to register to Nickserv such that he can talk here, too.
camus has quit [Read error: Connection reset by peer]
<mripard> like I said, I've been working on it recently but splitting it out leads to another corruption / inconsistency I haven't been able to figure out yet
camus has joined #dri-devel
<ukleinek> mripard: I didn't say that to urge you, just FYI that you have a contact if there are questions where he might help.
<ukleinek> so it's just to take me out as a proxy
lynxeye has joined #dri-devel
tursulin has joined #dri-devel
<mripard> ukleinek: ok, I'll post on that bug report when it's ready
<mripard> thanks
rbrune has joined #dri-devel
rkanwal has joined #dri-devel
<emersion> ah, for hotplug handling not sure i have something
<emersion> ah, my talk has one slide about it
<emersion> with minimal example code
robertfoss has quit [Server closed connection]
robertfoss has joined #dri-devel
pendingchaos has quit [Server closed connection]
pendingchaos has joined #dri-devel
Terman has joined #dri-devel
shashanks has quit [Read error: Connection reset by peer]
shashanks has joined #dri-devel
Lucretia has joined #dri-devel
hikiko has quit [Server closed connection]
jagan_ has joined #dri-devel
hikiko has joined #dri-devel
slattann has quit [Ping timeout: 480 seconds]
pcercuei has joined #dri-devel
RSpliet has quit [Quit: Bye bye man, bye bye]
RSpliet has joined #dri-devel
tzimmermann has joined #dri-devel
LexSfX has quit [Remote host closed the connection]
<tzimmermann> danvet, airlied, last week's PR for drm-misc-fixes got lost? https://lore.kernel.org/dri-devel/YjwkvPp6UnePy4Q8@linux-uq9g.fritz.box/
apinheiro has joined #dri-devel
LexSfX has joined #dri-devel
crabbedhaloablut has quit [Server closed connection]
crabbedhaloablut has joined #dri-devel
icecream95 has quit [Ping timeout: 480 seconds]
LexSfX has quit [Remote host closed the connection]
mceier has quit [Ping timeout: 480 seconds]
LexSfX has joined #dri-devel
slattann has joined #dri-devel
<slattann> Can someone help me how user space app/compositor can register for event sent via "drm_send_event" from kernel?
nchery has quit [Ping timeout: 480 seconds]
<emersion> eh
<emersion> what kind of event is that?
<emersion> pageflip?
DPA has quit [Server closed connection]
lemonzest has joined #dri-devel
DPA has joined #dri-devel
dliviu has quit [Server closed connection]
dliviu has joined #dri-devel
itoral has quit [Remote host closed the connection]
ukleinek has quit [Server closed connection]
ukleinek has joined #dri-devel
rasterman has joined #dri-devel
idr has quit [Ping timeout: 480 seconds]
guru_ has joined #dri-devel
maxzor has quit [Ping timeout: 480 seconds]
oneforall2 has quit [Ping timeout: 480 seconds]
MajorBiscuit has quit [Quit: WeeChat 3.4]
MajorBiscuit has joined #dri-devel
ppascher has quit [Ping timeout: 480 seconds]
tobiasjakobi has joined #dri-devel
tobiasjakobi has quit []
<Lynne> airlied: thanks, I'll get on it right away
devilhorns has joined #dri-devel
ccr has quit [Server closed connection]
ccr has joined #dri-devel
chema has quit [Quit: Reconnecting]
chema has joined #dri-devel
jasuarez has quit [Quit: Reconnecting]
jasuarez has joined #dri-devel
<dj-death> is there a NIR pass that puts barriers outside of loop/if/else blocks by splitting them?
jasuarez has quit []
jasuarez has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.4]
<dj-death> sounds a lot like the nir_lower_shader_calls actually
flacks has quit [Quit: Quitter]
slattann has quit [Ping timeout: 480 seconds]
flacks has joined #dri-devel
<danvet> mripard, if you nuke legacy_cursor_path yourself instead of the helper changes
<danvet> then you also nuke the async plane update :-)
<danvet> imo we need my patch
<danvet> or you need to wire through a new flag
<danvet> ofc if things still don't work with my patch then something is iffy
slattann has joined #dri-devel
<danvet> hm and I think msm should/could probably drop it's hacks too
bl4ckb0ne has quit [Server closed connection]
bl4ckb0ne has joined #dri-devel
maxzor has joined #dri-devel
shankaru has quit [Quit: Leaving.]
jani has quit [Server closed connection]
jani has joined #dri-devel
<danvet> mripard, if you can supply a t-b/r-b on my patch I think I'll just land it since we're early in the merge window
shankaru has joined #dri-devel
<mripard> danvet: like I said, I don't get what it involves at all, but if you trust that it makes things better then you definitely have my acked-by :)
<danvet> mripard, well I need a tested-by really
<danvet> and maybe some links to add for the vc4 problem
<mripard> ok, let me test it then
<slattann> Test Msg
<slattann> @Emersion: basically i915 driver is calling this drm_send_event to pass an event to userspace event could be anything from Display HW. ex: DE has processed FB and ready for frther steps.
<mripard> danvet: it seems to work great
<danvet> mripard, want me to resend first or you dig out the old version?
dv_ has quit [Server closed connection]
dv_ has joined #dri-devel
camus1 has joined #dri-devel
Company has joined #dri-devel
<mripard> I dug out the version you linked to in the mail
camus has quit [Remote host closed the connection]
<mripard> there was a bit of fuzz but it was still applying
<danvet> mripard, maybe just resend it then since you've handled that already :-)
<danvet> and I guess I should poke robclark for an ack too
<danvet> and somewhen intel needs to get off its hand rolled atomic commit
`join_su1line has quit []
`join_subline has joined #dri-devel
shankaru has quit [Quit: Leaving.]
FLHerne has quit [Server closed connection]
FLHerne has joined #dri-devel
V has quit [Server closed connection]
V has joined #dri-devel
kts has joined #dri-devel
digetx has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
famfo has quit [Server closed connection]
famfo has joined #dri-devel
digetx has joined #dri-devel
karolherbst has quit [Server closed connection]
karolherbst has joined #dri-devel
illwieckz has quit [Server closed connection]
illwieckz has joined #dri-devel
apinheiro has quit [Ping timeout: 480 seconds]
jagan_ has quit [Remote host closed the connection]
mattst88 has quit [Ping timeout: 480 seconds]
qyliss has quit [Quit: bye]
mclasen has quit [Ping timeout: 480 seconds]
qyliss has joined #dri-devel
shankaru has joined #dri-devel
maxzor has quit [Ping timeout: 480 seconds]
<mripard> danvet: I just sent it
mclasen has joined #dri-devel
<mripard> danvet: I think the preliminary patches of https://lore.kernel.org/all/20220221134155.125447-1-maxime@cerno.tech/ (patches 1-7) still have some benefits, could you give a look if you have some time?
<danvet> robclark, [PATCH v4] drm/atomic-helpers: remove legacy_cursor_update hacks <- can you pls whack an ack onto that
<danvet> mripard, I'm a bit out of the loop, but no other display patch series you could cross-ack to land it all?
<danvet> mripard, for patch 6, did you look at DRM_PLANE_COMMIT_ACTIVE_ONLY instead?
slattann has quit [Quit: Leaving.]
<mripard> danvet: as far as I know DRM_PLANE_COMMIT_ACTIVE_ONLY isn't passed down to the crtc atomic_flush?
<danvet> mripard, active only skips the commit on not-active crtc
<danvet> but it also skips the plane commits so maybe not what you want
<danvet> just figured I bring this up
kts has quit [Quit: Konversation terminated!]
jewins has joined #dri-devel
jagan_ has joined #dri-devel
dllud_ has quit [Server closed connection]
dllud has joined #dri-devel
sdutt has joined #dri-devel
<mripard> ah thanks, it might work then, I'd need to check
<mripard> thanks :)
<marex> danvet: mripard: robertfoss: should I wait for 5.18-rc1 and MW closed before applying series with full set of AB/RB , or is now a good time too ?
<marex> tc358767 looks ready for example
<marex> on drm-misc-next ... that is
<robertfoss> marex: I think now is fine
ahajda_ has joined #dri-devel
<robertfoss> marex: can you link me the series on lkml?
<robertfoss> ah, that's fine too :)
<marex> robertfoss: should I just CC you on the bridge stuff in general ?
ahajda has quit [Ping timeout: 480 seconds]
<marex> robertfoss: https://patchwork.freedesktop.org/series/101981/ could use review ;-)
<robertfoss> Yes, that's be good. Just found an issue with my ML subscriptions, so I've missed this series until just now
ahajda__ has joined #dri-devel
rbrune has quit [Ping timeout: 480 seconds]
agd5f_ has quit []
agd5f has joined #dri-devel
ahajda_ has quit [Ping timeout: 480 seconds]
<lynxeye> marex: Looking at the patchwork I just noticed that you missed to add my Tb to the last patch of the tc358767 series. No big deal.
<marex> lynxeye: that was deliberate, since I did minor change to that one last patch
<marex> lynxeye: if you want to retest it real quick and send that one last TB, please do
<marex> I am still not entirely sure when to drop AB/RB/TB tags from patches , it feels wrong to just collect them, it just feels wrong to waste reviewer time by dropping the tag when the change is very small
fxkamd has joined #dri-devel
<ajax> Plagman: how... did you manage that
<ajax> ugh someone messed with window lifetimes recently didn't they
mbrost has joined #dri-devel
<robertfoss> for #100372, i fixed a checkpatch --strict warning & pushed it to drm-misc-next
<lynxeye> marex: Just tested it again in DSI to DPI mode. I'm not able to redo the DPI to eDP test right now, but I trust that your minor change didn't break that one from the last time I tried.
<marex> lynxeye: correction, the minor change was that dropped hw init patch you asked for ...
<lynxeye> marex: Yep, I wouldn't expect any new issues from dropping that patch and rebasing the last one.
<lynxeye> marex: Bored.... hahaha. You might have noticed that I still haven't sent out the HDMI stuff. That's not because it's all ready and shiny and I'm just holding it back. ;) But seriously, I'll get you feedback on the first 2 mentioned series as soon as I sorted some more HDMI issues.
<lynxeye> Not sure when I'll be able to grab some HW to test the LDB stuff.
alyssa has joined #dri-devel
<alyssa> Today on "do stencil results come in .r or .g?" ...
<marex> lynxeye: I figured I shouldn't interupt your boredom, hence I didn't even ask about the HDMI stuff
<MTCoster> alyssa: Switch it up, hide them in .b where they can jump out and surprise people
<marex> lynxeye: there should still be plenty of time until like rc5, no ?
<alyssa> MTCoster: why is none of this documented
<MTCoster> alyssa: Too many annoying things to throw documentation at, too few hours to find where you threw it
LexSfX has quit []
<lynxeye> marex: Yep, I don't particularly like a lot of the downstream HDMI PHY handling and I'm not sure if we get this all sorted until the next merge window, but we should at least manage to get the LCDIF and power domain stuff in.
<marex> lynxeye: the PDs and LCDIF would be nice indeed
<marex> lynxeye: I had a look at the HDMI myself briefly and decided to let you have all the "fun" with it
<marex> the DSI and LVDS were just ... easier
<marex> oh and I have plenty of hardware to test both, which is also nice
<marex> jagan_: are you making any progress on the MX8M DSIM btw ?
<alyssa> Panfrost just replicates stencil to all channels and everyone seems happy with that...
Guest647 has quit [Server closed connection]
heat has joined #dri-devel
leah has joined #dri-devel
leah is now known as Guest860
heat has quit [Read error: No route to host]
heat has joined #dri-devel
Guest670 is now known as dreda
LexSfX has joined #dri-devel
ella-0_ has joined #dri-devel
ella-0 has quit [Read error: Connection reset by peer]
<robclark> danvet: as long as cursor movement doesn't break 60fps webgl aquarium I think it's good.. I'll try to find some time to test it.. but I guess you should be able to test it yourself ;-)
znullptr[m] has quit [Server closed connection]
znullptr[m] has joined #dri-devel
<alyssa> 0:7(3): error: explicit member locations are not allowed in blocks declared as arrays fragment shader
<alyssa> on dEQP-GLES31.functional.program_interface_query.program_input.location.interface_blocks.in.block_array.var_explicit_location
<alyssa> ...is this a botched test?
<alyssa> It's in gles31-spec-issues.txt and nowhere else so... yes :)
bluebugs has quit [Read error: Connection reset by peer]
bluebugs has joined #dri-devel
khfeng has quit [Ping timeout: 480 seconds]
shashank_sharma has joined #dri-devel
shashank_s has joined #dri-devel
<alyssa> next up: apparently vertex id is no longer zero based
<alyssa> or maybe it depends on shader stage
<alyssa> fun
<alyssa> *whether the new or old flow is used
<alyssa> although other stuff depends on that too so... meh...
shashanks has quit [Ping timeout: 480 seconds]
<alyssa> grumble grumble grumble
<alyssa> I guess I can lower in the backend
<alyssa> it's a trivial lowering anyway
shashank_sharma has quit [Ping timeout: 480 seconds]
<alyssa> pass, onwards!
<danvet> daniels, thx for elaborating a bit on the "how to hotplug change connector type" therad
<danvet> *thread even
<emersion> when possible please include the CONNECTOR prop
<emersion> in the uevent
naveenk2 has quit []
shankaru has quit [Quit: Leaving.]
<danvet> robclark, afaict you've fixed msm already meanwhile, so it should be good
<danvet> robclark, also the box isn't set up for testing :-(
imre has quit [Server closed connection]
imre has joined #dri-devel
<danvet> marex, drm-misc-next is always open for business, there's no block out for merging features during the merge window
<marex> danvet: I still don't feel like I should eagerly merge my own patches, even when they are AB, so I was waiting a bit
<danvet> mripard, I've sent out my v4 which should be fixed
<danvet> mripard, pls also ack
jkrzyszt_ has quit [Ping timeout: 480 seconds]
alatiera9 is now known as alatiera
gawin has joined #dri-devel
minecrell has quit [Read error: Connection reset by peer]
minecrell has joined #dri-devel
rcf has quit [Remote host closed the connection]
mceier has joined #dri-devel
q66 has quit [Server closed connection]
q66 has joined #dri-devel
ds` has quit [Server closed connection]
ds- has joined #dri-devel
ds- is now known as ds`
mbrost has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
pjakobsson_ has joined #dri-devel
jagan_ has quit [Remote host closed the connection]
pjakobsson has quit [Ping timeout: 480 seconds]
MajorBiscuit has quit [Quit: WeeChat 3.4]
libv has quit [Server closed connection]
libv has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
mclasen has quit []
mclasen has joined #dri-devel
paulk1 has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
<demarchi> dim: WARNING: issues in commits detected, but continuing
<demarchi> To ssh://git.freedesktop.org/git/drm/drm-intel
<demarchi> error: failed to push some refs to 'ssh://git.freedesktop.org/git/drm/drm-intel'
<demarchi> ! [rejected] topic/core-for-CI -> topic/core-for-CI (non-fast-forward)
<demarchi> danvet: rodrigovivi jani did something change in the server configuration for drm-intel remote? we used to be able to rebase and push a non-ff update in topic/core-for-CI
<danvet> huh maybe mistakenly?
<danvet> daniels, ^^
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
ybogdano has joined #dri-devel
mattst88 has joined #dri-devel
mszyprow has quit [Ping timeout: 480 seconds]
gawin has joined #dri-devel
paulk1 has joined #dri-devel
dreda has quit [Server closed connection]
dreda has joined #dri-devel
dreda is now known as Guest871
<ukleinek> demarchi: without knowing any details, did you specify -f or --force-with-lease
<ukleinek> ?
<jekstrand> zmike: Vulkan threading stuff merged. You can review !15651 at any time.
<zmike> ANY time?!
<zmike> how generous of you
<zmike> I'm currently reviewing how stupid I was last year, so maybe when I'm done with that
sravn has quit [Quit: WeeChat 3.3]
<demarchi> ukleinek: danvet oh... that looks like my fault
<demarchi> we have dim -f, but that force is for something else
<demarchi> should have been dim push -f, not dim -f push
<demarchi> let me try again
<karolherbst> what do I need to pass to INTEL_DEBUG= to get only shader stats?
<demarchi> ukleinek: now the push went through, thanks
<demarchi> but there are unresolved conflicts in drm-misc-next: drivers/gpu/drm/mediatek/mtk_dsi.c
<jekstrand> mattst88: Do you still have that script for counting reviewed-by tags?
lemonzest has joined #dri-devel
<mattst88> hmm, probably. let me see
<jekstrand> mattst88: I'm curious what my write/review ratio is. I suspect I review more than I write but I'm not sure.
<mattst88> jekstrand: yep, it's here: https://cgit.freedesktop.org/~mattst88/stats/
<mattst88> needs a lot of email updates :)
<mattst88> lots of R-b tags don't get recorded in git history these days too, which will make things difficult
<demarchi> robertfoss: 1d0b53630445 ("drm: bridge: mtk_dsi: Switch to devm_drm_of_get_bridge")... is this a conflict you're working on?
leandrohrb has joined #dri-devel
<karolherbst> shaders from hell: SIMD32 shader: 10218 instructions. 3 loops. 220376 cycles. 758:1556 spills:fills, 3 sends, scheduled with mode lifo. Promoted 22 constants. Compacted 163488 to 143152 bytes (12%)
<jekstrand> mattst88: Now that we have a .mailmap, the e-mail updates shouldn't be needed, maybe?
<mattst88> yeah, probably true!
<karolherbst> SIMD32 shader: 4078 instructions. 1 loops. 661526 cycles. 213:543 spills:fills, 3 sends, scheduled with mode lifo. Promoted 8 constants. Compacted 65248 to 57712 bytes (12%) uhh
<jekstrand> mattst88: Hrm... Git isn't smart enough to apply mailmap to tags. :-/
<jekstrand> Maybe there's an easy mailmap command we can stuff through?
<mattst88> oh dang, heh
<jekstrand> mattst88: git-check-mailmap. :D
<karolherbst> do we have somewhere a nice "post everything opt loop" I can look at? Like something st/mesa or so is doing right before stuff gets handed into a driver
<alyssa> karolherbst: 758:1556 spills:fills, oof
<karolherbst> alyssa: it's CL
<karolherbst> we have to stop inlining everything
<mattst88> jekstrand: git always impresses me!
<karolherbst> but CL is soo memory heavy, that stuff doesn't reallt matter all that much
<mattst88> jekstrand: should I move that repo to gitlab so you can make an MR? :)
<alyssa> karolherbst: spills = memory
<bnieuwenhuizen> try some RT, inlining all the shaders there can get fun (1 million instruction shaders starts getting into trouble with branch offsets etc.)
<jekstrand> mattst88: I'm not sure how motivated I am. I was kind-of hoping I'd get you curious enough to fix it. :P
<karolherbst> alyssa: sure, but all those kernels are doing is writing to memory anyway
<mattst88> bnieuwenhuizen: yeah -- do you guys have a solution for dealing with branch offsets that are too large?
heat has quit [Ping timeout: 480 seconds]
<mattst88> there's an outstanding bug on intel/gfx7 like this that I don't think anyone has thought about too deeply, but it suffers from that problem
<pendingchaos> mattst88: we use an addition and the s_setpc_b64 instruction instead of a branch
<alyssa> karolherbst: Yeah but spills/fills == moar memory
<pendingchaos> +s_getpc_b64
<karolherbst> yeah... I kind of need to get stuff more optimized
<karolherbst> probably dealing with fma is a good idea
<mattst88> pendingchaos: cool, that just allows you to set a 64-bit instruction pointer?
<pendingchaos> yes
<alyssa> mali has JUMP_EX
<pendingchaos> during assembly, we replace branches with this sequence until all branches can be encoded
<mattst88> pendingchaos: how do you handle non-uniform control flow?
<karolherbst> that thing doesn't even use fma :(
<mattst88> or does that instruction respect execution masks, etc?
LexSfX has quit []
<bnieuwenhuizen> mattst88: all our branches are effectively uniform branches
<bnieuwenhuizen> the rest is lowered to making logic before
<pendingchaos> doesn't matter, that's handled before assembly
<pendingchaos> non-uniform control flow is implemented earlier using predication and uniform branches
<mattst88> ah, interesting
<mattst88> on intel, for an if statement you have to specify the location of the else instruction and also the endif instruction :|
<alyssa> if/else/endif instructions are so funny to me i don't know
<bnieuwenhuizen> on AMD we basically end up with a bunch of masking logic and then a "if mask = 0, skip some blocks"
jagan_ has joined #dri-devel
<bnieuwenhuizen> (+ loop branches ofc.)
rcf has joined #dri-devel
<alyssa> Mali has divergence hw, I feel lucky :p
<mattst88> makes sense, thanks
<bnieuwenhuizen> still gotta figure out how to make that C++ forward progress compliant wrt locking in the same subgroup
LexSfX has joined #dri-devel
<alyssa> what
<mattst88> that sounds scary
<bnieuwenhuizen> alyssa: what if two invocations in the same subgroup/wave/etc want to lock the same lock? How do you ensure the subgroup isn't stuck for the second thread waiting until the first thread unlocks
<bnieuwenhuizen> (your typical GPU lock being a spinlock. Not sure offhand if there were also forward progress guarantees on atomics themselves)
<bnieuwenhuizen> I hear nvidia might have a solution to it, but I'm not sure how that works even algorithmically
<mattst88> 8-wide atomic CAS on intel is guaranteed to only work for at most one
<bnieuwenhuizen> yeah, the question is how do you know to move forward in the subgroup and try lrevisiting later
<bnieuwenhuizen> (in a CAS loop)
<mattst88> ah, I see
<alyssa> bnieuwenhuizen: there is a blessed implementation of locks for Mali, some magic code sequence that is written in a bizarre way but ensures forward progress
<alyssa> however I don't know what API needs that
devilhorns has quit []
<bnieuwenhuizen> alyssa: CUDA and other C++ stuff :)
<mattst88> right, I think on Intel the loop channel gets disabled once the CAS has succeeded for that the channel that got the CAS completed
<mattst88> and so the loop runs for e.g. all 8 channels the first time with one channel executing the CAS successfully, then it runs again for 7 channels, etc
<bnieuwenhuizen> and those 7 channels might all keep looping in the case the CAS was for a mutex
<bnieuwenhuizen> which would prevent forward progress on the 1 channel?
<mattst88> ah, I see!
<mattst88> yeah, I see why that's tricky now
<jani> who's got mad C preprocessor skills? is there a way to come up with an #if conditional expression that checks if a macro is equal to . (just the period, no quotes). the macro may be defined to basically anything, typically a path
<jekstrand> jani: Can you string it and then check for [0] == '.' and [1] == '/0'?
<jekstrand> Not sure if that'c constant-fold well enough, though.
<jekstrand> *that'd
<jani> jekstrand: I don't think that's possible in preprocessor
<jani> (tried it anyway)
<karolherbst> jani: he meant in C code and hope the compiler constant fold it away
<bnieuwenhuizen> just a word of warning that IIRC according to C spec the source before the preprocessor still needs to tokenize correctly, I don't think that would necessarily succeed for a path
<karolherbst> normally I would do as few as possible in the preprocessor anyway
<karolherbst> compilers are smart enough to constant fold these days
<jekstrand> The v3d uAPI is all syncobj. That's convenient....
<jani> so this is specifically about TRACE_INCLUDE_PATH. you're not supposed to set it to . but people do anyway. it only gets used at the preprocessor level.
<Danct12> can anyone here take a look at https://gitlab.freedesktop.org/mesa/mesa/-/issues/6233 ?
jagan_ has quit [Remote host closed the connection]
jagan_ has joined #dri-devel
<alyssa> bnieuwenhuizen: CUDA, ooof
<alyssa> oh right OpenCL C++ is Arm's fault isn't it
<Plagman> ajax: lmk if it's not obvious to repro, i can try to revert my workaround and put something together for you
<zmike> Plagman: would be great if you could give us a repro scenario; we're reasonably sure what's happening but not how
<zmike> or if you could hook me up with whatever steam build that'd work too
<Plagman> let me look
<jekstrand> alyssa: Pretty sure Intel helped. :-/
iive has joined #dri-devel
<karolherbst> what's with cuda?
<bnieuwenhuizen> karolherbst: talking about forward progress and mutexes
<karolherbst> ehhh
<karolherbst> I know that hw hardware got a sleep instructions, but they had something like yield for quite some time as well
ngcortes has joined #dri-devel
<Plagman> zmike: it got worked around in the ui by not calling the offending code on top of my own workaround, will have to jump through a few more hoops to get you a consistent repro, but will do
<zmike> Plagman: cool, just lmk
ngcortes has quit [Remote host closed the connection]
<karolherbst> mhh I might want to try to optimize scratch memory a little
<alyssa> jekstrand: Womp
<karolherbst> if somebody is a little bored, I could need some help figuring out what opt passes to run to make this a little better: https://gist.githubusercontent.com/karolherbst/c3822f3d8d65177918ecc87e89216e14/raw/34c93da724482381e1c8b5a3a5ac1fa0c5d8a8ee/gistfile1.txt
<karolherbst> btw, I like how that thing had like 31 kernel input params, but most of them went away. jekstrand ^^
agners has joined #dri-devel
<karolherbst> they do add tons of -D flags, so I guess that is to be expected somewhat
<karolherbst> still weird they still have so many dead inputs
aravind has quit [Ping timeout: 480 seconds]
<karolherbst> those long bcsel look odd
<karolherbst> *chains
Namarrgon has joined #dri-devel
<jekstrand> karolherbst: What specifically do you want to see shrink?
<karolherbst> mhhh, not quite sure, kind of hoped something obvious would stand out
Haaninjo has joined #dri-devel
<jekstrand> karolherbst: Sometimes a lot of math is just a lot of math. :(
<karolherbst> yeah...
<karolherbst> I do wonder about the scratch memory though
<alyssa> at a quick glance you're lowering int64 which might make it worse..?
<karolherbst> perf is bad because we spill like hell, but still
<karolherbst> alyssa: could be
<karolherbst> but I only lower what the driver wants me to lower
<alyssa> er wait no
<karolherbst> and int64 lowering is quite late
<karolherbst> but those bcsel make me wonder
<alyssa> honestly the NIR looks reasonable
<karolherbst> yeah.. it probably is
<alyssa> Not sure what makes you think it's not
<alyssa> It just seems like a big boy
<karolherbst> well it's one of the smaller ones
<jekstrand> karolherbst: nir_opt_copy_prop_vars may help
<karolherbst> I already call that one
<jekstrand> karolherbst: What about nir_opt_dead_write_vars()?
angerctl has quit [Ping timeout: 480 seconds]
<karolherbst> that I don't
<jekstrand> Yeah, that'll help
<karolherbst> when should I run it?
<jekstrand> after copy_prop_vars()
<karolherbst> okay
<jekstrand> They're complementary passes
<karolherbst> okay, let's see what that changes
<alyssa> I wonder about dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.const_expression_vertex
<alyssa> Do drivers that do binning shaders pass that...?
<karolherbst> jekstrand: that actually helps in a few kernels quite a bit
<jekstrand> alyssa: Presumably. They're GLES conformant. The qeusion is if they "naturally" pass it or if some driver dev has hacked it to death to barely pass. :)
<karolherbst> one went from 250k cycles down to 220k
<jekstrand> karolherbst: \o/
<karolherbst> that's still not a big one though :D
<alyssa> jekstrand: Nod..
<karolherbst> a few went up, but not a lot
<alyssa> is this llvmpipe or iris?
<jekstrand> karolherbst: That's not incredibly surprising. It'll let the compiler move things around a bit and then iris will suck at scheduling.
<alyssa> oh and separate shaders fail. joy.
<karolherbst> there is one beast kernel with 669126 cycles
<alyssa> how the heck do other drivers do linking
<karolherbst> anyway.. we need a solution for inlining :(
<karolherbst> jekstrand: I do wonder how difficult it would be to support branching for drivers
<karolherbst> I mean.. in the terms of jumps
<karolherbst> not calls
<jekstrand> karolherbst: I'm not sure what you man
<karolherbst> like we have to support calling functions, but I'd like to start with an impl for hardware which doesn't even have a call stack, so all you've got is a plain jump, no register saving, no return, nothing
<karolherbst> of course that kind of requires labels
<jekstrand> The real hard problems with functions are mostly RA in the back-end
<karolherbst> or would it be better to let drivers handle all of that?
<jekstrand> Regardless of whether or not there's a call instruction
<karolherbst> mhh yeah..
<karolherbst> that's true
<karolherbst> nouveau kind of supports calling functions, and we just save registers the called functions would overwrite
alyssa has left #dri-devel [#dri-devel]
<jekstrand> On Intel, I think there's a couple ways to do the actual jump.
<jekstrand> The hard part is the save/restore
<karolherbst> they got rid of everything on nv hardware :)
<jekstrand> And there's a few different ways to go about that.
<karolherbst> you can jump, that's all
<karolherbst> we have a relative, absolute and absolute indirect jump, and no return
<karolherbst> (on newest hardware that is)
<jekstrand> sure
<jekstrand> All you need is indirect jump
<jekstrand> return with a value is dumb
<karolherbst> we do have a "fake" return, but that's just a plain jump with a different bit so the debugger knows
<jekstrand> sure
<jekstrand> I mean, you could have a return instruction with a magic return addr register but then you have to save/restore that reg which is super annoying.
<karolherbst> yeah wel
<karolherbst> l
<karolherbst> we had a stack once where we push to/pop from, but they got rid of it
<karolherbst> jekstrand: btw, I am wondering how much I should focus on getting the local size situation worked out. How much of a benefit would it be to launch kernels with n * subgroup_size threads vs 1 * subgroup_size
<karolherbst> atm I report the subgroup size as the max threads for a compiled kernel, because I am sure that will always work, but not quite sure if launching bigger ones does have a real benefit or not
rbrune has joined #dri-devel
idr has joined #dri-devel
mclasen has quit [Ping timeout: 480 seconds]
<jekstrand> idk
<jekstrand> It does if they're using workgroup memory
<karolherbst> workgroup memory as in shared mem, right?
mclasen has joined #dri-devel
<karolherbst> fun... luxmark compiles the kerenl with a static work group size if the device is not a GPU
<jekstrand> yup
<karolherbst> as if on GPUs that wouldn't matter
gouchi has joined #dri-devel
nchery has joined #dri-devel
jagan_ has quit [Remote host closed the connection]
ngcortes has joined #dri-devel
apinheiro has joined #dri-devel
shankaru has joined #dri-devel
<jekstrand> Ugh... We never implemented DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT for the non-timeline ioctl. I'm pretty sure I asked for it at the time but no...
<jekstrand> :sob:
<jekstrand> I guess I get to spin on DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD until it hands me a sync_file.
<jekstrand> This is so gross....
icecream95 has joined #dri-devel
mszyprow has joined #dri-devel
anujp has joined #dri-devel
anujp has quit []
anujp has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
<dj-death> jekstrand: what??
<dj-death> jekstrand: seems to be an accepted as flag for non timeline wait
<jekstrand> dj-death: Yup. It's only supported by timeline_wait_ioctl() which returns -ENOTSUPP if DRIVER_SYNCOBJ_TIMELINE isn't supported.
<jekstrand> wait_ioct() rejects the flag.
Duke`` has quit [Ping timeout: 480 seconds]
<jekstrand> Oh, I meant WAIT_AVAILABLE
<jekstrand> WAIT_FOR_SUBMIT has been around for forever
<dj-death> at yeah
<jekstrand> And I need WAIT_AVAILABLE for v3dv threaded submit to be correct and... I don't have it. :-(
<dj-death> ah sad
rkanwal has quit [Ping timeout: 480 seconds]
<dj-death> it wasa introduced at the same time
<jekstrand> So I'm going to spin on HANDLE_TO_FD asking for a sync_file
<jekstrand> Yeah, but timelines require driver support. waits could be independent.
<jekstrand> I should probably write and send the 1-line kernel patch once I've got v3dv working.
rkanwal has joined #dri-devel
ppascher has joined #dri-devel
lanodan has quit [Quit: WeeChat 3.4.1]
lynxeye has quit [Quit: Leaving.]
lanodan has joined #dri-devel
danvet has quit [Ping timeout: 480 seconds]
mvlad has quit [Remote host closed the connection]
<karolherbst> uhh.. iris shader variants on work group size
<karolherbst> now I got rid of all shader stats not ending up on the hardware and things look way better
mclasen has quit [Ping timeout: 480 seconds]
graphicalidiot has joined #dri-devel
gouchi has quit [Remote host closed the connection]
<graphicalidiot> hello guys
<graphicalidiot> hows everyone
<graphicalidiot> this is the mesa chat right
<karolherbst> yes it is
<graphicalidiot> okay
<graphicalidiot> i was wondering, how the hell can glxgears reach 2000fps but xonotic runs at like 2 on llvmpipe
<airlied> because gears mostly measure how fast you can clear the framebuffer
<graphicalidiot> oh
<graphicalidiot> okay
<graphicalidiot> also, any idea why zink won't load on kgsl turnip?
<airlied> that I don't, though not sure how supported kgsl turnip is for buffer sharing etc that zink needs
<graphicalidiot> well
<graphicalidiot> i heard other people got it running
agd5f has quit [Read error: Connection reset by peer]
<graphicalidiot> but well, i'm not on android anymore, running mainline on my phone with pure dreedreno
<graphicalidiot> *freedreno
<graphicalidiot> also, how can i compile turnip, and zink only (and the icds)
<karolherbst> jekstrand: ehhh.. I found something very stupid we are doing
<graphicalidiot> huh
<karolherbst> type* some_func(type* base, ...) { return &base[offset] } -> ends up with a local copy and gets turned into load/store scratch
<airlied> -Dgallium-drivers=zink -Dvulkan-drivers=turnip should work
agd5f has joined #dri-devel
<jekstrand> karolherbst: We should be able to propagate that eventually
rbrune has quit [Ping timeout: 480 seconds]
<karolherbst> yeah... I am currently checking out what fails
<graphicalidiot> also how do i reply on irc
<airlied> graphicalidiot: you use name
<airlied> name:
<graphicalidiot> ok
<graphicalidiot> airlied: ok got it lemme compile
graphicalidiot has quit [Quit: Page closed]
<karolherbst> jekstrand: there is a memcpy in between
<karolherbst> so it memcpys to function_temp .. uh, maybe I should try to get rid of memcpies earlier?
graphicalidiot has joined #dri-devel
<graphicalidiot> h
<karolherbst> mhh nope
rkanwal has quit [Quit: rkanwal]
<graphicalidiot> oh wow sway died again
frankbinns has quit [Quit: Leaving]
<karolherbst> jekstrand: would copy prop work if things get casted to nonsense types?
<karolherbst> mhhh
gawin has quit [Ping timeout: 480 seconds]
<karolherbst> jekstrand: uhh.. is this annoying. Okay, imagine a struct { int a, b, c} S; then we get memcpies like: memcpy((uint8_t*)&global_S.a, (uint8_t*)&function_S, 0xc)
<karolherbst> we might want to opt those memcpies into simple sotre_deref
<jekstrand> karolherbst: opt_memcpy tries
<jekstrand> And it can handle quite a lot of cases
<jekstrand> Also, I think there's an MR outstanding to improve it
<karolherbst> ahh, I don't use opt_memcpy either :)
mszyprow has quit [Ping timeout: 480 seconds]
<graphicalidiot> fuck
<jekstrand> karolherbst: Yeah, you want opt_memcpy
<jekstrand> karolherbst: LLVM REALLY likes memcpy and it does a decent job of eating them up.
<karolherbst> when it's best to call it? before copy_pro or after opt_deref?
<graphicalidiot> why the hell is vkcube segfaulting
<karolherbst> I loop anyway, but roughly
<graphicalidiot> oh fuck
<graphicalidiot> running glxgears with zink yields:
<graphicalidiot> smth about optimal tiling and failed to assert
<graphicalidiot> and that it doesnt support base zink requirements feats.features.WideLines
<graphicalidiot> wow
<jekstrand> karolherbst: idk. Do whatever I did in intl_clc?
<karolherbst> ahh
dreda has joined #dri-devel
Guest871 has quit [Read error: Connection reset by peer]
dreda is now known as Guest886
ahajda__ has quit []
<karolherbst> jekstrand: yeah.. with that MR those things are going away as it seems
rasterman has quit [Remote host closed the connection]
<karolherbst> mhh, but some are still there.. let's see
<jekstrand> :)
<karolherbst> jekstrand: okay.. yeah soo.. a bunch of casts are gone and now it looks like: memcpy(&global_S.a, (uint8_t*)&function_S, 0xc)
shankaru has quit []
<karolherbst> it improved in other places though
* karolherbst has no idea why we even get weirdo code like this out of llvm
<jekstrand> LLVM is an enigma
<karolherbst> I have an idea what we could fix to make this work here..
<karolherbst> let's see
<karolherbst> memcpy(&struct.a, ...) -> memcpy(&struct, ...) if the field is the first one?
ybogdano has joined #dri-devel
<karolherbst> jekstrand: I assume the pass can handle memcpies of the size of the base?
<jekstrand> karolherbst: IDK
<jekstrand> karolherbst: Maybe?
<karolherbst> it looks like it
<jekstrand> Not sure exactly what you mean
<jekstrand> cool
<karolherbst> ehh.. maybe only for vecs and scalars though
* karolherbst starts reading some code
<jekstrand> It tries
<jekstrand> It does suffer quite a bit if there are any holes in the struct
<karolherbst> yeah.. but here it's just three ints
<karolherbst> first step is to get rid of this pointless struct_deref anyway
anujp has quit [Ping timeout: 480 seconds]
Haaninjo has quit [Quit: Ex-Chat]
mbrost has quit [Ping timeout: 480 seconds]
<karolherbst> jekstrand: type_is_tightly_packed returns a weird size. It should be 0xc for a struct with three ints, right?
iive has quit []
<karolherbst> ehh wait
<karolherbst> I have to run it after nir_lower_vars_to_explicit_types, no?
tzimmermann_ has joined #dri-devel
pcercuei has quit [Quit: dodo]
tzimmermann has quit [Ping timeout: 480 seconds]
mclasen has joined #dri-devel
<karolherbst> jekstrand: okay... I messed up :) but now it's good
<karolherbst> multiple kernels dropped like 15% cycles :)
graphicalidiot has quit [Quit: Page closed]
<karolherbst> okay.. all memcpies are gone :)
<mattst88> karolherbst: nice. what GPU is this on?
<karolherbst> iris
<karolherbst> ehh
<karolherbst> CometLake-H GT2
<karolherbst> maybe I need to check with the official intel CL stuff to see how bad or good the stack ist :D
<airlied> install ubuntu first :-P
<airlied> though maybe there are fedora rpms for it somewhere
<karolherbst> ehhh
<karolherbst> airlied: they have tarballs with binaries as it seems?
<karolherbst> ahh no, just debs
<karolherbst> *sigh*
<karolherbst> and I just want them to tell me that their stack gets 10 times as many points
<karolherbst> airlied: they have rhel 8.4 stuff.. maybe I just try that?
<karolherbst> airlied: " Platform Name Intel(R) OpenCL HD Graphics" seems to work?
<airlied> karolherbst: cool
<karolherbst> "free(): invalid pointer" ehhh
<karolherbst> I have questions
<karolherbst> of course it crashes inside their llvm fork :)
<karolherbst> guess I win this round
tursulin has quit [Read error: Connection reset by peer]
<karolherbst> guess I'll boot rhel at some point and try it there
<airlied> dcbaker: any reason 3bbd404457e6e3278afd78f6721be9e174c6b777 didn't land in 22.0? can it land for 22.0.2 so I can submit lavapipe for conformance :)
<airlied> happy to open an MR if needed, but just seemed cc stable was on it
<dcbaker> Probably there was a issue I haven’t sorted out yet :) I’ve got a fairly large pile of patches to sort through right now