ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
Bill has joined #dri-devel
Bill is now known as Guest1818
Guest1818 has quit []
Company has quit [Remote host closed the connection]
sarnex has quit [Read error: Connection reset by peer]
sarnex has joined #dri-devel
mareko has quit [Ping timeout: 480 seconds]
dri-logger has quit [Ping timeout: 480 seconds]
glisse has quit [Ping timeout: 480 seconds]
quantum5 has quit [Quit: ZNC - https://znc.in]
quantum5 has joined #dri-devel
mareko has joined #dri-devel
dri-logger has joined #dri-devel
glisse has joined #dri-devel
<alyssa> karolherbst: I gave some more thought to the "62-bit generic pointers make a mess if you don't inline"
<alyssa> Here's a crude take
<alyssa> If a function has any _private arguments, we really want to inline, because we can likely copy prop away the scratch memory access entirely, and eliminating scratch should outweigh the cost of inlining
<alyssa> If a function has any local arguments, we probably also want to inline, since local mem access behaves differently from global/scratch mem in terms of performance characteristics (e.g. on AGX there's no need to insert a wait after reading from local mem)
<alyssa> also, I suspect that passing in a pointer to local mem to a generic ptr function is.. probably pretty rare in practice
<alyssa> so it might not matter what we do
<karolherbst> that all sounds fine in theory, but I have shaders with millions of SSA values
<alyssa> So then the heuristic is
<alyssa> inline if any arguments are private or local ptrs, and then whatever you didn't inline assume everything is _global
<karolherbst> we can't always inline anything, because there will be situations we can't
<karolherbst> what if a 500 loc function is called 100 times
<alyssa> (I imagine it doesn't /quite/ work like that because generic ptrs mean we can't necessarily know the address spaces at compile-time. So might need to template stuff, but. shrug)
<alyssa> then you have a 50,000 line kernel \shrug/
dri-logg1r has joined #dri-devel
<karolherbst> well.. sure, but the point I was making here, I already have shaders with _millions_ of SSA values
<karolherbst> literally
<alyssa> I mean
<alyssa> if you have shaders executing millions of instructions I kinda feel like you're already toast?
<karolherbst> it's not millions of instructions
<karolherbst> it's just control flow
<alyssa> oh
<alyssa> meh?
<karolherbst> and very nasty one
<karolherbst> the thing is.. those shaders usually run if LLVM compiles them to AMD
<karolherbst> but on mesa you OOM your system
<karolherbst> so any heurestic where we always inline functions based on argument types won't work
glisse has quit [Ping timeout: 480 seconds]
<karolherbst> because what if that function is called bazillion times?
<karolherbst> then we are again toast
mareko has quit [Ping timeout: 480 seconds]
<karolherbst> some shaders also do switches on type parameters to call into certain functions and other unky bits
dri-logger has quit [Ping timeout: 480 seconds]
<karolherbst> like hand rolled function tables
<karolherbst> some of the compute kernels are just massive and wild
<karolherbst> but if we allow function calls, we can also just duplicate functions with generic arguments and call the variant we actually need
<karolherbst> might be better than if-else-ladders resolving generic pointers
<karolherbst> but we also kinda want to make use of hardware supporting generic pointers natively
<karolherbst> which is the best case and solves a lot of the pain points here
mareko has joined #dri-devel
yyds has joined #dri-devel
columbarius has joined #dri-devel
glisse has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
jewins has quit [Ping timeout: 480 seconds]
ungeskriptet0 has joined #dri-devel
<alyssa> sure
<alyssa> I suspect they're in the minority, though?
<alyssa> I mean, Mali does but it's deeply terrible
ungeskriptet has quit [Ping timeout: 480 seconds]
<alyssa> and honestly i'd be tempted to do 62-bit on mali
flynnjiang has quit [Remote host closed the connection]
flynnjiang has joined #dri-devel
<alyssa> karolherbst: FWIW, Apple claims that they force inline functions that read stack or constant mem
<alyssa> citing "SROA, Buffer preloading"
<DemiMarie> karolherbst: is this some sort of numerical algorithm or scientific computing code? If so, this would not surprise me at all.
quantum5 has quit [Quit: ZNC - https://znc.in]
flynnjiang has quit [Remote host closed the connection]
flynnjiang has joined #dri-devel
quantum5 has joined #dri-devel
flynnjiang has quit [Remote host closed the connection]
flynnjiang has joined #dri-devel
sassefa has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
<youmukonpaku1337> hey guys
<youmukonpaku1337> so im trying to do a bit of trickery with my mainlined ebook and get usb display
<youmukonpaku1337> and it WORKS but its using llvmpipe instead of lima and i get this
<youmukonpaku1337> libGL error: failed to load driver: gud
<youmukonpaku1337> libGL error: MESA-LOADER: failed to open gud: /usr/lib/dri/gud_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/arm-linux-gnueabihf/dri:\$${ORIGIN}/dri:/usr/lib/dri,
<youmukonpaku1337> suffix _dri)
<youmukonpaku1337> am i missing something? am using mesa from debian repos
<youmukonpaku1337> oh and the way i got GUD is very fucky (compiling out of tree with kernel headers) but it seems fine
<youmukonpaku1337> i have the module and theres a drm device at card1
<youmukonpaku1337> oh
<youmukonpaku1337> right
<youmukonpaku1337> i probs need gl4es lol
flynnjiang has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
<youmukonpaku1337> es2gears also works but uhh
<youmukonpaku1337> same err
youmukon1 has joined #dri-devel
youmukonpaku1337 is now known as Guest1834
youmukon1 is now known as youmukonpaku1337
Guest1834 has quit [Ping timeout: 480 seconds]
<kode54> I have to test a regression in ANV since 23.1.6
<kode54> It renders that game, The Spirit and The Mouse, into a colorful and flickery mess
ayaka_ has joined #dri-devel
dviola has quit [Quit: WeeChat 4.0.4]
sassefa has quit [Quit: sassefa]
crabbedhaloablut has joined #dri-devel
yuq825 has joined #dri-devel
yyds has quit [Quit: Lost terminal]
yyds has joined #dri-devel
yyds has quit []
yyds has joined #dri-devel
yyds has quit []
yyds has joined #dri-devel
Daanct12 has joined #dri-devel
flynnjiang has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
Duke`` has joined #dri-devel
ohmltb^ has quit [Remote host closed the connection]
ayaka_ has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
anarsoul has quit [Ping timeout: 480 seconds]
yyds has quit [Quit: Lost terminal]
flynnjiang has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
yyds has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
flynnjiang has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
camus has joined #dri-devel
flynnjiang has quit [Remote host closed the connection]
flynnjiang has joined #dri-devel
fab has joined #dri-devel
anarsoul has joined #dri-devel
junaid has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
junaid has quit [Remote host closed the connection]
youmukonpaku1337 has quit [Quit: WeeChat 4.0.4]
kzd has quit [Ping timeout: 480 seconds]
sima has joined #dri-devel
ayaka_ has joined #dri-devel
* airlied fails to get a gitlab container to run deqp tests locally, the docs don't seem to be up to date, or just don't tell you how to run deqp/piglit tests against a build
<airlied> or at least the docs explain builds, but not how to test already built artifacts
youmukonpaku1337 has joined #dri-devel
rasterman has joined #dri-devel
tzimmermann has joined #dri-devel
fab has quit [Ping timeout: 480 seconds]
mszyprow has joined #dri-devel
<airlied> okay hacked it around, and now the tests don't hit the assert in my container they hit in the CI one
<youmukonpaku1337> huh lima DOES have desktop gl
<youmukonpaku1337> why is mesa looking for a gud.so though
<youmukonpaku1337> *gud_dri.so
<youmukonpaku1337> what did i mess up lmao
<Sachiel> what the hell is gud?
<youmukonpaku1337> generic usb display
<youmukonpaku1337> essentially a way to get display output with a pi turned into a usb gadget
<youmukonpaku1337> it *works* (kinda) but mesa freaks out and spits this: libGL error: failed to load driver: gud
<youmukonpaku1337> libGL error: MESA-LOADER: failed to open gud: /usr/lib/dri/gud_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/arm-linux-gnueabihf/dri:\$${ORIGIN}/dri:/usr/lib/dri,
<Sachiel> oh, if that's using a specific kernel driver, then something that doesn't recognize it might be trying to find a userspace driver matching the name, thus the failed search for gud_dri.so
<youmukonpaku1337> suffix _dri)
<youmukonpaku1337> ah
<youmukonpaku1337> any way to make it not do that?
<youmukonpaku1337> but yea theres no userspace driver
<Sachiel> try MESA_LOADER_DRIVER_OVERRIDE=whateveryouexpecttowork
<youmukonpaku1337> oh true
<youmukonpaku1337> also for some reason even with an override es2gears and glxgears run at 15fps
<youmukonpaku1337> ;-;
mszyprow has quit [Ping timeout: 480 seconds]
<youmukonpaku1337> i doubt the mali 400 is *that* bad
<youmukonpaku1337> maybe i should test with wayland instead of X
mszyprow has joined #dri-devel
<youmukonpaku1337> anyway i guess ill test once im home lol
<youmukonpaku1337> still kinda cool that im able to get any display output at all on an ebook
<youmukonpaku1337> using wifi pins for usb lol
<youmukonpaku1337> i changes out the about-to-short usb port for a breakout board now but its still about as cursed
<youmukonpaku1337> also had to compile GUD out of tree because it isnt enabled in linux-image-armmp :(
mszyprow has quit [Ping timeout: 480 seconds]
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
<kode54> cool
<kode54> I found the bad commit, or commits
<kode54> it outright crashes on them
<kode54> I'm building a full debug build now to produce proper backtraces
<kode54> the thing I hate about debug builds of mesa is that this full build results in about a 2GB install footprint
<kode54> most of which is the debugging symbols package
An0num0us has joined #dri-devel
<kode54> it also takes upwards of 10-15 minutes for the strip/objcopy process that pulls the debug data off the binaries and stuffs it into a debug package
<kode54> the default mesa-tkg-git package config and script, the PKGBUILD hardcodes b_ndebug=true, and the config file defaults to --strip --buildtype release
itoral has joined #dri-devel
sghuge has quit [Remote host closed the connection]
<kode54> crap
<kode54> doesn't crash in debug build
sghuge has joined #dri-devel
<kode54> but it does have the rendering bugs
fab has joined #dri-devel
flynnjiang has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
mwk_ has quit [Ping timeout: 480 seconds]
youmukon1 has joined #dri-devel
youmukonpaku1337 has quit [Read error: Connection reset by peer]
Ahuj has joined #dri-devel
flynnjiang has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
mwk has joined #dri-devel
flynnjiang has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
Omax has quit [Ping timeout: 480 seconds]
Omax has joined #dri-devel
frieder has joined #dri-devel
swalker_ has joined #dri-devel
youmukonpaku1337 has joined #dri-devel
swalker_ is now known as Guest1860
swalker__ has joined #dri-devel
An0num0us has quit [Ping timeout: 480 seconds]
youmukon1 has quit [Read error: Connection reset by peer]
<karolherbst> DemiMarie: ray tracer
Guest1860 has quit [Ping timeout: 480 seconds]
<karolherbst> alyssa: I'm sure we could do it for _most_ functions, but we have to be mindful about how we do it all. There will be situation we can't simply inline certain type of functions, because it would blow up the kernel. If inlining works for 99% of the applications, good, but we need a fallback for the 1%
vliaskov has joined #dri-devel
youmukon1 has joined #dri-devel
mripard has joined #dri-devel
lynxeye has joined #dri-devel
<pq> youmukonpaku1337, I don't think that USB display drivers (that is, *not* USB-C DP alt mode) would support hardware rendered content (dmabuf), which is the reason why you'd get software rendering on GUD. There could be hardware+display specific exceptions, but I don't know about those.
mripard has quit []
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
mripard has joined #dri-devel
<pq> youmukonpaku1337, a Wayland compositor could implement hardware rendering and then do a CPU copy into GUD's buffers, but I don't know if anyone implemented that.
<pq> oh right, Mutter does at least
<pq> youmukonpaku1337, a USB display driver is probably always going to shovel pixels with the CPU, so that will always hurt.
donaldrobson has joined #dri-devel
rgallaispou has joined #dri-devel
samuelig_ has quit []
samuelig has joined #dri-devel
Haaninjo has joined #dri-devel
youmukon1 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
flynnjiang has quit [Ping timeout: 480 seconds]
flynnjiang has joined #dri-devel
kts has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
danylo has quit [Quit: Ping timeout (120 seconds)]
danylo has joined #dri-devel
<youmukonpaku1337> pq: oh yea i see (btw is there a way to use mutter alone without gnome)
<pq> umm... mutter can run without gnome-shell, but I'm not sure how useful that is
<pq> other than testing
<youmukonpaku1337> pq: though i probs gotta test weston too, might work
<youmukonpaku1337> ~~as long as it doesnt use waaaay too much ram~~
<pq> who knows, maybe you could configure even Xorg to render with lima and copy to GUD...
<pq> I don't remember Weston having such copy you'd need, but it has had some multi-DRM-device patches I haven't really looked into what they do.
<pq> for Xorg, if you can get it to recognize both rendering and GUD devices, playing with xrandr --setprovideroutputsource / --setprovideroffloadsink
<pq> ..might do something maybe
Haaninjo has quit [Quit: Ex-Chat]
ced117 has quit [Ping timeout: 480 seconds]
flynnjiang has quit [Read error: Connection reset by peer]
ap51 has joined #dri-devel
flynnjiang has joined #dri-devel
turol_ has joined #dri-devel
<turol_> is it possible for non-developers to get rights to add tags to issues?
<turol_> labels, whatever gitlab calls them
kts has quit [Ping timeout: 480 seconds]
flynnjiang has quit [Ping timeout: 480 seconds]
simon-perretta-img has joined #dri-devel
tristan has joined #dri-devel
tristan is now known as Guest1874
<DavidHeidelberg[m]> to everyone who running manually pipelines for testing their MR: We currently have too many rootfs images hiting the caches, please always rebase before running pipeline, if you can.
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
pekkari has joined #dri-devel
alkisg_irc has joined #dri-devel
alkisg_irc is now known as alkisg
Company has joined #dri-devel
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
bmodem has joined #dri-devel
bbhtt- is now known as bbhtt
An0num0us has joined #dri-devel
padovan has joined #dri-devel
<alyssa> DavidHeidelberg[m]: rebase on upstream to pick up the latest image?
<turol_> alyssa: the nir if condition change also seems to apply to loops
<turol_> that caused a regression
<turol_> was is intended to apply to loops?
<turol_> issue 9750 if you want more details
<alyssa> uh oh
ayaka_ has quit [Remote host closed the connection]
ayaka_ has joined #dri-devel
<turol_> it triggered unrolling of a loop that previously wasn't
<alyssa> turol_: What's the regression?
<alyssa> Being able to unroll more loops is a good thing..
<turol_> causing increased register pressure and lowered subgroups per SIMD
<turol_> not when there's a texture read inside the loop
<alyssa> ok, but that's a deficiency in the loop unrolling heuristic then (deciding to unroll loops when it's not beneficial)
<alyssa> not the fault of last night's patch
<alyssa> and also, unrolling loops with a texture read inside may *still* be a win in practice?
<alyssa> you get lower occupancy, but you get more ILP to hide the latency, and might come out ahead despite the pipeline stats
<DavidHeidelberg[m]> alyssa: if you have older MR, which using .gitlab-ci/image-tags.yml which been produced long time ago, the CI (until assigned to Marge) will use the old images
<DavidHeidelberg[m]> and the old images aren't usually that much cached
<alyssa> DavidHeidelberg[m]: +1, got it2
<alyssa> thx
<alyssa> turol_: see the discussion in https://gitlab.freedesktop.org/mesa/mesa/-/issues/7161
<turol_> just tried, 142 fps unrolled, 146 fps not unrolled
<alyssa> OK. That's a more interesting statistic then
<turol_> it's the slowest shader of SMAA
<turol_> the others are pretty simple
<turol_> it's actually a little bit infamous for causing issues in both spirv-tools and spirv-cross
<pendingchaos> I'm not sure this particular form of loop unrolling is a good idea (it's that weird nested if form) since it usually doesn't overlap iterations
<pendingchaos> but we would need LICM/GCM to fully replace it
<pendingchaos> it's not very beneficial for this particular shader (just doing LICM for a descriptor load)
youmukon1 has joined #dri-devel
<pendingchaos> (complex_unroll() in nir_opt_loop_unroll.c, I think)
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
<turol_> on nvidia proprietary driver unrolling or not affects the binary size but not register count
<turol_> fps seems identical
<turol_> don't have other amds to easily test
<turol_> does someone have instructions for setting up a chroot/vm for compiling mesa for the steam deck?
<pendingchaos> you can use RADV_FORCE_FAMILY and Fossilize to loop at how shaders compile for other gpus
<turol_> but that doesn't let me test the fps
<alyssa> pendingchaos: will NIR ever grow a dedicated LICM? or is that purely part of nir_opt_gcm?
<alyssa> (Every time I look at opt_gcm, it blows up my reg pressure and slows things down)
<pendingchaos> no idea
<turol_> and like i mentioned in the issue while i can fix this shader for myself that doesn't help everyone else who's used it in their proprietary game
<turol_> on the other hand in more complicated render it's proportionately less important
<pendingchaos> maybe nir_opt_gcm can be modified so that it can only do LICM
<alyssa> pendingchaos: fair
<alyssa> the other case that comes up is duplicated stuff on both sides of an if
<alyssa> another case that's not nearly as problematic of gcm's usual thing of "move EVERYTHING!!"
<alyssa> but opt_gcm seems like a blunt hammer, idk
youmukon1 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
ayaka_ has quit [Ping timeout: 480 seconds]
pekkari has quit [Ping timeout: 480 seconds]
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
alkisg has left #dri-devel [#dri-devel]
yyds has quit [Remote host closed the connection]
Guest1874 has quit [Remote host closed the connection]
tristan has joined #dri-devel
tristan is now known as Guest1914
itoral has quit [Remote host closed the connection]
youmukonpaku1337 has quit [Remote host closed the connection]
hansg has joined #dri-devel
youmukonpaku1337 has joined #dri-devel
hansg has quit []
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<pq> swick[m], what property setting ioctls did you refer to in the email?
<swick[m]> pq: DRM_IOCTL_MODE_SETPROPERTY, etc
<pq> why would you use those?
<swick[m]> to set the property of a connector?
<swick[m]> it's all hidden in libdrm
<pq> no, that's atomic commit ioctl
<pq> let's see...
<swick[m]> mhh, is it?
<pq> it wouldn't be atomic, if each property was set with a separate ioctl
<swick[m]> oh, you're right...
<swick[m]> I mean, it could still be atomic
<pq> atomic commit ioctl argument is struct drm_mode_atomic, and it seems to contain the whole lot.
<swick[m]> yes, it actually only issues one ioctl
<swick[m]> my bad
<pq> I thought I missed something :-)
<swick[m]> just saying, that's not a requirement for it to be atomic, just like in wayland where we built up state in the compositor and then start using it on a commit message
<pq> right, if DRM_IOCTL_MODE_SETPROPERTY staged stuff
mauld has quit [Ping timeout: 480 seconds]
<zamundaaa[m]> Please don't reinvent the atomic API in worse
youmukonpaku1337 has quit [Remote host closed the connection]
<pq> we're not
youmukonpaku1337 has joined #dri-devel
mauld has joined #dri-devel
Daanct12 has quit [Quit: WeeChat 4.0.4]
turol_ has quit [Quit: Leaving]
<pq> Is it so that KMS has no way of choosing BT.2100 ICtCp as video stream colorimetry?
<pq> not in v6.5 it seems
<swick[m]> do sinks support that?
<pq> I dunno, but CTA-861 defines it
<pq> no hits in linuxhw/EDID, so I guess not
<swick[m]> oh, in 861-H
<swick[m]> pretty new then
<pq> oh, yeah, I'm reading H, and wasn't there I already too?
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<swick[m]> only YCbCr in 861-G
<pq> there is no RGB variant of it, is there?
<pq> or you mean BT2020_YCC?
<pq> swick[m], t
<pq> swick[m], this reminds me, should the new color pipeline UAPI replace the automatic RGB/YCC selection from the start?
<swick[m]> yeah, bt 2020 YCC is defined in CTA-861-G already but not ICtCp
<swick[m]> it's only for the plane right now, so I don't think so
<pq> right, memory is slowly coming back
<pq> and it can be added later with "auto"
<pq> were the diagrams supposed to appear as rendered images in https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html ? I see source, e.g. Overview.
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
yyds has joined #dri-devel
ced117 has joined #dri-devel
Danct12 has quit [Read error: Connection reset by peer]
fab has quit [Quit: fab]
heat has joined #dri-devel
JTL has quit []
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
fab has joined #dri-devel
JTL has joined #dri-devel
hansg has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
Guest1914 has quit [Remote host closed the connection]
xzhan34_ has joined #dri-devel
xzhan34 has quit [Remote host closed the connection]
mripard has quit [Quit: mripard]
mripard has joined #dri-devel
agd5f has quit [Remote host closed the connection]
heat has quit [Remote host closed the connection]
ap51 has quit [Ping timeout: 480 seconds]
kzd has joined #dri-devel
agd5f has joined #dri-devel
<mareko> llvmpipe-traces times out randomly: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/48524383
<zmike> yeah something to do with new infra
tristan has joined #dri-devel
<zmike> being discussed in #freedesktop
tristan is now known as Guest1928
<karolherbst> jasuarez: I'm kinda looking into compute stuff for v3d, but I'm running issues with fences. At least it looks like some aren't signaled and I wonder what's the best approach here to debug it
fab has quit [Ping timeout: 480 seconds]
mszyprow has joined #dri-devel
jewins has joined #dri-devel
pekkari has joined #dri-devel
<jasuarez> I never deal with such issues so not sure what's the best approach
<jasuarez> I don't remember to have anything special for that
<karolherbst> mhh.. maybe I'm doing something incorrectly, but I also don't see the GPU faulting, or at least nothing in dmesg
mripard has quit [Quit: mripard]
yuq825 has quit []
<karolherbst> jasuarez: do you know if all memory (a.k.a. pipe_resources) need to be referenced before work can be launched/waited on or somehting odd like that? I'm currently not doing this, so maybe I want to figure out how to properly do it in v3d
Guest1928 has quit [Remote host closed the connection]
<karolherbst> but then again it's a bit odd to not see any errors
<karolherbst> yeah mhh.. doesn't seem to be it either
alpalcone has quit [Quit: WeeChat 3.8]
pekkari has quit [Quit: Konversation terminated!]
ap51 has joined #dri-devel
<jasuarez> Pretty sure Iago dealt with then when developing v3dv, but he is not connected now. I could ping him tomorrow
<karolherbst> cool
agd5f has quit [Remote host closed the connection]
tzimmermann has quit [Quit: Leaving]
agd5f has joined #dri-devel
pekkari has joined #dri-devel
yyds has quit [Remote host closed the connection]
frieder has quit [Remote host closed the connection]
tristan has joined #dri-devel
tristan is now known as Guest1931
pekkari has quit [Ping timeout: 480 seconds]
jessica_24 has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
bmodem has joined #dri-devel
junaid has joined #dri-devel
hansg has quit [Remote host closed the connection]
hansg has joined #dri-devel
<mareko> rustcuda when
<karolherbst> I wonder if layering HIP on CL is good enough here, at least that's my hope and that the project in question works out :D
<karolherbst> but I also don't know if AMD plans to stay compatible with CUDA forever or not
<youmukonpaku1337> amd is compatible with cuda??
<youmukonpaku1337> the hell
<mareko> youmukonpaku1337: it's called HIP
<karolherbst> well.. HIP is basically `s/cu/hip/` + some mistakes or something
<youmukonpaku1337> i see
<mareko> I'm hearing nobody uses OpenCL
<karolherbst> yeah, hearing that a lot from AMD people
<youmukonpaku1337> yep thats true
<youmukonpaku1337> most stuff uses cuda
<karolherbst> yeah, but the reason is, that all the CL stacks were horrible in the past :D
<karolherbst> but yeah..
<karolherbst> at least there are a couple of companies still invested in CL.. anyway.. I think layering CUDA/HIP/whatever on top of CL or whatever is probably the best strategy here
<karolherbst> and such projects already exists
<youmukonpaku1337> yep that could work
<youmukonpaku1337> ~~the zink of opencl~~
<karolherbst> HIP on CL on zink on....
<mareko> .. glide
<zmike> nope shut it down
<karolherbst> *layers weren't supposed to be layered on top of layers*
<mareko> or zink on r600
<zmike> don't encourage them
<karolherbst> uhhh
<karolherbst> anyway... all those HIP on CL layers require insane extensions
<karolherbst> e.g. SVM
<karolherbst> :')
kts has joined #dri-devel
<youmukonpaku1337> anyway unrelated but why the hell does an e reader need a dedicated video encoder/decoder chip on the soc LOL
<karolherbst> mhh
<karolherbst> copyrighted embedded videos?
<youmukonpaku1337> am not complaining but its kind of funny
<youmukonpaku1337> nope
<karolherbst> (with DR)
<karolherbst> *DR
<youmukonpaku1337> the reader never plays any videos of sort
<karolherbst> ... my M is stuck
<karolherbst> huh.. weird
<karolherbst> adds?
<karolherbst> :D
<karolherbst> ehh
<karolherbst> ads
<youmukonpaku1337> its just the allwinner a13 has a cedar VPU and they couldnt be bothered to get anothee soc lol
<karolherbst> uhh
<youmukonpaku1337> ads are impossible, this device doesnt have wifi (by default at least)
<youmukonpaku1337> HOWEVER
<youmukonpaku1337> you can uh
<youmukonpaku1337> do something so utterly cursed
<karolherbst> like using the sound card?
<youmukonpaku1337> that its just plain insane
<youmukonpaku1337> karolherbst: no sound card to speak of
<mareko> lavacuda would be interesting, zink can help I'm sure
<karolherbst> sooo... there is this cuda driver API we could potentially implement
<karolherbst> which is libcuda.so
<karolherbst> but I have no idea how painful that would be
yyds has joined #dri-devel
<youmukonpaku1337> karolherbst: check out this monstrosity i made https://youmu.i-am-in-your.systems/MtWKwDaqmTye https://youmu.i-am-in-your.systems/fnpkpGMuwCpB
<youmukonpaku1337> so technically this system doesnt have wifi *right*
<youmukonpaku1337> BUT the usb interface used for it works and is very easy to use
<youmukonpaku1337> so its like a nice usb interface
fab has joined #dri-devel
<youmukonpaku1337> i should probs boost the 3.3v it provides to 5v
<youmukonpaku1337> instead of external power
<karolherbst> cursed
Guest1931 has quit [Ping timeout: 480 seconds]
<youmukonpaku1337> very
<youmukonpaku1337> but it works (horribly)
<youmukonpaku1337> i love compiling wifi drivers for an hour
<youmukonpaku1337> best pastime
<youmukonpaku1337> thank god i didnt have to cross comp mesa and lima is included in stock debian mesa lol
<youmukonpaku1337> i just hope that i can get desktop gl lima to work
swalker__ has quit [Remote host closed the connection]
<youmukonpaku1337> because if so this makes this actually workable instead of pure hell
Duke`` has joined #dri-devel
haagch has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
haagch has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
tristan has joined #dri-devel
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
Jeremy_Rand_Talos__ has joined #dri-devel
tristan is now known as Guest1936
Jeremy_Rand_Talos__ has quit [Remote host closed the connection]
Jeremy_Rand_Talos__ has joined #dri-devel
<gildekel> @pq @emersion Hi! I am currently going through review with Intel on a series that suggest a fix around complete link-training failures, in which in these cases, the effective bandwidth of a connector is set to 0Gbps, which will cause all its modes to be pruned in in the next probing. The risk here is introducing a change that userspaces are not expecting. The intuition suggests that connectors without modes should be ignored...
<gildekel> I would love to get your input as weston/sway maintainers (hope I got it right)
<gildekel> And, needless to say, anyone else here who feel like this change is relevant to their product stability
Danct12 has joined #dri-devel
yyds has quit [Remote host closed the connection]
Guest1936 has quit [Ping timeout: 480 seconds]
<zamundaaa[m]> For KWin connectors with zero modes would be fine; this already happened in the past (don't remember in what circumstances though) so we have a workaround in place
<gildekel> That's good. The approach here is that upon link-training failure, userspace will get a uevent in which it will see the failed connector is "sterile", so ignoring it, or marking it in a bad state is the goal. At least that's what we would like to see in ChromeOS.
lynxeye has quit [Quit: Leaving.]
jljusten has quit [Quit: WeeChat 3.8]
jljusten has joined #dri-devel
kxkamil has quit []
anarsoul|2 has joined #dri-devel
anarsoul has quit [Read error: No route to host]
rasterman has joined #dri-devel
kxkamil has joined #dri-devel
alanc has quit [Remote host closed the connection]
Ahuj has quit [Ping timeout: 480 seconds]
agd5f has quit [Read error: Connection reset by peer]
agd5f has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
flto has quit [Read error: Connection reset by peer]
flto has joined #dri-devel
ungeskriptet0 has quit []
ungeskriptet0 has joined #dri-devel
mszyprow has quit [Ping timeout: 480 seconds]
alanc has joined #dri-devel
gouchi has joined #dri-devel
Haaninjo has joined #dri-devel
ap51 has quit [Ping timeout: 480 seconds]
<airlied> karolherbst: ptx parser in rust?
<HdkR> Don't even need to parse PTX, plenty of Switch emulators proved you can just take the raw ISA and translate it :P
<karolherbst> airlied: why not tho...
<airlied> HdkR: that assumes yoy have raw isa though
<karolherbst> I just think it makes more sense to have an open ecosystem besides CUDA, so opting in into supporting CUDA is kinda a double edge one here
<airlied> but yeah sass to nir translator
<karolherbst> HdkR: right... we could even pattern match commong lowering, it shouldn't be all too hard
<karolherbst> for compute almost none of this exists anyway
<karolherbst> though cuda supports texgrad and other evil things :')
<airlied> also does nvidia have a d9cumented calling convention to their sass kernels?
<karolherbst> uhm... no idea
<karolherbst> airlied: you mean sass kernels as in normal compute shaders?
<karolherbst> the elf binaries usually document all the constant buffers
<karolherbst> but not sure how flexible they are with that
<karolherbst> but there doesn't really exist any kinda of calling convention here besides some internal data passed in via const buffers
<karolherbst> at fixed locations
<airlied> dont they have libs you link against, or is it just pre made kernels?
<karolherbst> they have some internal binaries and I'm sure there is some kinda of calling convention for those, but I don't actually know what they are doing there
<karolherbst> anyway, nvidia does not want you to target SASS
<karolherbst> so I doubt they document anything
<karolherbst> and I'm sure I wont' even be allowed to help out writing a SASS parser....
<karolherbst> or at least that might bring me in a icky legal situation
rauji___ has joined #dri-devel
rasterman has quit [Quit: Gettin' stinky!]
fab has quit [Quit: fab]
junaid has quit [Remote host closed the connection]
youmukonpaku1337 has quit [Remote host closed the connection]
youmukonpaku1337 has joined #dri-devel
<jtatz[m]> cuobjdump can dump SASS, and the ISA is somewhat documented https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#instruction-set-reference
<jtatz[m]> Also for JIT kernels you can use CUPTI to grab it at runtime
<alyssa> considerably more docs than I was expecting, neat
<youmukonpaku1337> okay this is not working out
<youmukonpaku1337> cant change even resolution
<youmukonpaku1337> no gles2 in es2info with lima
<youmukonpaku1337> no desktop gl either
<youmukonpaku1337> and X is buggy as all hell
<youmukonpaku1337> and gud + lima arent in xrandr providers
<DemiMarie> youmukonpaku1337: try Wayland
<youmukonpaku1337> weston literally locks up the system
<youmukonpaku1337> so uh
<youmukonpaku1337> anything else
iive has joined #dri-devel
<youmukonpaku1337> DemiMarie: can i run mutter separate from gnome? it seems to support this kind of trickery
mszyprow has joined #dri-devel
<karolherbst> "somehwat documented" :D
<karolherbst> yeah...
<DemiMarie> youmukonpaku1337: Try Sway or KWin.
<youmukonpaku1337> kwin... definitely no
<youmukonpaku1337> am gonna try mutter first because pq mentioned it having support for trickery like what im doing
<DemiMarie> And report a kernel bug, because Weston should not lock up the system in a way that killing it cannot correct.
<youmukonpaku1337> i mean, system is more or less alive but it crashes GUD it seems
<DemiMarie> GUD?
Duke`` has quit [Ping timeout: 480 seconds]
melonai3 has quit []
<youmukonpaku1337> Generic USB Display
melonai3 has joined #dri-devel
melonai3 has quit []
melonai3 has joined #dri-devel
hansg has quit [Quit: Leaving]
crabbedhaloablut has quit []
<DemiMarie> Ah
<DemiMarie> Probably a kernel bug; I would report it to the relevant mailing lists.
<youmukonpaku1337> hmm
<youmukonpaku1337> how can i run mutter with PRIME
youmukon1 has joined #dri-devel
<youmukon1> youch
<youmukon1> 4fps under mutter with es2gears wayland
<youmukon1> also permission denied with kmscube
<youmukon1> oh nvm
<youmukon1> okay so i can get 2fps in kmscube if i run it on card0 which is software accelerated (and badly at that) GUD
<youmukon1> question is how can i use lima to render to card0
<youmukon1> if i do mesa loader override to use lima it just throws an invalid modeset argument error
youmukonpaku1337 has quit [Ping timeout: 480 seconds]
<Sachiel> mesa drivers expect to talk to their corresponding kernel driver, not just some random out of tree thing, so if you are having issues with some random out of tree thing, go ask their authors for support. I don't think you'll find much help here with that
<youmukon1> GUD is in mainline lol
<zmike> sounds like bugs
<youmukon1> ehhh
<youmukon1> its probably intended and it uses sw accel by default
<youmukon1> question is how do i make it offload to lima
<airlied> might have to hack gud kmsro, not sure if that would help
<youmukon1> what's kmsro
<youmukon1> as long as it isn't *too* difficult im fine with a little trickery
<airlied> kmsro is mesa internal thing to link accel and display drivers
<youmukon1> aha
<youmukon1> i see
<airlied> i think you might need to write code in mesa, but that is close to the limit of what i know about it
<youmukon1> oh fuck
<youmukon1> i have almost 0 knowledge of how to write C lol
<karolherbst> does setting `DRI_PRIME=1` help or did you already try that?
<youmukon1> tried that
<youmukon1> nope
<youmukon1> also
<youmukon1> kmscube shows renderer as mali400
<youmukon1> but uhh i somehow doubt that's right
<karolherbst> why not?
<youmukon1> 2.5 frames per second
<karolherbst> there might be a different reason it's so slow
<karolherbst> maybe it's CPU overhead
<youmukon1> hm
<karolherbst> the content of the frames kinda need to be copied over to the display driver
<karolherbst> and if there is no accelerated path for that the performance is kinda toast
<airlied> also copied over usb
<karolherbst> try LIBGL_ALWAYS_SOFTARE=1 and see if that changes antyhing
<youmukon1> thats possible but also es2 info and glxinfo list driver as llvmpipe
<youmukon1> oh
<youmukon1> will test
<youmukon1> in a sec
<karolherbst> LIBGL_ALWAYS_SOFTWARE=1 I mean
<karolherbst> but kmscube is kinda special
<karolherbst> there might be a different way for kmscube to use llvmpipe
<youmukon1> karolherbst: libgl always software makes kmscube throw "failed to set mode: invalid argument"
<karolherbst> fair
<youmukon1> it does show that jts using llvmpipe before that
<youmukon1> hm
<youmukon1> i would go the kmsro route but i have absolutelt no idea how to program C so i suppose thats not an option lol
<youmukon1> is there a way to test rendering speed headlessly?
<karolherbst> I think it's already working as intented
<karolherbst> it's just that the kernel driver doesn't provide what we need for proper offloading here
<karolherbst> at least that's my working theory
<karolherbst> did you check the CPU load?
<youmukon1> am about to do that
<karolherbst> and where it spends most of the CPU cycles at
<karolherbst> or rather what process uses most of the CPU
<youmukon1> kmscube was not using much cpu at all
<karolherbst> yeah.. so it's indeed not softare rendering, or if it is, the bottleneck is something else
<youmukon1> and how would i go about finding it i guess?
<karolherbst> is your CPU busy nonetheless?
<youmukon1> what do you consider busy
<youmukon1> around 10% with htop open
<karolherbst> mhh, that's not much
<karolherbst> like total or is one core at 100%?
<youmukon1> theres a single core lol
<karolherbst> heh
<karolherbst> yeah.. so I guess something with the GUD driver and usb and... other things is why it's slow
<daniels> there’s no magic bullet here - the only possible solution (pipelining rendering) absolutely requires knowing C
<youmukon1> ugh
<daniels> fundamentally, you have a very slow GPU rendering to system memory, then copying back out over USB, and waiting for this to take effect, every frame
youmukon1 has quit [Read error: Connection reset by peer]
youmukonpaku1337 has joined #dri-devel
<youmukonpaku1337> yeah i can see why itd be slow...
<youmukonpaku1337> wait
<youmukonpaku1337> how can i check usb bandwidth?
<youmukonpaku1337> i may have a hunch something went horribly wrong and im running over usb1.1 bandwidth
<karolherbst> uhh.. that would be terrible indeed
<youmukonpaku1337> very
<karolherbst> but would the bandwidth be enough for displaying anything?
<glennk> lsusb -t should show the theoretical bandwidth for each port
<youmukonpaku1337> karolherbst: for a tty should be lol
<glennk> i'm also guessing this platform is stuck with single channel memory too?
<youmukonpaku1337> OH
<youmukonpaku1337> LMFAO
<youmukonpaku1337> it is running at 12mbit bandwidth
<youmukonpaku1337> the entire hub
<karolherbst> RIP
<youmukonpaku1337> ugh
<youmukonpaku1337> guess im gonna have to somehow use the main port
<glennk> all pixels, line up in single file...
<youmukonpaku1337> glennk: i mean i doubt it would have dual channel 256mb ram lol
<youmukonpaku1337> and no its single channel
<youmukonpaku1337> but yea
<glennk> gpu + cpu + usb memory access
<youmukonpaku1337> i think i found the problem lmao
<youmukonpaku1337> lemme try uh
<youmukonpaku1337> the main micro b port
<youmukonpaku1337> it didnt work before but you never know
mszyprow has quit [Remote host closed the connection]
mszyprow has joined #dri-devel
glennk has quit [Remote host closed the connection]
glennk has joined #dri-devel
<youmukonpaku1337> oh GREAT
<youmukonpaku1337> guess we're back to the roots of this
<youmukonpaku1337> ;-;
iive has quit [Quit: They came for me...]
<youmukonpaku1337> how can i check mode of a usb port?
<youmukonpaku1337> i do have dr_mode set to host in DT but i dont think it works lol
gouchi has quit [Quit: Quitte]
<glennk> cat /sys/bus/usb/devices/<device>/speed and version is one way
<youmukonpaku1337> mode as in peripheral or host
<youmukonpaku1337> not speed
<youmukonpaku1337> hm
rauji___ has quit []
<youmukonpaku1337> it should be host
<youmukonpaku1337> this is quite weird
<alyssa> karolherbst: hmm, this is a spicy problem
<alyssa> I /really/ want read-only buffers to be read with load_global_Constant, not load_global
<alyssa> Is that even a thing in CL? I guess _constant?
<alyssa> doesn't work with generic ptrs, though
<alyssa> ok I guess I can just not use generic ptrs, fine
sassefa has joined #dri-devel
sassefa has quit []
sassefa has joined #dri-devel
sassefa has quit []
sassefa has joined #dri-devel
sassefa has quit []
sassefa has joined #dri-devel
sassefa has quit []
sassefa has joined #dri-devel
sassefa has quit []
sassefa has joined #dri-devel
sassefa has quit []
sassefa has joined #dri-devel
ngcortes has joined #dri-devel
sassefa has quit [Read error: Connection reset by peer]
sassefa has joined #dri-devel
sassefa has quit []
<alyssa> OK, yeah, using constant does what I need. Cool
<alyssa> thanks
sassefa has joined #dri-devel
sassefa has quit []
sassefa has joined #dri-devel
sassefa has quit []
<karolherbst> alyssa: yeah.. I want to use ubos for those things in the near future
<alyssa> I don't =D
sassefa has joined #dri-devel
* alyssa flexes her agx hardwrae
sassefa has quit []
<karolherbst> ehh...
sassefa has joined #dri-devel
<karolherbst> it should at least use load_global_constant though I think
<alyssa> yeah I just wrote that patch
sassefa has quit []
<karolherbst> but some hardware benefits from those being actual ubos
<karolherbst> yeah... I think I have a patch like that somewhere as well
sassefa has joined #dri-devel
An0num0us has quit [Ping timeout: 480 seconds]
<karolherbst> I suspect you lower ubos to load_global_constant in agx?
<alyssa> yes
sassefa has quit []
<alyssa> in the gl driver
sassefa has joined #dri-devel
<karolherbst> do you load the descriptor at runtime?
sassefa has quit []
<alyssa> there is no descriptor
<karolherbst> I mean.. the actual address
sassefa has joined #dri-devel
<alyssa> it's pushed in
<karolherbst> mhhh
sassefa has quit []
<alyssa> AGX is literally the CL model
<karolherbst> right...
<alyssa> pass __constant pointers in and read em
sassefa has joined #dri-devel
sassefa has quit []
<karolherbst> yeah, that's fair, I'm just wondering if there is a significant overhead when using ubos and if we want to make that optional
sassefa has joined #dri-devel
<karolherbst> on nvidia it really should be an ubo e.g.
sassefa has quit []
sassefa has joined #dri-devel
<karolherbst> I probably don't even need a new cap for, if the constant buffer size is below 1M it's probably a hardware thing :D
sassefa has quit []
sassefa has joined #dri-devel
sassefa has quit []
sassefa has joined #dri-devel
<karolherbst> `PIPE_CAP_MAX_SHADER_BUFFER_SIZE_UINT` is what I'm currently reporting as constant memory size
sassefa has quit []
sima has quit [Ping timeout: 480 seconds]
<karolherbst> alyssa: the thing on nvidia is, that you can literally use ubos as sources for alu instructions
<karolherbst> and they are as fast as gprs
<karolherbst> but I know that some drivers really just map them to raw memory
<karolherbst> so I guess this situation all warrants a flag (which st/mesa might also do in the far future if at all)
<karolherbst> though I thikn there are also robustness arguments, so drivers are supposed to bound check them?
Haaninjo has quit [Quit: Ex-Chat]
<alyssa> i can literally use arbitrarily complicated expressions on UBOs as sources for alu instructions as fast as gpr :~)
<karolherbst> sounds cursed
<alyssa> nir_opt_preamble
mszyprow has quit [Ping timeout: 480 seconds]
soreau has quit [Ping timeout: 480 seconds]
soreau has joined #dri-devel
<karolherbst> right.. but I guess you don't have like 1MB of space there
<karolherbst> anyway, you still bind it as a normal buffer
<karolherbst> not as some special ubo thing
<alyssa> 1KiB, so not quite as big as you ;)
<karolherbst> right.. I'm just wondering if it makes sense to add complexity to bind constant buffers as ubos or normal global memory depending on what the driver wants or if it's good enough to always bind them as ubos in the future
<alyssa> always UBOs is fine
<karolherbst> okay, cool
<emersion> gildekel: yeah i think 0 modes is a kernel bug
vliaskov has quit []