ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
ngcortes has quit [Remote host closed the connection]
<Lynne> airlied: okay, fixed all validation layer issues again except the ycbcr conversion one, which I hope isn't required by your radv implementation
<Lynne> I remember creating a sampler like that was like fighting with bureaucracry
<Lynne> currently, I get a segfault in radv_CmdDecodeVideoKHR
<airlied> Lynne: yeah it's pretty likely things would need a ycbcr conversion if you want to sample from the image at all
<airlied> Lynne: do you call CmdBeginVideoCoding
<airlied> and CmdControlVideoCoding before starting
<Lynne> actually no, I'll change that
<Lynne> no way we're using that awful GPU yuv conversion
<Lynne> that's the whole point of not using multi-plane textures, we're free to implement it our own way
<airlied> Lynne: okay as long as you don't want vulkan to sample it it's fine for now :)
<Lynne> wait, aren't all of those only needed for encoding?
<airlied> Lynne: nope
<airlied> you have to begin and control for decode as well
<airlied> you also have to pass the control reset flag on the first frame
<karolherbst> printf :( :( :(
<Lynne> but vulkan can already sample it
<airlied> Lynne: VUID-VkVideoCodingControlInfoKHR-flags-06518
<Lynne> why is it asking me for references and reference slots when it already knows about them from the main decode info?
<airlied> Lynne: because it's vulkan
<airlied> in theory the decode info can be a subset of the begin set
<Lynne> okay, added all
<Lynne> segfaults in radv_CmdControlVideoCodingKHR now
iive has quit []
<Lynne> odd, because that doesn't take any pNext even
<Lynne> unless it's in encoding mode
<airlied> Lynne: okay you haven't bound memory yet
tzimmermann has quit [Ping timeout: 480 seconds]
<airlied> vkGetVideoSessionMemoryRequirementsKHR
<airlied> BindVideoSessionMemoryKHR
<airlied> Lynne: so vulkan has the user do the memory allocations for large things in the driver, at the moment radv hacks the DPB in via this mechanism and some others, will have to remove it
<Lynne> how many memoryrequirements should I ask for
<Lynne> (it's not an API-used managed heap, right?)
<Lynne> *API-user
<Lynne> oooh, it takes a pointer to a number
<airlied> Lynne: you call get function twice
<airlied> once to get the count, then again to fill it out
<airlied> then you allocate the memory, then you call the bind
tursulin has quit [Read error: Connection reset by peer]
<karolherbst> airlied: is clCreateProgramWithIL required in CL 3.0?
<karolherbst> ehh wait.. I have to supply the func pointer regardless
<karolherbst> nvm then
<Lynne> airlied: is VkVideoGetMemoryPropertiesKHR.memoryBindIndex equal to VkMemoryAllocateInfo.memoryTypeIndex?
<Lynne> never heard of a memory bind index before
<Lynne> or are users meant to search for a memory index via the VkVideoGetMemoryPropertiesKHR.requirements.typebits
mbrost has joined #dri-devel
garrison has joined #dri-devel
i-garrison has quit [Read error: Connection reset by peer]
<Lynne> oh, I get it
<Lynne> this is so ridiculously verbose
<airlied> Lynne: welcome to vulkan
<airlied> the memoryBindIndex is for passing bcak into the bind index
<Lynne> wow, it got even better, each VkVideoGetMemoryPropertiesKHR needs to have its own allocated pMemoryRequirements
<Lynne> airlied: okay, I allocate and bind memory sucessfully, still a segfault in control
mbrost has quit [Ping timeout: 480 seconds]
<Lynne> wow, it's requesting 1.5 gigabytes of memory for video decoding, all in bind index 0
<Lynne> that's sorta large for a 1920x1080 video
<airlied> Lynne: did you call Begin before Control?
<Lynne> yup, calling begin before control, sefault in radv_CmdDecodeVideoKHR
<airlied> there is a large hw context and a dpb
camus1 has quit []
camus has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
co1umbarius has joined #dri-devel
<airlied> Lynne: okay so the dst image view is crashing it now
columbarius has quit [Ping timeout: 480 seconds]
<Lynne> huh, nothing odd about that from what I can see
<karolherbst> for CL 3.0: Pass 2011 Fails 145 Crashes 14 Timeouts 1: 100% :)
<airlied> frame_info->dstPictureResource.imageViewBinding
<karolherbst> Did I surpass clover already?
<airlied> karolherbst: run a clover run :-P
<karolherbst> airlied: before I do that, I'd implement printf and just get 55 more passes :D
<karolherbst> non_uniform_work_group uhh
<Lynne> the imageview should exist, don't know why it would trip the driver up
<Lynne> getting very late here, going to get some sleep
<airlied> Lynne: cool, pick up again sometime
<Lynne> tomorrow, though likely all of the day will be eaten away trying to install nvidia's drivers, only to probably end up in failure
* airlied is Saturday tomorrow :-)
<Lynne> err, later today is what I meant
<airlied> Lynne: also are you building mesa with debug symbols? because you can see easily what crashes :-)
mclasen has quit [Ping timeout: 480 seconds]
<Lynne> rvcn_dec_message_decode
<Lynne> looks like it's past the vulkan section
khfeng has joined #dri-devel
<airlied> Lynne: you never fill in the dstPictureResource
<airlied> that has to be there along with the dpb info
<deathmist> building https://github.com/KhronosGroup/VK-GL-CTS on my target device and it seems LTO is enabled which makes it impossible for me to link it with ~5.5 GB of free memory, any idea how to disable this?
soreau has quit [Read error: Connection reset by peer]
soreau has joined #dri-devel
<mattst88> deathmist: how is LTO getting enabled? maybe try a debug build?
<Lynne> airlied: bit further now, segfaults in get_h264_msg
<airlied> Lynne: video session parameters done?
<airlied> pBeginInfo->videoSessionParameters
<karolherbst> ehh we have PIPE_COMPUTE_CAP_SUBGROUP_SIZE but it's not used by anything outside CL
<karolherbst> but how every driver always sets those caps
<Lynne> airlied: NULL, the spec said they may be NULL if they're not "applicable"
<airlied> Lynne: they contained the sps/pps params so definitely applicable for h264 :-)
<airlied> VkVideoDecodeH264SessionParametersAddInfoEXT
nchery has quit [Ping timeout: 480 seconds]
illwieckz has quit [Ping timeout: 480 seconds]
<Lynne> oh, so that's where all of that info goes
<Lynne> looks like I'll have to make a new one per frame
illwieckz has joined #dri-devel
mclasen has joined #dri-devel
<Lynne> nice, both of them use fully standard names from the spec
Danct12 has joined #dri-devel
illwieckz has quit [Ping timeout: 480 seconds]
<deathmist> mattst88: not quite sure, debug build doesn't change it from trying to LTO e.g. deqp-vk towards the end and get ld SIGKILLed
illwieckz has joined #dri-devel
Danct12 has quit [Quit: Quit]
<airlied> daniels: I don't think cts itself turns lto on
<airlied> oops
<airlied> deathmist: ^
<airlied> must be from distro or some dependency
LexSfX has quit []
LexSfX has joined #dri-devel
<Lynne> airlied: seems to be working now
<Lynne> no crashes at least, having some hanging issues with our threaded decoder
<Lynne> (we thread hardware decoders too because they're just a thin layer on top of our regular parser/decoder)
mbrost has joined #dri-devel
<airlied> Lynne: nice!
mhenning has quit [Quit: mhenning]
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
garrison has quit []
i-garrison has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
sdutt has quit [Read error: Connection reset by peer]
sdutt has joined #dri-devel
Duke`` has joined #dri-devel
pnowack has joined #dri-devel
mclasen has quit [Ping timeout: 480 seconds]
heat has quit [Ping timeout: 480 seconds]
thellstrom has joined #dri-devel
rgallaispou has quit [Read error: Connection reset by peer]
thellstrom has quit [Ping timeout: 480 seconds]
LexSfX has quit []
Duke`` has quit [Ping timeout: 480 seconds]
illwieckz_ has joined #dri-devel
itoral has joined #dri-devel
illwieckz has quit [Ping timeout: 480 seconds]
danvet has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
ahajda has joined #dri-devel
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
lplc has quit [Remote host closed the connection]
mvlad has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
lplc has joined #dri-devel
itoral has quit [Remote host closed the connection]
nchery has joined #dri-devel
lplc has left #dri-devel [#dri-devel]
itoral has joined #dri-devel
lplc has joined #dri-devel
tursulin has joined #dri-devel
jkrzyszt has joined #dri-devel
illwieckz_ has quit [Ping timeout: 480 seconds]
frankbinns has joined #dri-devel
illwieckz_ has joined #dri-devel
lynxeye has joined #dri-devel
rasterman has joined #dri-devel
kj has joined #dri-devel
tzimmermann has joined #dri-devel
illwieckz_ has quit [Ping timeout: 480 seconds]
rgallaispou has joined #dri-devel
illwieckz_ has joined #dri-devel
ppascher has quit [Quit: Gateway shutdown]
kts has joined #dri-devel
ppascher has joined #dri-devel
ppascher has quit [Read error: Connection reset by peer]
ppascher has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
MajorBiscuit has joined #dri-devel
pcercuei has joined #dri-devel
sdutt has quit [Ping timeout: 480 seconds]
<danvet> tzimmermann, mripard_ mlankhorst_ just pushed another patch to drm-misc-fixes
<danvet> probably needs some merging if there's no -rc9 to make sure it's not lost
<danvet> s/just pushed/will push shortly/ the script is still running :-)
<airlied> worth bouncing to Linus?
<danvet> nah
<danvet> minimal fix for a panel driver
<danvet> which landed in 5.15
<danvet> if I get it right it's more actually new hw support, it fixes a case where a vcc is optional
<danvet> so if your dt has that, nothing goes boom at all
flacks has quit [Quit: Quitter]
flacks has joined #dri-devel
nchery is now known as Guest2464
nchery has joined #dri-devel
dliviu has quit [Ping timeout: 480 seconds]
rkanwal has joined #dri-devel
dliviu has joined #dri-devel
Guest2464 has quit [Ping timeout: 480 seconds]
lemonzest has joined #dri-devel
rgallaispou has quit [Read error: Connection reset by peer]
illwieckz_ has quit [Ping timeout: 480 seconds]
<tzimmermann> danvet, ok
rgallaispou has joined #dri-devel
illwieckz_ has joined #dri-devel
tobiasjakobi has joined #dri-devel
mclasen has joined #dri-devel
sagar__ has quit [Remote host closed the connection]
sagar__ has joined #dri-devel
JohnnyonFlame has joined #dri-devel
tobiasjakobi has quit []
rgallaispou has quit [Ping timeout: 480 seconds]
sagar__ has quit [Remote host closed the connection]
sagar__ has joined #dri-devel
pH5 has quit [Ping timeout: 480 seconds]
pH5 has joined #dri-devel
<bbrezillon> jekstrand: any good reason for not having a generic GetPhysicalDeviceQueueFamilyProperties -> GetPhysicalDeviceQueueFamilyProperties2 wrapper in vk_physical_device.c ?
ahajda_ has joined #dri-devel
ahajda has quit [Read error: Connection reset by peer]
rkanwal has quit [Quit: rkanwal]
rkanwal has joined #dri-devel
rkanwal has quit []
rkanwal has joined #dri-devel
itoral has quit [Remote host closed the connection]
rkanwal has quit []
rkanwal has joined #dri-devel
rgallaispou has joined #dri-devel
<marex> mripard_: hey, can you look at the remaining two icn6211 patches so we can wrap it up and finally have a driver which does not burn displays ?
mclasen has quit []
mclasen has joined #dri-devel
krushia has quit [Ping timeout: 480 seconds]
mclasen has quit []
mclasen has joined #dri-devel
<jekstrand> bbrezillon: I've not typed it yet?
<jekstrand> bbrezillon: I can whack one out quick. It'll only take a few minutes.
sdutt has joined #dri-devel
alarumbe has quit [Remote host closed the connection]
fxkamd has joined #dri-devel
illwieckz__ has joined #dri-devel
mripard_ has quit []
mripard has joined #dri-devel
<mripard> marex: I asked for two changes, you did neither. What kind of review do you expect?
<mripard> and if those patches are that important, shouldn't they have at least a fixes tag?
<karolherbst> ehh memory leaks :(
alarumbe has joined #dri-devel
<mripard> marex: and if you really don't like my review, feel free to ask somebody else
illwieckz_ has quit [Ping timeout: 480 seconds]
<bbrezillon> jekstrand: already did
<jekstrand> bbrezillon: Oh... I just finished typing mine and changes for all the drivers. :-/
<jekstrand> bbrezillon: Is yours in a branch somewhere?
<bbrezillon> not yet
<bbrezillon> and I didn't patch all drivers anyway
<bbrezillon> so go ahead with yours, I'll catch uup
fxkamd has quit []
<jekstrand> Doing a full CI run now but I expect it'll be fine
<jekstrand> Yeah, crucible runs. It's fine.
mattrope has joined #dri-devel
<jekstrand> Also found and fixed a lavapipe bug while I was at it for bonus goodness. :D
<bbrezillon> jekstrand: thx
mbrost has joined #dri-devel
Haaninjo has joined #dri-devel
<karolherbst> is there a way to figure out how much memory a nir uses?
pnowack has quit [Remote host closed the connection]
pnowack has joined #dri-devel
<jekstrand> karolherbst: nope
<karolherbst> but I think I found my issue
<jekstrand> karolherbst: Well, what do you mean by how much memory?
<karolherbst> I never call nir_sweep :O
<jekstrand> oops
<karolherbst> got a few really high memory spikes in the tests doing libclc stuff :D
<jekstrand> Yeah, good idea to call that every now and again. :D
<jekstrand> Oh, yeah, you want to sweep right after inlining libclc.
rgallaispou has quit [Read error: Connection reset by peer]
<ccr> wouldn't nir_vacuum() be more efficient?
<karolherbst> jekstrand: yeah.. currently looking into when it makes sense
<jekstrand> ccr: Nah, it damages the rugs.
<karolherbst> 2.9 GB :O
<ccr> nir_dedust() for that gentle touch
<karolherbst> 2.9 GB -> 2.4GB with a nir_sweep after all passes mhhh
khfeng has quit [Ping timeout: 480 seconds]
<jekstrand> karolherbst: That's entirely possible depending on how big the shader is. Does seem a bit on the large side. Maybe you're leaking something?
<karolherbst> not really
<karolherbst> maybe
<karolherbst> dunno
<karolherbst> valgrind doesn't report much
<karolherbst> "possibly lost: 112,689,646 bytes in 639,007 blocks" soo maybe? dunno
<karolherbst> but that could be ralloc
<karolherbst> jekstrand: ahhh...
<karolherbst> I found it
<karolherbst> running nir_sweep before/after every pass does make a huge difference
<karolherbst> 2.9 GB -> 190MB
<karolherbst> ehh
<karolherbst> 1.9 GB
<karolherbst> but it goes back to 190MB later in the process
<karolherbst> just have to figure out when to run nir_sweep
* jekstrand should probaly look at rusticl again today
<jekstrand> I've been saying I'll do it tomorrow all week
paulk has joined #dri-devel
<karolherbst> jekstrand: yeah.. seems like once after inlining is enough :)
* karolherbst hopes systemd-oomd won't reap ssh sessions anymore
<karolherbst> jekstrand: we still have a massive peak but I doubt we can do much about it
<karolherbst> so in htop that one test moves between 1% and 8%
<karolherbst> 1% after done with compiling
<karolherbst> but at least it goes down now :)
<karolherbst> but lets see what valgrind has to say now
<jekstrand> :)
<karolherbst> jekstrand: maybe we want to specify an entry point when doing inlining
<karolherbst> so it only inlines inside this one function
<karolherbst> or have a flag for "only inside the entry point"
<jekstrand> karolherbst: Or dead code everything else before we get to that point.
<karolherbst> jekstrand: I think the issue is, that we inline everything multiple times
<karolherbst> so even if something is called in the entry point, the copy might also have inlined everything already
<karolherbst> so we have that twice
<karolherbst> and I do DCE everything
<jekstrand> karolherbst: Typically, you inline then delete everything except the one entrypoint you care about.
<karolherbst> sure.. but those shaders can be huge
<jekstrand> If we do CLC after that, then there shouldn't be a problem.
<karolherbst> and can have a deep call tree
<karolherbst> and if we inline everything multiple times?
<Lynne> airlied: I got it working, but I'm getting mojibake output: https://0x0.st/oNZM.jpg
<karolherbst> ohhh wait
<karolherbst> huh
<jekstrand> You can lose either way. If you have a single funciton called N times which has a libclc call in it, then doing libclc first means you only do the libclc inline once. On the other hand, if you've got piles of dead functions, doing libclc late means you don't inline into all those.
<karolherbst> no
<karolherbst> that won't solve the issue
<karolherbst> or well..
<karolherbst> it solves a different one
<karolherbst> the test only have a single entry point to begin with
<karolherbst> jekstrand: I don't have dead functions
<karolherbst> it's one kernel, doing one single alu call
<karolherbst> but the thing getting inlined from libclc can be massive
<jekstrand> Yeah....
<karolherbst> I will print the shader after inlining and see what we have there
<karolherbst> but I suspect it's just multiple inlines having inlined the same thing, just less deep in the call stack
<karolherbst> _wow_
<karolherbst> that's massive
sdutt has quit []
sdutt has joined #dri-devel
<jekstrand> karolherbst: It may help to do more optimizations on libclc before we use it for lowering.
<karolherbst> yeah
<karolherbst> anyway, the nir is 17k lines :)
<karolherbst> just for acos
<jekstrand> That's big but it not 2.5G big
<jekstrand> Oof
<jekstrand> Is acos using a lookup table?
<karolherbst> well, that's just the vec16 fp32 version
<karolherbst> it compiles for every vec + double + relaxed fp32
<karolherbst> uhm
<karolherbst> *
<karolherbst> jekstrand: yeah
<jekstrand> karolherbst: Are we lowering that lookup table to a giant pile of bcsel?
<karolherbst> not sure
<karolherbst> I just know that libclc uses lookup tables for some stuff
<karolherbst> "decl_var INTERP_MODE_NONE float phi@1116"
<karolherbst> I am not sure if I should cry or be very impressed
<karolherbst> jekstrand: ahh...
<karolherbst> so ehh
<karolherbst> I was partly right
<karolherbst> the wrapper contains everything and the original func
<karolherbst> so we just double the stuff
<karolherbst> mhh
<karolherbst> maybe we should do inline once, then repeate with clc
<karolherbst> *repeat
<karolherbst> I totally forgot about the wrapper
<dschuermann> jekstrand: karolherbst: any objections on renaming ffma->fmad for glsl fma? for opencl, I think we should introduce a separate nir opcode (then ffma) and the current implementation aligns more with the glsl version
<karolherbst> dschuermann: there is no difference between glsl and CL fma really. glsl just allows using fmad
<karolherbst> so for glsl it's more like ffma or fmad and for CL you have to use ffma
<dschuermann> then we need to split precision and consistency for our "exact" ALU flag
<karolherbst> but the CL one isn't more precise than the glsl one
<karolherbst> dschuermann: yeah.. something like that
<karolherbst> but I don't think nir is the problem here, there was some funkyness in glsl land if I remember correctly
<karolherbst> maybe not
<karolherbst> not sure
<jekstrand> dschuermann: I've thought about it and maybe even proposed it once.
<dschuermann> guess we just have to decide if we go with separate flags or separate opcodes
<jekstrand> dschuermann: IIRC, LLVM does something like that. They have one opcode that's sloppy and can become fma and another that's actual fma
<jekstrand> dschuermann: I don't want to plumb through a new flag if we can avoid it
<dschuermann> but the alternative potentially duplicates a lot
<jekstrand> yeah
<karolherbst> jekstrand: soo uhm.. inlining without libclc crashes currently :/
<karolherbst> ohh
<karolherbst> I remember
<karolherbst> glsl inexact fma -> mad, glsl exact fma -> fma
<karolherbst> I think that was the idea I or somebody else had
<karolherbst> not sure if we do that already or if somebody wrote a patch for it
<jekstrand> dschuermann: For the stuff in opt_algebraic that's duplicated between fma and fmad, we could do a `for op in ['ffma', 'fmad']:` and de-dup it.
<jekstrand> dschuermann: For a lot of other stuff like mad-fusion, we only want it to operate on fmad anyway.
<karolherbst> there was also some hardware around were using fmad is faster than ffma
<karolherbst> the thing about exact in glsl is, that it only demans consistency
<karolherbst> *demands
<dschuermann> yes, I proposed that in your MR ;) but giving it a second thought, not sure if the idea glsl exact fma -> ~fma does any good
<karolherbst> so you can replace all exact ffmas with fmads if you are always doing it
<karolherbst> so maybe the driver should be able to tell a preference for "glsl fma" and we just go with it?
<jekstrand> dschuermann: which MR?
<dschuermann> if we say that glsl fma is always fmad, then the backend can just pick whatever as long as consistent
<jekstrand> yes and no
<jekstrand> It depends on how you handle fma
<jekstrand> ugh...
<karolherbst> ohh right, I had an MR
<dschuermann> hm, the issue is probably exact a*b+c which could be fmad if the backend emits unfused :/
rgallaispou has joined #dri-devel
<karolherbst> yeah...
<karolherbst> well.. some hw even has both
<karolherbst> some.. only one
<karolherbst> I think nvidia always had only one, but they changed what it means
<dschuermann> AMD has every possible combination depending on the generation ;)
<dschuermann> of fast ffma, slow ffma, fast fmad, no fmad
<karolherbst> :D
<karolherbst> yeah, but slow ffma is only really useful for CL
<dschuermann> yeah, we don't use it all currently, but who knows what brings the future :)
<karolherbst> jekstrand: ohh.. now I know why I didn't catch that earlier.. I've disabled asserts and stuff to speed things up, but now I don't hit a bug anymore :)
<karolherbst> math_brute_force/test_bruteforce fract has some... bugs
<karolherbst> test_bruteforce: ../src/compiler/nir/nir_deref.c:990: opt_restrict_deref_modes: Assertion `parent->modes & deref->modes' failed.
<jekstrand> dschuermann: Glad they remembered to include the slow ones. :D
<karolherbst> jekstrand: well that's for hardware without a fast ffma imho
<karolherbst> ahh, afaik
<dschuermann> probably still better than software ffma ;)
<karolherbst> _yes_
<karolherbst> I saw libclcs software ffma
<karolherbst> it's terrible
<karolherbst> it looks only 1% as fast as ffmad
<karolherbst> *fmad
<dschuermann> I think AMDs slow ffma is 1/4 rate or something like that
<karolherbst> yeah, sounds familiar
<dschuermann> and the new generations dropped fmad °°
<karolherbst> so you might see why AMD thought it's a good idea to put a slow ffma into hw :)
<dschuermann> uhm.... I guess that's without DP either
<karolherbst> yeah, just fp32
<karolherbst> ohh.. libclc lowering for fp64 ffma is bonkers
* jekstrand doesn't want to know what that expands to on HW without fp64 support
<karolherbst> pain and suffering?
<jekstrand> That's all fp64 on such hardware
<dschuermann> such kernels are probably better off on the CPU
<karolherbst> but there is worse
<Lynne> airlied: patched the validation layer to go further, but it gives conflicting information
<Lynne> vkCmdPipelineBarrier2KHR can't be ran on a video queue
<Lynne> but using VK_ACCESS_2_VIDEO_DECODE_WRITE_BIT_KHR/VK_PIPELINE_STAGE_2_VIDEO_DECODE_BIT_KHR can't be done on a graphics/compute queue
<karolherbst> vec1 64 ssa_14 = deref_ptr_as_array &(*ssa_6)[ssa_10] (global double) /* &(*(double *)ssa_3)[ssa_10] */
<karolherbst> vec1 64 ssa_24 = deref_cast (double *)ssa_14 (function_temp double) /* ptr_stride=0, align_mul=0, align_offset=0 */
<karolherbst> :(
<karolherbst> oh well
<karolherbst> not in the mood of such nir compiler shenanigans. I have to leave something for jekstrand :P
<jekstrand> lol
rkanwal has quit []
<jekstrand> karolherbst: That does seem a bit on the bogus side. Not sure what's going on.
<karolherbst> not sure either
<marex> mripard: I did the changes you asked for in v4 ?
<marex> mripard: the DSI stuff no longer depends on the port
<karolherbst> it's the fract test in math_bruteforce
<karolherbst> I think I might implement printf...
rkanwal has joined #dri-devel
rkanwal has quit []
ngcortes has joined #dri-devel
* jekstrand is debating if he should read patches or just pull the branch and play
<jekstrand> karolherbst: more includes. :joy:
<karolherbst> jekstrand: :D
<karolherbst> well.. I am sure you have enough time to read all the patches
<karolherbst> I've created an MR btw
<jekstrand> Yeah, I saw
<jekstrand> karolherbst: I think, if it's happening from meson, you can get the builddir programatically
<karolherbst> at this point I am more focued on getting stuff to work than clean code anyway, but... well... I think I can just not judge what is proper rust code and what is wrong, so I am just waiting on that rust pro to tell me what's wrong on rework stuff then :D
<karolherbst> jekstrand: mhhhh... true
<karolherbst> there seems to be meson.current_build_dir()
<jekstrand> karolherbst: Actually, you can just use a relative path to src/compiler/nir and it'll include both the source and build dirs
<karolherbst> ehh
<karolherbst> meson doesn't let me use it ... wait...
<karolherbst> jekstrand: it does?
<karolherbst> doesn't seem that way
ahajda_ has quit []
<jekstrand> hrm...
<jekstrand> karolherbst: That's what meson told me
<jekstrand> Maybe it lied?
Duke`` has joined #dri-devel
<karolherbst> dunno
<karolherbst> I include nirs dir, but that didn't work
ybogdano has joined #dri-devel
<karolherbst> and "include_directories(nir_build_dir)" doesn't work because it's absolute :(
<karolherbst> but yeah.. I see meson printing something...
<karolherbst> weird
<jekstrand> karolherbst: I think it's something with bindgen
<karolherbst> yeah
<karolherbst> I've talked with dcbaker about this issue
<karolherbst> mentioned some bigger rework is needed to fix it
<jekstrand> Yeah, I think the problem is that rust.bindgen isn't properly expanding include dirs
MajorBiscuit has quit [Ping timeout: 480 seconds]
<karolherbst> potentially
<mripard> marex: port@0 is still mandatory, and the data-lanes property is still in the endpoint. This was clearly stated in my mail here: https://lore.kernel.org/dri-devel/20220311162956.vm7qsrzauw7asosv@houat/
<mripard> as far as I'm concerned, those are the two blockers
<mripard> and you didn't address them
<marex> mripard: ah ... ok ... so then there is a completely new requirement for this entire series ?
<marex> I mean, the series only attempts to fix the fact that the driver as-is right now physically destroys displays and adds I2C configuration support
<marex> I adjusted it so that it does not depend on port@0 in DSI mode and so that even bindings which are invalid and don't pass DT validation would still work with the fixed driver
<marex> wasn't that what you wanted ?
<mripard> no, the binding was plain wrong to start with. It should never have had that requirement on port@0 in the first place
<marex> mripard: I can also very well drop the lane count addition and we can debate that later
<mripard> and the number of lanes it has doesn't have anything to do with DCS vs i2c, it's always going to be there, so it needs to be usable for DCS too
<mripard> so we must A) make the binding consistent with every other bridge out there (but the TC358762), and then have a property that can express the number of lanes without using the OF-Graph
<marex> mripard: you know what, I will drop that lane count for now so the patches can be applied and no more displays end up with permanent damage
<marex> then I will send this as separate patch and we can debate that
<marex> does that work ?
<mripard> there's numerous examples in KMS bindings, like adi,dsi-lanes
<mripard> (well, maybe not numerous)
<mripard> if you want
<mripard> but really, it takes like 5 minutes to address, and there's not much to debate really
<marex> yes, I do, I think permanent hardware damage is a problem
<marex> mripard: look at ti,sn65dsi83 , that's where the lane count came from
<marex> mripard: there it is the same thing, port@0 , with data-lanes
<marex> mripard: that's where I pulled this from ...
<mripard> yes, and it's not using DCS
<marex> mripard: what makes you think this ICN6211 is using DCS ?
<marex> mripard: the DCS registers which used to be listed in the driver were all wrong
heat has joined #dri-devel
<mripard> the binding, the datasheet and the driver?
<marex> the register description in the original driver was completely wrong, there are no registers named DCS in the ICN6211, the ICN6211 Specification V0.9 does not mention DCS even once either
<karolherbst> annoying.. llvmpipe doesn't properly align buffers :(
<karolherbst> airlied: do you have a patch for that? I am getting alignment fails on double16
<marex> mripard: or do you actually have a full datasheet for this chip which does mention DCS ?
<mripard> but even then, it's not really the point. The binding describes a DCS bridge, and there's no reason to have that port@0 requirement for a DCS bridge
<mripard> and it's inconsistent with the rest of the other bridges
<mripard> for that discussion, the state of the driver is completely irrelevant
<marex> the link ... does not work
<marex> mripard: could it be the bridge is not really a DCS bridge just like the driver which used MIPI_DCS_* macros to describe completely different registers ?
<mripard> marex: DCS is the bus
<mripard> the MIPI_DCS_* macros are not requirements afaik
<mripard> (and even if it is a requirement to implement those commands, it's ignored by half the vendors)
<mripard> maybe they are, it's not clear without the spec. The datasheet mentions than it needs a generic short write to access the device, which isn't a DCS command but a MIPI-DSI packet. It's still sent through the MIPI-DSI link so the point remains
<marex> mripard: I tested both modes, except the controller I use also has port@0 , so that's how I linked this bridge and the controller
<marex> mripard: seems that's what pinchartl suggested too
rgallaispou has quit [Remote host closed the connection]
<mripard> yeah, I mean it makes sense to have port@0 to accomodate controllers that indeed need it
<mripard> but there's no requirement on the OF graph support for MIPI-DSI controllers
<mripard> so we need to accomodate those too
<mripard> hence why I said that we shouldn't make it mandatory
<mripard> not that we should remove it entirely
<mripard> and the point that pinchartl was making that we should make it mandatory for everyone is a separate debate
<dcbaker> jekstrand: can you send me a sha of what's failing? karolherbst and I have talked about teaching bindgen to take a a `dependency` object, and I'm working on that, but `include_directories()` should work, so if that's not working I'll fix that ASAP
<marex> I would much rather prefer consistency and one way of modeling it
<marex> i.e. port / endpoint
<marex> why should we have two inconsistent ways of modeling the same ?
<mripard> port / endpoint is the data stream
<mripard> our whole discussion was about the bus we use to control the device
<karolherbst> check src/gallium/frontend/rusticl/meson.build
<mripard> and if you want consistency, just do like every other panel or bridge is doing then?
<karolherbst> I have there "include_directories('../../../../build/src/compiler/nir/')" as a workaround
<karolherbst> but "inc_nir" or include_directories('../../../compiler/nir/') should work instead, but they don't
<marex> mripard: uh, I did it exactly like that ... with port and endpoint
<marex> mripard: maybe I was misled that this was the right approach
<mripard> so it's not consistent
<dcbaker> karolherbst: oh, that's bad. Yeah, I'll fix that ASAP so we can get that fix into 0.62 and 0.61.4 if that happens
<karolherbst> okay, cool
<karolherbst> dcbaker: I think we also have to declare dependencies, and I think that's what you want to work on later, correct?
<dcbaker> karolherbst: yeah, I just need to get https://github.com/mesonbuild/meson/pull/10122 landed (after the current freeze), and then passing dependencies to bindgen will be trivial
<karolherbst> cool
ybogdano has quit [Ping timeout: 480 seconds]
anarsoul|2 has joined #dri-devel
anarsoul has quit [Read error: Connection reset by peer]
<dcbaker> gitlab's REST API is so cool. I now have a script that can automatically apply commits to the stable branch, then poll gitlab for the CI status and act on it.
alyssa has joined #dri-devel
<alyssa> daniels: uhhhhh don't @ me but why is panfrost-g52-piglit-gl secretly deqp-gles2 https://gitlab.freedesktop.org/mesa/mesa/-/jobs/19950046
macc24 has quit [Quit: ZNC 1.7.5+deb4 - https://znc.in]
frankbinns has quit [Remote host closed the connection]
ybogdano has joined #dri-devel
<jekstrand> karolherbst: I'm seeing a finalize_nir being called on an invalid screen
macc24 has joined #dri-devel
sdutt_ has joined #dri-devel
<karolherbst> jekstrand: mhhhh
<karolherbst> I guess iris doesn't expose something I rely on
<karolherbst> let me try with iris
<jekstrand> It does. It's just getting passed the wrong screen pointer
<karolherbst> huh?
<jekstrand> karolherbst: rusticl/iris adds support for clear_buffer
<jekstrand> Well, partial support anyway
<karolherbst> okay sure, but what does it have to do with finalize_nir getting the wrong pointer?
<jekstrand> karolherbst: You need that if you're going to test iris. :)
<karolherbst> ahh
mdnavare_ has joined #dri-devel
<jekstrand> karolherbst: found it
<jekstrand> karolherbst: finalize_nir is passing s instead of self.screen
jkrzyszt has quit [Ping timeout: 480 seconds]
<karolherbst> ehh wait
<karolherbst> jekstrand: move that let s into the unsafe thing
<karolherbst> and remove the inner unsafe
<karolherbst> then it should work
<karolherbst> ehh wait...
<karolherbst> annoying
Ryback_[WORK] has joined #dri-devel
<daniels> alyssa: ha … git blame? (wasn’t me)
<karolherbst> put a big unsafe around all of it
anarsoul|2 has quit []
<karolherbst> jekstrand: I think the problem is that the &mut has to happen inside the block or outside? I don't know, something is odd about scoping there
anarsoul has joined #dri-devel
<karolherbst> I hit such issues sometimes
nchery has quit [Ping timeout: 480 seconds]
mattrope has quit [Ping timeout: 480 seconds]
mdnavare has quit [Ping timeout: 480 seconds]
sdutt has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
ybogdano has quit [Ping timeout: 480 seconds]
Ryback_ has quit [Ping timeout: 480 seconds]
<jekstrand> karolherbst: Top commit on rusticl/iris
<jekstrand> I can repro the opt_restrict_deref_modes fail now
<karolherbst> cool
<karolherbst> might running test_basic?
<karolherbst> just wondering how much is broken
iive has joined #dri-devel
<jekstrand> karolherbst: Ugh... CL doesn't have the NIR_PRINT macro so I can't get debug logs
<karolherbst> jekstrand: sorry.. but you can put a nir.print() inside the pass functions... I am still not sure how to deal with that
<karolherbst> like calling passes in general
<karolherbst> check src/gallium/frontends/rusticl/mesa/compiler/nir.rs
<karolherbst> ehh self.print()
<karolherbst> jekstrand: I was thinking about doing a macro for having a nicier interface, but that nir print stuff evolved a lot... would be messy to pull it in :/
<Lyude> danvet: btw, regarding the issue with private object states not being safe due to parallel modesets, I'm curious why we wouldn't be able to fix that by just adding the appropriate DRM connectors to the state? I had looked into hooking up drm_crtc_commit structs, but it seems like drm_atomic_helper_setup_commit() already goes through and waits on any drm_crtc_commits for connectors
<Lyude> in the atomic state. Which makes it seem like we could just fix the problem by adding the appropriate DRM connectors to the atomic state, which seems like it'd trigger the appropriate CRTC waits
<danvet> Lyude, the connectors are hotpluggable
lynxeye has quit []
<danvet> so finding all the right ones gets tricky
<danvet> the root drm_connector never has anything plugged into it, so is never part of the modeset
<danvet> iirc at least
<danvet> heck there's often not even a common drm_encoder since we tend to have fake ones for mst (at least in some drivers)
<danvet> well in all of them
<danvet> since for each real mst encoder we need at least one per crtc
<danvet> otherwise we can't drive multiple outputs
<Lyude> We do have references to every mst port in the atomic state though, which also holds a reference to the connector, so we can always just add some code to go through all the atomic payload structs and add the relevant DRM connectors for each one to the state
mattrope has joined #dri-devel
ybogdano has joined #dri-devel
ngcortes has joined #dri-devel
<Lyude> I suppose connectors can become unregistered though
<karolherbst> jekstrand: test_basic on iris: FAILED 6 of 78 tests. :)
<karolherbst> seems stuff isn't entirely broken on a real GPU either
<jekstrand> yup
<karolherbst> I kind of feared that the event stuff isn't entirely correct, but maybe that's fine. As long as we don't do out of order it's also not that complicated to get right
<karolherbst> out of order will be a huge task
<Lyude> And -technically- it does seem like we could lose a drm connector for a drm_dp_mst_port - but I think we would be able to probably figure out how to fix that pretty easily by moving around where we destroy the connector
<Lyude> danvet: if I wrote up some code for this do you think you'd be able to tell if it looks sensible or not?
pcercuei has quit [Ping timeout: 480 seconds]
<danvet> Lyude, not sure that's really worth it
<danvet> since it kinda tracks things very indirectly
<danvet> and I'm not sure with hotplug/unplug and changing crtc and connectors we really always get all the right connectors to sync against
<karolherbst> jekstrand: do we have a good write up on how printf lowering works? Like how should I interpret whatever the nir lowering makes the GPU write into the buffer?
nchery has joined #dri-devel
<marex> mripard: I sent v5 without the lane count, 9/11 is the only patch missing AB/RB
nchery has quit [Ping timeout: 480 seconds]
<Lyude> danvet: fwiw: I actually designed things so that there's some rules with connector creation/removal: port->connector can only go from NULL to non-NULL, so we never need to worry about port having more then one connector or even losing it's connector
tzimmermann has quit [Quit: Leaving]
tzimmermann has joined #dri-devel
ngcortes has quit [Ping timeout: 480 seconds]
nchery has joined #dri-devel
nchery has quit [Remote host closed the connection]
nchery has joined #dri-devel
<Lyude> I guess I'm not really sure what the issue is beyond that then, as if we make sure we're holding a reference to the drm connector from the payload state (with drm_connector_get()), then we're pretty much guaranteed the ability to find every active drm connector in an MST topology. The only exception for that I see is if we needed to protect against the possibility of a topology
<Lyude> being removed, and then a modeset being attempted on it's previous primary connector before the MST related commit has finished. But we probably could solve that by just having a reference to the primary MST connector in the MST topology state that we init/deinit from drm_dp_mst_topology_set_mst (which is used to init the mst atomic state in my atomic only branch)
gouchi has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
ella-0_ has joined #dri-devel
<alyssa> anholt: ..Any chance you understand the new NIR XFB stuff?
ella-0 has quit [Read error: Connection reset by peer]
<anholt> not thoroughly, I've been thinking I should take a look at it for https://gitlab.freedesktop.org/mesa/mesa/-/issues/2880
<anholt> (the reason *that's* been on my mind is a3xx gpu hangs that got in the way of the hangcheck timeout bump)
<airlied> karolherbst: don't remember seeing that, worth checking if clover fails the same
<jekstrand> karolherbst: nope
* Lyude wishes she had a way of verifying these atomic questions with mst
<alyssa> anholt: ack
<alyssa> I don't see how it can be used with any driver that's not radeonsi.
<anholt> do you have any nir xfb stuff laying around to steal?
<alyssa> No
<alyssa> Well
<alyssa> Maybe soon
<alyssa> but
<alyssa> It relies on running lower i/o in mesa/st
<alyssa> each driver has driver-specific pre-i/o lowering passes to run
<alyssa> which means that series just adds the radeonsi pass set as a generic sounding option
<imirkin> also note that intel/gen6 has an xfb-from-shader situation as well
gouchi has quit [Remote host closed the connection]
<imirkin> oh, but that's all in backend IR too
<alyssa> anholt: I think I've been nerdsniped into banging out a basic lower_xfb pass that's good enough for ES3.1
<alyssa> (nir_lower_xfb, I mean, using the new infra)
<alyssa> (Lacking support for GS or TS for obvious reasons)
<dcbaker> jekstrand, karolherbst: well that was an easy one: https://github.com/mesonbuild/meson/pull/10148 should fix the generated headers issue
<karolherbst> dcbaker: cool
sarnex has quit [Read error: Connection reset by peer]
sarnex has joined #dri-devel
<dcbaker> I literally just didn't tell it to include the builddir, so... do that?
<dcbaker> 🤦‍♂️
luckyxxl has joined #dri-devel
<anholt> alyssa: you can @ me to review it when you do.
<alyssa> anholt: cool
<karolherbst> mhh, something is odd abouts clover printf impl...
<alyssa> I hear Apple does XFB on the CPU so I guess I'll use it there too ... :-p
<alyssa> at least until Asahi does GS and we're screwed? :-p
<airlied> karolherbst: I was going to say the printf impl was tractable to read, but then I read it again :P
<karolherbst> airlied: LOL
<alyssa> xD
<marex> is there going to be asahi talk at embedded/kernel recipes ?
<karolherbst> airlied: what I am mostly wondering about is when clover actually flushes out the buffer...
<alyssa> is that the one in France? no
<karolherbst> isn't that like way too early?
<karolherbst> or is "q.pipe->memory_barrier(q.pipe, PIPE_BARRIER_GLOBAL_BUFFER)" like doing a hard sync on everything?
<marex> alyssa: that's the one
<alyssa> marex: wake me up when it's in Montreal ;)
<karolherbst> although that's probably fine
<airlied> karolherbst: printf.cpp just calls printf
<airlied> oh you mean before printf happens
<karolherbst> yeah
<karolherbst> like what makes sure that the kernel finished successfully before reading it out
ybogdano has joined #dri-devel
<marex> alyssa: I have doubts that's ever gonna happen :)
<airlied> karolherbst: it maps it for read
<marex> pity, I think that would be a great talk
<karolherbst> airlied: ohhh
<karolherbst> mhhh
<karolherbst> very annoying
<alyssa> marex: si la poutine ne suffit pas, j'sais pas que je te peux dire :P
<karolherbst> sounds like a pipeline stall to me
<karolherbst> I really don't want events to actually wait on the hw being done, just flush out commands for the GPU
<airlied> karolherbst: for printf you have to stall though
<marex> alyssa: I hope you're talking about the food ... else ... sigh
<karolherbst> airlied: are you sure?
<karolherbst> all I see is that clFinish is the only point were you are actually required to print it the latest
<airlied> karolherbst: it happens on exec_context::unbind
<karolherbst> airlied: yeah, but that happens inside the event
<karolherbst> so the event itself stalls the queue
<karolherbst> I don't want that in rusticl
<airlied> it's not very well tested by CTS, so feel free to do whatever you can within the spec :-)
<airlied> like if you are using printf, you likely are debugging and just want to see the output asap
<karolherbst> yeah soo.. the spec even says, if you executed the kernel multiple times before clFinish, the lines can even intermix
<karolherbst> airlied: I think I'd rather have a second event depending on the kernel execution one and once we do out of order queues we can take into account what event deps are already ready and so on...
<karolherbst> not sure thouogh
<karolherbst> *though
<karolherbst> airlied: or I just attach the buffer to the queue and check on clfinish if there is anything stored in it
<karolherbst> and flush it out there
ngcortes has joined #dri-devel
pnowack has quit [Quit: pnowack]
<karolherbst> airlied: the wording in the spec is totally weird though
<karolherbst> "When the event that is associated with a particular kernel invocation is completed, the output of all printf() calls executed by this kernel invocation is flushed to the implementation-defined output stream. Calling clFinish on a command queue flushes all pending output by printf in previously enqueued and completed commands to the implementation-defined output stream."
<alyssa> marex: ...I was but I see how that could have been misinterpreted...
<alyssa> anholt: ...Do you know what this xfb2 thing is?
<karolherbst> anyway.. without clFlush you can't be sure when stuff gets done anyway
<karolherbst> so that's probably the most pragmatic thing to do
<alyssa> oh
<alyssa> it's a hack
<karolherbst> and once I read the buffer out, I reset the count
<alyssa> got it
<anholt> alyssa: the second const? it's just where you store the remaining bits
<karolherbst> ehh wait.. I think we actually have to do it after the kernel is done.. mhhh
<karolherbst> ehhh
<karolherbst> okay whtvr
<jekstrand> karolherbst: I know why fract is busted. Not sure what I'm going to do about it just yet, though.
<karolherbst> jekstrand: something generic pointer related or something even more terrible?
<jekstrand> It's because libclc defines it to consume a function_temp and it gets passed a global
<karolherbst> ahh
<jekstrand> If we can get all the libclc functions to use generic pointers, problem solved.
<karolherbst> so.. a programming error
<Lynne> airlied: ping
<jekstrand> But I need to look at the sources to see how tractable that is
<jekstrand> So I'm now cloning LLVM *sigh*
<airlied> Lynne: hey, still got the garbage rendering?
<karolherbst> jekstrand: :(
<alyssa> karolherbst: all errors are programming errors
<karolherbst> jekstrand: but I think you won't have to clone LLVM for hacking on libclc
Peste_Bubonica has joined #dri-devel
<karolherbst> alyssa: :D
<karolherbst> true
<airlied> jekstrand: there are some llvm nightly copr if you want to avoid building it
<marex> alyssa: ah ... uh ... all right, I don't speak any French, sorry
<jekstrand> airlied: I don't want to build it. I want to look at the latest libclc sources
<jekstrand> airlied: I know better than to build it. :P
<airlied> jekstrand: ah cool that's not as bad a day :-P
mbrost has quit [Ping timeout: 480 seconds]
<karolherbst> ehh wait... that's rawhide only
<airlied> yeah that one
mbrost has joined #dri-devel
<karolherbst> doesn't have the translator though :(
<karolherbst> or mesa
<karolherbst> ohh
<karolherbst> it installs compatible versions.. mhh
<Lyude> danvet: hm, and thinking about the encoder issue: I think I follow now, are you referring to the fact that if one CRTC tries stealing the real (as in hw) encoder for the CRTC on the primary connector of the MST topology, we might not treat that as encoder theft and as a result could potentially miss adding CRTCs we need into the state to block on?
<airlied> ah you need to build translator, but that's not horrible
<karolherbst> yeah.. I was more concerned about mesa actually
<Lynne> airlied: yeah, still don't know which queue to switch layouts on either
<Lynne> I've checked all of the questionable SPS/PPS/params variables and they seem fine
<Lynne> a lot don't matter for keyframes, which are messed up as well
<Lyude> that one seems like it might not actually be that tough to solve either though, would just need to add a callback to drm_encoders that can be used to test encoder equivalence
<Lyude> that'd need to be implemented per-driver, but I don't think that'd be difficult at all
<Lynne> it looks like a regular h264 desync, but not sure how that's happening, my slices are all properly concat'd and the offsets are right
<airlied> Lynne: what command do I need to see something on my display?
<airlied> the one you gave me previously seems to not display anything
<karolherbst> airlied: okay, seems to work out alright :)
<Lynne> airlied: "./ffmpeg_g -f matroska -c:v h264 -init_hw_device "vulkan=vk:0,debug=0" -hwaccel vulkan -hwaccel_output_format vulkan -i test.mkv -loglevel trace -vf hwdownload,format=nv12 -y test.nut"
<Lynne> you can play the test.nut file with any media player (or change it to y4m for something more standard)
<Lynne> debug=1 switches on the validation layer
<danvet> Lynne, yup
<danvet> oops I mean Lyude ^^
<airlied> Lynne: I think I see a radv bug here
<airlied> oh no wasn't that
* karolherbst hopes there aren't any regressions with llvm-15
ngcortes has quit [Remote host closed the connection]
<karolherbst> airlied: somehow that is all bonkers
heat has quit [Remote host closed the connection]
pnowack has joined #dri-devel
<karolherbst> trivial cases seem to pass, but anything more complex crashes inside vtn :(
<Lyude> it's kind of amusing that there is both an active Lyude and Lynne in this channel Lol
<karolherbst> just sort it out between you :P
<Lyude> hah
<Lynne> think that's bad, there's someone with the nick 'another' on most channels I'm on in freenode
<Lynne> err, libera now
<airlied> Lynne: okay can't spot anything obviously wrong there, but can't dig in much further as it's Saturday, but I'll look first thing next week and see can I track it down
<Lynne> alright, thanks
ybogdano has quit [Ping timeout: 480 seconds]
<karolherbst> airlied: wait... CL_DEVICE_OPENCL_C_ALL_VERSIONS just returns "OpenCL C" as the string part?!?!
<karolherbst> how pointless is that...
<Lyude> danvet: anyway got it, starting to understand why just adding the connectors to the atomic state really wouldn't be enough here. I guess what I'll do is I'll see if I can figure out solutions for the issues pointed out here, and just keep a running list of the issues in general so I don't miss anything, and see if I can come up with some solutions that solve all of them
<Lynne> airlied: I'll try to get it working during the weekend, but before you go, which queue do I change the texture layout on?
<Lyude> at least assuming mripard doesn't respond before i'm done with that
<airlied> Lynne: probably doesn't matter right now to get that right, I don't think it's causing this, I'd expect that stuff needs some refining in the spec/driver
<airlied> karolherbst: yes pointless :-P
<airlied> karolherbst: "the name field is required to be "OpenCL C"
<karolherbst> yeah.. well..
<Lyude> lmao
<karolherbst> I mean.. duh
<karolherbst> yeah whatever
soreau has quit [Read error: Connection reset by peer]
<karolherbst> the runtime saves a bit of space, now clients need to do silly printfs still
<karolherbst> who even designed that
<Lynne> airlied: I know, but I'm curious, because IMO it's a drawing operation that needs to run on a queue that doesn't do drawing
soreau has joined #dri-devel
<Lynne> surely someone has thought of this when writing the spec, surely
<karolherbst> airlied: uhm.. OpenCL C 3 needs all CL C 2 features?
<karolherbst> I guess I shouldn't advertize it then
<alyssa> Fail (Result comparison failed)
<alyssa> Woo!
<alyssa> ..wait
<jekstrand> karolherbst: It looks like clang is doing something fishy
<karolherbst> ahh wait.. we have to
<jekstrand> karolherbst: No, it doesn't
<jekstrand> (require all the features, that is)
<airlied> Lynne: I htink if you'd use the texture before on a graphics queue then you'd have to transition it on the graphics queue, but from allocated I expect it could be don't on the decode queue, but that stuff usually means more reading than I'm capable off right now :-P
<airlied> karolherbst: CL C 3 has lots of #defines for the CL 2 features
<airlied> though there are some llvm bugs in the area
<airlied> esp with opencl-c.h
<karolherbst> airlied: sure.. but the CTS is doing weird things
ybogdano has joined #dri-devel
<karolherbst> airlied: so test_conformance/compiler/test_compiler opencl_c_versions fails because apparently we don't support CL 2.0 features :(
<karolherbst> and it checks for pipes
<karolherbst> and others
<jekstrand> karolherbst: I know what's going on now and it apears to be either a clang bug or an us invoking clang wrong bug
<karolherbst> :(
<karolherbst> could be the latter
<jekstrand> karolherbst: The CL C is fine. All its pointers are __global and there is a version of fract() which takes a __global output parameter.
<jekstrand> However, by the time we get the SPIR-V, someone has inserted an OpPtrCastToGeneric for no good reason
<karolherbst> mhhh
<karolherbst> weird
<jekstrand> So it's trying to call a version of fract() which doesn't work on generic pointers with a generic pointer.
<jekstrand> Well, I should back up a bit. I don't know for sure that fract isn't defined to work with generic pointers. It seems like it probably should have a version that does.
<jekstrand> But libclc doesn't have such a version.
<jekstrand> At least not as far as I can tell
<airlied> karolherbst: I think how you invoke clang and/or bugs on the clang opencl-c.h header might cause that
<karolherbst> it's all clcs doing
<jekstrand> Sadly, I think someone is going to have to patch libclc
<airlied> jekstrand: building libclc isn't hard though, it doesn't need all of llvm
<jekstrand> sure
<karolherbst> oh well.. I just deal with printf so at least that's done
<airlied> karolherbst: I'd have to reproduce my last wip setup which I don't have in front of me to tell you what to do there
<jekstrand> I see two fixes: One is to figure out how to get clang to stop casting to generic for now. Maybe a flag? Second is to fix libclc so it provides generic versions of those few functions that have pointer out paramters
<jekstrand> The later is necessary for generic pointer support anyway
<jekstrand> From a quick grep, it look like only fract, modf, and sincos likely need fixing
<alyssa> anholt: FWIW don't have anything working yet, but I'm feeling good about the direction
<alyssa> might be less optimal, maybe
<alyssa> but *so much simpler* than the "clever" path panfrost has
<clever> but my pan isnt frozen? lol
<jekstrand> karolherbst: With that, it stops erroring out deep inside of NIR and just complains that it can't find the right libclc function. Seems like a better failure mode to me.
Peste_Bubonica has quit [Quit: Leaving]
<dcbaker> karolherbst: does OCL_ICD_FILENAMES really point at a .so and not a .json like vulkan?
<karolherbst> jekstrand: ahh cool
<karolherbst> dcbaker: correct
<karolherbst> ehh wait
<karolherbst> no
<karolherbst> it points to a icd file
<karolherbst> which just contains a so file name
<dcbaker> so the documentation is wrong: https://github.com/KhronosGroup/OpenCL-ICD-Loader?
<karolherbst> dcbaker: I wouldn't be surprised if both work
<karolherbst> ehh wait
<dcbaker> I was reading your PR and noticed that devenv doesn't setup OpenCL and thought I'd fix that
JohnnyonFlame has quit [Read error: Connection reset by peer]
<karolherbst> dcbaker: ahh no
<karolherbst> OCL_ICD_VENDORS points to an ICD
<karolherbst> FILENAMES to the so
<alyssa> but *so much simpler* than the "clever" path panfrost has
<alyssa> oops
<karolherbst> jekstrand: if you want to have more fun, "test_conformance/compiler/test_compiler multiple_files" :(
<karolherbst> "SPIR-V id 4 is the wrong kind of value" maybe that's somehwat related?
<karolherbst> but it points to some global invoc stuff
<karolherbst> but no idea what's going wrong there
<dcbaker> karolherbst: okay, that makes no sense, but I see how it is :)
<jekstrand> karolherbst: Hrm...
<jekstrand> karolherbst: I doubt it's related but I can look
<karolherbst> yeah...
<karolherbst> something goes wrong when linking stuff
<karolherbst> it does work for the trivial tests though
<karolherbst> so not sure if that's my fault, clcs or soemthing elses
ybogdano has quit [Read error: Connection reset by peer]
<jekstrand> karolherbst: I'm getting a harmless-looking LINK_PROGRAM_FAILURE
<karolherbst> jekstrand: ahh yeah.. you need llvm-14 and so on
<jekstrand> I say "harmless" because it's not blowing up in spirv_to_nir
<karolherbst> the translator can't even deal with it at 13
<jekstrand> karolherbst: Uh... maybe gonna pass?
<karolherbst> because of extern kernels
<karolherbst> jekstrand: dunno?
<karolherbst> :P
<karolherbst> use that copr
<karolherbst> :D
jfalempe has quit []
mhenning has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
<dcbaker> karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15471; You might be interested in that for your rustcl as well, since that saves manually setting the env var
ybogdano has joined #dri-devel
<karolherbst> dcbaker: mhh.. don't use devenv, but maybe I should
<dcbaker> it's super convenient for OpenGL and Vulkan, OpenCL seems a little easier to do ad-hoc (not custom icd file required, one envvar)
<karolherbst> yeah... but I already have all my scripts :D
<karolherbst> ahhhhh
<karolherbst> the tests eats by stdout :(
<karolherbst> okay...
<karolherbst> nice
<karolherbst> jekstrand: at first I didn't like how I treat internal args, but adding new ones isn't a lot of work :)
<karolherbst> sooo.. do I reimplement printf parsing or....
<karolherbst> the clover one wasn't really written with reusability in mind
<Lyude> man, "one connector/CRTC per encoder" is really an assumption I wish DRM had refrained from making
mvlad has quit [Remote host closed the connection]
luckyxxl has quit [Quit: bye]
<vsyrjala> what would "multiple crtcs per encoder" even do?
* karolherbst reads phoronix
<karolherbst> oh boi
<karolherbst> "Could we please wait with getting unstandardized, unstable languages into critical system components until said language matures and becomes standardized?" :D
<karolherbst> you mean like C?
ppascher has quit [Ping timeout: 480 seconds]
<Lyude> vsyrjala: save us from having to have fake MST encoders :P
<Lyude> karolherbst: lol is someone complaining about rust not being mature
* karolherbst grabs popcorn
<anholt> karolherbst: really curious how they're using opencl in their critical system components currently.
<karolherbst> what should I answer.. :D
<karolherbst> anholt: yeah...
<Lyude> you already did :P
<karolherbst> ahh "Please enter a message with at least 5 characters"
<karolherbst> I went with a simple "nope :P" opens me up to no attack surface at all
<karolherbst> don't really want to discuss C
<karolherbst> :D
<airlied> karolherbst: by printf parsing you mean the code to put the strings into printf itself? totally should be done in rust safely
<airlied> instead of the C string parser we have in clover
<karolherbst> yeah.. I lean towards that as well
<karolherbst> good thing about relying on nir is, I don't have to parse whatever nir provides, but just use it :)
Haaninjo has quit [Quit: Ex-Chat]
<alyssa> oh, uff
<icecream95> puff? muff? huff?
danvet has quit [Ping timeout: 480 seconds]
<alyssa> icecream95: huff, I think
<alyssa> "Individual lines or triangles of a strip or fan primitive will be extracted and recorded separately."
<alyssa> ...how could that possibly have worked with our old scheme
<icecream95> It never did
<alyssa> *...how did we pass conformance anyway
<icecream95> No tests used it?
<alyssa> delightful.
<alyssa> at a cursory glance, angle is broken in the same way
<icecream95> Oh, so is it only desktop GL that this is a problem for?
<icecream95> We could always interpret the tiler heap to get an exact list of all the vertices
<icecream95> (All the ones that didn't get culled, that is...)
<alyssa> That text is from the GLES spec...
<icecream95> ES 3.2 then?
<alyssa> mm
<alyssa> --That text isn't in the ES3.0 spec..
<alyssa> I have so many questions
<alyssa> "Incomplete primitives are not recorded."
<alyssa> anholt: ^^ This is the bigger issue with doing it in the VS (bin shader)
<alyssa> While running the VS, you don't know if the primitive is complete, I don't think
<alyssa> This works on Panfrost only because we do a full index scan ahead of time (slow!)
<anholt> I think update_draw_stats() in freedreno_draw.c is part of how we handle it
<anholt> we also u_trim_pipe_prim in the draw function
<alyssa> right..
<alyssa> I guess you don't need the index scan
<alyssa> trim_pipe_prim makes the test pass, but I'm not at all convinced it's appropriate for an adult driver
<imirkin> alyssa: it's fine if no GS/Tess
<alyssa> imirkin: ok.. a vertex shader with side effects invoked as glDraw(LINES, 1) can be skipped?
<alyssa> anholt: alyssa/mesa:play-xfb-2 has WIP patches passing deqp-gles3 but very much unfit for merge
<karolherbst> why is from_ne_bytes not a trait method :(
<imirkin> alyssa: good q. dunno.
<alyssa> more mildly u_trim_pipe_prim breaks
<alyssa> tches passing deqp-gles3 but very much unfit for merge
<alyssa> ummmm
<alyssa> More mildly, u_trim_pipe_prim breaks primitive restart
<alyssa> (How could it not?)
<imirkin> hm, yeah, that's probably a bigger problem. forgot about that.
<alyssa> not sure how angle deals
<imirkin> i forget when (if) ES gets primitive restart
<alyssa> ES3
<alyssa> same time as XFB
<imirkin> ah ok
<alyssa> overall I have serious questions about the viability of lowering XFB in a VS without hw support
<imirkin> in adreno, there's a special "vtxcnt" input
<imirkin> which tells you which "vertex number" you're on
<imirkin> not to be confused with something like vertexid
<alyssa> nod.. mali doesn't have that
<imirkin> basically you can use that to index into the thing
<alyssa> in general, starting to think splitting off a compute kernel dedicated to XFB is the only sane correct approach
<imirkin> you and i have very different ideas of "sane"
<alyssa> mmmh
<alyssa> otoh even that doesn't handle the desktop GL version of XFB
<alyssa> which maybe requires an index buffer walk..
<alyssa> starting to see why Apple bailed and did it on the CPU
<imirkin> and why i prefer the hw impl, no matter how inefficient it might be
<alyssa> i mean, there's no hw path on Mali (or presumably AGX)
<alyssa> there's just a different sw path that's broken in different ways
<imirkin> it's the wrong way ... but faster!
* alyssa wonders what arm did
<alyssa> (on valhall)
<icecream95> alyssa: If you need a compute kernel.. why not write it in OpenCL C?
<alyssa> icecream95: because by that point, suddenly wiring up the (known broken, but GLES31 conformant) Midgard path seems like the lesser evil by far
<icecream95> alyssa: And I guess having that path would make it easier to RE how the tiler works, so I can see what changes there were since Bifrost
<icecream95> Well, at least it would make it easier to replace the tiler with a software implementation so we can remove it later to make the driver faster
ngcortes has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
pnowack has quit [Quit: pnowack]
sdutt_ has quit [Ping timeout: 480 seconds]
<imirkin> alyssa: do you know if there are xfb + primrestart tests in cts/deqp?
pcercuei has joined #dri-devel
famfo has quit []
famfo has joined #dri-devel
<alyssa> imirkin: dunno
<alyssa> icecream95: remove.. the tiler? :-p
famfo has quit []
famfo has joined #dri-devel
famfo has quit []
famfo has joined #dri-devel
<karolherbst> ehhh... stuff is weird
nchery has quit [Read error: Connection reset by peer]
tursulin has quit [Read error: Connection reset by peer]