<tzimmermann>
javierm, hi. i'm just back from vacation. could take a bit
<javierm>
tzimmermann: yeah I figured. I'm summarizing the threads since there were many mails there
<eric_engestrom>
hakzsam: thanks! fyi I've hit this in 3 out of 3 runs of `vkcts-navi21-valve 3/3` (https://gitlab.freedesktop.org/mesa/mesa/-/jobs/40079004), while the other 2 worked on the first try; perhaps that can help narrow it down
<hakzsam>
yeah, we are working on
<tzimmermann>
javierm, looking at it
Danct12 is now known as Guest11314
Danct12 has joined #dri-devel
<javierm>
tzimmermann: thanks, no rush
Guest11314 has quit [Ping timeout: 480 seconds]
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
lynxeye has joined #dri-devel
Danct12 has quit [Ping timeout: 480 seconds]
<tzimmermann>
javierm, thanks your looking at this bug report. i think we should try that fixed version you posted
jkrzyszt has joined #dri-devel
<tzimmermann>
err, it was posted by pierre
Danct12 has joined #dri-devel
<javierm>
tzimmermann: the max(max(max... ?
<javierm>
that's horrible IMO :)
<tzimmermann>
it is :) but it fixes the problem and is clear once you got it
<tzimmermann>
let me type a reply
<javierm>
tzimmermann: but if you think that's the correct version, I won't going to argue since you looked at this issue more than me
swalker__ has joined #dri-devel
Zopolis4_ has joined #dri-devel
<tzimmermann>
javierm, it's been a while
<javierm>
tzimmermann: I think the question is what can be trusted and what can't
<tzimmermann>
the proposed fix is closer to the original code. IIRC i discarded lfb_depth because it didn't work with cases of 15-bit rgb. this cases used bpp=16 and one of the tests failed. adding it back in this max3() statement should still work
<javierm>
tzimmermann: yeah, you explained why lfb_depth can't be trusted and it seems there are BIOS that don't report some channels (i.e: for xRGB the filler bits)
swalker__ has quit [Ping timeout: 480 seconds]
<javierm>
but Pierre patch is still assuming that either of those can be trusted
<javierm>
while my patch assumes that lfb_linelength and lfb_width can be trusted (we are relying on those anyways)
<javierm>
tzimmermann: if we go with Pierre's patch, then I think that we should also apply my v1 (recalculate lfb_linelength if BPP is calculated)
<javierm>
in other words, we should either trust lfb_linelength or recalculate it
<javierm>
if we trust it, then can be used to calculate the BPP or if we don't, we shouldn't use it as provided
<tzimmermann>
javierm, IDK which information is trustworthy. i'll try to send pierre's patch through a number of systems
<javierm>
the commit message isn't accurate because wasn't the cause after all, but we are calculating BPP and then blindly use lfb_linelength to create the I/O resource
<javierm>
tzimmermann: yeah, patch v2 worked for Pierre though
<javierm>
because BPP is calculated from lfb_linelength and lfb_width
<tzimmermann>
and then it still picks and incorrect format for the sysfb framebuffer. hence, there's a sysfb but with the wrong colroformet
<javierm>
tzimmermann: correct, but the stride matches the (wrong color format BPP) at least
pendingchaos_ has joined #dri-devel
<javierm>
that's why I think we need both to prevent the calculated BPP and reported stride to not match
<javierm>
tzimmermann: let me answer in the thread
pendingchaos has quit [Ping timeout: 480 seconds]
<tzimmermann>
javierm, i cannot follow 100%, but the stride has no effect on the internal pixel format: if you need xrgb8888, but select rgb888, each pixel will still look wrong. stride only affects the overall line length
<javierm>
tzimmermann: I know but if you pick a wrong color format (i.e: rgb8888) then the line lenght will be bigger than format * resolution
<javierm>
it should match, regardless if was picked correctly or not (of course ideally correct but the selected pixel format and stride should match)
<MrCooper>
karolherbst: piglit/bin/cl-api-enqueue-fill-image fails an assertion in radeonsi because depth == 0
rsalvaterra has quit []
rsalvaterra has joined #dri-devel
jkrzyszt has quit [Remote host closed the connection]
APic has joined #dri-devel
<javierm>
tzimmermann: answered in the thread, maybe I'm missing something silly but don't understand how we can't trust lfb_depth and then happily use lfb_linelength rather than a stride using the calculated BPP
<tzimmermann>
javierm, stride is an arbitrary value. it is only vaguely connected to the bpp.
<javierm>
tzimmermann: ah, I see. Is not that we don't trust the lfb_depth is just that is wrong to assume that's the BPP ?
<javierm>
so is only a problem of format selection
<javierm>
IOW, always lfb_depth == (lfb_linelength * 8 / lfb_width) but the error is assuming that lfb_depth == bpp and that's why you used the color bits to calculate it ?
<javierm>
tzimmermann: perfect, then if you trust lfb_linelength and lfb_width, you can just do: bpp = (lfb_linelength * 8 / lfb_width)
<javierm>
I don't see why the calculation has to be so complicated with 3 max and using the color bits
<javierm>
tzimmermann: since the SIMPLEFB_FORMATS array has the bpp that can be used to match the format
<javierm>
we are trusting those anyways to calculate the I/O resource mem size as mentioned so I don't see why can't be trusted for the BPP too. That was my rationale in v2
<javierm>
tzimmermann: but up to you, I'm OK with Pierre's patch too
<tzimmermann>
javierm, in an email reply, i've given the example of allocating 800x600, which gives a stride of 832 pixel (at 4 bpp): 832 ร 4 ร 8 รท 800 = 33,28
<tzimmermann>
that's 33 bits_per_pixel. that's what I meant with 'bpp and linelength are only vaguely connected'.
<tzimmermann>
sorry for all the mess and confusion here
<javierm>
tzimmermann: got it. On the contrary, thanks for the clarifications and sorry for my confusion
devilhorns has joined #dri-devel
glennk has quit [Remote host closed the connection]
glennk has joined #dri-devel
frankbinns has joined #dri-devel
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
jkrzyszt has joined #dri-devel
frieder_ has joined #dri-devel
frieder has quit [Read error: Connection reset by peer]
JohnnyonFlame has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<karolherbst>
MrCooper: ohh interesting, maybe I should run piglit on radeonsi then
<karolherbst>
but normally that shouldn't happen :)
<MrCooper>
karolherbst: to be clear, that was without your !22506
<karolherbst>
right, but that one doesn't fix anything really
<karolherbst>
just adds more validation
<karolherbst>
the CL spec already requires the unused dimension to be set correctly, so I'm mostly curious what happens with piglit here
<karolherbst>
could be also a piglit bug
<MrCooper>
interesting
<tzimmermann>
javierm, can you review some fbdev patchsets?
<karolherbst>
could also be that the crash is when piglit checks for the error code to be returned :)
<javierm>
tzimmermann: sure, I didn't because thought that all where reviewed by the drivers' maintainers
<karolherbst>
either way, that MR doesn't fix the problems I'm seeing with darktables. Interesting enough, I do see them also on llvmpipe and older intel gens.. so might be some weird general problem. Maybe something annoying like clamping behavior
<javierm>
tzimmermann: I see that Marek only provided his tested-by but not ack/review for "[PATCH 0/5] drm/exynos: Convert fbdev to DRM client"
<tzimmermann>
javierm, exynos is on its way into drm-next
<tzimmermann>
the maintainer sent the PR today
<javierm>
tzimmermann: ah, Ok. I'll review armada then after a meeting
<tzimmermann>
thanks a lot
illwieckz has quit [Ping timeout: 480 seconds]
<karolherbst>
uhm.. I think I just realized I pushed to the wrong drm-misc branch. If I want to get a fix in existing in 6.3-rc1, it should go through drm-misc-next-fixes, no?
<karolherbst>
or drm-misc-fixes?
<karolherbst>
or ist next-fixes for before rc1 was tagged?
illwieckz has joined #dri-devel
anand has joined #dri-devel
kzd has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
pendingchaos_ is now known as pendingchaos
<eric_engestrom>
am I missing something, or is there zero CI coverage of the WSI right now?
<eric_engestrom>
from what I can tell, not a single vulkan driver HWCI_START_WESTON or HWCI_START_XORG in its test jobs; is there another way to start them that I'm missing?
yuq825 has quit [Remote host closed the connection]
<javierm>
tzimmermann: .fb_destroy was the callback that was executed only when the last client closed the fbdev fb right ?
<javierm>
even after module removel / unregister
Dr_Who has joined #dri-devel
<tzimmermann>
javierm, yes, it's the cleanup after the fbdev device has been removed. it's called from framebuffer_release() IIRC
<javierm>
tzimmermann: yeah, just checked it. Just wanted to be sure that I reminded it correctly
<javierm>
tzimmermann: r-b patches #3 and #4, didn't for #1 and #2 because I see that Sui already did for those
<tzimmermann>
great, thank you so much. I'll leave the patchset around for a bit. maybe the maintainers still want to comment
<javierm>
tzimmermann: you are welcome
<javierm>
tzimmermann: any other patchset that you are missing a r-b or that is the last one from your series ?
<tzimmermann>
javierm, so let me introduce you to that 100+ patches series that i've been working on for a while....
<tzimmermann>
just kidding
Haaninjo has joined #dri-devel
<tzimmermann>
there's still a similar series for i915. jani didn't have the time to review it. but i got errors from the CI. i'd first have to look through them and see if they are related
<javierm>
tzimmermann: Ok
iive has joined #dri-devel
anand has quit [Remote host closed the connection]
Jasen has joined #dri-devel
pa- has joined #dri-devel
pa has quit [Ping timeout: 480 seconds]
aravind has quit [Ping timeout: 480 seconds]
dcz_ has joined #dri-devel
i509vcb has quit [Quit: Connection closed for inactivity]
<DavidHeidelberg[m]>
jenatali: nice, thank you much ;)
<DavidHeidelberg[m]>
I'll try ot today
<jenatali>
Not sure if there's a better way to detect the use of the MinGW headers which is the real problem, 'cause there's also a Clang included in e.g. MSYS2
<jenatali>
There's other workarounds like changing the include order that I guess we could do instead
stuarts has joined #dri-devel
lynxeye has quit [Quit: Leaving.]
devilhorns has quit []
tarceri_ has joined #dri-devel
frieder_ has quit [Ping timeout: 480 seconds]
tarceri has quit [Ping timeout: 480 seconds]
<karolherbst>
I wonder, are fixes like this valid enough to push them through fixes? https://patchwork.freedesktop.org/series/116536/ I don't think any of those actually fix bugs, they just mostly clean up bad style code or resolve undefined behavior
<mripard>
karolherbst: yeah, if it doesn't actually fix anything there's no real reason to make them go through fixes
<mripard>
take extra care with Markus Elfring patches though, he has (or used to have) very bad reputation
<karolherbst>
yeah
<karolherbst>
the patches are all fine tho
<karolherbst>
just dealing with &NULL->some_field things
<karolherbst>
mostly
<karolherbst>
not sure what's the reason for bad reputation, just if it's because of stuff like that, I wouldn't mind as much. Though I'd still drop Fixes tags unless people don't care either way
<karolherbst>
or rather, those patches just move such dereferences behind null pointer checks
<karolherbst>
and the other part is seq_printf -> seq_puts/putc
<MrCooper>
karolherbst: the bad reputation is because his patches tend to be trivial and some of them incorrect
<karolherbst>
not sure if that alone justifies bad rep
bmodem has quit [Ping timeout: 480 seconds]
<MrCooper>
indeed, he also tends to turn review feedback into pointless arguments
<karolherbst>
ah yeah, that's more of an issue ๐
tursulin has quit [Ping timeout: 480 seconds]
i509vcb has joined #dri-devel
Jasen has quit []
<mdnavare>
daniels: Were you able to update the ssh keys for me so I can set up my dim?
<mdnavare>
daniels: My fd.o username is mdnavare, how do I change that to match my login name from my Google linux machine?
<mdnavare>
and I will upload the ssh keys in the plain text format to the ticket
gouchi has quit [Remote host closed the connection]
mbrost has joined #dri-devel
pa has joined #dri-devel
thellstrom has quit [Ping timeout: 480 seconds]
pa- has quit [Ping timeout: 480 seconds]
kzd has quit [Quit: kzd]
mbrost has quit [Ping timeout: 480 seconds]
kzd has joined #dri-devel
heat has quit [Remote host closed the connection]
rasterman has joined #dri-devel
heat has joined #dri-devel
mbrost has joined #dri-devel
sauce has quit []
heat has quit [Remote host closed the connection]
heat has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
<kchibisov>
What is :cs0 thread created by mesa responsible for?
<Kayden>
that's the radeonsi command submission thread created by util_queue in src/gallium/winsys/amdgpu
<Kayden>
radeonsi enqueues jobs, and that thread takes those and actually submits them to the kernel
<Kayden>
(assuming you're on radeonsi; maybe other drivers have "cs" threads too that I'm not aware of)
<kchibisov>
Yeah, I only have amdgpu hardware.
<Kayden>
at one point it was created to hide the latency of command submission in the kernel, but at this point threaded submission is assumed to exist for some fencing stuff, I think.
<kchibisov>
So it's basically a transmission channel of everything my application is doing to the kernel.
<Kayden>
not quite everything - radeonsi constructs batches of GPU commands (set state, draw, and so on) and asks the kernel to execute those batches. those batch submissions are what go through the :cs0 thread, IIRC.
<Kayden>
other things, like "allocate me a buffer" can just go directly to the kernel without going through that queue
<Kayden>
having trouble with it?
<kchibisov>
I have frequent page faults with my pure gles2 boring applications.
<kchibisov>
Which mentions cs0 thread. Though, I have a feeling that GPU firmware is at fault.
<kchibisov>
At this point, I'm trying to reduce vector or maybe developing some workaround in the application(at least for myself).
<Kayden>
ah, GPU hangs?
<kchibisov>
They don't really hang.
<kchibisov>
It's more like "we try to handle" -> reset or recovered.
<Kayden>
right
<Kayden>
#radeon may be able to help too, though sounds like filing a bug against radeonsi at https://gitlab.freedesktop.org/mesa/mesa/-/issues with some way to reproduce it might be the way to go (if you haven't already)
<kchibisov>
Kayden: oh, there're bugs already for that.
<Kayden>
ah
<kchibisov>
But I have a feeling that unless you fix it yourself there's no luck.
* Kayden
doesn't actually work on amdgpu...just read through that code a bit trying to understand how other drivers handle certain things
smilessh has joined #dri-devel
<kchibisov>
I could easily make my program robust, though, the issue is that it won't help me :p
<kchibisov>
Because my compositor is not robust yet.
<kchibisov>
But it's not really a "solution".
<airlied>
if you are getting faults from a gl or gles program, there is either a bug in your app or the shader compiler
<kchibisov>
airlied: the app works fine for 2-3 years and crashes only for me on a specific GPU.
<kchibisov>
And the crashes are recent thing.
<airlied>
okay so might be a bug in the shader compiler or just some undefined behaviour raising its' head
<kchibisov>
I could share the shaders, they are sort of simple.
<kchibisov>
I know that I have crash with gles2 shaders, OpenGL 3.3 shaders, and when using zink as well.
jdavies has joined #dri-devel
<kchibisov>
we could continue in #radeon, airlied if it's better.
jdavies is now known as Guest11395
Guest11395 has quit [Remote host closed the connection]