<danvet>
emersion, yeah I think many folks from fbdev background approach this with "display should be easy"
lalbornoz_ has joined #dri-devel
lalbornoz has joined #dri-devel
lalbornoz has quit [Read error: Connection reset by peer]
lalbornoz_ has quit [Read error: Connection reset by peer]
xantoz has joined #dri-devel
adjtm is now known as Guest895
adjtm has joined #dri-devel
Guest895 has quit [Ping timeout: 480 seconds]
iive has joined #dri-devel
ppascher has quit [Ping timeout: 480 seconds]
camus1 has quit [Ping timeout: 480 seconds]
camus has joined #dri-devel
ppascher has joined #dri-devel
khfeng has quit [Ping timeout: 480 seconds]
itoral has quit [Remote host closed the connection]
camus1 has joined #dri-devel
camus has quit [Read error: Connection reset by peer]
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
lemonzest has joined #dri-devel
Lucretia has quit [Ping timeout: 480 seconds]
Lucretia has joined #dri-devel
RobertChiras has joined #dri-devel
RobertC has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
minecrell has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
vivijim has joined #dri-devel
<whald>
hi! when doing a gbm_bo_map with GBM_BO_TRANSFER_WRITE, am i supposed to write the whole mapped region or are partial updates ok? with intel iris partial updates seem ok, but i get horrible artifacts on amdgpu.
elongbug has quit [Remote host closed the connection]
<MrCooper>
whald: if not writing to the whole specified rectangle, need to pass GBM_BO_TRANSFER_READ as well (which will have a performance cost)
<whald>
MrCooper, this is what i'm seeing, right. good to know that's expected. then i'll have to evaluate if mapping (write only) multiple smallish regions is faster than mapping and writing the whole thing. adding GBM_BO_TRANSFER_READ is out of question for me because much to slow.
<lynxeye>
MrCooper: Huh? Partial updates should be okay with a pure write mapping, as long as you don't tell the driver to discard the map range, which you can do through GBM.
<MrCooper>
lynxeye: if that is allowed, a driver which uses a staging buffer for the mapping always has to behave as if GBM_BO_TRANSFER_READ was specified
<lynxeye>
MrCooper: Right, that's why you have map flag DISCARD_RANGE in the gallium interface to allow the driver to skip the readback of GPU memory.
<pq>
MrCooper, that interpretation of the read/write flags is not how I would have understood the docs IIRC, but now that you said it, it's kind of obvious in hindsight.
<MrCooper>
lynxeye: not sure what you mean; how would the implementation know if / which parts can be skipped?
<pq>
lynxeye, there is no such discard flag in the GBM bo API.
<lynxeye>
MrCooper: Partial update requires the driver to read back the GPU memory into the staging buffer. If you specify DISCARD_RANGE, you promise to not do partial update, or don't care about the content in the buffer regions you don't fill. So only with DISCARD_RANGE the driver can assume you don't do partial update of the mapped region.
<MrCooper>
pq: yeah, the documentation could probably be more specific about the staging buffer case
<lynxeye>
pq: Right, which is why I'm puzzled by the amdgpu result. Partial update on a write mapping should work.
<pq>
lynxeye, seems to be unspcified - and also prohibitely slow for whald it sounds like.
<pq>
seems that in the gbm_bo_map() API, discard is the default, and if that's a problem, you need to set TRANSFER_READ.
<pq>
if that's what the implementations that use a staging buffer have always done, then so be it - the doc is just missing
<MrCooper>
lynxeye: the implementation would need to copy any parts not written by the caller from the real buffer to the staging one first, otherwise the reverse copy on unmap copies garbage for them
camus1 has quit []
<lynxeye>
pq: No, at least that's not what the implementation does. The map flags are passed through to the dri frontend, which rewrites them to gallium flags. There is nothing adding the discard flag.
<MrCooper>
the discard flag is irrelevant
<MrCooper>
the staging buffer doesn't magically get the current contents of the real buffer
<lynxeye>
MrCooper: No, the driver copies the mapped range from real to staging on map if you don't specify discard.
<lynxeye>
Exactly because it doesn't know if you are going to do partial updates. By adding the discard flag, you promise to not do partial update, or don't care about getting garbage in the unwritten regions.
<emersion>
are there performance penalties when mmapping a dumb buffer with PROT_READ as well?
<MrCooper>
lynxeye: it depends on PIPE_MAP_READ as well (which corresponds more or less directly to GBM_BO_TRANSFER_READ)
<MrCooper>
not sure why you keep bringing up DISCARD_RANGE, since there's no such GBM flag
<whald>
and is seeing garbage on partial writes a good indicator that a staging buffer is used? i was hoping to get intel - like behaviour on the amd APU as well, because it's unified memory like intel as far as i know.
<MrCooper>
emersion: CPU reads can be extremely slow with dGPUs, other than that not sure
tzimmermann has quit [Ping timeout: 480 seconds]
<lynxeye>
MrCooper: Because I would expect a GBM write mapping to behave as if you don't specify DISCARD_RANGE, which means from my interpretation of the gallium API that partial updates should work.
<emersion>
my main question is, is it a big deal to avoid setting PROT_READ if i'm not going to read the buffer?
<MrCooper>
lynxeye: Gallium resource contents are read back only if PIPE_MAP_READ is set
<MrCooper>
for the same reasons
frieder has quit [Remote host closed the connection]
<MrCooper>
emersion: I doubt it matters for performance; I'd rather be worried about implicit reads, e.g. for unaligned writes with some platforms (not sure that can actually happen though)
<emersion>
ok
Peste_Bubonica has joined #dri-devel
Duke`` has joined #dri-devel
RobertC has joined #dri-devel
reb0rn_ has joined #dri-devel
RobertChiras has quit [Ping timeout: 480 seconds]
<MrCooper>
lynxeye: it may be a bit weird, but it's the only way (given the current GBM API) an implementation using a staging buffer can ever skip the copy from the real buffer to the staging one
<lynxeye>
MrCooper: Hm, yea I seems to be misremembering some details of the API here.
<imirkin>
i guess it depends which part of the stack you're looking to focus on
<danvet>
pq, emersion just wanted to thank you for all your dri-devel kms uapi review
<danvet>
you're bringing up all the nonsense so that by the time I catch up on mails there's nothing for me to do :-)
<imirkin>
wow, that bootlin graphics deck is pretty intense.
tzimmermann has joined #dri-devel
sravn has quit [Quit: WeeChat 3.0.1]
<imirkin>
i'm going to try to remember that one.
reb0rn_ has quit []
<jenatali>
Wow, that's an excellent presentation
<jenatali>
Wish I'd seen that a couple years ago
GloriousEggroll has joined #dri-devel
Kayden has quit [Ping timeout: 480 seconds]
yk has quit [Remote host closed the connection]
yk has joined #dri-devel
yk has quit [Remote host closed the connection]
yk has joined #dri-devel
<ccr>
I had never heard of Xgl before reading that
Kayden has joined #dri-devel
<bnieuwenhuizen>
emersion: if you map VRAM on AMDGPU with read I think it will move it to system memory temporarily for the time of the map. Not sure what dumb buffers end up with
pnowack has quit [Read error: Connection reset by peer]
khfeng has joined #dri-devel
Anorelsan has quit [Quit: Leaving]
Kayden has quit [Read error: Connection reset by peer]
gouchi has joined #dri-devel
khfeng has quit [Ping timeout: 480 seconds]
Kayden has joined #dri-devel
tobiasjakobi has quit [Remote host closed the connection]
aravind has quit [Remote host closed the connection]
<emersion>
danvet: glad to help!
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
mlankhorst has quit [Ping timeout: 480 seconds]
RobertC has joined #dri-devel
Danct12 has quit [Quit: Quitting]
Danct12 has joined #dri-devel
NiksDev has quit [Ping timeout: 480 seconds]
sravn has joined #dri-devel
jcline_ has joined #dri-devel
jcline has quit [Ping timeout: 480 seconds]
iive has quit []
tzimmermann has quit [Quit: Leaving]
sagar_ has quit [Ping timeout: 480 seconds]
pcercuei_ has joined #dri-devel
pcercuei has quit [Read error: Connection reset by peer]
ngcortes has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
ngcortes_ has joined #dri-devel
jcline_ has quit []
sagar_ has joined #dri-devel
ngcortes_ has quit []
<mareko>
I just noticed that uploading to VRAM has higher CPU overhead than uploading to uncached GTT
ngcortes_ has joined #dri-devel
<bnieuwenhuizen>
mareko: my testing says memcpu to VRAM is about 10% slower. Is that what you're seeing?
<bnieuwenhuizen>
memcpy*
<mareko>
probably more
<mareko>
I'm talking about descriptor uploads, which are sparse
lemonzest has quit [Quit: Quitting]
Joe234 has joined #dri-devel
utttddfdsGuest80 has joined #dri-devel
blue__penquin has quit [Quit: Connection closed for inactivity]
Kayden has quit [Quit: Leaving]
Joe234 has quit []
utttddfdsGuest80 has quit []
DextersHub has joined #dri-devel
RobertC has quit [Ping timeout: 480 seconds]
sagar_ has quit [Ping timeout: 480 seconds]
Kayden has joined #dri-devel
tobiasjakobi has joined #dri-devel
ngcortes_ has quit [Remote host closed the connection]
jcline has joined #dri-devel
thellstrom has quit [Remote host closed the connection]
thellstrom has joined #dri-devel
iive has joined #dri-devel
thellstrom1 has joined #dri-devel
thellstrom has quit [Ping timeout: 480 seconds]
Ahuj has quit [Ping timeout: 480 seconds]
ngcortes_ has joined #dri-devel
gouchi has quit [Remote host closed the connection]
vivijim has quit [Remote host closed the connection]
txenoo has quit [Quit: Leaving]
sagar_ has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
ngcortes_ is now known as ngcortes
tobiasjakobi has quit [Remote host closed the connection]
<airlied>
imirkin: does dEQP-GLES3.functional.* pass for you on i965?
iive has quit []
<imirkin>
airlied: i'm on a SKL atm. and ... don't have dEQP here. sorry
<airlied>
imirkin: yeah was just wondering if you have any historical memory
<imirkin>
airlied: sorry, no. i never ran it personally on i965. it all passed on crocus, except xfb though
<imirkin>
and a couple of depth tests here and there
<airlied>
seeing dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler3d_fixed_fragment fail for me both 965 and crocus
<imirkin>
i don't think i saw that, but more generally iirc there used to be some problems with such tests that have been fixed in newer deqp?
<imirkin>
otoh, the deqp i was running was probably quite old. maybe the problems are new? dunno
<imirkin>
i'll try to check into it tonight
<imirkin>
airlied: does it fail on i965 for the intel guys too? i.e. is it in their fail lists in the intel ci?