ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
<anholt> gawin: we already convert unsupported operations in nir.
<gawin> do remember in which file? i'd add a few more
<gawin> it'd be superb to move "transform_r300_vertex_CMP" into NIR
mlankhorst has quit [Ping timeout: 480 seconds]
<anholt> I guess you could make an r300-specific cmp, but I think most of our wins would come from sorting out has_fused_comp_and_csel for ntt
<anholt> or once we land the tgsi ra series, then we do some greedy csel fusing in ntt.
<anholt> note that "make an r300-specific cmp" would involve both a nir and tgsi opcode, and we don't really make new tgsi opcodes these days.
<anholt> so at that point you're probably actually talking about making yourself a copy of ntt as the frontend for r300's compiler.
pcercuei has quit [Quit: dodo]
ybogdano has joined #dri-devel
<zmike> airlied: well I filed the issue and got a pretty quick reply, so it looks like my compiler patch is correct and this is a longstanding bug that nobody else has managed to hit
<gawin> anholt: I was thinking about getting rid of all "transforms", which are introducing new temps (for example opcodes which are implemented by 3 or more opcodes). except moving into NIR, we could either create some garbage collector (implementing SSA) or manually cleanup after each "transform" (maybe it'd require rewriting allocator). I guess first option is simplest and easiest to do(?)
<gawin> iirc float_const_write_dynamic_loop_read_vertex is failing because "transform_r300_vertex_CMP" is grabbing too many temps
<anholt> register allocation should make an arbitrary number of cmps take just 1 more (at worst) total temp.
<anholt> if that's not the case, then you need to fix your RA.
<anholt> (spoilers: radeon does, in fact, need to fix its ra)
<gawin> for now storing index and reusing all over is good idea?
ybogdano has quit [Ping timeout: 480 seconds]
italove31 has quit []
italove31 has joined #dri-devel
<airlied> zmike: I disagree with his assessment, since the GLSL spec doesn't apply, it's the GLSL ES spec that needs the language
Ristovski has joined #dri-devel
<zmike> airlied: I guess we'll see what he says
<airlied> I don't see that sort of language anywhere in the GLSL ES spec
Lucretia-backup has quit [Remote host closed the connection]
<airlied> adding a if targetting OpenGL to that spec would clarify it alright
Lucretia-backup has joined #dri-devel
ced117 has quit [Remote host closed the connection]
ced117 has joined #dri-devel
fxkamd has quit []
<maxzor> FLHerne, please don't sue me for plagiarism https://en.wikipedia.org/wiki/AMDgpu_(Linux_kernel_module)
<maxzor> one could argue that someone could have done it during these years.
<maxzor> probably won't last long due to lack of reliable sources...
<airlied> maxzor: amdgpu isn't developed under ROCm
<airlied> amdkfd is the rocm component of the amdgpu driver
<airlied> the driver is developed on the amd-gfx mailing list, and internally at AMD by whatever teams have requirements on it I suppose
<maxzor> do you still maintain this hard disinction within amdgpu.ko?
<airlied> amdkfd is now just part of amdgpu
<airlied> but there is still a large part of kfd specific code in there
columbarius has joined #dri-devel
<airlied> and you can turn off KFD completely still with CONFIG_HSA_AMD
<maxzor> airlied, do you know a place where one can read a summary of the difference between the designs of amdkfd and other graphics drivers? Read on the CRIU thread that amdkfd was subject to mid-term change/redesign also.
co1umbarius has quit [Ping timeout: 480 seconds]
<airlied> nope never seen it written down in a neat explaination
<airlied> amdkfd isn't a graphics driver
<maxzor> right
<airlied> it exists solely to provide userspace command submission to firmware provided compute queues
<maxzor> why is courbet not doing it ':)
<airlied> amdkfd was probably a bad idea in hindsight
<airlied> since they tried to move away from the fd model to a process vm model
<airlied> so multiple GPUs could share a process VM and have the kernel manage it all
<maxzor> knowing virtually nothing to the graphics pipeline I won't ask relevant questions on this part :<
<maxzor> thank you for inputs!
<airlied> also I don't think amdgpu is explicitly tested on fd.o gitlab
<airlied> fd.o gitlab is used for mesa driver testing, don't think anyone has graduated it to testing kernel drivers yet
<maxzor> have you been approached for rocm mesa interop yet?
<maxzor> hip has patches mentioning vulkan, not sure if it is reserved to amdvlk/pal
ngcortes has quit [Remote host closed the connection]
<maxzor> I started assembling the understanding yesterday that the design differences that you just mentioned could be a hindrance to this interop, and that they had not been worked through yet
V_ has quit [Remote host closed the connection]
<airlied> maxzor: I think they only care about pal
<maxzor> -.-
<airlied> zmike: btw do you have a gles build of the gtf test suite? I wonder if those tests get executed there as well
<airlied> granted you don't need GTF for latest gles conform, but older ones all used it
<zmike> airlied: no, I only build for gl
<zmike> I think you'd have to check the mustpass list anyway to find out?
<airlied> oh mustpass good plan, I'll find an old one
<airlied> yes those are in the gles3.2 mustpass
lplc has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
<zmike> according to jekstrand the atan tests can't pass as fp16, so it seems we're at an impasse
<graphitemaster> speaking about barriers, if you have a draw call which writes to a texture in an FBO and you read that texture in a compute shader (say with texelFetch), what barrier is actually necessary here?
<graphitemaster> The glMemoryBarrier spec is very clear about it only being value for *writes from shaders* not writes via the framebuffer
<graphitemaster> Right now I just have FETCH barrier but that doesn't seem correct to me
<graphitemaster> FRAMEBUFFER barrier is when you have a draw call that will write to a texture (via FBO attachments) that was previously written via a shader (imageStore)
<imirkin> when you say "draw call which writes to a texture", you just mean you're drawing a tri onto some fb, right?
<imirkin> not like imageStore or whatever?
<graphitemaster> Yeah
<imirkin> iirc you don't need to do anything
<graphitemaster> imageStore specifically you'd use SHADER_IMAGE_ACCESS barrier for
<jekstrand> zmike: Sure... Blame it all on me. :-P
<graphitemaster> So writes from a draw call (to framebufffer attachments) are implicitly synchronized with compute dispatches that read those attachments?
<imirkin> graphitemaster: and later draws, yes
<graphitemaster> That's not true in Vulkan
<imirkin> forget compute for a second
<graphitemaster> What is the purpose of glTextureBarrier then
<imirkin> let's say you want to do a 2-pass render
<zmike> jekstrand: whoa whoa calm down buddy I'm just citing your expert assessment
<imirkin> that's when you're reading the framebuffer from the same shader as which is writing it
<zmike> I just wanna get this failboat uncapsized
<graphitemaster> This says something different so I'm really confused
<imirkin> the first part agrees with me :)
<imirkin> anyways... nha tends to know this stuff. he did at least part of the images bringup in mesa iirc
<graphitemaster> Wait how does it agree with you, am I reading it wrong
<imirkin> he says that glMemoryBarrier is not for you :)
<graphitemaster> Yeah that I know
<imirkin> that part agrees with me too :)
<graphitemaster> Okay well we're in agreement there
<graphitemaster> Do I need texture barrier though :D
<imirkin> ARB_texture_barrier has this texzt:
<imirkin> This extension relaxes the restrictions on rendering to a currently
<imirkin> bound texture and provides a mechanism to avoid read-after-write
<imirkin> hazards.
<imirkin> so this is about reading and writing to the same texture in the same shader
<graphitemaster> Oh
<imirkin> i.e. having a texture bound both for sampling and framebuffer
<imirkin> basically this is only allowed if you're only reading the "current" location
<graphitemaster> I didn't even think that was possible without something like fragment interlock
<imirkin> and glTextureBarrier() is a way to let you flush the texture cache since you just updated everything
<graphitemaster> Or framebuffer fetch
<imirkin> and this only works if there's no over-draw
<imirkin> i.e. it's for a very very very limited use-case
V has joined #dri-devel
<graphitemaster> So then in OpenGL every compute dispatch needs to complete before a draw can even begin
<graphitemaster> There's literally no room for overlap
<imirkin> no
<imirkin> other way. and only if the compute shader is sampling from the fbo
<graphitemaster> Or writing to it
<imirkin> no
<imirkin> compute shader can't "write" to the fbo in the same natural way
<graphitemaster> Oh but if it's writing to it you need FRAMEBUFFER barrier before the draw
<imirkin> it'd have to use imageStore/etc
<imirkin> exactly.
<graphitemaster> Makes sense
<imirkin> anyways ... this is my understanding.
<HdkR> Time for Nouveau to add support for Maxwell PSI :P
<imirkin> PSI?
<HdkR> Pixel Shader Interlock
<imirkin> yeah. that'll happen.
<imirkin> NEVER
<graphitemaster> fragment interlock on NV is really slow
<HdkR> What, you don't want to RE the hundreds of instructions it takes to do PSI? :D
<graphitemaster> So you couldn't possible to worse than the official driver
<imirkin> i'm going to do better
<imirkin> i'm going to give people the benefit of not providing that ext
<imirkin> and then they won't be tempted into doing stupid shit
<jekstrand> :)
<graphitemaster> In fact, interlock on the official driver turns my whole pass into the speed of nouveau on a modern NV GPU XD
<HdkR> haha
<jekstrand> It supposedly doesn't suck too bad on Intel
<airlied> zmike: I think they pass on llvmpipe :-P
<jekstrand> Once again, optimizing the things that don't matter...
<HdkR> ^ I've heard that as well
<airlied> if I turn off the constbuf stuff, but I haven't had time to check fully, maybe in a few hours I'll dig in
<graphitemaster> I think Intel has it for framebuffer fetch reasons
<imirkin> jekstrand: intel just serializes everything, so it's a no-op right? :p
<graphitemaster> And Intel has framebuffer fetch because they need to software blend the advanced blend equations
<graphitemaster> Since it's spec now right?
<imirkin> everyone has to do advanced blend in software
<graphitemaster> NV supports KHR_blend_equation_advanced in hardware
<HdkR> Everyone hates the photoshop modes
<zmike> airlied: I shouldn't have to disable that, and it still doesn't fix the atan tests
<imirkin> graphitemaster: huh? since when?
<graphitemaster> 3000?
<imirkin> ampere? ok, i can believe that.
<imirkin> or did you mean the year 3000? :p
<graphitemaster> haha
<graphitemaster> Oh god please tell me we're not using OpenGL in the year 3000
<jenatali> Is there any good lowering available already for those advanced blend modes?
<imirkin> jenatali: yea
<imirkin> you just have to implement a "fbfetch" op
<zmike> as long as you can do fbfetch
<jekstrand> jenatali: Yup. Just implement frambuffer_fetch_non_coherent and you get it "for free"
<jenatali> Excellent. In terms of framebuffer fetch or something else?
<imirkin> which you can lower into a texture load if you want
<jekstrand> If you can do coherent fbfetch, we automatically enable the coherent blend advanced extension
<imirkin> there isn't even support in mesa for a non-lowered impl :)
<graphitemaster> You can emulate fbfetch with another implicit R32UI and atomic increment the pixels on that then do a nice spinny boy on it
<imirkin> [would be addable, obviously, but it hasn't come up...]
<graphitemaster> That's what I have as a fallback :|
<graphitemaster> Spin locks in a fragment shader are always ... fun
<graphitemaster> I bet there is a less stupid way
<jekstrand> imirkin: I'm sure zmike will add it so he can hook it into VK_EXT_blend_equation_advanced to get at that sweet NV hardware... :-P
<imirkin> heheh
<graphitemaster> But I don't get to write drivers, just GL :|
<jenatali> I think we could do it with ROVs maybe. I'll get to that eventually
<imirkin> yeah, that nvidia does it in hw is completely news to me
<imirkin> but i haven't kept up with the latest
<zmike> jekstrand: I had actually forgotten about that, but it's on my list
<imirkin> insufficiently interested, tbh
<zmike> and I think gallium does support a non lowered impl
<zmike> it's got all the modes
<imirkin> it gets the info, but i thought it got the lowered thing. maybe someone made an option to pass through the originals ... svga maybe?
<graphitemaster> I didn't even see there is a BlendBarrierKHR, shrug
<airlied> zmike: not lowering precision pretty much disables that cap anyways
<imirkin> graphitemaster: theres a _coherent variant
<imirkin> which is supported by intel, naturally
<imirkin> (and all the arm drivers i think)
<graphitemaster> LOL
<graphitemaster> Intel is so weird
<imirkin> and then you don't have to do barriers
<jekstrand> And you can draw more than one triangle at a time.
<zmike> airlied: only for desktop
<imirkin> you can draw as many triangles as you want without _coherent
<zmike> and I don't have to pass atan in es
<imirkin> as long as there's no overdraw :)
<zmike> so I can figure out the mat ones and disable lowering for desktop gl and be fine
<graphitemaster> Would be nice if you could get access to the hardware blend units in compute shaders
<jekstrand> Let's start by figuring out the mat ones. Those look like they might be legit fails.
<jekstrand> zmike: ^^
<zmike> that's my plan during my meeting block tomorrow
<jekstrand> zmike: It's possible whatever bug is breaking those is affecting atan()
<airlied> is the problem I found in llvmpipe before with getuniform
<airlied> I was going to look into the matrix ones to see if it was similiar
<HdkR> graphitemaster: That's just called a fragment shader :)
<jekstrand> zmike: But I still suspect atan() is something with fp16
<airlied> well we do lower atan in glsl to some stuff with 32-bit stuff in it
<graphitemaster> HdkR, okay but I need to do 3D voxel stuff and fragment shaders don't support 4096 layers, and using a geometry shader to emit those is sadge
<imirkin> graphitemaster: NV_viewport_array2?
* airlied would rather we at least have a root cause before we just go axing code paths
<jekstrand> sure
<zmike> I'm ok with this
<jekstrand> I'm generally a fan of figuring out why bugs exist rahter than hacky workarounds.
<graphitemaster> imirkin, meh, also neh
* airlied assumes gles doesn't have glGetUniform
<imirkin> graphitemaster: or NV_geometry_shader_passthrough?
<airlied> oh it does
<graphitemaster> Can't set that many viewports
<imirkin> graphitemaster: oh yeah, good point. 16 viewports
<airlied> I suspect the fp16 constbuf stuff is buggy wrt GetUnifom, and we just haven't tested it
<jekstrand> Yeah, when I looked at the mat test, the shaders looked fine. I suspect we've got a mismatch somewhere with uniform upload.
<jekstrand> But I don't know that code well at all.
* jekstrand doesn't do GL :-P
* imirkin definitely won't mention ARB_copy_image...
<airlied> yeah fixing the uniform upload to now lower to 16-bit fixes the mat tests
<jenatali> Oh there's a blend barrier. Okay that helps
<jenatali> I hadn't actually read that spec yet, just the brief and got scared
<zmike> airlied: nice!
<jekstrand> imirkin: My ARB_copy_image implementation no longer exists. :P
<imirkin> hehe
<imirkin> you piped it through core
<imirkin> that's still there
<jekstrand> jenatali: Yes, there's a barrier so you can do your flushes etc.
<jekstrand> imirkin: Yeah... There are a few lines in src/mesa/ that still blame to me. :P
<jenatali> Aka copy back or something
<graphitemaster> How much do you want to bet BlendBarrierKHR has the same function address as TextureBarrier on NV
<imirkin> graphitemaster: definitely does as i've implemented!
<graphitemaster> :D
<imirkin> just a texture flush
<imirkin> since the advanced blend thing is precisely the scenario that ARB_texture_barrier talks about
<graphitemaster> I wish over-draw was allowed and it was coherent
<graphitemaster> I'd have a sweet use for that
<jekstrand> Then you need Intel hardware!
<imirkin> with texture barrier?
<imirkin> or with advanced blend?
<graphitemaster> texture barrier
<imirkin> you can do that with ...
<imirkin> GL_EXT_shader_framebuffer_fetch
<imirkin> largely supported on arm
<graphitemaster> NV does not support it :P
<imirkin> right.
<jekstrand> Intel does!
<jekstrand> On Gen9+
<imirkin> like i said ... arm
<imirkin> :p
<imirkin> embedded
<imirkin> (j/k. mostly.)
<graphitemaster> The coverage is real bad, a whole 9.64%
<graphitemaster> Interlock is too slow too
<jekstrand> Not on Intel. :P
<graphitemaster> Okay but can Intel do NV_path_rendering
<graphitemaster> If we're talking weird shit for a moment
<jekstrand> CAN is an interesting word...
<HdkR> CAN? Probably. Want? Probably not.
<graphitemaster> Imagine putting a potscript and SVG parser in mesa
<graphitemaster> *postscript
<graphitemaster> potscript is funny tho
<Sachiel> displayport printers
<jekstrand> VK_KHR_printer_surface
<graphitemaster> You know, when NV_path_rendering came out it actually had something really cool, the paths rendered could be conservative - in the sense that even partial pixel overlap with a fragment could produce a fragment, ala conservative rasterization - before we had conservative rasterization extensions - so using it for shadow volumes would've been pretty cool
<jekstrand> I wonder how hard that'd be to implement... Can't be that hard to send an image to /dev/lp0
* airlied asked about fp16 glGetUniform intercations
<graphitemaster> My bad :| sorry - muddying up the IRC, I should get back to work.
<jekstrand> graphitemaster: No worries. This channel is 2/3 BS anyway. :)
<jekstrand> airlied: What interactions?
<jenatali> WGL had GDI metafile surfaces so that you could render to printable surfaces IIRC
<imirkin> NV_path_rendering supports SVG parsing in the GL. what could go wrong.
<graphitemaster> jekstrand, fine one last BS since it's topical https://github.com/graphitemaster/printer-display
<airlied> jekstrand: if we have 16-bit constant lowering, and you set a 32-bit uniform and read it back with glGetUniform we currently would lose precision
* airlied thinks PIPE_SHADER_CAP_FP16_CONST_BUFFERS might not be compliant as we have it implmenetd
<zmike> sure seems that way
<jekstrand> graphitemaster: I love that you list "latency" as a reason not to use a printer. :D
<HdkR> Need some io_uring for that printer to remove some copies and lower latency </s>
<jekstrand> airlied: Ugh... I suspect you should be able to read it back exacct...
<imirkin> airlied: if it's set as mediump in the shader? dunno
<jekstrand> HdkR: Blit via photocopier, anyone?
<imirkin> airlied: or if it's auto-lowered?
<graphitemaster> jekstrand, those 6 lines of C exist in every printer driver, fun fact.
<airlied> imirkin: if mediump
<imirkin> airlied: just like using GL_FLOAT uploads of texture data with RGBA8. you don't get the exact same thing back out ... dunno
<imirkin> airlied: what happens if you use glUniform1i with a float uniform? is that legal?
<airlied> imirkin: not sure actually
<imirkin> if it's legal, does it convert to float? and if it does, what happens if you stick something greater than 1<<23?
<imirkin> (you probably see where i'm going with this line of reasoning)
<airlied> imirkin: the spec is again vague :-P
<airlied> imirkin: though the interface for setting a float uniform takes a 32-bit value
<airlied> which is sorta different than setting an int and getting a float
<airlied> converting it to garbage on readback is different than converting it internally
<jekstrand> Do we actually implement GetShaderPrecisionFormat?
<airlied> yes from limits set in the state tracker
<jekstrand> So we do
<jekstrand> airlied: Yeah, I've got no idea what the spec means for glGetUniform()
<jekstrand> It says incredibly little
<imirkin> "get uniform" - how much more spec do you really need? it's in the function name!
<jekstrand> I expect that it should return the same value set through glUniform()
<jekstrand> And not lose precision
<jekstrand> But what do I know?
<imirkin> jekstrand: would you expert glGetTextureImage to return the same data you fed in via glTexImage?
<imirkin> expect*
<imirkin> coz you'd be sorely dissapointed...
<jekstrand> No, but that's different, IMO
<jekstrand> For one thing, textures have an internal format that gives some idea of the precision
<jekstrand> I guess uniforms sort-of do
<jekstrand> It could go either way
<imirkin> yea
<imirkin> but think of it this way ... what's the benefit of glGetUniform? i.e. why would one ever do that?
<imirkin> other than writing a GTF test
<imirkin> tbh i can't think of any great use-cases. but ultimately you'd want to know the value that the shader was going to use
<jekstrand> imirkin: apitrace probably uses it
<imirkin> jekstrand: what for?
<jekstrand> To get uniforms, obviously. :P
<imirkin> it already captured the glUniform's as they were being "sent in"
<jekstrand> Yeah.
<imirkin> shouldn't need to "get" them, except maybe one of the "inspect" views
<jekstrand> So maybe not apitrace
<jekstrand> I do
<imirkin> in which case i think seeing the values the shader was using might be preferable?
<jekstrand> I don't know. I'm sure there's some app out there that found a "good" reason. :P
<jekstrand> “¯\_(ツ)_/¯
<imirkin> anyways, i'd argue for "return the real value"
<imirkin> the only counterargument is glGetCompressedTexture
<imirkin> where you're supposed to return the original data
<imirkin> if you e.g. decompressed it internally, you can't just recompress it (for a variety of reasons)
<jekstrand> ¯\_(ツ)_/¯
maxzor has quit [Ping timeout: 480 seconds]
mattrope has quit [Read error: Connection reset by peer]
nchery has quit [Read error: Connection reset by peer]
maxzor has joined #dri-devel
Duke`` has joined #dri-devel
jewins1 has joined #dri-devel
jewins has quit [Read error: Connection reset by peer]
<airlied> jekstrand, zmike : another data point is the float test passes, the vec2 one fails
<airlied> and I've built old gles conformance with gtf, it also fails
<airlied> and it's actaully the reference shader failing
LexSfX has joined #dri-devel
mszyprow_ has joined #dri-devel
<airlied> jekstrand, zmike : submitted a fix for the atan gtf tests to kc-cts internally gitlab
LexSfX has quit []
mszyprow_ has quit [Ping timeout: 480 seconds]
LexSfX has joined #dri-devel
jewins1 has quit [Ping timeout: 480 seconds]
itoral has joined #dri-devel
sdutt has quit [Ping timeout: 480 seconds]
<airlied> jekstrand, zmike : the matrix fail seems to be lowering f16mat
<airlied> actually not sure how the matrix is going to be packed here
<airlied> seems to bug somewhere in mesa with lowering fmat ubo loads, vec3 vs vec4
Duke`` has quit [Ping timeout: 480 seconds]
tursulin has quit [Read error: Connection reset by peer]
mlankhorst has joined #dri-devel
danvet has joined #dri-devel
MajorBiscuit has joined #dri-devel
pnowack has joined #dri-devel
mvlad has joined #dri-devel
frieder has joined #dri-devel
gouchi has joined #dri-devel
frieder has quit [Remote host closed the connection]
frieder has joined #dri-devel
frieder has quit []
frieder has joined #dri-devel
gouchi has quit []
pnowack has left #dri-devel [#dri-devel]
pnowack has joined #dri-devel
tzimmermann has joined #dri-devel
Major_Biscuit has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
<mlankhorst> airlied: looks like just some DP_HELPER selects were missing
tursulin has joined #dri-devel
<danvet> tzimmermann, javierm I finally sent out my fbcon series includeing fbdev core maintainers entry patch
<danvet> acks welcome and all that
<javierm> danvet: yes, it's on my TODO to look at this morning
<tzimmermann> ok
<javierm> there are some easy ones that I can ack right away like your revert of the efifb hack. But for others I'll need to dig into the fbcon code
<javierm> danvet: btw, on the discussion about Rx formats. If we advertise that we will still need to convert to invert mono right ?
<danvet> javierm, invert mono?
* danvet lost
<danvet> javierm, well the revert I thought needs more work, or did I misunderstand the discussion with tzimmermann and you yesterday?
<danvet> I thought there's still a corner where sysfb can slip through and be resurrected
<danvet> that's why I put the revert and final cleanup at the very end
<javierm> danvet: yes, it needs more work but I'm happy to ack if we have a path to fix it correctly
<javierm> as long as that doesn't go to stable, it's OK to fix it before a release I think
<danvet> javierm, yeah but please include that
<danvet> so I don't go ahead and merge it, breaking stuff :-)
<javierm> danvet: Ok, gotcha
<javierm> let's wait to ack that then
<danvet> the other thing that bummed me out while doing these patches is that I think fbcon locking is unfixable
<danvet> without the threaded printk support
<danvet> and making fbcon always threaded
<danvet> you just can't take any kind of lock from random printk contexts
<danvet> this is the entire kgdb/oops printing path, once more
<javierm> right. But we don't know when that threaded printk support is going to land, right ?
<danvet> to be correct it'd need to be trylock&bail-out all the way down into every driver
<danvet> Soon(tm)
<danvet> it's more that -rt folks decided to not make everything threaded by default, because regression
<danvet> I think we don't get much of a choice really, it's just the correct thing
<javierm> danvet: yeah
<danvet> ah
<danvet> so we need a pile of drm_fourcc with inverted bits?
<danvet> or just invertedR1?
<javierm> danvet: if we want to exporse the correct format as supported by the panel, yes
<danvet> maybe call them Dx for darkness
* danvet bad at naming
<javierm> danvet: on the other hand is kind of nice that all user-space works with the advertised format
<javierm> would that be the case for {R,D}x ?
<javierm> I was able to run plymouth, gdm, etc on the monochromatic display
<danvet> javierm, oh we pretty much need the rgbx8888 fallback if you care about modern distro userspace
<danvet> it's more about the discussion to allow the efficient path
<danvet> especially the fbdev side of it
<javierm> danvet: got it. Then proferred reversed mono but rgbx8888 as fallback. Makes sens e
<javierm> danvet: I think though that this could be done as a follow-up. There's no need to block drivers for this kind of HW due the missing bits
<emersion> user-space has no way to figure out that it should prefer to use reversed mono instead of xrgb
<danvet> yeah I think what we should do is {native formats in preference order} + rgbx88888 as sw fallback
<emersion> it would be nice to have a way to know
<danvet> emersion, the format list is sorted
<javierm> specially since we need to type that up in user-space anyways
<emersion> oh
<danvet> start at the top, pick first you like
<danvet> emersion, maybe another doc patch?
<emersion> :D
<danvet> so at least in my reviews for format lists I tried to make sure the sw fallback is the very last one
<danvet> heck I even started a patch series to make sure the first is really the preferred
<danvet> and fbdev emulation would pickt hat u
<danvet> *pick that up automatically
<danvet> so that we could get rid of the terrible preferred_depth confusion we have
<javierm> emersion: look how the (only) format supported as set as DRM_MODE_TYPE_PREFERRED
<danvet> but alas I got lost
<emersion> javierm: that's just the mode
<danvet> javierm, that's modes, not formats
<danvet> formats you only have xrgb8888
<danvet> so ideally there'd be an inverted mono first
<danvet> and then xrgb8888
<javierm> err, right. Sorry conflated format with mode. Still didn't have coffee :)
<imirkin> danvet: there's the somewhat upsetting situation that right now there's no good way to represent non-32-bit formats on BE. in practice the drivers for that hw don't support AddFB2 so it mostly works out.
<danvet> emersion, oh I just realized that I never sent out that patch set :-/
<javierm> danvet: so user-space should pick the first one listed as preferred then ?
<danvet> imirkin, be is sooooo dead :-P
<emersion> imirkin: DRM_FORMAT
<emersion> err
<emersion> imirkin: DRM_FORMAT_BIG_ENDIAN?
<imirkin> emersion: don't tell me about the BIG_ENDIAN flag
<imirkin> it doesn't work
<imirkin> at all :)
<emersion> :P
<javierm> danvet, emersion: or maybe have a DRM_MODE_FORMAT_PREFERRED ?
<emersion> javierm: not sure how that'd work
<danvet> javierm, don't like that as much
<emersion> also a sorted liost is better
<danvet> we have the same problem with modifiers
<javierm> danvet, emersion: Ok
<imirkin> the drivers work with depth, and userspace works with depth, so it all works out
<danvet> also what emersion said, it's just binary, but maybe you have more preferrence
<emersion> so you can have R8 in-between if that's not as bad as XRGB8888 but not as good as R1
<javierm> emersion: yeah, agreed. Better to make it a convention that the first one is the preferred
<imirkin> but if you remove the "depth confusion" then that might no longer hold true
<danvet> like bochs wants C8 first, then rgb565, then xrgb8888 (because just not enough ram in the fake card)
<emersion> what is bochs again?
<javierm> emersion: right, because R8 would just need one conversion while XRGB8888 would need two -> greyscale -> reversed mono
<imirkin> qemu, but much older
<emersion> eh
<airlied> it's also the qemu vga adapter
<airlied> mlankhorst: ++ will stick that in top of the merge
<javierm> emersion: is there user-space that supports Rx formats ?
<emersion> javierm: not right now, but i can type that up if you want
<emersion> inverted would be more work
<javierm> emersion: no need for now, was just curious. Because I think we need XRGB8888 anyways for the current versions
<danvet> mlankhorst, maybe backmerge that merge to get it into drm-misc-next then?
<emersion> yes, XRGB8888 is good to have anyways to run user-space without special Rx support
<javierm> emersion, danvet: but thanks for the explanations, I understand now why Rx and Dx (or whatever will be called) would be nice to have
<javierm> and make it Dx, Rx, XRGB8888 in that order
<emersion> yea
<danvet> +1
<danvet> also maybe I should resurrect my patch set to clean up this confusion
<emersion> D1, R1, R8, XRGB8888 even if you want
<javierm> right
<emersion> pixman supports R1 and R8 so should be easy to plumb to wlroots and/or weston
<javierm> libdrm also supports R8 AFAICT, didn't find R1 there
pcercuei has joined #dri-devel
<danvet> I guess I should resurrect that half-done preferred format series I've started
<danvet> but it's really messy :-(
<danvet> javierm, the idea was to sort all the format lists and then use that in drm_fbdev_generic_setup to compute the preferred depth
<danvet> and also to compute the preferred bpp getcap
<danvet> and also pave the way for moving fbdev over to fourrcc format codes
<danvet> maybe was a bit too ambitious
<danvet> but thoughts on this direction?
<danvet> tzimmermann, ^^ anyone else who cares about smaller drivers and this stuff?
<danvet> 14c1e12ba605d8770cae3e8078e520365daca921 is essentially the motivation
soreau has quit [Read error: Connection reset by peer]
<tzimmermann> that sounds good
<danvet> but maybe also good prep work for adding more formats to fbdev emulation
soreau has joined #dri-devel
<tzimmermann> danvet, javierm, btw, i've been able to mmap gem-shmem buffers for fbcon without the extra shadow buffer for fbdefio. this will save memory and reduce latency
<danvet> nice
<tzimmermann> i also investigated the slowness of fbcon
<tzimmermann> and the reason why drm drivers have a slow console is...
<tzimmermann> fbdev!
<tzimmermann> *tadaa.wav*
<danvet> how?
<tzimmermann> the code in sys_fillrect is much slower than the same code in cfb_fillrect
<tzimmermann> the compiler doesn't do a good job
<tzimmermann> sys_fillrect take ~3500 cycles to fill a single line with a pattern
<tzimmermann> cfb_fillrect takes ~700
<tzimmermann> all the natve fbdev drivers use cfb_ functions, because they operate on i/O memory
<danvet> lolz
<danvet> that's pretty bad
<tzimmermann> drm uses the sys_ functions because we have a shadow buffer
<tzimmermann> yep: first lol, than facepalm. that was my reaction too
<tzimmermann> i have a simple patch that brings sys_filrect down to ~300 cycles for filling a single line with a pattern
<tzimmermann> so takes have of the time of cfb_ (as i would expect)
<tzimmermann> 'half'
<javierm> danvet: the idea makes sense to me as well
<tzimmermann> i'll post i patchset after i looked at sys_copyarea() and sys_imageblit(). but the issue is the same there
<danvet> tzimmermann, if your microbenchmark rewrites the same line over and over it should be much faster
<danvet> as long as the line fits in l2
<danvet> but yeah faster than cfb_ sounds good already
<javierm> tzimmermann: plot twist :)
Major_Biscuit has quit []
<tzimmermann> danvet, i directly measure the performance of fbcon while i use it: something like 'time find /usr/share/ -type '
<tzimmermann> with rdtsc
<tzimmermann> javierm, indeed. everyone's been blaming drm, then the problem is in fbdev helpers. the rest of the code paths make no difference AFAICT
<javierm> tzimmermann: cool
<tzimmermann> and drives home danvet's argument about how hard it is to write a fast 2d blitter
MajorBiscuit has joined #dri-devel
<tzimmermann> it might take a bit to get this finished. i'm having quite a bit of work to do ATM
<javierm> tzimmermann, danvet: a funny thing is that the original ssd1307fb driver didn't pass all the fbtests from geert but the emulated ssd1307 DRM driver did
<tzimmermann> :D
<javierm> also rmmod ssd1307fb caused a NULL pointer deref. Thought about digging more about for these two issues but then decided that wasn't worth it
Company has joined #dri-devel
<MrCooper> tzimmermann: nice find!
<danvet> javierm, can we port the fbtest to igt perhaps?
<danvet> or are you not that bored :-)
<javierm> danvet :) I think is a good idea, I can add it to my TODO in case I get some free time at some point
<javierm> specially if we plan to replace all the fbdev drivers, should make sure that the emulated fbdev path does not regress
<pq> javierm, what's the problem with doing xrgb8888 -> reversed mono conversion in a single pass?
<emersion> more code to type?
<javierm> pq: you could optimize that but unsure if will be worthy since we would like to advertise greyscale (Rx) too
<pq> I also don't think we are going to need Dx since we have Cx formats, but I'll elaborate that on the mailing list. Too much for irc.
<javierm> so you will need the Rx -> Dx code anyways
<javierm> pq: agreed. I've summed up the format discussions here in the ML
<javierm> danvet, tzimmermann, emersion, sven: I've two meta discussions 1) what to do with the DT binding and whether we should keep it compatible or use the latest and greatest DT conventions
<javierm> and 2) if the driver should be named ssd130x (more accurate) or ssd1307 (less accurate but consistent with ssd1307fb name)
<sven> wrong sven i think :)
<javierm> sven: sorry, I meant sravn I think
<danvet> javierm, ping robher, but I thought the idea is to go with latest&greatest and have the others maybe as fallback?
<danvet> or perhaps pinchartl has an opinion too
<danvet> also who cares about driver names
<danvet> i915 supports i915
<danvet> and also down to i830
<javierm> danvet: Ok, thanks
<danvet> and also like 10 more generations later
oneforall2 has quit [Quit: Leaving]
<danvet> javierm, I'm definitely very far from an authority wrt dt things
<pq> javierm, tzimmermann, I'm really happy to see what you're doing here :-D Of course danvet too.
rasterman has joined #dri-devel
<emersion> yes, thanks a lot javierm :)
<javierm> pq, emersion :)
<tzimmermann> MrCooper, pq, glad you like it :)
oneforall2 has joined #dri-devel
bnieuwenhuizen_ has joined #dri-devel
bnieuwenhuizen has quit [Ping timeout: 480 seconds]
<danvet> tzimmermann, gl has moved away from I/Y formats, they're officially deprecated
<danvet> so a lot of userspace moved towards R as the greyscale format
<tzimmermann> danvet, no big deal
<danvet> (reason is some technicality in the shaders, which is only available in legacy gl context with backwards compat cruft enabled)
<danvet> essentially R loads a (x, 0, 0, 0) in the shader
<tzimmermann> then i propose 'I' as in 'index' :)
<danvet> while I loads as (x, x, x, 1) iirc and Y as (x, x, x, x) or something like that
<pq> daniels, didn't you disagree with using R for grayscale?
<pq> ..in DRM pixel formats
xyene_ has joined #dri-devel
<pq> I just sent my Cx format proposition, FWIW.
<daniels> yeah I definitely prefer C to R
<emersion> a read-only palette would be… quite messy to handle from user-space
xyene has quit [Ping timeout: 480 seconds]
<emersion> especially if it's just grayscale…
<danvet> yeah read-only palette to confer meaning to Cx that "oh btw it's linear greyscale" seems like a funky interface
<danvet> like we also have yuv formats
<danvet> and some implied color space attached to them
<danvet> mixing up paletted with linear modes seems like a funky idea
<pq> I'd agree if the read-only palette was *only* for grayscale. But it's not.
<pq> emersion, what's messy about it?
<emersion> user-space rendering becomes a hell more complicated with a palette
<danvet> pq, also the main motivator for these seems to be "more efficient fbcon rendering"
<pq> emersion, I don't see it that way. Either you use the colors you can display, or you render your usual 8-bit sRGB and then convert with your preference, either trying to match color or intensity.
<danvet> so unless we rewrite fbdev/fbcon, whatever that thing is doing wins
<pq> emersion, so it wouldn't be changing rendering, but you just add a quantization step at the end.
JoshuaAs- has joined #dri-devel
<emersion> pq, what can i do to figure out that i can just use the pixman grayscale format if the kernel gives me C8 + a read-only LUT?
<pq> danvet, sure. My proposal was to make those panels more useful in general, not really for fbcon.
<emersion> i need to have an heuristic to guess that the LUT is linear
<ishitatsuyuki> it's a h264 zoom recording of lecture, with very long I-frame interval (~15s) and full of P-frames in between. The bitrate is also very low at ~200kbps, and at that rate it looks like sending commands to my GPU is becoming a greater overhead
<ishitatsuyuki> afaik vaapi requires a flush for every frame (since flush is bound to render target), so there probably isn't much can be done on the application side?
<ishitatsuyuki> sorry for interrupting, but do you know if vaapi is suitable for very low bitrate streams? I'm currently investigating a case where hwdec has much worse seeking performance than sw
<pq> emersion, I don't think you'd do that. Either you use the palette explicitly (pixman has palettes), or you render to argb8888 and then quantize to the palette.
<emersion> i don't want to do that
<pq> oh...kay...
<emersion> that's wasting a lot of CPU time doing unnecessary stuff
<emersion> why should i need a palette to just do grayscale?
<pq> that depends on what your source material is in
<emersion> why should i render to argb8888 to just display some grayscale?
<pq> the display may not be grayscale
<emersion> if it prefers R8, it is
<pq> then you do R8
<emersion> then why do you want to use C8?
<pq> my proposal is for Cx formats, particularly for displays that are NOT grayscale
<emersion> it's just complicating things for no good reason
<emersion> oh, but that's not the case here
<pq> Did I not mention white/blue LED displays?
<emersion> why is R1 bad?
<pq> ask daniels
<pq> this is my answer to "R1 is bad" but I do not think R1 is bad
<emersion> in the special case of C1, maybe that's okay
<emersion> but C8/C16/etc will be a pain to deal with
<emersion> (with a read-only LUT, that is)
<pq> anyway, I do not expect anyone to actually follow my proposal, I just wanted it out.
<danvet> pq, for me the difference is that with Cx you have to use paletted rendering/quantization/color mapping
<danvet> with Rx you can pretend it's just gamma ramp as usual and works out
<danvet> and for most cases that's good enough
<danvet> because at that color depth not many people care about accurate color grading
<danvet> and yeah C1 is a bit a special case since if you don't care there's really nothing you can do better with R1
<pq> xrgb8888 is also quantized and mapped
<pq> and could benefit from dithering
<danvet> yeah but most people don't care about the difference
<danvet> there's a lot of people who are perfectly happy with output that's "not too shitty"
<danvet> ofc sliding scale
<danvet> so forcing everyone to go through a full quantization/color mapping pipeline just because it's a nice model feels like serious overkill
<pq> danvet, emersion, you know, I was in your shoes exactly when talking R vs. C for grayscale with daniels recently. So I blame daniels. :-)
<danvet> sure if they'd do, the result would be a lot better
<danvet> but reality is that a lot of the stuff running on top of drm drivers is very far from that ideal
<danvet> like all the converting we do in the kernel to give you xrgb8888 is very bad thing from that pov
<danvet> perfect world userspace would do this
<emersion> i mean, i'm not against allowing user-space to do fancy color stuff with a palette
<danvet> imperfect world that stance would result in a lot of black screens and disappointed users
<emersion> but please don't make it mandatory
<danvet> yup
<danvet> pretty much my point
<pq> emersion, I can't make it mandatory. It's the driver who decides which DRM formats it accepts.
<danvet> like with Rx you can also attach a color profile for the screen
<danvet> (do they support paletted screens?)
<pq> I presented a plan to make Cx formats useful, that's all.
<javierm> pq: thanks, it's appreciated
<danvet> pq, I think for stuff like vga16 your Cx proposal is solid
<emersion> pq, i was speaking from the driver PoV here :)
<danvet> since vga16 mode is very much not R4
<emersion> "please drivers, expose R1/R8, so that my user-space can be simple"
<pq> danvet, an ICC profile can have a LUT, but I think it is intended to be interpolatable, so not really.
<danvet> so C4 + fixed palette in srgb or so sounds good
<javierm> pq: my take is that these panels are really shitty, so I'm not sure that level of color accuracy is worth it
<javierm> pq: you can't even play doom because its minimum resolution supported is 320x200 :P
<danvet> anything can run doom assuming sufficient c3 have passed I thought
<daniels> my only objection to Rx is that r means red, not monochrome
Akari` has joined #dri-devel
<daniels> maybe Gx or Wx or whatever?
<danvet> daniels, yeah I think if we go with Rx then at least a doc patch would be good
<emersion> r means "coloR"
<pq> daniels, and I think pixel formats should *not* specify colorimetry at all. So I'm with danvet and emersion on R.
<danvet> but really we could also say they're for roughly a single color with some kind of probably gamma mapped intensity scale
Akari has quit [Remote host closed the connection]
<javierm> emersion: where color here is grey, right ?
<emersion> yeah, but could be any color
<danvet> whereas Cx is "you get all kinds of random things, and in many cases you can specify the palette through GAMMA_LUT
<danvet> "
<emersion> yea
<javierm> emersion: yes, I got that but wanted to say that could Rx could match a greyscale based on your definition
<emersion> ideally a new prop would be better as mentionned before
<danvet> so vga16 is C4 with fixed palette
<danvet> but a green lcd with 4 bits roughly monotonic scale is R4
<emersion> yea
<pq> danvet, I very much agree with that.
<danvet> and yeah R1 probably makes no sense in that world and we should have C1 only
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<daniels> pq: I understand that colorimetry is interesting, I just think that given we have no extant userspace for this, that ‘hey so red is actually grey sometimes!’ is too much of a cute sleight of hand, and that a new format token would be easier and more explicit
<danvet> I guess a FIXED_PALETTE_XXXX where XXXX is the fourcc for that palette could work out
itoral has quit [Remote host closed the connection]
<danvet> daniels, there's fbdev/fbcon, which I think is the main motivator for these
itoral has joined #dri-devel
<danvet> so making it too hard to support fbdev defeats the point
<daniels> fbdev could do G1 just as well as it could R1?
<daniels> hmm no, not G. that’s already taken isn’t it :)
<javierm> daniels, danvet: tbh for this particular driver/device the fake DRM_FORMAT_XRGB8888 is enough really
<javierm> the I2C bus is so slow and the panel only 128x64, that you could do a gazillion copies and transformations in RAM that wouldn't be a bottleneck
<javierm> but understand that's important for monochomatic panels that have high speed busses and are used on slow machines
<emersion> my main motivation for a grayscale pixel format is to let user-space figure out that colors won't be displayed
<javierm> emersion: yes, agreed
<danvet> daniels, I'm also thinking this from the gl side, which can do Rx already
<danvet> and really does not want to import Ix or Yx because of all the historical baggage
<danvet> and at that point we're just bikeshedding a letter
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<danvet> so R is as much as lie as any of the other letters, but it's a lie that at least is convenient for one of our most important userspace
<danvet> daniels, the other thing and I guess that was a misunderstanding, but I thought you or pq argued for only Cx for these
<danvet> and that would make it more tricky to distinguish the "does this thing have a palette or is it more monotonic single color" meaning
<danvet> which at least fbdev cares about and I think makes at least some sense
<daniels> not really, because you then have to manually broadcast the R channel out to the other colour channels - unless you do really want it to be red-only
<daniels> I agree that Cx + fixed palette is a bad idea
<danvet> well yeah but compositor needs to have these shaders anyway
<daniels> pq and emersion nicely illustrated that
<daniels> danvet: not really - for every other format, unless we need to do YUV conversation, we just assume that the sampled colour is not a lie
<danvet> hm
<danvet> but I don't expect gl folks to be happy if we inflict Ix on them
<daniels> It’s completely trivial driver-side to make Mx (or whatever) produce (x,x,x,1)
<danvet> but I guess it's all there already
<danvet> and yeah on the Cx side I wonder whether we need distinct fourcc for fixed vs user-controlled palette
<daniels> I’m just going on the general principle that explicit >>> surprising
<danvet> yeah
<danvet> daniels, so importing a yuv drm_fourcc is specc'ed to work like oes_image_external, or whatever that was again?
<daniels> yes if the driver accepts it
<daniels> if not, you open-code the conversation
<daniels> *conversion
<daniels> stupid phone
<danvet> yeah but with Ix I expect you'll have to do the Rx mapping anyway in the compositor, so why bother
flacks has quit [Quit: Quitter]
<danvet> and on the display I don't think we'll ever have a need for a real Rx format
<danvet> since rewriting luts and ctms to fit whatever the hw does is easy
flacks has joined #dri-devel
JohnnyonFlame has joined #dri-devel
itoral has quit [Remote host closed the connection]
<pq> daniels, if you think that a channel labeled "red" should also produce something you look at and think "it looks red", then that is colorimetry. If we start encoding anything about colorimetry in pixel formats, we'll just get an explosion of formats with no benefit that I can see. If you want to express that a single-channel panel is grayscale, then let's do that, but not with a pixel format.
itoral has joined #dri-devel
<pq> Otherwise you're going to need GRAY8 for grascale, G8 for green shaded panels, O8 for orange shaded panels... there is no end to that
JohnnyonF has quit [Ping timeout: 480 seconds]
<daniels> shouldn’t we just start calling our channels numbers then, rather than something like ‘red’?
<pq> "red", "green" and "blue" are convenient channel names, because we also have the same named primaries in displays, and they match. When you have a display that has less than three primaries, the labelling is no longer intuitive but it also doesn't matter.
<pq> sure, that would work
<pq> or any arbitrary names
itoral has quit [Remote host closed the connection]
<daniels> I mean, as written, we can’t expect that XRGB [1.0, 1.0, 0.0, 0.0] to produce red - it could equally be a light blue
itoral has joined #dri-devel
<daniels> but that seems absurd tbh
<pq> right, because that display has all three channels, which are enough to represent a color volume.
<pq> if you have a two-channel display, what do you then? and now we're talking about a single-channel display.
<pq> There is absolutely no difference between R8 and GRAY8 pixel formats: they are laid out in memory the same way, they encode the same single-channel values, they do sub-sampling the same way (i.e. don't)
<daniels> that’s a good argument that ‘red’ only is the wrong representation for greyscale, no?
<pq> if you want to rename R8 to GRAY8, that's fine with me. As long as we don't have both.
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
kts has joined #dri-devel
fxkamd has joined #dri-devel
devilhorns has joined #dri-devel
<FLHerne> pq: What you say about why G8, O8 etc. is why I find your palette suggestion confusing
<FLHerne> If you tell an application its display is a white/blue LCD, what use is it supposed to make of that information?
<FLHerne> I find it hard to imagine any application that could treat "white/blue monochrome" differently from "grayscale"
<pq> FLHerne, are you confident saying that no-one will ever need know what the display is like? I wouldn't.
<pq> of course, it doesn't have to be KMS to deliver that information, it could be app settings too that the end user has to fiddle with
<pq> but, *if* we want KMS to be able to describe such displays, then this is my proposal for now.
<pq> otherwise, you probably use R8 or even XRGB8888 on that white/blue LCD, and hack your app colors so that they truncate to white/blue good enough to make sense on the display. Also taking into account how the driver happens to mangle XRGB8888 into 1-bit.
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
<pq> maybe that embedded device has a web UI for arranging things on screen, and the preview would be nice to use the actual colors.
<pq> it's a question of how automatically and where is the system configured
<FLHerne> pq: I just can't see what action an application could make based on that. If the display isn't the right shade of blue, or it wants to render in orange, then...tough luck, it can only render in that shade of blue
<FLHerne> and if it isn't designed to be monochrome, trying to quantize other colours based on their blue content will be less legible than simply using intensity as if it's greyscale
<pq> FLHerne, that's true. It might have different color themes to pick from.
<pq> there is no reason to pick just one channel of RGB even if the display is blue, you can always weight all components. That may result in gray or some other scale, whatever you prefer.
<pq> *weigh
shsharma has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
itoral has quit [Remote host closed the connection]
itoral has joined #dri-devel
Daanct12 has joined #dri-devel
Danct12 has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Konversation terminated!]
itoral has quit [Remote host closed the connection]
Danct12 has joined #dri-devel
Daanct12 has quit [Ping timeout: 480 seconds]
off^ has joined #dri-devel
<zmike> airlied: nice, those fix everything except GTF-GL46.gtf21.GL2Tests.glGetUniform.glGetUniform
nchery has joined #dri-devel
<zmike> though as you noted in another ticket, that's just loss of precision
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
JohnnyonF has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
pcercuei has quit [Ping timeout: 480 seconds]
jewins has joined #dri-devel
APic has joined #dri-devel
mattrope has joined #dri-devel
<glennk> what would a drm fourcc be for a 4 bit packed single channel pixel format? "R4" or some other pet bikeshed color? asking for a friend...
<daniels> oh god
<pq> R4
<pq> or C4 if you need a palette :-p
<pq> I mean, I always assumed Rx and Cx with x < 8 are tightly packed.
<pq> but if they are not, then call it e.g. XR44 for one byte per pixel
<pq> DRM_FORMAT_XC62 etc. for a C2 pixel in a byte.
pcercuei has joined #dri-devel
<glennk> C4 sounds volatile
<pq> YC88_420 would be fun: full resolution 8-bit luminance plane with 8-bit paletted color on 2x2 sub-sampled chroma plane. Or something.
<glennk> at least its not a bit plane interleaved tile format
<pq> Could also do GC88, where green is given as-is, but red-blue are given via palette
<glennk> will probably go the route R8 with alternate RGBA8888 and just convert in the driver, to keep userspace a bit more sane
<pq> D should probably be reserved for depth, because XR is hot
kts has joined #dri-devel
Duke`` has joined #dri-devel
<danvet> glennk, pq I wonder whether we should solve the complete bonkers stuff with modifiers
<danvet> but I guess more fourcc can't hurt
shsharma has quit [Remote host closed the connection]
<danvet> complete bonkers = funny bit interleaved nonsense like banked vga
<glennk> i think "complete bonkers" already got used up by x14rgb666
<danvet> yeah maybe
<danvet> glennk, well we don't have that yet in drm_fourcc.h, maybe there is still hope left
<glennk> or GBRG3553 (byteswapped rgb565)
mbrost has joined #dri-devel
<zmike> anholt: have you tried the mold linker at all yet? seems noticeably faster for mesa and cts at least
<danvet> glennk, we have a big/little endian flag that everyone ignores
shsharma has joined #dri-devel
<danvet> so that should be RGB565 | BIG_ENDIAN
<danvet> yes fourcc have flags, it's entertaining
<glennk> its not really a big endian format, its just byteswapped relative to the host
kts_ has joined #dri-devel
<imirkin> danvet: the BIG_ENDIAN flag doesn't work btw
<imirkin> it's not that it's ignored -- it will just trigger failures
<imirkin> adding support in the driver isn't enough, all the core bits need help too
<danvet> imirkin, oh I know
<imirkin> vmwgfx (iirc) limps along by only supporting rgba8888, which just gets flipped into another format very early on in the pipeline
<danvet> yeah there isn't really anyone who handles this correctly
<imirkin> but like the format lookup logic in drm_fourcc will fail if that flag is set, etc
<danvet> and userspace definitely won't render byteswapped rgb formats
<imirkin> (at least that's my recollection last i looked)
kts has quit [Ping timeout: 480 seconds]
<danvet> imirkin, you'd need to include it as a format with that flag set
<danvet> since it's strictly speaking a different one (in most cases at least)
<imirkin> i was going to make nvidia hw "respect" it properly, since they can support both irrespective of host endian
<danvet> and ideally we'd have a canonicalize stage or something
<imirkin> but the core just had no support for piping that info through
<danvet> so if you have byteswap hw, double your format list
<imirkin> there is no format for RGB565|BIG_ENDIAN
<imirkin> if you just do it like that, ka-boom
<danvet> ah right the format info probably fails
<danvet> I guess should filter it out there
<danvet> instead of making the table twice the size
ybogdano has joined #dri-devel
<imirkin> the pre-nv50 hw can support both
<danvet> otoh if we just do it everywhere then format canonicalization becomes a mess
<imirkin> it's just a bit somewhere
<imirkin> whether to byteswap or not
<danvet> so maybe actually adding rgb565|BIG_ENDIN makes more sense
<imirkin> yeah, it makes sense. but the format info is all off as a result
<danvet> yeah
<imirkin> not an infeasible task to fix, but ... hard to care
<imirkin> esp when "depth" works totally fine :)
<danvet> like with 8888 formats the ship kinda sailed, so there we should canonicalize
<danvet> but for the others I don't think allowing | BIG_ENDIAN blindly is a good idea
<danvet> e.g. all the ones which are just bytestreams like all the yuv stuff
<imirkin> actually the 8888 formats is where we already canonicalize -- there's logic which flips from RGBA8888 to ABGR8888
<danvet> only on the addfb1 -> addfb2 compat path
<danvet> not anywhere else
<imirkin> (based on a driver quirk)
<imirkin> ah yeah, probably
<danvet> so yeah we give you a canonical fourcc code
<danvet> but if userspace gives you an alias in addfb2, we don't do anything
<imirkin> in nouveau's case, iirc we disallow addfb2 on such configurations, and xf86-video-nouveau uses addfb1 anyways
iive has joined #dri-devel
<sravn> javierm: As long as the Kconfig entries are logical then the driver name is less important - but I like that it is not explicit
<sravn> javierm: For the binding the target group is small and they likely have to do some adaptions anyway - so here I suggest to go with a binding that describes the actual HW the best. And the backligth is conceptually separate - so describe it so.
gawin has joined #dri-devel
<sravn> The way to do so could be to deprecate the current pwms property and add the backlight node - so the name etc. is kept but backlight is described in a new way. And the drm driver only supports the non-deprecated way to specify backlight
<sravn> This model everything in the best and most logical way. Even in the fbdev world few drivers have built-in backlight support.
pcercuei has quit [Quit: Lost terminal]
pcercuei has joined #dri-devel
<glennk> backlight on an oled panel... is a concept...
<imirkin> well, brightness on a panel is a concept
<imirkin> (one which does nothing about the intelligence level of the content being viewed, unfortunately)
gawin has quit [Ping timeout: 480 seconds]
<javierm> sven: thanks for your review. Could you please mention in the ML ?
<javierm> sven: I don't really have a strong opinion. I do wonder what's the value of the backlight DT node since won't have any properties...
<javierm> sven: but Geert and Andy seems to have stronger opinions on the topic so would be good if you comment there and see what they think
heat has joined #dri-devel
<sven> wrong sven again ;)
<javierm> sven: arg, sorry again! I meant sravn ^
mszyprow_ has joined #dri-devel
<javierm> dianders: feel free to add my r-b for the revert patch if you want to push ASAP
<dianders> javierm: OK, done. I'll push it now.
<javierm> dianders: I didn't get that this was only for a test and had no value otherwise
<javierm> dianders: btw, this is a patch I wrote to add a debugfs entry also for a test, commit ("28af109a57d1 driver core: add a debugfs entry to show deferred devices")
<javierm> dianders: if you want to use as a refernce, sholdn't be more code than what you typed for sysfs
<dianders> javierm: OK, thanks! I'll take a look at it.
<javierm> cwabbott: yw
<javierm> err, dianders ^ obviously I can't type today
gouchi has joined #dri-devel
mszyprow_ has quit [Ping timeout: 480 seconds]
<maxzor> airlied, Yes they do focus on pal only... http://ix.io/3Okb
<dianders> javierm: So adding something basic to debugfs isn't hard, the problem I ran into was figuring out where to put it and how to manage the hierarchy. I could put a top-level "edp-panel" in debugfs and then list panels underneath, but that felt slightly ugly. It also would need to get deleted not based on driver remove but on _module_ remove (so if someone rmmod's edp-panel then the top level dir goes away).
<dianders> javierm: sysfs had the nice property that everything was already organized by device, which is why I moved my code there. There did seem to be DRM stuff in debugfs but it was more about exposing general DRM properties. This isn't _really_ a general drm property but is just a quirk about this particular panel driver I wanted to expost...
ngcortes has joined #dri-devel
ngcortes has quit [Remote host closed the connection]
gawin has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
mbrost has quit [Read error: Connection reset by peer]
haagch has quit [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.]
haagch has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
MajorBiscuit has joined #dri-devel
karolherbst has quit [Remote host closed the connection]
<pendingchaos> compiled shaders, mostly to copy, clear or decompress images
karolherbst has joined #dri-devel
<jenatali> Ugh, using softfp64 is going to like quadruple the CI time for the d3d12 driver :(
<imirkin> you should see what adding support for 64-bit vertex attribs does
karolherbst has quit [Remote host closed the connection]
<imirkin> there's 1GB of shader tests.
<jenatali> Ridiculous
<imirkin> i think the CI containers normally delete them
karolherbst has joined #dri-devel
<imirkin> (the idea of the tests was to cover lots of cases, but the end result is that _no_ cases are covered)
<jenatali> I think I might see about adding some lowering for the double instructions that DXIL is missing, so we can at least get mul, div, add, sub, and fma without having to use the softfp64
<jenatali> I think it's just floor/ceil/frac and rounding
<imirkin> are you _sure_ you don't have those?
<imirkin> those are _pretty_ basic
<jenatali> Yes
<jenatali> Unfortunately
<imirkin> all hw (which does fp64) definitely supports that
<imirkin> pretty sure even G200 supported that (the tesla-era DX10 GPU which had fp64 support)
<jenatali> Yeah. I figured. But for whatever reason they were never added to shader model 5, and fp64 just hasn't changed since then...
<imirkin> i'd encourage you to take another look
<jenatali> I've looked
<imirkin> in case it's called something funny
fxkamd has quit [Remote host closed the connection]
<imirkin> do you have a link to the SM5 ref pages?
<imirkin> it's not that i don't believe you, but ... i don't believe you ;)
fxkamd has joined #dri-devel
<jenatali> Yeah. I know. I've complained to our compiler folks and they agree it's kind of ridiculous. Hopefully they'll get it added for a future shader model, but also, fp64 just isn't a priority so who knows
<imirkin> i see. they don't specify a new "round", it's inherited from SM4
<imirkin> and SM4 doesn't have doubles
<jenatali> Yeah, exactly
<imirkin> much sad.
<jenatali> In SM5, double instructions need to explicitly be different from floats. In SM6 they're just another overload, but the set of valid overloads was just inherited from which instructions got added to SM5
MajorBiscuit has quit [Ping timeout: 480 seconds]
<cwabbott> jenatali: you might want to check out nir_lower_double_ops.c
<jenatali> cwabbott: It lowers fract and floor to each other :(
<cwabbott> uhh, no it doesn't or else intel would be really busted
<cwabbott> the list of double ops supported sounds awfully like the list of ops Intel supported before gfx12...
<jenatali> Oh, wait
<jenatali> Oh DXIL doesn't have ftrunc either, that's what was missing
<jenatali> Maybe that's all I need to add then?
<cwabbott> and that pass was all we needed to get intel working
<cwabbott> the pass lowers ftrunc tooo
<jenatali> Oh, then it just needs to be run in a loop instead of relying on the GL frontend doing it for me, I see
<cwabbott> nope
<cwabbott> nir_function_impl_lower_instructions is smart
<jenatali> Then maybe I just missed setting a bit...
<jenatali> Let me try again
<cwabbott> it will re-run the callback on lowered instructions
<cwabbott> so you never need to call it in a loop
<jenatali> Ack. Guess I just saw the unsupported instruction, saw it came from the lowering, and assumed
<jenatali> Thanks for making me look again
<cwabbott> you're basically in exactly the same boat as iris, btw
<cwabbott> either intel inherited the list of double instructions from DX or vice-versa
<jenatali> Good to know
<cwabbott> until one of the gens when they decided to nuke 'em all
heat has quit [Ping timeout: 480 seconds]
<jenatali> cwabbott: Yeah I just somehow missed adding nir_lower_dfract into my bitmask :(
<cwabbott> that'll do it :)
<jenatali> Pretty embarassing
frieder has quit [Remote host closed the connection]
<javierm> dianders: I wasn't in front of my laptop and didn't want to keep typing from my phone and look silly :)
ybogdano has joined #dri-devel
<javierm> dianders: I see what you meant. There seems to be though a drm_debugfs_create_files(), maybe you could use that ?
<airlied> robclark: so why have we all the code to upload consts in 16-bit if the hw can do it?
<javierm> dianders: looking at other DRM drivers, that's what they use and expose the debugfs entries in a dir using the struct drm_minor of the struct drm_device .primary
<javierm> dianders: maybe you can follow that convention for the edp-panel ? Looking at the entries added by drivers, many are specific and not general DRM properties
<dianders> javierm: Thanks for the pointer! I'm pretty sure I missed that function when looking before. I'll look in detail after lunch.
<javierm> dianders: Ok!
<robclark> airlied: hmm, where? Somehow I guess we aren't hitting that for freedreno because we very much expect our const bufs to not be packed to fp16..
agjohnston has joined #dri-devel
<airlied> robclark: _mesa_uniform does it
<airlied> copy_uniforms_to_storage has copy_to_float16
<airlied> robclark: ah you don't enable the fp16 const cap
<airlied> wierd for some reason I thought you wrote that support
<robclark> I think mareko did
<airlied> must have been mareko
<zmike> hm radeonsi also exports PIPE_SHADER_CAP_GLSL_16BIT_CONSTS
<zmike> maybe this is the magical key
<airlied> mareko: so can you comment on how you see fp16 float mat3 packed?
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
shsharma has quit [Remote host closed the connection]
devilhorns has quit []
agjohnston has left #dri-devel [#dri-devel]
shsharma has joined #dri-devel
pnowack has quit [Quit: pnowack]
tzimmermann has quit [Quit: Leaving]
lemonzest has quit [Quit: WeeChat 3.4]
callen92 has joined #dri-devel
<airlied> mareko: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14817 and linked issue, you might have some idea how you saw it working
pnowack has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
<graphitemaster> O_o: Really surprised if (any(...) || any(...)) does not compile on NV, says there's no matching overload for "bor"
Haaninjo has joined #dri-devel
gouchi has quit [Remote host closed the connection]
ybogdano has joined #dri-devel
<sravn> javierm: Looked a bit more on the ssd130x stuff - typed my reply on ML.
<sravn> javierm: Forget about my uneducated rambling on irc, it helped reading a bit in the datasheets
ngcortes has joined #dri-devel
cworth has quit [Ping timeout: 480 seconds]
callen92 has quit []
mvlad has quit [Remote host closed the connection]
<javierm> sravn: thanks. Yes, I also was leaning towards fixing the DT but then Geert comment and reading the datasheets opened my eyes
<demarchi> javierm: did you face any issue regarding merging https://patchwork.freedesktop.org/patch/470882/?series=99030&rev=2 in drm-misc-next?
<cheako> For VkCreateImageView is levelCount: 4294967295 a problem? This app, No Man Sky, creates a bunch of small ish(960x120) images creating two views for each.
<cheako> This is related to the video I've been sharing: https://youtu.be/QMBp0B9BCFQ
<javierm> demarchi: yes I did and mentioned to you here in the channel. Sorry, I thought you saw it
<javierm> 16:34 < javierm> | demarchi: sorry, got distracted and now looked at your patch
<javierm> 16:34 < javierm> | demarchi: doesn't apply cleanly because drm-misc-next is still based on v5.16-rc5 and there are changes in include/linux/string_helpers.h landed in v5.17-rc1
<demarchi> javierm: oh... I lost that message
<javierm> 16:35 < javierm> | I could easily resolve the merge conflict but my worry is that could cause issues down the road
<javierm> 16:35 < javierm> | probably better to wait until drm-misc-next is rebased on top of v5.17-rcx ?
<demarchi> thanks... Yeah, it needs to be on 5.17-rc1 because there were other patches touching that file
mszyprow_ has joined #dri-devel
<demarchi> but yes, may be better to wait to wait for the backmerge
<demarchi> javierm: thanks for looking into that
<javierm> demarchi: no worries. I'll try to remember pushing once drm-misc-next moves to 5.17-rc1, but please ping me in case I forget
<demarchi> will do
<javierm> demarchi: cool, thanks
ybogdano has quit [Ping timeout: 480 seconds]
mlankhorst has quit [Ping timeout: 480 seconds]
<airlied> anholt: in that pipeline in 13779 for the assert, where is the assert triggered?
gouchi has joined #dri-devel
gouchi has quit []
ngcortes has quit [Ping timeout: 480 seconds]
<dcbaker> Anyone interested in running mesa inside the build tree instead of installing it first might be interested in: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14826
<Sachiel> oh, that's interesting
<HdkR> Oh, that's cute. I love it.
mszyprow_ has quit [Ping timeout: 480 seconds]
shsharma has quit [Ping timeout: 480 seconds]
<jenatali> dcbaker: Ooh, cool. That sounds handy for running unit tests on Windows via meson devenv meson test
<jenatali> Don't have to copy binaries around then
<dcbaker> There's probably some more work to make it useful on windows
<jenatali> Oh I'm not talking about the Mesa-specific changes, just the feature in general :)
<jenatali> It adds all DLL-producing directories to PATH
<dcbaker> yup
<dcbaker> although I thinik we try to add DLL producing paths when running tests as well? Or maybe we've only talked about it but never done it
<dcbaker> If we haven't done that, we should
<jenatali> Pretty sure it hasn't been done. At least last I checked
<jenatali> Maybe there's just an old version of meson running in CI though
MajorBiscuit has joined #dri-devel
ngcortes has joined #dri-devel
ybogdano has joined #dri-devel
<cheako> I've been looking at this app, NMS, I've dumped vulkan api seeing that it's de-allocating and re-allocating alot after reciving a VK_SUBOPTIMAL_KHR. I'm assuming I can write a layer that caches vulkan primitives and skips over sending these calls? I'm terrible at explaining.
<cheako> I wonder if this may illuminate the "re-create" cycle causing my FPS randomization.
<cheako> Such a layer could be useful in other debugging situations?
Haaninjo has quit [Quit: Ex-Chat]
Duke`` has quit [Ping timeout: 480 seconds]
MajorBiscuit has quit [Ping timeout: 480 seconds]
ngcortes has quit [Ping timeout: 480 seconds]
MajorBiscuit has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
gawin has quit [Ping timeout: 480 seconds]
<cheako> At a glance it seems there is a leak, can the delta number of AllocateMemory and FreeMemory diverge? I assume something like the awk https://pastebin.com/65fhiHGR would show an even number of allocs-free, but instead this value increases.
<cheako> Validation layers doesn't show it.
pzanoni has quit [Quit: Coyote finally caught me]
tjaalton_ has joined #dri-devel
tjaalton has quit [Ping timeout: 480 seconds]
danvet has quit [Ping timeout: 480 seconds]
MajorBiscuit has quit [Ping timeout: 480 seconds]
ybogdano has joined #dri-devel
gawin has joined #dri-devel
maxzor has quit [Quit: Leaving]
pzanoni has joined #dri-devel
pzanoni has quit [Remote host closed the connection]
pzanoni has joined #dri-devel
<dianders> javierm: probably not your working hours anymore, but I did spend some time looking at this. Unfortunately, drm_panel is _really_ quite disconnected from the rest of drm. There is no drm_device anywhere near the panel these days. Even back when there used to be a drm_device owned by the panel it wasn't really a good fit...
nchery has quit [Ping timeout: 480 seconds]
<dianders> javierm: Ah, though maybe this is one of those cases where I need to move edp_panel to use a panel bridge or something? I'll poke there...
<dianders> ...no, that's not right. panel bridge is for the other direction (the client of a panel). Also not clear that would help me get into the DRM's debugfs...
ybogdano has quit [Ping timeout: 480 seconds]
pcercuei has quit [Quit: dodo]
<airlied> anholt: just acked nuking the assert