ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
LeviYun has quit [Ping timeout: 480 seconds]
bolson has joined #dri-devel
feaneron has quit [Ping timeout: 480 seconds]
vliaskov has quit [Ping timeout: 480 seconds]
Daanct12 has joined #dri-devel
flynnjiang has joined #dri-devel
<berberlightning>
it's simply bullshit what you do, and it's like even though i said it's bullshit and described how big cow shit you do, it's like you keep on doing it, where as i already showed how to do everything properly the dependencies are handled fine, cause length is invariant it is taken from the same next power and assembled as allocation based of value and index, so let's do it with numbers
<berberlightning>
then, so the fake dependencies have to work, or pseudo deps is a batter name, what happens is you add a pc compressed identifiers sequence to the hash which has a size 6k the whole data bank of every variable is only 6thousound decimal digits, you access that from the hash by adding it to the memory location as sum dudes this is only less than two bytes, two bytes max value is 65536
<berberlightning>
units, and it has all the pc's enumerated in as to where to query the lengths and it does it multiaccess , so all lengths are queried together, the assembly of such bank and all the compilation process is extremely fast, the runtime is extremely fast and low power as well, the battery would not last for 7 days like linuses old employer transmeta promised, but with such performance as
<berberlightning>
you offer now it would last for years. Of course it's a body blow for some companies, but however for safety reasons me and tom know that amd can not expose their clockless domain, we would beam eachother into ashes soon enough, i simulated all the blocks of miaow and opened the gtkwave diagrams timeline toggled in and out , amd knows what tom and me know, it can not be exposed for
<berberlightning>
safety reasons. Because of that i made a new way, you can not directly use my code to emit waves which harm people, i call them supersonic, ultrasonic whatever is the spectre, cause clockless can do that , and i still have those molds in all over my body when they emitted such waves on me, correctly focused wave will crush your heart just like this and only wireless antenna is needed
<berberlightning>
which is equipped with any modern board. they emitted both versions to me, short burst to my heart for demonstration as well as the backteria culturing out of spec wave, which made me very sick, it was some military battleground, yyeah i nearly died but came out ontop once again eventually recovered through the miracle, what they did was hope for last minute that i die, delivering the
<berberlightning>
needed energy from the sky with lasers which made floods to happen on my location, room was full of water, and i was very sick, however i am truely special in tissue strength, and such people are all donors in this world, cause they are labeled through monester medicine mob, and they chip you without your approval, to take all the substances out as to what they need for vaccines etc.
<berberlightning>
Though the active low does not permit that, the reptilians do not care, however i am not an alien those traits come genetically mutant ways, and this is no going back several centuries like noble prize fraud winner svande pääbo believes. It happens cause of military testings and science of mass destruction weapons. For an example the energy of nuclear bombs wave goes three times
<berberlightning>
around all of the earth, and the so called mushroom where there is enough proximity and that is what my location of granny was when my dad was born, very stramge things will happen one generation forward, and me as twin dominant got +4 in kellars table, and such are so called gods or dieties that nowdays are used as cows due to result fixings. My brain scan is alive as hell, it can
<berberlightning>
lit up all the darkness, but they injured my joints and i have had issues, its so active cause my skull is thicker than usual and i have pronounced forehead, brain cells are more protected and in bigger vacuum, so you just learn how to master your head, if severely milked and injured and broke off guy can do it, you can do it too. My reflexes as well as instinctive and fast thinking
berberlightning has quit [Remote host closed the connection]
berberlightning has joined #dri-devel
LeviYun has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
LeviYun has quit [Ping timeout: 480 seconds]
karenthedorf has joined #dri-devel
LeviYun has joined #dri-devel
LeviYun has quit [Ping timeout: 480 seconds]
flynnjiang has quit [Quit: flynnjiang]
<DemiMarie>
dwfreed: banhammer?
<airlied>
already done
flynnjiang has joined #dri-devel
berberlightning has quit [Remote host closed the connection]
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
apinheiro has quit [Quit: Leaving]
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
yyds has quit []
yyds has joined #dri-devel
alane has joined #dri-devel
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
LeviYun has joined #dri-devel
yyds has quit []
yyds has joined #dri-devel
LeviYun has quit [Ping timeout: 480 seconds]
vedranm_ has joined #dri-devel
yyds has quit []
yyds has joined #dri-devel
yyds has quit [Remote host closed the connection]
<alyssa>
how it started: I'm going to work on agx for a few minutes
<alyssa>
how it's going: this is a fundamental bug in core NIR affecting every driver
<alyssa>
:clown:
nerdopolis has quit [Ping timeout: 480 seconds]
yyds has joined #dri-devel
vedranm has quit [Ping timeout: 480 seconds]
Daaanct12 has joined #dri-devel
Daanct12 has quit [Read error: Connection reset by peer]
yyds has quit [Remote host closed the connection]
yyds has joined #dri-devel
LeviYun has joined #dri-devel
LeviYun has quit [Ping timeout: 480 seconds]
LeviYun has joined #dri-devel
<idr>
alyssa: I looked at that issue. That's... misery.
Calandracas_ has quit [Remote host closed the connection]
Calandracas has joined #dri-devel
hansg has joined #dri-devel
xroumegue has quit [Ping timeout: 480 seconds]
Hazematman has joined #dri-devel
<Hazematman>
Hey all, I've been working on an MR to improve llvmpipe & lavapipe android support to work without kms_swrast, as well as improve the mesa documentation for android to include an out of tree build into an android image. I would appreciate any feedback on my MR https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29344 :)
xroumegue has joined #dri-devel
bmodem has joined #dri-devel
vliaskov has joined #dri-devel
fab has joined #dri-devel
fab has quit []
<alyssa>
idr: (:
<FL4SHK[m]>
So if I write my own Vulkan driver, do I still benefit from a Mesa backend?
itoral has quit [Quit: Leaving]
Company has joined #dri-devel
OftenTimeConsuming has quit [Remote host closed the connection]
epoch101 has joined #dri-devel
OftenTimeConsuming has joined #dri-devel
simon-perretta-img has quit [Read error: Connection reset by peer]
simon-perretta-img has joined #dri-devel
bmodem has quit [Ping timeout: 480 seconds]
<alyssa>
yes
nerdopolis has joined #dri-devel
<FL4SHK[m]>
neat
<FL4SHK[m]>
where does the integration with Mesa come in?
<FL4SHK[m]>
I'm a little confused about that
<FL4SHK[m]>
I have a GCC port I've written most of for the CPU/GPU (they will have similar instruction sets) I'm developing
<FL4SHK[m]>
can I combine this with Mesa
<zmike>
DavidHeidelberg: are you back now?
<Hazematman>
FL4SHK[m]: Checkout `src/vulkan` in mesa, there is a bunch of common code in the folder to help implement a lot of vulkan functionality. In your own driver you would call functions from that module.
<Hazematman>
Additionally you can look at the shader compiler backends for different platforms to see how they use common code for shader compilers in mesa. That kind of goes outside vulkan, but the is a lot of common code in `src/compiler/nir` for example.
<FL4SHK[m]>
I see
<FL4SHK[m]>
thanks
<FL4SHK[m]>
can I make use of my GCC port?
heat has joined #dri-devel
<Hazematman>
<FL4SHK[m]> "can I make use of my GCC port?" <- Not exactly sure what you're doing, but I assume you have a GCC port for a different cpu arch that you want to use in a cross compiler fashion? In that case yes, you just need to setup a cross compiler environment with meson. You can read this for instructions, but you should be able to setup a cross file that points to your custom gcc and use that to build mesa
<FL4SHK[m]>
I'm doing something a bit different for my CPU/GPU design
<FL4SHK[m]>
like I mentioned they will run similar instruction sets, though with some modifications if necessary
<FL4SHK[m]>
I can compile regular C/C++ code with the GCC port targeting the GPU
<Hazematman>
In that case I'm not exactly sure it would be as useful here... Mesa has its own compiler infrastructure, since vulkan expects SPIRV and OpenGL expects GLSL or SPIRV. For drivers in Mesa the common code typically will take SPIRV or GLSL, compile it into Mesa's IR which is called NIR, and then the drivers will ingest NIR and convert it to native machine code for their architecture.
<Hazematman>
So i'm not sure what exactly you're hoping to achieve by making use of Mesa if you have your own compiler infrastructure already.
<Hazematman>
There is common code for handling things like sync, swapchain, wsi, etc in Mesa that might be of interest to you
rgallaispou has quit [Read error: Connection reset by peer]
<FL4SHK[m]>
so, I could just use the GCC port for running regular C/C++ on the GPU, and also write a Mesa port.
<FL4SHK[m]>
what I want is to be able to run Vulkan and hopefully OpenGL on the machine
<FL4SHK[m]>
the whole system is hopefully going to be a many-core machine where CPU/GPU are hooked up via an interconnect inside the FPGA
<FL4SHK[m]>
I am unsure how many cores I can fit.
<FL4SHK[m]>
I have 500k logic cells but none of those have to be used to implement any kind of RAM
<FL4SHK[m]>
so there's a lot of space for cores if I keep them simple. One core might be 1000 or so logic cells
<FL4SHK[m]>
maybe more like 2000
fab has joined #dri-devel
<FL4SHK[m]>
oh, wait a minute
<FL4SHK[m]>
no, the cores will be more than 2000 logic cells
<FL4SHK[m]>
and there'd be fewer of them
<FL4SHK[m]>
the 2000 logic cells number is from if I didn't include vector instructions
<FL4SHK[m]>
but I want to include vector float instructions at least
<FL4SHK[m]>
32-bit floats that is
rgallaispou has joined #dri-devel
feaneron has joined #dri-devel
karenthedorf has quit [Remote host closed the connection]
<Hazematman>
Sounds like an ambitious project 😅 just an FYI then if you want to run generic OpenGL or Vulkan application you'll need to be able to ingest GLSL or SPIRV, which I don't think GCC supports. So at some point you'll need to look at modifying GCC to support that or building a new compiler that can handle those.
<FL4SHK[m]>
well
<FL4SHK[m]>
yeah, it's an ambitious project
<FL4SHK[m]>
haha
<FL4SHK[m]>
the GCC port wasn't too bad
<FL4SHK[m]>
took a couple months
<FL4SHK[m]>
it's a long term project, mind you
<FL4SHK[m]>
at least I'm not writing an OS too
<FL4SHK[m]>
so I'll go ahead and write a SPIR-V compiler for the system
<FL4SHK[m]>
I heard that you can just do a basic translation from SPIR-V to the GPU instruction set
<FL4SHK[m]>
which is an assembly-to-assembly transpile
<FL4SHK[m]>
I might be oversimplifying it
fab has quit [Ping timeout: 480 seconds]
alih has joined #dri-devel
<DemiMarie>
Is similar CPU and GPU ISAs a bad idea?
<FL4SHK[m]>
possibly
<karolherbst>
yes
<FL4SHK[m]>
I'm gonna try it
<karolherbst>
GPUs by definition generally don't need vector instructions, because that's implicit in how they run things
<DemiMarie>
I suggest going with a conventional design.
simon-perretta-img has quit [Ping timeout: 480 seconds]
<FL4SHK[m]>
what I was going to try is hooking up a lot of small cores
<FL4SHK[m]>
since that's a novel idea mostly
<FL4SHK[m]>
apparently it's been tried with x86
<FL4SHK[m]>
in the 2010s
Net147 has quit [Quit: Quit]
<DemiMarie>
FL4SHK: tried and failed
<karolherbst>
yeah, and it was a bad idea
<FL4SHK[m]>
why did it fail?
simon-perretta-img has joined #dri-devel
<karolherbst>
because running the same CPU code on a GPU with the same ISA is a wrong promise
<DemiMarie>
FL4SHK: There are three areas in which I would like to see something new.
<FL4SHK[m]>
tell me
<FL4SHK[m]>
my design is absolutely not finalized right now
<FL4SHK[m]>
I coul djust go for the manycore thing for the CPU
<DemiMarie>
The first is fault isolation: If a context faults or must be reset, the GPU should guarantee that other contexts are not affected.
<DemiMarie>
The second is bounded-latency instruction-level preemption of all hardware, including fixed-function. This means that even a malicious shader cannot prevent itself from being preempted in a bounded amount of time, allowing the GPU to be used in real-time systems.
<alyssa>
FL4SHK[m]: for mesa you really want to write a nir backend
<alyssa>
it's not hard
<DemiMarie>
The third is security isolation: the GPU should guarantee that information cannot be leaked across contexts.
<FL4SHK[m]>
alyssa: sounds good
<FL4SHK[m]>
I sitll want to hook up the GPU cores directly to my on-chip interconnect
<FL4SHK[m]>
inside the FPGA
<Hazematman>
DemiMarie: Isn't this mostly covered by hardware GPU contexts and per context virtual memory. Or there something I'm missing?
<FL4SHK[m]>
that's what I thought too
<FL4SHK[m]>
per-context virtual memory is like virtual memory per-process on a CPU right?
<Hazematman>
1 & 2 would be a game changer for real time GPU usage especially in SC systems
<Hazematman>
FL4SHK[m]: yeah the same concept but applied to GPUs. A graphics context will have its own virtual address space and typically the kernel driver is responsible for mapping GPU virtual memory to physical memory
<FL4SHK[m]>
I see. Also, from the looks of it, the vectorization stuff that failed for LRB was for general purpose code right?
<FL4SHK[m]>
not as much for the shader code maybe?
<karolherbst>
FL4SHK[m]: the point is, that once you rely on auto vectorization for performance you are kinda screwed
<FL4SHK[m]>
even for shader code?
<karolherbst>
that's why GPUs are generally SIMT rather than SIMD
<FL4SHK[m]>
I see
<karolherbst>
so the ISA looks all scalar, but implicitly the same instruction is ran on multiple threads/lanes/whatever you want to call it
<FL4SHK[m]>
gotcha
<FL4SHK[m]>
I can do that
<karolherbst>
and you either explicitly or implicitly manage thread masks
<karolherbst>
like e.g. on nvidia each instruction can be predicated to turn it of for the current thread
<karolherbst>
but it still executes on other threads with the predicate being true in the same warp/subgroup
<FL4SHK[m]>
can I at least make the instruction set similar to the CPU in some ways?
meowmeow has quit [Remote host closed the connection]
<FL4SHK[m]>
like let's say I use some of the same instruction encoding
<FL4SHK[m]>
for stuff like, ALU ops
<karolherbst>
yeah, the issue isn't the ISA being the same, just often a GPU ISA is more specialized so you might as well
<FL4SHK[m]>
I see
<karolherbst>
but you could have certain instructions (e.g. vector ones) only work in the "CPU" mode
<karolherbst>
and in the GPU mode, scalar instructions just execute in an entire subgroup
<FL4SHK[m]>
I see
guludo has joined #dri-devel
<FL4SHK[m]>
So that's neat
<karolherbst>
each thread still has their own registers, but there is also the concept of "scalar" or "uniform" registers which are the same in each thread of a subgroup
<FL4SHK[m]>
I'm going to have to study GPUs further
<FL4SHK[m]>
so SIMT is like... partially SIMD right?
<FL4SHK[m]>
partitioned SIMD?
<FL4SHK[m]>
you have multiple SIMD units
<FL4SHK[m]>
many of them
<karolherbst>
well.. more like implicit SIMD
<FL4SHK[m]>
that was what I was thinking of emulating
<FL4SHK[m]>
hm
<FL4SHK[m]>
I thought there were actual SIMD engines in the hardware; that was what I learned in school
<karolherbst>
like.. e.g. those x86 SIMD instructions with a lane mask map directly to predicated scalar instructions in a SIMT ISA
<FL4SHK[m]>
small SIMD engines
<FL4SHK[m]>
Hm
<karolherbst>
yeah.. but the main difference is, that the ISA is scalar
<karolherbst>
so.. because your ISA is scalar, you don't need auto vectorization to get great performance
<FL4SHK[m]>
that makes sense
<karolherbst>
and of course that also requires a different programming language e.g. glsl where you describe what each SIMD lane/SIMT thread is doing
<karolherbst>
instead of looking at the entire group
<karolherbst>
now a days anything vecN gets scalarized anyway
<karolherbst>
*nowadays
<FL4SHK[m]>
how do you get from a scalar ISA to telling the hardware what SIMD Lanes to use?
<karolherbst>
(except for load/stores which some hardware can actually do wide ones of)
<FL4SHK[m]>
somehow that has to be figured out
<karolherbst>
all lanes execute the same instruction
<FL4SHK[m]>
then how do you get the data for them?
<cwabbott>
I'd say that it *is* possible to compile a SIMT language to a SIMD architecture, as long as predication is competent enough, but it requires a completely different compiler architecture
<lynxeye>
the important thing is that you can switch between threads when one of them is blocked on memory, which allows you to hide memory latency without large caches and sophisticated prefetchers
<karolherbst>
FL4SHK[m]: you mean between threads in the same group?
<FL4SHK[m]>
I think so
<FL4SHK[m]>
you have to somehow specify where the data comes from
<lynxeye>
and that's a direct consequence of the programming model, which you won't be able to realize with a SIMD model
<karolherbst>
there are subgroup operations e.g. shuffle where you can move values between threads
<cwabbott>
AMD's architecture, for example, is basically a SIMD core with a few goodies strapped on
<karolherbst>
but normally each thread is just pulling the data it needs directly
<FL4SHK[m]>
cwabbott: ah, yeah, that's what I was reading
<FL4SHK[m]>
that AMD actually does have SIMD machines
<FL4SHK[m]>
well, SIMD cores
<cwabbott>
scalar registers are just normal registers, vector registers are SIMD registers, etc.
<FL4SHK[m]>
yeah that is actually teh model I was going to emulate
<cwabbott>
subgroup ops are vector shuffles
<FL4SHK[m]>
ooh
<FL4SHK[m]>
so then
<FL4SHK[m]>
is what karol was talking about for nvidia then?
<FL4SHK[m]>
or is it applicable to AMD as well?
<cwabbott>
just nvidia
<karolherbst>
the concepts are similar, just mostly different terms being used
<karolherbst>
or different models :P
<karolherbst>
though AMD manages masks explicitly, no?
<FL4SHK[m]>
that sounds like putting the masks into the ISA
<cwabbott>
yes, AMD manages masks explicitly
<FL4SHK[m]>
I'd be happy to emulate that
<cwabbott>
the point i was trying to make is, it is possible to have a different model in the hardware that's more explicit
<cwabbott>
and more SIMD-like
<karolherbst>
yeah, fair
<cwabbott>
but the *compiler* still has to be SIMT
<FL4SHK[m]>
oh, I see
<cwabbott>
i.e. the register allocator has to still be aware of what the original control flow is
epoch101 has quit [Ping timeout: 480 seconds]
<FL4SHK[m]>
so if I emulate AMD, I need to translate from SIMT to the ISA's more SIMD like arch?
<FL4SHK[m]>
in the compiler
<cwabbott>
yes, you would need to do that
<cwabbott>
for example, ACO does that when translating from NIR
<FL4SHK[m]>
okay, that sounds like something that could be a happy medium for my hardware design
<cwabbott>
but it also keeps a representation of the original control flow
<FL4SHK[m]>
I could still go with my original design idea?
<FL4SHK[m]>
I see
<FL4SHK[m]>
so if I port Mesa, can I still access that information in my backend?
<cwabbott>
and register allocates vector registers with that
<cwabbott>
so an existing backend that's written assuming a SIMD model would be mostly useless
<cwabbott>
it has to at least be aware of the higher-level representation
<FL4SHK[m]>
right, that's what I'm asking about. Can I access the higher-level representation from my backend?
<cwabbott>
mesa's IR (NIR) is explicitly only using the higher-level representation
<FL4SHK[m]>
ah, got it
<karolherbst>
I think "vector registers" are kinda a dangerous terms, because you still don't get a vector reg as you'd get on a SIMD x86 ISA, right? It's still a thread private register you get, you just encode "this operates on different values per thread" or did I misunderstand how that's working on AMD?
<FL4SHK[m]>
I thought AMD actually did use CPU-like SIMD registers
<FL4SHK[m]>
based upon their documentation I read
<cwabbott>
i'd say it's more like CPU SIMD registers
<karolherbst>
I think it depends on how you look at it
<FL4SHK[m]>
excellent, that's exactly what I wanted to hear
<cwabbott>
yes, it's sort-of a difference in semantics
<FL4SHK[m]>
then my question is, if I go with CPU-like SIMD registers in my ISA, can I emulate AMD's model?
<karolherbst>
if you look at the entire thing as a SIMD group, yes it makes more sense to say it's SIMD like, but if you look at it from a "single thread" perspective it kinda doesn't
<cwabbott>
yes, but beware that you do have to explicitly design things to "be nice" with the SIMT model
pcercuei has quit [Quit: Lost terminal]
<cwabbott>
for example, 16-bit values have to go in the upper/lower half of 32-bit values
<FL4SHK[m]>
I am willing to change the hardware to better fit the SIMT model
<FL4SHK[m]>
since I do have control over the ISA and stuff
<cwabbott>
the "stride" for lack of a better word has to always be 32 bits
<karolherbst>
I think there were people having a common ISA on both sides, but it operated on SIMD lanes for GPU "code"
<cwabbott>
i.e. each vector lane must be 32 bits
<karolherbst>
and it was just a scalar ISA
<FL4SHK[m]>
oh, that's familiar to me
<FL4SHK[m]>
I don't mind making it 32-bit
<FL4SHK[m]>
so that would mean, you have 32-bit floats?
<FL4SHK[m]>
I was hoping to go with 32-bit floats
<cwabbott>
otherwise you get into a world of hurt if you try to use the higher-level control flow in your compiler
<FL4SHK[m]>
since those will use less hardware
<FL4SHK[m]>
oh
<karolherbst>
GPUs normally don't encode the "type" in registers, they use the same registers for int and float operations
<karolherbst>
but yeah.. they are most of the time 32 bit wide
<FL4SHK[m]>
Okay well I'll keep that in mind
<FL4SHK[m]>
right actually I was thinking of doing that as well
<FL4SHK[m]>
already had that idea
<FL4SHK[m]>
for the scalar stuff as well
kxkamil2 has joined #dri-devel
<FL4SHK[m]>
and for the CPU as well :)
jkrzyszt_ has joined #dri-devel
<FL4SHK[m]>
I need to write up a version of my ISA spec for a 64-bit version of the CPU
<FL4SHK[m]>
or just modify the existing one
<karolherbst>
I think the entire reason it's split in x86 is because of legacy
<cwabbott>
GPUs tend to use the same cores for int and float operations
<FL4SHK[m]>
I see, that makes sense
<cwabbott>
because integer stuff just isn't as important
jsa1 has joined #dri-devel
<karolherbst>
nvidia has explicit int units these days
<FL4SHK[m]>
I've heard that before too
<karolherbst>
but yeah...
<cwabbott>
that naturally leads to using the same registers, whereas CPUs have the opposite tradeoff
<karolherbst>
but nvidia is weird anyway
<karolherbst>
if you do a float op on a result of an int alu, you need to wait one cycle more
<karolherbst>
vs float -> float or int -> int
jsa has quit [Ping timeout: 480 seconds]
kxkamil has quit [Ping timeout: 480 seconds]
jkrzyszt has quit [Ping timeout: 480 seconds]
<mattst88>
karolherbst: how do texture operations handle sources where some arguments are floats and some are ints? do you have to move the ints to the float register file?
<karolherbst>
mattst88: it's all raw data
<karolherbst>
the instruction is responsible for interpreting the data
<mattst88>
right, but can the texture operation take some sources from the float reg file and some from the int file?
<karolherbst>
float vs int regs don't exist
<karolherbst>
it's all registers
<mattst88>
oh, they use the same register file. it's just that there's a different ALU unit and some additional latency when moving results from the int ALU to the fp ALU, etc
<karolherbst>
that's why NIR is also entirely untyped, because it just doesn't really make sense to have typed registers
<mattst88>
yeah, gotcha
<karolherbst>
mattst88: correct
<karolherbst>
though on nvidia it's all werid, because the scoreboarding/latency is done at compile time
<karolherbst>
so the compiler has to know those rules
<karolherbst>
and results just appear in a register at some defined time
<mattst88>
yeah, makes sense
<karolherbst>
(which also means, that an instruction executed later can actually clober the input of a previous instruction)
<mattst88>
so NVIDIA has to do the software scoreboarding stuff in the compiler, like recent Intel GPUs?
<mattst88>
and presumably has had that for much longer?
<karolherbst>
yeah, it's quite old
<karolherbst>
they experimented with that in kepler, but made it a full requirement with maxwell
<karolherbst>
so like over 10 years roughly?
<karolherbst>
it's quite complicated really. There are also instructions which read some inputs 2 cycles later and stuff 🙃
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
macromorgan has joined #dri-devel
blaztinn has quit [Remote host closed the connection]
blaztinn has joined #dri-devel
LeviYun has quit [Ping timeout: 480 seconds]
blaztinn has quit [Remote host closed the connection]
<DavidHeidelberg>
zmike: still around the world trip.. just occasionally doing something until I'll start at some new job :)
blaztinn has joined #dri-devel
Kayden has quit [Quit: -> JF]
lynxeye has quit [Quit: Leaving.]
LeviYun has joined #dri-devel
blaztinn has quit [Remote host closed the connection]
blaztinn has joined #dri-devel
dbrouwer has joined #dri-devel
Kayden has joined #dri-devel
ckinloch has joined #dri-devel
LeviYun has quit [Ping timeout: 480 seconds]
cyrinux30 has quit []
cyrinux30 has joined #dri-devel
<gfxstrand>
jenatali: Thinking about WDDM in mesa "for real" and I'm not sure I want to have a hard dependency on libdxg. Thoughts about marshalling things?
<jenatali>
Marshaling?
<jenatali>
In WSL you can rely on those entrypoints being available in the distro in a libdxcore.so, and in Windows they come from gdi32.dll
<gfxstrand>
Yeah but let's say Ubuntu is going to ship Mesa with WDDM enabled
<gfxstrand>
Does that mean Ubuntu also ships libdxg and it just doesn't do anything?
<gfxstrand>
I guess that's probably fine. It's tiny.
<jenatali>
That's an option, but you could also operate like we do with the d3d12/dozen driver, which ships enabled in Ubuntu AFAIK, and just dlopen
<jenatali>
If libdxcore.so isn't there at runtime then you don't have WDDM anyway
<gfxstrand>
Yeah, but then we have to dlsym everything
<gfxstrand>
That's kinda what I was asking for thoughts on
<jenatali>
Not necessarily, can't dlopen promote things into the global namespace?
<gfxstrand>
Maybe we can with weak symbols of some sort
<jenatali>
You'd have to allow unresolved symbols at link time though I guess for that to work
<alyssa>
are Windows uapis stable ?
<jenatali>
Yes
<alyssa>
dang
<jenatali>
All APIs provided from any Windows DLL, whether it's a kernel-accessor API or just strictly usermode, is stable once it ships in a retail OS
<gfxstrand>
the pPrivateDatas, though, are anyone's guess.
<jenatali>
Right, those are generally not considered stable
<jenatali>
We require UMD and KMD to match because vendors have refused to commit to making those stable...
<gfxstrand>
Yeah, they're usually literally just a struct in a header in a perforce tree somewhere
fireburn has quit [Quit: Konversation terminated!]
jkrzyszt_ has quit [Ping timeout: 480 seconds]
i-garrison has quit []
LeviYun has joined #dri-devel
simon-perretta-img has quit [Read error: Connection reset by peer]
kts has quit [Quit: Konversation terminated!]
i-garrison has joined #dri-devel
LeviYun has quit [Ping timeout: 480 seconds]
<DemiMarie>
jenatali gfxstrand: does that mean that Mesa can be used as the UMD of a WDDM2 driver?
<jenatali>
If the driver actually has a stable KMD interface, yeah
<jenatali>
Or if the vendor is shipping Mesa as their UMD along with a matching KMD
<gfxstrand>
DemiMarie: Yes, in theory
<DemiMarie>
jenatali gfxstrand: Use-case is GPU acceleration in Windows guests on Qubes OS, which will require virtio-GPU native context support. That requires Mesa as the UMD and a proxy as KMD.
<DemiMarie>
So the KMD interface would just be a proxy for the host’s KMD.