tobiasjakobi has quit [Remote host closed the connection]
pcercuei has joined #dri-devel
coldfeet has joined #dri-devel
apinheiro has joined #dri-devel
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
Company has quit [Quit: Leaving]
riteo_ has joined #dri-devel
riteo has quit [Ping timeout: 480 seconds]
sima has joined #dri-devel
davispuh has joined #dri-devel
Haaninjo has joined #dri-devel
davispuh has quit [Ping timeout: 480 seconds]
phire has quit [Remote host closed the connection]
rasterman has joined #dri-devel
phire has joined #dri-devel
kzd has quit [Ping timeout: 480 seconds]
ced117 has quit [Ping timeout: 480 seconds]
ced117 has joined #dri-devel
phire has quit [Remote host closed the connection]
phire has joined #dri-devel
sukuna has quit [Remote host closed the connection]
vedranm has quit [Quit: leaving]
<karolherbst>
jenatali: how are you handling memory maps with OpenCL? Allocate a CPU side buffer and synchronize accordingly? I'm considering to stop handing out pointers to mappings and just have host side copies
kts has joined #dri-devel
Haaninjo has quit [Ping timeout: 480 seconds]
Company has joined #dri-devel
u-amarsh04 has quit []
rasterman has quit [Quit: Gettin' stinky!]
amarsh04 has joined #dri-devel
nerdopolis has joined #dri-devel
apinheiro has quit [Quit: Leaving]
simon-perretta-img has quit [Read error: Connection reset by peer]
simon-perretta-img has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
kts has joined #dri-devel
davispuh has joined #dri-devel
nerdopolis has quit [Ping timeout: 480 seconds]
maxzor has joined #dri-devel
<jenatali>
karolherbst: ALLOC_HOST_PTR resources are allocated out of sysmem. Everything else allocates a temp buffer and copies
maxzor has quit [Remote host closed the connection]
maxzor has joined #dri-devel
kts has quit [Ping timeout: 480 seconds]
<karolherbst>
jenatali: with temp buffer, you mean a buffer located in VRAM or host memory mapped into the GPU address space?
<karolherbst>
or rather.. I'm considering syncrhonizing mapped memory like that (GPU copies VRAM -> mapped buffer via compute shaders)
<jenatali>
karolherbst: host memory
<karolherbst>
my only concern is that this is quite inefficient, because the application might just copy it to its own buffer somewhere, but oh well I guess (and it also uses RAM)
<karolherbst>
I wonder if I want to keep what I'm currently doing as an optimized path for single device contexts...
<karolherbst>
or something similar
coldfeet has quit [Quit: leaving]
nerdopolis has quit [Ping timeout: 480 seconds]
<jenatali>
karolherbst: if it's ALLOC_HOST_PTR then it should be placed in host memory
<karolherbst>
ehh, I meant the mapping part here
<jenatali>
If it's not in host memory then I always copy to host memory for mapping instead of mapping VRAM directly
<karolherbst>
right, I'm considering doing that as well now
<karolherbst>
atm for `CL_MEM_ALLOC_HOST_PTR` I'm using `PIPE_USAGE_STAGING`, but I wonder if I should allocate myself and use `resource_from_user_memory` instead, so the same allocation can be shared across devices
<karolherbst>
but...
<karolherbst>
I think that would require gallium changes so that drivers can tell me when it won't fail
<karolherbst>
but anyway, that's going to be a fun rework
<jenatali>
karolherbst: staging seems like the right thing there
<karolherbst>
right.. but if you have three devices, they'll all allocate in system RAM individually
<karolherbst>
probably
<jenatali>
Yeah
<jenatali>
Is that a thing that people actually do?
<karolherbst>
doing what?
<jenatali>
Using ALLOC_HOST_PTR on multiple devices
<karolherbst>
I have no idea
<jenatali>
Then I wouldn't worry about it until you see it be a problem
<karolherbst>
though then I'd also have to do an additional allocation on the host for mapping
<karolherbst>
and then ALLOC_HOST_PTR would kinda perform terribly as well
nerdopolis has joined #dri-devel
<karolherbst>
I guess should I try a few things out here
coldfeet has joined #dri-devel
coldfeet has quit [Quit: Leaving]
coldfeet has joined #dri-devel
coldfeet has quit [Quit: Leaving.]
Haaninjo has quit [Quit: Ex-Chat]
halves has quit [Quit: o/]
coldfeet has joined #dri-devel
halves has joined #dri-devel
nerdopolis has quit [Ping timeout: 480 seconds]
davispuh has joined #dri-devel
kts has joined #dri-devel
kts has quit []
kts has joined #dri-devel
heat has joined #dri-devel
gouchi has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
Company has quit [Read error: Connection reset by peer]
_isinyaaa has quit []
_lemes has quit []
melissawen has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]
siqueira has quit [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]
isinyaaa has joined #dri-devel
simon-perretta-img has joined #dri-devel
lemes has joined #dri-devel
melissawen has joined #dri-devel
siqueira has joined #dri-devel
isinyaaa has quit [Quit: ZNC 1.8.2+deb3.1 - https://znc.in]
lemes has quit []
melissawen has quit []
siqueira has quit []
isinyaaa has joined #dri-devel
lemes has joined #dri-devel
melissawen has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
siqueira has joined #dri-devel
i-garrison has quit []
nerdopolis has joined #dri-devel
i-garrison has joined #dri-devel
melonai has quit []
epoch101 has joined #dri-devel
alyssa has quit [Quit: alyssa]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
nerdopolis has quit [Ping timeout: 480 seconds]
kzd has joined #dri-devel
KDDLB has joined #dri-devel
Peuc_ has joined #dri-devel
Peuc has quit [Ping timeout: 480 seconds]
kts has quit [Quit: Konversation terminated!]
feaneron has joined #dri-devel
feaneron has quit []
simon-perretta-img has joined #dri-devel
nerdopolis has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
<Mis012[m]>
> To continue rolling out new Google AI features to users at a faster and even larger scale, we’ll be embracing portions of the Android stack, like the Android Linux kernel and Android frameworks, as part of the foundation of ChromeOS.
<Mis012[m]>
what's next, they will stop respecting ownership and will blow fuses?
<Mis012[m]>
if anything, the android team should be more like cros
gouchi has quit [Remote host closed the connection]
epoch101 has quit []
melonai has quit []
<karolherbst>
Mis012[m]: please don't be like this. If you are displeased with Google's decision, maybe redirect your anger towards the people actually able to make those decisions and not just individual developers
<Mis012[m]>
this is not me directing anger at robclark, if anything I'm worried about his ability to keep his job
<Mis012[m]>
I doubt he supports this decision
<Mis012[m]>
I just wonder if he has some insight, though I'm not very optimistic about this being a misunderstanding
<karolherbst>
you don't know that, but yeah, I think it's fair to ask for his take on this
maxzor has quit []
Duke`` has quit [Ping timeout: 480 seconds]
<robclark>
it's tbd how much cros will rub off on android vs how much android will rub off on cros.. given the 10yr support window for cros I don't think the android kernel model actually works and I think a lot of the cros kernel devs have the same opinion. Some of leadership and PMs have magical thinking, we'll see how it plays out.
nerdopolis has joined #dri-devel
<DemiMarie>
robclark: For what it is worth, I agree with you.
davispuh has quit [Ping timeout: 480 seconds]
<DemiMarie>
Is it worth adding virito-GPU native context support for nouveau, or should this wait until the Nova driver is ready?
coldfeet has quit [Remote host closed the connection]
<robclark>
if the new driver will support all the hw the old one does, then I guess probably wait? Tbh I've not thought about it much because nv isn't a thing we have to care about for cros ;-)
cyrinux has quit [Ping timeout: 480 seconds]
glennk has quit [Ping timeout: 480 seconds]
<karolherbst>
nova will be Turing+ only
<karolherbst>
or rather, it will be GSP only
<karolherbst>
DemiMarie: ^^
praneeth_ has quit []
<DemiMarie>
karolherbst: is that the case I should care about?
<karolherbst>
mhh?
<DemiMarie>
Is pre-GSP hardware worth caring about?
<karolherbst>
it depends
<DemiMarie>
what on?
<karolherbst>
on many things I suppose? I can't really make the decision for any distribution. Newer kernel modules generally always require some newer hardware, because often the reason for new kernel drivers are, that hardware changed significantly and splitting things up makes it easier to maintain
<karolherbst>
such a decision isn't being made based on use of hardware
<karolherbst>
if you want to add virtio-gpu native context to nouveau, then that's kinda on you to decide what hardware to care about
<DemiMarie>
What is the attack surface like? I know that GSP firmware replaces much of the driver, but I don’t know how much of the code that was moved is userspace-accessible attack surface and how much is purely internal to the GPU.
<karolherbst>
it's just used to configure the GPU, e.g. display, hardware contexts, and probably other things. But the main purpose is hardware configuration. VM management is still done outside of GSP afaik
<Mis012[m]>
sadly it doesn't matter if the hardware changed if the firmware interface did and changing the firmware is not possible
<Mis012[m]>
unless you can go around the firmware I guess, which to my understanding you can't
<karolherbst>
GSP is not possible to use before Turing due to hardware reasons
<karolherbst>
anyway, it takes care of a lot of hardware specific programming
<DemiMarie>
I read that it improves performance. Is this because of fewer PCIe bus round-trips?
<karolherbst>
no
<karolherbst>
it reclocks the GPU
<karolherbst>
and does full power management
<DemiMarie>
This was for articles aimed at Windows users
<karolherbst>
including fan control and everything
nerdopolis has quit [Ping timeout: 480 seconds]
<DemiMarie>
So the comparison was to the proprietary driver, which would have already had these features.
<karolherbst>
I don't see why performance should change per se, however, the GPU can change it's performance levels without having to wait for the kernel to tell it to do so
<karolherbst>
so maybe it just reacts quicker and that leads to higher power efficiency?
<DemiMarie>
Maybe?
<karolherbst>
as in.. you can run at lower clocks, because you can mitigate perf spikes quicker without having to poll too often
<karolherbst>
doing so in the kernel is just an objectively bad idea
<DemiMarie>
and without interrupt latency guarantees the hardware would need to be safe even if the kernel didn’t respond
<karolherbst>
yeah. but that's a different thing
<karolherbst>
you don't need to reclock in order to reduce power consumption
<karolherbst>
but the benefit is, that you can lower clocks to bring down temps, which is more efficient what the hardware was able to do before
<DemiMarie>
I see
<karolherbst>
there was a temperature triggered clock divider, which could cut the clocks to 1/8 or something on high temperatures
<karolherbst>
which maybe brings down power by 50%
<karolherbst>
but at terrible perf
<karolherbst>
but that worked without kernel intervention since forever basically
<karolherbst>
and is pretty cheap to do in hardware
<karolherbst>
at some point I figured out how to program that part, because I wanted to know if nouveau has to or if something else does so already. Turns out, the vbios already programmed it in sane ways
<DemiMarie>
On Maxwell/Pascal/Volta, is Nouveau usable for anything interesting, or would users be just as well off using integrated graphics?
melonai has joined #dri-devel
<Mis012[m]>
robclark: well, hopefully the cros team will be able to convince the decision makers that their idea is absolutely insane...
<DemiMarie>
those don’t require native contexts or accelerated rendering
<karolherbst>
correct
<karolherbst>
the only gen which is somewhat worth caring about is kepler and 1st gen maxwell
<karolherbst>
but that's still manually reclocking
<DemiMarie>
why not pre-Kepler?
<karolherbst>
becuase on fermi nouveau doesn't support reclocking either, and then you have tesla, which are kinda old
<Mis012[m]>
I don't think there would be a massive performance hit for Linux controlling the reclocking the same way it does for the CPU, I always assumed that it's not able to access the required registers but the fw can
<karolherbst>
and won't support vulkan
<Mis012[m]>
and/or the registers are not documented and can't be RE'd either because the proprietary driver doesn't use them
<karolherbst>
Mis012[m]: it's a pointless discussion. Intel also moves more and more of its reclocking outside the kernel. It's where the industry is heading and it's a good thing
<karolherbst>
also on nvidia those registers literally can't be accessed by the host anyway
<karolherbst>
at least some of them, e.g. fan speed control and voltage regulation
<karolherbst>
heck, even your own written firmware won't be allowed to access those
<Mis012[m]>
my firmware won't be allowed to run, which is somehow not considered property rights violation
<karolherbst>
it can run
<karolherbst>
it just can't access certain registers
<karolherbst>
anyway, nothing we discuss here will change that fact
<Mis012[m]>
if it can't run with the same privileges, it's more of a shader than a firmware
<karolherbst>
I'm getting tired of this nonsense
<Mis012[m]>
firmware is typically used for "software that you better not even thing about changing"
<karolherbst>
look, I'm not happy either, but being angry on IRC won't change it either
<DemiMarie>
karolherbst: Why is it a good thing? Lower latency?
<Mis012[m]>
intel handing reclocking outside kernel could maybe be a good thing if it's completely transparent, if it needs a driver anyway then it's very much not helping anything
<karolherbst>
DemiMarie: yeah, and also it doesn't rely on the kernel to be responsive. You can adjust the clocks way quicker. So instead of running at a 60% target, you might get away with running at 80% without risking fps drops
<karolherbst>
the lower voltage you can use, the more efficient your hardware runs
<karolherbst>
and the higher the clocks, the higher the voltage
<DemiMarie>
karolherbst: drops because the clocks could not react fast enough and the hardware thermal safety tripped?
<Mis012[m]>
and if the driver is hidden in ACPI then that's abuse of something that was supposed to be board-level in order to play pretend with x86 "just working"
<karolherbst>
DemiMarie: no, just if you want to keep your GPU at 80% load, you need lower clocks than if you'd keep it at 60%
<karolherbst>
but if you get a spike in load, you have to jump up the clocks quick enough
<karolherbst>
there are idle counters on the GPU telling you how busy the engines are
<DemiMarie>
which the kernel can’t do without polling too frequently, ruining CPU-side performance?
<karolherbst>
and the firmware can just read them out directly, instead of the kernel having to poll and waste IRQs on it
<Mis012[m]>
polling and using irqs are two separate things surely?
<karolherbst>
it depends on how you poll
<Mis012[m]>
if you poll in sw then the hw design is insane
<karolherbst>
you really don't want to do those things on the kernel, because that's just another poll thing keeping your CPU from idling
<Mis012[m]>
to make you do that
<karolherbst>
hence you do it in firmware
<DemiMarie>
The firmware runs on a much smaller processor than the ones Linux runs on. That means that the processor is slower at executing instructions, but it also means that it uses much less power and has much less state to be flushed when an interrupt comes in.
LeviYun has joined #dri-devel
<karolherbst>
and there is no PCIe bus in the way
<DemiMarie>
You really want your main CPU to go to sleep to save power, but it can take quite a while to wake it up from that low-power state.
<karolherbst>
anyway, the tldr is, it makes perfect sense to do that in firmware
<DemiMarie>
I suspect the firmware can wake up much, much faster, and given how small a processor it runs on, it might even be able to get away with busy-polling.
<karolherbst>
nviida's firmware coprocessors did support timers
<karolherbst>
I'm sure the new ones also do
<Mis012[m]>
eh, using a timer for polling doesn't make it much less sad
<karolherbst>
at least I'm sure their RPC works like that. You send an IRQ and the firmware handles the RPC request. But it can also configure a timer for itself, not sure if that also uses an IRQ or not, but probably just handled on the chip itself
<karolherbst>
Mis012[m]: how else do you think those things work in hardware?
<DemiMarie>
The other part of the issue is that Linux can’t run on the coprocessors that the firmware runs on. That’s one reason the firmware is separate software from the OS.
<karolherbst>
stuff just magically wakes up after 1.5s because it just knows a value changed?
<karolherbst>
this isn't userspace programming
<Mis012[m]>
you could directly fire an irq when a value changes, that seems easy enough
<Mis012[m]>
to the mcu obviously
<karolherbst>
and what part would know that a value changed?
<karolherbst>
anyway, I'm done with your better knowing attidue, good night
<Mis012[m]>
lots of things that don't contain MCUs can fire IRQs, I'm sure it's possible
<Mis012[m]>
but the MCU probably doesn't have much other stuff to worry about so it could be fine to use a timer
<DemiMarie>
Possible? Yes. Is it what they did? No, and presumably they have good reasons for that, such as reducing the risk of the hardware design.
<DemiMarie>
It is much cheaper to fix an issue in firmware than to have to recall the silicon.
<DemiMarie>
So it makes sense to have as much as possible in the firmware, with the silicon only having what is necessary.
<Mis012[m]>
can always work around a hw issue in fw or even in the OS, vendors love this one weird trick
<DemiMarie>
exactly
<DemiMarie>
The more that is in FW instead of HW, the more likely they are to be able to do this.
LeviYun has quit [Ping timeout: 480 seconds]
<Mis012[m]>
fw being much easier to change also means that for example for one EC you need at least dozens of different drivers
<Mis012[m]>
which is one reason why I prefer stuff being done in hw
<DemiMarie>
That isn’t the way things are going, though.
<Mis012[m]>
fw middleman is an option for standardizing a protocol but somehow that doesn't usually happen so it actually ends up worse
<DemiMarie>
And more importantly, this is not a channel about embedded hardware design, so this discussion is off-topic.
<Mis012[m]>
right
<DemiMarie>
This channel is about how things are, not how we would like them to be.
LeviYun has joined #dri-devel
<DemiMarie>
Unless “how we would like them to be” can be achieved by reasonable changes to Linux, Mesa, or another part of the open source graphics stack, it’s not relevant here.
<Mis012[m]>
reasonable people disagree about whether flashing custom fw would be reasonable, not that it's a possibility
<Mis012[m]>
* a possibility in this case
<DemiMarie>
This isn't the place for that discussion
<Mis012[m]>
it could qualify as a change to Linux
<DemiMarie>
no, because the signature checks are done either by hardware or by ROM
<Mis012[m]>
well, signature checks are not always done, but I don't know about anything graphics-related where there aren't
<Mis012[m]>
I heard something about AMD possibly opening something up
<Mis012[m]>
arguably standardizing a fw interface across a single vendor is absolutely useless
<Mis012[m]>
well, I got the answer to my original question, anything else is tangential