mattrope has quit [Remote host closed the connection]
lemonzest has joined #dri-devel
camus has joined #dri-devel
Kayden has joined #dri-devel
JohnnyonFlame has quit [Ping timeout: 480 seconds]
flto has quit [Remote host closed the connection]
aravind has joined #dri-devel
flto has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
ybogdano has joined #dri-devel
flto has quit [Quit: Leaving]
ybogdano has quit []
flto has joined #dri-devel
Yuriy has joined #dri-devel
Yuriy has quit []
ybogdano has joined #dri-devel
ybogdano has quit []
ybogdano has joined #dri-devel
ybogdano is now known as Yuriy
Yuriy has quit []
ybogdano has joined #dri-devel
mbrost_ has joined #dri-devel
ybogdano has quit [Ping timeout: 480 seconds]
mbrost has quit [Ping timeout: 480 seconds]
slattann has joined #dri-devel
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
Duke`` has joined #dri-devel
flto has quit [Remote host closed the connection]
flto has joined #dri-devel
mbrost_ has quit []
<Ristovski>
mareko: Missed your reply yesterday. The rather cursed mesa issue I was experiencing was due to me having forgotten to remove amdgpu.mcbp=1 from my kernel cmdline. It caused the gfx ring to timeout when running anything, including `glxinfo`. Since I am on GFX6, I assume it was trying to use MCBP even though it wasn't properly implemented?
<mareko>
Ristovski: I think MCBP doesn't even exist on gfx6
<Ristovski>
mareko: I see. Any clue why it would break with amdgpu.mcbp=1 then? Or does that override somehow present fake mcbp support to mesa?
<mareko>
gfx8 is the first hw with MCBP
<mareko>
that codepath is probably messed up everywhere
<mareko>
not all kernel options are supposed to work at all times
<Ristovski>
Figured, I guess it makes sense if mesa doesn't check for >= GFX8 and tries to use MCBP
<Ristovski>
I assume it gets the "is mcbp supported" bit from libdrm?
unsolo_ has joined #dri-devel
unsolo has quit [Ping timeout: 480 seconds]
Erandir has quit [Ping timeout: 480 seconds]
pnowack has joined #dri-devel
<CounterPillow>
Does anyone happen to know where the radeon driver has its PCI IDs? I have an Evergreen Cedar card "PCI edition" connected to a PCIe-to-PCI bridge (don't ask) and it just enumerates as "Non-VGA unclassified device: Comp. & Comm. Research Lab Device 8112 (rev 2a)" (1035:8112)
frieder has joined #dri-devel
<CounterPillow>
No clue whether I stumbled into some engineering sample or something or if an error is being read as the PCI id, because that is a very weird vendor id.
camus1 has quit [Remote host closed the connection]
Company has joined #dri-devel
unsolo_ has quit [Ping timeout: 480 seconds]
camus has joined #dri-devel
* thellstrom
Fixing drm-tip
<Ristovski>
CounterPillow: the device id is weird too, unless I'm missing something obvious
<CounterPillow>
8112 PEX8112 x1 Lane PCI Express-to-PCI Bridge
<CounterPillow>
that's an interesting one to have an ID of 8112. Not the bridge I use, but maybe one the card internally uses.
tursulin has joined #dri-devel
rgallaispou has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
rasterman has joined #dri-devel
<airlied>
CounterPillow: you should still be able to see the VGA device
<airlied>
if the bridge is working an enumerated
<CounterPillow>
Hmmm, true.
unsolo has joined #dri-devel
<airlied>
I assume there's a PCI->PCIe bridge on the card
<airlied>
which is that device
tzimmermann has joined #dri-devel
lynxeye has joined #dri-devel
dongwonk has quit [Remote host closed the connection]
Ahuj has joined #dri-devel
rgallaispou has quit [Read error: Connection reset by peer]
robher has quit [Read error: Connection reset by peer]
SanchayanMaity has quit [Read error: Connection reset by peer]
SanchayanMaity has joined #dri-devel
robher has joined #dri-devel
lileo_ has quit [Read error: Connection reset by peer]
dianders_ has quit [Read error: Connection reset by peer]
lileo_ has joined #dri-devel
krh has quit [Read error: Connection reset by peer]
dianders_ has joined #dri-devel
krh has joined #dri-devel
jonmason has quit [Read error: Connection reset by peer]
jonmason has joined #dri-devel
aswar002 has quit [Quit: No Ping reply in 180 seconds.]
rsripada_ has quit [Remote host closed the connection]
aswar002 has joined #dri-devel
rsripada has joined #dri-devel
elongbug has quit [Ping timeout: 480 seconds]
<CounterPillow>
I think this really is a bridge device, just with a different vendor id. I shall try hacking in code to enable the bridge chip during probe at a later date.
rgallaispou has joined #dri-devel
shashanks has quit [Read error: Connection reset by peer]
shashanks has joined #dri-devel
rgallaispou has left #dri-devel [#dri-devel]
<CounterPillow>
so in short, I am on an arm64 board probing a Radeon HD 5450 through a PCIe<->PCI bridge connected to another PCIe<->PCI bridge. Things are going well.
<arnd>
CounterPillow: it's likely that there is a bug in your firmware or the PCIe host bridge driver that prevents bridges from getting probed right
<arnd>
which SoC is this?
<CounterPillow>
RK3566. I think it's the bridge actually, the device ID is referenced in oxygen_lib.c as a bridge, where it gets some manual configuration.
<CounterPillow>
the reason why I'm not PCIe-ing directly is because of a silicon bug on the SoC making it unable to satisfy the cache coherency requirements of PCIe.
gawin has joined #dri-devel
shashanks has quit [Remote host closed the connection]
shashanks has joined #dri-devel
<arnd>
CounterPillow: I don't support for rk3566 in mainline yet, only rk3399. Are you using any out-of-tree patches for the SoC support, or does the host bridge claim compatibility with rockchip,rk3399-pcie?
<arnd>
CounterPillow: I don't know what the cache coherency requirements are here, but it seems unlikely that going through two extra bridges helps there, that usually only makes things worse ;-)
rgallaispou has joined #dri-devel
<arnd>
the rk3399 pcie support has no cache coherency at all, but that's how most arm64 SoCs operate and not considered a bug, it's just slow
frieder has quit [Ping timeout: 480 seconds]
<arnd>
it means all DMA that the device does needs to be done to uncached memory, or it needs additional cache flushes when accessed by the kernel
<CounterPillow>
It's a different controller than the rk3399, but it has mainline support that was merged in 5.15.
<arnd>
ok, got it. I haven't merged the dts changes for that yet, but I see the driver now
hansg has joined #dri-devel
<CounterPillow>
basically, what I'm doing is plugging an Analogix PCIe/PCI bridge into it, end then the PCI gpu into that. It's quite the unstable stack in all meanings of the word.
<CounterPillow>
err, not Analogix. ASMedia, sorry.
<CounterPillow>
ASM1083/1085, so notably one with special quirk handling. Might be part of the issue as well.
<arnd>
which also (unsurprisingly) has it as non-coherent, but (slightly more surprising) looks like it does not support any legacy IRQs either
<arnd>
Not sure if passing MSIs through your stack of bridges works as intended
<gawin>
in my experience bridges pci <-> pci express are very tricky (at least inside usb controllers), I wasn't able to get it running with VFIO (device was always busy)
<arnd>
though that wouldn't cause the probing to fail entirely
frieder has joined #dri-devel
<CounterPillow>
the device vendor id being something unexpected but the device id being one that also is a bridge from some other vendor makes me think someone did an acquisition, though do the bridges even have drivers that match against certain IDs?
<arnd>
CounterPillow: since this is now a dwc PCIe, I would at least expect the bridges to work, as the actual probing is done in the common dwc-pcie code, not in the rockchip specific parts
<CounterPillow>
oh yeah the PCI bridge I plug in works with other PCI devices (I used an ASUS Xonar PCI soundcard to test at one point)
<arnd>
bridges are meant to be probed by generic code looking at the device classes, not PCIe vendor/device IDs
<CounterPillow>
I see
<arnd>
CounterPillow: what does 'lspci -t -v' show? Do you see both bridge, but not the device behind the second bridge, or do you only see the first bridge?
<CounterPillow>
-[0000:00]---00.0-[01-ff]----00.0-[02]----00.0 Comp. & Comm. Research Lab Device 8112
<CounterPillow>
I see both bridges, but not the device behind the second bridge.
<CounterPillow>
Assuming that this still is a bridge
<CounterPillow>
it enumerates as a "Non-VGA unclassified device", so if it is a bridge then it's not saying that it's a bridge
<arnd>
try 'lspci -vv' as root for more detailed output about both bridges
rgallaispou has quit [Read error: Connection reset by peer]
unsolo has quit [Ping timeout: 480 seconds]
<arnd>
"Memory behind bridge: [disabled]" looks like a problem
<arnd>
so even the host bridge has no access to MMIO registers
<CounterPillow>
Oh well
<arnd>
CounterPillow: do you see any errors in the boot log for the PCIe probe?
<CounterPillow>
[ 1.280445] pci 0000:02:00.0: [1035:8112] type 00 class 0x060400
<CounterPillow>
[ 1.281060] pci 0000:02:00.0: ignoring class 0x060400 (doesn't match header type 00)
<CounterPillow>
[ 1.279344] pci_bus 0000:02: extended config space not accessible
<CounterPillow>
[ 1.279980] pci_bus 0000:02: scanning bus
<arnd>
ok, so the last line explains why it ignores the second bridge, it just doesn't know what this is
<arnd>
the first line might be the cause of that problem, it's possible that you can't configure the first bridge without extended config space (not sure, that's where my pcie knowledge definitely hits its limits)
<CounterPillow>
Oh well, this was a fun poke at things, I'll just write this one off :)
rgallaispou has joined #dri-devel
<arnd>
CounterPillow: do you know what the original problem was that prevents you from using a normal pcie card? That would likely be the easier problem to work out.
f11f12 has joined #dri-devel
<arnd>
I think the probing here can be fixed by digging into it, it certainly sounds like something wrong with the host bridge driver, but most likely after you fix that you end up in the same situation that you'd be in with a normal pcie card
<CounterPillow>
I think it was that the amd drivers will not support missing cache coherency
<arnd>
right, in that case, there is no hope
<arnd>
if the pcie host bridge is not coherent, then adding bridges would not make it any less broken
<CounterPillow>
oops
<arnd>
you might still be able to use it as a dumb framebuffer if you could get the bootloader to post the device, but there are probably better ways of getting a dumb framebuffer
<CounterPillow>
Well yeah, there's an integrated GPU that's just sitting there waiting for a vop2 driver
<CounterPillow>
It's not that I need to do this, it's just that I thought it would be funny if it ended up working
<arnd>
CounterPillow: I suppose there is a chance of the amdgpu driver getting fixed at some point. Samsung and AMD already announced a phone SoC based on a more modern AMD gpu, so if that is noncoherent as well (most phone chips are), they might have to fix the driver after all
<CounterPillow>
Does nouveau work with noncoherent devices? Since Tegra is an ARM SoC family.
<HdkR>
Or they just make it coherent out of sanity :)
<CounterPillow>
I guess I could test the coherency requirements of nouveau after lunch, I've got an 8800 GTX as well as a GTX 480 laying about that should be supported.
lynxeye has quit []
<tjaalton>
will mesa 21.3 branch tomorrow?
<tjaalton>
eric_engestrom: ^
<eric_engestrom>
tjaalton: yes, tomorrow at around 6pm UTC :)
<tjaalton>
eric_engestrom: cool, thanks
<arnd>
CounterPillow: maybe ask on #armlinux or #aarch64-laptops, I'm sure someone there has tried it before
<pinchartl>
danvet: can I (gently) ping you on "[GIT FIXES FOR v5.15] R-Car DU fix" ?
<danvet>
airlied, ^^
<pinchartl>
actually, scratch that, it seems I miesed up
<pinchartl>
messed
<pinchartl>
the same fix was included by mistake in my -next pull request
<pinchartl>
which has been merged already
<pinchartl>
so that will conflict in Linus' tree, not nice
<pinchartl>
I suppose it's best to skip the v5.15 fix and get it backported in the v5.15.x stable branch ?
<pinchartl>
airlied: ^^ I'll let you decide what's best
<gawin>
may be stupid question how can I debug vram corruption? unfortunately there's no asan/valgrind for vram
<HdkR>
renderdoc? :)
<gawin>
thanks, gonna try
<gawin>
" In particular on desktop only modern GL is supported - legacy GL that is only available via the compatibility profile in OpenGL 3.2 is not supported." sad r300 noises
<gawin>
recently debugging d3d9 or gl2 became difficult (even if you're on app's side)
<gawin>
I mean even just getting tools is problematic (iirc amd has removed their older tools for windows)
camus has joined #dri-devel
slattann has quit []
camus1 has quit [Ping timeout: 480 seconds]
<MrCooper>
gawin: you can enable Option "TearFree" in xorg.conf without forcing the driver, or you can enable TearFree at runtime with xrandr
flacks has quit [Quit: Quitter]
flacks has joined #dri-devel
<gawin>
MrCooper: just this?
<gawin>
Section "Device"
<gawin>
Option "TearFree" "true"
<gawin>
EndSection
<MrCooper>
and an Identifier
<gawin>
it can be anything? or needs to match something?
<MrCooper>
anything
<gawin>
thanks
<MrCooper>
np
<hansg>
vsyrjala, can you review my "[PATCH 10/10] drm/i915: Add privacy-screen support (v3)" patch please ? I've addressed your request to move drm_privacy_screen_get() call to intel_ddi_init_dp_connector(). That is the last one of the series needing a review, then I can push the series.
<emersion>
hansg: btw, in case you've missed it, i've sent a RFC for the CLOSEFB stuff. feedback welcome!
<hansg>
emersion, yeah I've seen it. I've been out with the f
<hansg>
Ugh. What I wanted to write is I've been out with the flu for 10 days, so I'm currently catching up on the backlog. And I really want to get the drm-privacy stuff wrapped up before starting something new.
<emersion>
oh, no worries, glad you're back!
<hansg>
With that all said I've looking at your v2 / CLOSEFB proposal on my to do list.
<emersion>
yeah, no rush, just wanted to make sure it's not falling through the cracks :)
<hansg>
ack
imre has quit [Quit: leaving]
imre has joined #dri-devel
flto has quit [Read error: Connection reset by peer]
lynxeye has joined #dri-devel
vivijim has joined #dri-devel
gruetze_ has joined #dri-devel
flto has joined #dri-devel
sdutt has joined #dri-devel
sdutt has quit []
sdutt has joined #dri-devel
sukualam has joined #dri-devel
shashank_sharma has joined #dri-devel
camus1 has joined #dri-devel
camus has quit [Ping timeout: 480 seconds]
camus has joined #dri-devel
camus1 has quit [Read error: Connection reset by peer]
leandrohrb has quit [Read error: Connection reset by peer]
fxkamd has joined #dri-devel
unsolo has joined #dri-devel
unsolo_ has quit [Ping timeout: 480 seconds]
pushqrdx has quit [Ping timeout: 480 seconds]
camus1 has joined #dri-devel
camus has quit [Read error: Connection reset by peer]
FireBurn has quit [Quit: Konversation terminated!]
thellstrom has quit [Ping timeout: 480 seconds]
pushqrdx has joined #dri-devel
pushqrdx has quit [Ping timeout: 480 seconds]
mattrope has joined #dri-devel
<hwentlan>
Plagman, emersion: was off for canadian thanksgiving yesterday. i'm okay taking the chrome workaround upstream or for chrome guys to take it in the chrome tree. i hear chrome devs are working on a new compositor, so maybe in a year this won't be needed anyways
<airlied>
hwentlan, agd5f : https://paste.centos.org/view/raw/9ab2f8f4 amd vs intel display code in a race to 10000 :-P, though the intel one is after I've refactored 1000 more lines out
<airlied>
but 10,000 loc in one file seems a bit unwieldly
camus1 has joined #dri-devel
camus has quit [Read error: Connection reset by peer]
gawin has joined #dri-devel
<hwentlan>
airlied: yeah, we need to break up that file some more
JohnnyonFlame has joined #dri-devel
<emersion>
i always end up having to touch amdgpu_dm.c
<emersion>
was wondering if it was just bad luck, doesn't seem like it :P
ybogdano has quit [Ping timeout: 480 seconds]
flto has joined #dri-devel
pnowack has joined #dri-devel
camus has joined #dri-devel
camus1 has quit [Ping timeout: 480 seconds]
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
heat has joined #dri-devel
dianders_ has left #dri-devel [#dri-devel]
dianders has joined #dri-devel
lemonzest has quit [Quit: WeeChat 3.2]
unsolo has joined #dri-devel
pnowack has quit [Quit: pnowack]
pnowack has joined #dri-devel
gawin has quit [Ping timeout: 480 seconds]
hfink_ has quit []
hfink has joined #dri-devel
heat has quit [Ping timeout: 480 seconds]
mbrost has quit [Remote host closed the connection]
ybogdano has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
Ahuj has quit [Ping timeout: 480 seconds]
rasterman has quit [Quit: Gettin' stinky!]
<graphitemaster>
So the amdgpu driver does support mid command buffer preemption, it's just not default - loading the module with mcbp=1 does make AMD systems so much more responsive to long running compute kernels
<graphitemaster>
Why is this not default?
<HdkR>
Buggy, crashy, not fully validated, all of the above? :P
<graphitemaster>
Okay well someone here previously told me this is a hard problem requiring a re-plumb of the entire Linux graphics stack and apparently there's already a working implementation of it that has been here for quite awhile, just needs some massaging. I don't know who to believe anymore XD
<airlied>
mcbp doesn't fix any of the problems with the Linux stack replumbing
<airlied>
it's merely a component of the fix
gouchi has quit [Remote host closed the connection]
<bnieuwenhuizen>
I think if the compute work is rocm, and you want to preempt that the replumbing might actually not be needed, but that is quite a limited scope of work
<agd5f>
graphitemaster, all mcbp=1 does is allow preemption to work. Someone still has to actually do request it.
<agd5f>
bnieuwenhuizen, ROCm can already preempt
vivijim has quit [Ping timeout: 480 seconds]
<pinchartl>
airlied: I've made a mistake and included the same fix in both "[GIT FIXES FOR v5.15] R-Car DU fix" (which hasn't been pulled yet) and in a branch that you have merged for v5.16. how would you like to proceed with that, dropping the v5.15 fix pull request (the fix, which is for a v5.15 regression, would then end up in the stable branch once v5.16 is out), or still merge the fix for v5.15 ?
<graphitemaster>
No, not ROCm, I was just doing my own research and investigation into it when I felt people here weren't that interested or serious in fixing what I believed was a fairly serious problem. Came across the only driver that had any mention of more granular preemption and decided to try it for myself and I can report on a regular X desktop install with a long running GLSL compute shader, the desktop is more responsive than without it.
<graphitemaster>
It's still not Windows good though. I'm curious to know what plumbing is needed when this already has a pretty obvious interactivity improvement?
<graphitemaster>
If it's just a part of the equation and more work is needed, this is already a pretty damn good start.
<agd5f>
graphitemaster, nothing actually gets preempted unless someone says preempt that job. that part is missing. On windows the OS does it. On Linux no one does.
<graphitemaster>
Is it so hard to hack that pre-empt request in the kernel/mesa?
<airlied>
pinchartl: how big is the fix? I think I can cherry-pick it back from the v5.16 branch into the v5.15 branch like we do for other fixes
<bnieuwenhuizen>
I'm just wondering what changed if there is no pre-emption happening yet
<airlied>
it shouldn't mess git up as much
<graphitemaster>
It doesn't need to be perfect here, a really gnarly criterion that just keeps the desktop interactive is infinitely better than the current "hah your desktop is hosed, lets hold the power button down because not even alt+sysrq+r works"
<karolherbst>
soooo.. I want to look into using marge for nouveau kernel patches (doing the rebase + adding Link: tags etc...)
<karolherbst>
is there anybody around to give me some pointers on this? :D
<karolherbst>
btw... how strong are we about those Link: tags inside the commits?
<pinchartl>
airlied: 187502afe87a in your tree
<airlied>
pinchartl: okay let me give this a go
<airlied>
karolherbst: we like the Link tags to exist, it's not end of the world if they don't
<airlied>
karolherbst: is marge going to get a signed-off-by line or how are you going to deal with that requirement?
* pinchartl
wonders why after all this year git..b don't support adding tags to commits
<karolherbst>
airlied: well, people can also just set them manually and point towards the MR
<karolherbst>
pinchartl: because you have to figure out to which patch the tag applies to
<karolherbst>
does it apply to all? or just a few?
<pinchartl>
that's why I'd like git..b to support per-commit approval
<pinchartl>
instead of a single button to approve the whole request
<karolherbst>
airlied: atm I push the patches to drm-misc myself and I don't plan to automate this with marge
<pinchartl>
I'm certainly biased by the kernel work flow, but it seems such a core feature to me
<karolherbst>
marge just pushes to nouveau-next/fixes and then somebody would move it higher up
<karolherbst>
pinchartl: sure, but again.. how do you map language/comments to tags added to which commit
<pinchartl>
(and if per-commit review became a first-class citizen, it would also be good to have the ability to comment on the commit message itself)
<airlied>
pinchartl: per-commit is harder for gitlab/hub to track, esp as an MR evolves
<karolherbst>
yeah.. and that as well
<airlied>
gerritt does the crazy change-id thing
<pinchartl>
when reviewing a merge request, you get the list of commits, you can open them individually and it wouldn't be difficult in the UI to support adding tags
<airlied>
pinchartl: how do you track the tags across iterations though
<airlied>
the commit ids change
<airlied>
so do the subjects
<airlied>
and the contents
<karolherbst>
airlied: so, if a Link: tags points towards a merge request, that would be totally acceptable for everybody I guesS?
<airlied>
karolherbst: yes as long as you can find discussion
<pinchartl>
that's why the author would likely need to pull the updated branch
<pinchartl>
to work on vn+1
* pinchartl
goes back to his e-mail client for reviews
<airlied>
yeah it's more the whole database behind gitlab tracking it, github used to choke on force pushes for the same reason
<airlied>
it could no longer match the existing commentary to the new push
<karolherbst>
yeah.. it's all not very nice atm
<karolherbst>
but it also never was with literally any other tooling
<karolherbst>
the only thing I think could work is, people approving it via UI and then it applies to all patches
<karolherbst>
and this marge could add
<karolherbst>
but anything beyond that?
<pinchartl>
for large patch series it's common to approve some of them only, I like tags to figure out what I've reviewed already
<pinchartl>
I need to start using b4 and public-inbox more seriously, there are very good ideas there
<karolherbst>
yeah, but I don't see a way to do that automatically
ybogdano has quit [Ping timeout: 480 seconds]
<airlied>
like email only works there if the original author remembers to add all tags manually before sending a v2
<karolherbst>
or use proper reply-to stuff...
<airlied>
which is no different than using gitlab and having the author manually add tags
<karolherbst>
point is: every solution sucks
<airlied>
and also the v1-Rb: type of thing
<airlied>
gets messy
<karolherbst>
yeah...
<airlied>
it's like how much change invalidates an r-b
<karolherbst>
it's a social issue you try to find technical solutions for :p
<pinchartl>
everything sucks, we should all go farm tomatoes in portugal (or herd goats in the French Larzac, it seems quite popular for people who are fed up with humanity :-))
<airlied>
there's a lot of guesswork and workarounds even for the glorious email syustem
<airlied>
pinchartl: they suck more
<airlied>
ever try and a make a profit from farming?
<pinchartl>
maybe "profit" is what we should reconsider :-)
<airlied>
pinchartl: mostly just pointing out why "why can't g..* just do X" is because X is really hard to do right
<karolherbst>
or even a living wage without subsidies
<airlied>
pinchartl: you can't live on tomatoes
<airlied>
or goats
<pinchartl>
you can't eat patches :-D
<karolherbst>
pinchartl: it's not even enough to live unless you get subsidies :D
danvet has quit [Ping timeout: 480 seconds]
<pinchartl>
(while a goat meat stew with tomatoes...)
<karolherbst>
airlied: well.. actually...
<airlied>
so you are fed for one week a year when your tomatoes aren't eaten by random goats :-P
<karolherbst>
:P
<pinchartl>
hmmmm... right, goats eating tomatoes is an issue we'd have to solve
<karolherbst>
after that you eat the goats
<pinchartl>
I wonder if there's a project on git..b for that :-D
<karolherbst>
anyway..... for now I just want marge to add tags :D
<karolherbst>
another problematic thing would be s-by tags
<karolherbst>
should we add them from the person "accepting" the MR even though somebody else pushes it up to drm-misc?
<pinchartl>
agreed, we don't need to solve the tomatoes and goats issues as part of your tag handling problem. that would be an extreme case of yak shaving, or most likely goat shaving
<karolherbst>
atm I just want to script away the annoying bits of all of this
<airlied>
karolherbst: yes automated s-o-b tags are kinda legally dubious
<karolherbst>
*sigh*
<karolherbst>
airlied: well.. but we could say that the person accepting the MR ....
<karolherbst>
dim also adds the tag automated
<karolherbst>
so I don't realy see the big difference here
<karolherbst>
if a maintainer accepts patches they are not comfortable adding a s-by, then I'd question the maintainer why the patches were accepted in the first place :p
<pinchartl>
it may be fine, but it should be discussed with the kernel community I think
<karolherbst>
I mean... I can also add the s-by through dim...
<robclark>
karolherbst: what about marge pushing to `${driver}-next-staging` and then real human doing `git rebase --sign-off` and pushing to the real -next branch?
<karolherbst>
dim apply-from-gitlab $dest_Branch $src_branch
<karolherbst>
robclark: and then pushing to drm-misc again?
<karolherbst>
that's two steps
<airlied>
karolherbst: someone applying a patch is when the s-o-b from them is applied
<karolherbst>
so atm I am just talking about MR against nouveau-next/fixes and then somebody pushes against drm-misc-next/fixes
<robclark>
well, like marge would push to drm-misc-staging but then maintainer does the manual step of adding s-o-b and pushing to drm-misc
<airlied>
they've signed off that they are legally allowed to apply this patch
<airlied>
now having a bot do that makes me a bit skeptical
<airlied>
having a bot do it on behalf of someone when they click a merge button I suppose might be okay
<karolherbst>
airlied: yeah
<karolherbst>
that's my thinking
<airlied>
but I think there has to be a considered action on behalf of a user
<karolherbst>
why is marge different to dim
<karolherbst>
it's all triggered by a person
<airlied>
does marge know which person to apply them for?
<karolherbst>
I have no idea
<pinchartl>
there has to be a human action I think. git rebase --sign-off already automates adding the SoB line, having another tool doing it isn't very different, as long as it's triggered by a human
<karolherbst>
let's see
<pinchartl>
but it should still be discussed with the kernel community I think
<airlied>
dim is all done locally on a developers machine by them, it's just a wrapper around it
<karolherbst>
airlied: "Emma Anholt @anholt assigned to @marge-bot and unassigned @anholt 2 hours ago"
<karolherbst>
at least gitlab knows
<airlied>
yeah I'm guessing you'd have to fork marge-bot anyways to do this sort of thing
<karolherbst>
probably
<karolherbst>
okay
<karolherbst>
so uncontroversial is adding Link: tags to patches
<airlied>
yeah that should be fine
<karolherbst>
and we could script adding the s-by tags in dim for now
<karolherbst>
having some "apply those patches to this branch" kind of thing checking all patches a last time or whatever
<robclark>
adding s-o-b is easy, git-rebase can do it
<robclark>
so rebase drm-misc-staging on drm-misc, and then push
<karolherbst>
yeah..
<karolherbst>
if people don't forget
<pinchartl>
if gitlab can't do it by itself, what's the process to capture review "tags" ? or is the plan to drop that information altogether, not recording review and test information in commit messages ?
<robclark>
true.. it is at least a step in the right direction because we get some CI and automation of most of the process
<karolherbst>
yeah.. I think it would be good to know what the kernel community thinks about adding those tags automatically, because then it's easy
<karolherbst>
pinchartl: you have to collect those tags yourself sadly :/
<karolherbst>
well.. at least something patchwork wasn't that terrible at
<pinchartl>
that doesn't differ from today, so it's not a regression, even if automation could be nice
<pinchartl>
but how do you get them in the first place if reviews happen on gitlab ?
<bnieuwenhuizen>
on mesa people just comment Rb or whatever and you just amend that manually
<agd5f>
graphitemaster, there is a debugfs file, amdgpu_preempt_ib, if you want to play with it.
<JoshuaAshton>
karolherbst: It'd be nice if we could make Marge collate strings like "Reviewed-by: " for every commit or "[sha] - Reviewed-by:" for a single commit
<JoshuaAshton>
It could even refer to the mail map contributors thing to work for just "Rb" or "[sha] - Rb"
<karolherbst>
JoshuaAshton: I'd try those things out inside mesa first though
<pinchartl>
it's indeed better to test the process first before pushing it towards the kernel community, or you'll risk some serious backlash
<robclark>
it is a bit hard for a script to decide if r-b applies to the whole series or just individual patches..
<karolherbst>
yeah..
<karolherbst>
I think it's fine to add Link: and s-b-o tags, but everything else should be heavily tested before, as this gets quite complicated real quick
<karolherbst>
airlied: do you want to start the discussion with the kernel folks?
<karolherbst>
also.. where to get marge? and how to deploy it and where? :D
<robclark>
I guess to start, it is submitter's responsibility to append r-b/t-b/etc tags and re-push the MR
<airlied>
karolherbst: not really sure where best to bring in kernel ppl, cc'ing lkml is pretty futile :-P
<airlied>
tbh I'm not sure you want to engage too much with the lkml community before you've got a proof of concept
* airlied
isnt sure whree our marge-bot comes from or is hosted
<karolherbst>
airlied: I can work on the proof of concept part, but it's really not that much I guess? Just a bot adding s-b-o tags from people saying "merge it!". But I gues I _could_ make it work first
<airlied>
pinchartl: oh indeed I forgot about that
<pinchartl>
there's users@linux.kernel.org and tools@linux.kernel.org too
<pinchartl>
not sure which ones are the most appropriate
<airlied>
karolherbst: yeah a bot doing it under user direction seems like a proper answer
<karolherbst>
yeah, wouldn't do anything more and it's still not into drm-misc directly yet, so there is a person in between making sure everything is alright
camus has quit [Ping timeout: 480 seconds]
<karolherbst>
we could also say, MRs against drm-misc or drm _need_ those s-b-o tags
<karolherbst>
and what drivers do internally? nobody cares :p
<karolherbst>
but I guess we want to come to a situation where we only have one repo?
<karolherbst>
dunno
<karolherbst>
or well.. drm-misc accepts whatever
<graphitemaster>
agd5f, I don't really want to play with it, so much as I want to see the Linux desktop have it :P Basically need to take this from toy status to every desktop install in 2022 can actually just deal with long running compute shaders. WebGPU is approaching dangerously so at minimum the denial of service attack bug reports going to be piling up on both sides as regular users navigate to URLs that system lock their PC.
<karolherbst>
but drm requires s-b-o tags for all MRs
<bnieuwenhuizen>
graphitemaster: webgl can already do this, don't worry
<airlied>
karolherbst: linux needs s-o-b on every commit from everyone who handles it
<airlied>
until it lands in a git tree
<karolherbst>
airlied: sure, that's not what I mean
<karolherbst>
I mean until it gets to drm, bots/scripts can add those tags
<robclark>
bnieuwenhuizen: I did notice the other day that shadertoy seems to not try to compile all the shadertoy's on it's front page these days
<graphitemaster>
bnieuwenhuizen, Only a problem on Linux too.
<karolherbst>
but drm _requires_ those to exist before merging
<karolherbst>
ohh wait
<karolherbst>
then it needs yours or danvets tag...
<karolherbst>
ehh
<karolherbst>
annoying
<karolherbst>
or doesn't it?
ngcortes has quit [Remote host closed the connection]
ybogdano has joined #dri-devel
<airlied>
karolherbst: when I merge an MR I only add my tag to the merge
<graphitemaster>
Hah, NV's 496.13 driver completely breaks all OpenGL applications.
<jekstrand>
woohoo
<jekstrand>
quality
<airlied>
karolherbst: I assume that whoever merged the patches or done intermediate rebases have added their s-o-b targs
<airlied>
tags
adjtm has joined #dri-devel
<graphitemaster>
jekstrand, conspiracy to kill opengl by making bad gl drivers is amd's stichk
pnowack has quit [Quit: pnowack]
<jekstrand>
graphitemaster: Yeah, nvidia typically loves GL
<graphitemaster>
They still probably only test like one CAD suite and the original GLQuake
<graphitemaster>
If they stay working then GL QoL passes.
<graphitemaster>
I want the job of the guy who just plays GLQuake all day testing drivers.
<anholt>
mareko: are there any nice tools for tracking/assigning blame for gpu memory usage with radeonsi?
<HdkR>
graphitemaster: Their automated testing suite is definitely more encompassing than that.
<karolherbst>
airlied: mhh, okay
<karolherbst>
so I guess we could do it this way then
<karolherbst>
drm is doing real merges and you can add your tag yourself before sending it out or something.. but dunno
<karolherbst>
maybe you can also just merge locally :D
<karolherbst>
but the MR could verify that we have s-b-o tags from allowed people
<bnieuwenhuizen>
anholt: what granularity?
<bnieuwenhuizen>
between processes or within a process?
<anholt>
bnieuwenhuizen: within a process
<bnieuwenhuizen>
none I think
<bnieuwenhuizen>
what do you need to trace it back to?
<anholt>
trying to run test_va_api, we're ooming on grunt with it looks like 2.5GB of memory used between 4 test processes.
<anholt>
I've grabbed a few testcases and checked libc allocations with massif and it's ~16MB.
<graphitemaster>
HdkR, I was making a joke lol. I know the coverage of it is more than that XD
flto has quit [Remote host closed the connection]
<anholt>
bnieuwenhuizen: so, if there was some logging of BO allocation, I might be able to use that to find troublesome testcases, or find leaks.
<bnieuwenhuizen>
anholt: my random guess would be the encoder has a scratch buffer that IIRC was huge in the past due to no mid-stream resize capabilities
flto has joined #dri-devel
<bnieuwenhuizen>
assuming a new enough kernel /sys/kernel/debug/dri/0/amdgpu_vm_info might give you the current state
<bnieuwenhuizen>
just no really useful metadata per buffer object beside size and memory type
<airlied>
pinchartl: pushed that rcar fix to drm-fixes
<anholt>
bnieuwenhuizen: thanks, that should help me at least figure out if I'm on the right track with BOs
<anholt>
bnieuwenhuizen: yep. 587MB of BOs showing up in some subset of the tests.
tursulin has quit [Read error: Connection reset by peer]