ChanServ changed the topic of #dri-devel to: <ajax> nothing involved with X should ever be unable to find a bar
nchery is now known as Guest866
nchery has joined #dri-devel
Guest866 has quit [Ping timeout: 480 seconds]
hch12907 has joined #dri-devel
columbarius has joined #dri-devel
co1umbarius has quit [Ping timeout: 480 seconds]
stuart has quit []
ngcortes has quit [Remote host closed the connection]
icecream95 has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
cef has quit [Quit: Zoom!]
rexbcchen_ has joined #dri-devel
nchery has quit [Read error: Connection reset by peer]
rexbcchen has quit [Ping timeout: 480 seconds]
linearcannon has joined #dri-devel
<airlied> karolherbst: with a tree full of hacks, I've gotten a very simple image write test to pass on clover/aco
mbrost_ has quit [Read error: Connection reset by peer]
Daanct12 has joined #dri-devel
Daanct12 has quit [Remote host closed the connection]
kts has joined #dri-devel
kchibisov has quit [Read error: Connection reset by peer]
kchibisov has joined #dri-devel
sdutt has joined #dri-devel
kts has quit [Quit: Konversation terminated!]
heat has quit [Remote host closed the connection]
janesma has quit [Quit: Leaving]
mbrost has joined #dri-devel
Daanct12 has joined #dri-devel
cef has joined #dri-devel
Daanct12 has quit [Ping timeout: 480 seconds]
aravind has joined #dri-devel
nchery has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
Daanct12 has joined #dri-devel
bmodem has joined #dri-devel
karolherbst has quit [Ping timeout: 480 seconds]
karolherbst has joined #dri-devel
sarnex has quit [Read error: No route to host]
sarnex has joined #dri-devel
mclasen has quit [Ping timeout: 480 seconds]
YuGiOhJCJ has joined #dri-devel
Duke`` has joined #dri-devel
LexSfX has quit []
tzimmermann has joined #dri-devel
hch12907_ has joined #dri-devel
danvet has joined #dri-devel
hch12907 has quit [Ping timeout: 480 seconds]
hch12907 has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
hch12907_ has quit [Ping timeout: 480 seconds]
Company has quit [Quit: Leaving]
mbrost has joined #dri-devel
LexSfX has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
mszyprow has joined #dri-devel
Haaninjo has joined #dri-devel
cheako has quit [Quit: Connection closed for inactivity]
<bluepenquin> Hi, I'm trying to bisect to find a regression but building fails with `/usr/include/directx/d3d12video.h:1086:27: error: ‘NumTexture2Ds’ is not a type`, any idea? main/c4cec842315313a24342d1d9a4dbd4ad11fbdd6c (HEAD)
mbrost has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
mbrost has quit [Read error: Connection reset by peer]
i-garrison has quit []
mvlad has joined #dri-devel
tursulin has joined #dri-devel
MajorBiscuit has joined #dri-devel
<bluepenquin> Oh, it is already filed https://gitlab.freedesktop.org/mesa/mesa/-/issues/6511, guess I'll build directx-headers from git with makepkg now...
rgallaispou has joined #dri-devel
<MrCooper> feaneron: authentication is only required with /dev/dri/card* (and even then not with DRM master), not with /dev/dri/render*; so why iris doesn't work with the latter remains a mystery
lynxeye has joined #dri-devel
adavy has joined #dri-devel
rasterman has joined #dri-devel
i-garrison has joined #dri-devel
jkrzyszt has joined #dri-devel
lemonzest has joined #dri-devel
ahajda has joined #dri-devel
pcercuei has joined #dri-devel
MajorBiscuit has quit [Quit: WeeChat 3.4]
MajorBiscuit has joined #dri-devel
Daanct12 has quit [Ping timeout: 480 seconds]
ahajda_ has joined #dri-devel
ahajda has quit [Ping timeout: 480 seconds]
camus has quit [Remote host closed the connection]
camus has joined #dri-devel
garrison has joined #dri-devel
i-garrison has quit [Read error: Connection reset by peer]
simon-perretta-img has joined #dri-devel
sdutt_ has joined #dri-devel
MrCooper has quit [Remote host closed the connection]
MrCooper has joined #dri-devel
sdutt has quit [Ping timeout: 480 seconds]
<dj-death> danylo: I was hoping to drop the cs argument from the trace points
<dj-death> danylo: for Anv/Iris it's always the same, and we can get it back from the u_trace struct
<danylo> dj-death: But it's not true for Turnip
<dj-death> yeah, just saw that :)
<danylo> It's should be useful only for tilers
tzimmermann has quit [Quit: Leaving]
tzimmermann has joined #dri-devel
mnadrian has quit [Ping timeout: 480 seconds]
lumag_ has quit [Remote host closed the connection]
lumag_ has joined #dri-devel
rkanwal has joined #dri-devel
jcwasmx86[m] has joined #dri-devel
<dolphin> airlied: danvet: The fix for https://lore.kernel.org/all/CAHk-=wj0gHsG6iw3D8ufptm9a_dvTSqrrOFY9WopObbYbyuwnA@mail.gmail.com/ got pushed yesterday after the last cherry-pick for -fixes PR
<dolphin> I have now scheduled an another CI run for it and will send a late PR tomorrow assuming it passes CI
<karolherbst> airlied: nice
icecream95 has quit [Ping timeout: 480 seconds]
sdutt_ has quit [Ping timeout: 480 seconds]
MrCooper has quit [Quit: Leaving]
mclasen has joined #dri-devel
MrCooper has joined #dri-devel
flacks has quit [Quit: Quitter]
flacks has joined #dri-devel
apinheiro has joined #dri-devel
bmodem has quit []
devilhorns has joined #dri-devel
MajorBiscuit has quit [Ping timeout: 480 seconds]
MajorBiscuit has joined #dri-devel
MajorBiscuit has quit []
aravind has quit [Ping timeout: 480 seconds]
apinheiro has quit [Ping timeout: 480 seconds]
shadeslayer has left #dri-devel [Konversation terminated!]
mclasen has quit [Ping timeout: 480 seconds]
mclasen has joined #dri-devel
mbrost has joined #dri-devel
mahkoh has joined #dri-devel
<alyssa> airlied: Woo!
Company has joined #dri-devel
shadeslayer has joined #dri-devel
tzimmermann has quit [Quit: Leaving]
garrison has quit [Read error: Connection reset by peer]
garrison has joined #dri-devel
hch12907[m] has joined #dri-devel
simon-perretta-img_ has joined #dri-devel
hch12907[m] has quit [Quit: Reconnecting]
hch12907[m] has joined #dri-devel
simon-perretta-img has quit [Ping timeout: 480 seconds]
devilhorns has quit []
sdutt has joined #dri-devel
ella-0 has joined #dri-devel
ella-0_ has quit [Read error: Connection reset by peer]
mbrost has quit [Ping timeout: 480 seconds]
mbrost has joined #dri-devel
rexbcchen_ has quit [Ping timeout: 480 seconds]
simon-perretta-img_ has quit []
simon-perretta-img has joined #dri-devel
mbrost_ has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
mahkoh has quit [Quit: Page closed]
mszyprow has quit [Ping timeout: 480 seconds]
<jekstrand> Does gallium have any sort of a surface cache? Asking for a friend.
lumag_ has quit [Ping timeout: 480 seconds]
<jekstrand> Specifically, to cache the result of create_surface
<jekstrand> Sounds like something Zink might like
<alyssa> jekstrand: No, I don't think so. create_surface is expected to be cheap.
<jekstrand> :-/
Duke`` has joined #dri-devel
Peuc has quit [Quit: Peuc]
Peuc has joined #dri-devel
whald has quit [Remote host closed the connection]
YuGiOhJCJ has quit [Quit: YuGiOhJCJ]
<alyssa> anholt: let's play count the (preexisting) bugs !16437 has uncovered
<alyssa> decl_reg vec3 64 r0
<alyssa> decl_reg vec3 64 r1
<alyssa> but here's problems #1 and #2
<bbrezillon> jekstrand: hm, I think I got some vk_sync expections wrong when implementing the ID3D12Fence wrapper. So, ID3D12Fence is basically a timeline-sync, but, AFAICT, it can also do binary (we can reset the fence and multi-wait is supported as well). The thing is, I'm not quite sure the vk_semaphore wrapper works as expected, unless I use the vk_sync_binary implementation to emulate
<bbrezillon> binary syncs over timeline ones. Looks like the sync object is not explicitly reset after a wait in that case
<bbrezillon> to sum-up, when I use dzn_sync_type for binary syncs, I think I end in a case where the vk_semaphore only works once, and once it's been signaled, it stays signaled forever. I suspect that's because I shouldn't declare dzn_sync_type as supporting the binary model, but I'd like to be sure I'm not missing something
<alyssa> we're up to at least 3, i think
<jekstrand> bbrezillon: I don't think you're missing anything.
<jekstrand> bbrezillon: I'm not sure how binary D3D12Fence works. It may be that it can be made to work as-is but it's not obvious to me that it can.
<jekstrand> bbrezillon: If not, just throw a vk_sync_binary around it and move on
<jekstrand> That's what vk_sync_binary is for
<jekstrand> bbrezillon: It really only matters if we care about fence/semaphore sharing of binary things
<bbrezillon> well, we can signal any value we want, so the binary implementation was just using 0 an 1, and the reset was implemented as a signal(0)
<jekstrand> hakzsam: I should close both of them.
<jekstrand> hakzsam: I don't think we can do that filtering reliably and the loader has already fixed it.
<jekstrand> hakzsam: yup
<hakzsam> great, thanks
<jekstrand> bbrezillon: Right... But there's no good way to pipeline that. So it works ok for fences where you have an explicit reset but there's no way to make it work for semaphores.
<bbrezillon> exactly
<bbrezillon> I'd need to know what the sync is used for
<jekstrand> bbrezillon: You could have a vk_sync_type which exports binary but not GPU_WAIT
<jekstrand> But I don't really see a point
<jekstrand> Shared fences are the least interesting of the sharable objects
<bbrezillon> or some sort of RESET_AFTER_WAIT flag, but I'm fine with the binary_sync wrapper ;)
<jekstrand> :)
<jekstrand> Anyone want to help me get lavapipe built on Windows?
<jekstrand> I'm getting lots of "mismatch detected for 'RuntimeLibrary': value 'MTd_StaticDebug' doesn't match value 'MDd_DynamicDebug' in liblavapipe_st.a"
<dcbaker> are you using a pre-built llvm?
<bbrezillon> -Db_vscrt=mtd maybe?
<jekstrand> cwabbott: Mind taking a quick look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16111 ? It's bitrotting on my main branch and does fix bugs.
<dcbaker> bbrezillon: that's what I was thinking too
<jekstrand> worth a try
<anholt> alyssa: I would recommend giving a little bit of a readme in there of how to use it? other than that, I'm +1 for just add in a tool like that.
<jekstrand> bbrezillon: Seems to have done the trick. Thanks!
<jekstrand> Now to see why vkcube doesn't work
jkrzyszt has quit [Ping timeout: 480 seconds]
<alyssa> anholt: sure, happy to add documentation for that. more making sure it's kosher given the function entrypoints etc
<jekstrand> And... now to build everything for x64 instead of x86...
<jekstrand> *sigh*
AndrewR has quit [Ping timeout: 480 seconds]
Duke`` has quit [Ping timeout: 480 seconds]
AndrewR has joined #dri-devel
* jekstrand has lavapipe gears!
<alyssa> Woo!
Duke`` has joined #dri-devel
<jekstrand> dcbaker: Does meson have a concept of functions or subroutines?
<alyssa> jekstrand: you're in too deep
<jekstrand> dcbaker: Wanting to wrap the ICD stuff
<jekstrand> We want --use-backslash whenever we're targetting windows
<dcbaker> jekstrand: user defined functions? no, that's an explicit design decision
<dcbaker> which part of the ICD stuff?
<jekstrand> dcbaker: The ICD generator that calls vk_icd_gen.py
<jekstrand> dcbaker: Currently, every caller is hand-rolling a bunch of stuff and that's not great
<jekstrand> It's all the same except for the version to embed in the json file and the .so name.
<jekstrand> except it's not because radv has windows fixes no one else has
<jekstrand> dcbaker: It doesn't even need to be a full function. Just a thing where i can define a custom generator to call later
<dcbaker> jekstrand: we've been talking about custom_target templates which would probably solve that issue
<dcbaker> I've also thought about something like python's function `partials`
<dcbaker> This is a known problem, it's just a hard problem to solve
<jekstrand> dcbaker: See also https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16612 (copied from RADV)
stuart has joined #dri-devel
mclasen has quit [Ping timeout: 480 seconds]
<jekstrand> dj-death: I need to try and finish that up this week.
<jekstrand> I need to do the other half
<jekstrand> dj-death: If you wanted to review the one patch that's there, that'd be good.
eukara has quit [Remote host closed the connection]
mclasen has joined #dri-devel
<dj-death> jekstrand: just asking in the context of vm_bind because it refers to it
<dj-death> jekstrand: I can have a look, are you doing to add a lot to it?
heat has joined #dri-devel
<jekstrand> dj-death: Right now, it has stuff to wait on dma-bufs. I need to add the stuff to signal them.
<jekstrand> Which may refactor the wait stuff a bit, actually...
eukara has joined #dri-devel
eukara has quit [Remote host closed the connection]
eukara has joined #dri-devel
mbrost has joined #dri-devel
mbrost_ has quit [Read error: Connection reset by peer]
rasterman has quit [Quit: Gettin' stinky!]
stuart has quit [Ping timeout: 480 seconds]
mclasen has quit [Ping timeout: 480 seconds]
nchery is now known as Guest942
Guest942 has quit [Read error: Connection reset by peer]
nchery has joined #dri-devel
pendingchaos_ has joined #dri-devel
pendingchaos has quit [Ping timeout: 480 seconds]
lynxeye has quit [Quit: Leaving.]
X512 has joined #dri-devel
mclasen has joined #dri-devel
<X512> Anybody detailed about Gallium?
<clever> [clever@amd-nixos:~]$ V3D_DEBUG=help MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo
<clever> name of display: :0.0
<clever> Killed
<clever> [172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8
<clever> [172536.665287] Call Trace:
<clever> [172536.665322] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
<clever> that aint good!
<marex> indeed, the lack of information provided is not good
<marex> dmesg > paste.debian.net
<airlied> it's like fuzztesting by accident
<clever> marex: at a glance, it seems to be 2 problems, 1: the v3d driver is issuing ioctl's to an amdgpu card, 2: the amdgpu driver isnt validating the inputs
<clever> airlied: yep!
<airlied> yeah 2 definitely needs to be fixed there
<clever> yep, it could potentially be a security problem
<clever> checking the os packages, amdgpu is part of the kernel source tree
mbrost has quit [Remote host closed the connection]
mbrost has joined #dri-devel
<clever> this region
<clever> the backtrace doesnt say where it was called from properly
<clever> aha, thought so, tail-call
jcwasmx86[m] has left #dri-devel [#dri-devel]
<clever> and an ioctl list in each driver...
<airlied> yeah no need to backwards
<airlied> the trick is to find where RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 is
alyssa has left #dri-devel [#dri-devel]
<airlied> the ? in the backtrace can often be junk
<clever> i can disasm the function as well
<airlied> yeah that might be interesting
<airlied> there is also some addr2line thing
<clever> [nix-shell:~/apps/rpi/lk-overlay]$ cat /run/current-system/kernel-modules/lib/modules/5.10.81/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.xz | unxz > amdgpu.ko
<clever> [nix-shell:~/apps/rpi/lk-overlay]$ objdump -d amdgpu.ko
<airlied> ah yeah having a look at +0x96 might point it out
<airlied> need amdgpu_cs function
<airlied> amdgpu_cs_ioctl function
<clever> > (0x000000000001c540 + 0x110).toString(16)
<clever> '1c650'
<clever> ah, i see what you mean by ?
<clever> its pointing to the byte right after the end of the function
<airlied> yeah just look at the RIP line
<airlied> is usually enough info
<clever> but for that, i need to also know how the kernel relocated the module
<marex> Documentation/admin-guide/bug-hunting.rst gdb section is a _very_ good read
<airlied> nope amdgpu_cs_ioctl+0x96 should be easy to find from a objdump
<marex> gdb l part is especially useful when decoding these splats
<marex> airlied: that's it indeed
<clever> airlied: gist updated again, amdgpu_cs_ioctl was too long to copy/paste out of the objdump, so i just shoved the entire thing (43mb) into the gist
<clever> that may have made github upset...
<clever> yeah, it doesnt like a 43mb file being uploaded, lol
<clever> > (0x000000000001c650 + 0x96).toString(16)
<clever> '1c6e6'
<clever> comparing asm to src...
<clever> ah, and i forgot `-dr`, relocations are making it a tad more muddy, with a call to __fentry__ prepended to every func
<airlied> 1c6e6:48 83 ba d8 01 00 00 cmpq $0x0,0x1d8(%rdx)
<clever> yep
<clever> [172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8
<clever> that also matches the 1d8 offset from null
<airlied> objdump -S might show some more info if you have debuginfo in there
<clever> so rdx must be null, and the reg dump agrees
<clever> -S doesnt show anything more
<clever> not sure where it would have landed during my kernel build, *looks*
<clever> airlied: by chance, do you have the nix package manager installed?
<clever> or docker
mbrost has quit [Ping timeout: 480 seconds]
<airlied> nope not on where I'm at
<airlied> agd5f: ^ fyi
<airlied> clever: does addr2line work?
<clever> the directions in gistfile3 will give you the exact binaries i'm running, both kernel and modules
<clever> checking...
<airlied> thogu that's a pretty old kernel
<airlied> so it might be fixed
<clever> every change to that file, up to current master
<clever> my kernel is from sept 2020
<airlied> yeah it says 5.10.81 which is old by main tree terms
<airlied> not all fixes will end up in stable kernels
<clever> yeah
<airlied> we'd have to see if it reproduces on something a bit closer to Linus' tree to be sure
<clever> > If the kernel is compiled with ``CONFIG_DEBUG_INFO``, you can enhance the
<clever> [root@amd-nixos:~]# cat /proc/config.gz | gunzip | grep DEBUG_INFO
<clever> CONFIG_DEBUG_INFO=y
<clever> thats a good sign
Daanct12 has joined #dri-devel
Daanct12 has quit [Remote host closed the connection]
<airlied> okay it kills my 5.17.5 kernel
<feaneron> if i use a gbm_surface through EGLSurface, does that mean i cannot use functions like gbm_surface_lock_front_buffer() and gbm_surface_release_buffer() ?
<clever> WARNING! Modules path isn't set, but is needed to parse this symbol
<clever> [nix-shell:~/apps/rpi/lk-overlay]$ dmesg | bash ~/apps/rpi/linux/scripts/decode_stacktrace.sh /nix/store/hprwry55jwyd71ng7v7c2rhk3a3z1im8-linux-5.10.81/bzImage /nix/store/hprwry55jwyd71ng7v7c2rhk3a3z1im8-linux-5.10.81/lib/modules/
<clever> airlied: it doesnt like my modules being in a weird place, but if you can reproduce it, you may be able to get debug out of things faster
<daniels> feaneron: no, they’re there for exactly that and used by all commodities
<clever> i'm going to read the asm more, and trace where edx came from
<daniels> *compositors
<clever> rdx*
<clever> 1c6ca: 48 89 54 24 68 mov %rdx,0x68(%rsp)
<clever> it came off the stack, and i would need debug info or sizeof's to trace it easily
<feaneron> daniels: okay, thanks, this is useful to know
<feaneron> this is just a super odd of an issue. if i create a gbm_device from /dev/dri/card0, the app crashes at the first glDrawArrays() call
<feaneron> if i create a gbm_device from /dev/dri/renderD128, egl returns failure for many functions, but never crashes
<clever> airlied: it faulted after amdgpu_ras_intr_triggered() but before printk_ratelimit(), and amdgpu_cs_parser_init() was inlined, cant make out much more from reading asm by hand, to ghidra!
<airlied> I should be able to figure out how to get more info here
<airlied> though might be good to file an issue https://gitlab.freedesktop.org/drm/amd
<clever> ioctl(6, DRM_IOCTL_EXYNOS_GEM_GET or DRM_IOCTL_PANFROST_GET_PARAM or DRM_IOCTL_QXL_GETPARAM or DRM_IOCTL_TEGRA_SYNCPT_WAIT or DRM_IOCTL_V3D_GET_PARAM or DRM_IOCTL_VC4_MMAP_BO <unfinished ...>) = ?
<clever> what strace reported
<clever> given that i was using the v3d userland driver, DRM_IOCTL_V3D_GET_PARAM fits best
<airlied> yeah it's just an ioctl full of garbage
<airlied> as far as amdgpu is concerend it shouldn't trust any of it
<clever> yep
<clever> include/uapi/drm/v3d_drm.h:#define DRM_IOCTL_V3D_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_V3D_GET_PARAM, struct drm_v3d_get_param)
<clever> include/uapi/drm/v3d_drm.h:#define DRM_V3D_GET_PARAM 0x04
<clever> its just starting from 0, so there is a high chance of overlap with amdgpu
<clever> #define DRM_AMDGPU_CS 0x04
<clever> #define DRM_IOCTL_AMDGPU_CS DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CS, union drm_amdgpu_cs)
<clever> airlied: and bingo, DRM_V3D_GET_PARAM and DRM_IOCTL_AMDGPU_CS are the same ioctl#!
<clever> so amdgpu is expecting a drm_amdgpu_cs, but mesa gave it a drm_v3d_get_param
<clever> drm_v3d_get_param contains a 32bit, 32bit, and 64bit int
jfalempe has quit [Quit: Leaving]
<clever> drm_amdgpu_cs is a union between several larger structures, a 32/32/32/32/64 input, and a 64bit output
<clever> airlied: at the simplest level, this should immedaitely fail, because the userland didnt supply something big enough to be a valid union drm_amdgpu_cs
lumag_ has joined #dri-devel
<airlied> we don't trust userspace length either, so things get 0 padded
pendingchaos_ is now known as pendingchaos
<clever> and that padding is likely where the null-ptr came from
mbrost has joined #dri-devel
Duke`` has quit [Ping timeout: 480 seconds]
frankbinns has quit [Remote host closed the connection]
<clever> re-reading the source, now that i know the input type...
<clever> airlied: yep, this data ptr on line 1292, is the same type the ioctl is expecting from userland
<clever> so the cs variable there, will have been zero padded
<clever> first thing it does is check how many chunks are in the request
<clever> thats the 3rd 32bit field, which collides with half of the 64bit "value" field in the v3d req
ahajda_ has quit []
<clever> so its mis-parsing the id of a config variable? as a count
<clever> then it allocates an array of the right size
<clever> and calls amdgpu_ctx_get on the ctx_id field (which is the v3d param)
<clever> thats likely to malfunction more
mszyprow has joined #dri-devel
<clever> ah, its a generic api to map an id handle back to a poiner
<clever> and assuming it starts at 0, and the value id's in v3d start near 0, collisions again
<clever> so v3d is fetching a random amd context, from early in the init
<clever> airlied: i'm assuming that this copy_from_user is the most likely thing to fault, but i would have expected it to have safeties against NULL
<airlied> clever: copy_from_user can't fault
<airlied> it's the whole point of it
<clever> exactly
<clever> so i'm starting to get a bit lost on where it could be malfunctioning
iive has joined #dri-devel
<clever> personally, next step would be to just spam printk's everywhere
<clever> but i cant be rebooting every 5 minutes
<airlied> yeah I might just do a local build
<clever> i suspect the fault can only occur if you have an amdgpu compatible card in your syste
<clever> so a VM wont do either
<airlied> yeah I've got a test machine, it just doesn't have a local build kernel on it
<airlied> if (parser->job->uf_addr && ring->funcs->no_user_fence) seems to be the oops here
<clever> is it parser->job or ring->funcs that is null?
<airlied> though not sure I trust the objdump too much
<clever> DRM_IOCTL_V3D_GET_PARAM
<clever> from what i know of the hardware, this makes the most sense as to where the fault is coming from
<clever> userland is trying to query the ident0 and ident1 registers of the hardware
<clever> so it know exactly what its dealing with
<clever> that param value, lands smack ontop of the amdgpu ctx value
mvlad has quit [Remote host closed the connection]
<clever> airlied: so its padding amdgpu a ctx_id of 4, which just by chance matches up to an existing valid ctx
<clever> but isnt this a security problem, where you can access a ctx belonging to another process?
<clever> s/padding/passing/
<airlied> pretty sure ctxs are per file descriptor
<clever> then i would assume the ctx_id is invalid, because the v3d client didnt issue the proper ioctl to create one?
<clever> and it should have EINVAL'd here
<airlied> but nchunks is 0
<airlied> so it never gets that far
danvet has quit [Ping timeout: 480 seconds]
<clever> airlied: num_chunks is part of the v3d value field, which is un-initialized userland stack
<clever> so it could be anything
<clever> compare drm_amdgpu_cs_in and drm_v3d_get_param in the headers
icecream95 has joined #dri-devel
<airlied> for me parser->job and parser->entity are both null when it crashes at the line abovec
<clever> amdgpu_cs_ioctl calls amdgpu_cs_parser_init which calls amdgpu_job_alloc
<clever> and passes it the addr of the job pointer
<clever> so you somehow used the job before allocating it
<airlied> yeah nchunks = 0 causes lots of things to fail
<airlied> but without failing properly
<clever> this feels like the userland should be telling the kernel, "yes i know your amdgpu" before it can do anything
<clever> just to prevent a v3d client from talking to a amdgpu driver
<anholt> the kernel interface should already be fuzzed. and, if userspace wants to ignore the kernel telling it what driver it is talking to, you can't really stop them.
<clever> anholt: then how was i able to trigger a null-pointer in my amdgpu driver?
<clever> did the fuzzing miss it?
<anholt> the kernel probably wasn't fuzzed, which is a problem.
<clever> and why is the v3d client in mesa talking to an amdgpu driver
<jekstrand> Is there a way to allocate a dma-buf without opening a DRM node?
ahajda_ has joined #dri-devel
<jekstrand> I guess there's DRM_IOCTL_MODE_CREATE_DUMB
<jekstrand> I want to test if my new dma-buf ioctls exist without the VkDevice
alanc has quit [Remote host closed the connection]
alanc has joined #dri-devel
Haaninjo has quit [Quit: Ex-Chat]
apinheiro has joined #dri-devel
ahajda__ has joined #dri-devel
ahajda_ has quit [Read error: Connection reset by peer]
ahajda__ has quit []
sdutt has quit []
sdutt has joined #dri-devel
X512 has quit [Quit: Vision[]: i've been blurred!]
mszyprow has quit [Ping timeout: 480 seconds]
apinheiro has quit [Quit: Leaving]
tursulin has quit [Remote host closed the connection]
simon-perretta-img_ has joined #dri-devel
morphis has quit [Ping timeout: 480 seconds]
morphis has joined #dri-devel
mbrost has quit [Ping timeout: 480 seconds]
simon-perretta-img has quit [Ping timeout: 480 seconds]
lemonzest has quit [Quit: WeeChat 3.4]
pcercuei has quit [Quit: dodo]
rkanwal has quit [Read error: Connection reset by peer]
rkanwal has joined #dri-devel
krushia has joined #dri-devel
cheako has joined #dri-devel
iive has quit []