ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
JulianGro has quit [Remote host closed the connection]
digetx has quit [Read error: Connection reset by peer]
digetx has joined #panfrost
guillaume_g has joined #panfrost
q4a-sbc has joined #panfrost
q4a-sbc has quit [Quit: WeeChat 2.8]
q4a-sbc has joined #panfrost
q4a-sbc has quit []
q4a-sbc has joined #panfrost
q4a-sbc has quit []
q4a-sbc1 has joined #panfrost
q4a-sbc1 has quit []
q4a-sbc has joined #panfrost
q4a-sbc has quit []
guillaume_g has quit []
guillaume_g has joined #panfrost
rasterman has joined #panfrost
JulianGro has joined #panfrost
anarsoul|2 has joined #panfrost
anarsoul has quit [Read error: Connection reset by peer]
anarsoul|2 has quit [Ping timeout: 480 seconds]
anarsoul has joined #panfrost
MajorBiscuit has joined #panfrost
<tomeu> cphealy: don't know about the NPU, but would also like to know
alyssa has joined #panfrost
<alyssa> q4a: lavapipe and llvmpipe issues -> #dri-devel
<q4a> ok
q4a-sbc has joined #panfrost
q4a-sbc has quit [Quit: WeeChat 3.0]
rkanwal has joined #panfrost
WoC has quit [Remote host closed the connection]
WoC has joined #panfrost
rasterman has quit [Quit: Gettin' stinky!]
* alyssa swims in todos
<alyssa> let's see how many I can chomp down on today, though
pjakobsson has joined #panfrost
pjakobsson_ has quit [Ping timeout: 480 seconds]
guillaume_g has quit []
MajorBiscuit has quit [Quit: WeeChat 3.4]
<cphealy> tomeu: I did a little more research and it does appear to be the Verisilicon NPU IP. This Verisilicon NPU appears to be a cousin of the Vivante 3D GPU arch with some adjustments for NPU specific use. It also appears the Verisilicon NPU is also used in other designs such an NXP i.MX8MP and various Rockchip SoCs.
<macc24> cphealy: "various rockchip socs"?
<cphealy> Let me rephrase that. I saw NPU in at least one Rockchip SoC.
<ndufresne> cphealy: I think starting from RK3399Pro and most newer
<ndufresne> It is alleged to be a Vivante NPU, which would make sense, since whenever they are missing something, RK goes to Vivante, we notice with the Hantro G1/H1, and today they picked VSI V9000D for AV1 decoding, the NPU brochure
<cphealy> I was looking at the RK3566 (which is newer.)
<cphealy> Oh nice, I wasn't aware of the V9000D for AV1 decoding. Now we just need stateless AV1 support... ;-)
<macc24> cphealy: rk3566 has npu? nice!
<cphealy> Yep, 0.5 TOPS according to pretty picture block diagrams on the internet...
<macc24> i wonder if that can be used to upscale video games on a Certain Handheld
<cphealy> Using NPU for upscaling buffers doesn't seem like a good fit.
<macc24> what if the upscaling algorithm was built around a neural network?
<macc24> like dlss but open source and usable on a 2w arm chip
<cphealy> RK3566 has a 2D engine which can probably do scaling more efficiently.
<macc24> eh
<cphealy> Mali G52-MP2 can probably do the scaling pretty efficiently too.
<macc24> i guess
<macc24> the reason for using npu would be to train the neural network for actual retro game images
<macc24> to have it look better
<cphealy> Wouldn't a nearest neighbor upscale make the most sense?
<macc24> it would only look good with certain pixel ratios in that case
<cphealy> ack
erle has quit [Ping timeout: 480 seconds]
<macc24> how are you going to upscale |# | to | |? both |## | and |# | will /not/ look good
greenjustin has joined #panfrost
<macc24> nearest neighbour to nearest integer scale then bilinear *in my opinion* looks the best* unless it's close enough to not be that bad, like 2.1x scale
erle has joined #panfrost
anarsoul|2 has joined #panfrost
rasterman has joined #panfrost
anarsoul has quit [Ping timeout: 480 seconds]
anarsoul|2 has quit [Ping timeout: 480 seconds]
anarsoul has joined #panfrost
<ndufresne> macc24: the question will be in 0.5 TOPS is enough for the task, gaming GPU have much bigger NPU
<macc24> ndufresne: gaming gpu would be used up by game when gaming
<macc24> and rk3566 has "just" mali g52
<ndufresne> I'm just saying that the NPU is slightly undersize of the display capability
WoC has quit [Remote host closed the connection]
* alyssa considers rewriting Valhall's linking code since it's broken
<alyssa> Or I could keep throwing more hacks on top of it... hmm
<macc24> alyssa: are you paid hourly?
<macc24> if yes then just throw more hacks on top of it
WoC has joined #panfrost
<macc24> ;D
<alyssa> Hmm..
<macc24> write good code and they'll only need you once, write spaghett and they'll need you forever to maintain it
* alyssa tries to figure out what a production Valhall linker would need
<alyssa> I guess abstractly for each varying, we need to know its offset and type.
<alyssa> There's not an obvious way to do this with separable shaders
<ndufresne> (sounds like ... it broken for Vulkan ...)
<alyssa> ndufresne: other way around, no?
<alyssa> Vulkan gives you the whole pipeline at once?
<alyssa> no separable shaders to worry about
<ndufresne> I was thinking the other way around, is there a thing like seperatable shaders in GL ?
<alyssa> I'm so lost
<alyssa> Separable shaders are a GL-only thing.
<ndufresne> oh really, I'm not super up to data
<ndufresne> date
<ndufresne> I mean that could be a handy thing for sure, but I can understand that it will impair the optimization
<ndufresne> sounds like you need some ABI
<alyssa> The SSO rules make this a bit easier: precision must match, so types (f32/f16) can be figured out independently
<alyssa> in theory. I seem to recall this being broken with some internal shaders forcing bifrost to carry a hack for this
<alyssa> oh, that was for interp modes
<alyssa> /* In principle we can do better for 16-bit. At the moment we require
<alyssa> * 32-bit to permit the use of .auto, in order to force .u32 for flat
<alyssa> * but smooth in the FS */
<alyssa> * varyings, to handle internal TGSI shaders that set flat in the VS
<alyssa> oh and a thing for XFB. but shrug.
<alyssa> so assume we know the precisions, we just need to look up the offsets
<alyssa> for GLES + highp, this 'can' be as easy as using the slot
<alyssa> for desktop GL, that won't work well because of the extra weird slots
<alyssa> though we can make a little lookup table that should be pretty cheap
<alyssa> base location -> offset
<alyssa> jekstrand: You know what this calls for? more GL-only sysvals!
<jekstrand> Why is dma_fence_release calling an IRQ handler?!?
<jekstrand> alyssa: Fun!
<alyssa> jekstrand: it's specifically out to get you
<jekstrand> alyssa: It's looking that way
<jekstrand> Then again, I never 100% trust kernel stack traces...
<alyssa> who ever wrote that code was just thinking,
<jekstrand> It doesn't help that this is a race on DRM file close. :-/
<alyssa> "Man, you know who I hate and want to make miserable? jekstrand! I should call an IRQ handler."
<jekstrand> probably
<jekstrand> I can take down the panfrost kernel in seconds, sadly. :(
<alyssa> that sounds about right yes
<alyssa> annoying isn't it
<alyssa> oh, erm, this won't work either will it.
<alyssa> shader key looks like the least bad option here.
<jekstrand> Ooh, this is extra fun! It crashes inside a spinlock
<jekstrand> Boom! Kernel go dead.
<jekstrand> maybe?
<jekstrand> hrm... no. It's in sched_entity_destroy
rkanwal has quit [Read error: No route to host]
rkanwal has joined #panfrost
robmur01__ has joined #panfrost
icecream95 has joined #panfrost
robmur01_ has quit [Ping timeout: 480 seconds]
WoC has quit [Remote host closed the connection]
WoC has joined #panfrost