ChanServ changed the topic of #panfrost to: Panfrost - FLOSS Mali Midgard & Bifrost - Logs https://oftc.irclog.whitequark.org/panfrost - <macc24> i have been here before it was popular
<macc24>
icecream95: do you have chromeos on your duet?
stano has quit [Read error: Connection reset by peer]
stano has joined #panfrost
<icecream95>
macc24: yes...
Putti has joined #panfrost
rasterman has joined #panfrost
enunes has joined #panfrost
<bbrezillon>
tomeu: can you try to add assert()s in pan_texture.c and pan_cs.c to make sure we're not passed X32_S8X24 or Z32_S8X24 formats
enunes has quit [Read error: Connection reset by peer]
enunes has joined #panfrost
<bbrezillon>
those should be converted to dual-plane formats (Z32 + S8X24) by the gallium driver
<robmur01>
macc24: FWIW, compile-time pipeline scheduling is a fundamental of VLIW architectures which used to be popular for CPUs/DSPs, and was also one of the reasons for IA-64 being so successful...
wwilly has quit []
<urja>
... succesful.
<robmur01>
the extent of its success, yes ;)
wwilly has joined #panfrost
camus has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
<macc24>
robmur01: i also vaguely recall some nvidia cpu(denver?) being vliw
<robmur01>
Quite possibly - I know Transmeta Crusoe was, and Denver's the same kind of deal
ids1024 has quit [Ping timeout: 480 seconds]
ids1024 has joined #panfrost
camus has joined #panfrost
davidlt has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
camus has joined #panfrost
Net147 has quit [Quit: Quit]
Net147 has joined #panfrost
camus has quit [Ping timeout: 480 seconds]
robmur01 has quit [Read error: Connection reset by peer]
robmur01_ has joined #panfrost
robmur01_ is now known as robmur01
nlhowell has joined #panfrost
indy has joined #panfrost
ezequielg has quit []
ezequielg has joined #panfrost
camus has joined #panfrost
nlhowell has quit [Ping timeout: 480 seconds]
ezequielg has quit []
ezequielg has joined #panfrost
nlhowell has joined #panfrost
<HdkR>
macc24: robmur01: Denver/Carmel is VLIW yes
<macc24>
HdkR: is there linux port for that without any hardware translation?
<HdkR>
You can't
<macc24>
why
<HdkR>
It's in violation of ARM's license to expose another ISA through the ARM path, so you just never get it exposed.
<macc24>
D:
<HdkR>
Also as itanium proved, you don't want VLIW in your CPU
<macc24>
why?
<HdkR>
Hand written VLIW is a major pita
<alyssa>
HdkR: Unless you're Apple
<macc24>
still, why vliw is bad
<HdkR>
VLIW good. Just bad when you have any form of developer interaction with it :P
<macc24>
i mean, there's a reason why everyone moves away from it lol
<macc24>
terascale amd gpus were vliw iirc
<robclark>
For general purpose CPUs, things are unpredictable enough that you don't really want vliw, you want more traditional superscaler and OoO.. the compiler doesn't have enough knowledge to dtrt with vliw
<robclark>
for more niche cases where the compiler *can* know enough, vliw is nicer
<HdkR>
You can also end up in a situation where even if your VLIW ISA is backwards compatible. Your hardware changes the scheduling model to make your previously written code now wildly inefficient or broken :D
<macc24>
wtf
<alyssa>
how about multi-issue in-order?
<HdkR>
alyssa: Whatcha mean? Issuing multiple VLIW packets in order?
<macc24>
alyssa: *whispers* midgard?
<alyssa>
macc24: AGX, we think
<alyssa>
HdkR: Encoding and assembly "looks" like a purely in-order scalar processor
<macc24>
ah, agxgard
<alyssa>
but if you have the same 'type' of instruction back to back, they can be executed faster
<HdkR>
ooo, cute. Doing the nvidia thing
<alyssa>
but the hw won't search out of order for instructions to execute
<robclark>
in-order makes sense for GPUs where you care more about "wide" than "in-order-fast"
<alyssa>
so the compiler /should/ reorder for optimal perf, but does not /have/ to for correctness
<alyssa>
(and in particular, if you can't fill slots you don't waste i-cache space padding with nops)
<HdkR>
Need those out of order GPUs to make compiler dev's jobs easier :P
<robclark>
sure, just get one of those riscv gpu things.. or did they switch to power?
<HdkR>
I think they are still planning riscv because of how easy it is to add instructions
wwilly has quit [Remote host closed the connection]
atler is now known as Guest2719
atler has joined #panfrost
Guest2719 has quit [Ping timeout: 480 seconds]
wwilly has joined #panfrost
<bbrezillon>
alyssa: I got rid of all _packed objects in panvk's non per-gen files, but it looks like we have the same issue in the gallium driver
<bbrezillon>
(pan_context.h)
<alyssa>
uhh
<alyssa>
i thought i fixed that
<alyssa>
uhm. guess not. I'll deal with it
<alyssa>
i assume you need review on the core bits in the mean time..?
<bbrezillon>
I'd really like to go through one full conversion before posting the MR
<alyssa>
Alright
<alyssa>
In that case -
<alyssa>
rasterizer, zsa, sampler -- all move to pan_cmdstream.c
<bbrezillon>
ack
<alyssa>
panfrost_sampler_view_destroy -- move to pan_cmdstream.c, and then sampler_view itself goes to pan_cmdstream.c
<alyssa>
you'll need forward decls
<alyssa>
panfrost_shader_state is the only tricky one, but given the RSD is partial anyway, probably easiest to just u32[] and STATIC_ASSERT and call it a day
<alyssa>
(or just leave it as is -- RSD is the same size from t604 through g76... will need to change for valhall but valhall needs more invasive changes anyway)
<alyssa>
that should be it for _packed
<alyssa>
ack? π
<macc24>
alyssa: uh do you know if anyone used panfrost under android?
<alyssa>
macc24: globallogic has done so (with glodroid), haven't built it myself though
<bbrezillon>
alyssa: should be good
<bbrezillon>
thx
<alyssa>
π
<macc24>
alyssa: may or may not be doing similar thing
<macc24>
but with chromebooks
<alyssa>
macc24: Collabora will likely do a demo at some point but linux is more interesting π
<macc24>
alyssa: what demo
<alyssa>
as for chromebooks, er
<macc24>
ah
<alyssa>
tomeu: I think you had a chromeos panfrost build at some point? maybe?
<macc24>
alyssa: don't even think about demoing the dumpster fire i call cadmium xD
<alyssa>
macc24: I meant an android panfrost demo at some point
<macc24>
ah cool
<alyssa>
i'm scared of android but you know
<alyssa>
someone's gotta do it
<macc24>
alyssa: well... you should be scared xD
<macc24>
android games probably are doing stupid gles stuff xD
<alyssa>
Not worried about that, just about compiling AOSP on a chromebook ;p
<robclark>
alyssa: speaking of android, I found a way to get better fps/gfx on a bunch of gamesandroid ...
<robclark>
(a bunch of games are putting GPUs in performance brackets and when they see a GPU they don't recognize they put it in lowest bracket, artificially limiting framerate and gfx features)
<alyssa>
garbage :p
<robclark>
I'm working on a driconf so we can override GL_VENDOR/GL_RENDERER on a per-game basis
<robclark>
yes, it is
<macc24>
robclark: thanks i hate it
<robclark>
(I won't use mali, I'll use qc values, ofc.. that was just an experiment when I was comparing some game to krane)
<HdkR>
When Tegra was on the market, there were a BUNCH of games that check for tegra and only enable some effects in that case :(
<urja>
ha... i once hacked the GLESv2 driver on the jolla to report a different (less weird? dont rememeber if it was slightly up or down ...) Adreno for GTA Vice city to work ... had some bug workaround tied to just a couple specific adrenos ...
<robclark>
The funny thing, at least one game was failing to differentiate G72MP3 from G72MP12 and putting krane in a performance bracket where it totally didn't belong
<HdkR>
oops
<HdkR>
That bit of software not understanding how Mali GPUs scale :D
<robclark>
yeah
<HdkR>
G78MP24 totally the same perf bracket as G78MP7...
<alyssa>
robclark: what i don't understand is why it doesn't put all malis in the lowest perf bracket π
<macc24>
alyssa: mali g72 mp2137 would be pretty fast
<alyssa>
macc24: no, it would be tiler bottlenecked
<macc24>
alyssa: but not slow
<alyssa>
yes, slow
<alyssa>
for higher geometry counts
<robclark>
but it would be great at fullscreen quads :-P
<alyssa>
that.. yes, it would be :-p
<alyssa>
so i guess unironically ray tracing would be good
<macc24>
robclark: or 2 fullscreen triangles
<HdkR>
Just like Nvidia can't easily scale a design down to G57 levels, mobile GPUs can't easily up to Titan levels :>
<alyssa>
though you'll quickly be mobile bw bound
<robclark>
alyssa: the shadertoy edition ;-)
<alyssa>
there we go
<alyssa>
so i've been writing python to parse some xml and generate c code
<alyssa>
i think i've hit the cliff where it's faster to just write the c code and skip the python and xml ;v
atler has quit [Ping timeout: 480 seconds]
atler has joined #panfrost
wwilly_ has joined #panfrost
wwilly has quit [Ping timeout: 480 seconds]
atler is now known as Guest2728
atler has joined #panfrost
Guest2728 has quit [Ping timeout: 480 seconds]
davidlt has quit [Ping timeout: 480 seconds]
<macc24>
alyssa: how much do you think vr would kill mali g72?
<alyssa>
yes.
<alyssa>
meh. g72 with the ddk should be ok.
<alyssa>
unless you're sensitive to motion sickness..
<macc24>
hmmmmmmmmmmmmmm
<alyssa>
I seem to recall doing google cardboard style AR with a mali t628 phone in 2015.
<alyssa>
then again. do not recommend :-p
<macc24>
i was doing cardboard vr with adreno 330 phone xD
<alyssa>
actually maybe this was a330 as well. dunno