2024-01-15

<austriancoder> eiln: if you give me a hint want simple re application you used on MacOS to see how it interacts with ane I will look into it
<austriancoder> eiln: I want to get the experiment working before I start looking into the userspace. From reading the ane docs and code it looks like it could fit the wip TensorFlow Lite delegate in mesa called teflon.

2024-01-13

<austriancoder> eiln: I want to get the experiment working before I start looking into the userspace. From reading the ane docs and code it looks like it could fit the wip TensorFlow Lite delegate in mesa called teflon.
<eiln> I don't think there's any missing RE pieces to ane anymore. it's more of an implementation/userland integration issue

2024-01-12

<austriancoder> j`ey: thanks .. i cloned all the ane related repos from github

2023-12-15

<marcan> povik had a version bypassing AOP, but I'm really not comfortable with that after where we ended up with ANE

2023-11-19

<eiln> When ANE support happens, the first feature will be the ability to run any pre-compiled CoreML model imported from macOS. This is easy for us because all we need to do is do the kernel-side submission dance e.g. IOMMU-mapping. CoreML support means we're forced to write a bespoke layer for Linux, but there's no other way here.
<eiln> very difficult to expose ANE as a generic compute node.
<eiln> I'll admit my mind has been in video world and it's been some time since all this, but ANE is a FP16 MAC unit, it descended from the DSP block of the camera; it's basically ISP got out of hand. It can only do one thing -- and though it can do that one thing pretty efficiently -- it's in turn extremely inflexible and opinionated. As much as I'd like to, hardware limitations make it

2023-10-07

<eiln> I thought about using ane's chinook but it seems iboot locks down ctrr since 13.5

2023-09-28

<eiln> MLVNR just won't die. it's 248 (J413AP 720p) with SCL1, but 248 only has one camera config. either we bring back the platform_id hack or I RE and port ANE-ISP IPC (probably even more platform problems).

2023-09-26

<lina> ISP should then wait for ANE, if it can't find it on probe it needs to return EPROBE_DEFER
<lina> If you want to hook up ANE for real, what you do is set it up as a separate device and reference it with a custom phandle property from ISP, then you can get a reference to the device/drvdata that way and just make up whatever API you want between the two modules

2023-09-25

<eiln> as a last resort I think I can kick up the ane for this sake. we're not calling face detection, we just need the client up. not sure how the hell that's gonna work in v4l2 though
<lina> On the other hand initializing ANE has a nice trivial arbitrary u32 write... so if we ever need to mess with firmware memory and can't do it directly... that's one way... ^^;;
<lina> eiln: Fake initializing ANE doesn't work. I got it to get through the buffer allocs, but then it actually tries to initialize the ML model and times out on ANE not actually replying...

2023-09-19

<lina> j`ey: I mean, I hope one day I can use eiln's ANE stuff to run motion capture instead of an iPhone ^^

2023-09-14

<ydalton> eiln: pretty sure the ane driver will work on m2 out of the box; i dumped my adt and it lists the compatible string as being "ane,t8020", which is the same one as in your linux driver. haven't tested it yet though...

2023-09-12

<eiln> iiuc unk4 is platform_type (t8103: 0x1, t6000: 0x3) and pad_40[1]/pad_40[3]/pad_7c[3] are required? pad_7c[3]=0x3b54000 looks like the ANE-IPC endpoint (I hate that I know this by heart), can you try removing that? pad_40[1]=0x3128000/pad_40[3]=0x48000 look like u64 iova/size for some other heap, not sure what though. we don't allocate any differently

2023-09-10

<marcan> if code can be shared here, it might make sense to have some kind of libispfw or something, especially since both of these drivers are in media/ (ane OTOH, not sure, depends on how different it ends up being)
<eiln> it's fine, I figured out a hack during the ANE days

2023-09-08

<marcan> and yeah, for ane you're allowed to use the drm_mm stuff if necessary since that's in accel/ which is morally drm, so if it makes sense there I wouldn't worry about it, but for ISP it'd be nice (and probably simplify things quite a bit) to switch back to the regular iova allocator

2023-09-07

<eiln> quickly inverting the iovas in m1n1 does not break it. frankly, it was pasted from the ane code, and also the familiar iovas make debugging a lot easier. I will check on the bottom-up allocator though
<eiln> as for the ANE, DMA to/from the upper iovas (I think it was piodma_size <= iova <= vm_size) is noticeably slower, at least 10x, and their kext actually does tricks to avoid using this range. it seems loading/unloading an entire transformers model is cheaper
<eiln> also I can finally boot the ANE firmware with my cursed ISP knowledge. not sure what to do with this
<eiln> ISP is not the only one with a static non-iboot carveout. see ANE and AVE ;)
<eiln> marcan: following ANE I think "revision" means board type. it makes sense at that point in probe too. it's not in adt but there might be a register.

2023-08-21

<eiln> I recall reading something similar (for the ane) on google zero day, except this is much stupider

2023-07-20

<eiln> cam output is the same with or without ane. i'm pretty sure it just speeds up face detection.
<eiln> marcan: also the ane-isp ipc can be disabled. i trace with the ane firmware range zeroed out.

2023-07-15

<eiln> ChaosPrincess: i recall the ane dapf writes were slightly different so i played safe. this is much better, thank you

2023-07-14

<eiln> in w_tun_d() the ttbr of the first dart domain gets copied to the 2nd & 3rd domains. this is unique to the isp & ane. dma doesn't work w/o it
<eiln> i was poking at the isp yesterday. it's a beast but similar to the ane. my priority's the ane, but the isp seems stale, so i'll look at it in my downtime.

2023-07-10

<ydalton> hi, question, i heard eiln's ane driver bypasses rtkit, but does it still communicate with the coprocessor (i assume there's one for ane)? or does it bypass even the coprocessor all together or something

2023-07-02

<dimilar> `lsmod | grep ane`, I got `ane 65536 0`
<dimilar> git remote add eiln https://github.com/eiln/linux && git fetch eiln ane-accel && git merge FETCH_HEAD, and I am on branch asahi-wip
<dimilar> Has anyone tried eiln's ANE driver? Today I had compiled a kernel and successfully loaded the driver?

2023-05-05

<eiln> merged & commented out the pd patch in ane-t6000 branch https://github.com/eiln/m1n1.git

2023-05-04

<eiln> take 2, lmk if experiments/ane.py under ane-t6000 branch works (either ane0/ane2)
<eiln> the updated experiments/ane_t600x_power.py in the ane-power branch (https://github.com/eiln/m1n1.git) tries to manually powergate ane_sys_cpu before turning it on

2023-04-30

<eiln_> shot in the dark but does any t6000 mind running experiments/ane.py on ane-t6000 branch https://github.com/eiln/m1n1/tree/ane-t6000
<eiln_> t6000 has ane_sys_cpu elsewhere, and there's 6 too, so it goes 0xc000 - 0xc028
<eiln_> ane_sys_cpu is 0xc000 for t8103, and there's 6 more after to make it 0xc000 - 0xc030
<jannau> also modified '"ane%d" % i*2' to '"ane%d" % (i*2)'. it looks like '%' takes precedence over '*'

2023-04-29

<eiln_> ig stacking two ane's didn't go as planned
<eiln_> interestingly t8103 ane & t6000 ane compile the same microseq
<marcan> ane is actually quite power hungry, it's a major component in the CLPC stuff and has its own power rail, so ane1 is probably not even wired to power on any machine
<marcan> so the double ane for max models was a planned feature that got cut and removed from the next iteration
<eiln> started the HW:ANE page

2023-04-26

<ChaosPrincess> there is some weird dart with 4 io areas used for ave, ane and isp

2023-04-12

<marcan> I think the only case where we've had a plausible "byassing firmware is OK" story so far is ANE, though I still have my doubts about that one

2023-04-09

<marcan> even linux thinks it's ane_sys, wat
<marcan> [ 7.798483] apple-pmgr-pwrstate 28e080000.power-management:power-controller@260: PS ane_sys: pwrstate = 0xf: 0xf00000f
<marcan> [ 7.802556] apple-pmgr-pwrstate 28e080000.power-management:power-controller@260: PS ane_sys: pwrstate = 0x0: 0xf0000f0
<marcan> ok, linux is poking the ANE_SYS power domain and this makes *no* sense. either I screwed up something major (but I can't find it) or there is something very wrong in linux genpd

2023-04-02

<marcan> ANE is very much a "side" thing right now so I'm comfortable leaving things up to you, I doubt it's going to become a critical piece of the ecosystem overnight (which is a good place to be when developing since you don't have to care about users as much :p)
<eiln> marcan: was wondering how the ane stuff should proceed?

2023-03-31

<eiln> there's no evidence multi-ane systems can "load balance" a single job (model)

2023-03-24

<eiln> coreml would agree with that; there's a size threshold to get compiled on the ane
<eiln> marcan: im gonna dump the ane device tree mess over at #asahi-re

2023-03-14

<eiln> marcan: created a pull in m1n1 with the ane tunables

2023-03-04

<eiln> the ane kext is still named h11 lol

2023-01-30

<eiln> updated t8103 dt bindings @ https://github.com/eiln/linux.git on "ane" branch

2023-01-27

<jannau> I haven't found a devicetree patch to add the ane node in that repo. do you have it somewhere else?
<jannau> eiln: RE ane power-domains. does it use more than 1 power-domain directly? if so it needs special handling in the driver. see apple_nvme_attach_genpd in drivers/nvme/host/apple.c for an example

2023-01-17

<nicolas17> ANE?

2023-01-16

<eiln> no ANE!
<j`ey> no thats the ANE

2023-01-14

<chadmed> makes sense re efficiency of the ANE. it's mostly used in the iphone for background image processing for things like text search indexing and post processing, neither of which need to be blisteringly fast
<marcan> dammit and here I thought ANE was going to be on my back burner forever, it's what i've been telling people :p
<eiln> ANE fully
<nicolas17> GPU, ANE?

2023-01-01

<eiln> "tethered" 1+2 on the ane https://bashify.io/images/n1Xz3i

2022-12-26

<eiln> so ane's DMA channels expect a continuous iova block

2022-12-11

<eiln> specifically ANE_SYS enables only a small subset of mmio
<eiln> ANE_SYS_V is actually a stream of sub-power domains @ 0x8 offsets that turns on the whole engine
<eiln> hence the adt parser doesn't detect it properly and ends up calling ANE_SYS only
<eiln> but every other field in ANE_SYS_V is zeroed out
<eiln> ANE_SYS_V lists ANE_SYS as its parent
<eiln> in adt ane's power-gates/clock-gates field indexes 302 which is ANE_SYS_V

2022-11-30

<amarioguy2> let's not forget a new challenger, ANE :)

2022-11-29

<eiln> ane power tunables are constant & retain state so I chucked them under tunables_static.c

2022-11-28

<alyssa> I do want to understand the complexity of the ANE instruction set
<alyssa> ("ANE done wrong" would be the downstream vendor approach to get some cool pics for Twitter/Mastodon and never end up getting shipped...)
<alyssa> as the closest prior art for "ANE done right"
<alyssa> which is more sensible if you're going to have an ANE at all
<alyssa> I think I'd like to understand the capabilities of the ANE more before I could give any advice
<marcan> wasn't there a new subsystem in linux for ane type stuff?
<povik> running opencl on ane when
<ChaosPrincess> isnt ane mostly for inference, not training?
<eiln> feel pretty lost on what direction to head with the ane

2022-11-27

<eiln> realistically what processes could leverage the ane

2022-11-23

<dsharshakov> chadmed: if it could be wired up to TensorFlow PluggableDevice API, then tf apps will be able to make use of ANE even in case supported op count is low. Unified memory gives us possibilities to not care about moving data between ANE and CPU being slower than benefits from initial ops implementation
<dsharshakov> eiln: what're your plans on UAPI to expose ANE to apps? How suitable does accel from 6.2 feel?
<eiln> oh sorry -- forgot to re-introduce myself. my nickname before was "lamlam", i.e. ane guy. my real name is eileen (handles use eiln).

2022-11-20

<lamlam> was about to commit ane patch rn and github soft-banned me?? "Your account has been flagged. Because of that, your profile is hidden from the public"
<lamlam> THE ANE WORKS

2022-11-17

<lamlam> i just did a convolution on the ane

2022-11-04

<lamlam> not sure which is the culprit but am seeing some *really* detailed isp logs by ane (log show --last 5m --debug --info)... might help
<lamlam> also i boot guest w/ these ane-specific args added:
<lamlam> would not be surprised if it's the same team behind it. there's an entire apfs file-sharing protocol between the two. first ane wake is actually called by isp to init apfs lol.

2022-11-03

<lamlam> i finally booted the ane

2022-11-01

<j`ey> oh ANE, jamie/R were looking at AVE
<lamlam> i just kicked read dma in the ane for the first time

2022-09-08

<M4t64[m]> may be related, but ane is referred to as sne internally

2022-08-22

<qcyrdqcpzg[m]> is the apple cli tool "powermetrics" reliable? specifically the ANE metric?

2022-04-20

<millenialhacker> but it has some interesting stuff related to I2C/GPIO and ANE. I feel I was going in the wrong direction all this time. but at least that's some progress, isn't?

2022-01-20

<marcan> krbtgt: someone actually wanted me to mentor their master's thesis on reversing ANE... but unfortunately I don't trust myself with that level of commitment :(

2021-11-10

<squags_> I've seen that geohot look at stuff yeah, I'm taking a look at some of the ANE Mach-O files
<marcan> you're familiar with the existing userspace side ANE reversing?
<squags_> yeah my vague plan atm is to get a minimal ANE example going in MacOS, dump traces, then try getting it to run in m1n1/similar so I can fuzz out specifics
<squags_> might take a stab at some ANE RE bc it at least looks interesting compared to neural accelerators I've worked with before

2021-10-21

<alyssa> also, do you count AGX? what about the ANE?

2021-10-20

<chadmed> the ANE, that is
<arnd> alyssa: on a related note, do you think it would be possible to have an OpenCL based BLAS implementation on top of your gallium driver, and use that for machine learning on the GPU instead of the ANE?
<TheLink> emulate the cuda api with ane :B
<alyssa> I am genuinely unsure why ANE would matter on Linux outside of some niche spaces
<arnd> what is ANE?
<marcan> hm, since someone mentioned ANE again... I'm wondering *how* that would even be implemented?