ChanServ changed the topic of #cedrus to: Cedrus development channel (http://linux-sunxi.org/Cedrus) | *only registered users can talk* | Logs at https://oftc.irclog.whitequark.org/cedrus/
Daanct12 has joined #cedrus
Daanct12 has quit [Ping timeout: 480 seconds]
Daanct12 has joined #cedrus
Daanct12 has quit [Ping timeout: 480 seconds]
<ndufresne> linkmauve: for codec, you must set the OUTPUT (bitstream) format before doing VIDIOC_ENUM_FMT, {index=0, type=V4L2_BUF_TYPE_VIDEO_CAPTURE
<ndufresne> The enumeration of the capture format applies to the format being decoded
<ndufresne> right now, it will just get you a list of the default selected format, which can be anything
<ndufresne> linkmauve: second thing, before you call VIDIOC_G_FMT, {type=V4L2_BUF_TYPE_VIDEO_CAPTURE,
<ndufresne> You should set codec specific headers, otherwise the driver does not know if you need 8bit/10bit, 420/422, etc.
<ndufresne> it does default to 8bit and may work, but that isn't guarantied
<ndufresne> munmap(0xb65bc000, 3907527) -> You should keep you memory mapped, doign mmap/munamp every time will be a huge overhead
<ndufresne> linkmauve: final note, you forgot to queue de capture buffer, so that driver does not know it can use it
<linkmauve> So many mistakes… And thanks for the review, I’ll fix them!
<linkmauve> Re munmap/mmap, I still have to do it if the JPEG ever changes resolution right?
Danct12 has quit [Remote host closed the connection]
Danct12 has joined #cedrus
<linkmauve> Woohoo! [246808.281253] Unable to handle kernel NULL pointer dereference at virtual address 00000000
<linkmauve> My driver is now called!
<linkmauve> [246808.315619] PC is at v4l2_jpeg_parse_header+0x34/0x420
<linkmauve> I probably did something very wrong. :)
<linkmauve> Heh, vb2_plane_vaddr(src_buf, 0) is NULL. ^^'
<ndufresne> linkmauve: if resolution changes, you will streamoff and call reqbuf again, any mmap() needs to be munmap() to avoid leaking, and the new set of bufs needs querybuf/mmap again
<ndufresne> note that I'm not 100% certain of the behaviour when using the mmap offsets, I only use DMABuf (EXPBUFS) these days
<linkmauve> Using a dumb buffer containing the JPEG in my case?
<linkmauve> What would the benefit be here?
<linkmauve> I’ll want to test that usecase at some point, I already noticed USERPTR doesn’t work on this kernel, not sure why yet as I haven’t debugged that part.
<linkmauve> I think USERPTR might avoid one copy if the data happens to be contiguous and allocated in the lower 256 MiB of physical memory, compared to MMAP where it is always copied.
<jernej> linkmauve: you can forget on userptr, iommu is needed for that
<jernej> it should be possible to support it on H6, H616 and others, but not on old SoCs like A64
<linkmauve> Ok.
<jernej> ndufresne: setting codec specific headers doesn't apply here, because jpeg header is parsed in kernel and thus no controls are present
<ndufresne> oh, it stateful of course
<ndufresne> yeah, need to re-read with sateful hat on ;-D
<linkmauve> Speaking of which, I can’t figure out how to make the kernel “k”mmap() the buffer I provide so that vb2_plane_vaddr() returns non-NULL.
<ndufresne> linkmauve: its a tad more complex, after you queued and streamon(OUTPUT), you have to wait for the SRC_CH event before you allocate the capture buffer (Strictly speaking)
<linkmauve> Because I can’t know its size before that?
<jernej> linkmauve: I'm responsible for that one, just a sec
<ndufresne> correct, in stateful, we set a control the tells the driver the info, in stateful, the driver needs to process the header first to get all the requirement info to allocate
<ndufresne> * in stateless ... said sateful twice here
<linkmauve> So far I’ve written my userland assuming ST12 will be used, but of course linear NV12 will also happen on newer SoCs (I’ll test on my A64 phone after it works on this A20 sbc only).
<jernej> remove above line
<linkmauve> jernej, heh. :)
<ndufresne> normally JPEG decoder are special
<jernej> it's memory optimization for 32-bit SoCs, because they have very limited virtual address space in kernel
<ndufresne> haha, lol, can't parse in kernel if you have no kernel mapping
<linkmauve> *taps head*
<jernej> but we'll probably need to relax that also due to H265 issue
<linkmauve> Which issue is that?
<jernej> I think capture buffers should be marked with that, since they are bigger in general
<linkmauve> Only for decoding right, not for encoding.
<jernej> sure
<linkmauve> I haven’t done any benchmarking yet, but I think I’d like to tackle JPEG encoding too once I’m done with decoding.
<jernej> h265 frame has header + padding + slice data
<jernej> naturally, you would expect that HW expects pointer at slice data
<jernej> but Cedrus expects pointer at the beginning of padding data
<linkmauve> :/
<jernej> so driver will have to take slice data pointer and read a byte back and calculate new offset
<jernej> this padding is at most 8 bits, so it's mostly annoyance
<linkmauve> Hmm, despite having commented out that line (the DMA_ATTR_NO_KERNEL_MAPPING one), I still get NULL when I try to access the buffer.
<linkmauve> [ 55.317637] Hello world! 00000000 3907527 with the first number being vb2_plane_vaddr(src_buf, 0).
<jernej> context for those numbers would be useful :)
<jernej> ah
<jernej> second?
<linkmauve> vaddr=00000000 len=3907527
<linkmauve> And by len I mean vb2_get_plane_payload(src_buf, 0).
<linkmauve> So bytesused in V4L2.
<linkmauve> I’ve reduced the stack buffer from 2048 bytes to 256 bytes btw.
<jernej> can you push your code somewhere?
<linkmauve> Sure.
<linkmauve> ndufresne, I’m not dead set on making a stateful decoder btw, would it make sense to do a stateless JPEG uAPI still?
<linkmauve> I think it would avoid this issue.
<linkmauve> As we then would receive the tables through ctrls instead, and could keep the buffer unmapped.
<jernej> which issue?
<jernej> ah, it doesn't matter much, because it will be needed anyway
<linkmauve> Of having to map the JPEG buffer, mostly.
<linkmauve> Oh, why?
<jernej> as I said, for H265
<linkmauve> Right.
<jernej> but I already tested that localy and it worked for me
<jernej> so not sure why it doesn't work for you
<linkmauve> Uh, I rebuilt the kernel and now it worked. oO
<linkmauve> Still stuck in poll(), but at least now it does things!
<jernej> so, is header parsed successfully?
<linkmauve> Yes.
<jernej> btw, doing return in setup function is not a good thing
<jernej> currently errors are not handled well, so try to fill as much data as possible
<linkmauve> Header parsed successfully, MPEG engine configured successfully, JPEG trigger sent successfully, but then no IRQ, and poll() really hangs in IOwait mode.
<jernej> ok, then you proably didn't configure everything correctly :)
<linkmauve> What do you mean by as much data as possible?
<linkmauve> Should I try to finish the setup, even though I know it won’t decode properly?
<jernej> yes
<linkmauve> Wouldn’t that risk doing bad things?
<jernej> cedrus can hang if bullshit data is given
<jernej> less bad things can happen if configuration is just slightly off than totally off :)
<linkmauve> I have a TODO about making setup return an error on failure, wouldn’t that be better for the long term?
<linkmauve> My cedrus really seems hung already. :(
<jernej> imo setup function should return bool (successfully configured or not) and caller should not call trigger if it failed and at the same time mark decoding job as finished and buffer as decoding error
<linkmauve> Yes, that would also be my expectation.
<jernej> do you have any reference implementation? at this point the easiest thing to do is to run reference implementation, dump registers and compare them with yours
<linkmauve> I have the cedrus poc, but that requires a /dev/ve and /dev/disp and I have no kernel for that I think.
<linkmauve> Perhaps I could dig a super old 3.4 from Allwinner.
<linkmauve> But I’m almost certain my userland wouldn’t work on it.
<linkmauve> Especially systemd, which requires somewhat recent kernels usually.
<jernej> by cedrus poc you mean kernel or userspace?
<linkmauve> Userspace, this one: https://github.com/jemk/cedrus
<linkmauve> /dev/cedar_dev sorry, not /dev/ve.
<jernej> this is BSP driver, so you can use any distro with old kernel
<jernej> I think some old Armbian images should still be somewhere
<gamiee> also, there are various ports of cedarx kernel module to upstream, so you can also use it
<linkmauve> Oh, that could be easier to use.
<jernej> well, you don't even need that
<gamiee> or you can just mmap /dev/mem the VE ofsfet
<jernej> just print values to stdout directly, buffer addresses will be different anyway
<linkmauve> Oh right.
<jernej> linkmauve: btw, jemk uses only VE_DEC_MPEG_TRIGGER_HW_JPEG_VLD for trigger
<jernej> which seems right
<linkmauve> Oh, doing DQBUF without poll() before it does something else!
<linkmauve> EINVAL saying [ 159.138599] videobuf2_common: [out-14556b43] vb2_mmap: MMAP invalid, as it would overflow buffer length
<linkmauve> Perhaps it just worked?
<jernej> no, don't do that
<jernej> first try with trigger value suggested above
<linkmauve> That’s what I’m doing now, sorry I forgot to mention it.
<linkmauve> And poll() isn’t stuck in IO wait but just sleeping.
<linkmauve> So it seems that was the issue?
<jernej> you can add some printk's here https://elixir.bootlin.com/linux/latest/source/drivers/staging/media/sunxi/cedrus/cedrus_hw.c#L135 in order to check if decoding finished and if it is, if successfully or not
<linkmauve> It’s the request fd I must poll() right? Not the video fd?
<linkmauve> If I poll() the video fd, I do receive a POLLIN event.
<linkmauve> Ok. :)
<jernej> I forgot all userspace details already, but there is kernel documentation for that
<linkmauve> I followed the kernel documentation pretty closely while writing my userland, so it should be correct-ish probably maybe.
<linkmauve> [ 59.814660] cedrus_irq: 5 (error=6 done=5)
<linkmauve> Woohoo! \o/
<jernej> congrats! it's nice to see that, but don't be disappointed if actually resulting frame is all green :)
<linkmauve> Of course. ^^
<linkmauve> At least the hardware says it has decoded *something*. ^^
<jernej> usual progression is: 1. don't crash, 2. give least some colours, 3. make decoding mostly correct, 4. pixel perfect decoding
<linkmauve> I’m a bit worried about 1. on my phone, with both VP8 and H.264 the phone reboots at the end of the stream, unless it got a failure in the middle.
<linkmauve> But that’s another issue altogether.
<jernej> which documentation did you follow? for request api or ordinary stateful drivers?
<linkmauve> Oh, the request API.
<jernej> ok, then it should be fine
<jernej> we're using select() instead of poll()
<linkmauve> Ok, so my mmap() issue was that I forgot to include the 0x40000000 m.offset I got in QUERYBUFS, and never after.
<linkmauve> jernej, actually between 1. and 2., there is getting at least something decoded. ^^'
<linkmauve> Because it seems I implemented a wonderful… DMA copy engine. :°)
<jernej> I just spotted another issue
<jernej> you should also add VE_DEC_MPEG_TRIGGER_JPEG flag to trigger register
<linkmauve> Ah, that hung the whole ssh connection it seems.
<linkmauve> Perhaps a kernel panic.
<jernej> well, push all your kernel changes again, so I can be sure I'm not talking BS
<linkmauve> Wow, even the serial console hung!
<linkmauve> Here, pushed.
<linkmauve> The LEDs stay on, that’s the only indication the computer still exists somewhat.
<jernej> It seems that VE_DEC_MPEG_TRIGGER_MB_BOUNDARY is not needed, however, it resets JPEG DC component to default value (1024)
<jernej> this can be one of those things you can try last, when you're almost there
<jernej> it's entirely possible that Cedrus writes to parts of memory it shouldn0t
<linkmauve> Right.