Sunday, February 16, 2020

Opens Source IMX219 Camera MIPI CSI-2 Receiver Verilog HDL Lattice FPGA MachXO3 Raspberry PI Camera

This post is going to be yet another part of the previous camera projects published on this blog. As described in the last post here in which i made Raspberry PI camera Sony IMX219 4 Lane MIPI CSI Board.

 In this post will be having details about how to what is needed to Get data out of MIPI camera and then feed data into Cypress FX3 USB3.0 controller.






Full Frame 8Mpixel Capture 



Lattice MachXO3 FPGA Cypress FX3 Interrconnect Inter Connect PCB

Camera module Directy connects to the FPGA Board and FPGA board Connects to Cypress FX3 FPGA board using this interconnect. There is not much in this PCB only few regulators for VCCIO
and length match 32bit 10Mhz bus for FPGA to FX3 Connection.

Bank IO for LVDS mipi HS receiver should be set to 2.5V and for LP should be 1.2 . Lattice FPGA board does not have any 2.5V bank regulator so this mounting 2.5V regulator on this board will provide bank io voltage.




Blank Lattice MachXO3 Dev board to Cypress FX3 Dev board Interconnect PCB witout any component is available for purchase. 
Buy now 9.99 EUR free shipping worldwide. 


If used if my 4  Lane mipi csi-2 camera board version 1.1 1952, there is a bug in the interconnect PCB v1.1 2001, camera I2C sda and sck are interchanged, so if used with that specific camera PCB. interconnect board must be patched. You can see two thin blue wires coming out of the headers hole in one of images of back of the setup. Camera PCB is available on the camera page.




Camera Board


Details of Camera Module are published here in a previous post. 




MIPI


What is MIPI, you can google it to find out but basically it is a interface specification for Displays and Camera sensor to a application processor.


Image blow show block diagram of MIPI.  On one side there is application processor and other side is the peripheral. When peripheral is Camera and CSI apply. though mipi is closed specification which means one has to be member of MIPI consortium to gain access to full specification. And membership of the consortium comes with a big price tag for individuals. Luckily full specification is already available just a right keyword web search away. DCS, CCS, DSI, CSI and DPHY all the specification are available with just few minutes of web search.



DPHY

Basic transmitter only block diagram looks like in the image blow there is differential receiver for high speed data and two single ended LP receiver and CD receiver. 

Control logic consist of lane management if more than one lane is used, detecting when HS is activated and also detecting when HS is activated and other control logic. Block diagram only show one lane, you will have minimum one clock signal and minimum one lane, though i have not seen any camera with less than two lanes. 


MIPI DPHY Signal

The image shows i got from google shows signal level for MIPI , HS driven by differential driver swings -200mV to +200mV at offset of 200mv. while LP signal is a 1.2V lvcmos 


there are two different modes of transmission , HS mode and LP mode, HS mode is for hi speed display data while LP mode is for Low power transmission. 


Receiver must detect when transmitter has gone into HS mode and exited HS mode.
Image blow shows how transmitter enter HS modes. 


Stage 0 : LP-11 state in the shown image is LP state.
Stage 1 : To get into HS mode driver drives LPdp low for Tlpx(minimum 50ns) and stay in LP-01 (HS driver is tristate in LP 01).
Stage 2: Driver drives LPdn low for Ths-prepare (minimum 95ns) stay in LP-00 , Later somewere in the middle of this stage target device will activate it's 100R termination register. 
Stage 3: Now Target is in HS, driver activates HS driver start sending mandatory zeros .
Stage 4: Send mandatory 0xB8 sync byte and then payload.

CSI

As explained earlier CSI , describes Packet structure. How exactly bytes are packed on to wire in different lanes configuration.
Image blow shows packet structure. 

There are Two types of Packets
Short Packet: 4 Bytes (Fixed Length)
Long Packet: 6 - 655541 Bytes (Variable Length)

MIPI Short Packet Structure

MIPI Long Packet Structure
Endianness

Bytes are sent lsbit first and bytes in the packet are sent LSByte first. 


CCS

Very important fact with CCS when comparing with DCS , CCS describes command interface to be I2C while with DCS commands are set over same HS line as the data itself. 
But in case of camera as per MIPI spec CCS is implemented over a extra I2C line. 

CSI Single Frame

Single Frame from camera is show in the image blow. 

Camera send a Frame start packet 
Then send embedded line information which tells receiver about the stream 
Then image data line by line. 


Test Video






Scope Screenshots of Raspberry talking to IMX219

Image blow shows overall MIPI data coming from camera to raspberry pi. 
Streaming 1920x1080 @30FPS
Streaming 1920x1080 @30FPS Start of a frame
Streaming 1920x1080 @30FPS Embedded line info and first camera line
Streaming 1920x1080 @30FPS embedded line in start of frame
Lattice MachXO3 FPGA Decoded in 4 Lane mode.
CH3 DP CH2 DN CH1 CLKP , Bus decode of first lane shows packet type 0x2B (RAW10) data type.


What make this camera sensor different to camera modules



IMX219 camera is bare bone camera sensor. What do means when i say bare bone camera sensor is , there not much image processing going on on the camera die it self. Camera sensor is Sensor array with Bayer filter on it , 10 bit ADC , clock system , MIPI output driver and I2C controllable system control.


What does this means for us as camera sensor implementer. As my final goal is to interface this camera to USB3.0 UVC with RAW YUV.  This camera does not output YUV, forget about YUV this will not even output RGB. Camera output is absolute RAW 10-bit ADC conversion result from the Bayer filtered sensor array.

So go first get RGB output from bayer raw data, a Debayer or demosaic need to be performed. Once demosaic is done we will have RGB ready to be converted to YUV. And one we have YUV it can be transmitted to USB to be displayed.

What next this camera will not have is any automatic control over exposure. because camera does not have any intelligence to know how dark  or bright scene is.  Solution to this problem what raspberry pi implement is , Raspberry Pi regularly on each frame update analog gain register over I2C to adjust gain according to how bright and dark scene is.

This camera does not have any white balance control as well so host must do correct while balance compensations. To get correct colors out of image.

FPGA module Block Diagram 


FPGA block diagram is show in the image blow. This diagram describe how overall system is implemented and what the key components what this diagram does not describe is control signals and other miscellaneous stuff.






Byte Aligner Received Raw unaligned bits from DDR RX module outputs Aligned bytes, Bytes on MIPI lane does not have any defined byte boundary so this modules Looks for always constant first byte 0xB8 on wire, once 0xB8 is found, byte boundary offset is determined, set output valid to active and start outputting correct bytes stays reset when data lane are in MIPI LP state  

Lane Aligner Receives multiple lane, byte aligned data from mipi rx byte aligner @mipi byte clock  outputs lane aligned data in a multi-lane mipi bus, data on different lane may appear at different offset so this module will wait till of the all lanes have valid output start outputting lane aligned data so byte x from all the lanes outputted at same timescale

MIPI CSI Packet Decoder Basically a packet Stripper, removes header and footer from packet Takes lane aligned data from lane aligner @ mipi byte clock looks for specific packet type, in this case RAW10bit ( 0x2B). Module outputs Stripped bytes in exactly the way they were received. This module also fetch packet length and output_valid is active as long as input data is valid and received number of bytes is still within the limits of packet length.

MIPI CSI RAW10 Depacker  Receives 4 lane raw mipi bytes from packet decoder, rearrange bytes to output 4 pixel 10bit each output is one clock cycle delayed, because the way , MIPI RAW10 is packed output come in group of 5x40bit chunk, output_valid_o remains active only while 20 pixel chunk is outputted

Debayer / demosaic Takes 4x10bit pixel from RAW10 depacker module @mipi byte clock output 4x24bit RGB for each pixel , output is delayed by 2 lines Implement Basic Debayer filter, As debayer need pixel infrom neighboring pixel which may be on next or previous display line, so input data is written onto RAM, only 4 lines are stored in RAM at one time and only three of the readable at any give time , RAM to which data is written to can not be read. As we have enough info in RAM 4 10bit pixel will be coverted to 4x24bit RGB output First line is expected to BGBG , second line GRGR Basically BGGR format  

RGB to YUV Color Space Converter Received 4 pixel 120bit RGB from the Debayer filter output 64bit 4pixel yuv422  Calculation is done based on integer YUV formula from the YUV wiki page 

Output reformatter Takes 64bit 4pixel yuv input from rgb2yuv module @ mipi byte clock outputs 32bit 2pixel yuv output @output_clk_i , output_clk_i must be generated by same way as mipi byte clock, output_clk_i must be exactly double to mipi byteclock This implementation of Output reformatter outputs data which which meant to send out of the system to a 32bit receiver depending on requirement this will be need to be adapted as per the receiver 

Debayer / demosaic  Need little more attention than other modules , IMX219 datasheet incorrectly mention output as to be either GBRG or RGGB. 

But after wasting lots of time it turned out camera output BGGR .  IMX219 Camera only output BGGR as defined by the IMX219 Driver in linux repo MEDIA_BUS_FMT_SBGGR10_1X10,  Camera datasheet incrorrectly defines output as RGGB and GBRG. Data sheet is incorrect in this case.
To test my debayer i was using built in camera test patterns. One key thing about IMX219 is Bayer filter type does affect test pattern as well. It seems like in Test pattern mode it outputs RGGB instead of BGGR. Test pattern will have R and B channel inverted when image have right color.

Update: I have discussed this issue with raspberry pi , It turned out flipping image seems to be the solution, once image flipped bayer output it correct for both data from sensor and test pattern. because flipping image does not affect bayer order of the test pattern.



RAW10 Packet Format


Streaming Camera Test patters , Compared with Camera datasheet 





Test image 






Cypress FX3 Firmware



Firmware implementation with FX3 was quite easy. I have put all the resolution and framerate in the USB descriptor , As described earlier this type of camera sensors are quite bare bone all the have sensor element, PLLs and ADC . So this camera sensor does not have any control over exposure, White-balance or even brightness, I have implemented manual control over USB UVC control channel. it possible to completely control camera exposure and brightness.


Few things you keep in mind, cypress fx3 clock frequency need to be set in 400Mhz mode to allow full 100Mhz 32bit GPIF DMA transfer.

One more thing is though Cypress CYUSB3014 has 512KB RAM but only 224 KB and additional 32KB is available for DMA buffer.

Having large buffer chunk is really important because on every DMA chunk cpu intervention is expected to insert UVC header. As this is high performance application less often cpu intervention is needed is better. So i have set DMA chunk / UVC individual packet to 32KB

Scope capture Image blow shows Channel 13 is the individual DMA packet capture and on Channel 12 show CPU DMA finish interrupt.

These Two scope capture show difference between 16KB DMA vs 32KB DMA

16KB DMA Size, CH13 DMA packet , CH12 CPU interrupt
32KB DMA Size, CH13 DMA packet , CH12 CPU interrupt

Tests 

Currently i have made Cypress FX3 firmware to  Support
3280x2464 15FPS
1920x1080 60FPS
1920x1080 30FPS
1280x720 120FPS
1280x720 60FPS
1280x720 30FPS
640x480 200FPS
640x480 30FPS
640x128 682FPS
640x80 1000FPS

UVC is implemented to support on the fly switching to any listed frame rate and resolution.

Video Quality Test Video






Full Frame 8Mpixel Capture @ 15FPS

1920x1080 @ 60 FPS

1280x720 @ 120FPS


640x480 200FPS 2x Binning

640x126 @ 682 FPS 2x binning

640x78 @ 1000 FPS 2x binning



Further Performance Optimization

Lowest hanging fruit for optimization is output refromatter module in FPGA Right now FPGA YUV output is running at full 100Mhz ( which is driven by 200Mhz mipi clock /2 ) but output is fragmented and does not take full advantage of 100Mhz FX3 bus. If i would make a FIFO in reformatter we can de-couple MIPI clock and FPGA output. which make possible to run MIPI clock faster than 200Mhz. Wich ultimately makes it possible to get even faster frame rate, i have seen frame rate @ 640×80 to upto 1500 FPS when i ran MIPI clock faster form 270 to 320Mhz. Hardware in the current state may hinder reaching high MIPI frequency. May need to have custom FPGA board with correctly terminated CSI lanes.

Second optimization would be again in same reformatter but little complicated. buffer a lot more data and utilze 100 % of 32bit FX3 Bus. In theory one can utlize some other USB controller and get all bandwidth what USB 3.0 allow.

Next part of this project is going to Next Raspberry Pi camera IMX477 Camera to FPGA.
Source Files

PCB and Schematic Source is available in the Github Repo

https://github.com/circuitvalley/mipi_csi_receiver_FPGA

6 comments:

  1. Dear Gaurav, great great work!
    I have ordered the camera PCB. Is it possible also to order the
    "Lattice MachXO3 FPGA Cypress FX3 Interrconnect Inter Connect PCB"
    best Andi

    ReplyDelete
    Replies
    1. Yes it possible to buy interconnect PCB as well
      I have posted paypal button.

      Regards

      Delete
  2. Dear Gaurav,
    This is a really nice body of work. I will bookmark this for sure as you have worked through a lot necessary interfacing that any camera-to-FPGA experimenter will need. Thank You.
    Coop, aa1ww

    ReplyDelete
  3. Very cool! Booked marked for future project!

    ReplyDelete
  4. Very nice work Gaurav. I want to build a Sony IMX264 based camera that I can interface to RPI4B+. It looks like Lattice has the IP to convert from the 4ch LVDS to MIPI CSI2. Have you used their IP? Is it free? I'm just looking into this and wondered if you want to consult on this project.

    ReplyDelete
    Replies
    1. There is no such direct IP available from lattice , as far as i know xilinx or intel also does not have it. you are free to contact me if you feel like.

      Delete