/videocoreiv

Tools and information for the Broadcom VideoCore IV (RaspberryPi)

Primary LanguagePython

Disclaimer:

This is a independent documentation project based on a combination of static analysis
and trial and error on real hardware.  This work is 100% independent from and not
sanctioned by or connected with Broadcom or its agents.

No Broadcom documents or materials were used beyond those publically available 
(see Referenced Materials).

This work was undertaken and the information provided for non commercial use on the 
expectation that hobbyists of all ages will find the details useful for understanding 
and working with their Raspberry Pi hardware.

The hope is that Broadcom will be flattered by the interest in the device and
understand the benefits of opening up understanding to a larger audience of 
potential customers and developers.

Broadcom should be commended with making their SoC available for a project as 
exciting as the Raspberry Pi.

The intent is that no copyrighted materials are contained in this repository.  

Introduction

Purpose of this repo: Documentation and samples on the VideoCore IV instruction set as used in the BCM SoC used in the Raspberry Pi. As of early 2016, Broadcom has yet to release public information on the VPU, so it is hoped you find this repo useful.

The BCM2835 SoC (System on a Chip) in the original RaspberryPi has the following significant computation units:

  • (ARM) ARM1176JZF-S 700 MHz processor which acts as the "main" processor and typically runs Linux.
  • (VPU) Dualcore Videocore IV CPU @250MHz with SIMD Parallel Pixel Units (PPU) which runs scalar (integer and float) and vector (integer only) programs. Runs ThreadX OS, and generally coordinates all functional blocks such as video codecs, power management, video out.
  • (ISP) Image Sensor Pipeline (ISP) providing lens shading, statistics and distortion correction.
  • (QPU) QPU units which provide 24 GFLOPS compute performance for coordinate, vertex and pixel shaders. Whilst originally not documented, Broadcom released documentation and source code for the QPU in 2014.

Newer Raspberry Pi mix things up with faster and more modern ARM cores, but the VPU information here is still relevant.

For more information on the Raspberry Pi, see the foundation's site at http://raspberrypi.org, or the embedded linux wiki at http://elinux.org/R-Pi_Hub.

Active discussions take place on IRC (freenode) on #raspberrypi-internals, #raspberrypi-osdev, #raspberrypi-dev, and #raspberrypi.

There is a raspberrypi-internals mailing list, you can subscribe at mailing list page at freelists.org.

We are in a very early stage of understanding of the device. At this stage we only have Serial IO and GPIO for flashing things like the status led. You will need to attach a terminal to the Mini UART on the GPIO connector. For more details see "Getting started" below.

It is now possibly to use VideoCore Kernels from Userland / Linux, see https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-Kernels-under-Linux. Our understanding of the Videocore Processor is nearing completion, and it is an excellent target for integer SIMD and DSP kernels. Essentially, it can be used for 16 way SIMD processing of 8, 16 and 32 bit integer values.

Videocore IV Community and Resources:

I recommend starting with Julian's GNU toolchain, at https://github.com/itszor/vc4-toolchain

Documentation:

  1. Getting started: https://github.com/hermanhermitage/videocoreiv/wiki/Getting-Started
  2. Instruction set: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-Programmers-Manual
  3. Hardware regs:
  1. Kernels from Linux: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-Kernels-under-Linux
  2. Performance Issues: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-Performance-Considerations
  3. 3d Pipeline Overview: https://github.com/hermanhermitage/videocoreiv/wiki/VideoCore-IV-3d-Graphics-Pipeline
  4. QPU Shader Processors (24 GFLOPS): https://github.com/hermanhermitage/videocoreiv-qpu

Methodology:

All information here has been obtained solely by a combination of:

  1. Static analysis.
  2. Experimentation on a Raspberry Pi.
  3. Discussions on #raspberrypi-osdev and #raspberrypi-internals.

All activities were undertaken on a Raspberry Pi running Debian.

Those interested in the legal issues involved with reverse engineering activities, please review:

  1. https://www.eff.org/issues/coders/reverse-engineering-faq
  2. http://www.chillingeffects.org/reverse/faq.cgi
  3. http://en.wikipedia.org/wiki/Reverse_engineering

We do not accept materials nor publish materials relating to DRM or its circumvention.

Referenced Materials

Software and Binaries

Official RasPi firmware and blobs

Available at https://github.com/raspberrypi/firmware/tree/master/boot. Releases after May the 10th 2012 are accompanied by a LICENSE.broadcom readme file containing copyright notice, a disclaimer and guidelines for use. Prior to this date the readme was not present.

Debian "Squeeze" Distribution

The distribution debian6-19-04-2012.zip from http://www.raspberrypi.org/downloads was used a development platform for the majority of the work you find here.

Data Sheets

  1. BCM2835 ARM Peripherals data sheet at http://www.raspberrypi.org/wp-content/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
  2. VideoCore® IV 3D Architecture Reference Guide at https://docs.broadcom.com/docs/12358545

Patents and Patent Applications

The original Alphamosaic patents and patent applications provide a wealth of information for understanding the structure of the VideoCore instruction set and architecture. Whilst the instruction encodings are different, and only a limited range of instructions are indicated they prove an invaluable reference for understanding the design space the engineers were exploring.

The newer Broadcom SoC patents and applications provide detailed information on how the VideoCore has been been integrated into a broader platform setting. They are invaluable for gaining a deeper insight into the additional function units present in the BCM2835 and how they fit together.

Patent Applications on Broadcom SoC Method and Systems

  • US20060184987 Intelligent Dma in a Mobile Multimedia Processor Supporting Multiple Display Formats
  • US20080291208 Method and System for Processing Data Via a 3d Pipeline Coupled to a Generic Video Processing Unit
  • US20080292216 Method and System for Processing Images using Variable Sized Tiles
  • US20080292219 Method and System for an Image Sensor Pipeline on a Mobile Imaging Device
  • US20090232347 Method and System for Inserting Software Processing In a Hardware Image Sensor Pipeline
  • US20110148901 Method and System for Tile Mode Renderer With Coordinate Shader
  • US20110154307 Method and System for Utilizing Data Flow Graphs to Compile Shaders
  • US20110154377 Method and System for Reducing Communication During Video Processing Utilizing Merge Buffering
  • US20110216069 Method and System for Compressing Tile Lists Used for 3d Rendering
  • US20110221743 Method and System for Controlling a 3d Processor Using a Control List in Memory
  • US20110227920 Method and System for a Shader Processor With Closely Couple Peripherals
  • US20110242113 Method and System for Processing Pixels Utilizing Scoreboarding
  • US20110242344 Method and System for Determining How to Handle Processing of an Image Based Motion
  • US20110242427 Method and System for Providing 1080P Video with 32 Bit Mobile DDR Memory
  • US20110249744 Method and System for Video Processing Utilizing Scalar Cores and a Single Vector Core
  • US20110254995 Method and System for Mitigating Seesawing Effect During Autofocus
  • US20110261059 Method and System for Decomposing Complex Shapes Into Curvy RHTS For Rasterization
  • US20110261061 Method and System for Processing Image Data on a Per Tile Basis in an Image Sensor Pipeline
  • US20110264902 Method and System For Suspending Video Processor and Saving Processor State in SDRAM Utilizing a Core Processor
  • US20110279702 Method and System for Providing A Programmable and Flexible Image Sensor Pipeline For Multiple Input Patterns

Patents on the baseline Alphamosaic processor

  • US7028143 Narrow/Wide Cache
  • US7036001 Vector Processing System
  • US7457941 Vector Processing System
  • US7043618 System for Memory Access in a Data Processor
  • US7107429 Data Access in a Processor
  • US7069417 Vector Processing System,
  • US7818540 Vector Processing System
  • US7080216 Data Access in a Processor
  • US7130985 Parallel Processor Executing an Instruction Specifying Any Location First Operand Register and Group Configuration in Two Dimensional Register File
  • US7167972 Vector/Scalar System With Vector Unit Producing Scalar Result from Vector Results According to Modifier in Vector Instruction
  • US7350057 Scalar Result Producing Method in Vector/Scalar System by Vector Unit from Vector Results According to Modifier in Vector Instruction
  • US7200724 Two Dimentional Access in a Data Processor
  • US7203800 Narrow/Wide Cache

Patents Applications on the baseline Alphamosaic processor:

Third Party Documents and Links

Some snippets of information appear in third party documents.