ossrs/srs

Streaming with DJI GO to SRS, RTMP playback is normal, but HLS cannot be played.

xiaoyaowx opened this issue · 14 comments

Using the "DJI GO" app (for DJI drones), streaming to SRS and playing through RTMP works fine. However, HLS playback is not possible as the video appears pixelated. At the same time, when testing streaming to a wowza server, the HLS playback generated on wowza is normal. Can you please explain the reason for this?

TRANS_BY_GPT3

Open ATC mode, use SRS RTMP dump to record a segment of FLV, and upload the logs and files to take a look.
Remember not to make mistakes, as incorrect data is not acceptable.

TRANS_BY_GPT3

[2016-12-08 10:26:36.416][trace][13946][227] dvr stream wifiwx-85 to file /storage/live_data/live/wifiwx-85.1481163996416.flv
[2016-12-08 10:26:36.426][trace][13946][227] set TCP_NODELAY 0=>1
[2016-12-08 10:26:36.427][trace][13946][227] start publish mr=0/350, p1stpt=20000, pnt=20000, tcp_nodelay=1, rtcid=230
[2016-12-08 10:26:36.530][trace][13946][227] got metadata, width=960, height=720, vcodec=7, acodec=10
[2016-12-08 10:26:36.530][trace][13946][227] 42B video sh, codec(7, profile=Baseline, level=3.1, 960x720, 0kbps, 0fps, 0s)
[2016-12-08 10:26:36.530][trace][13946][227] 7B audio sh, codec(10, profile=LC, 1channels, 0kbps, 44100HZ), flv(16bits, 2channels, 44100HZ)
[2016-12-08 10:26:40.036][warn][13946][227][11] hls: ts starts without IDR, first nalu=9, idr=0
[2016-12-08 10:26:40.794][trace][13946][227] -> HLS time=5000, stream dts=348930(3877ms), sno=2, ts=http://m.cdn.wifiwx.com/live/wifiwx-85-1481164000037-1.ts, dur=0.69, dva=0p
[2016-12-08 10:26:43.377][warn][13946][227][11] hls: ts starts without IDR, first nalu=9, idr=0
[2016-12-08 10:26:46.539][warn][13946][227][11] hls: ts starts without IDR, first nalu=9, idr=0
[2016-12-08 10:26:49.744][warn][13946][227][11] hls: ts starts without IDR, first nalu=9, idr=0
[2016-12-08 10:26:50.634][trace][13946][227] -> HLS time=15000, stream dts=1245870(13843ms), sno=5, ts=http://m.cdn.wifiwx.com/live/wifiwx-85-1481164009745-4.ts, dur=0.99, dva=0p
[2016-12-08 10:26:53.199][warn][13946][227][11] hls: ts starts without IDR, first nalu=9, idr=0
[2016-12-08 10:26:56.185][warn][13946][227][11] hls: ts starts without IDR, first nalu=9, idr=0
[2016-12-08 10:26:56.427][trace][13946][227] <- CPB time=0, okbps=1,0,0, ikbps=2096,0,0, mr=0/350, p1stpt=20000, pnt=20000
[2016-12-08 10:26:59.386][warn][13946][227][11] hls: ts starts without IDR, first nalu=9, idr=0
[2016-12-08 10:27:00.623][trace][13946][227] -> HLS time=25000, stream dts=2147490(23861ms), sno=8, ts=http://m.cdn.wifiwx.com/live/wifiwx-85-1481164019388-7.ts, dur=1.32, dva=0p

wifiwx-85.1481163996416.flv.zip

It must be the non-standard NALU inside the 264. Find time to check if it can be compatible. Garbage encoders usually like to send strange streams.

TRANS_BY_GPT3

Your file, ffmpeg, also has some issues:

winlin:srs-plus winlin$ ffmpeg -i wifiwx-85.1481163996416.flv 
ffmpeg version 2.1.1 Copyright (c) 2000-2013 the FFmpeg developers
  built on Feb  5 2016 17:28:08 with Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
  configuration: --enable-gpl --enable-nonfree --yasmexe=/Users/winlin/Desktop/git/srs-plus/trunk/objs/ffmpeg.src/_release/bin/yasm --prefix=/Users/winlin/Desktop/git/srs-plus/trunk/objs/ffmpeg.src/_release --cc= --enable-static --disable-shared --disable-debug --extra-cflags='-I${ffmpeg_exported_release_dir}/include' --extra-ldflags='-L${ffmpeg_exported_release_dir}/lib -lm -ldl' --disable-ffplay --disable-ffprobe --disable-ffserver --disable-doc --enable-postproc --enable-bzlib --enable-zlib --enable-parsers --enable-libx264 --enable-libmp3lame --enable-libfdk-aac --enable-libspeex --enable-pthreads --extra-libs=-lpthread --enable-encoders --enable-decoders --enable-avfilter --enable-muxers --enable-demuxers
  libavutil      52. 48.101 / 52. 48.101
  libavcodec     55. 39.101 / 55. 39.101
  libavformat    55. 19.104 / 55. 19.104
  libavdevice    55.  5.100 / 55.  5.100
  libavfilter     3. 90.100 /  3. 90.100
  libswscale      2.  5.101 /  2.  5.101
  libswresample   0. 17.104 /  0. 17.104
  libpostproc    52.  3.100 / 52.  3.100
[h264 @ 0x7fe5ee018000] top block unavailable for requested intra4x4 mode -1 at 2 0
[h264 @ 0x7fe5ee018000] error while decoding MB 2 0
[h264 @ 0x7fe5ee018000] concealing 2700 DC, 2700 AC, 2700 MV errors in I frame
[h264 @ 0x7fe5ee018000] Missing reference picture, default is 0
    Last message repeated 1 times
[h264 @ 0x7fe5ee018000] ref 32 overflow
[h264 @ 0x7fe5ee018000] error while decoding MB 13 0
[h264 @ 0x7fe5ee018000] Cannot use next picture in error concealment
[h264 @ 0x7fe5ee018000] concealing 2700 DC, 2700 AC, 2700 MV errors in P frame
[h264 @ 0x7fe5ee018000] Missing reference picture, default is 2
[h264 @ 0x7fe5ee018000] ref 14 overflow
[h264 @ 0x7fe5ee018000] error while decoding MB 13 0
[h264 @ 0x7fe5ee018000] Cannot use next picture in error concealment
[h264 @ 0x7fe5ee018000] concealing 2700 DC, 2700 AC, 2700 MV errors in P frame
[h264 @ 0x7fe5ee018000] ref 9 overflow
[h264 @ 0x7fe5ee018000] error while decoding MB 13 0
[h264 @ 0x7fe5ee018000] Cannot use next picture in error concealment
[h264 @ 0x7fe5ee018000] concealing 2700 DC, 2700 AC, 2700 MV errors in P frame
[h264 @ 0x7fe5ee018000] ref 14 overflow
[h264 @ 0x7fe5ee018000] error while decoding MB 13 0
[h264 @ 0x7fe5ee018000] Cannot use next picture in error concealment
[h264 @ 0x7fe5ee018000] concealing 2700 DC, 2700 AC, 2700 MV errors in P frame
[h264 @ 0x7fe5ee018000] ref 32 overflow
[h264 @ 0x7fe5ee018000] error while decoding MB 13 0
[h264 @ 0x7fe5ee018000] Cannot use next picture in error concealment
[h264 @ 0x7fe5ee018000] concealing 2700 DC, 2700 AC, 2700 MV errors in P frame
[h264 @ 0x7fe5ee018000] ref 7 overflow
[h264 @ 0x7fe5ee018000] error while decoding MB 13 0
[h264 @ 0x7fe5ee018000] Cannot use next picture in error concealment
[h264 @ 0x7fe5ee018000] concealing 2700 DC, 2700 AC, 2700 MV errors in P frame
Input #0, flv, from 'wifiwx-85.1481163996416.flv':
  Metadata:
    encoder         : Lavf56.15.102
    service         : SRS/2.0.221(ZhouGuowen)
  Duration: 00:00:37.68, start: 0.000000, bitrate: 2144 kb/s
    Stream #0:0: Video: h264 (Constrained Baseline), yuv420p, 960x720, 1000 kb/s, 30.30 tbr, 1k tbn, 60 tbc
    Stream #0:1: Audio: aac, 44100 Hz, mono, fltp, 128 kb/s
At least one output file must be specified
winlin:srs-plus winlin$ 

TRANS_BY_GPT3

The file recorded by srs rtmp dump (the original stream without modified data) can be found here: wifiwx-85.1481163996416.flv.zip.

Push to SRS:

ffmpeg -re -i wifiwx-85.1481163996416.flv -c copy -f flv -y rtmp://127.0.0.1/live/livestream

Server logs:

[2017-01-09 12:05:15.032][trace][67761][107] RTMP client ip=127.0.0.1, fd=10
[2017-01-09 12:05:15.032][trace][67761][107] srand initialized the random.
[2017-01-09 12:05:15.034][trace][67761][107] complex handshake success
[2017-01-09 12:05:15.034][trace][67761][107] connect app, tcUrl=rtmp://127.0.0.1:1935/live, pageUrl=, swfUrl=, schema=rtmp, vhost=__defaultVhost__, port=1935, app=live, args=null
[2017-01-09 12:05:15.034][trace][67761][107] client identified, type=fmle-publish, stream_name=livestream, duration=-1.00
[2017-01-09 12:05:15.060][trace][67761][107] source url=/live/livestream, ip=127.0.0.1, cache=1, is_edge=0, source_id=-1[-1]
[2017-01-09 12:05:15.061][trace][67761][107] hls: win=60.00, frag=10.00, prefix=, path=./objs/nginx/html, m3u8=[app]/[stream].m3u8, ts=[app]/[stream]-[seq].ts, aof=2.00, floor=0, clean=1, waitk=1, dispose=0
[2017-01-09 12:05:15.061][trace][67761][107] ignore disabled exec for vhost=__defaultVhost__
[2017-01-09 12:05:15.072][trace][67761][107] exec thread cid=110, current_cid=107
[2017-01-09 12:05:15.083][trace][67761][107] start publish mr=0/350, p1stpt=20000, pnt=20000, tcp_nodelay=0, rtcid=111
[2017-01-09 12:05:15.083][trace][67761][107] got metadata, width=960, height=720, vcodec=7, acodec=10
[2017-01-09 12:05:15.083][trace][67761][107] protocol in.buffer=0, in.ack=0, out.ack=2500000, in.chunk=60000, out.chunk=60000
[2017-01-09 12:05:15.083][trace][67761][107] 42B video sh,  codec(7, profile=Baseline, level=3.1, 960x720, 0kbps, 0fps, 0s)
[2017-01-09 12:05:15.083][trace][67761][107] 7B audio sh, codec(10, profile=LC, 1channels, 0kbps, 44100HZ), flv(16bits, 2channels, 44100HZ)
[2017-01-09 12:05:24.792][trace][67761][107] -> HLS time=10013, stream dts=874620(9718ms), sno=1, ts=livestream-0.ts, dur=9.72, dva=0p
[2017-01-09 12:05:25.279][warn][67761][107][35] hls: ts starts without IDR, first nalu=9, idr=0

Logs were generated around the time of slicing.

bogon:live winlin$ cat livestream.m3u8 
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-ALLOW-CACHE:YES
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-TARGETDURATION:15
#EXT-X-DISCONTINUITY
#EXTINF:10.192, no desc
livestream-0.ts
#EXTINF:10.196, no desc
livestream-1.ts

The sliced ts files cannot be played in VLC, only showing green screen and gray screen. However, RTMP can be played. VLC can directly play FLV files as well.

TRANS_BY_GPT3

I always feel that the H.264 in this FLV file is not in Annex B format, but rather in MP4 format with H.264 encapsulation. Will there be any issues?

TRANS_BY_GPT3

Analyze its SPS/PPS, which is sent as the first frame of the video. The content is as follows:

(lldb) p *msg
(SrsSharedPtrMessage) $7 = {
  timestamp = 0
  stream_id = 1
  size = 42
  payload = 0x00000001010034a0 "\x17"
  ptr = 0x0000000101003780
}
(lldb) x/42xb msg->payload
0x1010034a0: 0x17 0x00 0x00 0x00 0x00 0x01 0x42 0xc0
0x1010034a8: 0x1f 0xff 0xe1 0x00 0x16 0x67 0x42 0xc0
0x1010034b0: 0x1f 0xd9 0x00 0xf0 0x16 0xe8 0x40 0x00
0x1010034b8: 0x00 0x03 0x00 0x40 0x00 0x00 0x0f 0x03
0x1010034c0: 0xc6 0x0c 0x92 0x01 0x00 0x04 0x68 0xcb
0x1010034c8: 0x8e 0x20

The encapsulation is correct, and it can enter the function SrsAvcAacCodec::avc_demux_sps_pps to parse the content of SPS and PPS. The bytes of SPS and PPS are as follows:

(lldb) p avc_extra_size
(int) $25 = 37
(lldb) x/37xb avc_extra_data
0x1010037b0: 0x01 0x42 0xc0 0x1f 0xff 0xe1 0x00 0x16
0x1010037b8: 0x67 0x42 0xc0 0x1f 0xd9 0x00 0xf0 0x16
0x1010037c0: 0xe8 0x40 0x00 0x00 0x03 0x00 0x40 0x00
0x1010037c8: 0x00 0x0f 0x03 0xc6 0x0c 0x92 0x01 0x00
0x1010037d0: 0x04 0x68 0xcb 0x8e 0x20

The data of SPS is as follows:

(lldb) p sequenceParameterSetLength
(u_int16_t) $45 = 22
(lldb) x/22xb sequenceParameterSetNALUnit
0x101201550: 0x67 0x42 0xc0 0x1f 0xd9 0x00 0xf0 0x16
0x101201558: 0xe8 0x40 0x00 0x00 0x03 0x00 0x40 0x00
0x101201560: 0x00 0x0f 0x03 0xc6 0x0c 0x92

The data of PPS is as follows:

(lldb) p pictureParameterSetLength
(u_int16_t) $52 = 4
(lldb) x/4xb pictureParameterSetNALUnit
0x101003310: 0x68 0xcb 0x8e 0x20

It seems that SPS and PPS do not have any special features. The parsed SPS information is also correct:

[2017-01-09 15:46:21.531][trace][83301][107] 42B video sh,  
codec(7, profile=Baseline, level=3.1, 960x720, 0kbps, 0fps, 0s)

Comparing with FFMPEG's information:

 Stream #0:0: Video: h264 (Constrained Baseline), yuv420p, 960x720, 
1000 kb/s, 30.30 tbr, 1k tbn, 60 tbc

TRANS_BY_GPT3

The first frame should be an IDR frame, and the data is as follows:

(lldb) x/128xb stream->p-4
0x102802e05: 0x00 0x00 0x00 0x02 0x09 0x30 0x00 0x00
0x102802e0d: 0x00 0x02 0x09 0x30 0x00 0x00 0x00 0x20
0x102802e15: 0x06 0x00 0x0d 0x80 0x80 0xd9 0x00 0x2e
0x102802e1d: 0xf1 0x80 0x80 0xd9 0x00 0x2e 0xf1 0xc0
0x102802e25: 0x01 0x09 0x00 0x20 0x08 0x24 0x68 0x00
0x102802e2d: 0x00 0x03 0x00 0x01 0x06 0x01 0xc4 0x80
0x102802e35: 0x00 0x00 0xee 0x52 0x21 0xb8 0x03 0x3f
0x102802e3d: 0xf9 0xd2 0x9c 0x11 0x2e 0x62 0x78 0x06
0x102802e45: 0xd2 0x65 0x0d 0x40 0xcd 0xda 0xad 0x49
0x102802e4d: 0x97 0x4d 0x4a 0x0b 0x5b 0x9c 0xab 0xea
0x102802e55: 0x8f 0x00 0x3d 0x95 0xbb 0x09 0x56 0x51
0x102802e5d: 0x30 0x47 0x9f 0x49 0x87 0x3f 0x99 0x9b
0x102802e65: 0x69 0x3c 0x8c 0x80 0x0b 0xd6 0x38 0x34
0x102802e6d: 0xed 0x08 0x81 0x66 0x38 0x99 0x0a 0x56
0x102802e75: 0x59 0x84 0xec 0xe3 0x06 0x29 0x44 0xd1
0x102802e7d: 0x9f 0x11 0x7f 0xbe 0xa7 0xe8 0x2f 0x5d

Or, to put it another way, it contains several NALUs, each in the ibmf format with size+data:

(lldb) x/6xb stream->p-4
0x102802e05: 0x00 0x00 0x00 0x02 0x09 0x30

(lldb) x/6xb stream->p-4+6
0x102802e0b: 0x00 0x00 0x00 0x02 0x09 0x30

(lldb) x/36xb stream->p-4+6+6
0x102802e11: 0x00 0x00 0x00 0x20 0x06 0x00 0x0d 0x80
0x102802e19: 0x80 0xd9 0x00 0x2e 0xf1 0x80 0x80 0xd9
0x102802e21: 0x00 0x2e 0xf1 0xc0 0x01 0x09 0x00 0x20
0x102802e29: 0x08 0x24 0x68 0x00 0x00 0x03 0x00 0x01
0x102802e31: 0x06 0x01 0xc4 0x80

Finally, this length is 0xee52, which is equivalent to 61010 bytes.
(lldb) x/128xb stream->p-4+6+6+36
0x102802e35: 0x00 0x00 0xee 0x52 0x21 0xb8 0x03 0x3f
0x102802e3d: 0xf9 0xd2 0x9c 0x11 0x2e 0x62 0x78 0x06
0x102802e45: 0xd2 0x65 0x0d 0x40 0xcd 0xda 0xad 0x49
0x102802e4d: 0x97 0x4d 0x4a 0x0b 0x5b 0x9c 0xab 0xea
0x102802e55: 0x8f 0x00 0x3d 0x95 0xbb 0x09 0x56 0x51
0x102802e5d: 0x30 0x47 0x9f 0x49 0x87 0x3f 0x99 0x9b
0x102802e65: 0x69 0x3c 0x8c 0x80 0x0b 0xd6 0x38 0x34
0x102802e6d: 0xed 0x08 0x81 0x66 0x38 0x99 0x0a 0x56
0x102802e75: 0x59 0x84 0xec 0xe3 0x06 0x29 0x44 0xd1
0x102802e7d: 0x9f 0x11 0x7f 0xbe 0xa7 0xe8 0x2f 0x5d
0x102802e85: 0x19 0x28 0x29 0x6c 0xe2 0xbd 0x05 0x6c
0x102802e8d: 0x9c 0x8e 0x9b 0xc7 0x1f 0x60 0x53 0x10
0x102802e95: 0x76 0x68 0x5d 0x1c 0xc0 0x1b 0x7e 0xc8
0x102802e9d: 0xa1 0x11 0x4d 0x23 0xc4 0xc5 0x48 0x6d
0x102802ea5: 0xd5 0xa1 0x08 0x69 0xc7 0x3d 0x25 0x31
0x102802ead: 0xad 0xe2 0x48 0x24 0xbe 0x74 0x6a 0x44

The sum of several NALUs is 6+6+36+61014=61062 bytes. The payload of this video has a 5-byte FLV header, plus these 61062 bytes, making it a total of 61067 bytes.

(lldb) p size
(int) $38 = 61067
(lldb) x/16xb data
0x102802e00: 0x17 0x01 0x00 0x00 0x00 0x00 0x00 0x00
0x102802e08: 0x02 0x09 0x30 0x00 0x00 0x00 0x02 0x09

There are 4 NALUs (samples) parsed out.

(lldb) p *sample
(SrsCodecSample) $63 = {
  nb_sample_units = 4
  sample_units = {
    [0] = (size = 2, bytes = "\t0")
    [1] = (size = 2, bytes = "\t0")
    [2] = (size = 32, bytes = "\x06")
    [3] = (size = 61010, bytes = "!\xffffffb8\x03?")

These several NALUs are as follows:

AUD, 2 bytes (09 30)
AUD, 2 bytes (09 30)
SEI, 32 bytes (06 00 0d)
NonIDR, 61010 bytes (21 b8 03)

If SRS does not have IDR, it will not insert SPS and PPS, resulting in screen flickering and errors.

If SRS does not have IDR, it will not insert SPS and PPS, resulting in screen flickering and error messages.

TRANS_BY_GPT3

Video frame data sent.

The data of SPS is as follows:

(lldb) p sequenceParameterSetLength
(u_int16_t) $45 = 22
(lldb) x/22xb sequenceParameterSetNALUnit
0x101201550: 0x67 0x42 0xc0 0x1f 0xd9 0x00 0xf0 0x16
0x101201558: 0xe8 0x40 0x00 0x00 0x03 0x00 0x40 0x00
0x101201560: 0x00 0x0f 0x03 0xc6 0x0c 0x92

The data of PPS is as follows:

(lldb) p pictureParameterSetLength
(u_int16_t) $52 = 4
(lldb) x/4xb pictureParameterSetNALUnit
0x101003310: 0x68 0xcb 0x8e 0x20

First frame:

AUD, 2 bytes (09 30)
AUD, 2 bytes (09 30)
SEI, 32 bytes (06 00 0d)
NonIDR, 61010 bytes (21 b8 03)
frame_type = SrsCodecVideoAVCFrameKeyFrame
avc_packet_type = SrsCodecVideoAVCTypeNALU

(lldb) x/32xb SEI
0x102009415: 0x06 0x00 0x0d 0x80 0x80 0xd9 0x00 0x2e
0x10200941d: 0xf1 0x80 0x80 0xd9 0x00 0x2e 0xf1 0xc0
0x102009425: 0x01 0x09 0x00 0x20 0x08 0x24 0x68 0x00
0x10200942d: 0x00 0x03 0x00 0x01 0x06 0x01 0xc4 0x80

Parsing out RBSP data, only 30 bytes (1 byte for NALU header, one 03 as a marker):
(lldb) x/30xb rbsp
0x100703ec0: 0x00 0x0d 0x80 0x80 0xd9 0x00 0x2e 0xf1
0x100703ec8: 0x80 0x80 0xd9 0x00 0x2e 0xf1 0xc0 0x01
0x100703ed0: 0x09 0x00 0x20 0x08 0x24 0x68 0x00 0x00
0x100703ed8: 0x00 0x01 0x06 0x01 0xc4 0x80

Second frame:

Second frame:
- AUD, 2 bytes (09 10)
- AUD, 2 bytes (09 10)
- SEI, 32 bytes (06 01 09)
- NonIDR, 6640 bytes (21 e2 23)
frame_type = SrsCodecVideoAVCFrameInterFrame
avc_packet_type = SrsCodecVideoAVCTypeNALU

(lldb) x/14xb SEI
0x100814c15: 0x06 0x01 0x09 0x00 0x02 0x08 0x24 0x68
0x100814c1d: 0x00 0x00 0x03 0x00 0x01 0x80

Third frame:

- AUD, 2 bytes (09 30)
- AUD, 2 bytes (09 30)
- SEI, 32 bytes (06 01 09)
- NonIDR, 4600 bytes (21 e4 23)
frame_type = SrsCodecVideoAVCFrameInterFrame
avc_packet_type = SrsCodecVideoAVCTypeNALU

(lldb) x/14xb SEI
0x10201a015: 0x06 0x01 0x09 0x00 0x04 0x08 0x24 0x68
0x10201a01d: 0x00 0x00 0x03 0x00 0x01 0x80
  • Fourth frame:
- AUD, 2 bytes (09 30)
- AUD, 2 bytes (09 30)
- SEI, 32 bytes (06 01 09)
- NonIDR, 4685 bytes (21 e6 23)
frame_type = SrsCodecVideoAVCFrameInterFrame
avc_packet_type = SrsCodecVideoAVCTypeNALU

(lldb) x/14xb SEI
0x102802215: 0x06 0x01 0x09 0x00 0x06 0x08 0x24 0x68
0x10280221d: 0x00 0x00 0x03 0x00 0x01 0x80

Currently, it can be determined that the data sent by DJI is NonIDR open gop data, which is usually preceded by IDR frames in closed gop. It is necessary to correctly handle this type of data.

TRANS_BY_GPT3

It is said that the latest FFMPEG can handle this stream correctly.

ffmpeg -re -i ~/Downloads/wifiwx-85.1481163996416.flv -c copy \
-flags global_header -f hls -hls_time 3 -hls_list_size 0 output.m3u8

Note: The FFMPEG of SRS cannot handle it and also throws an error.

Analyzing the data generated by FFMPEG in TS format, the information is as follows:

00 00 00 01 09 30 // AUD
00 00 01 09 30 // AUD
00 00 01 06 00 ...... // SEI
00 00 01 21 b8 03 ...... // I Frame

00 00 00 01 09 30 // AUD
00 00 01 09 30 // AUD
00 00 01 06 00 ...... // SEI
00 00 01 21 e6 23 ...... // I Frame

Just follow the FFMPEG method of handling NonIDR (non-intra-coded) frames, which means not inserting the default AUD NALU (Network Abstraction Layer Unit) and not writing SPS (Sequence Parameter Set) and PPS (Picture Parameter Set).

TRANS_BY_GPT3

In summary:

  1. NonIDR, also known as open gop mode, the first package in the FLV file contains a Sequence Header. The FLV file mentioned above is in Baseline profile, but this is incorrect. The subsequent Nalu will provide the correct SPS and PPS, with a profile of High.
  2. IDR mode, also known as closed gop mode, uses the first package in the FLV file, which contains the SPS and PPS in the Sequence Header. The SPS and PPS in the NALU are discarded.

Under the SRS documentation, the sample video of Avatar has the following NALU sequence:

[2017-01-10 17:25:10.082][warn][51431][107][35] NALU SEI, size=688, 0x6 0x5 0xff
[2017-01-10 17:25:10.082][warn][51431][107][35] NALU IDR, size=4436, 0x65 0x88 0x84
[2017-01-10 17:25:10.082][warn][51431][107][35] NALU parsed, open_gop=0, keyframe=1

[2017-01-10 17:25:10.083][warn][51431][107][35] NALU NonIDR, size=123, 0x41 0x9a 0x21
[2017-01-10 17:25:10.083][warn][51431][107][35] NALU parsed, open_gop=0, keyframe=2
......
[2017-01-10 17:25:14.994][warn][51431][107][35] NALU NonIDR, size=32, 0x41 0x9a 0x1c
[2017-01-10 17:25:14.994][warn][51431][107][35] NALU parsed, open_gop=0, keyframe=2

[2017-01-10 17:25:15.029][warn][51431][107][35] NALU IDR, size=227, 0x65 0x88 0x82
[2017-01-10 17:25:15.029][warn][51431][107][35] NALU parsed, open_gop=0, keyframe=1

[2017-01-10 17:25:15.065][warn][51431][107][35] NALU NonIDR, size=21, 0x41 0x9a 0x21
[2017-01-10 17:25:15.065][warn][51431][107][35] NALU parsed, open_gop=0, keyframe=2

Sequence:
SEI IDR
NonIDR ......
IDR
NonIDR ......

The sequence of open-gop recommended by DJI is as follows:

[2017-01-10 17:27:14.137][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x30 0
[2017-01-10 17:27:14.137][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x30 0
[2017-01-10 17:27:14.137][warn][51623][107][35] NALU SEI, size=32, 0x6 0 0xd
[2017-01-10 17:27:14.137][warn][51623][107][35] NALU NonIDR, size=61010, 0x21 0xb8 0x3
[2017-01-10 17:27:14.137][warn][51623][107][35] NALU parsed, open_gop=1, keyframe=1

[2017-01-10 17:27:14.138][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x10 0
[2017-01-10 17:27:14.138][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x10 0
[2017-01-10 17:27:14.138][warn][51623][107][35] NALU SEI, size=14, 0x6 0x1 0x9
[2017-01-10 17:27:14.138][warn][51623][107][35] NALU NonIDR, size=6640, 0x21 0xe2 0x23
[2017-01-10 17:27:14.138][warn][51623][107][35] NALU parsed, open_gop=1, keyframe=2

......

[2017-01-10 17:27:14.560][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x30 0
[2017-01-10 17:27:14.560][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x30 0
[2017-01-10 17:27:14.560][warn][51623][107][35] NALU SPS, size=47, 0x27 0x64 0
[2017-01-10 17:27:14.560][warn][51623][107][35] NALU PPS, size=4, 0x28 0xee 0x38
[2017-01-10 17:27:14.560][warn][51623][107][35] NALU SEI, size=32, 0x6 0 0xd
[2017-01-10 17:27:14.560][warn][51623][107][35] NALU IDR, size=62883, 0x25 0xb8 0x20
[2017-01-10 17:27:14.560][warn][51623][107][35] NALU parsed, open_gop=1, keyframe=1

[2017-01-10 17:27:14.592][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x10 0
[2017-01-10 17:27:14.592][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x10 0
[2017-01-10 17:27:14.592][warn][51623][107][35] NALU SEI, size=14, 0x6 0x1 0x9
[2017-01-10 17:27:14.592][warn][51623][107][35] NALU NonIDR, size=6643, 0x21 0xe2 0x23
[2017-01-10 17:27:14.592][warn][51623][107][35] NALU parsed, open_gop=1, keyframe=2

......

[2017-01-10 17:27:16.772][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x30 0
[2017-01-10 17:27:16.772][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x30 0
[2017-01-10 17:27:16.772][warn][51623][107][35] NALU SPS, size=47, 0x27 0x64 0
[2017-01-10 17:27:16.772][warn][51623][107][35] NALU PPS, size=4, 0x28 0xee 0x38
[2017-01-10 17:27:16.772][warn][51623][107][35] NALU SEI, size=32, 0x6 0 0xd
[2017-01-10 17:27:16.772][warn][51623][107][35] NALU IDR, size=63008, 0x25 0xb8 0x20
[2017-01-10 17:27:16.772][warn][51623][107][35] NALU parsed, open_gop=1, keyframe=1

[2017-01-10 17:27:16.805][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x10 0
[2017-01-10 17:27:16.805][warn][51623][107][35] NALU AccessUnitDelimiter, size=2, 0x9 0x10 0
[2017-01-10 17:27:16.805][warn][51623][107][35] NALU SEI, size=14, 0x6 0x1 0x9
[2017-01-10 17:27:16.805][warn][51623][107][35] NALU NonIDR, size=6625, 0x21 0xe2 0x23
[2017-01-10 17:27:16.805][warn][51623][107][35] NALU parsed, open_gop=1, keyframe=2

......

Sequence:
AUD AUD SEI NonIDR ......
AUD AUD SEI SPS PPS IDR
AUD AUD SEI NonIDR ......

TRANS_BY_GPT3

Finally, NonIDR and IDR were unified. The strategy for writing HLS in SRS is:

  1. A RTMP video packet may contain multiple NALUs. For example, in the movie "Avatar", it can be SEI+IDR or NonIDR or IDR. DJI's recommendation is to use AUD+AUD+SEI+NonIDR or AUD+AUD+SEI+SPS+PPS+IDR, and so on.
  2. If there is no AUD in the NALU, insert a default AUD at the beginning.
  3. If there are no SPS and PPS in the NALU, but there is an IDR, insert SPS and PPS before the IDR.
  4. Preserve the original NALU sequence, including AUD, SEI, SPS, and PPS, without discarding any of them.
  5. Previously, it was observed that SPS and PPS always start with 00 00 00 01 in Apple's examples, but it seems unnecessary as ffmpeg does not do it.

TRANS_BY_GPT3

Streaming to SRS using the "DJI GO" app for DJI drones, the RTMP playback is normal, but the HLS playback is not working properly, with the video being all pixelated. However, when testing the streaming to a wowza server, the generated HLS playback on wowza is normal. Can you please explain the reason for this?

The streaming address I wrote is rtmp://192.168.1.130:1935/live/livestream, but it keeps showing "connecting". Can you please explain the reason for this?

TRANS_BY_GPT3

I am using the DJI Cloud API to push live streams from the M300. The HLS playback is working fine, but the live stream fails to play in Chrome using the FLV player. Additionally, the FLV file recorded by the DVR cannot be played, although it can be played using VLC player after downloading. It seems that there is an issue with the format provided by DJI. How can this problem be resolved?

TRANS_BY_GPT3