zergon321/reisen

panic: buffered: len(pix) was 8294400 but must be 3686400

Closed this issue · 6 comments

Hi! Thank you for this great project! I've tried your player code with your test mp4 and it works as intended. But when I try running it with any other mp4 video file I consistently get the error panic: buffered: len(pix) was 8294400 but must be 3686400. I noticed the factor between those two numbers is exactly 2.25. Any idea what the issue might be here? Thank you!

Hi. 3,686,400 is 1280 * 720 * 4, i.e. the frame width multiplied by the frame height multiplied by 4 which is the size of a single pixel in bytes (one pixel consists of 4 one-byte components: R, G, B, A). Thus 3,686,400 bytes (or ~3.5 Mb) is the size of a single video frame if the video resolution is 1280x720. Similarly, 8,294,400 bytes (or ~7.9 Mb) is 1920 * 1080 * 4 and it's the size of the video frame whose resolution is 1920x1080. 1920 / 1280 = 1080 / 720 = 1.5, but since we are kinda comparing the areas of the frames, the ratio becomes quadratic so 8,294,400 / 3,686,400 = 2.25 = 1.5^2.

Apparently, you tried to supply a 1920x1080 video to the player. The video player example is quite basic and expects you to provide a 1280x720 video:

const (
	width                             = 1280
	height                            = 720
	frameBufferSize                   = 1024
	sampleRate                        = 44100
	channelCount                      = 2
	bitDepth                          = 8
	sampleBufferSize                  = 32 * channelCount * bitDepth * 1024
	SpeakerSampleRate beep.SampleRate = 44100
)

You just have to change the width and height constants, and that shoud do the thing for you. If you would like to have a more advanced video player, you could write it on your own. Find the video stream of the media file, then read its dimensions:

// Width returns the width of the video
// stream frame.
func (video *VideoStream) Width() int {
	return int(video.codecParams.width)
}

// Height returns the height of the video
// stream frame.
func (video *VideoStream) Height() int {
	return int(video.codecParams.height)
}

After it you can create a sprite of the appropriate size to hold your frames.

But also you could make the decoder supply video frames in a preferred resolution. Just use OpenDecode(width, height int, alg InterpolationAlgorithm) instead of Open() on the video stream. The third parameter is an interpolation algorithm that is used to produce the output in the resolution different from the original one. The default algorithm is InterpolationBicubic. Each one of them returns a different result for the same set of size parameters (source):

AaIAW

And here's the set of algorithms supported by Reisen:

// InterpolationAlgorithm is used when
// we scale a video frame in a different resolution.
type InterpolationAlgorithm int

const (
	InterpolationFastBilinear    InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_FAST_BILINEAR)
	InterpolationBilinear        InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_BILINEAR)
	InterpolationBicubic         InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_BICUBIC)
	InterpolationX               InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_X)
	InterpolationPoint           InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_POINT)
	InterpolationArea            InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_AREA)
	InterpolationBicubicBilinear InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_BICUBLIN)
	InterpolationGauss           InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_GAUSS)
	InterpolationSinc            InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_SINC)
	InterpolationLanczos         InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_LANCZOS)
	InterpolationSpline          InterpolationAlgorithm = InterpolationAlgorithm(C.SWS_SPLINE)
)

Thank you for your feedback! Please close the issue if everything works fine after the suggested actions performed.

Adjusting the dimensions like you suggested fixes the problem immediately and my videos play with sound as expected. I thought it must be something simple - thank you for your quick reply! Will consider what you said about scaling next.

For some reason my videos play upside down for now - probably another trivial issue :)

@boriwo As for your videos being played upside down - that shouldn't have happened. I tried my own 1920x1080 video which I got from the Internet and it works as it should. Could you please specify the output of ffmpeg -i your_video.your_format command? Also I would like to know where and how you obtained the video. Did you download it from the Internet or created it yourself? Did you record it using a camcorder or some 3rd party software like OBS? Or may be you even somehow created it yourself using your own code on some programming language?

I used two mp4s, one recorded with a gopro, the other recorded with my Samsung Android phone. They both play upside down with reisen. However, they both play correctly with Apple Quicktime. Here is the ffprobe output for one of the videos:

ffprobe version 4.4 Copyright (c) 2007-2021 the FFmpeg developers
  built with Apple clang version 12.0.5 (clang-1205.0.22.9)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.4_2 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-avresample --enable-videotoolbox
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'ella.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.71.100
  Duration: 00:05:58.96, start: 0.000000, bitrate: 7348 kb/s
  Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
  Stream #0:1(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 7210 kb/s, 59.94 fps, 59.94 tbr, 220999 tbn, 119.88 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]

Notice how the index positions for the sound and audio stream are reverted compared to your demo mp4 (I had to make a minor adjustment to your player code), but I don't think that has anything to do with playing upside down.

Yeah, the indexing of streams doesn't matter. What really matters is the pixel order.

In OpenGL, image pixels are handled corresponding to this coordinate system:

c1

But RGBA images (including video frames) are handled corresponding to this coordinate system:

c2

The latter is just more convenient to store data in the RAM. And this is the format Ebiten expects you to supply your images to create sprites out of them. So the image pixels should be organised according to the coordinate system #2. Ebiten takes the picture, reverses its pixels for them to be ordered in the way #1 and creates a correct sprite from it.

Apple Quicktime might somehow determine the order of pixels in your video. Unfortunately, I don't know what metadata parameter is responsible for it and how to get its value from the stream. What I could advise is to create a Reverse(pix []byte) []byte function that would take your frame pixels and convert them into format #2. It's also easily achievable with such library as imaging. I think that methods like imaging.Rotate180() and imaging.FlipH() might be what you need. But I strongly recommend you to create your own pixel reverse function because it would be more performant. For example, I can give you an algorithm I used long ago to convert image pixels from format #2 to format #1:

func pixToPictureData(pixels []byte, width, height int) *pixel.PictureData {
	picData := pixel.MakePictureData(pixel.
		R(0, 0, float64(width), float64(height)))

	for y := height - 1; y >= 0; y-- {
		for x := 0; x < width; x++ {
			picData.Pix[(height-y-1)*width+x].R = pixels[y*width*4+x*4+0]
			picData.Pix[(height-y-1)*width+x].G = pixels[y*width*4+x*4+1]
			picData.Pix[(height-y-1)*width+x].B = pixels[y*width*4+x*4+2]
			picData.Pix[(height-y-1)*width+x].A = pixels[y*width*4+x*4+3]
		}
	}

	return picData
}

It was created for Pixel game library. pixel.PictureData consists of color.RGBA structs, and I read all the bytes of the source picture into the picData converting them from the format #2 to the format #1. If you're gonna create your own reverse function, the code above could serve as a starting point or something like that. But using imaging is the easiest solution to implement.

Also you should note that your audio stream sample rate is 48,000 and the player demo uses 44,100. It's necessary to replace 44,100 with 48,000 in the constants section of the player example.

Thanks again for your detailed response. Flipping the image works. For now I'm using the image library and yes there is a noticeable performance hit so I will probably code my own. Will close this issue for now as at this point everything works and I got all my questions answered!