SWAPUGC

SWAPUGC: Software for Adaptive Playback of Geotagged User-Generated Content

This is the repository for SWAPUGC: a platform for creating applications that consume geotagged User-Generated Content (UGC), as presented at the 9th ACM Multimedia Systems Conference (MMSys '18).
DOI: 10.1145/3204949.3208142.

You can find the current release repository of SWAPUGC here. Example features implemented in the current release repository that are not available to the frozen MMSYS repository:

Multiple representations
Sensor-based adaptation algorithms
Simulation for network and/or stream quality updates

GoTo:
About
Architecture Flow
Demo
Generate/Record Compatible Files
Known Issues
Links/Contact

About

The about page is here.

TL;DR:
SWAPUGC is a platform for building web applications that consume geotagged UGC, in a syncrhonous and adaptive manner. Some key features:

Support for mixed adaptation policies. Stream adaptation can be based on spatial and/or cinematic and/or quality and/or system criteria.
Inter-stream synchronization between the geospatial data and the video.
Inter-bundle synchronization between the different recordings.
Extensibility for other timed data types (e.g. network metrics).

Architecture flow of the client

When the client is launched it does the following, in the corresponding order:

Load items from the playlist.txt, containing the NAMEOFFILE of relevant recordings. And then, for each NAMEOFFILE entry:
1. Construct globalSetIndex where all the information/data on the recordings is placed
2. Fetch the corresponding NAMEOFFILE_DESCRIPTOR.json file, containing information on the recordings about its timing, location of its video / location / orientation files.
3. Fetch the corresponding NAMEOFFILE_dash.mpd file
Fetch NAMEOFFILE_LOC.json containing the location data (placed in the globalSetIndex)
Fetch NAMEOFFILE_ORIENT.json containing the orientation data (placed in the globalSetIndex)
With the acquired timed location/orientation pairs
1. Place the markers on the map from the location/orientation pairs
2. Add the cues for updating the markers
Fetch NAMEOFFILE_dash.mpd with the information on the segmented files (placed in the globalSetIndex)
Adjust MSE accordingly

Demo

The current demo is available at https://emmanouil.github.io/SWAPUGC/. This demo is the up-to-date version and new scenarios are added constantly (example differences with MMSYS version: multiple representations, adaptation algorithms, etc).

The MMSys '18 demo (described in the publication) is available at https://acmmmsys.github.io/2018-SWAPUGC/.
All the sample videos are encoded using H264, in 1080p at 2000kbps with 2s-long segments. Because we are simulating live scenario with dynamic switching, the buffer size is one segment, thus a stable high-speed connection is required; if such is not available, try running the demo locally.

To run a local demo:

Download and extract, or Clone (git clone https://github.com/acmmmsys/2018-SWAPUGC.git) the repository
Start an http server on the top dir of your local repository copy (e.g. download, copy and run Mongoose, or python -m http.server 8080 in the SWAPUGC folder)
Open your browser, and navigate to the location of index.html (e.g. http://localhost:8080/index.html - for port = 8080)

To run a remote demo, repeat the same process at your server, by replacing localhost with your server IP.

The demo is working better with Chrome, was tested and works with Firefox [3], and does not work with Microsoft Edge or IE [4].

Generate Compatible Files

UGC recorder (video + sensors)

A compatible UGC recorder Android application that can be used, is available here

Generate DASH-compatible Segmented Video Files

For the demo we used MP4Box of the GPAC suite, but other tools (like ffmpeg) should work. With MP4Box, an example command to generate the mpd file [1] and the associated 2s-long segments would be [2]: MP4Box -dash 2000 -profile live -closest -segment-name file_out_seg_ file_in.mp4 (for live profile - recommended)

MP4Box -frag 2000 -dash 2000 -segment-name file_out_seg_ file_in.mp4 (for full profile)

NOTE: MP4Box does not do any transcoding on the media files. For that, we used ffmpeg. An example command for encoding a video in x264 (audio aac @ 48kHz sample rate) with framerate = 30 fps and GOP size of 30 frames at 2Mbps bitrate, scaled with height = 1080px would be : ffmpeg.exe -i 20140325_121238.webm -r 30 -preset slow -vf scale=-1:1080 -c:v libx264 -b:v 2000k -movflags +faststart -sc_threshold 0 -keyint_min 30 -g 30 -c:a aac -ar 48000 20140325_121238.mp4 (hint: if ffmpeg throws a scaling error you can use scale=-2:1080)

To analyze generated files you can use MP4Box as well (e.g. MP4Box -info 20140325_121238.mp4 or MP4Box -info 1 20140325_121238.mp4 for info only on the first track)

Format XML sensor data (compatible with ICoSOLE dataset)

File parser_single_file located inside the tool dir will generate files, from XML files, as those used in the demo, taken by the ICoSOLE project (project repository here)

Using The Parser

parser_single_file run with the name NAMEOFFILE as an argument (without extension). For example, for a file ABC123.mp4 in the folder 'parsing', it should be executed as python3 parser_single_file.py parsing/ABC123. Each entry should have at least a video file and an associated EbuCore timing file (in xml)

Parser Output

NAMEOFFILE_DESCRIPTOR.json, containing information about the recording
NAMEOFFILE_ORIENT.json, containg the timestamped orientation samples of the recording
NAMEOFFILE_LOC.json, containg the timestamped location samples of the recording

Platform-specific data used

Global Pairs Holder

global variable name: globalSetIndex
decription: an Array of recordings - the Location/Sensor Pair Objects of each recording are stored in the set field)

    {
        id: "1234567_12345"
        index: 0
        set: Array[12]
        textFile: "OUT_1234567_12345.txt"
        textFileURL: "http://137.194.232.162:8080/parsing/OUT_1234567_12345.txt"
        videoFile: "OUT_1234567_12345.mp4"
    }

Other globals:
p - player information and settings
map map Object, created using the external API
markers references to the marker objects (when orientation available)

Location and Sensor Pairs

decription: An Object holding Orientation and Location information for a POI

    {
        "id": 1,
        "Sensor": {
            "Y": -0.083974324,
            "LocalTimestamp": 1466187515309,
            "Type": "ORIENTATION",
            "X": 2.5136049,
            "Z": -1.4016464
        },
        "Location": {
            "Time": 1466187920000,
            "LocalNanostamp": 27814219216825,
            "Longitude": 2.3506619881858737,
            "Latitude": 48.83000039044928,
            "Altitude": 111.77508694140864,
            "Bearing": 213.30880737304688,
            "Provider": "gps",
            "Accuracy": 16,
            "LocalTimestamp": 1466187515321,
            "Velocity": 1.0693713426589966
        }
    }

SWAPUGC should work without orientation data, receiving just the location updates. For example, when setting the markers, the default icon is used. This feature is implemented (incl. in the demo), but not actively tested.

Events implemented

We are using VTTCue for new events (e.g. location updates). VTTCue is part of WebVTT and is an extension of the text track cue. We set the event type by specifying the id attribute. We currently implemented three types:

Event Generic Event (e.g. video end)
OrientationUpdate Orientation Update (currently using the X axis)
LocationUpdate Location Update (with the format {"lat": 123, "lng": 123})

Known Issues

[1] MP4Box does not play nice when generating the mpd of the files. More specifically, the "mimeType" and "codec" of the mpds are extremely unreliable. It is recommendaded to completely delete the "codecs" attribute and change "mimeType" to mimeType="video/mp4".

[2] Even though in the official blog of GPAC recommends using the "-rap" option when creating files for dash using MP4Box, I strongly suggest to ommit it, since it can misalign the timing of the MSE.

[3] For this demo, we are using non-aligned segments. This is an edge non-standardized case scenario, but it is the only way to seamlessly switch between views. Chrome handles its bufffers as expected, but Firefox keeps the audio of all the fetched segments, even if newer have arrived, thus occasionally switching the video before the audio.

[4] Demo does not work on Microsoft Edge, because we are using VTTCues for the marker updates, that are not supported by Edge.

If an issue is not mentioned here, you can either contact us, or submit a New Issue.

acmmmsys/2018-SWAPUGC