ImageMonkey/imagemonkey-core

Add ImageMonkey integration to existing annotation tool 'labelme'?

bbernhard opened this issue · 15 comments

The past couple of weeks I was evaluating whether it would be possible to integrate the ImageMonkey backend into the labelme annotation tool. The idea is to use the existing annotation functionality of labelme, but instead of loading the image and the annotation data directly from the disk, the data will be loaded from the ImageMonkey web service.

Originally I had planned to rework the application in a way that it's possible to choose between different backends (filesystem and ImageMonkey) in the labelme application. That way it would have been possible to merge the changes upstream and let the original maintainer maintain the labeling/annotation part of the application. But unfortunately there isn't a clear separation between the filesystem backend and the business logic, so there's no easy way to swap out the filesystem backend with the ImageMonkey backend without some major refactoring. As this major refactoring results in a huge diff (I already moved thousands of lines of code and added a bunch of abstraction layers for my PoC), I think it's unlikely that the original maintainer would accept such a huge diff.

So, I think if we want to go that way we would have to create a "hard fork" of the project and maintain the application ourselves (maybe we can cherry pick some bug fixes from the original project). But I think it wouldn't be such a big problem, as labelme seems to be quite mature.

After spending some weeks with the sourcecode, I think it should be possible to add a ImageMonkey backend to labelme. My PoC is already capable of loading images from the web. The next thing would be to fix all the stuff I broke during the refactoring (as it's a python application it's mostly "run the application", "press some buttons", "check where the application segfaults" & "fix the crash"). After that, the next(last?) big thing would be to load/persist the labels & annotations from/to the ImageMonkey service. At that point we should now have a simple alternative for the unified mode (it lacks the properties system & browse functionality, but that could be added later).

For me the big question is now: Should I invest more time into that or is it not worth it?

@dobkeratops in case you have a minute to spare, I would really appreciate it, if you could give labelme a try and play with it a bit. Would really be interested to hear what you think of it.

At the moment I am pretty unhappy with the code quality of the unified mode. The unified mode has definitely grown over the past years and that's also visible in the sourcecode (i.e a lot of spaghetti code; hard to extend & maintain; a bit slow when annotations with a lot of poly points are loaded; etc). So, in the long run I would like to either replace/extend it with a more powerful annotation tool (targeted towards power users) or completely rework the Web UI.

So you’ve tried to fork the labelme desktop tool to work with imagemonkey;

that does sound like a neat idea.

I’ve used labelmes web interface only - I’ll try and give their desktop version a go for comparison.

the best thing about it is the polygons being in a tree eg parent =whole car outline, children = wheel,headlight etc (saves on reappeating the parent label per part, and directly associates the parts)

As for how much effort you should put into it -

what I like about imagemonkey is the data is there online , and anyone can just contribute instantly ... no need to download or install anything .

You can always interoperate with other services and tools by importing/exporting data in their formats - eg I do think a “imagemonkey to labelme format export option” would be useful.

If you submitted imagemonkey support as fork of labelme (short of them accepting a pull request) might that go some way to helping people discover your service ?

Another idea for “different input to the imagemonkey database” would be bitmap data - A way to associate a colour coded overlay image . Any paint program with layers can be used as an annotation tool (gimp, photoshop, iPad “Procreate” with its pen).... a web tool could manage the colour coding (verify the colour mappings when you upload) and associated files. Color coded annotations must be mutually exclusive of course.

I would personally rather complement imagemonkey with bitmap data versus using another annotation tool - because I have the pen device.

But I can independently look at converters for that sort of thing myself. I have focussed on imagemonkey annotating because I like the data being in that publically accessible , extendable form , and I can get to it from any web connected device.

I’ve used labelmes web interface only - I’ll try and give their desktop version a go for comparison.

I think the tool and the site do not have anything in common, besides the name (at least that's my impression from the github page). So, the labelme I've linked to, is actually just an offline annotation software to annotate images. It allows you to load images from the filesystem and annotate them. The annotations are stored alongside the image in a separate json file.

What I am actually looking for, is a way to offload the "boring part" (i.e the actual drawing functionality; drawing polygons, zoom in/zoom out, drawing rectangles, move poly points, copy/paste polygons; etc). I think designing a good annotation software/framework is almost a complete project on its own, so offloading that would help me focus on the service aspects of ImageMonkey more.

My "problem" is also, that I do not know Javascript/CSS/HTML well enough to be really productive with it. So, although I have some great ideas in mind (e.g I would really like to see a Photoshop like application where you can individually arrange the different toolboxes in order to optimize your workflow), I am often lacking the technical skills to implement that. So, I was thinking that maybe there are already some annotation tools out there that I could fork and build upon (instead of reinventing the wheels)

But I totally understand that offline applications aren't as accessible as online services (maybe it's possible to compile the desktop application to webassembly and make it accessible that way, but I haven't tried that yet). I also wouldn't see the offline application as a replacement for the WebUI, but rather an extension for power users. So, in case you aren't satisfied with the unified mode, as it doesn't give you enough flexibility, you could install the application and work on the ImageMonkey dataset that way.

At the moment, it's just an idea - I am myself not really sure whether it's worth it or not. I think a native desktop application could have some benefits (probably faster than a web application, easier to extend as one doesn't need to support that many devices and browsers; easier customizable via plugins & scripts, etc), but it also has some drawbacks (has some external software dependencies, needs a installation, etc). So, in case labelme is just slightly better than the existing unified mode, it's probably not worth it to add a ImageMonkey integration. But in case the experience is much better than with the unified mode, I guess a port could be worth it.

Right now, I haven't invested that much time into the labelme fork, so in case it turns out that a ImageMonkey integration isn't worth it, I can easily drop it and look into some other (web based) alternatives.

Makes sense.. you’re right that this kind of interactive geometric manipulation is indeed a project in its own right. But I think Imagemonkey does well by having a tool that’s “good enough” integrated.

I agree you should be able to get the best of both worlds by exchanging data with other applications. Perhaps you could reduce the UI burden in imagemonkey by consolidating the features you ended up with , and focus on labelme Interop instead of the idea of an alternate “next gen” interface

As you know I had that attempt at a js labelling tool myself but it wasn’t integrated with anything. Perhaps under this direction you’d just have a protocol and someone like me could just setup their own custom labelling UI, or use the labelme desktop tool as suggested .

There’s also more that can be done with visualising and exploring the data you’ve accumulated ... it would definitely be worth opening it up for that . In light of the stories about error ridden databases.. good “explore +verify” would be useful

22DD2FA0-F2C3-4351-A955-A98DA9F1597F
Ok I have installed and trialled desktop labelme
1st impressions -
advantages:

  • well known so probably has more existing users (its accessible via pip and Ubuntu 19 apt get)
  • shows all the labels color coded, individual visibility toggle
  • Persistent common label palette kept between images (really great for street parts, person parts etc)
  • Use all the desktop tools for browsing and organising images on your hardrive . Can just drag images straight in, open a folder etc.
  • the annotations are just there on your HD in JSON format
  • Whilst drawing it shows the current edge following the cursor before you click, helps approximating boundaries .
  • Common format might already have training workflows setup in neural net libraries
  • Zoom works a bit better ie drag handles sizing (but still not perfect - no mouse wheel zoom to point.. )
  • can rename polygons after you draw them

disadvantages:

  • offline , so less motivating -when you work in imagemonkey the data is instantly available - online gives you a greater sense of collaboration (so, that’s what your integration experiment would fix)
  • Imagemonkey kind of lets you tag images with the label lists first. I don’t know where I’d put a “scene label” here. I guess I could just make a full screen poly. I like the fact imagemonkeys “task list” doubles up as “image tags”, with un-annotatable labels (but you could certainly use directories as mutually exclusive scene groupings). It is useful to be able to just list all the objects even if you don’t have time to annotate all of them
  • Imagemonkey has its “curated label list”, labelme is 100% free-labelling.
  • Less suitable for casual users (you must download,install, figure out your own file organisation )
  • Without the tree working, a long label list does get a bit complex (needs a hide/show all toggle)
  • does crash a bit (“attemp to open image folder, attemp to drag poly to make tree”, “delete polygons” )I’ve saved a couple of JSON files and one has a weird corrupted line padding the file to 4 mb for 10 polys). These files are in turn causing text editors (mousepad and gedit) to lock up, lol. EDIT: ok this is just the whole image embedded in base64 format I guess that saves figuring out locations. There’s an option to disable)
  • Whilst the site has a “iPhone” version, I haven’t found ipad support .. and I rate the iPad+pen as the best annotating experience .. easy to carry around and use casually, whilst the pen +screen is also very precise (I will look around... 1million apps,maybe someone made similar already..)

open questions:

  • I liked their web tools “polygon tree”. So far any attempt to drag one poly onto another crashes it .. but I think I’ve seen screenshots of that.

So to summarise
If you wanted labelme interop instead of any “next gen imagemonkey UI” that will work fine. It’ll free you up from a certain type of UI work.

I’d certainly adapt to using this and submitting JSON files .. I guess you could allow dragging them in to your upload page?

maybe you can adapt your sites export to use their JSON format? It’s nicer than the xml files I saw from their web tool. It looks very similar to the info you show in the explorer view.

I could see myself using both.

I have always been a fan of the online approach which is what motivated me to keep going doing a few annotations regularly in this public format.. the fact it’s “live” seems more useful and inviting than just,say, dumping annotations on a GitHub repo

perhaps just supporting labelme JSON format upload and download gives you interop without needing to fork their tool? (You could link to labelme on the upload page and explain how to share)

Many thanks for testing - very much appreciated!

I’d certainly adapt to using this and submitting JSON files .. I guess you could allow dragging them in to your upload page?

But wouldn't that be pretty complicated then? So, in order to annotate something that's already in the dataset, you would first need to download the image, open the image with the labelme application, do your annotations and then upload the json file again.

My idea would be to really integrate that transparently into the labelme fork. So, when you open the labelme fork, it automatically loads a random image, with all the labels and annotations from the ImageMonkey dataset and renders it in the application. You do your annotations and when you press the "Save" button the data is pushed back to the ImageMonkey service and the next image is loaded (so you are really working on "live data"). In a next iteration we could even add something some additional features like:

  • auto-completion for labels (fetch all the existing labels from the ImageMonkey service and use that for auto completion/label suggestions
  • build a browse based mode into the application (similar to the one on the website).

When you annotate something via the labelme fork, it still shows up in the "activity chart" on the front page.

I agree with you, the current labeling and annotation tools are good enough for now, so there's no immediate pressure to replace/extend the existing solution. But the last time I've tried to add a bigger feature (the "limb system") it was a real pain in the a** to work with the existing code base. It's the typical spaghetti code - grown over the years, no real structure, hard to extend and easy to break.

So, in the long run I would really like to slowly build up a replacement and put the unified mode into "maintenance mode" (i.e just fix some bugs, but not add any more features to it - at least not big ones). I guess we have several options here:

  • integrate into existing tools (either the tool you've been working on or some other open source tool)
  • create everything from scratch with "lessons learned"
  • clean up/refactor the existing unified mode codebase and continue with that one (we probably also need to check if the existing approach scales; I've seen some big polygons with a lot of poly points which already slow down the browser a bit; not sure yet if this is a problem of the drawing library (fabric js) or it's the slow rendering in the browser)

Could just use labelme for new images in the immediate future. Scrape new photos, annotate, upload them with annotations. Downloading and having a local copy of the data would be handy. But you’re right .. integrating with the service will give you the best of all worlds.

Also, if you have duplicate image detection, you could just absorb new submissions (although this would waste upload bandwidth admittedly)

another idea..
Imagine a daemon which could just sync a directory with the imagemonkey database -

  • optionally grab some subset of the imagemonkey db formatted as labelme json and directories of images (collection = dirname, if those are mutually exclusive) . The whole thing would be under 50gb ? Allow filling with a certain search query?
  • Then whenever you save labelme work in this directory, the daemon uploads it to imagemonkey

advantages

  • doesn’t need maintainace of a labelme fork, existing labelme users can just run this tool.
  • Having the local copy of the db would be great for browsing (use the labelme rendering of multiple labels)

Disadvantages

  • installing and running this daemon process itself might be off putting configuration work.(daemons sound scary for casual users).
  • would it be more error prone? Eg if the user accidental copies nonsense into the directory?

“Might as well just make the forked labelme tool do this” / “it’s a migration path, an easy way to test out how this will all work”

Personally I think I’d still go for manual labelme json upload to the site as the first step

That's an interesting idea!

We could go even one step further and write our own FUSE filesystem. Here has someone implemented a filesystem for twitter: https://github.com/guilload/twitterfs We could do something similar for ImageMonkey.

But I am not sure if that will be a pleasant experience. The user would then have 100k images, each image with a uuidv4 as filename, laying around in a folder. For every labeled/annotated image there would exist a json file containing all the labels and annotations. One of the biggest problems is probably to find those images that need work. I guess you could grep the json files and sort them somehow and then open the images individually in the labelme application, but I think that gets pretty annoying soon.

Another option would be, as you already suggested, to limit its scope and only allow the upload of new contributions (so that we don't sync changes back from the service to the filesystem). But I think that kills a bit the collaboration spirit.

would it be more error prone? Eg if the user accidental copies nonsense into the directory?

that's also a good point. I am also not sure if labelme guarantees backwards compatibility of their json format. So, we have to be careful, that they don't break the json format at some point leading to bunch of corrupt file uploads.

After thinking a bit more about it, I think it's also a bit of a strategic decision and what we are aiming for. Do we want to create a new labeling/annotation tool primarily for our use case, i.e ImageMonkey, or do we want to attract some new contributors? For the latter, I think a sync daemon/FUSE filesystem implementation might do the trick. But to be honest, I am not sure if I would trust some random guy on the internet to implement a proper file sync mechanism. I think I would be a bit worried that due to a bug or my own mistake personal images (or other personal data) gets uploaded which wasn't supposed to be uploaded.

short update: After some back and forth, I decided to rewrite the unified mode from scratch. Instead of using Semantic UI (which isn't maintained anymore) and jquery as frontend framework, I am now experimenting with Vue.js and Tailwind CSS. The first version will mostly be a rewrite of the existing functionality - I do not want to add any new features at this point. Until the new version is mature enough, both UIs will coexist - that makes it possible to fall back to the old UI in case the new one has a bug.

My main goals for the new UI are:

  • better to maintain; easier to extend
  • larger annotation area; better use of working space

In case you are interested, here's a short preview:

Selection_116

At the moment everything is still in alpha state and a lot of the functionality isn't ported over yet. Any suggestions and improvements are really appreciated!

looks nice, that screen layout with the toolbar on the side does seem to give more work area.

short update:
Selection_132

I think I am now almost done re-writing the unified mode from scratch. So far I am quite pleased with the result - I think it doesn't look too bad and the code quality is now much much better (which is big win for me). At the moment I am doing some extensive testing and bug fixing of the remaining bugs. :)

looks good..

been continuing with my 3d experiments ( got some character animation going), and doing a wasm port . eventually that will end up online somewhere.
i am itching to try rendering things to train with (i still want this material generator..) . theres a lot of scope for deep learning in conjunction with animation systems (pose estimation , and perhaps rendering out a lot of synthetic data with limb labels generated.. )

That sounds great - looking forward to hear/see more about that! :)

short update: A first version with the new UI is now online.
Selection_133

I've just pushed a new update to production:

  • Fixed a few small (cosmetic) issues
  • Added possibility to show all annotations