jsherfey/dnsim

Remove duplicate functions

Opened this issue · 2 comments

(I will only be discussing code in dnsim/matlab/functions here). With the exception of MechanismBrowser.m, which has a LARGE number of GUI-related functions that are independent of the rest of the program, I've tried to identify all the functions currently in DNSIM, both those that have their own file and those that don't. (Yes, I know there're are automated ways to do this but I'm partially doing it for education about the codebase).

I've identified a number of cases where the same function is present, occasionally in slightly different forms, within different files, including sometimes its own file. I.e., there are duplicates of many functions. This is confusing for a number of reasons, like which version is MATLAB using (probably the "closest" one, i.e. the one inside a higher-order function's file instead of the version that is independent), but most importantly it's super confusing for developers.

For a project like this, I think it's generally a good idea to all functions inside a single file, that only contains that function -- with an exception for big GUI "function" definitions, which can include a ton of little callback/click-y functions that only make sense in their own context (MechanismBrowser.m and modeler.m are good examples of this). Even with GUI code segmentation, however, I don't think any two functions should be named the same and do two different things in different contexts. Yes, that is the definition of polymorphism (I think), but due to the fact we are restricted from using classes because one of the goals is to satisfy the MCR MATLAB compiler to make standalone versions (what @jsherfey just told me), and due to the size of the codebase, I don't think we should bring that into the mix. In simpler terms, I think the entire program should just be a bunch of files with single functions in them, with exceptions for complicated GUI behavior.

IMHO this is far easier for developers, and makes no difference to users, and makes it easier for developers to have cause to re-use others' internal functions.

I will go through these duplicated functions below and try to spell out what I think is best if there are differences between the versions, but @jsherfey is really the only person who understands anything about the history of different versions if he can remember. If they are not different, then I think we should just turn them into single files that just hold their own function.

  • the format is
    • function_name (same or different) [solution: do thingy]
      • file_where_function_is.m
      • other_file_where_function_is.m
  • duplicate functions with their own file
    • download_get_models (different) [solution: use standalone version, since is newest of three? only one that has IP address]
      • (standalone version).m
      • MechanismBrowser.m
      • modeler.m
      • modeler.m (download_models) [solution: change name since wrapper for download_get_models?]
      • dnsim.m (downloadmodel) [solution: just change name since just calls modeler]
    • combine_models (and accompanying function combine_models_pair) (different) [solution: use standalone version since seems to have more code / be more developed]
      • (standalone version).m
      • modeler.m
    • load_get_models (same)
      • (standalone version).m
      • modeler.m
      • modeler (load_models).m
      • dnsim (loadmodel).m
    • mech_spec2str (same except for MechanismBrowser) [solution: use standalone version/modeler/write_dnsim_scipt version, since seems to have a couple extra lines more than MechanismBrowser]
      • (standalone version).m
      • MechanismBrowser.m
      • modeler.m
      • write_dnsim_script.m
  • duplicated functions that do NOT currently have their own file
    • dat2str (same)
      • MechanismBrowser.m
      • mech_spec2str.m
      • modeler.m
      • write_dnsim_script.m
    • disperror (same)
      • biosimdriver.m
      • modeler.m
    • standardize_model_spec (different, combine_models newer?) [solution: merge?]
      • modeler.m
      • combine_models.m
    • correct_loadjson_arrstr (seem to be the same) [solution: use get_search_space.m version since has an extra ischar check]
      • get_search_space.m
      • loadspec.m
    • copyfields (different) [solution: use json2spec since seems to have more code?]
      • json2spec.m
      • spec2json.m
    • dat2str` (same)
      • MechanismBrowser.m
      • mech_spec2str.m
      • modeler.m
      • write_dnsim_script.m
    • simulate (modeler) (name confusing with, well, everything) [solution: rename]
    • RunSimStudy (modeler) (name confusing) [solution: rename]

After discussion with @jsherfey, the main GUI code in modeler.m and
MechanismBrowser.m both haven't been developed in a long time, and are also
harder to test because it takes time to click-through and test them. Therefore,
we are going to forego messing with functions in those files until the core
re-architecture is complete.

  • duplicate functions with their own file -- these will be deleted from the non-same-function-name-file and isolated into their own standalone file
    • mech_spec2str (same)
      • (standalone version).m
      • write_dnsim_script.m
  • duplicated functions that do NOT currently have their own file -- these will be extracted and put into their own file
    • dat2str (same)
      • mech_spec2str.m
      • write_dnsim_script.m
    • disperror (same)
      • biosimdriver.m
    • standardize_model_spec
      • combine_models.m
    • correct_loadjson_arrstr (same)
      • get_search_space.m
      • loadspec.m
    • copyfields (different) [solution: use json2spec since seems to have more code?]
      • json2spec.m
      • spec2json.m
    • dat2str (same)
      • mech_spec2str.m
      • write_dnsim_script.m

Update: copyfields in either of its two files are actually NOT identical, but rather they are reflections, since they're doing opposite things in either file. Since other functions can't call functions that are "subfunctions" (i.e. defined after a parent function in that parent function's file), no one else can call them, so we are leaving that alone too.

Also, I forgot to mention that the get_search_space.m version of correct_loadjson_arrstr has an additional logical && ischar(this) check, so I'm merging that into the single correct_loadjson_arrstr version. Otherwise, everything is proceeding as in the previous comment.