[JOSS Review] Documentation & Functionality

Main review issue: openjournals/joss-reviews#4250

First of all, I want to say thank you for your efforts in developing and sharing this code. As someone who has worked with driving simulators in human factors research — good open source solutions are extremely rare to come by. Your work is exactly what is needed in the field at the moment, so thank you for your contributions!

Below, I provide feedback on a number of issues specifically related to repository and documentation itself, this issue does not assess the software paper (I will create a second issue for this).

Apologies for the length of the review, I thought it was easier to keep everything in one place than multiple issues. I intend all my feedback to be as constructive as possible and I am aiming to improve your project and make it as accessible to as many individuals as possible!

1. A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

1.1 Could you provide an updated statement of need on the documentation and the README.md

I thought the introduction you provide here https://joan.readthedocs.io/en/latest/introduction/ is the best description of JOAN. The Github repository README.md was quite vague, and due to the "high level" documentation home page, it was difficult for a me as a naive individual, to understand what your software did. I suggest it could make your work clearer to include the description you provide in the introduction on the README.md page of your repository.

1.2 Update a description of what Carla doesn't offer that requires JOAN.

I would also like to emphasise an issue that I think underpins most of my comments below. I think you really need a description of what Carla doesn't offer, to make much clearer what JOAN does do. At the moment, I can get an understanding of the features JOAN has, but it's difficult to assess how useful these are without knowing the limitations in Carla. I note you hint at this in the second paragraph of the introduction page, but I suggest expanding on this to make it really easy for new users to get how it all works.

2. Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.

I did note I was unable to actually install Carla, so keep that in mind.

2.1 Consider possible reproducibility of build context

I note that you don't have an automated installation solution. I very much appreciate this is in part due to Carla requiring manual installation. However, I would like to ask whether you have any ideas for making it clearer to validate the build context for Carla.

You have a lot of warnings about Python versions and versions of Visual Studio on path, and I wondered if a simple script could be produced to check this. Again, not essential but I'm wondering whether you had any thoughts to this regard.

2.2 Provide a single list of all dependencies that will be installed to build Carla on the relevant page.

I note that the dependencies are listed gradually, making it hard to know up-front what will be installed.

2.3 Host all dependent files on a more stable long-term archive than Dropbox

The instructions in the compilation pages are dependent on externally hosted URLS: https://www.dropbox.com/s/6v35q307dosin55/120222_JOAN_Assets.zip?dl=0. I appreciate that large asset files don't belong on Github. However, it is would be more appropriate to use a service like FigShare (https://figshare.com/) in which the materials are hosted on an archived, long-term storage solution, with citable DOIs for the materials. This is particularly important for anything that is crucial to the installation process.

2.4 Improve clarity for why users are performing particular installation steps.

Currently, the documentation often doesn't describe why a user is being instructed to do a set of steps. For example, on the Windows Carla page https://joan.readthedocs.io/en/latest/setup-carla-windows/, I was initially confused as to why a user couldn't download the pre-compiled binary. It would be good to have a statement here https://joan.readthedocs.io/en/latest/setup-carla-windows/ like:

"In order to use the UE Editor which is required to run JOAN experiments, you must compile Carla, as the binary release does not contain the editor. Furthermore, manually compiling ensures you can install all JOAN assets in the binary."

Similarly, it's not stated that "Unreal Engine 4.24 is required in order to build Carla" anywhere on the page. The user is just prompted to install it. These kind of descriptions just clarify why actions are being taken.

In summary, what I am suggesting is to look over the installation documentation and ensure that for each step you are clearly stating why the user is about to do this, and foreshadowing what is ahead.

2.5 State where .egg should be placed.

Could you clarify where the .egg file should be included "Include the CARLA .egg file in the JOAN project"

3. Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

3.1 Perhaps provide a more complete complete example of the software

Excellent examples and animated gifs are provided in the documentation. I think users will appreciate the clarity of instructions provided for the software.

I did wonder though if a more complete example of the software and it's use could be provided somewhere? That is, almost like a brief methods section where you can describe a complete example use case of JOAN. This should address questions like, what variables were captured and why, what input devices were used, how the map was developed. Then, connect these details with the features of JOAN. These kinds of information can help an individual form a more complete mental model of your software (and make them hopefully more likely to use and cite it).

As I noted below, I expected the documentation for JOAN under "First Steps" to cover this kind of introductory description, but the information on the page was instead highly technical (https://joan.readthedocs.io/en/latest/firststeps-joan-overview/. Perhaps it is here you could introduce the kind of example I am talking about, shifting this technical overview information somewhere else?

To be clear, I don't think this needs to be very lengthy, but might help a lot in attracting new users.

4. Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?

4.1 Clarify documentation

Perhaps what is missing overall is a better description of what Carla would look like without JOAN.

In my view, it was quite difficult to ascertain what exactly the purpose of each JOAN module was. From a technical/developer perspective, I could see a lot of detail was placed into describing the API: https://joan.readthedocs.io/en/latest/firststeps-joan-overview/. However, for the average or naive user, it is unclear exactly what the modules are used for or why they are needed. For example, looking at the data recorder module https://joan.readthedocs.io/en/latest/modules-datarecorder/, I was unsure what exactly each variable you could select would record. What would the data plotter be used for in terms of experiments?

Take for example the data recorder module. Perhaps you could extend it along the lines of:

"The data recorder module is used to record the data from an experiment run in JOAN (linked to Carla). Data available for capture include: Driver interaction behaviour (e.g., steering wheel, braking, throttle), etc... The data recorder module enables you to export varibales to a CSV document, and is essential if you are planning on using the [other modules] or [purpose]."
In addition to the other notes you have there.

There are some features that are a bit difficult to understand like: "storing trajectories with the data recoder".

5. Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?

I note that you provide some example use and manual tests here: https://joan.readthedocs.io/en/latest/example-usage-and-testing/

As these are not automated tests, I was curious as to how one might validate that the outputs from a configuration match an expected value? I.e., if there were any errors in terms of steering wheel data to what's in the CSV, how is this checked? I'm not necessarily suggesting any changes here, but wondered if you had any thoughts on this?
I recommend that you remove the dependence on the Dropbox hosted test-results

6. Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

6.1. I suggest the project could do with some added information related to community guidelines. You can check the status of your project here for a starting point: https://github.com/tud-hri/joan/community. I note that while you seem to have issue templates on the repo, they don't appear for me when I click new issue, so I would check the configuration of these is correct.
6.2. In particular a contributing file can help prospective collaborators identify how can contribute to your software. If you need some ideas on contributing guidelines, the team from fMRIPrep have a very comprehensive set of guidelines: https://github.com/poldracklab/fmriprep/blob/master/CONTRIBUTING.md

7. Minor Assorted Issues

7.1. What does JOAN actually stand for? It might be good to include this in the paper or repository.

7.2. "Depending on whether you want to use the CARLA simulation environment with JOAN, the CARLA interface module is one of the core modules."

joan/docs/modules-carlainterface.md

Line 3 in a1b9d2d

    
           Depending on whether you want to use the CARLA simulation environment with JOAN, the CARLA interface module is one of the core modules. If you want to do

. I found this somewhat confusing, as I wasn't sure how else JOAN would work.

7.3. I note that the documentation has a few typos in it, e.g., "Ofcourse" on https://joan.readthedocs.io/en/latest/modules-hardwaremanager/. It could be worth running the markdown files through a spell checker.

Again, I thank the authors for their hard work on an important package. I hope you find this review process constructive, and I hope I can use your package one day in the future.

Hi @humanfactors,

Thanks for your comprehensive review, we very much appreciate it! We are working on your comments and will reply here on a rolling basis :-)

We've updated the statement of need in readme.md and the documentation introduction (comments 1.1, 1.2) , better indicating what 'gaps' in Carla JOAN addresses.

Hi @humanfactors,
As @niekbeckers already said, thanks for the many useful and constructive comments! By now, we have addressed most of them. I tried to address and explain the three last open checkboxes below. Let me know if you agree or have any other questions.

The Build Context

We agree that building Carla is a tedious process with a lot of warnings and conditions. The problem here is that we also don’t fully know to what extent these specific versions of dependencies really matter. The instructions to build Carla are heavily based on their own documentation, and instead of trying to also develop extra tools for building Carla we prefer to link users to their documentation.

Example of usage

I have updated the introduction page in the documentation to give more details on how we think JOAN can be best used. I’ve also added a link to a research paper that actually used JOAN. Is that what you had in mind?

Tests

The testing is a difficult part because we are trying to validate if a human-in-the-loop experiment is working as intended. So comparing (human input) values to a pre-recorded CSV is not going to be of any help. What we are actually testing here boils down to three things:

Does the connection with Carla work?
Does the hardware input work?
Does JOAN’s underlying structure (with settings, news, and a bunch of separate Python processes) work?

If any of these things fail, the user will get an exception or clearly unexpected behavior (e.g., if the human input steering angles are larger than expected). So, we think comparing values is not really necessary to test if JOAN works as expected.

Thanks again for the extensive review of these topics!

Thanks for the update. I'm currently on leave and have limited reception, but will be able to finalise this next week. Cheers.

…

On Tue, 12 Apr 2022, 11:20 pm Olger Siebinga, ***@***.***> wrote: Hi @humanfactors <https://github.com/humanfactors>, As @niekbeckers <https://github.com/niekbeckers> already said, thanks for the many useful and constructive comments! By now, we have addressed most of them. I tried to address and explain the three last open checkboxes below. Let me know if you agree or have any other questions. The Build Context We agree that building Carla is a tedious process with a lot of warnings and conditions. The problem here is that we also don’t fully know to what extent these specific versions of dependencies really matter. The instructions to build Carla are heavily based on their own documentation, and instead of trying to also develop extra tools for building Carla we prefer to link users to their documentation. Example of usage I have updated the introduction page <https://joan.readthedocs.io/en/latest/introduction/> in the documentation to give more details on how we think JOAN can be best used. I’ve also added a link to a research paper that actually used JOAN. Is that what you had in mind? Tests The testing is a difficult part because we are trying to validate if a human-in-the-loop experiment is working as intended. So comparing (human input) values to a pre-recorded CSV is not going to be of any help. What we are actually testing here boils down to three things: - Does the connection with Carla work? - Does the hardware input work? - Does JOAN’s underlying structure (with settings, news, and a bunch of separate Python processes) work? If any of these things fail, the user will get an exception or clearly unexpected behavior (e.g., if the human input steering angles are larger than expected). So, we think comparing values is not really necessary to test if JOAN works as expected. Thanks again for the extensive review of these topics! — Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD2ZWLB7NI3TCDLH7FKNRB3VEV2ATANCNFSM5RG4F65A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Few quick follow ups:

Could you ensure all Dropbox links coverted to the AWS (some here https://joan.readthedocs.io/en/latest/example-usage-and-testing/)
The quote on the home page runs over https://joan.readthedocs.io/en/latest/

Otherwise changes look good!

Hi @humanfactors, thanks for spotting these - we've converted all remaining Dropbox links to the 4TU data storage links and fixed the note overrun.

Thanks, all my concerns from there are resolved then :)