WeblateOrg/weblate

Allow personal email addresses to be ommitted in commits

pappasadrian opened this issue Β· 31 comments

Is your feature request related to a problem? If so, please describe.

Email addresses used by translators are included in the commit files, with no option to opt out/turn off. This leads to leaks of personal information publically, by including addresses in public code. This is not respectful towards translators who contribute, and potentially illegal in some jurisdictions (Especially considering that there's no specific consent given. Everyone is opted-in by default, with no option to opt out).

Describe the solution you'd like

Add an option allowing translators's email to not be included in commits.

Describe alternatives you've considered
Using a fake email is not an option, as a valid email is required for other features of weblate.
Using an throwaway/single use email is also not very viable, as it is very cumbersome for users, and not a clear option for people who don't want their personal information leaked.

Screenshots

Additional context

Relevant issue: #4988

This issue has been added to the backlog. It is not scheduled on the Weblate roadmap, but it eventually might be implemented.

In case you need this feature soon, please consider helping or push it by funding the development.

Github gives everyone a email address like bjorn3@users.noreply.github.com to be used as commit author. Alternatively giving the option to use one of the secondary email addresses associated with a github account rather than the primary one would be nice. For example zulip asks you which of the email addresses associated with your github account to use. Personally I have my private email address set as primary email on my github account, but a public email address as secondary email.

nijel commented

The final solution should look like:

  • Prefer public address from GitHub, done in 90e1011
  • Fetch all verified addresses from GitHub and store them as VerifiedEmail, done in b0cbaa3
  • Add separate selection for email address to use in Git commits
phw commented

@nijel Thanks for getting into this discussion. Could this also be done independent of a users GitHub profile? Not every users has one. I imagine something similar to how Github does it:

  • First have a separate "Commiter e-mail" field the user can set, so the e-mail being used for commits is separate from the main communication e-mail. The user can fill in any value they want to have a different e-mail for committing. If they omit this the primary e-mail address will be used.
  • Have an option "Keep my e-mail address private". Activating this will automatically use some e-mail address in the form of "username@weblate_host", let's say for example "phw@hosted.weblate.org"

This is how it looks like in Github:

grafik

I even think this option should be active by default. It's better users have to activate their real e-mail than to have them discover later that they published their private e-mail address. E.g. Transifex also has an option to keep e-mail private, but it is disabled by default We had some discussion about this on the MusicBrainz forums, where users where badly surprised that Transifex put their e-mail addresses into public translation files.

nijel commented

There are many questions related to this:

  • Should Weblate allow entering any address as committer e-mail without any verification?
  • Keeping the address private on Weblate side will avoid associating commits on GitHub, that will be surprising for many users if that would be a default. On the other side, I understand that privacy wise, there might be good reasons for that.
  • Keeping the address private requires some setup to avoid problems with the e-mail addresses (make sure these do not end up in some mailbox) - something we can easily do on Hosted Weblate, but on the other side, something we cannot realistically expect from thousands of other Weblate installations. So, this would have to be an opt-in feature enabled in the configuration, which is used only on few installations.

Should Weblate allow entering any address as committer e-mail without any verification?

Probably not.

Keeping the address private on Weblate side will avoid associating commits on GitHub, that will be surprising for many users if that would be a default. On the other side, I understand that privacy wise, there might be good reasons for that.

An option would be to use the github user specific noreply email as author for the commit and an email address forwarding to the author in the commit message (if you want to introduce any forwarding in the first place).

For me specifically allowing me to pick any of the already email addresses verified by github would be fine.

phw commented

@nijel Those are good points. So give the user the ability to specify a separate commit e-mail from their verified e-mail addresses + importing the verified e-mails from github as verified is probably a good first start.

Having the ability to configure a weblate instance to provide custom e-mails (maybe on a specified subdomain) would be a great addition I think, but it really complicates things. E.g. ideally the user would still be able to add this fake e-mail to their github account.

Maybe an easier approach would be to allow anonymous commits, where the commits then get associated with a default user + e-mail provided by the weblate instance (e.g. most setups could use the Github account they use for pushing).

This should be clearly an opt-in from the user, with the note that commits will then not be associated with their own Github user account. Also this option should only be available to users if the instance is configured accordingly. Would this work?

UPDATE: On Gitlab the user also has the option to separately specify the commit e-mail and the e-mail used for communication, but both need to be from the list of verified e-mails.

On hosted Weblate probably 90e1011 is already included, I assume. Even if the relevant checkboxes are set on GitHub, it doesn't change anything on Weblate, the E-Mail is still exposed.

grafik
I assume it's not possible to get these values from GitHub?

nijel commented

That commit just makes Weblate prefer the public e-mail address if you have it configured on GitHub.

How do you set an e-mail address as public? I can't find any ui for it. The only option I see it to keep everything private or everything public.

nijel commented

When editing profile:

obrazek

I see, thanks! Unfortunately it isn't compatible with the option to prevent accidentally leaking any other email addresses. Maybe temporarily unchecking that option while creating a weblate account would work?

nijel commented

Weblate picks up public or primary mail. You can add and verify any e-mail and use that.

Also, I've improved the GitHub integration in b0cbaa3 to fetch all verified e-mails, so that you can choose them without need for additional verification.

Does this allow to set the noreply E-Mail from GitHub now?

I think what's wanted here is to be able to have an invalid E-Mail in the translation files while still being able to receive Mails from Weblate.

nijel commented

No, Weblate needs a working e-mail and doesn't allow different e-mail for commit and notifications. That's why this issue is still open. The desired solution has been outlined in #6508 (comment), patches to implement that are welcome.

I don't know Python or the code base well enough, but do you have any pointers on how I could help out?
(I'm personally interested in getting this issue resolved)

nijel commented

In more detail, what is missing:

  • The VerifiedEmail model needs to track information whether e-mail is suitable for e-mail delivery
  • Extend get_all_user_emails to support filtering based on that, defaulting to deliverable e-mails
  • Add commit_email field to Profile and make it configurable in the settings (utilizing get_all_user_emails) to get valid selection, this time without the filter for deliverable e-mails
  • Adjust get_github_emails to not filter noreply e-mail, but flag it accordingly for storing as VerifiedEmail

This issue seems to be a good fit for newbie contributors. You are welcome to contribute to Weblate! Don't hesitate to ask any questions you would have while implementing this.

You can learn about how to get started in our contributors documentation.

Thanks. I'll try to look into the code base, but no guarantees ;-)

The VerifiedEmail model needs to track information whether e-mail is suitable for e-mail delivery

That should be easy with just some field "is_deliverable" added to the place where E-Mails are stored (DB?)

Extend get_all_user_emails to support filtering based on that, defaulting to deliverable e-mails

I think we need to look where the method is called and change the behavior based on a parameter

Add commit_email field to Profile and make it configurable in the settings (utilizing get_all_user_emails) to get valid selection, this time without the filter for deliverable e-mails

I think that's the interesting part. Would we want to add a generic noreply@weblate.org E-Mail here for people not using GitHub but wanting privacy, or would we just disable committing an E-Mail at all (which I doubt is even possible on the git side).

After some thinking, wouldn't the cleanest way be to add another field in the user database containing the commit E-Mail? To me the code I read seems like it was targeted to one E-Mail only - and we need to save a possible commit E-Mail somewhere...

Also

def set_committer(self, name, mail):
for git and similarly for other VCS we could use a different commit E-Mail.

nijel commented

I think that's the interesting part. Would we want to add a generic noreply@weblate.org E-Mail here for people not using GitHub but wanting privacy, or would we just disable committing an E-Mail at all (which I doubt is even possible on the git side).

There will be currently no option for such users. We definitely do not want to commit all changes without any authorship. And providing such e-mail for every user needs some setup (see #6508 (comment)). So let's focus on selecting commit e-mail from those we already have for now.

After some thinking, wouldn't the cleanest way be to add another field in the user database containing the commit E-Mail? To me the code I read seems like it was targeted to one E-Mail only - and we need to save a possible commit E-Mail somewhere...

The committer e-mail is typically noreply@weblate.org (it is configured by https://docs.weblate.org/en/latest/admin/config.html#default-commiter-email), what we want to configure here is author e-mail. That is retrieved here (note that this is currently also used in sending e-mail, so again, this will need to handle correct scope):

def get_author_name(self) -> str:
"""Return formatted author name with e-mail."""
return f"{self.get_visible_name()} <{self.email}>"

Ok thanks. So we'd incorporate a separate "no-reply" author E-Mail only for GitHub users for now?
All other E-Mails need to be verified (can send+receive mail).

Ok. I've now looked at the database, and it seems as if we do in fact only save one E-Mail. So we'd need to add a field to weblate_auth_user named e.g. commit_email. Then we'd set it via GUI (with verification if needed) and use this one for all git/svn operations from this user?

I don't seem to understand how the logic in the social login works

That's what I tried: 69a6e96 until now as erroneous draft.

What does VerifiedEmail do? get_all_user_email doesn't seem to use it as I expected:

I'd expect to have some array of E-Mails or VerifiedEmail objects which I can extend with a new field and modify. However, this doesn't seem to be the case: What does

        emails = set(VerifiedEmail.objects.filter(**kwargs).values_list("email", flat=True))

actually do? Is it a database query? How should we get the data from GitHub there?

nijel commented

VerifiedEmail is a database model. It's a database query using Django ORM. It fetches "email" field from the matching object and creates a set from that.

The data needs to be fetched from GitHub during authentication and stored in the database (this already happens, it just needs to be extended to cover the deliverability flag). Then n the settings all user verified e-mail are available for selection.

Ok. I think i've made some progress: https://github.com/ann0see/weblate/tree/feature/hideMail

GUI still needs to be done.

Ok. Now I get an error about a missing key after setting the commit E-Mail.

File "/home/a0/weblate/weblate/accounts/models.py", line 307, in get_message
message = ACCOUNT_ACTIVITY[activity]
KeyError: 'commit_email'

The database has the key and it is also updated.

nijel commented

The ACCOUNT_ACTIVITY defines verbose messages to notify user when attribute has changed. The audit log entries are created in UserForm.audit.

nijel commented

I've created #8268 so that I can easily comment on the code (and sorry for late response, I built a huge backlog on GitHub and I'm now finally processing that).

Thank you for your report; the issue you have reported has just been fixed.

  • In case you see a problem with the fix, please comment on this issue.
  • In case you see a similar problem, please open a separate issue.
  • If you are happy with the outcome, don’t hesitate to support Weblate by making a donation.

(This feature has already been implemented.)
Next: #8451 : Allow instance owner to set a default template for commit e-mail, hiding user e-mails from commits by default