So do I understand correctly that google's AI assumes genders based on stereotypes when Hungarian did not specify them?
I am pretty sure that was one of the problems she was researching and writing about when they fired her.
@mur2501 @pinkprius @bhaugen the Data is not the only thing that matters. The algorithms objective people look at is e.g. accuracy meaning it is rewarded when choosing the option with the highest prior probability (given next word is “working”, leading word is likely to be male)
This view is flawed though when such models are applied outside of academia (where metrics are everything currently)
@bhaugen part of it may be user submitted translations? If I remember correctly Google Translate allows users to contribute translations and confirm other's translations so the bias may lie with the volunteer contributors.
@pinkprius and I wouldn’t be surprised if the excuse was that the sexism was in the corpus, and it learned it from the corpus
and I also wouldn’t be surprised if that excuse is actually the case
…an example of why you have to be careful with your corpus
@uint8_t @bhtooefr imo this is fraud.
People expect computers to have correct, dict-like output. They would as well expect a dictionary book to give both answers as translation for a word that allows both, not just one. So giving them a model that does whatever, but looks as if it works, isn't easy to understand for users and I fear that often they aren't told that they are reinforcing existing biases with using it.
it'd be correct to translate all of these using "they", right? since the hungarian pronoun is gender-neutral?
@pinkprius or just becasue it's based on ML so it just means most content available so far is translated this way?
@pinkprius Is this real or manipulated? If real and the data is machine learned, then the machine is not the problem.
@pinkprius That it’s just saying it like it is (according to its written data). If someone doesn’t like the results, then it’s a long road to change the narrative and thus the data that the machine analyzes.
I’m trying not to voice an opinion. Hopefully you’re not looking for one. 😉
@mansr @greypilgrim @pinkprius The original table has perfect Hungarian sentences, however, yours is not correct (lacks the object's sign) and because of this, the translation is wrong. Correctly, you can say
Ő kutatót főz
Őt megfőzi a kutató
However, both of these seem to get the same gender from the context. Even the incorrect sentence gets the gender from the context.
Not sexism. Maths. AI will just output what it gets more statistically relevant.
The sexism is elsewhere.
Thank you for clarifying that your intention was not implicitly accusing the algorithm of being sexist by design.
By the way, though it may be revealing of usual roles or representation, these differences do not necessarily route back to sexism.
Your choice of phrases is also biased. Try *he is a slave, or *he committed suicide, or *he was killed. Gender differences do not necessarily mean sexism. Women and men in average are different, even bias accounted for.
@jcast imagine a dictionary that translates they commit suicide to -> he commits suicide. That is sexist as well and also plain wrong. Just like my original toot, both follow the definition of sexism: https://en.wikipedia.org/wiki/Sexism
I get what you are trying to say, but I don't think saying that "the model reproduces existing bias and trends" is acceptable for a translation service. Translation doesn't mean "we guess a gender based on prejudice" where we don't know any different.
chaos.social – a Fediverse instance for & by the Chaos community