Text-linguistic evaluation of Twitter's auto-translation service

A thesis submitted in partial fulfillment of the requirements for the Master of Science Degree in Translation Studies in accordance with the requirements of Effat. Beladel, under the direction of her thesis supervisor and approved by her thesis committee, has been submitted and accepted by the Dean of Graduate Studies and Research on May 23, 2021 in partial fulfillment of the requirements for the degree of Master of Science in Translation. Studies.

Aims and objectives of the study

Today, the need for translation is increasing massively, and human translators may not be enough to handle this huge demand quickly. Therefore, in recent years, Twitter has allowed users to join a translation community and, when they do so, have a translation badge displayed on their profiles in recognition of their translation efforts.

Importance of the study

Compared to this huge demand on social media platforms, including Twitter, human translators were not able to keep up with it immediately, not to mention the high-priced customers had to pay to have their posts translated by human translators. Therefore, the world needs serious studies and research to develop MT services that can produce high quality translation in different types of social media platforms.

Research questions

To explain, during the 2020 COVID-19 pandemic, companies, organizations, schools, etc., were forced to use social media to communicate with people while maintaining social distancing. So people followed the news of the US presidential election as if it were their countries' elections.

Hypotheses

In fact, US political affairs affect other countries in one way or another. An easy and quick way to get updates on the US presidential election would be to follow the US president's Twitter account.

Scope and Limitation of the Study

Outline and Content of the Study

Google Translate

Social media translation challenges

Studies have shown that access to culturally and linguistically diverse social media posts increases the level of knowledge and acceptance of others (Wankel, 2016, pp.116-124, cited by Lim et al., 2018, p.253) . Additionally, some believe that MT is not providing efficient assistance to social media users due to poor translation quality resulting from the machine's inability to understand the culture and context behind these posts (Lim et al., 2017 , pp. 281 - 297).

Donald Trump’s Twitter account

As mentioned above, social media posts contain a high level of noise as they follow fewer constraints, unlike books, articles or other types of texts. Sufficient knowledge of social media is essential to maintain its communication function in the translation process, especially in sensitive areas such as politics.

Previous Studies of the Machine Vs. Human Translation

Machine translation (Google Translate and Bing) has enjoyed a larger number of content and non-content words, surface construction distribution (brackets and numbers), person and location distribution, and n-gram and four-three-word-word n-gram . While human translation enjoyed a greater number of distribution of surface construction (punctuation), distribution of organization entities and two-word n-grams (pp.101-102).

Text-linguistics method of analysis

Cohesion
Coherence
Intentionality
Acceptability
Situationality
Informativity
Intertextuality

Therefore, the translator must follow the grammatical structure of the target text by restructuring the sentence (Graustein et al. 1977, p. 98176, cited in Neubert and Shreve, 1992, p. 113). Neubert and Shreve (1992) define coherence as a logical order of ideas that effectively directs the target reader to an easier understanding of the meaning of the text.

Research design

Identify translation issues caused by the tweet character limit (280 characters) and/or other issues related to the seven wording standards. In addition, it describes the data collection process and criteria for selecting tweets for analysis.

Methods of Data Collection

Twitter’s auto-translation service (MT)
Donald Trump’s Tweets

In summary, this chapter introduces the research design, namely the product-oriented approach, and expresses the chosen model for data analysis which is text-linguistics by Neubert and Shreve (1992): cohesion, coherence, intentionality, acceptability, situation, informativeness, intertextuality. Also, it explains the process and criteria of data collection and its contribution to the service of the research goals. In this chapter the research findings are discussed and the research questions are answered based on the data analysis.

Solutions to those problems will also be presented based on the machine's analysis of Donald Trump's translated tweets. Finally, the limitations encountered during the research process should be mentioned, followed by some recommendations for further research. This is followed by the data analysis in which the researcher analyzes the Twitter machine translation of the chosen tweets based on the seven text-linguistic standards (Neubert and Shreve, 1992): cohesion, coherence, intentionality, acceptability, situationality, informativeness and intertextuality.

Discussion and Findings

In addition, this chapter determines whether the hypotheses mentioned in the introduction can be confirmed through data analysis. In addition, the chapter draws on some previous studies that have proposed solutions to the problems found in machine translation-generated social media translation.

Observations Based on the Research Hypotheses

Hypothesis 1: MT is not yet developed enough to be relied on completely in
Hypothesis 2: MT could help in giving an idea about the content of tweets regardless

Based on the qualitative analysis of Trump's machine-translated tweets, it is observed that most of the machine-translated tweets are fully understandable. Furthermore, issues related to acceptability can be annoying for readers, but they can still get the message of the tweet (See 4.4 acceptability). It has been proven by the analysis of machine-translated tweets that machines lack understanding of the context and culture in which the text takes place.

Machines tend to choose the first equivalent of a word in the dictionary, which is not always the correct procedure for a translator to follow. The translator must understand the context to select the appropriate equivalence that serves the ST message. In Table 27 Situationality Example 4,5,6, the machine translation selects the first match of the word "Dead" in the dictionary "ıلتق", which is not the appropriate match that conveys the message of the tweet.

Observations Based on the Research Questions

Question 1: What translation issues result from using Google translate to render tweets
Question 2: Is it possible to produce machine translation of Donald Trump’s tweets

Twitter's character limit policy (280 characters per tweet) affected the quality of machine translation. Users are sometimes forced to break seven textuality standards to stay within the character limit (280 characters per tweet). Therefore, these violations in the ST would make it difficult for the translator to understand the ST and convey the message to the target readers.

Thus, machine translation makes translation errors due to violations in ST. But the machine is not able to understand the implicit reference and carry it into the translation, as discussed in Table 20 Purposeful Example 2. However, translating tweets respecting the seven standards is not impossible if machines are developed to process the implicit information by linking tweets of the same topic or create a corpus of any field terminology.

Limitations of the Study

In addition, this policy results in issues related to intentionality, as the writer has limited scope to explicitly express his intentions (see 4.6.3 intentionality). Humans can understand the implicit intentions of a writer, but a machine has no ability to understand those intentions without enough expansion on the part of the writer. Also, the policy creates problems with consistency (see 4.6.2 consistency), as English users omit some pronouns that English readers can pick up.

Recommendations

According to BI Intelligence's cross-country analysis of Twitter data, 41% of Saudi Arabia's online population uses Twitter, the highest percentage internationally (Saudi Arabia Ministry of Communications and Information Technology, 2018). Therefore, it is important to improve the quality of tweets machine translation to keep the Arabic reader up to date on all areas and news in the world.

Data Analysis

Cohesion

Grammatical Cohesion

Coherence
Intentionality
Acceptability
Situationality
Informativity
Intertextuality

The problem in the machine translation is related to the literal translation of the preposition “in يف”. However, this anaphoric reference is not clear in the machine translation version due to the misuse of the Arabic pronouns. As described by Neubert and Shreve (1992), intentionality is the attitude of the writer in the text.

As in the case of the machine-translated version, the preposition "inيف fi" is repeated three times. This example is another case where the machine translation leaves some parts of the tweet untranslated. However, in the researcher's opinion, the repetition would distract the reader from the main message of the tweet (ST).

Another extension is made by the researcher in the last sentence of the tweet "HUGE CROWD!". The machine translation partially translates the last sentence of the tweet, and the other part is kept in English.

Conclusion

Because of the Trump administration, hospitals are now required to publish their REAL PRICES, which will generate. Before we even discuss the massive corruption that took place in the 2020 election, which gives us far more votes than we need to win all the Swing states (we only need three), it should be noted that state legislatures have not been on any way responsible for a huge.. 2021, January 2)...Just a tiny fraction of those votes give the USA a big and convincing victory in Georgia. Georgia Republicans need to be wary of political corruption in Fulton County, which is rampant.

IMPORTANT, OUR COUNTRY NEEDS THE CHAIRMANSHIP MORE THAN EVER BEFORE - THE VETO POWER. THE REPUBLICAN PARTY AND, MORE IMPORTANTLY, OUR COUNTRY, HAVE NEEDED THE LEADERSHIP MORE THAN EVER BEFORE - THE POWER OF VETO. TRANSPARENCY in medical pricing will be one of the biggest and most important things done for the American citizen.

Due to the Trump administration, hospitals are now required to publish their REAL PRICES with immediate effect, which will create competition and reduce costs. Before we even get into the massive corruption that took place during the 2020 election, which gives us far more votes than it takes to win all of the Swing States (only three are needed), it should be noted that the state legislatures have not responsible for the masses.