All events

All events

date

May 27, 2021

Panel discussion: Machine translation quality: Quantifying the unquantifiable

LocFromHome

Panel discussion

YouTube video player
As the MT-as-a-service market matures, its end users increasingly expect to be able to not only use it in their workflows but also assess how good or bad each specific engine and variation is. How, exactly, do they frame those expectations? What do language service providers have to offer to meet those expectations? And, last but not least, what technology exists that allows to quantify machine translation quality in some reliable and repeatable way? We’ll find out in this panel

Transcription

Max Morkovkin 00:00 Our first panel discussion about empty. So empty is a hot topic in our industry for quite some time. We're discussing it every year and more and more insights coming. And today we'll be also discussing topic about machine translation quality quantifying the unquantifiable. And the moderator of this panel discussion is Pawel the running product leader in tensile. Well, are you already with us? Pavel Doronin 00:36 Hi, guys, Sam, here. Max Morkovkin 00:38 You are. Nice jacket. How are you doing? Pavel Doronin 00:41 This is one of my conference jackets. 2020 collection. Yeah. I have also had a surprise for you, Max. Especially you see, I came prepared. Max Morkovkin 00:52 Good. Good. Bye. Oh, I miss you. You know, this. So unforgettable 2020. And who we have as panelists today, let's introduce them. Pavel Doronin 01:05 I will introduce everyone just to not spoil my, my moderation. So yeah. Should we start right, Max? Max Morkovkin 01:15 Yeah, yeah, we can start so. And I will just remind our attendees guys, share with your network that like look from home is life now. We want to have more attendees and especially for these panel discussions really hot. Okay. cPanel is join us. So, Pablo, giving the mic to you. And yeah, we'll be waiting for questions in q&a. And we are all ears to listen to you guys. Pavel Doronin 01:45 Thank you for the mic, Max. I missed you too. And yeah, you look great, by the way. So yeah. Hi, everyone. And welcome to the panel discussion to the first panel discussion on machine possession quality. I have to say, this century, but not. So quantifying the unquantifiable. Yeah, I also learned how to pronounce it without any interruptions. To manage your expectations. From the discussion, were going to focus mainly on the business perspective of this topic, and a small housekeeping note. So repeating Max, you can leave your questions right here, probably here in the chatbox. And we try to reserve some time to answer them. If we don't have a chance to do that. We just can to invite the panelists to chat for further discussion. So yeah, let me introduce myself. Obviously, not everyone missed me. And most of the people probably see me for the first time. Yeah, so I'm Pavel. I'm a product manager at intento. I help for machine translation and AI services. And what you don't learn from my LinkedIn profile is that my father has been featured in both multilingual magazine and Rolling Stone magazine. Yeah, and please meet the panelists, the true experts in machine translation and in the translation world. Let's start with Anna unser Eska is a senior machine translation implementation specialist at transperfect. But what you don't learn from her LinkedIn profile is that once she took a course in Catalan to her by a native Catalan speaker, while she, she was studying general linguistics in Moscow, Russia, and 10 years later, when she moved to Barcelona or Barcelona, to join transperfect. She could surprise her local colleagues, by the fact that Catherine is, is being taught in Russian universities and probably by the perfect pronunciation, right? Kirill Soloviev 03:55 Not so perfect, but yeah, hopefully not too bad either. Thank you for the introduction. Pavel Doronin 04:02 Glad you're here. And let me also introduce Luna Liu. She is localization program manager at Alibaba Group. What is doesn't from her LinkedIn profile that she wrote stories and posted them online. So far, she gathered 31 readers, and Lulu claims they're mostly family and friends. And she also mentioned two cats who probably asked her to read her stories aloud. Luna, you have now the opportunity to grow your reading audience and I'm also glad you're here. Guanghan (Luna) Liu 04:38 Thank you payable for the introduction. Yes. I really hope you can meet my cast. Absolutely. They're really cute. And thank you everyone for attending this meeting. I hope it all goes well. Thank you. Pavel Doronin 04:54 Thank you, Luna. And yeah, you're wondering who is this handsome guy in on my screen? In the in the right color. Finally, this is curious sorry after co founder and CEO of content for a product for quality management, the topic of this discussion what you don't learn from his LinkedIn profile that instead of languages, Kirill takes in his spare time, of course, he picks up new musical instruments and learns how to play them. He's he has been spoiled playing things like guitar, ukulele, mandolin, drumset, handheld, percussion, Irish traversal. And a lot of things he sent me a very long list table. And really, honestly, he never gets so far that so far, you just don't ask him to play in front of you. Unless you're his close friend and Kirill, I'm not asking you to play today, but this day will come to live in. And I'm glad you're here, too. Kirill Soloviev 05:58 Thanks for the intro paddle. You forgot my favorite instrument as a fly the Z the Irish transverse flute, I wanted to actually bring it with me into into the camera, but SmartCAT didn't plaid me. Glad to be here. Thanks for the invitation. Pavel Doronin 06:12 Thank you, Carol. And yeah, you see, we have a group of world class professionals and bright personalities, what is important, let's get back to the topic, I just give a very wide context of of this topic so that we can, we can start our discussion. So historically, people do not trust machine translation, even now in 2020, and questioned Russian translation quality, and they think human translation is better, which is a fair point. And clients, post editors, language companies have demos more or less bias towards machine translation. On the other hand, the reality is so that the world is moving so insanely fast, and the lifespan of the continent is shrinking. And actually, it turns out the purely human translation might not always meet modern business needs. And the example of from Robert, from the previous presentation, actually, kind of proves that. So companies are investing in machine translation technologies implementation of machine translation. And some companies at some moment even start talking about machine translation and human translation. parity. So how we can measure that. And the thing is how to talk numbers. And if we talk numbers, what numbers should we talk, this would be the topic of, of this discussion. And everything is built around expectations from machine translation quality, and the different angles. That is why we invited representatives from different areas of the translation industry, from session buyers, language companies, technology providers, so to give the wider perspective. So let's start probably with Luna, because Luna represents a product company that is actually a translation buyer. And just to start shaping a context, can you give us an overview, what role in machine translation plays in Alibaba business in general and how machine translation quality in particular impacts uneven business. Guanghan (Luna) Liu 08:33 Hi, everyone. I want to make sure everyone can hear me okay, because I just read it on the conversation window. Some. Okay, great. So um, hello, everyone. My name is Luna. And I'm from the Alibaba translate team localization program manager in charge of the machine translation quality and user experience Evaluation Program. Actually, Alibaba is using and developing machine translations to translate the large volume products selling on the Alibaba platforms. The team actually spend a couple of years to develop and continuously optimize the customize the quality and the experience evaluation mechanism. We perform quality and experience evaluations regularly actually monthly to monitor the performance of our machine translation. So for Alibaba translated, the goal is to continuously improve the machine translation, so it makes the users feel comfortable and confident about what they read on the platforms. So I have to say what we care most is the user's experience, how they feel about the content and what they like or dislike about them. Thank you. Pavel Doronin 09:51 Thank you, Luna. And yeah, kick off question to Anna. How does it look like from language company's perspective? Different companies like Alibaba Group share their plans and expectations, the impact of machine translation with you guys. Thanks companies. Anna Zaretskaya 10:11 Yes, well, yes, of course. So of course, obviously, machine translation quality is one of the most important things for us. transperfect has been providing machine translation services for many years now. And this includes, you know, post editing, or machine translation tools. And of course, we need to stay competitive in this actually very competitive field. Because nowadays, there are a lot of companies that provide the service. They're also as you know, empty systems that are available for free. And that's why we constantly monitor our quality, benchmark it against other systems that are available out there. And, apart from that, we also improve work on improving the systems while while they're already in production. So we want to not just, you know, provide the quality that we can. But also make sure that the experience of our clients is also improved on an ongoing basis. And in terms of quality expectations, in general, I would also distinguish two things. One is, you know, does the machine translation quality really help you achieve your goal as a customer? And another thing is, in the post editing scenario, does it help the translators be more efficient? Right? So this is about minimizing the post editing effort. But yeah, I mean, the bottom line is that quality is very important. And this is actually one of my main tasks at transperfect. Pavel Doronin 12:15 Thank you, Anna. And, yeah, we say customer expectations and user expectations. And Robert mentioned, they measure number of feedbacks they receive from their website, but generally carry on, you probably work with your technology as a bridge between language companies, end users, translation buyers, what they think, what would be the right approach to measure it, and how much data do we need to measure it? And what are the approaches? And do you think there are some approaches that Luna and Anna should use? Because they're better, but they don't do that? Can you give us an overview of that? Kirill Soloviev 13:03 Sure. Sure. I would love to just this morning, I was talking to somebody not working in the translation industry and actually trying to explain what we do with translation quality and why they how companies try to measure that at all. And why is it useful? So I think, specifically was machine translation. And also trying to bridge what Luna was saying was what Anna was saying. The ultimate measure of quality in translation for me, and I used to be a buyer of translations before bottles who worked with an LSP. So the way I see it now, the ultimate measure is does the translation achieve the impacts for which it has been commissioned? Okay, so for Alibaba, machine translation is brought into existence to help Alibaba sell more products to an international customer audience, right, or help people buy more products, depending on how you want to phrase it. And that is actually the ultimate measure of whether your machine translation quality is good enough. Ana was referring to that as well, by saying that they're trying to achieve their customer achieve their goals with their machine translation. So I would call this type of quality metrics, the outcome metrics, does it do what we wanted it to do? Can people make buying decisions and positive buying decisions from us as Alibaba? It was the sort of machine translation that we provide. And if the answer to that is yes, which means our conversion rates for Ireland or Spanish speaking markets is at least the same, or even above vanities, let's say for the Mandarin speaking markets or other Chinese languages, then empty is high quality and if we're below the borrow the empty is low quality. However, the problem was the sort of outcome metrics is kind of easy to see when you start thinking about it. They are only available after the fact. So they are what we call lagging metrics, you actually have to publish your piece of machine translated content to figure out whether it does its job or not. And even after you publishing it to wait for quite a while, before you can gather enough data on the outcomes, do people buy this, this object? Was this level of quality of machine translation or description? Or is a conversion rate lower than we've seen on the source language, right? So because of that, because of those outcome metrics, taking a very long time together, we actually kind of have to use other things, right? And Luna mentioned this as well. So it's not only about the experience, yes, we want to get there. But we also have to look at something that we can measure earlier, in the process of producing MT, that we can measure faster, we can measure cheaper, and try to predict using quality as a predictor to try and inform the next step, will the outcome likely be positive if we have reached this level of quality measurements or not. So that's kind of the overall dichotomy. We want to get the perfect results with our empty quality. But we also want to learn about this as early as possible, preferably before we publish empty, and that's why we also have a plethora of approaches to actually measure the quality of machine translation. But we can talk about those in more detail a bit later. Pavel Doronin 16:42 Great, thank you carry on. When you're talking about I was wondering, for example, on do you have actually access as a language company to outcome measurement? Because most language companies, they work as a black box, you know, they provide services, but they don't have access to the last mile. I don't know if you have some challenges in this regard? Yes, Kirill Soloviev 17:09 exactly. I mean, we don't have the access to to default, by default, for example, to our clients, you know, conversion rates before and after we implement them T. But of course, what we can do and what we this is also our responsibility to explain our clients, how best to measure the Mt. Quality. And here, I totally agree with everything that Carol just said. So of course, we can, this is a collaboration, right, the client can share with us the results based on that we can improve, if needed, and so on. And yeah, I mean, our role is basically to advise on how to approach it best, because not all clients know in advance how they're going to actually evaluate the results when they come to us for machine translation solution. Pavel Doronin 18:09 Okay, thank you. Luna, do you how closely do you work with your language providers with language companies you work with? In what? So what part of the success? Do you share with them to give them feedback? How does it work? Or do you listen to their advices? Or do they make sense to you? Guanghan (Luna) Liu 18:38 Yeah, absolutely. So as mentioned before, we perform both quality and user experience evaluations on machine translation, of course, before we publish those machine translation online as Crail. Just said, because it sounds lazy, if you're evaluating the quality, and, and the actually that the results are already online. So we actually do this before before we publish those outputs. And for quality evaluations, we mostly, of course, work with external vendors, LSPs and freelancers. It's very important to make sure the external resources fully understand our customize the evaluation rules and perform the tasks correctly. And sometimes that the vendors actually gave us very good feedback about what they don't understand about the mechanisms or if there is, they think something is not practical in those mechanics and we are always happy to listen to those feedback and see what we can do to improve the health system we actually we also provide tools online tools to our vendors so they can work on I with our evaluation tasks. And sometimes the vendors also give us other biases about if they see something is not working in those online platforms. And our team is always happy to receive those feedbacks and see what we can do to improve our tools as well, to make it easier for the evaluators to work better. Yes. Pavel Doronin 20:26 Speaking of tools, it's actually interesting. From one side we have customer feedback, it is in form of I will not buy this item because I don't understand. On the other hand, there is academia and they have their own ways to measure machine translation quality, like blue metrics and all these things. And probably under and Kirill, you are in the middle between two worlds how to map this feedback from the business to approach this, how to evaluate quality in the way, what numbers are to talk to. Kirill Soloviev 21:04 Right, I'll give a very brief intro, and then I'll pass it on to on as the real empty practitioner. That's okay. I just wanted to emphasize this subtle but very important difference on based on what Luna was saying, Yes, we have the user experience metrics, right? And, yes, we can get feedback from our customers on how good or bad our translators are. But in the grand scheme of things, this is actually slightly less important than you might think. Because in a commercial setting, the only real measure of quality is whether they actually buy this or not. And whether they have reported issues in the quality of your empty or they didn't, this is a secondary level metric. Right? So outcome, lies and behave or lies one level above what people say. Right? And I think it's an important distinction to make, as we build up this chain of measurements approaches. Yes, of course, we all have to care what our customers say. But we also should observe what they do, because it's their actions that actually influence our company's top line in the short term, in the short term perspective, that is, and then user experience and, and user feedback comes off of that. And then we go into the weird and wonderful domain of actually crunching numbers, which I would love honor to talk more about if she can. Right, so one thing that I actually wanted to mention is that I'm really impressed with what Luna just said about their evaluation methods, I see that, you know, you have very well planned and methodology and the way you outsource the evaluation to, to other vendors, and so on. So, um, not all of our clients, or all of our MT users have this level of proficiency, you know, and and yeah, I definitely agree. Also, with what Carol said, in terms of Mt valuation and the data and and the methodology. I'm completely convinced that there is no one way to evaluate them t right. And our clients often come to us asking for to provide such recommendation and say something like, Well, do you have some kind of standard way? And usually, the answer is no, because there is no standard way. It always depends on your goal. But it's interesting that Pavel, you mentioned, academia and the methods that are used in academia and the practitioners, because they don't always correspond, let's say, in my opinion, we as practitioners have to look to into research for best practices, of course. However, it's not always possible. So if you look in the latest WM T, evaluation methods, so this is the conference, the main machine translation conference rate, they are very thorough, and they're very time consuming and costly. Of course, practitioners like us, we cannot always resort to this method, especially, you know, in urgent production scenarios, so we have to find a balance always, but our methods should always be based in on what we see in research. This is something that I'm definitely convinced about. But on the other hand, of course, we're always has tried to minimize cost for such evaluations? Absolutely. As they like to say, it's good, it's fast, or it's cheap. You can pick any two, right, but you cannot pick on free. So what we've seen working with companies like, you know, the likes of transperfect, and the likes of Alibaba, helping them power, their quality evaluation programs were empty here at Quantico is that there's one thing in common, no single company that constant quo has been deployed, it only uses one way to evaluate empty, that's completely not feasible. Instead, what we see is that they all have at least three different methods they use from very different categories. One is what Pasha, I think you have been referring to when you said academia, these are the automatic metrics for machine translation quality, they're fast, they're cheap, but they're not good. They're not really that predictive of how a human perceives them. So this is why all of the companies using automatic metrics, also pair them up with human evaluations, right. And there are too late two ways. Again, we can have a balance between good and fast and cheap, something like a deck was the fluency measurements. While we're would get high level feedback from a translator on a piece of translation produced by the empty engine, they don't need to spend as much time we have a general idea of what's working and what's not. But it's not super detail. The the the upside, of course, is that also it costs much less. So like Anna was saying, it's usable in a practical setting where you have deadlines, or you have budget constraints, and so on, and so forth. So this would be the second layer of those evaluation techniques. And then the last ones, like the heavy hitters. So this is for when you need to figure out each exact thing that has went wrong in a piece of machine translation output, meticulously documented, categorize it, make sure that your record the the big problems separately from the small, tiny problems, right, this type of approaches, so usually revolve around error annotation, invaluable for debugging problems, but absolutely impractical to be used on a daily basis. Right. So usually, it's it's about combining those three approaches in creative ways. And balancing your budgets, your team, your efforts, along those three access, right, going from good and fast and cheap. And trying to make the right balance, of course, at least three evaluation techniques and various more, of course, and three, those are just the most popular ones, we see a consequence. Pavel Doronin 27:49 Thank you, Carrie, Luna, does it correspond with division at Alibaba Group? Because he mentioned you have some internal technologists and he also own a machine translation engine? So your internal one, how do you work? What is your approach? Do you also use a combination of evaluation approaches or something else? Guanghan (Luna) Liu 28:16 Yes, exactly. As crowdestor said, we use actually automatic evaluations to gather blue score from the engines, it's automatic score, it's not that accurate. That's why we're also using human evaluations for quality evaluations. We'll use those skills, the linguists who can evaluate our machine translation and give them scores. So we know if the translation is good or not. And also, as I mentioned, we are also doing user experience evaluations for those programs. We don't require the evaluators to acquire any special skills. As long as there are real, real users in our target markets, we just ask simple questions. So they can let us know what they like or dislike about the translations and see if they can give us any insights about what we can do to further improve our content on the platforms. Pavel Doronin 29:27 Are they actually active in this regard? I mean, do they actually live feedback and feed you with insights? Or is it just a site site activity that is just an extension of your internal team. Guanghan (Luna) Liu 29:47 We actually run those programs regularly. We will post the both quality and experience evaluation tasks on our crowdsourcing platforms and I To say that if you are interested, please feel free to register with us. And if you would like to know more about what we are doing, please feel free to follow the official Google Translate LinkedIn page. Kirill Soloviev 30:15 Yeah, I would like to try that myself, when I will definitely take up the opportunity and see what I can provide. I'm a regular customer of certain services provided by Alibaba. Pavel Doronin 30:31 Great, so glad you've met on this panel discussion? And yeah, but the questions still open. Okay. The customers, for example, this Last Mile End Users, they're not professional reviewers, they probably have no idea about the accuracy, fluency. Anything else. And also, you have another layer where the customer of machine translation in the global enterprise is not the end user, not the buyer. But probably, let's say, if Alibaba Group has a Japanese office or Korean office, you probably also use machine translation to speed up the conversation inside the company. And they have another level of expectations. It's not the act of purchase, but the act of message transmitted and so on. And the recipients are also not professional reviewers, they can say I can I do not understand the email or chat message. What is actually what would be the approach, problem pressure mode on the carrier to transform this feedback into something meaningful, so that you can do something with it? Kirill Soloviev 31:49 I think that it's not only about transforming the feedback, but it's more about the way of collecting it the way you actually ask this question. So I know that on some websites, you can have something like, are you satisfied with this translation? Yes, no, or rate the quality of this translation? So if you want feedback that's more meaningful or more granular. And this question has to be formulated in a way that the data that you collect is actually useful, right? So of course, if they if the evaluators or the the users that provide feedback, are not trained on this, or there is no way you can just educate all the users and all the evaluators, then it's it's the only way I believe, but maybe Korea will have more input on this. Great topic, I feel we could talk about this for hours, really, and we don't have that. Here's the deal. So we talked about those, like different level levels of getting information about Mt. Quality. And one key concept that Anna was talking about is something that we call resolution or granularity or here at Quantico. So some metrics give you an directional idea of whether it's good or bad, but don't help you in any way to proceed and figuring out where the exact problem is, and even less so how to actually fix it. Right. So this is the the most delicate and careful balancing act we have seen in any MT program. You need to invest as little effort as you can to figure out whether things are good and bad. And that's really drilled down to the core to find the root cause. Why is it actually going wrong? Right? Why is my empty output? Not better? Right? Why is it making mistakes, right? So you want to be as efficient as possible, as Anna was saying before, they're always practical considerations, but budgets timelines, when going from this, you know, thumbs up or thumbs down, saying that very low granularity very, very low level of detail to the exact place in the sentence translated by Mt that the user felt is incorrect. But the fast you take to get there, and actually the bus to take after that, once you found the exact place that was causing problems, how the hell do we train the machine to avoid it? This is probably a more difficult question. And that's like, that's the larger challenge in front of anybody who's managing quality of anything, right? It's just one part of the job to find out how exactly things are. Right. So this is the evaluation part. But then after that you usually expected to improve it or to fix it. If that's a completely separate story. How would you get a neural machine engine to stop making that mistake that you've spotted right now? So I think it's like it's like he and and Yang, right. They come together. It's hard to separate them from each other. There are There are twists that's in a way that you can pull them apart. But you need to have both parts on the team, somebody figuring out what's wrong, and somebody figuring out how to act on this information, how to actually improve the quality of empty. And this is not a task for, you know, a one person, one person team, you'll literally get an army, a small one, perhaps, to actually tackle both sides of the story. Yeah, I definitely agree. And they think we really need to collect all this data that we get from the evaluations, in order to really understand what we actually need to improve. There are some things that cannot be improved right now. But having all this data will allow us to focus our development efforts into a specific direction if we see, okay, the main issues are these are the most frequent issues. And I would say that collecting this this data is pretty challenging. Pavel Doronin 36:01 collection and probably analyzing, transforming it into something that could be passed to the machine for session specialists who actually improve it. Yeah, Luna, what is your approach? Because since you have your in house machine translation engine, you probably have a good bridge within developers of machine translation. How do you transform this feedback that you receive from crowdsources? Language renders your in house teams and probably some automatic metrics, what can be improved and how? Yeah. Guanghan (Luna) Liu 36:40 Right. The machine translation team in Alibaba is actually using different methods to improve the machine translation color TV, for example, they will directly constraint MTS output with terminologies. We can customize the output by using technologies like translation intervention, or interactive translation, or we can feed high accuracy coppers, those coppers might focus on a specific domain. And sometimes those are not even standard translations for trans creations, because that will help the users get more information. And it's more localized the translation. And those kinds of covers can be used to train a highly customized machine translation system. And also, the machine translation team will provide a set of linguistic rules or features to the engines to improve their translation colleagues. Yes, that's what we are doing here. Pavel Doronin 37:48 All right. Yeah, I know. transperfect also has internal technology, and machine translation technology, and you probably have similar or the same approach. But the question is how to understand Did it improve or not, as Gail said, in a cheap and fast way? Kirill Soloviev 38:10 Well, we have a big advantage, because we have internal linguists who can help us with linguistic evaluation. And also point out the specific issues that they see in the in the output, right, and they're very experienced in it. So it's a very big difference from the just casual empty users who don't know how it works. And some of the things, of course, that they report, we can improve, and some we cannot. But and yeah, this is the challenge that I was mentioning before, right? It's seeing the global picture, that should help us understand in which direction we should be improving, it can be, you know, terminology or fluency or accuracy and so on. And then how do we measure if it is improved? So I think it's very related to the general question on Mt valuation? We do we use automatic metrics, of course, just because often we want to benchmark you know, different parameters in our systems, or specific improvement that we added. But we do try to combine both human and automatic evaluation methods, if you know if the goal is just to improve in general the quality of our systems. In many cases, though, we want to improve specific things. For example, if it's a client of ours who is using are empty, and they want to improve the understandability or they want to include specific terminology, and so on, then we're evaluating these specific things. Unless making a great point, there is no one size quality measure that fits. All right. So but by the very definition of what constitutes good translation quality beads empty or not empty, by the way, is very context dependent, right, you have a specific purpose behind each piece of content. And that determines what's good enough, right? What is high quality. So, for instance, when you're reading a description of an item on one of Alibaba platforms, you don't necessarily need the same surgical level of precision and accuracy that you would want to find, let's say, in an instruction med manual for some kind of a medical device that's used in the operating theatre. So obviously have completely different levels of quality and different aspects of quality is important to different steps. I actually wanted to latch up on something that Luna has said before and try to generalize this a little bit. So Luna was talking about those custom engines, their training and the customization efforts. And that, you know, Anna was saying that different clients focus on different aspects of quality. What we hear at Quantico started to feel lately more and more is that managing machine translation engines is becoming very much alike as managing human translation vendors, or suppliers. Because there's such a large multitude of agents you you can be working with at any given time, with all the customized options with all the ancient vendors and whatnot. And you need to figure out who is best at what kind of translations, what aspects of quality, they deliver better, where there may be performing subpar. Right. So we started to see more and more analogies between the worlds of empty and the world of vendor management, for instance, and lots of good parallels in terms of practices actually emerge from the two. So I was interested to get some feedback from both on that. And Lorna, how do you guys see that? Is it? Is it similar? Can we borrow stuff from how human vendor managers manage human translators? Pavel Doronin 42:12 That's actually a very good point. Kirill Soloviev 42:14 Yeah, this is an interesting observation. And I agree in the sense that we cannot measure all empty providers with one kind of method, just like we wouldn't just say that this vendor or this LSP is better than this, they can be better or worse in different aspects. For example, you know, the ability to customize the ability to implement or fix specific issues, or the speed even, you know. And, of course, the same machine translation system can perform well for some content and worse for other. So yeah, I definitely agree in this in the sense. Guanghan (Luna) Liu 43:10 Right, and, of course, because we are customizing as much in translation for different scenarios, I definitely agree that four different machine translation machines, they are good for maybe one scenario, but maybe not so good for another, for example, in Alibaba, we are translating those meetings, upfront products online every day. So at worst, what's most important for us is that the machine translation can can interpret the product name, that's the most important thing. We are evaluating here. So yeah, that's why we are seeing different kinds of quantum mechanics in this field. Pavel Doronin 44:10 Thank you, guys. Max was right, this is a really hot topic, because we are suddenly out of time, almost. And the a lot of questions I'm scrolling and there is no end. So people, people have a lot of questions. So I probably just take a couple of random ones. It said that we did not cover something that technology can offer to improve it in a quite different way. Improve the machine position output but let that be a topic for the third Lock. Lock from from conference. Yeah, and yeah, one of the questions that is right now on my screen is actually from Robert, the previous presenter, and it's quite interesting because probably it's one of the main challenges for Luna and And Robert asks, what techniques would you recommend to make the source text more suitable, suitable for empty? Because and I think Alibaba for Alibaba is the most tricky ones because it's user generated content that should be translated. So what what do you do to adapt user generated content to machine translation? Guanghan (Luna) Liu 45:30 How do we parse those source tags? So it's more suitable for machine translation? Right? Yep. Okay. Yeah, that's a that's a really good question. Because, you know, some of the source texts are not produced by native speakers on both platforms. So, but actually, we don't I think it's more important for the machine to learn how to interpret those problematic stores, and to translate them into proper proper target text. So that's what we are doing and the machine translation team is doing a lot of work to help the machine better understand those kinds of special source texts. That's what we're doing here. Pavel Doronin 46:29 All right. And yeah, to close our discussion on a positive note, or I don't know how it turns. I know there are a lot of smaller boutique agencies and freelance translators, and many questions. They see really experienced machine translation specialists, they always ask okay, when machine translation quality achieves this human parity will human translators be removed and what it is now human in the loop are machines in the loop? I have seen the recent discussion on Twitter about that. So if you probably share your short thoughts about that, how far it might go with machine possession quality, because obviously, you're attending to increase it to a certain level. But what is the impact on on people? Kirill Soloviev 47:33 I can start maybe, I don't think I can only talk about near future because I don't know what will happen in in in a faraway future. But I don't think that a human translators will be removed, replaced or replaced by empty or anything like that. This is something that people used to claim already in the 50s. And the human translators are still here. But I think one important thing that the language specialists have to have nowadays, is to be able to adapt to all the technological developments that are happening. And this is not only true for the translation industry, it's true for any industry. So it's staying up to date with the developments, educating yourself, and being adaptable. That's it. Pavel Doronin 48:27 Thank you. Can you know now what are your thoughts? Kirill Soloviev 48:30 Yeah, I'll chime in just for a sec. So at content quo, we're kind of on the intersection between the translation buyers and translation vendors. So we try and look at both sides of the story. So here, here's what we see. Like, I also probably focus on the short term. For now, there's definitely no shortage of needs of human linguists who are being hired just to look at the output of machine translation, figure out if it's good enough or bad enough, and provide valuable advice, either to LSB teams like Ghana, or to buy our teams like Luna to help them inform their next actions. And in that sense, the role of translator has become even more important. So if you can be in the position where you are the judge of machine translation quality, that ultimately you have control over the machine instead of the machine controlling you. So seriously, if you guys want to learn new skills, figure out more about how to do those machine translation quality evaluations, look for jobs, perhaps on Alibaba cloud sourcing platform, always transperfect to try and, you know, help provide this kind of crucial input, and then going to the quality topic itself. Usual stuff that you guys might have heard about. The MTA is quickly becoming much more fluent And it's actually becoming more difficult to spot the remaining types of quality problems in empty output. In particular, since the the the this the sense the semantics of the text, and the semantics of the real world are still out of reach for machines. It's actually, humans need to do much more work to find and fix the remaining machine translation error. So fear knots, you guys will always have the opportunity to prove your edge over empty, definitely. So it's a short term perspective. That's what we see. Pavel Doronin 50:35 Thank you, Carol and Luna, just a final thought before we end because organizers already indulged me in private messages. Guanghan (Luna) Liu 50:47 Yeah, yeah, I just want to add that I totally agree with Cornell because there are things you need validation on machine translations. That's what we need to real human translators to do. And also we are feeding cards to the machines so they can learn how to translate it as humans to. So human translating is like the ultimate goal of machine translation. And we're doing what we can to keep improving our machines so we can finally reach that goal. I'm not sure when we are doing but we can. Pavel Doronin 51:30 Thank you. Thank you guys. It was really hot, as Max announced, and it was my pleasure moderating this panel discussion. And yeah, as I promised, on Monday on test run it I'm sure to be legendary. Yeah, thank you guys. And dear translators. Do not worry you won't lose your jobs. And the technology is not about that. And thank you, Max, I return you the mic. Max Morkovkin 52:01 Luna, Korea. Thank you very much. It was a great panel discussion.
See more