AIR 083 | Dialogue with Huawei’s Noah’s Ark Lab Director Li Hang: Big data is often small data

Li Hang, Director of Noah's Ark Lab, Huawei Technologies Co., Ltd., Adjunct Professor of Peking University and Nanjing University. He graduated from the Department of Electrical and Electronic Engineering at Kyoto University in Japan and received a doctorate in computer science from the University of Tokyo, Japan. Dr. Li Hang’s research interests include information retrieval, natural language processing, statistical machine learning, and data mining. He has been active in related academic fields, has published three academic monographs, published hundreds of academic papers in top international academic conferences and international academic journals, and has 40 authorized US patents.

Recently, at the CCF-Gair conference held by Lei Fengnet (searching for the "Lei Feng Net" public number) , Li Hang accepted Lei Feng's dialogue on AIR in the background, expounding Huawei's big data, machine learning, and artificial intelligence. The internal link of the block work.

Which is the main piece of Huawei's big data?

Huawei's big data is mainly used to help Huawei, telecoms, and operators to increase their efficiency. Based on the big data accumulated by these companies over many years, they solve various problems in business and operations, and do intelligent upgrades to solve problems with big data. All kinds of business problems, all the business in the company should be able to (and our laboratory) associated.

As an example of one of the customers, Shanghai Unicom, there are 5 million subscribers. Through these mobile phones a large amount of data to determine the flow of these people in a day, and then we can do a lot of things......

Usually, big data we are talking about is small data, such as the data in your own mobile phone. You don't want to see others. Adding all the small data is big data, but you can't just put everyone's data Take out, there are copyright, privacy and other issues.

The training model requires large-scale data. At this time, we can only learn a general model, learn how to migrate it to each individual's data case, and then go further to learn. At present, this (transfer learning application) also There is no specific case.

But concretely, what Huawei is doing with machine learning based on big data is:

First, search for classified photos in text or natural language. This method sets a label for each photo in advance without using manual or machine learning, and then processes the photos through labels. Instead, photos are processed using a deep learning model, which uses the content carried by the photos to produce a natural classification.

Second, neural machine translation.

Third, the neural response machine, an automated generative system, is the industry's first publicly-aware smart answering machine that can automatically generate replies, rather than through big data search pairings.

Compared to Tencent's 2C data, what should Huawei pay attention to when doing machine learning based on low-level data?

The main driving force in research is to focus on the company's business.

These studies have long-term and short-term.

For example, what kind of technology the company needs to develop its business direction in the next ten years, and then what kind of technology needs to be pushed back. (The direction of laboratory research is the same as Huawei's other business, or is it customer-oriented?) Yes, it can be said.

How is the difference between the former director Professor Yang Qiang and your management style?

Everyone has his own style of work. The big direction is the same. Everyone wants to do a good job of their own research and promote research and development. He is also a scholar and I am a scholar.

Our background is not the same. His direction is mainly in migration learning. My direction is mainly in natural semantic processing, information retrieval, etc. The direction will be related to the research content and interest of people.

Four years ago, we established the Huawei Noah's Ark Lab together. Now everyone in their respective fields still insists on doing. This will not change.

How is the ratio of research and application in Noah's Ark laboratory daily balanced?

There is no absolute ratio, and the laboratory's goal is still around the company's business.

If it is a product planned for 10 years, the purpose is to do it around the future and invest in it. In turn, it is what areas we need to try. However, if the light is aimed at the next 10 years, the goal will be easily empty.

If it is a product planned for 3 to 5 years, there must be some phased results. There are even one year and half a year to produce phased results. This time will be adjusted according to the situation, but the big direction is always a comparison. Clearly, it focuses on research in cutting-edge technology fields such as artificial intelligence, machine learning, and data mining.

Then according to these three directions, we must decide which areas to increase efforts to invest in, which actual products to cooperate with, and balance the long-term research and practical application development of two types of projects, relatively speaking, the proportion of product development will be Bigger.

Huawei currently has two products (one product is the App market on Huawei mobile phones. The other product is Huawei's "mobile phone service.") There are also some industry-leading deep learning and natural semantic processing-related cooperations. Although these are still in the technological research and development stage, they will be successful in a year or two.

Some big companies that used to do big data have changed themselves to be artificial intelligence. What is the difference between artificial intelligence and big data?

The core technology of artificial intelligence is now machine learning. The two are currently almost equal to one another. There may be other ways to do better in the future, but we haven't seen it yet.

Machine learning often requires data, or big data, related to big data.

A lot of big data is garbage when it is not used. If it can be effectively used, machine learning technology is used to do some smart things. It is artificial intelligence.

Artificial intelligence methods are basically this routine, basically related to these three (big data, machine learning, artificial intelligence). Specifically how to call, mainly depends on what you emphasize.

Profinet

PROFINET, launched by Profibus international organization, is a new generation of automation bus standard based on industrial Ethernet technology. It provides a complete network solution for the automation communications sector, covering current hot topics in automation such as real-time Ethernet, motion control, distributed automation, fail-safe and network security, and, as a cross-vendor technology, it can fully compatible with industrial Ethernet and existing fieldbus technologies, protecting existing investments.

Custom Profinet,Connector M8 Profinet,M8 Connector Shielded,Profinet M8 Connector

Kunshan SVL Electric Co.,Ltd , https://www.svlelectric.com