Topic Advisor:Yan Zhitao

Vice President of R&D,TalkingData

Yan Zhitao is currently the vice president of research and development at TalkingData. He leads the research and development of the company's data management platform (DMP), data observatory and other products, and is responsible for the research and development of the company's big data computing platform. Currently focused on building a big data computing platform that integrates multiple computing models and supports machine learning and data mining. Focus on technologies such as Spark, Hadoop, HBase, and MongoDB. Over 15 years of experience in the IT field, has been engaged in large-scale distributed computing systems, middleware, BI and other related work.

He graduated from Peking University with a master's degree in atmospheric physics. He graduated from North China Institute of Computational Computing Technology and his research direction is distributed computing system. Prior to joining TalkingData, he served as a senior architect at IBM CDL, a senior middleware technology consultant for Oracle Asia Pacific, and a senior middleware technical consultant for BEA Asia Pacific. Participate in a series of cross-border and large-scale domestic middleware, BI and other projects.

Topic: Streaming Big Data and Instant Interactive Analytics

Big data technology has gradually become the standard of enterprises. The long wait for data analysis results is out of date, and the lower latency streaming big data processing technology makes real-time analysis more and more important. In this forum, we will bring you the industry's leading streaming big data, real-time interactive analysis technology related sharing.

return

Speaker:Chen Feng

Suning Tesco IT Headquarters

Senior Technical Manager

Big Data Platform Responsible for the construction of Suning Tesco Group's big data stream computing platform, including Storm, SparkStreaming, Flink and other components, and experienced the development process of stream computing from componentization to platform service to intelligence. Has a wealth of experience in the big data open source framework,has his own thinking and understanding in distributed computing architecture design and system optimization.

Topic:Flow Computing in the Past and Present of Suning

1. The development of the flow computing platform

The development from 2014 to the present, experienced the transformation of storm->spark streaming->flink, currently flink is in the process of transformation.

Scale: storm (4000 ~ virtual machine node), flink & spark streaming (200 + physical node, on yarn mode), problems and solutions in the development of each engine;

2. The shortcomings of storm and spark streaming & why do we choose flink?

(1) Consider both throughput and latency

(2) Efficient state management

(3) Guarantee of Exactly-Once

(4) Event-Time

3. What did we do about flink?

(1) The platform layer is rich in functions: sql syntax is rich (distinct, flow table join), operator automatically expands and shrinks, connector (mysql, hbase, kafka1.0), sink slowdown

(2) Tool layer: unified log collection and display, unified monitoring and management platform (platform layer & business layer)

(3) Service layer: Dlink one-stop development platform.

4. Future prospects

Data Integration && Machine Learning && CEP, etc.

Speaker:Huang Xiangwei

Senior Data R&D Engineer, Netease

Engaged in big data research and development for seven years, now responsible for NetEase YEATION flow computing platform, data exchange platform and machine learning platform, research on the theory and implementation of distributed scheduling, memory computing and stream computing, and has rich experience in related open source frameworks (eg.Flink, spark).

Topic:Flink-based Flow Computing Platform Architecture and Application Practice in Netease

Stream computing technology is highly attractive because of its rapid response to events, and has become an indispensable technology in e-commerce platforms.

With the rapid development of open source flow computing framework in recent years, the ease of use and reliability have been improved, making it widely used simply in the production environment.

The streaming computing platform of Netease has experienced nearly two years of development from scratch, which has greatly improved the efficiency of data output and decision-making.

At present, the platform has been widely used within the company, such as monitoring, real-time counting, and risk control. This sharing will introduce the Netease stream computing platform.

Architecture implementation and hands-on experience in multiple lines of business in Netease.

Speaker:Wang Chengguang

Chief Architect of Zhongdong New Media

Mr.Wang, master's degree, has worked as architect and technical experts in Belle E-Commerce, Sohu, Netease and Yidianzixun. He has been engaged in search, data mining and personalized recommendation design. R & D work, has built a complete search and recommendation system, and has been open source lightweight distributed real-time computing framework light_drtc, and published in 2016 "Distributed real-time computing framework principles and practice cases"

Topic:Application of Streaming Computing in Content Information Recommendation Service

Streaming computing has always been a hot topic of professional technology in recent years. Content information is also the entrepreneurial direction that the Internet has been consistently favored by capital for nearly 20 years. The content of this issue is mainly to introduce the application of streaming computing in content information recommendation. Introduce the current mainstream information recommendation service process and introduce the real-time update of user portraits. This is also the typical application of streaming computing.

Audience benefits:

1). Understand the content information recommendation service process

2). Understand the user portrait

3).Understand the user image real-time update process

Speaker: Xie Changsheng (Special Guest)

Professor of Wuhan National Research Center for Optoelectronics

Professor of Wuhan Optoelectronics National Research Center. He has served as Director of the Key Laboratory of the Information Storage System Ministry of Education and Deputy Director of Wuhan Optoelectronics National Laboratory. He has been engaged in the research and teaching of information storage theory and technology for a long time, and has undertaken national research projects including the National Natural Science Key Fund, the National Major Basic Research (973) Project, and the National High Technology Development Program (863) Project. The technology has shifted to the industrial world and has become the core technology for independent innovation products of Chinese enterprises. He has published more than 200 academic papers and has more than 50 national patents. He has won the International Invention Gold Award and the National Invention Award. A large number of doctoral and master students have been trained in the field of computer storage, many of whom have become the core technical backbone of international and domestic famous enterprises and professors and associate professors at well-known universities at home and abroad.

Topic: Magneto-Optical Fusion Big Data Long-Term Storage

The big data—— "Big" is a hot spot, but the "long life"of big data was ignored by people. The current life expectancy of mainstream storage media and devices such as flash memory and hard disk is only about 5 years, and the retention time of people-related information needs to be at least as long as the life of the person. The father of the Internet, Cerf, is worried that the information will be lost over time. It will make future people unable to understand the information generated by human beings and enter the era of digital darkness. The long-term preservation and reproduction of information is a major and challenging issue before us.

Optical storage has the longest retention time, but capacity and speed are short versions. In recent years, the breakthrough of optical diffraction limit and the significant progress of multi-dimensional optical storage have made people see new hopes, and can expect the emergence of new storage devices with long storage time and huge capacity.

This presentation will analyze the key factors of long-term storage, and propose major issues to be resolved from both physical long-term and long-term agreement. Introduce the latest developments in optical storage ultra-long-term storage, and introduce the use of magneto-optical fusion technology to solve the comprehensive problems encountered in long-term storage in terms of life, performance and cost. Finally, it introduces the new idea of using the biological system metabolism principle to build a super long life storage system.

return