The Synerise team among the winners at the KDD Cup 2021

6 min read

The global success of Polish engineers in the field of Artificial Intelligence

Baidu, DeepMind, Synerise. Representatives of these companies were among the winners of the KDD Cup 2021. The competition, organized for 24 years, is considered by the technology industry to be one of the most prestigious AI and Machine Learning events in the world and is often called the World Championships in the field of Artificial Intelligence.

The competition was attended by representatives of the most technologically advanced companies and universities in the world. The results on June 17, 2021 were announced on Twitter by Jure Leskovec, entrepreneur and computer science professor at Stanford University, who is the chief scientist at Pinterest.

“Congratulations to the winning teams from @BaiduResearch, @DeepMind, @Synerise, Harbin and Dalian Institutes of Technology and USTC!,” 

said Jure Laskovec.

Synerise defeated teams from around the world, including specialists from Intel Labs (manufacturer of computer processors), OPPO Research Topology Lab (manufacturer of OnePlus and Oppo phones) and Huazhong University of Science and Technology.

“It is a great joy to stand on the podium with giants such as Baidu Research or Google Deep Mind. Much of the progress made in machine learning is made possible by the use of ever-increasing computing power. Technology giants are trying to outdo each other in the use of ever larger models with incredible capacity, but also very high training costs and a significant carbon footprint. At Synerise, we focus on a fundamental understanding of the mathematical phenomena underlying deep learning. Combined with engineering finesse, this allows us to compete with the best research centers in the world despite having only a fraction of the resources available to them,” 

said Jacek Dąbrowski, Chief Artificial Intelligence Officer, Synerise S.A.

KDD Cup (International Knowledge Discovery and Data Mining Competition) is organized by ACM (American Computer Association), which is the most influential scientific and educational IT organization in the world.

Held since 1989, the KDD conference is the oldest and largest data mining event in the world. It is home to some of the first and most cited scientific articles in the fields that are now commonly known as "Big Data", "Data Science", and "Predictive Analytics." Innovations such as crowdsourcing, large-scale data science competitions, algorithms for personalizing advertisements (like Google), data mining (Facebook, LinkedIn) and recommendation systems (Netflix, Amazon, etc.) come largely from KDD.

In 2020, the KDD conference attracted over 3,900 leading researchers from both the commercial and university worlds. Among them were leading university researchers from Berkeley, Stanford, Oxford, Tsinghua who visited KDD to learn and demonstrate cutting-edge advances in Data Science, Machine Learning, Artificial Intelligence, Predictive Analytics, and Big Data. KDD is a pioneer of applied data science. KDD attendees come from the most powerful technology companies in the world such as Google, Alibaba, Facebook, Netflix, LinkedIn, Tencent, Microsoft, IBM, Spotify, and Amazon. The voices of state institutions such as NIH, NSF, DARPA are also important to the KDD community and representatives of these industries can be met during the conference.

“I think graph mining and modelling is one of the most important issues in the data mining community. The KDD Cup takes them to a higher level in terms of the scale of these problems and their diversity. [...] Hopefully, the competition will encourage the community to develop new techniques and see what algorithms work on large-scale data." 

says Alex Beutel, KDD Cup Chair, Research Scientist & Team Leader at Google.

This year, almost 2,500 teams from around the world competed in 3 KDD Cup competition categories, of which three winners of a given category were awarded. Synerise starred in the most difficult of them, organized by Stanford University, Facebook AI, Google and Intel, among others.

“Our big dream has always been to compete with the biggest tech companies. The path we have chosen is bumpy, ambitious and uncompromising. We are dealing with the largest technology companies in the world, we want to win with knowledge, excellence and exclusivity of solutions supported by the latest scientific achievements, in particular in 3 market segments such as BigData, AI and automation," 

adds Jarosław Królewski, President of Synerise.

Michał Daniluk, AI Research Scientist, Synerise SA says 

"With our work, we want to prove that our AI team can compete with innovation leaders from around the world. This is our third victory in this arena, which confirms that the global aspirations of Polish technology companies are perfectly justified. I hope that our success will inspire others. Scientists and engineers in the country to compete with the best in the world.”

The competition task was to predict the subject of scientific publications on the basis of edges contained in the heterogeneous graph of papers, citations, authors and scientific institutions. The graph of unprecedented size (~ 250 GB) contained 244,160,499 vertices of 3 types, connected by as many as 1,728,364,232 edges, which made it possible to verify the algorithms in terms of their readiness to operate on very large-scale data.

"Large heterogeneous graphs appear in many practical applications. The graph processed by us as part of the KDD Cup concerns academic publications, but data with a similar structure are also present in e-commerce (customer transaction graphs), large knowledge bases and document databases. processing this type of data therefore leads to a specific business advantage in improving the quality of recommendations and information retrieval. I am glad that data on this type of practical problems are increasingly appearing in competitions at leading conferences," 

explained Barbara Rychalska, AI Research Scientist, Synerise.

The Poles, including Jacek Dąbrowski, Michał Daniluk, Barbara Rychalska, and Konrad Gołuchowski, unlike most teams that improved existing algorithms, used proprietary machine learning methods: Cleora and EMDE. The methods developed by the Synerise team have so far allowed them to win the SIGIR Rakuten Data Challenge 2020 and WSDM Data Challenge 2021 competitions. These methods are also a key element of the personalization system (including recommendations and search results) available to Synerise customers. The solution of the Polish team has already been published on the Stanford University website.

“KDD Cup competitions are an opportunity for our team to test and develop our technologies used in the company. It is satisfying that the algorithms developed for the needs of our products successfully compete with solutions prepared by technological giants. Competition with the best universities and companies like Baidu, Intel or Google gives additional motivation to continue working on improving our solutions," 

says Konrad Gołuchowski, AI Research Lead at Synerise. is a Polish technology company that produces a Big Data and AI platform that allows users to process data in real time from various sources based on proprietary database systems and artificial intelligence algorithms as well as automated business scenario execution methods for segments such as retail, banking, telecommunications and e-commerce. Synerise's clients include: CCC, Carrefour, Żabka, Orange, mBank, SharafDG.