BDU 2.0 will focus on creating a Big Data culture of success and to explore topics such as "do things differently than you or your organization have ever done before".


Machine Learning & Artificial Intelligence
Deep Learning
Big Data in the Cloud
Predictive Analytics
IoT (Internet of Things)


BDU 2017 in numbers


    Keynote speaker

  • Travel industry is an amazing field for Data Science and Data Scientists at In this talk you will find all about our success stories, failures, challenges, anti-patterns, good practices and everything we learnt during all these years turning petabytes of data into awesome trips.

    Lucas Bernardi is Senior Data Scientist at He worked for many years on recommendations and personalization where he applied machine learning and large scale data science on a daily basis. Today his focus is on scaling up the application of Machine Learning throughout the whole organization. His background is Computer Science and Software Engineering.
  • Speakers

  • In the previous years we have got the Polyglot Persistence. This is a fancy term which means that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by. If we have multiple persistence, then sometimes we need polyglot operations. One of the most popular use case in Big Data is searching. Almost all websites provide a search function to their users, to be able to find what they are looking for. Usually it is an Apache Lucene based solution, like Elasticsearch or Solr. I will show you how to enrich this kind of searching with the power of graph based searches, and implement a polyglot search functionality, where the results are based on the cooperation of a search engine and a graph based real time recommendation.

    I’m an engineer with both business and technical mindset, and I am a big fan of data analytics and big data techniques. A few years ago I started to discover different noSQL solutions, and I finally found graph databases. It started with a few pet projects and meetups, and now I am the ambassador of market leader graph database, Neo4j and I work as a senior consultant at the world's #1 Neo4j consultancy company, GraphAware.
  • Open data is an intellectual treasure trove that already helped many unexpected and often fruitful applications to surface. There are many areas where open data is considered to be a high value, and where examples of how it has been used already exist. This session will show how easy it is to work/play with open data through a BI tool. The presentation will show you how to consolidate data from different sources into a single analysis, which allows you to see the connections and answer questions. Also, I will give you hints and tips where to find accessible open data sources and use cases.

    I am a data analyst at Nextent Informatics Co. My primary role is to demonstrate and present Qlik Sense and bring projects into the proof of concept phase, because I understand the main drivers within business intelligence. Due to the fact, I was always fascinated by the hidden values and information of the data, I am honoured that my job is to create unique value for the clients by delivering meaningful analysis based on the provided sources.
  • This talk is for the underdog. If you're trying to solve data related problems with no or limited resources, whether it's time, money or skills, don't go any further.This talk points mostly to decades-old technology, free operating systems and cheap hardware if possible, but if it makes sense to spend a hundred bucks instead of tearing your hair, we'll say so. This talk is opinionated. The cloud is somebody else's computer, use it if it makes economical sense and you believe that distributed computing is a solved problem. The stream consists of lots of unborn events that has not been acknowledged, don't cry if you lose them. Every abstraction layer could introduce a magnitude of slowing memory, processor and I/O, black boxes and undebuggable errors. Unfortunately they actually do so. Nobody ever got fired because of grepping files from drives mounted to memory drives (aka MEMDISK). We mostly use bash, SQL and make. Maybe Python and Go if we really have to. This talk does not contain made up sample code and false promises of fancy technology. I talk about stuff we use in production. Period. It's gonna be fine. Nothing from Apache, no Mapreduce, no streaming. Long live James Mickens.

    Generalist data nerd and startup specialist. Specialties are cloud agnostic data infrastructure and data-driven product development (Microsoft, 6Wunderkinder). Experienced co-founder, built and hired teams up to 30 people. Proven build-to-market capabilities.
  • Two of the major expenses for tech companies are development time and computing resources. Andrei Alexandrescu, a famous C++ programmer said that, once he optimized 1% on the Facebook backend, he saved more than 10 years of his salary on electricity costs for the company each month. In order to save costs, we need high level tools to make developers productive and we need those tools to perform well. The key to those tools is compiler technology. In this talk I will introduce how some ot the tools (Apache Flink, Apache Spark, TensorFlow, etc.) can make your life easier by providing faster and more convenient runtime code generation.

    Gábor has a Master's degree in Computer Science and started his PhD studies recently. He is participating in compiler research since 2012. He is a long time LLVM/Clang contributor, regular speaker at conferences, meetups, participated in Google Summer of Code twice and was working for Apple briefly. He is also teaching C++ and Advanced Compiler Construction at ELTE.
  • Huge volume of data is generated by the telecom systems day by day; considering data privacy rules, we make it available for data-driven service developers.

    Mátyás Dobó is the head of New Business Directorate of Magyar Telekom, expert of the digitalization of the telecommunication sector, founder of the SMART conference, and guest editor at Forbes.
  • We are representing a startup called DiabTrend, whose members are eager to solve real life healthcare problems and convinced that data science has the best tools for it. Our focus is on diabetes as it is one of the most common diseases of our time. According to WHO, 422 million people were affected by it in 2014. There is still no cure for this disease while it is really hard for a lot of people to treat it. We’ll talk about how to use neural networks in the field of medical science and possible use cases. We want to present our solution through diabetes to show the performance of recurrent neural networks in predictive analysis on complex real world problems especially in healthcare. We speak about why it is important to work on these technologies today. How can we cope with big companies and why do we think the future is in machine learning based projects?

    Marcell Havlik and Tamás Havlik are twin brothers. They both finished their studies in computer science and mechatronics engineering on Master’s level. They work in the field of data science. In the past they’ve built an own machine learning library and since then they are following the state of the art algorithms in Big Data. Recently – being part of a team - they won a Data Science award on Leading Data Hackathon organized by Telekom and also had good results in an international competition organized by Numerai.
  • How to write a data pipeline from scratch using high performance components that scale better, both in technical and financial sense than Hadoop. “Reactively” is a marketing and product analytics platform for online businesses. From data collection to visualization it covers all aspects of data-driven marketing and product development.

    In my over 15 years with various Fortune 500 companies and startups, I have held a variety of increasingly responsible positions in engineering, including systems & software engineering roles. I have managed changes in large scale infrastructures without downtime while customers were actively using the system. Besides engineering I have experience in managing onshore & offshore software teams delivering mission critical systems. I also work with startups as a mentor and advisor.
  • Creating large scale web crawling networks comes with numerous problems that range from graph theory to memory and network optimization. At SentiOne we managed to create a system which monitors and extracts content from 500.000 domains in 23 European languages. In my presentation I explain the key challenges in web scraping, the way we overcame them, and will also speak about the failures we made and the solutions that didn't work for us.

    Michał is the founder and CTO of startups SentiOne and SalesLift, holds MSc in Computer Science and BA in Economics. He has been working with Big Data and Artificial Intelligence technologies over the last 7 years, combining DevOps and R&D skills. At SentiOne he is responsible for product development and system administration, and is also a coordinator of research grant funded by The National Centre for Research and Development.
  • You can use logging on your IoT device(s) for collecting usage statistics, monitoring, security or for debugging running applications. Logs can then be sent to various Big Data destinations for storage or for further analysis. Learn more about how you can solve both the logging agent and central server side using syslog-ng through a wide range of examples that include Amazon Kindle, BMW i3, and industrial devices of National Instruments.

    Peter is a system engineer working as a community manager at BalaBit, the company behind the syslog-ng logging daemon. He helps distributions to maintain the syslog-ng package, follows bug trackers, supports syslog-ng users, and talks regularly at conferences (SCALE, FOSDEM, Libre Software Meeting, LOADays, etc.). In his limited free time he is interested in non-x86 architectures, and works on one of his PPC or ARM machines.
  • The chatbot hype is over. Brands need real solutions for real conversations. This talk is about how machine learning can help human chat agents to be more productive, and how real conversations can help chatbots to get off the ground.

    János has spent more than 10 years in the advertising industry focused on developing digital campaigns and product for top clients like Coca-Cola or Telekom. After quitting the ad industry he co-founded the startup Clipdis. In this project the team worked together with Facebook Messenger and KIK Messenger to create a chatbot. János is the founder of where their team is researching the ways to make chat-based customer care more productive with AI and data-driven technologies.
  • As a part of MOL Group Machine Learning Program, this project’s aim is to develop automated advanced business analytics abilities in Danube Refinery. After a successful PoC (Proof of Concept) in 2016 regarding Delay Coker Unit Coke yield and steam eruption forecast the next steps are to:

    • set up a general Machine Learning technology and capability framework for Danube Refinery (DR) which can be ready to rollout later on,
    • utilize OSIsoft PI system as an Industrial Internet of Things (IIOT) data store (130k sensors in DR, 400k+ in Group),
    • respond to crude diversification and support Coke feed homogenization project with Machine Learning.

    Data Innovation Team Lead at MOL Group. Piloting new IT concepts and technologies, like Big Data, Machine Learning, Streaming, In-Memory Databases, AR/VR, iBeacon and sensors.
  • 3V’s of Big Data are like a quickly moving target. To shoot effectively, you need to plan for the unexpected. I will share best practices about how our customers are using Qlik Sense in Big Data / Data Lake infrastructures. Best practices aim to give high adoption of analytics among wide variety of users of different skills (not only data scientists). Thanks to the API’s of Qlik Platform, we can manage analytics and it’s metadata in a more automatic way. The extendibility of Qlik Platform, enables custom data visualization. Thanks to the wide connectivity possibilities we can talk i.e. to NoSQL data sources, Web services, REST API’s, R, Python and much more.

    I work as a Senior Solution Architect at Qlik Eastern Europe, and mostly work with customers from the Retail/FMCG/Logistics industry. I am using my retail experience gained by working over 10 years at Metro Group in different roles. I was involved in designing and implementing ERP, Data Warehouse, BI, Shelf - and Store Management and Loyalty systems. I am truly fascinated by data visualization and the Internet of Things. I am also an active organizer of Qlik meetups and Makerspace in Warsaw.
  • We were new to Hadoop by the time we started building Prefixbox. In the last three years we have gone through a lot of iterations to take our product this far. In this talk I will share key insights about our learnings: what did and did not go so well.

    István Simon is the CEO and founder of Prefixbox (founded in 2014). During this time his team helped 40+ e-commerce sites in measuring and optimizing their site search experience. Previously István has worked in Microsoft’s search product team in London as a Software Engineer.
  • Many companies are experimenting today with Big Data use cases. They set up Data Lakes to collect and manage data from a set of sources that fit to the subject area, and begin to analyze the contents with the carefully chosen - or just somehow already known - tools from the available wide choice of different Hadoop distributions, or elsewhere. Fine so far. But, how can the analysts - who have a vast knowledge about how business is running its course in a company - be involved into the exploration of the value?

    I began my professional life as a Research Engineer, working on projects that were hard to explain to an outsider. In contrast to that, during my more than 20 years with Oracle in different sales support positions on the Analytical field, I have learned one thing: bringing simple tools to the market is much easier. If users have a good understanding of what they are receiving, and how exactly will they be able to use it in their routine, they are much more likely to support a buying decision. Big Data solutions are complex today, but with the more and more widespread acceptance there is a significant need for simplification. My intention is to offer simplification to the projects that intend to be productive in a shorter than average time.
  • In the 1930s one farmer produced enough agricultural product to feed 4 people. In the 1970s this number rose to 73. Fast forward in the 2010s one farmer produces enough food to feed 155 people. There are complex trials, breeding programs and huge amounts of analysed data behind these improved capacities and great perfomances. In 2017 we are in the modern technolgy’s era: data can be collected easily, although the extent of the continuously collected data becomes hardly manageable. Effective data analysis is the key to maximize the available production areas’ potential and keep feeding the word’s quickly increasing population. The agricultural sector needs to find the best solution how to analyse, systematize and interpret the outcome of the available data to implement further long-term solutions for field production. Our challenge: ensure real-time data collection processes and provide visually easily manageable outcome based on measured data on field.

    Business Improvement Consultant, focusing on corn and sunflower production processes in 9 countries in Europe. 79 successfully completed Lean projects, and 54 Six Sigma projects both on regional and European levels in 6 years, resulting hard and recurring benefit with incremental impact. Leading role in the implementation processes according to new methods and technics of data collection, data management, data analysis, real-time data input and real-time reporting in the agricultural sector.
  • While technologies change rapidly project failures come the same as before. If we take a look at some Big Data project failures (not much seen publicly) we might find more reasons on the organization’s side than in the technology. One pattern of project failures seems to be related to the expectations to lower cost of data and analytics based on open source technologies.

    Erzsébet Erdélyi has 15+ years experience in leading business transformation. She started to lead analytics based business transformation as part of Customer Relationship Management (CRM) programs. Since joining Teradata she is helping customers to plan and lead data-driven business transformations as a program manager.



Data Science at

Data Janitor 101

Big Bang Theory and Telekom

Winners of Telekom Leading Data Hackathon pitches

From IoT to Big Data using syslog-ng

Chat automation doesn't start with chatbots


Let’s get practical. Qlik Sense design patterns for Big Data / Data Lake

Data Science to Fight Diabetes, Predictive Analysis


Project failure patterns when shooting for cheap data storage

The power of polyglot searching

Working with open data


Improving Operational Efficiency and Asset Health with Predictive Analytics in Downstream Process

Beyond Hadoop, a simple data pipeline

Big Bang Theory and Telekom


Compiler Technology: Key to the Performance

Data management in the field production sector: challenges, vision and effectiveness


Lessons learned while building Search Analytics pipeline using Hadoop on Azure

Hooray! We have a Data Lake! What can our Analysts do with it?


What can you expect from the seminars?

Workshops for exploring opportunities in the Era of Big Data

Deep dive into machine learning tools and appliances

Detailed relevation of distributed technologies

Hottest NoSQL approaches and tools

Best practices for using graph databases such as Neo4j

You Get To Meet ...

... data lovers, technical experts, Senior and C-level executives from leading innovators in the Data Science space. Executives from startups to large corporations will attend our conference.

The conference will feature internationally recognized speakers and it may be the most powerful event you attend in 2017.

Payment and tickets

The conference is FREE OF CHARGE but attendees must register via Eventbrite.
However VIP tickets are also sold (35.000 HUF) giving you the excellent opportunity to spoil yourself with quality gourmet food, PaaS (Palinka as a Service) and leverage the possibility of excellent networking.


Why to sponsor the event? Well, Big Data Universe gives you as many opportunities as many stars are shining on the night sky.

We have developed convenient and customizable packages to help your organization meet its objectives and reach your target market in the Big Data industry.

We are dedicated to making you part of a truly great conference experience. For detailed information please download our sponsorship offer!

Presenter Opportunities

Over 15 leading experts in Data Science area present at our conference regularly.
Please send an email to for speaking engagements.


We are keen to any ideas you may have to make BDU Conference 2.0 even more special! Send your ideas to


By entering the event premises, you consent to interview(s), photography, video recording and its/their release, publication, exhibition, or reproduction to be used for news, media, or any other purpose by Big Data and its affiliates and representatives. You release Big Data Universe Conference 2.0, its partners and each and all persons involved from any liability connected with the taking, recording, digitizing, or publication of interviews, photographs, computer images, video and/or sound recordings.

By entering the event premises, you waive all rights you may have to any claims for payment or royalties in connection with any exhibition, streaming, webcasting, television, or other publication irrespective of whether a fee for admission or sponsorship is charged.

You have been fully informed of your consent, waiver of liability, and release before entering the event. If you would not like to be recorded, please notify our staff members upon registration.


Akvárium Klub

12. Erzsébet tér,
1051 Budapest,


We would be happy to answer your questions and hear your suggestions!
Get in touch with us anytime @ or call us:


+36 30 560 2917


+36 70 386 9955

and the team, who helped us a lot:






Site build