Source: Databricks

“Arkadium has always respected the importance of data-driven decision-making. Databricks has allowed us to significantly level up regarding data quality, processing speed, data-expense management, and, most importantly, the depth and impact of our analysis.”

— Greg Gallo, SVP, Arkadium


Arkadium’s mission is to make games that grown-ups love to play. Players are front and center in their decision-making process for every game they make. Arkadium believes in creating games that are inclusive, accessible and age appropriate, and that you shouldn’t have to read the fine print to know what you are getting. Arkadium has quickly become the No. 1 online gaming destination and community designed for players 35 and up, a demographic that represents 41% of all game players. From January 2019 to December 2021, Arkadium’s network grew by 164%, to over 15 million monthly players. To continue achieving this level of scale, Arkadium is using the Databricks Lakehouse Platform to focus on collecting data that will help drive increased user retention and grow their efforts with customer monetization and engagement.

Data cost, scalability and stability all impact the ability to deliver value

Arkadium tracks events in their websites and applications to analyze users’ behavior in-game and while they interact with the gaming GUI. This has positive effects on the company’s efforts to grow their customer monetization, engagement and retention.

Arkadium faced challenges in storing and processing large amounts of unstructured and semi-structured data. The data team was struggling to scale their traditional SQL-based on-premises data warehouses — architecture that couldn’t keep up with the size of data that Arkadium was processing. The team was batch processing close to 500GB of rapidly growing data in just one pipeline, which applied the appropriate transformations required for data reporting and dashboarding capabilities. The Databricks platform gave Arkadium flexibility in cluster capacity and storage space, making ad hoc hypothesis checking and finding patterns easier and faster.

Technology evolution, from legacy to modern data stack

With the company’s hyper-growth, Arkadium’s data architecture has reached new levels of maturity, growing from a traditional on-prem SQL data warehouse to a new solution with Databricks Lakehouse as the foundation. This modernization has had direct impact in several areas of the company and clear benefits for Arkadium’s customers: With Delta Lake from Databricks, there has been more data reliability, faster performance and reduced costs. When migrating data from a traditional SQL database to Databricks, they saw a considerable reduction in data volume, mainly due to the optimizations of Delta with version control and Parquet compacting. Arkadium’s volume of raw data has increased 2x during the last year, and the lack of storage in on-prem servers was a huge inconvenience. With Databricks, they are storing all data in object storage, without having to think about space in the data warehouse.

The lakehouse architecture has helped Arkadium’s data stack to evolve. This has been accomplished through several processes: They have created a single source of truth by leveraging the medallion architecture on the Silver layer, where data from different sources are cleaned, ordered and bound with each other; they have leveraged Delta Lake for its open format storage layer, delivering reliability, security and performance on the data lake; they have leveraged Databricks SQL to connect to Power BI for fast execution of queries and dashboards; and with Databricks Repos, their experimentation and A/B testing has significantly improved, allowing them to increase developer productivity by 40% and go to market more efficiently.

Winning big by delivering better insights and revenue

Arkadium has internal applications that interact with many supply-side platforms, marketing CRMs and several event trackers — the company has a number of internal tools that contain the master data. As a result, all data that has been generated by these systems has to be connected and kept in one place. Although the company was primarily using an on-prem SQL warehouse, each team was independently storing its data in various locations, leading to a data swamp with inconsistent data. This was the biggest challenge they faced — and as a data-driven company, they successfully solved it by unifying all data through ingesting, cleaning and linking to dimensions, which is a complex problem when the data may arrive with the different set of columns in different order.


With Arkadium’s ability to unify their data using Databricks, this meant more data with more detail. Creating data sets is now fast and simple: what used to take 5 hours for daily data ingestion takes only 10 to 20 minutes with Databricks. This has positively impacted Arkadium’s total cost of ownership. Leveraging Delta Lake with object storage has allowed them to reduce costs of IT infrastructure by greater than 30%. They are ingesting 2x more data but spending 30% less money than they used to spend for on-prem servers.


“Immense respect to the technologists and engineers at both Databricks and Arkadium for their eloquent Lakehouse architecture and its seamless implementation.” says Greg Gallo, SVP at Arkadium. “We’re looking forward to our next chapter in partnering with Databricks as we apply machine learning to Arkadium’s massive data. In Databricks, we have a partner committed to extensive R&D and excellent client service.”