Investing in MOSTLY AI, a Synthetic Data Privacy Company
Nearly every enterprise talks about the need to become more data-driven, but even the most sophisticated data-collection initiatives often struggle to put that data to use. Privacy laws restrict the sharing of personally identifiable information (PII), making it difficult for teams to share data and collaborate effectively. If your office has developed a promising model and you’d like a data scientist in Germany to work with it, for example, you may not be allowed to share that data. Or you may be able to work together on it, but only after a six-month approval process. Such restrictions limit the ability to make progress in analytics, machine learning, and artificial intelligence. Data breaches, which affect thousands of companies a year and expose hundreds of millions of records annually, also discourage data sharing.
One way to protect PII is with hashing, or data anonymization . This technique can disguise a person’s name and other attributes, but is of limited use for data sets that can be shrunken down with just a few variables. In fact, in one recent study researchers were able to identify 90% of individuals in a consumer credit card database from just four random pieces of information.
MOSTLY AI strives to move beyond data anonymization and liberate private sensitive data so it can be used across an organization — and even with third parties — while respecting individuals’ privacy and the rules and laws that govern it. The company does this by creating synthetic data — data that maps closely to an organization’s real data, but is not based on any particular individual. Synthetic data contains all the statistical information needed to build a model, but none of the hurdles associated with PII. It also allows organizations to use full data sets without running into fields that are blocked because they are private, and makes it easier for teams to collaborate because they can be confident the data they’re using has no legal or security risk.
Synthetic data enables a balance between privacy and accuracy — one that MOSTLY AI allows its customers to fine-tune. If the data is to be shared with a third party, for example, the level of privacy can be set higher and the synthetic data will be a little further away from the underlying data. If the data is only to be shared internally, a team might want to prioritize accuracy and retain the exact level of privacy necessary to meet regulatory and compliance guidelines.
Because synthetic data is so new, organizations are still learning how best to use it, and MOSTLY AI is at the forefront of developing new use cases. Synthetic data can be used to test a model for biases, to drive pilot projects, and to determine whether it’s worth the time and trouble to get the approvals to work with the real data. It can also help test the sturdiness of new technology: for example, if an organization is developing new software such as a banking application, it may want to confirm that the app can properly serve distinct customer profiles. Synthetic data allows the organization to create millions of unique profiles, each containing all the data it needs to test the software and find edge cases that weren’t initially identified.
MOSTLY AI was founded in 2017 by a team of Austrian data scientists headed by Michael Platzer, Klaudius Kalcher, and Roland Boubela. The company’s leadership combines strong technical abilities with an understanding of the workings of large clients—its CEO, Tobias Hann, is a former consultant who previously created a spin-off for a large corporation. In speaking with their clients, we were impressed by how impactful MOSTLY AI’s synthetic data platform has been for them, as well as by the level of attention and service that accompanied it. MOSTLY AI’s customers include the City of Vienna, Telefonica, and Erste Group, Austria’s first savings bank.
We’re pleased to announce our investment in MOSTLY AI and are looking forward to supporting them in their mission.
For more information, email Ornit Shinar at ornit.shinar@citi.com or Avi Arnon at avi.arnon@citi.com .
For more on cybersecurity, data, and AI in the enterprise, click here.