How to Build Your Data Platform like a Product

Over the past few years, many companies have embraced data platforms as an effective way to aggregate, handle, and utilize data at scale. Despite the data platform’s rising popularity, however, little literature exists on what it actually takes to successfully build one.

Barr Moses, CEO & co-founder of Monte Carlo, and Atul Gupte, former Product Manager for Uber’s Data Platform Team, 分享设计数据平台的建议,使数据对组织的价值和影响最大化.

Your company likes data. A lot. 你的老板要求今年增加人手来加强你的数据工程团队(Presto、Kafka和Hadoop), oh my!). 你的数据副总裁经常潜伏在你公司的工程师团队Slack频道,看看“人们怎么想” migrating to Snowflake. Your CEO even wants to become data-driven, whatever that means. To say that data is a priority for your company would be an understatement.

To satisfy your company’s insatiable appetite for data, you may even be building a complex, multi-layered data ecosystem: in other words, a data platform.

At its core, a data platform is a central repository for all data, handling the collection, cleansing, transformation, and application of data to generate business insights. 对于大多数组织来说,构建一个数据平台不再是一件好事 a necessity, 许多企业之所以能在竞争中脱颖而出,是因为他们有能力从数据中收集可行的见解, whether to improve the customer experience, increase revenue, or even define their brand.

Much in the same way that many view data itself as a product, data-first companies like Uber, LinkedIn, and Facebook increasingly view data platforms as “products,” too, with dedicated engineering, product, and operational teams. Despite their ubiquity and popularity, however, data platforms are often spun up with little foresight into who is using them, how they’re being used, and what engineers and product managers can do to optimize these experiences.

Whether you’re just getting started or are in the process of scaling one, 推荐一个正规滚球网站分享了五个最佳实践,以避免这些常见的陷阱,并构建你梦想中的数据平台:

Align your product’s goals with the goals of the business

重要的是要使平台的目标与企业的总体数据目标相一致. Image courtesy of John Schnobirch on Unsplash.

For several decades, data platforms were viewed as a means to an end versus “the end,” as in, the core product you’re building. In fact, although data platforms powered many services, fueling rich insights to the applications that power our lives, 直到最近,他们才得到真正应得的尊重和关注.

当你构建或扩展你的数据平台时,你应该问的第一个问题是: how does data map to your company’s goals?

要回答这个问题,你必须戴上数据平台产品经理的帽子. Unlike specific product managers, a data platform product manager must understand the big picture versus area-specific goals since data feeds into the needs of every other functional team, from marketing and recruiting to business development and sales.

例如,如果你的企业的目标是增加收入(做大或者回家!), how does data help you achieve these goals? For the sake of this experiment, consider the following questions:

  • What services or products drive revenue growth?
  • What data do these services or products collect?
  • What do we need to do with the data before we can use it?
  • Which teams need this data? What will they do with it?
  • Who will have access to this data or the analytics it generates?
  • How quickly do these users need access to this data?
  • 如果有的话,平台需要做什么合规或治理检查呢?

By answering these questions, you’ll have a better understanding of how to prioritize your product roadmap, as well as who you need to build for (often, the engineers) versus design for (the day-to-day platform users, including analysts). Moreover, this holistic approach to KPI development and execution strategy sets your platform up for a more scalable impact across teams.

Gain feedback and buy-in from the right stakeholders

毫无疑问,在整个产品开发过程中,提前接受购买和迭代反馈都是数据平台之旅的必要组成部分. What isn’t as widely understood is whose voice you should care about.

Yes, 你需要你的首席技术官或数据副总裁对成品的最终签字, 但他们的决定往往是由他们信任的顾问——员工工程师——来决定的, technical program managers, and other day-to-day data practitioners.

While developing a new data cataloging system for her company, one product manager we spoke with at a leading transportation company spent 3 months trying to sell her VP of Engineering on her team’s idea, only to be shut down in a single email by his chief-of-staff.

Consider different tactics based on the DNA of your company. We suggest following these three concurrent steps:

  1. Sell leadership on the vision.
  2. Sell the brass tacks and day-to-day use case on your actual users.
  3. Apply a customer-centric approach, no matter who you’re talking to. 将平台定位为在数据生态系统中赋予不同类型的角色的一种手段, including both your data team (data engineers, data scientists, analysts, and researchers) and data consumers (program managers, executives, business development, and sales, to name a few categories). 一个好的数据平台可以让技术用户轻松高效地完成工作, 同时也允许较少技术性的角色利用丰富的见解或基于数据组合可视化,而无需工程师和分析师的太多帮助.
在为公司构建数据平台时,你必须考虑各种各样的数据角色, from engineers, data scientists, product managers, business function users, and general managers). (Image courtesy of Atul Gupte)

At the end of the day, 重要的是,这种经历培养了一个由数据爱好者组成的社区, share, and learn together. Since your platform has the potential to serve the entire company, everyone should feel invested in its success, even if that means making some compromises along the way.

Prioritize long-term growth and sustainability vs. short-term gains

考虑短期可用性的数据解决方案通常更容易启动, but over time, end up being more costly than platforms built with sustainability in mind. (Image courtesy of Atul Gupte.)

Unlike other types of products, data platforms are not successful simply because they benefit “first-to-market.” Since data platforms are almost exclusively internal tools, 推荐一个正规滚球网站发现,最好的数据平台是建立在可持续发展的思想上,而不是特定功能的胜利.

记住:你的客户就是你的公司,你的公司的成功就是你的成功. This is not to say that your roadmap won’t change several times over (it will), but when you do make changes, do it with growth and maturation in mind.

For instance, Uber’s big data platform 在五年的时间里建立,随着业务的需要不断发展; Pinterest has gone through several iterations of their core data analytics product; and leading the pack, LinkedIn has been building and iterating on its data platform since 2008!

Our suggestion: choose solutions that make sense in the context of your organization, and align your plan with these expectations and deadlines. Sometimes, 快速取胜作为一个更大的产品开发战略的一部分,可以帮助实现内部购买——只要它不是短视的. Rome wasn’t built in a day, and neither was your data platform.

Sign-off on baseline metrics for your data and how you measure it

It doesn’t matter how great your data platform is if you can’t trust your data, but data quality means different things to different stakeholders. Consequently, 如果您和您的涉众没有在这个定义上达成一致,您的数据平台就不会成功.

To address this, it’s important to set baseline expectations for your data reliability, in other words, 您的组织在整个数据生命周期中提供高数据可用性和运行状况的能力. 为软件应用程序的可靠性设置清晰的服务水平目标(SLOs)和服务水平指标(SLIs)是一件很简单的事情. Data teams should do the same for their data pipelines.

This isn’t to say that different stakeholders will have the same vision for what “good data” looks like; in fact, they probably won’t, and that’s OK. Instead of fitting square pegs into round holes, it’s important to create a baseline metric of data reliability and, as with building a new platform feature, gain sign-off on the lowest common denominator.

We suggest choosing a novel measurement (like this one for data downtime) 这将帮助整个公司的数据从业者在基线质量指标上保持一致.

Know when to build vs. buy

您必须做出的第一个决定是,是从头开始构建平台,还是从供应商那里购买技术(或几个支持技术).

While companies like — you guessed it — Uber, LinkedIn, and Facebook have opted to build their own data platforms, often on top of open source solutions, it doesn’t always make sense for your needs. While there isn’t a magic formula that will tell you whether to build vs. buy, we’ve found that there is value in buying until you’re convinced that:

  • The product needs to operate using sensitive/classified information (e.g.(如财务或健康记录),由于监管原因不能与外部供应商共享
  • 需要特定的自定义才能与其他内部工具/系统很好地合作
  • These customizations are niche enough that a vendor may not prioritize them
  • There is some other strategic value to building vs. buying (i.e., competitive advantage for the business or beneficial for hiring talent)

推荐一个正规滚球网站采访过的一家医疗保健初创公司的数据工程副总裁表示,如果他20多岁, he would have wanted to build. But now, in his late 30s, he would almost exclusively buy. 

“I get the enthusiasm,” he says, “But I’ll be darned if I have the time, energy, and resources to build a data platform from scratch. I’m older and wiser now — I know better than to NOT trust the experts.”

When it comes to where you could be spending your time — and more importantly, 金钱——通常更有意义的是购买一个可靠的解决方案,并拥有一个专门的团队来帮助您解决任何出现的问题.

What’s next?

构建数据平台是一个令人兴奋的过程,从产品开发的角度来说,它将受益于应用. Image courtesy of memegenerator.net.

将你的数据平台构建成产品将有助于你确保更大的共识 data priorities, standardize on data quality and other key KPIs, foster greater collaboration, and, as a result, bring unprecedented value to your company.

In addition to serving as a vehicle for effective data management, reliability, and democratization, the benefits of building a data platform as a product include:

  • 指导销售工作(根据潜在客户的反应,让你了解应该把精力集中在哪里)
  • Driving application product road maps
  • 改善客户体验(帮助团队了解您的服务痛点是什么, what’s working, and what’s not)
  • 在公司范围内标准化数据治理和合规措施(GDPR、CCPA等).)

Building a data platform might seem overwhelming at first blush, but with the right approach, 您的解决方案有可能成为整个组织的力量倍增器.