居民如何通过蒙特卡罗减少90%的数据问题

Many data leaders tell us that their data scientists and engineers spend 40%或更多 他们的时间 tackling data issues instead of working on projects that actually move the needle.

没必要这样的. 这是居民的数据工程团队的做法, 一个直接面向消费者的家具品牌之家, 减少了90%的数据事件 s点的数据可观测性凯尔.

Direct-to-consumer mattress brands may not be the first category that comes to mind when discussing data-driven companies. 但是Daniel Rimon,数据工程的主管 居民, 这归功于他们在技术上的投资, data, 以及他们最近的成功营销:2020年, 《推荐一个正规滚球网站》实现了100%的增长,总销售额超过5亿美元, 并获得了一轮1.3亿美元的投资. 

“当你听到床垫的时候, 它不一定会告诉你数据或技术, 但这就是推荐一个正规滚球网站公司的宗旨,”丹尼尔说. She currently sits on a data team with four analysts and three engineers, led by a VP of Data. 

《推荐一个正规滚球网站》依靠数据来推动营销决策和支出, with over 20 marketing connections supporting lead and customer tracking, 分割, 和零售分析. 丹尼尔和她的团队现在管理着30多人,000 tables in Big Query and leverage a data tech stack that includes 蒙特卡罗, 谷歌云平台, 气流, 多溪流的, 和其他企业工具, conducting experiments and enabling the business to base decisions on data in real-time. 

推荐一个正规滚球网站有很多数据,”丹尼尔说. “这让事情变得非常复杂,也很难追踪. And if any of the data that powers our operations becomes missing or broken, 或者管道里的什么东西坏了, 这可能会导致整个行业的严重问题.”

The challenge: connecting disparate data sources and understanding relationships between data assets
When a map of your data assets starts looking like a jumble of rubber bands, 你知道你有麻烦了. 图片由Shutterstock提供,可通过其标准许可获得.

In 2019, 丹尼尔加入公司的时候, 居民一直在使用BigQuery来分析广告表现的数据, 跟踪营销花费, 商品成本, 和销售. 

“We had complicated queries, many duplications, inconsistent logic—it was quite a mess,”丹尼尔说. “We knew we were missing a lot of data and wanted to keep better track of our website through Google Analytics and Google Tag Manager. 推荐一个正规滚球网站没有任何监控, 关于客户成功方面的信息, and found out we didn’t have connections to a lot of our marketing sources.”

The company suffered from unreliable data and strained relationships between teams. “Stakeholders and executives weren’t able to access the most up-to-date data they needed to make decisions,”丹尼尔说. “这也对业务部门之间的关系产生了负面影响. 例如, 如果你从事数据工程, 它确实会让你与BI或分析团队的关系变得紧张.”

超越公司内部关系, 客户体验也受到了影响, with bad data leading to customers receiving emails that weren’t relevant to them. 

At the time, Daniel and her team attempted to handle data issues manually. “推荐一个正规滚球网站的策略将基于记忆, 一些基本工具, 推荐一个正规滚球网站的分析师和同事给了推荐一个正规滚球网站很多希望和帮助,”她说. 手工工作导致了重复的视图和流程, and the team didn’t have a clear understanding of what fields or tables affected downstream data consumers. 他们设置的唯一通知是基本的故障通知.

“我早上, 或者是我老板的早上, would start by opening every main dashboard to look for mistakes or stuff that doesn’t make sense,”丹尼尔说. “它非常原始,效率非常低,实际上令人沮丧.”

偶尔, 丹尼尔了, something would get stuck in the back-end process that injects data into her team’s Big Query. 除非她手动检查过仪表盘,否则她不会知道. “But sometimes the CEO of the company would Slack me and my boss and say, ‘What’s going on? 推荐一个正规滚球网站没有销售吗?’所以这就是我的噩梦——收到“数据不可靠”的信息. 数据被破坏了.没有上下文, 我不知道从何说起, 所以你必须进行调查并找到核心内容,这让人很沮丧. 当你刚到公司的时候,就没那么有趣了.” The time difference between the Israel-based data team and their US-based colleagues didn’t help matters.

丹尼尔的老板知道情况必须改变, so Daniel was asked to build an entire data performance process from scratch. With an existing platform that hadn’t been well-maintained and proved difficult to debug, 这是一项具有挑战性的工作.

我告诉了我的老板, ‘I wish there were someplace that I could take a table and see everything it connects to,”丹尼尔说. “我可以知道, 它会影响这个和这个, 但不是在这里, 这是最终的仪表盘. 我需要一张很大的地图. 肯定有这样的东西!”

解决方案:数据可观察性 

丹尼尔和她的团队开始明白他们需要什么, 而不是构建一个定制系统, 居民 began using 蒙特卡罗 to handle real-time monitoring and alerting, as well as lineage—that big map of data relationships that Daniel told her boss she needed. 

Outcome: Fixing data issues before they impact data consumers through real-time monitoring and alerting
图片由蒙特卡罗提供.

蒙特卡罗 uses machine learning to automatically set alerting parameters for tables, 发现所有住院医师的表格在新鲜度或容量上的异常, 据丹尼尔估计,这占了他们警报的90%. 

现在, 而不是来自CEO的恐慌性信息, Daniel and her team now receive 蒙特卡罗 alerts in Slack for issues like a table updating with fewer rows than expected, 或者没有在预期时间更新. 他们是第一个知道问题的人, 工程师可以很容易地更新问题状态以跟踪进展. 

Daniel还可以定义自定义警报, such as one she put in place for a particularly sensitive set of processes that relies on Google Analytics delivering data to their warehouse at a certain time every day. “它通常在以色列时间晚上7点左右到达,但这并不一致. 之后我还要安排具体的程序, 但如果我错过了, 我得重新来过. I start my process at 8pm, and the first step is to know that the Google Analytics update happened. If it’s missing, I get a Slack alert message so I can stop, figure out what happened, and fix it. 我必须在主持之前手动检查这个,这真的很烦人. 我花了20秒设置这个警报,而且非常简单.”

结果:通过自动沿袭防止“数据灾难” 
图片由蒙特卡罗提供.

自动化血统对丹尼尔和她的团队来说是一个巨大的优势. “我不需要任何东西来维持这个家族. 如果我想要一些高级监控, 我可以定义它, 但我不需要任何条件就能获得这种血统. 这是梦想.”

创建沿袭图, 蒙特卡罗 automatically parses the SQL out of the query logs in BigQuery, 并通过连接器扩展到Looker和Github. 沿袭会随着数据的发展而自动更新.

图片由蒙特卡罗提供.

Daniel最喜欢的用例是一个名为“管理订单”的表, 其中包含所有关于订单的信息. “这是我最喜欢的桌子, 我可以看到很多关于它行为的信息, 就像提供给它的其他表和视图一样.“她能理解过去发生了什么变化. She can also see the downstream views that will be affected by changes to Admin Orders, 以及Looker仪表盘最终连接到她的表格上的东西, so she can collaborate with her analysts and engineers to understand the impact of any changes. 

“因为那些景观是他们建的——不是我,”丹尼尔说. “We can collaborate together and make sure we prevent data disasters before we make them.” 

结果:在管道的每个阶段都信任数据

居民的数据小组发现了一个 减少90% 自实施蒙特卡罗以来,在数据问题. 随着高管和利益相关者对数据的信任恢复, 居民 plans to build more and more data products and further enrich their data. Daniel’s team is integral to the company’s future: ”There isn’t any process in the company that happens without us. And people trust and believe our data because it’s reliable and it’s good. Even if it’s complicated, we have the tools to monitor it in real-time and make it more reliable.”

数据可观察性对居民的影响

Among other benefits of data observability, 蒙特卡罗 has positively impacted 居民 by:

  • 手工流程自动化 so the data team can focus time and efforts on innovation and improvements, not fire-fighting
  • 缩短出现数据问题时的检测时间 通过自动和自定义警报
  • 防止下游刀具磨损 通过自动化的传承和改进的协作
  • 恢复对数据的信任,实现更好的决策 by ensuring executives have the accurate information they need to drive the business forward

“Before 蒙特卡罗, I was always on the watch and scared that I was missing something,”丹尼尔说. “现在我无法想象没有它的工作. 一年前的事故只有现在的10%. 推荐一个正规滚球网站的团队非常可靠,大家都依靠推荐一个正规滚球网站. I think every data engineer has to have this level of monitoring in order to do this work in an efficient and good way.”

Interested in learning how data observability can help you restore trust in your data and reclaim valuable engineering time? 接触 将知更鸟 剩下的 蒙特卡罗团队! 

特别感谢丹尼尔和住院医师!