Modern Data Stack Conference 2021

The second and final day of the Modern Data Stack conference just wrapped up. If you missed it, check our big takeaways from the event!

Also, be sure to check out our takeaways from Day 1!

1. You can't convince someone with reason. Apply psychology and appeal to their emotions.

Jon Haidt, author of The Righteous Mind, walked conference attendees through heaps of information related to moral psychology. He showed how most people determine their own morality based on how much they value:

Care
Fairness
Liberty
Loyalty
Authority
Sanctity

If you want to convince someone of something, you need to identify how they weigh these different concerns, then acknowledge and appeal to those concerns. Otherwise, the findings that you present may go in one ear and out the other.

Want to know how you or your team value each of these different aspects? You can find out by taking this quiz.

2. Want to avoid data bias? Add in peer review.

As part of the same session, Jon made some great points about how "p-hacking" is starting to become more common, with people seeking out ways to make data prove their point, rather than presenting the data as is. This bias is all too common, but can be drastically reduced by adding in peer review to the process. Use GitHub reviews on SQL code to verify how the data is being pulled. Have others read the insights before they are presented. When people know that their work will be evaluated by others, they are far less likely to add in any bias. And if they do, even unconsciously, it still has a chance to be caught and corrected.

We thought it was a great reminder that analytic insight shouldn't happen in a silo or fall on one individual. Ensuring that analytics is performed as a team can actually result in more accurate insights overall.

3. Break your data team out of the service trap by focusing on insights.

Emilie Schario (formerly Netlify) had an infectious energy while teaching data teams how they could break out of the data trap. Many data teams fall into a vicious cycle where they're constantly fending off different requests from teams, providing low impact products with no time to do new or tranformative work.

The easiest takeaway from this presentation? When building out a data team, follow the slide below from left to right. Focus on the plumbing first to make sure data lives in all the applications that business users work with. Only once that's in place can you move to the next step. You can't avoid service requests entirely, but by delaying them and providing higher value work first, the demand for small requests will start to dwindle.

The problem, as Emilie sees it, is that few people trust the data team because they're constantly producing low impact work due to saying "yes" to every service request that comes through the door. If you're already in the data service trap, it's recommended to have your team book 2-3 hours every week to just finding a unique insight about the business. This could be just going the extra step on a report to explore something that piqued your curiosity.

Once you find those insights WORK LOUDLY by sharing them with the larger organization so they can see the cool, in-depth work being done. By doing this consistently, you'll start to provide a catalog of insights that serve as "appetizers" for the deeper work that your data team can work on.

4. ELT is still the future of data analytics.

While this may not be new to many folks working with a Modern Data Stack, ELT ( as opposed to ETL) is the future of how teams will operate with their data.

The major emphasis in this shift is that your warehouse should be used to store raw data first. Don't worry about formatting it correctly - just make sure it's available for the data teams to work with.

Once the data exists in your warehouse, you can transform it again and again until you get it right. With ELT, analysts no longer have to worry about getting data right the first time. They don't need to spend 15 weeks breaking down how the data should be loaded in to correctly answer a question. This shift in operations makes it easier to flexibly change how the data teams build tables and define metrics to drive value with the available data.

5. When faced with build vs buy... Buy first, Build later.

While no one explicitly stated this, the theme was prevalent across multiple sessions that we attended. If you need to make an impact with data quickly, you need to focus your efforts on things that matter.

Need to load data from an external service? Buy it.

Need to send data warehouse data to your other tools? Buy it.

Need to orchestrate one-off scripts? Buy it.

Need to make embedded dashboards? Buy it.

The thought process is that buying technology upfront can help reduce the amount of engineering overhead that it takes to get your data to a usable state. The end goal with data is to drive business impact with it. Executives don't really care how you accomplish that.

Everyone that walked through their data stack showed an evolution over time as the needs of the organization grew. Nobody gets it right the first time. It's ok to switch things around down the road. You don't know what you don't know until you're there.

In fact, some organizations even recommended "buying" a consultant if you need to get up to speed quickly. They've seen more data setups and more unique situations than your team probably has, so they'll be more effective at ensuring your organization's data infrastructure is set up properly.

Once your organization grows and matures, only then do you clearly see the unique aspects of your data situation that can warrant building out and managing proprietary solutions.

That marks an end to the Modern Data Stack conference. Thanks to the Fivetran team for hosting such a great virtual event this year. We can't wait until the next one!

About Shipyard:
Shipyard is a modern data orchestration platform for data engineers to easily connect tools, automate workflows, and build a solid data infrastructure from day one.

Shipyard offers low-code templates that are configured using a visual interface, replacing the need to write code to build data workflows while enabling data engineers to get their work into production faster. If a solution can’t be built with existing templates, engineers can always automate scripts in the language of their choice to bring any internal or external process into their workflows.

The Shipyard team has built data products for some of the largest brands in business and deeply understands the problems that come with scale. Observability and alerting are built into the Shipyard platform, ensuring that breakages are identified before being discovered downstream by business teams.

With a high level of concurrency and end-to-end encryption, Shipyard enables data teams to accomplish more without relying on other teams or worrying about infrastructure challenges, while also ensuring that business teams trust the data made available to them.

For more information, visit www.shipyardapp.com or get started with our free Developer Plan.