Data & AI Report – Data Science Trends July 2024

August 5, 2024

1510

Trends in data science have brought a fresh wave of excitement to the data and analytics landscape this July. We’re seeing major moves towards scalability, efficient governance, and AI capabilities. Additionally, Dr. Randy Olson shows us just how far creative data use can take you—literally! Turns out, data science isn’t just about numbers, it can plan one epic road trip too! 

Firstly, discord’s transition to Dagster and dbt for data orchestration 

This month, Discord announced a major overhaul of their data orchestration infrastructure, moving from their in-house system, Derived, to a combination of Dagster and dbt. As their platform and user base expanded, the need for enhanced self-service capabilities and robust observability became evident. This decision was driven by the necessity for declarative automation, a modern unified interface, reliability on Kubernetes, and seamless integration with existing tools. 

After evaluating open-source options like Argo and Prefect, Discord chose Dagster for orchestration and dbt for data modeling. This transition has enabled them to support over 2,000 dbt tables, enhancing their ability to deliver seamless service and insightful data analytics while scaling efficiently. 

Read about it here

Meta unveils Llama 3.1 

This month, Meta introduced Llama 3.1, a massive leap in open-source AI. The Llama 3.1 405B model brings unmatched flexibility and state-of-the-art capabilities, unlocking new workflows like synthetic data generation and model distillation. Additionally, Meta is enhancing the Llama ecosystem with new security tools and a reference system. Over 25 partners, including AWS and Google Cloud, will offer services from day one. 

 

Llama 3.1 models feature expanded context lengths to 128K, multilingual support, and strong performance across benchmarks. Upgraded 8B and 70B models enhance capabilities in general knowledge, tool use, and translation. 

Read Meta’s full update

Building a data-driven analytics team at DoorDash 

Jessica Lachs, DoorDash’s VP of Analytics & Data Science, shares insights on what it means to be truly data-driven and how to structure an analytics team. Having joined DoorDash as the first General Manager in 2014, Lachs has built the analytics team from the ground up and now leads global analytics, including the Wolt Analytics team post-acquisition. 

Not only does Lachs highlight that the term “analytics” can be ambiguous, encompassing data science, business intelligence, product analytics, machine learning, and BizOps. She also emphasizes that to build a data-driven organisation, founders should focus on desired outcomes rather than semantics. At DoorDash, the role of analytics has evolved with the company’s growth, shifting from gut instinct decisions to data-centric strategies. Initially, DoorDash used quasi-experimental methods due to limited data, but as the company matured, they invested in scalable data models and advanced experimentation capabilities, expanding their analytics scope to drive better decision-making. 

Read the full post here

Databricks’ migration to unity catalog for data governance 

In a recent blog post, the Data Platform team at Databricks shared insights into their migration to Unity Catalog for enhanced data governance. As the company grows, establishing secure, compliant, and cost-effective data operations has become a priority. With thousands of employees analysing data, consistent governance standards are essential, making the migration to Unity Catalog a top priority. 

The blog outlines the challenges and benefits of migrating from the default Hive Metastore (HMS) to Unity Catalog. While HMS lacked fine-grained access controls, lineage support, audit logs, and effective search integration, UC provided these features out-of-the-box. Therefore, the team chose a transformational approach, selectively migrating datasets to establish a structured governance framework. This strategy required more effort initially, but enabled clear data ownership, naming conventions, and intentional access, setting the stage for future governance policies.

Read the blog

Finally, some creative Data use!

Dr. Randy Olson, a full stack data scientist and AI researcher, utilised his expertise in machine learning to develop an optimal search strategy.  

He approached this task using the Traveling Salesman Problem (TSP) algorithm, which aims to find the shortest route that visits each city exactly once and returns to the starting point.  

Dr. Olson applied three specific restrictions:  

  1. The trip must stop in all 48 contiguous U.S. states 
  2. Only visit National Natural Landmarks, National Historic Sites, National Parks, or National Monuments #
  3. Be taken entirely by car without leaving the U.S. 

Want to take the trip? The route spans 13,699 miles and requires 224 hours (or 9.33 days) of driving, assuming no traffic. You can find the full itinerary here. 

Dr Randy Olsen used Data to design the optimum road trip across the U.S. Showing how useful data can be. Data science trends really are everywhere!

Olsen’s epic road trip

To conclude 

July highlighted several key trends in data and analytics. The push for scalability is evident in Discord’s adoption of Dagster and dbt, and Databricks’ migration to Unity Catalog for better data governance. The importance of building effective data teams was underscored by DoorDash’s approach to analytics leadership. Another notable trend is the growing emphasis on enhanced self-service capabilities and robust observability in data platforms. These themes point towards a future focused on scalable infrastructure, efficient governance, structured teams, and innovative AI applications.

If you’re interested in how we can help scale your data team, get in touch.

International Girls in ICT Day – Breaking Barriers

April 25, 2024

1510

Today, technology drives almost all innovations that shape our world. Yet, behind the scenes lies a harsh reality – in the Netherlands, only 16% of technical roles are filled by women. (Teamrockstars) This International Girls in ICT Day, we’re shining a light on the challenges faced and the opportunities for progress within the Dutch Tech space. 

International Girls in ICT Day was set up by the United Nation’s specialised agency, the International Telecommunication Union (ITU), to encourage girls and young women to pursue STEM education and careers.  

Infographic showing some of the statistics mentioned above.

The Numbers

  • 0,8% of all investments since 2008 went to startups with only female founders. (Techleap)
  • Only 15% of students taking technical courses in the Netherlands are women.(Teamrockstars)
  • Start-up companies run by women typically obtain approximately 20% less investment than those by males. (OECD)
  • 3% of females say a career in technology is their first choice. (PWC)
  • 78% of students can’t name a famous female working in technology. (PWC)

The Impact of Homogeneity 

A lack of diversity doesn’t just have a negative impact on the image of a company, it can also be detrimental to the output of a company. 

With around 84% of the Dutch tech workforce being male, there’s a likelihood that a large majority of that have come from similar backgrounds and/or have similar experiences, and perspectives. This can lead to a narrow approach to innovation or problem solving.  

Obviously it’s not as black and white as this, and as it goes, the Dutch tech scene is incredibly diverse in other areas. But there is still work to be done… 

We’ve got to start somewhere! 

Alarmingly, without intervention, the share of women in tech roles in Europe is projected to decline by 2027 (McKinsey) 

There are lots of organisations & communities aiming change that, here are a handful: 

S#E (pronounced She Sharp)  

An Amsterdam based non-profit foundation that makes it easier for all women and non-binary people to enter, stay, and grow in the Tech industry. 

They have a large community on Slack, run events, and offer funding for people within the community to develop through training & courses.  

Women in Tech  

A global charity, with a Netherlands-based chapter, supporting women and girls in ICT & STEM and closing the digital divide. They donate computers & laptops to schools, establish learning centres, and provide digital literacy courses for girls in places like India, South Africa and Brazil. 

Tech She Can  

An European charity, working together with industry, government, and schools to improve the ratio of women in technology roles. Educating, equipping and inspiring young people, especially girls, to study technology subjects and choose a career in technology. 

How can I help? 

As an individual, the most important thing you can do is advocate, but here are a few other ways you or your company could help:  

  • Volunteer to become a mentor 
  • Join or support the organisations, donate to their causes and help fund training to encourage more young women to pursue a career in tech. 
  • Work with your company to set hiring targets, and communicate those to your chosen recruitment partner. 
  • Set up links with communities that work with women in ICT, so that women can find new roles. 
  • Look for opportunities to speak and inspire others to get into a career in tech. 

Hiring for your team? Get in touch with us to discuss your requirements, and gain access to our diverse talent pool. 

contact our team.