June was a busy month for us in the tech world. The Snowflake Summit 2024 and Databricks Data + AI Summit 2024 kicked off only a few short weeks apart. Data-Sleek had the pleasure of attending both of these events along with a community of 300,000+ data teams and other attendees around the globe. Both summits provided many unique perspectives and lots of great ideas. But one thing was the star of the show: generative AI and what that means for the future of business analytics and data management.

What is Databricks?

Databricks is a San Francisco-based data, analytics, and ai company founded by the creators of Apache Spark. Their core offering includes a cloud-based solution allowing data management and consulting firms (like Data-Sleek) to integrate data warehouse and data lake capabilities seamlessly. This helps organizations like yours efficiently manage and analyze massive amounts of data for better decision making.

Databricks is not one of the largest big data platforms. Their share is an estimated 8% of the market, led by giants like Snowflake, AWS, and Microsoft Azure. Don’t be fooled–Databricks is also one of the fastest-growing companies in the sphere. Its recent Data and AI summit revelations may have secured its foothold. This is a company to keep on your radar.

Is Databricks or one of their competitors right for your data management needs? Schedule a free consultation with us today to find out.

tablet

Key Takeaways from the Databricks Data + AI Summit 2024

1. New Innovations in the Databricks Data Intelligence Platform

Ali Ghodsi, Databricks CEO, introduced several innovations to enhance the use of governed datasets on the Databricks Data Intelligence Platform. Some of the key announcements included:

Acquisition of Tabular: Aiming to eliminate the need for users to choose between different USB formats and ensure seamless integration between Delta Lake and Iceberg. This is a big step towards seamless data operations. We believe the Databricks Data Intelligence Platform will reduce many integration complexities and enhance data accessibility even further. 

Open-Sourcing Unity Catalog: Extending openness to metadata and governance, providing users more flexibility and control without vendor lock-in. 

office

2. New Unity Catalog Enhancements

CTO Matei Zaharia announced that Databricks is open-sourcing Unity Catalog, which includes:

Open Connectivity: Lakehouse Federation will connect external data sources to Unity, including Apache Hive and Glue, expanding the scope of data-driven insights.

Unified Governance: Introduction of Lakehouse Monitoring,  Attribute-based Access Control (ABAC), and Unity Catalog Metrics for standardized, governed business metrics. 

Open Accessibility: Flexibility to access data and AI resources from any tool, compute engine, or platform using open standards and interfaces. 

3. Databricks Goes 100% Serverless

Ghodsi revealed that Databricks will go completely serverless in the coming months. This is a powerful move in the market that aims to take significant market share. For legacy Databricks users, the cluster is now considered deprecated. Anyone using the legacy system is encouraged to use the serverless option to keep up with new rollouts and updates.

4. Upgrades to the Mosaic AI Model Training and Tools

Databricks Engineer Patrick Wendell introduces several upgrades his team has been working on regarding Mosaic AI. Some of these highly anticipated upgrades include:

Model Training + Machine Learning: Allowing customization for general-purpose models targeting enterprise use cases. Over 200,000 AI models were trained last year as part of the project.

No-Code Fine-Tuning: This upgrade will allow users to fine-tune models easily, serving them with one click to Databricks’ serving infrastructure.

AI Tools Catalog: Fully integrated with Unity Catalog, facilitating the sharing of functions across organizations.

AI Agent Framework and Evaluation: This includes an SDK and libraries for serving agents and Rag applications as real-time endpoints. This upgrade was bound to happen at some point, and the already bloated demand for real-time analytics makes it even more appropriate. 

5. New AI Gateway for Governance and Tracking

Mosaic AI Gateway was introduced to manage governance, permissions, guardrails, and runtime tracking. These features ensure secure and efficient AI operations within the Databricks platform. 

If you’ve been following any tech news in the last few months, much of the discussion has revolved around sensitive data and security with AI. Databricks is addressing concerns with centralized solutions to govern data and tracking frameworks to protect against some risks associated with AI deployment. 

ai image

6. The Elephant In the Room: Data Science Meets AI

Reynold Xin, Databricks computer scientist and engineer, presented the updated data warehousing capabilities. As mentioned earlier, this includes the Lakehouse concept. Some of the key highlights of his talk included:

AI-Powered Performance Enhancements: “Prediction I/O 2.0” to enable faster data processing and query performance.

DataBricks AI/BI: A compound AI system providing access to a built-in Genie chatbot, enabling natural language data queries and learning from user feedback.

7. Partnerships with NVIDIA

Databricks and NVIDIA announced the integration of NVIDIA’s accelerated computing with Databricks Photon, enhancing performance and cost efficiency for Databricks SQL. The open-source model DBR is now available as an NVIDIA microservice hosted on the NVIDIA API catalog. Snowflake also announced a partnership at their summit earlier in the month. 

Planning Your Data Management Strategy Based on the Databricks Data + AI Summit

Based on the Databricks Data + AI Summit, here’s what we recommend:

1. Evaluate all your data integration opportunities you have in your business. 

There are so many options in the market that can integrate all of those disparate data sources you’re (painstakingly) dealing with. Look at your data infrastructure and see if there are any areas where you can do better in the efficiency and scalability department. 

2. Plan on empowering your AI capabilities–but not right yet. 

Although we’re very excited about the possibility of AI in enterprise data, it’s still too early to jump unless you’ve considered the costs. Security is still a major concern, and the models must be refined enough to measure their considerable impact. That doesn’t mean throwing the idea out the window. That means it’s something to keep on your radar because there will be a time in the very near future when AI will be critical to compete. 

3. Enhance data governance and security. 

Again, data security is still a big deal, with or without AI. Let this be a reminder to review your data infrastructure and identify any areas where your organization might be vulnerable. The market is changing rapidly, and it’s important to increase security audits. 

Does Your Data Have Your Back Up Against A Wall? Data-Sleek Can Help

Crafting a data management strategy isn’t easy, nor is it a one-time thing. The business intelligence and data landscapes are changing faster than at any other point in history. Sometimes, the best thing to do is to call on a professional who can create a process tailored to your company. 

When it comes to data, our data engineers and data scientists seen it all. We offer end-to-end data management consulting, support, and implementation. Whether you need help solving a problem, achieving a business goal, or choosing the right platform for your business, we can help. 

Your data management strategy is critical to the success of your business. Don’t go the road alone. Talk with one of our experts today, and let us streamline your data.

Scroll to Top