Skip to content
Go back

Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency

Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency
With the rapid development of assisted driving technology, data plays a core role in driving algorithm iteration and scenario optimization. The concept of a data flywheel emphasizes efficient data collection, processing, and application loops. However, issues such as the explosion of connected data, heterogeneous processing of multi-modal data in intelligent assisted driving, and multi-team collaboration have led to efficiency bottlenecks. Volcano Engine aims to improve data flow efficiency, reduce storage costs, and accelerate algorithm training through its full-modal data lake capability base. On July 22, 2025, Zhang Weiliang, Senior Manager of Data Product Solutions at Volcano Engine, stated at the 8th Intelligent Assisted Driving Conference, 'The data flywheel is crucial in the field of assisted driving, but it faces challenges in high engineering collaboration and extreme data processing efficiency. Volcano Engine's full-modal data lake capability base, through open-source compatibility and AI-native design, achieves efficient data flow, greatly tapping into potential value, allowing data to truly become an asset rather than a hidden liability.' The presentation discussed several key variables in intelligent assisted driving and connected development from a data perspective. The first is the explosion of data from intelligent connected vehicles, which, with the evolution of data collection schemes, have seen collection frequencies increasing, entering a 1Hz era, with some signals even reaching 100Hz. This has led to weaker schema constraints and a rapid increase in data volume, posing severe challenges for cloud-based big data architecture. The second key point is the mass production of intelligent assisted driving, which has become a consensus in the industry by 2025, surprising the public with its rapid development. Intelligent assisted driving inherently involves multi-modal data processing, with data volumes reaching hundreds of petabytes, placing high demands on underlying processing engines. Zhang highlighted that while data flywheels drive the development of assisted driving, there are pain points such as low R&D efficiency, chaotic versions, and compliance risks. For instance, during discussions with clients, it was found that after requesting sample supplements for algorithms, the median data response delay reached T+3 days due to loose organization of underlying data and lack of service capabilities. The design of Volcano Engine's full-modal data lake capability base focuses on openness and pluggable design, aiming to fundamentally avoid vendor lock-in risks. The data lake design integrates six key dimensions, ensuring mainstream big data components are pre-integrated and continually iterated, maintaining 100% compatibility with open-source ecosystems, significantly reducing management burdens via web-based tools, and optimizing costs through managed architecture and elastic scaling. In practice, Volcano Engine has implemented innovative solutions to enhance GPU utilization in a project with a host manufacturer by decoupling data processing from training clusters, significantly improving efficiency and reducing training cycles. The future outlook emphasizes the need to strengthen the performance of intelligent driving and connected multi-modal lakes, ensuring data transformation into quantifiable assets rather than liabilities, while also addressing challenges such as cold data storage costs and compliance response timeliness.

Images

Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency
Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency
Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency
Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency
Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency
Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency
Volcano Engine's Multi-Modal Data Lake Enhances Autonomous Driving Efficiency

Share this post on:

Previous Post
Deep Blue Automotive Unveils New Model L06 at Changan Automobile Group Launch
Next Post
China's Automotive Companies in Fortune Global 500: A Mixed Bag