Data Lake Insight

Contact Sales

What's DLI

Ease of Use

DLI lets you easily explore terabytes of data in your data lake using standard SQLs in seconds, with zero O&M burden.

DLI lets you easily explore terabytes of data in your data lake using standard SQLs in seconds, with zero O&M burden.
One-stop Analysis

Fully compatible with Apache Spark, Flink, and openLooKeng; stream & batch processing and interactive analysis in one place.

Fully compatible with Apache Spark, Flink, and openLooKeng; stream & batch processing and interactive analysis in one place.

Scalable Resources

On-demand, shared access to pooled resources, flexible scaling based on preset priorities.

On-demand, shared access to pooled resources, flexible scaling based on preset priorities.
Cross-source Connection

Easy cross-source data access for collaborative analysis with DLI datasource connections, no need for data migration.

Easy cross-source data access for collaborative analysis with DLI datasource connections, no need for data migration.

Functions

Full SQL Compatibility

Do big data analysis simply with SQL statements. The ANSI SQL 2003 syntax is fully compatible.

Do big data analysis simply with SQL statements. The SQL syntax is fully compatible.
Serverless Spark/Flink/openLooKeng

Seamlessly migrate your offline applications to the cloud with serverless technology. DLI is fully compatible with Apache Spark, Apache Flink, and Presto ecosystems and APIs.

Seamlessly migrate your offline applications to the cloud with serverless technology. DLI is fully compatible with Apache Spark, Apache Flink, and Presto ecosystems and APIs.

Cross-source Analysis

Analyze your data across databases. No migration required. A unified view of your data gives you a comprehensive understanding of your data and helps you innovate faster. There are no restrictions on data formats, cloud data sources, or whether the database is created online or off.

Analyze your data across databases. No migration required. A unified view of your data gives you a comprehensive understanding of your data and helps you innovate faster. There are no restrictions on data formats, cloud data sources, or whether the database is created online or off.
Enterprise Multi-tenant

Manage compute or resource related permissions by project or by user. Enjoy fine-grained control that makes it easy to maintain data independence for separate tasks.

Manage compute or resource related permissions by project or by user. Enjoy fine-grained control that makes it easy to maintain data independence for separate tasks.

What's Right for You?

Compact Resources

64 compute units
SQL queues
Analysis of less than 1 TB of data in data lakes/warehouses; AI analysis; predictable costs; elastic resource scaling without O&M

Enterprise applications Limited commercial testing

From $3.65 USD

/hour

Expanded Resources

256 compute units
SQL queues
Analysis of less than 10 TB of data in data lakes/warehouses where a large number of JOIN tasks and shuffles are involved

High-performance computing Cloud-native data lake

From $14.59 USD

/hour

Database Analysis

Application data stored in relational databases needs analysis to derive more value. For example, big data from registration details helps with commercial decision-making.

Pain Points

Complicated queries are not supported for larger relational databases.
Comprehensive analysis is not possible because database and table partitions are spread in multiple relational databases. Business data analysis might overload available resources.

Advantages

SQL experience transferability

Hit the ground running with new services. DLI supports standard ANSI SQL 2003 relational database syntax so there is almost no learning curve.
Versatile, robust performance

Distributed in-memory computing models effortlessly handle complicated queries, cross-partition analysis, and business intelligence processing.

Related Services

CDM

Precision Marketing

Associative analysis combines information from multiple channels to improve conversion rates.

Advantages

Cross-source analysis

Advertisement CTR data stored in OBS and user registration data in RDS are able to be directly queried without migration to DLI.
Only SQL needed

Interconnected data sources map together with a table created using just SQL statements.

Related Services

OBS

Log Analysis

Gaming companies need a quality data analysis platform to improve ad placement, new player retention, operations, and feedback for future game iterations.

Pain Points

Log analysis is usually performed by period. During the idle periods between each task, resources are wasted.

Advantages

Pay per use & auto scaling

Release idle resources with flexible scaling policies and save half the costs of exclusive clusters.
Converged analysis

Just a single copy of metadata works for real-time cleaning and offline ETL processing. The data processing result can be directly used in interactive analysis for data mining.

Related Services

MySQL

Permission Control

When multiple departments need to manage resources independently, fine-grained permissions management improves data security and operations efficiency.

Advantages

Easier permissions assignment

Grant permissions by column or by specific operation, such as INSERT INTO/OVERWRITE, and set metadata to read-only.
Unified management

A single IAM account handles permissions for all staff users.

Library Integration

Genome analysis relies on third-party analysis libraries, which are built on the Spark distributed framework.

Pain Points

High technical skills are required to install analysis libraries such as ADAM and Hail.
Every time you create a cluster, you have to install these analysis libraries again.

Advantages

Custom images

Instead of installing libraries in a technically demanding process, package them into custom images uploaded directly to the Software Repository for Container (SWR). When using DLI to create a cluster, custom images in SWR are automatically pulled so you don't have to reinstall these libraries.
Built-in base images

Huawei-enhanced Spark and Flink images (multiple versions) and open-source AI images (TensorFlow/Keras/PyTorch) are available for your convenience.

Related Services

SWR

Real-time Risk Control

Almost every aspect of financial services requires comprehensive risk management and mitigation.

Pain Points

There is very little tolerance for excessive latency when it comes to risk control.

Advantages

High throughput

Real-time data analysis in DLI with the help of an Apache Flink dataflow model keeps latency low. A single CPU processes 1,000 to 20,000 messages per second.
Ecosystem coverage

Save real-time data streams to multiple cloud services such as CloudTable and SMN for comprehensive application.

Related Services

SMN

DIS

Real-time Displays

With COVID-19 raging across the globe, governments need to be able to monitor key data the moment things happen.

Pain Points

Government employees do not necessarily have a background in big data. SQL is usually far more familiar.

Advantages

Millisecond responsiveness

Thanks to its powerful in-memory computing framework, the built-in openLooKeng engine optimizes query performance for interactive analysis on the spot.
Mainstream compatibility

DLI queries use SQL syntax, so staff don't need a big data background. This familiar syntax is fully compatible with standard ANSI SQL 2003.

Related Services

MySQL

DLV

CDM

Big Data Analysis

Massive volumes of data include petabytes of satellite images and many types – structured remote sensing raster data, vector data, and unstructured spatial location data. The analysis and mining of all this data needs efficient tools.

Advantages

Spatial data analysis

Spark algorithm operators in DLI enable real-time stream processing and offline batch processing. They support massive data types, including structured remote sensing image data, unstructured 3D modeling, and laser point cloud data.
CEP SQL functionality

SQL statements are all that is needed for yaw detection and geo-fencing.
Heavy data processing

Quickly migrate up to exabytes of remote sensing images to the cloud, then slice them to data sources for distributed batch processing.

Related Services

DIS

CDM

DES

DLI vs Self-built Hadoop

活动对象：华为云电销客户及渠道伙伴客户可参与消费满送活动，其他客户参与前请咨询客户经理

活动时间： 2020年8月12日-2020年9月11日

活动期间，华为云用户通过活动页面购买云服务，或使用上云礼包优惠券在华为云官网新购云服务，累计新购实付付费金额达到一定额度，可兑换相应的实物礼品。活动优惠券可在本活动页面中“上云礼包”等方式获取，在华为云官网直接购买（未使用年中云钜惠活动优惠券）或参与其他活动的订单付费金额不计入统计范围内；

活动对象：华为云电销客户及渠道伙伴客户可参与消费满送活动，其他客户参与前请咨询客户经理

活动时间： 2020年8月12日-2020年9月11日

活动对象：华为云电销客户及渠道伙伴客户可参与消费满送活动，其他客户参与前请咨询客户经理

活动时间： 2020年8月12日-2020年9月11日

活动对象：华为云电销客户及渠道伙伴客户可参与消费满送活动，其他客户参与前请咨询客户经理

	Data Lake Insight	Self-built Hadoop System
Cost	Billed by the actual volume of data scanned or used compute unit per hour (CUH). Costs are saved.	Billed by resources occupied. Long-term occupation is expensive and wasteful.
Elastic scalability	Intelligent with container-based Kubernetes	N/A
O&M and availability	Out-of-the-box, serverless architecture, and cross-AZ DR	Strong technical capabilities are required for configuration and O&M
Learning cost	The optimization parameters are standardized based on 10 years' experience in thousands of projects. In addition, DLI provides a GUI for intelligent optimization.	Hundreds of tuning parameters need to be learned.
Supported data sources	Cloud: OBS/RDS/DWS/CSS/MongoDB/Redis; On-premises: self-built database/MongoDB/Redis	Cloud: OBS; On-premises: HDFS
Ecosystem compatibility	Data Lake Visualization (DLV), Tableau, Yonghong BI, and Fanruan BI	Big data ecosystem tool
Custom image	Supported. Dependencies can be added as required to meet service diversity requirements.	N/A
Workflow scheduling	Scheduling through Data Lake Factory (DLF) in DataArts Studio	Self-built scheduling tools, such as Airflow
Enterprise tenant permissions	Table-based with column-level granularity	File-based
Performance	Higher thanks to optimized software and hardware	Matches Hadoop open-source versions

Cost

Elastic scalability

O&M and availability

Learning cost

Supported data sources

Ecosystem compatibility

Custom image

Workflow scheduling

Enterprise tenant permissions

Performance

New Features

January 2020

Flink streaming jobs

March 2020

Spark access to DLI metadata

April 2020

IAM fine-grained authorization

April 2020

Flink 1.10

May 2020

Data scanning package

May 2020

Global variables configuration

June 2020

Dual-AZ compute queues

June 2020

Spark job developer mode

August 2020

Package billed by storage

October 2020

Custom images

February 2021

Multiple data versions

February 2021

Flink 1.11

September 2021

Flink OpenSource SQL syntax for ClickHouse and user-defined tables

December 2021

API deprecation

Success Stories

Mengxiang.com

Mengxiang uses Huawei Cloud DLI and DataArts Studio to analyze real-time behavioral data and match products with customers.

This fast grower faced challenges with service stability during traffic peaks and promotions. The DLI+DataArts Studio solution provides Mengxiang with an elastic architecture and high-performance data lake for integrated batch and stream processing.

DIANCHU Technology

DIANCHU used Huawei Cloud DLI and intelligent data lake DGC to establish a data analysis platform for games. The platform analyzes revenue, player retention rate, and payment rate in real time for activity planning, precise marketing, and decision-making.

Dragonest

Dragonest works with Huawei Cloud to query and analyze gaming data. The analysis is useful for different departments launching new services. Data applications are integrated, benefiting the entire organization.

Documentation

查看更多收起

Try Free

Data Lake Insight

Contact Sales

Data Lake Insight (DLI)

Data Lake Insight (DLI)

What's DLI

Functions

Application Scenarios

DLI vs Self-built Hadoop

New Features

Success Stories

Mengxiang.com

DIANCHU Technology

Dragonest

Documentation

Data Lake Insight (DLI)

Data Lake Insight (DLI)

What's DLI

Functions

Application Scenarios

DLI vs Self-built Hadoop

New Features

Success Stories

Mengxiang.com

DIANCHU Technology

Dragonest

Documentation

Getting Started

Getting Mastered

Developers