What's DLI

  • Ease of Use

    DLI lets you easily explore terabytes of data in your data lake using standard SQLs in seconds, with zero O&M burden.

    DLI lets you easily explore terabytes of data in your data lake using standard SQLs in seconds, with zero O&M burden.

  • One-stop Analysis

    Fully compatible with Apache Spark, Flink, and openLooKeng; stream & batch processing and interactive analysis in one place.

    Fully compatible with Apache Spark, Flink, and openLooKeng; stream & batch processing and interactive analysis in one place.

  • Scalable Resources

    On-demand, shared access to pooled resources, flexible scaling based on preset priorities.

    On-demand, shared access to pooled resources, flexible scaling based on preset priorities.

  • Cross-source Connection

    Easy cross-source data access for collaborative analysis with DLI datasource connections, no need for data migration.

    Easy cross-source data access for collaborative analysis with DLI datasource connections, no need for data migration.

Functions

  • Full SQL Compatibility

    Do big data analysis simply with SQL statements. The ANSI SQL 2003 syntax is fully compatible.

    Do big data analysis simply with SQL statements. The SQL syntax is fully compatible.

  • Serverless Spark/Flink/openLooKeng

    Seamlessly migrate your offline applications to the cloud with serverless technology. DLI is fully compatible with Apache Spark, Apache Flink, and Presto ecosystems and APIs.

    Seamlessly migrate your offline applications to the cloud with serverless technology. DLI is fully compatible with Apache Spark, Apache Flink, and Presto ecosystems and APIs.

  • Cross-source Analysis

    Analyze your data across databases. No migration required. A unified view of your data gives you a comprehensive understanding of your data and helps you innovate faster. There are no restrictions on data formats, cloud data sources, or whether the database is created online or off.

    Analyze your data across databases. No migration required. A unified view of your data gives you a comprehensive understanding of your data and helps you innovate faster. There are no restrictions on data formats, cloud data sources, or whether the database is created online or off.

  • Enterprise Multi-tenant

    Manage compute or resource related permissions by project or by user. Enjoy fine-grained control that makes it easy to maintain data independence for separate tasks.

    Manage compute or resource related permissions by project or by user. Enjoy fine-grained control that makes it easy to maintain data independence for separate tasks.

Application Scenarios

Database Analysis

Application data stored in relational databases needs analysis to derive more value. For example, big data from registration details helps with commercial decision-making.

Pain Points

  • Complicated queries are not supported for larger relational databases.

  • Comprehensive analysis is not possible because database and table partitions are spread in multiple relational databases. Business data analysis might overload available resources.

Advantages

  • SQL experience transferability

    Hit the ground running with new services. DLI supports standard ANSI SQL 2003 relational database syntax so there is almost no learning curve.

  • Versatile, robust performance

    Distributed in-memory computing models effortlessly handle complicated queries, cross-partition analysis, and business intelligence processing.

Related Services

Precision Marketing

Associative analysis combines information from multiple channels to improve conversion rates.

Advantages

  • Cross-source analysis

    Advertisement CTR data stored in OBS and user registration data in RDS are able to be directly queried without migration to DLI.

  • Only SQL needed

    Interconnected data sources map together with a table created using just SQL statements.

Related Services

Log Analysis

Gaming companies need a quality data analysis platform to improve ad placement, new player retention, operations, and feedback for future game iterations.

Pain Points

  • Log analysis is usually performed by period. During the idle periods between each task, resources are wasted.

Advantages

  • Pay per use & auto scaling

    Release idle resources with flexible scaling policies and save half the costs of exclusive clusters.

  • Converged analysis

    Just a single copy of metadata works for real-time cleaning and offline ETL processing. The data processing result can be directly used in interactive analysis for data mining.

Related Services

Permission Control

When multiple departments need to manage resources independently, fine-grained permissions management improves data security and operations efficiency.

Advantages

  • Easier permissions assignment

    Grant permissions by column or by specific operation, such as INSERT INTO/OVERWRITE, and set metadata to read-only.

  • Unified management

    A single IAM account handles permissions for all staff users.

Library Integration

Genome analysis relies on third-party analysis libraries, which are built on the Spark distributed framework.

Pain Points

  • High technical skills are required to install analysis libraries such as ADAM and Hail.

  • Every time you create a cluster, you have to install these analysis libraries again.

Advantages

  • Custom images

    Instead of installing libraries in a technically demanding process, package them into custom images uploaded directly to the Software Repository for Container (SWR). When using DLI to create a cluster, custom images in SWR are automatically pulled so you don't have to reinstall these libraries.

  • Built-in base images

    Huawei-enhanced Spark and Flink images (multiple versions) and open-source AI images (TensorFlow/Keras/PyTorch) are available for your convenience.

Related Services

Real-time Risk Control

Almost every aspect of financial services requires comprehensive risk management and mitigation.

Pain Points

  • There is very little tolerance for excessive latency when it comes to risk control.

Advantages

  • High throughput

    Real-time data analysis in DLI with the help of an Apache Flink dataflow model keeps latency low. A single CPU processes 1,000 to 20,000 messages per second.

  • Ecosystem coverage

    Save real-time data streams to multiple cloud services such as CloudTable and SMN for comprehensive application.

Related Services

Real-time Displays

With COVID-19 raging across the globe, governments need to be able to monitor key data the moment things happen.

Pain Points

  • Government employees do not necessarily have a background in big data. SQL is usually far more familiar.

Advantages

  • Millisecond responsiveness

    Thanks to its powerful in-memory computing framework, the built-in openLooKeng engine optimizes query performance for interactive analysis on the spot.

  • Mainstream compatibility

    DLI queries use SQL syntax, so staff don't need a big data background. This familiar syntax is fully compatible with standard ANSI SQL 2003.

Related Services

Big Data Analysis

Massive volumes of data include petabytes of satellite images and many types – structured remote sensing raster data, vector data, and unstructured spatial location data. The analysis and mining of all this data needs efficient tools.

Advantages

  • Spatial data analysis

    Spark algorithm operators in DLI enable real-time stream processing and offline batch processing. They support massive data types, including structured remote sensing image data, unstructured 3D modeling, and laser point cloud data.

  • CEP SQL functionality

    SQL statements are all that is needed for yaw detection and geo-fencing.

  • Heavy data processing

    Quickly migrate up to exabytes of remote sensing images to the cloud, then slice them to data sources for distributed batch processing.

Related Services

DLI vs Self-built Hadoop

Terms & Conditions

活动对象:华为云电销客户及渠道伙伴客户可参与消费满送活动,其他客户参与前请咨询客户经理

活动时间: 2020年8月12日-2020年9月11日

活动期间,华为云用户通过活动页面购买云服务,或使用上云礼包优惠券在华为云官网新购云服务,累计新购实付付费金额达到一定额度,可兑换相应的实物礼品。活动优惠券可在本活动页面中“上云礼包”等方式获取,在华为云官网直接购买(未使用年中云钜惠活动优惠券)或参与其他活动的订单付费金额不计入统计范围内;

活动对象:华为云电销客户及渠道伙伴客户可参与消费满送活动,其他客户参与前请咨询客户经理

活动对象:华为云电销客户及渠道伙伴客户可参与消费满送活动,其他客户参与前请咨询客户经理

活动时间: 2020年8月12日-2020年9月11日

活动期间,华为云用户通过活动页面购买云服务,或使用上云礼包优惠券在华为云官网新购云服务,累计新购实付付费金额达到一定额度,可兑换相应的实物礼品。活动优惠券可在本活动页面中“上云礼包”等方式获取,在华为云官网直接购买(未使用年中云钜惠活动优惠券)或参与其他活动的订单付费金额不计入统计范围内;

活动对象:华为云电销客户及渠道伙伴客户可参与消费满送活动,其他客户参与前请咨询客户经理

活动对象:华为云电销客户及渠道伙伴客户可参与消费满送活动,其他客户参与前请咨询客户经理

活动时间: 2020年8月12日-2020年9月11日

活动期间,华为云用户通过活动页面购买云服务,或使用上云礼包优惠券在华为云官网新购云服务,累计新购实付付费金额达到一定额度,可兑换相应的实物礼品。活动优惠券可在本活动页面中“上云礼包”等方式获取,在华为云官网直接购买(未使用年中云钜惠活动优惠券)或参与其他活动的订单付费金额不计入统计范围内;

活动对象:华为云电销客户及渠道伙伴客户可参与消费满送活动,其他客户参与前请咨询客户经理

Data Lake Insight

Self-built Hadoop System

Cost

Billed by the actual volume of data scanned or used compute unit per hour (CUH).

Costs are saved.

Billed by resources occupied.

Long-term occupation is expensive and wasteful.

Elastic scalability

Intelligent with container-based Kubernetes

N/A

O&M and availability

Out-of-the-box, serverless architecture, and cross-AZ DR

Strong technical capabilities are required for configuration and O&M

Learning cost

The optimization parameters are standardized based on 10 years' experience

in thousands of projects. In addition, DLI provides a GUI for intelligent optimization.

Hundreds of tuning parameters need to be learned.

Supported data sources

Cloud: OBS/RDS/DWS/CSS/MongoDB/Redis;

On-premises: self-built database/MongoDB/Redis

Cloud: OBS;

On-premises: HDFS

Ecosystem compatibility

Data Lake Visualization (DLV), Tableau, Yonghong BI, and Fanruan BI

Big data ecosystem tool

Custom image

Supported. Dependencies can be added as required to meet service diversity

requirements.

N/A

Workflow scheduling

Scheduling through Data Lake Factory (DLF) in DataArts Studio

Self-built scheduling tools, such as Airflow

Enterprise tenant permissions

Table-based with column-level granularity

File-based

Performance

Higher thanks to optimized software and hardware

Matches Hadoop open-source versions

Cost

Data Lake Insight

Billed by the actual volume of data scanned or used compute unit per hour (CUH).

Costs are saved.

Self-built Hadoop System

Billed by resources occupied.

Long-term occupation is expensive and wasteful.

Elastic scalability

Data Lake Insight

Intelligent with container-based Kubernetes

Self-built Hadoop System

N/A

O&M and availability

Data Lake Insight

Out-of-the-box, serverless architecture, and cross-AZ DR

Self-built Hadoop System

Strong technical capabilities are required for configuration and O&M

Learning cost

Data Lake Insight

The optimization parameters are standardized based on 10 years' experience

in thousands of projects. In addition, DLI provides a GUI for intelligent optimization.

Self-built Hadoop System

Hundreds of tuning parameters need to be learned.

Supported data sources

Data Lake Insight

Cloud: OBS/RDS/DWS/CSS/MongoDB/Redis;

On-premises: self-built database/MongoDB/Redis

Self-built Hadoop System

Cloud: OBS;

On-premises: HDFS

Ecosystem compatibility

Data Lake Insight

Data Lake Visualization (DLV), Tableau, Yonghong BI, and Fanruan BI

Self-built Hadoop System

Big data ecosystem tool

Custom image

Data Lake Insight

Supported. Dependencies can be added as required to meet service diversity

requirements.

Self-built Hadoop System

N/A

Workflow scheduling

Data Lake Insight

Scheduling through Data Lake Factory (DLF) in DataArts Studio

Self-built Hadoop System

Self-built scheduling tools, such as Airflow

Enterprise tenant permissions

Data Lake Insight

Table-based with column-level granularity

Self-built Hadoop System

File-based

Performance

Data Lake Insight

Higher thanks to optimized software and hardware

Self-built Hadoop System

Matches Hadoop open-source versions

Success Stories

Mengxiang.com

Mengxiang uses Huawei Cloud DLI and DataArts Studio to analyze real-time behavioral data and match products with customers.

This fast grower faced challenges with service stability during traffic peaks and promotions. The DLI+DataArts Studio solution provides Mengxiang with an elastic architecture and high-performance data lake for integrated batch and stream processing.

DIANCHU Technology

DIANCHU used Huawei Cloud DLI and intelligent data lake DGC to establish a data analysis platform for games. The platform analyzes revenue, player retention rate, and payment rate in real time for activity planning, precise marketing, and decision-making.

Dragonest

Dragonest works with Huawei Cloud to query and analyze gaming data. The analysis is useful for different departments launching new services. Data applications are integrated, benefiting the entire organization.

Sign up and start an amazing cloud journey

Try Free