/Linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Primary LanguageScalaApache License 2.0Apache-2.0

Linkis

License

English | 中文

Introduction

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

Linkis connects with computation/storage engines(Spark, Hive, Python and HBase), exposes REST/WS interface, and executes multi-language jobs(SQL, Pyspark, HiveQL and Scala), as a computation middleware.

Based on the microservices architecture, Linkis provides enterprise-level features of multi-tenant isolation, resource management and access control. It also offers convenient support to manage unified variables, UDFs, functions and resource files. it is also guaranteed with sophisticated task/job lifecycle management capabilities under high-concurrency, high-performance and high-availability scenarios.

linkis-intro-01

linkis-intro-03

Based on the concept of the computation middleware architecture of Linkis, we have built a large amount of applications and systems on top of it.

  • Currently available open-source project: Scriptis - Data Development IDE Tool.

  • Upcoming open-source projects:Data Visualization Tool, Graphic Workflow Tool and Data Quality Tool.

There will be more tools released as open-source projects, please stay tuned!

Features

  • Unified Job Execution Services: A distributed REST/WebSocket service for processing scripts execution requests from user.

    Available computation engines so far: Spark, Python, TiSpark, Hive and Shell.

    Available languages so far: SparkSQL, Spark Scala, PySpark, R, Python, HQL and Shell.

  • Resource Management Services: Available for real-time control/limit of resource usage from both perspectives of amount and load for both systems and users. With dynamic charts of resource statistics, it is convenient to monitor and manage resource usage for systems and users.

    Available resource types so far: Yarn queue resources, server(CPU and memory), number of concurrent instances per user.

  • Application Management Services: Manages global user applications, including offline batch applications, interactive query applications and real-time streaming applications. Also provides powerful reusability especially for offline and interactive applications, with complete lifecycle management which automatically releases idle applications for users.

  • Unified Storage Services: The generic IO architecture can quickly integrate with various storage systems and provide a unified invokable entrance. It is also highly integrated with most common data formats and easy to use.

  • Unified Context Services: Unite resources files of users and systems (JAR, ZIP, Properties). With unified management of arguments/variables for users, systems and engines, it is achieved that modification in random place will reflect in all the other places automatically.

  • Material Library: System and user-level material management, capable of sharing, transferring materials and automatic lifecycle management.

  • Metadata Services: Real-time display of dataset table structure and partitions.

Compared with similar systems

introduction01

Documentations:

Linkis, make big data easier

Linkis Quick Deploy

Linkis Quick Start & Java SDK documentation

HTTP APIs for frontend applications

WebSocket APIs for frontend applications

How to adapt Linkis with a new computation or storage engine


Architecture:

introduction02


Communication

If you desire immediate response, please kindly raise issues to us or scan the below QR code by WeChat and QQ to join our group:
introduction05

License

Linkis is under the Apache 2.0 license. See the LICENSE file for details.