Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluid incubation proposal #1337

Closed
wants to merge 1 commit into from
Closed

Fluid incubation proposal #1337

wants to merge 1 commit into from

Conversation

RongGu
Copy link
Contributor

@RongGu RongGu commented May 19, 2024

On behalf of the Fluid Steering Committee, we propose to move the Fluid project to CNCF Incubation stage.

Fluid is an open source Kubernetes-native Distributed Dataset Orchestrator and Accelerator for data-intensive applications, such as big data and AI applications. Fluid is can convert distributed caching systems (such as Alluxio and JuiceFS) into observable caching services with self-management, elastic scaling, and self-healing capabilities, and it does so by supporting dataset operations. At the same time, through the data caching location information, Fluid can provide data-affinity scheduling for applications using datasets.

In summary, to resolve the issue that Kubernetes lacks the awareness and optimization for application data, Fluid put forward a series of innovative methods such as co-orchestration, intelligent awareness, joint-optimization, to form an efficient supporting platform for data-intensive applications in cloud native environment.

Key Features of Fluid

  1. Application-oriented DataSet Unified Abstraction:DataSet not only consolidates data from multiple storage sources, but also describes the data's portablity and features, also providing observability, such as total data volume of the DataSet, current cache space size, and cache hit rate. Users can evaluate whether a cache system needs to be scaled up or down according to this information.

  2. Lightweight but highly extensible Runtime Plugins:Dataset is an abstract concept, and the data operation needs to be implemented by the Runtime. According to the different storages, there will be different Runtime interfaces. Fluid's Runtime is divided into two categories: CacheRuntime to accelerate data access, such as AlluxioRuntime for S3, HDFS and JuiceFSRuntime for JuiceFS; the other category is ThinRuntime, which provides a unified access interface to facilitate the access to third-party storage.

  3. Automated data operation:Providing data prefetch, migration, backup and other operations via CRDs, and supporting various trigger modes such as one-time, scheduled, and event-driven, to facilitate users to integrate them into the automated operation and maintenance system.

  4. Data elasticity and scheduling:By combining distributed data caching technology with autoscaling, portability, observability, and affinity scheduling capabilities, data access performance can be improved through the provision of observable, elastic scaling cache capabilities and data affinity scheduling capabilities.

  5. Runtime platform Agnostic:Support diverse environments such as native, edge, Serverless Kubernetes cluster, Kubernetes multi-cluster, and can run in various environments such as cloud platform, edge, Kubernetes multi-cluster. It can run storage client in different modes by choosing CSI Plugin and sidecar according to the differences in environments.

@TheFoxAtWork
Copy link
Contributor

#1317

@angellk
Copy link
Contributor

angellk commented Jun 27, 2024

@RongGu please finish filling out the information in #1317 and then close this issue - #1317 is the correct template. Thank you!

@RongGu
Copy link
Contributor Author

RongGu commented Jun 30, 2024

@RongGu please finish filling out the information in #1317 and then close this issue - #1317 is the correct template. Thank you!

OK, got it. We will work on #1317 and then close this PR. Thank you@angellk!

@angellk
Copy link
Contributor

angellk commented Oct 28, 2024

thank you for working on #1317 - closing this one to de-dupe

@angellk angellk closed this Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants