@Evolving public interface Fileset extends Auditable
Namespace
. A fileset is a virtual
concept of the file or directory that is managed by Gravitino. Users can create a fileset object
to manage the non-tabular data on the FS-like storage. The typical use case is to manage the
training data for AI workloads. The major difference compare to the relational table is that the
fileset is schema-free, the main property of the fileset is the storage location of the
underlying data.
Fileset
defines the basic properties of a fileset object. A catalog implementation
with FilesetCatalog
should implement this interface.
Modifier and Type | Interface and Description |
---|---|
static class |
Fileset.Type
An enum representing the type of the fileset object.
|
Modifier and Type | Method and Description |
---|---|
default java.lang.String |
comment() |
java.lang.String |
name() |
default java.util.Map<java.lang.String,java.lang.String> |
properties() |
java.lang.String |
storageLocation()
Get the storage location of the file or directory path that is managed by this fileset object.
|
Fileset.Type |
type() |
java.lang.String name()
@Nullable default java.lang.String comment()
Fileset.Type type()
java.lang.String storageLocation()
The returned storageLocation can either be the one specified when creating the fileset object, or the one specified in the catalog / schema level if the fileset object is created under this catalog / schema.
For managed fileset, the storageLocation can be:
1) The one specified when creating the fileset object.
2) When catalog property "location" is specified but schema property "location" is not specified, then the storageLocation will be "{catalog location}/schemaName/filesetName".
3) When catalog property "location" is not specified but schema property "location" is specified, then the storageLocation will be "{schema location}/filesetName".
4) When both catalog property "location" and schema property "location" are specified, then the storageLocation will be "{schema location}/filesetName".
5) When both catalog property "location" and schema property "location" are not specified, and storageLocation specified when creating the fileset object is null, this situation is illegal.
For external fileset, the storageLocation can be:
1) The one specified when creating the fileset object.
default java.util.Map<java.lang.String,java.lang.String> properties()