Skip to main content
Version: 0.3.0

Manage metadata using Gravitino

This page introduces how to manage metadata by Gravitino. Through Gravitino, you can create, edit, and delete metadata like metalakes, catalogs, schemas, and tables. This page includes the following contents:

In this document, Gravitino uses Apache Hive catalog as an example to show how to manage metadata by Gravitino. Other catalogs are similar to Hive catalog, but they may have some differences, especially in catalog property, table property and column type. For more details, please refer to the related doc.

Assuming Gravitino has just started, and the host and port is http://localhost:8090.

Metalake operations

Create a metalake

You can create a metalake by sending a POST request to the /api/metalakes endpoint or just use the Gravitino Java client. The following is an example of creating a metalake:

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{"name":"metalake","comment":"comment","properties":{}}' \
http://localhost:8090/api/metalakes

Load a metalake

You can create a metalake by sending a GET request to the /api/metalakes/{metalake_name} endpoint or just use the Gravitino Java client. The following is an example of loading a metalake:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/metalake

Alter a metalake

You can modify a metalake by sending a PUT request to the /api/metalakes/{metalake_name} endpoint or just use the Gravitino Java client. The following is an example of altering a metalake:

curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"updates": [
{
"@type": "rename",
"newName": "metalake"
},
{
"@type": "setProperty",
"property": "key2",
"value": "value2"
}
]
}' http://localhost:8090/api/metalakes/new_metalake

Currently, Gravitino supports the following changes to a metalake:

Supported modificationJSONJava
Rename metalake{"@type":"rename","newName":"metalake_renamed"}MetalakeChange.rename("metalake_renamed")
Update comment{"@type":"updateComment","newComment":"new_comment"}MetalakeChange.updateComment("new_comment")
Set a property{"@type":"setProperty","property":"key1","value":"value1"}MetalakeChange.setProperty("key1", "value1")
Remove a property{"@type":"removeProperty","property":"key1"}MetalakeChange.removeProperty("key1")

Drop a metalake

You can remove a metalake by sending a DELETE request to the /api/metalakes/{metalake_name} endpoint or just use the Gravitino Java client. The following is an example of dropping a metalake:

curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/metalake
note

Drop a metalake only removes metadata about the metalake and catalogs, schemas, tables under the metalake in Gravitino, It doesn't remove the real schema and table data in Apache Hive.

List all metalakes

You can list metalakes by sending a GET request to the /api/metalakes endpoint or just use the Gravitino Java client. The following is an example of listing all metalake name:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes

Catalogs operations

Create a catalog

tip

Users should create a metalake before creating a catalog.

The code below is an example of creating a Hive catalog. For other catalogs, the code is similar, but the catalog type, provider, and properties may be different. For more details, please refer to the related doc.

You can create a catalog by sending a POST request to the /api/metalakes/{metalake_name}/catalogs endpoint or just use the Gravitino Java client. The following is an example of creating a catalog:

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"name": "catalog",
"type": "RELATIONAL",
"comment": "comment",
"provider": "hive",
"properties": {
"metastore.uris": "thrift://localhost:9083"
}
}' http://localhost:8090/api/metalakes/metalake/catalogs

Currently, Gravitino supports the following catalog providers:

Catalog providerCatalog property
hiveHive catalog property
lakehouse-icebergIceberg catalog property
jdbc-mysqlMySQL catalog property
jdbc-postgresqlPostgreSQL catalog property

Load a catalog

You can load a catalog by sending a GET request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name} endpoint or just use the Gravitino Java client. The following is an example of loading a catalog:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/metalake/catalogs/catalog

Alter a catalog

You can modify a catalog by sending a PUT request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name} endpoint or just use the Gravitino Java client. The following is an example of altering a catalog:

curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"updates": [
{
"@type": "rename",
"newName": "alter_catalog"
},
{
"@type": "setProperty",
"property": "key3",
"value": "value3"
}
]
}' http://localhost:8090/api/metalakes/metalake/catalogs/catalog

Currently, Gravitino supports the following changes to a catalog:

Supported modificationJSONJava
Rename metalake{"@type":"rename","newName":"metalake_renamed"}CatalogChange.rename("catalog_renamed")
Update comment{"@type":"updateComment","newComment":"new_comment"}CatalogChange.updateComment("new_comment")
Set a property{"@type":"setProperty","property":"key1","value":"value1"}CatalogChange.setProperty("key1", "value1")
Remove a property{"@type":"removeProperty","property":"key1"}CatalogChange.removeProperty("key1")

Drop a catalog

You can remove a catalog by sending a DELETE request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name} endpoint or just use the Gravitino Java client. The following is an example of dropping a catalog:

curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog
note

Drop a catalog only removes metadata about the catalog and schemas, tables under the catalog in Gravitino, It doesn't remove the real data (table and schema) in Apache Hive.

List all catalogs in a metalake

You can list all catalogs under a metalake by sending a GET request to the /api/metalakes/{metalake_name}/catalogs endpoint or just use the Gravitino Java client. The following is an example of listing all catalogs in a metalake:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs

Schemas operations

tip

Users should create a metalake and a catalog before creating a schema.

Create a schema

You can create a schema by sending a POST request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas endpoint or just use the Gravitino Java client. The following is an example of creating a schema:

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"name": "schema",
"comment": "comment",
"properties": {
"key1": "value1"
}
}' http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas

Currently, Gravitino supports the following schema property:

Catalog providerSchema property
hiveHive schema property
lakehouse-icebergIceberg scheme property
jdbc-mysqlMySQL schema property
jdbc-postgresqlPostgreSQL schema property

Load a schema

You can create a schema by sending a GET request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name} endpoint or just use the Gravitino Java client. The following is an example of loading a schema:

curl -X GET \-H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema

Alter a schema

You can change a schema by sending a PUT request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name} endpoint or just use the Gravitino Java client. The following is an example of modifying a schema:

curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"updates": [
{
"@type": "removeProperty",
"property": "key2"
}, {
"@type": "setProperty",
"property": "key3",
"value": "value3"
}
]
}' http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema

Currently, Gravitino supports the following changes to a schema:

Supported modificationJSONJava
Set a property{"@type":"setProperty","property":"key1","value":"value1"}SchemaChange.setProperty("key1", "value1")
Remove a property{"@type":"removeProperty","property":"key1"}SchemaChange.removeProperty("key1")

Drop a schema

You can remove a schema by sending a DELETE request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name} endpoint or just use the Gravitino Java client. The following is an example of dropping a schema:

// cascade can be true or false
curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema?cascade=true

List all schemas under a catalog

You can alter all schemas under a catalog by sending a GET request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas endpoint or just use the Gravitino Java client. The following is an example of list all schema in a catalog:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas

Tables operations

tip

Users should create a metalake, a catalog and a schema before creating a table.

Create a table

You can create a table by sending a POST request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables endpoint or just use the Gravitino Java client. The following is an example of creating a table:

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"name": "table",
"columns": [
{
"name": "id",
"type": "integer",
"nullable": true,
"comment": "Id of the user"
},
{
"name": "name",
"type": "varchar(2000)",
"nullable": true,
"comment": "Name of the user"
}
],
"comment": "Create a new Table",
"properties": {
"format": "ORC"
}
}' http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables

In order to create a table, you need to provide the following information:

  • Table column name and type
  • Table property

Gravitino table column type

The following types that Gravitino supports:

TypeJavaJSONDescription
BooleanTypes.BooleanType.get()booleanBoolean type
ByteTypes.ByteType.get()byteByte type, indicates a numerical value of 1 byte
ShortTypes.ShortType.get()shortShort type, indicates a numerical value of 2 bytes
IntegerTypes.IntegerType.get()integerInteger type, indicates a numerical value of 4 bytes
LongTypes.LongType.get()longLong type, indicates a numerical value of 8 bytes
FloatTypes.FloatType.get()floatFloat type, indicates a single-precision floating point number
DoubleTypes.DoubleType.get()doubleDouble type, indicates a double-precision floating point number
Decimal(precision, scale)Types.DecimalType.of(precision, scale)decimal(p, s)Decimal type, indicates a fixed-precision decimal number
StringTypes.StringType.get()stringString type
FixedChar(length)Types.FixedCharType.of(length)char(l)Char type, indicates a fixed-length string
VarChar(length)Types.VarCharType.of(length)varchar(l)Varchar type, indicates a variable-length string, the length is the maximum length of the string
TimestampTypes.TimestampType.withoutTimeZone()timestampTimestamp type, indicates a timestamp without timezone
TimestampWithTimezoneTypes.TimestampType.withTimeZone()timestamp_tzTimestamp with timezone type, indicates a timestamp with timezone
DateTypes.DateType.get()dateDate type
TimeTypes.TimeType.withoutTimeZone()timeTime type
IntervalToYearMonthTypes.IntervalYearType.get()interval_yearInterval type, indicates an interval of year and month
IntervalToDayTimeTypes.IntervalDayType.get()interval_dayInterval type, indicates an interval of day and time
Fixed(length)Types.FixedType.of(length)fixed(l)Fixed type, indicates a fixed-length binary array
BinaryTypes.BinaryType.get()binaryBinary type, indicates a arbitrary-length binary array
ListTypes.ListType.of(elementType, elementNullable){"type": "list", "containsNull": JSON Boolean, "elementType": type JSON}List type, indicate a list of elements with the same type
MapTypes.MapType.of(keyType, valueType){"type": "map", "keyType": type JSON, "valueType": type JSON, "valueContainsNull": JSON Boolean}Map type, indicate a map of key-value pairs
StructTypes.StructType.of([Types.StructType.Field.of(name, type, nullable)]){"type": "struct", "fields": [JSON StructField, {"name": string, "type": type JSON, "nullable": JSON Boolean, "comment": string}]}Struct type, indicate a struct of fields
UnionTypes.UnionType.of([type1, type2, ...]){"type": "union", "types": [type JSON, ...]}Union type, indicate a union of types

The related java doc is here.

Table property and type mapping

The following is the table property that Gravitino supports:

Catalog providerTable propertyType mapping
hiveHive table propertyHive type mapping
lakehouse-icebergIceberg table propertyIceberg type mapping
jdbc-mysqlMySQL table propertyMySQL type mapping
jdbc-postgresqlPostgreSQL table propertyPostgreSQL type mapping

In addition to the basic settings, Gravitino supports the following features:

FeatureDescriptionJava doc
Partitioned tableEqual to PARTITION BY in Apache Hive and other engine that support partitioning.Partition
Bucketed tableEqual to CLUSTERED BY in Apache Hive, some engine may use different words to describe it.Distribution
Sorted order tableEqual to SORTED BY in Apache Hive, some engine may use different words to describe it.SortOrder
tip

Not all catalogs may support those features.. Please refer to the related document for more details.

The following is an example of creating a partitioned, bucketed table and sorted order table:

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"name": "table",
"columns": [
{
"name": "id",
"type": "integer",
"nullable": true,
"comment": "Id of the user"
},
{
"name": "name",
"type": "varchar(2000)",
"nullable": true,
"comment": "Name of the user"
},
{
"name": "age",
"type": "short",
"nullable": true,
"comment": "Age of the user"
},
{
"name": "score",
"type": "double",
"nullable": true,
"comment": "Score of the user"
}
],
"comment": "Create a new Table",
"properties": {
"format": "ORC"
},
"partitioning": [
{
"strategy": "identity",
"fieldName": ["score"]
}
],
"distribution": {
"strategy": "hash",
"number": 4,
"funcArgs": [
{
"type": "field",
"fieldName": ["score"]
}
]
},
"sortOrders": [
{
"direction": "asc",
"nullOrder": "NULLS_LAST",
"sortTerm": {
"type": "field",
"fieldName": ["name"]
}
}
]
}' http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables
note

The code above is an example of creating a Hive table. For other catalogs, the code is similar, but the supported column type, table properties may be different. For more details, please refer to the related doc.

Load a table

You can load a table by sending a GET request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables/{table_name} endpoint or just use the Gravitino Java client. The following is an example of loading a table:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables/table

Alter a table

You can modify a table by sending a PUT request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables/{table_name} endpoint or just use the Gravitino Java client. The following is an example of modifying a table:

curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"updates": [
{
"@type": "removeProperty",
"property": "key2"
}, {
"@type": "setProperty",
"property": "key3",
"value": "value3"
}
}' http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables/table

Currently, Gravitino supports the following changes to a table:

Supported modificationJSONJava
Rename table{"@type":"rename","newName":"table_renamed"}TableChange.rename("table_renamed")
Update comment{"@type":"updateComment","newComment":"new_comment"}TableChange.updateComment("new_comment")
Set a table property{"@type":"setProperty","property":"key1","value":"value1"}TableChange.setProperty("key1", "value1")
Remove a table property{"@type":"removeProperty","property":"key1"}TableChange.removeProperty("key1")
Add a column{"@type":"addColumn","fieldName":["position"],"type":"varchar(20)","comment":"Position of user","position":"FIRST"}TableChange.addColumn(...)
Delete a column{"@type":"deleteColumn","fieldName": ["name"], "ifExists": true}TableChange.deleteColumn(...)
Rename a column{"@type":"renameColumn","oldFieldName":["name_old"], "newFieldName":"name_new"}TableChange.renameColumn(...)
Update the column comment{"@type":"updateColumnComment", "fieldName": ["name"], "newComment": "new comment"}TableChange.updateColumnCommment(...)
Update the type of a column{"@type":"updateColumnType","fieldName": ["name"], "newType":"varchar(100)"}TableChange.updateColumnType(...)
Update the nullability of a column{"@type":"updateColumnNullability","fieldName": ["name"],"nullable":true}TableChange.updateColumnNullability(...)
Update the position of a column{"@type":"updateColumnPosition","fieldName": ["name"], "newPosition":"default"}TableChange.updateColumnPosition(...)

Drop a table

You can remove a table by sending a DELETE request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables/{table_name} endpoint or just use the Gravitino Java client. The following is an example of dropping a table:

curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables/table

List all tables under a schema

You can list all tables in a schema by sending a GET request to the /api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/tables endpoint or just use the Gravitino Java client. The following is an example of list all tables in a schema:

curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables