trait LayerDataFrameReader extends AnyRef
Custom Spark DataFrameReader for querying data from a given layer.
The layer type will be inferred from the layer configuration. Therefore the api to read from index layer or versioned layer is the same. To read from an index or versioned layer, your application must perform the following operations:
- Create an instance of a SparkSession.
- Use LayerDataFrameReader.SparkSessionExt.readLayer to create a LayerDataFrameReader.
- Call query to specify the query.
- Call load to create a DataFrame that contains the data. The method will infer the data format from the layer content type.
Below is an example written in Scala that demonstrates how to query data from an index layer:
import com.here.platform.data.client.spark.LayerDataFrameReader.SparkSessionExt val spark = SparkSession .builder() .appName(getClass.getSimpleName) .master("local[*]") .getOrCreate() val dataFrame: DataFrame = spark .readLayer(catalogHrn, indexLayer) .query( "tileId=INBOUNDINGBOX=(23.648524, 22.689013, 62.284241, 60.218811) and eventType==SignRecognition") .load()
Java developers should use com.here.platform.data.client.spark.javadsl.JavaLayerDataFrameReader#readLayer instead of spark.readLayer:
Dataset<Row> df =
JavaLayerDataFrameReader.create(spark)
.readLayer(catalogHrn, layerId)
.query(
"tileId=INBOUNDINGBOX=(23.648524, 22.689013, 62.284241, 60.218811) and eventType==SignRecognition")
.load();
- Note
If the load method cannot correctly infer the data format from the layer content type, the application can enforce the data format by previously calling the format method.
- Alphabetic
- By Inheritance
- LayerDataFrameReader
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Abstract Value Members
-
abstract
def
format(source: String): LayerDataFrameReader
Specifies the format of the data stored in the layer.
-
abstract
def
load(): DataFrame
Retrieve the data in the user defined format (see format) which satisfies the provided query (see query).
Retrieve the data in the user defined format (see format) which satisfies the provided query (see query).
If no format is set, the load method infers the input data source format from the layer content type. For example, if the layer content type is
application/x-parquet
, the load method will specify theparquet
data source format.If no format is set and the load method cannot infer the input data source format from the layer content type, the load method will use the default format defined in the
spark.sql.sources.default
Spark property, whose default value isparquet
.- returns
DataFrame with the data, note the structure of the data DataFrame will depend on the format (see format) or optional user provided schema
- Exceptions thrown
com.here.platform.data.client.DataClientNonRetriableException
in case of non-retriable errorcom.here.platform.data.client.DataClientRetriableException
in case of retriable error
-
abstract
def
option(key: String, value: Enum[_]): LayerDataFrameReader
Adds an input option for the underlying data source.
-
abstract
def
option(key: String, value: String): LayerDataFrameReader
Adds an input option for the underlying data source.
-
abstract
def
option(key: String, value: Boolean): LayerDataFrameReader
Adds an input option for the underlying data source.
-
abstract
def
option(key: String, value: Long): LayerDataFrameReader
Adds an input option for the underlying data source.
-
abstract
def
option(key: String, value: Double): LayerDataFrameReader
Adds an input option for the underlying data source.
-
abstract
def
query(query: String): LayerDataFrameReader
Specifies the query to use when querying the layer.
Specifies the query to use when querying the layer.
- query
Query string to retrieve layer data. Format of query should follow RSQL. See https://github.com/jirutka/rsql-parser
-
abstract
def
queryMetadata(query: String): LayerDataFrameReader
Specifies the query to use when querying the layer partitions metadata.
Specifies the query to use when querying the layer partitions metadata.
- query
Query string to retrieve layer partitions metadata. Format of query should follow RSQL. See https://github.com/jirutka/rsql-parser
-
abstract
def
schema(schema: StructType): LayerDataFrameReader
Specifies the schema of the data stored in the layer.
Specifies the schema of the data stored in the layer. Some data formats such as Apache Avro can infer the schema automatically from the data. By specifying the schema here, the underlying data source can skip the schema inference step, and thus speed up data loading.
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()