trait LayerDataFrameReader extends AnyRef
Custom Spark DataFrameReader for querying data from a given layer.
The layer type will be inferred from the layer configuration. Therefore the api to read from index layer or versioned layer is the same. To read from an index or versioned layer, your application must perform the following operations:
- Create an instance of a SparkSession.
- Use LayerDataFrameReader.SparkSessionExt.readLayer to create a LayerDataFrameReader.
- Call query to specify the query.
- Call load to create a DataFrame that contains the data. The method will infer the data format from the layer content type.
Below is an example written in Scala that demonstrates how to query data from an index layer:
import com.here.platform.data.client.spark.LayerDataFrameReader.SparkSessionExt val spark = SparkSession .builder() .appName(getClass.getSimpleName) .master("local[*]") .getOrCreate() val dataFrame: DataFrame = spark .readLayer(catalogHrn, indexLayer) .query( "tileId=INBOUNDINGBOX=(23.648524, 22.689013, 62.284241, 60.218811) and eventType==SignRecognition") .load()
Java developers should use com.here.platform.data.client.spark.javadsl.JavaLayerDataFrameReader#readLayer instead of spark.readLayer:
Dataset<Row> df =
JavaLayerDataFrameReader.create(spark)
.readLayer(catalogHrn, layerId)
.query(
"tileId=INBOUNDINGBOX=(23.648524, 22.689013, 62.284241, 60.218811) and eventType==SignRecognition")
.load();- Note
If the load method cannot correctly infer the data format from the layer content type, the application can enforce the data format by previously calling the format method.
- Alphabetic
- By Inheritance
- LayerDataFrameReader
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Abstract Value Members
- abstract def format(source: String): LayerDataFrameReader
Specifies the format of the data stored in the layer.
- abstract def load(): DataFrame
Retrieve the data in the user defined format (see format) which satisfies the provided query (see query).
Retrieve the data in the user defined format (see format) which satisfies the provided query (see query).
If no format is set, the load method infers the input data source format from the layer content type. For example, if the layer content type is
application/x-parquet, the load method will specify theparquetdata source format.If no format is set and the load method cannot infer the input data source format from the layer content type, the load method will use the default format defined in the
spark.sql.sources.defaultSpark property, whose default value isparquet.- returns
DataFrame with the data, note the structure of the data DataFrame will depend on the format (see format) or optional user provided schema
- Exceptions thrown
com.here.platform.data.client.DataClientNonRetriableExceptionin case of non-retriable errorcom.here.platform.data.client.DataClientRetriableExceptionin case of retriable error
- abstract def option(key: String, value: Enum[_]): LayerDataFrameReader
Adds an input option for the underlying data source.
- abstract def option(key: String, value: String): LayerDataFrameReader
Adds an input option for the underlying data source.
- abstract def option(key: String, value: Boolean): LayerDataFrameReader
Adds an input option for the underlying data source.
- abstract def option(key: String, value: Long): LayerDataFrameReader
Adds an input option for the underlying data source.
- abstract def option(key: String, value: Double): LayerDataFrameReader
Adds an input option for the underlying data source.
- abstract def query(query: String): LayerDataFrameReader
Specifies the query to use when querying the layer.
Specifies the query to use when querying the layer.
- query
Query string to retrieve layer data. Format of query should follow RSQL. See https://github.com/jirutka/rsql-parser
- abstract def queryMetadata(query: String): LayerDataFrameReader
Specifies the query to use when querying the layer partitions metadata.
Specifies the query to use when querying the layer partitions metadata.
- query
Query string to retrieve layer partitions metadata. Format of query should follow RSQL. See https://github.com/jirutka/rsql-parser
- abstract def schema(schema: StructType): LayerDataFrameReader
Specifies the schema of the data stored in the layer.
Specifies the schema of the data stored in the layer. Some data formats such as Apache Avro can infer the schema automatically from the data. By specifying the schema here, the underlying data source can skip the schema inference step, and thus speed up data loading.
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)