spark

package spark

Ordering

Alphabetic

Visibility

Public
All

Type Members

trait ClientsFactory extends AnyRef
class DataClientSparkContext extends DataClientContext
Context holder with shared resources used by DataClient.
Context holder with shared resources used by DataClient.
The context is not serializable (contains threads and sockets) and should never be shared between master and workers.
class DefaultClientsFactory extends ClientsFactory
class DefaultLayerDataFrameReaderFactory extends LayerDataFrameReaderFactory with Logging
class DefaultLayerUpdaterFactory extends LayerUpdaterFactory
class DefaultWritersFactory extends WritersFactory with Logging
trait IndexDataFrameReader extends LayerDataFrameReader
LayerDataFrameReader to query data from an index layer.
final class InteractiveMapDataFrameConstants extends AnyRef
trait InteractiveMapDataFrameReader extends LayerDataFrameReader
LayerDataFrameReader to query data from an interactive Map layer.
LayerDataFrameReader to query data from an interactive Map layer. Currently, the class with private access because doesn't implement additional methods towards LayerDataFrameReader class.
trait LayerDataFrameReader extends AnyRef
Custom Spark DataFrameReader for querying data from a given layer.
Custom Spark DataFrameReader for querying data from a given layer.
The layer type will be inferred from the layer configuration. Therefore the api to read from index layer or versioned layer is the same. To read from an index or versioned layer, your application must perform the following operations:
- Create an instance of a SparkSession.
- Use LayerDataFrameReader.SparkSessionExt.readLayer to create a LayerDataFrameReader.
- Call query to specify the query.
- Call load to create a DataFrame that contains the data. The method will infer the data format from the layer content type.
Below is an example written in Scala that demonstrates how to query data from an index layer:
```
import com.here.platform.data.client.spark.LayerDataFrameReader.SparkSessionExt

val spark =
  SparkSession
    .builder()
    .appName(getClass.getSimpleName)
    .master("local[*]")
    .getOrCreate()

val dataFrame: DataFrame = spark
    .readLayer(catalogHrn, indexLayer)
    .query(
        "tileId=INBOUNDINGBOX=(23.648524, 22.689013, 62.284241, 60.218811) and eventType==SignRecognition")
    .load()
```
Java developers should use com.here.platform.data.client.spark.javadsl.JavaLayerDataFrameReader#readLayer instead of spark.readLayer:
```
Dataset<Row> df =
      JavaLayerDataFrameReader.create(spark)
          .readLayer(catalogHrn, layerId)
          .query(
              "tileId=INBOUNDINGBOX=(23.648524, 22.689013, 62.284241, 60.218811) and eventType==SignRecognition")
          .load();
```
Note
If the load method cannot correctly infer the data format from the layer content type, the application can enforce the data format by previously calling the format method.
trait LayerDataFrameReaderFactory extends AnyRef
trait LayerDataFrameWriter extends AnyRef
Custom Spark DataFrameWriter for writing data to a given layer.
Custom Spark DataFrameWriter for writing data to a given layer.
The layer type will be inferred from the layer configuration. Therefore the api to write to index layer or versioned layer is the same. To write to an index or versioned layer, your application must perform the following operations:
- Have some data as a DataFrame.
- Use LayerDataFrameWriter.DataFrameExt.writeLayer to create a LayerDataFrameWriter.
- Call withDataConverter to specify how the data groupings should be merged to a single data file. This is not necessary if the data is stored in avro, parquet, or protobuf as the appropriate DataConverter will be inferred by the layer's content type.
- Call save to save the data stored in the DataFrame to the given layer
Below is an example written in Scala that demonstrates how to write data to an index layer:
```
import com.here.platform.data.client.spark.LayerDataFrameWriter.DataFrameExt

val spark =
  SparkSession
    .builder()
    .appName(getClass.getSimpleName)
    .master("local[*]")
    .getOrCreate()

val dataFrame: DataFrame = ???
dataFrame
    .writeLayer(catalogHrn, indexLayer)
    .save()
```
Java developers should use com.here.platform.data.client.spark.javadsl.JavaLayerDataFrameWriter#writeLayer instead of dataFrame.writeLayer:
```
Dataset<Row> df = ???
JavaLayerDataFrameWriter.create(df)
    .writeLayer(catalogHrn, layerId)
    .save()
```
The batch size (number of Rows) for a grouping can be restricted to a certain amount by setting the option olp.groupedBatchSize (ie 2 for 2 Rows to be in each group):
```
val dataFrame: DataFrame = ???
dataFrame
    .writeLayer(catalogHrn, indexLayer)
    .option("olp.groupedBatchSize", 2)
    .save()
```
Note
If the save method cannot correctly infer the DataConverter from the layer content type, the application will be required to provide the DataConverter using the withDataConverter method.
trait LayerUpdater extends AnyRef
Trait to represent a layer updater to mutate a layer by deleting some partitions
trait LayerUpdaterFactory extends AnyRef
trait WritersFactory extends AnyRef

Value Members

object DataClientSparkContextUtils
Utility class to initialize and hold shared resource required by DataClient.
object InteractiveMapDataFrame
object LayerDataFrameReader
Provides the implicit class LayerDataFrameReader.SparkSessionExt that simplifies the creation of LayerDataFrameReaders.
object LayerDataFrameWriter
object SparkSupport
Implicits helper to easily access the API using synchronous calls required by Spark.

Packages

spark

package spark

Type Members

Value Members

Ungrouped

Packages

spark 

package spark

Type Members

Value Members

Ungrouped

spark