HERE Data SDK - Scala API references < Back

Packages

package root
Definition Classes
root
package com
Definition Classes
root
package here
Definition Classes
com
package platform
Definition Classes
here
package data
Definition Classes
platform
package processing
This package provides the Data Processing Library for building distributed data processing applications.
This package provides the Data Processing Library for building distributed data processing applications.
A Runner both implements the interface with the environment for an application to run, and starts the application. The application, in turn, is driven by a Driver, that controls and performs the distributed processing.
Choose a Runner best suited for the environment where the application runs.
The Driver performs one of more tasks which read layers from input catalogs and write to one or more layers of an output catalog.
The main entry point in the processing library is the com.here.platform.data.processing.driver.DriverBuilder class where you can add different kinds of tasks to the driver. The driver runs the tasks, and commits the final results to the output catalog.
Tasks are implemented using one or more compilers.
The simplest compiler is the direct compiler which maps each input tile to N output tiles. The application needs to define com.here.platform.data.processing.compiler.Direct1ToNCompiler.
Other more complex compilation patterns are based on some kind of dependency tracking between input partitions and output partitions.
The processing Library supports the following patterns:
- com.here.platform.data.processing.compiler.NonIncrementalCompiler: non-incremental compilation only - com.here.platform.data.processing.compiler.DepCompiler: non-incremental dependency calculation and incremental compilation - com.here.platform.data.processing.compiler.IncrementalDepCompiler: incremental dependency calculation and compilation - com.here.platform.data.processing.compiler.Direct1ToNCompiler: incremental compilation where every output tile depends only on one input tile, and this mapping is independent from tile content - com.here.platform.data.processing.compiler.DirectMToNCompiler: incremental compilation where every output tile depends on multiple input tiles, and this mapping is independent from tile content - com.here.platform.data.processing.compiler.MapGroupCompiler: incremental compilation where every output tile can depend on multiple input tiles, and this mapping depend on the tile content - com.here.platform.data.processing.compiler.RefTreeCompiler: fully-managed two phases incremental compilation that can resolve references between input partitions. Input/Output dependency management is implemented and the developer doesn't need to provide this logic
The application's main object normally mixes in the a runner trait (like PipelineRunner) to setup the Driver, and interfaces with the environment where the application is run. See the Main classes in the example compilers for more details.
com.here.platform.data.processing.catalog, com.here.platform.data.processing.blobstore, and com.here.platform.data.processing.publisher contain utilities for accessing catalogs and payloads in a Spark-friendly way, providing an RDD-based abstraction over data and metadata. These classes are used by the processing library, but can also be used independently.
Definition Classes
data
package spark
Definition Classes
processing
package partitioner
Definition Classes
spark
AdapterPartitioner
AdaptiveLevelingPartitioner
HashPartitioner
KeyUnpackPartitioner
LocalityAwarePartitioner
NameHashPartitioner
PartitionNamePartitioner
Partitioner

com.here.platform.data.processing.spark.partitioner

AdaptiveLevelingPartitioner

case class AdaptiveLevelingPartitioner(pattern: AdaptivePattern, fallbackPartitioner: Option[PartitionNamePartitioner] = None) extends PartitionNamePartitioner with Product with Serializable

A Partitioner for com.here.platform.data.processing.catalog.Partition.Keys that uses a precalculated com.here.platform.data.processing.leveling.AdaptivePattern.

Keys are distributed to Spark partitions strictly following the leveling points that the pattern specifies. Keys left not aggregated by the pattern are distributed among a disjoint set of Spark partitions using a fallback partitioner, if specified. Otherwise they are uniformly distributed over the existing partitions.

The number of partitions used for aggregated keys is fixed and matches the number of leveling points of the pattern.

pattern: The adaptive leveling pattern that controls the partitioning.
fallbackPartitioner: The optional partitioner used for non-aggregated keys. If undefined, non-aggregated keys are uniformly distributed over the existing partitions.

Linear Supertypes

Product, Equals, PartitionNamePartitioner, Partitioner[KeyOrName], org.apache.spark.Partitioner, Serializable, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

AdaptiveLevelingPartitioner
Product
Equals
PartitionNamePartitioner
Partitioner
Partitioner
Serializable
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Instance Constructors

new AdaptiveLevelingPartitioner(pattern: AdaptivePattern, fallbackPartitioner: Option[PartitionNamePartitioner] = None)
pattern
The adaptive leveling pattern that controls the partitioning.
fallbackPartitioner
The optional partitioner used for non-aggregated keys. If undefined, non-aggregated keys are uniformly distributed over the existing partitions.

Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
val fallbackPartitioner: Option[PartitionNamePartitioner]
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@IntrinsicCandidate() @native()
def getPartition(key: Any): Int
Implements the Spark org.apache.spark.Partitioner interface by forwarding the calls to getPartitionForKey.
Implements the Spark org.apache.spark.Partitioner interface by forwarding the calls to getPartitionForKey.
If the object passed is not of type K or can't be converted to it (e.g. java.lang.Integer to Int), a IllegalArgumentException is thrown. This should be considered a bug that should not happen because the processing library uses Partitioner of type K only for RDDs for which it is aware and sure to have keys of type K.
Basically, this function is a no-op call that forwards to getPartitionForKey, but the important point here is to have a type-safe Partitioner in the processing library.
key
the key for which the partition must be calculated
returns
the partition, identified by one scala.Int, in which the key should be located
Definition Classes
Partitioner → Partitioner
Note
This is called by Spark and should not be called by developer's code, as it may be unsafe.
final def getPartitionForKey(key: KeyOrName): Int
Gets the partition for a given key of type K.
Gets the partition for a given key of type K. This is the function that must be implemented by children partitioners.
key
the key for which the partition must be calculated
returns
the partition, identified by one scala.Int, in which the key should be located
Definition Classes
PartitionNamePartitioner → Partitioner
def getPartitionForName(name: Name): Int
Definition Classes
AdaptiveLevelingPartitioner → PartitionNamePartitioner
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@IntrinsicCandidate() @native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@IntrinsicCandidate() @native()
def numPartitions: Int
Definition Classes
AdaptiveLevelingPartitioner → Partitioner
val pattern: AdaptivePattern
def productElementNames: Iterator[String]
Definition Classes
Product
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])

Deprecated Value Members

def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable]) @Deprecated
Deprecated
(Since version 9)

Packages

AdaptiveLevelingPartitioner

case class AdaptiveLevelingPartitioner(pattern: AdaptivePattern, fallbackPartitioner: Option[PartitionNamePartitioner] = None) extends PartitionNamePartitioner with Product with Serializable

Instance Constructors

Value Members

Deprecated Value Members

Inherited from Product

Inherited from Equals

Inherited from PartitionNamePartitioner

Inherited from Partitioner[KeyOrName]

Inherited from org.apache.spark.Partitioner

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

AdaptiveLevelingPartitioner

case class AdaptiveLevelingPartitioner(pattern: AdaptivePattern, fallbackPartitioner: Option[PartitionNamePartitioner] = None) extends PartitionNamePartitioner with Product with Serializable

Instance Constructors

Value Members

Deprecated Value Members

Inherited from Product

Inherited from Equals

Inherited from PartitionNamePartitioner

Inherited from Partitioner[KeyOrName]

Inherited from org.apache.spark.Partitioner

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped

AdaptiveLevelingPartitioner