package deltasets
- Alphabetic
- Public
- All
Type Members
-
trait
BaseSet extends AnyRef
The common base type for PublishedSet and DeltaSet.
- class BaseSetIdAssigner extends AnyRef
-
final
case class
CanDetermine[+A](x: A) extends Determine[A] with Product with Serializable
A value that can be determined.
-
sealed
trait
Changes[+K, +V] extends AnyRef
Represents the changes between two KeyValues.
Represents the changes between two KeyValues. Their may be either NoChanges or SomeChanges, which are represented by replaced key-values and deleted keys.
- K
The type of the key
- V
The type of the value
-
trait
DeltaContext extends AnyRef
Provides access to a set of common resources which a user of DeltaSets may require, that can be used by com.here.platform.data.processing.driver.DeltaSetup.
-
trait
DeltaSet[K, V] extends BaseSet
The DeltaSet is the main processing abstraction to implement custom processing patterns.
The DeltaSet is the main processing abstraction to implement custom processing patterns.
Pipeline developers may implement their processing logic by applying transformations to DeltaSets.
Core of the incremental processing framework is the ability to expose not only the contents of a DeltaSet but also what has changed about the contents since a previous, reference run. This information may be used by transformed DeltaSets to also expose only what has changed, enabling differential processing across any DAG of transformations.
Each DeltaSet is characterized by the type of keys and values it contains. The format of the input data, when applicable, needed produce the output data is not specified here, and depends on the DeltaSet implementation.
Data in a DeltaSet is not only strongly typed but also strongly partitioned.
The provided DeltaContext exposes the input catalog as sources. When integrating with the com.here.platform.data.processing.driver.Driver via the com.here.platform.data.processing.driver.DeltaSetup interface and relative com.here.platform.data.processing.driver.DeltaDriverTaskBuilder or directly using the com.here.platform.data.processing.driver.DeltaDriverTask, the result of processing must be exposed as sink of fixed types.
DeltaSets are immutable, distributed, de-duplicated.
DeltaSet transformations are incremental, lazy.
- case class DeltaSetConfig(intermediateStorageLevel: StorageLevel, validationLevel: DeltaSetConfig.ValidationLevel.Value, threads: Int, sorting: Boolean, incremental: Boolean, forceStateless: Boolean) extends Product with Serializable
-
sealed
trait
Determine[+A] extends AnyRef
Represents a value that may or may not be determined.
Represents a value that may or may not be determined.
Several Determine value can be combined using Determine!.zip and Determine.reduce.
- A
The type of the wrapped value.
-
final
case class
KeyValues[K, V](rdd: RDD[(K, V)], partitioner: Partitioner[K]) extends Product with Serializable
Uses a Spark RDD to store key-value pairs.
Uses a Spark RDD to store key-value pairs. Compared to a normal RDD, the use of this class asserts and, where possible, ensures that two conditions are met:
1. There are no duplicate keys. 2. The RDD is partitioned with the given partitioner.
- K
The type of the keys.
- V
The type of the values.
- rdd
The Spark RDD storing a set of key-value pairs. Must be partitioned using
partitioner
.- partitioner
The partitioner.
-
final
case class
Keys[K](rdd: RDD[(K, Unit)], partitioner: Partitioner[K]) extends Product with Serializable
Uses a Spark RDD to store a set of keys.
Uses a Spark RDD to store a set of keys. Compared to a normal RDD, the use of this class asserts and, where possible, ensures that two conditions are met:
1. There are no duplicate keys. 2. The RDD is partitioned with the given partitioner.
- K
The type of the keys.
- rdd
The Spark RDD storing a set of keys. Must be partitioned using
partitioner
.- partitioner
The partitioner.
-
class
ManyToMany[S, T] extends (S) ⇒ Iterable[T] with Serializable
Represents an m-to-n relation by pairing the function
mapFn
with its inverse functioninverseFn
.Represents an m-to-n relation by pairing the function
mapFn
with its inverse functioninverseFn
. If the function represented by this class is applied to a value for whichinverseFn
is not the inverse ofmapFn
, an exception is thrown.Note that
inverseFn
can be called by the Data Processing Library on values that are not produced bymapFn
. DefineinverseFn
as a partial function to correctly restrict its domain to the set of keys, for which an inverse can be defined.- S
The domain of the function.
- T
The co-domain of the function.
-
class
ManyToOne[S, T] extends (S) ⇒ T with Serializable
Represents an n-to-1 relation by pairing the function
mapFn
with its inverse functioninverseFn
.Represents an n-to-1 relation by pairing the function
mapFn
with its inverse functioninverseFn
. If the function represented by this class is applied to a value for whichinverseFn
is not the inverse ofmapFn
, an exception is thrown.Note that
inverseFn
can be called by the Data Processing Library on values that are not produced bymapFn
. DefineinverseFn
as a partial function to correctly restrict its domain to the set of keys, for which an inverse can be defined.- S
The domain of the function.
- T
The co-domain of the function.
-
class
OneToMany[S, T] extends (S) ⇒ Iterable[T] with Serializable
Represents a 1-to-n relation by pairing a function
flatMapFn
with its inverse functioninverseFn
.Represents a 1-to-n relation by pairing a function
flatMapFn
with its inverse functioninverseFn
. If the function represented by this class is applied to a value for whichinverseFn
is not the inverse ofmapFn
, an exception is thrown.Note that
inverseFn
can be called by the Data Processing Library on values that are not produced bymapFn
. DefineinverseFn
as a partial function to correctly restrict its domain to the set of keys, for which an inverse can be defined.- S
The domain of the function.
- T
The co-domain of the function.
-
class
OneToOne[S, T] extends (S) ⇒ T with Serializable
Represents a 1-to-1 relation by pairing a function
mapFn
with its inverse functioninverseFn
.Represents a 1-to-1 relation by pairing a function
mapFn
with its inverse functioninverseFn
. If the function represented by this class is applied to a value for whichinverseFn
is not the inverse ofmapFn
, an exception is thrown.Note that
inverseFn
can be called by the Data Processing Library on values that are not produced bymapFn
. DefineinverseFn
as a partial function to correctly restrict its domain to the set of keys, for which an inverse can be defined.- S
The domain of the function.
- T
The co-domain of the function.
-
case class
PartMapperByLevel(levels: Set[Int]) extends PublishedPartMapper with Product with Serializable
A PublishedPartMapper that assigns each key to a publish part based on its com.here.platform.data.processing.catalog.Partition.Name's
level
.A PublishedPartMapper that assigns each key to a publish part based on its com.here.platform.data.processing.catalog.Partition.Name's
level
. Typically used with com.here.platform.data.processing.catalog.Partition.HereTile keys, to publish each zoom level independently.- levels
The set of levels.
-
sealed
trait
PartitioningStrategy[-K] extends AnyRef
Indicates, for a transformation that transformations keys of type
K1
to keys of typeK2
, whether the transformation will preserve the partitioning of the input DeltaSet, or whether it must be repartitioned with a partitioner.Indicates, for a transformation that transformations keys of type
K1
to keys of typeK2
, whether the transformation will preserve the partitioning of the input DeltaSet, or whether it must be repartitioned with a partitioner.- K
the type of output keys.
-
trait
PublishedPart extends PublishedSetLike
The result of publishing a DeltaSet to blobstore.
The result of publishing a DeltaSet to blobstore. Unlike a PublishedSet, a PublishedPart corresponds to a single
part
of the output layers only. -
trait
PublishedPartMapper extends Serializable
An object that specifies how the output keys are partitioned in a multi part publishing.
An object that specifies how the output keys are partitioned in a multi part publishing.
-
trait
PublishedSet extends PublishedSetLike
The PublishedSet is the result of publishing a DeltaSet to blobstore.
-
trait
PublishedSetLike extends BaseSet
Base trait for classes that represent the result of publishing a DeltaSet to blobstore.
-
case class
RequiresRepartitioning[K](partitioner: Partitioner[K]) extends PartitioningStrategy[K] with Product with Serializable
Indicates that a transformation will not preserve the partitioning of the input DeltaSet.
Indicates that a transformation will not preserve the partitioning of the input DeltaSet. This means that an input key may transformed into an output key in a different Spark partition, and therefore, the data must be repartitioned.
- K
the type of output keys.
- partitioner
The partitioner to apply after the transformation.
-
trait
ResolutionStrategy[-K, -V] extends AnyRef
Defines a strategy that determines how metadata should be resolved.
Defines a strategy that determines how metadata should be resolved.
- K
the key type of the subject DeltaSet (the one transformed by mapValuesWithResolver)
- V
the value type of the subject DeltaSet (the one transformed by mapValuesWithResolver)
-
trait
Resolver extends AnyRef
Interface to resolve keys to metadata.
Interface to resolve keys to metadata. Provided to a mapping function used with
mapValuesWithResolver
and backed by one or more ResolutionStrategys. -
final
case class
SomeChanges[K, V](replaces: KeyValues[K, V], deletes: Keys[K]) extends Changes[K, V] with Product with Serializable
Represents changes between two KeyValues, which may be non-empty (as opposed to NoChanges, which always represents empty changes).
Represents changes between two KeyValues, which may be non-empty (as opposed to NoChanges, which always represents empty changes). Contains all keys with new values (added keys or keys with changed values), and all deleted keys.
- K
The type of the keys.
- V
The type of the values.
- replaces
All keys added or changed, with their new value.
- deletes
All keys deleted.
-
trait
StateManager extends AnyRef
Interface for retrieving the state from within a DeltaSet implementation.
- trait Transformations extends AnyRef
Value Members
- object BaseSet
-
object
CannotDetermine extends Determine[Nothing] with Product with Serializable
A value that cannot be determined.
- object DeltaContext
- object DeltaSetConfig extends Serializable
- object Determine
- object ManyToMany extends Serializable
- object ManyToOne extends Serializable
-
object
NoChanges extends Changes[Nothing, Nothing] with Product with Serializable
A value that has not changed.
- object OneToMany extends Serializable
- object OneToOne extends Serializable
-
object
PreservesPartitioning extends PartitioningStrategy[Any] with Product with Serializable
Indicates that a transformation will preserve the partitioning of the input DeltaSet.
Indicates that a transformation will preserve the partitioning of the input DeltaSet. This means, that output keys will reside in the same Spark partition as the input partitioner.
Using this setting will increase performance of the transformation.
-
object
ResolutionStrategy
Default ResolutionStrategies to use in
mapValuesWithResolver
. - object SomeChanges extends Serializable