Packages

class Publisher extends AnyRef

Publishes payloads, in the form of an RDD of com.here.platform.data.processing.catalog.Partition.Key and com.here.platform.data.processing.blobstore.Payload, to the Blob API and returns RDD of com.here.platform.data.processing.catalog.Partition.Commit to be committed to the Metadata API.

Note

both the full snapshot and the incremental one are implemented here the difference is mainly in the joining strategy adopted

,

metadata related to the published payloads need to be committed in a subsequent operation

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Publisher
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Publisher(uploader: Uploader, retriever: Retriever, numThreads: Int, uniquePartitionLimitInBytes: Int = 0, statistics: Option[Map[Id, PublisherStats]] = None)

    Creates a new publisher.

    Creates a new publisher.

    uploader

    The com.here.platform.data.processing.blobstore.Uploader object.

    retriever

    The com.here.platform.data.processing.blobstore.Retriever object. Needed when, during an incremental snapshot in conjunction with a rebase scenario, an unchanged partition is changed in the base catalog and must be overridden again, but the payload is only available in the base version of the output catalog. See com.here.platform.data.processing.publisher.PublishingAction.RebaseUpload.

    numThreads

    The number of threads (per Spark partition) used to upload the data.

    uniquePartitionLimitInBytes

    the commit partition of payloads of size smaller or equal to this will be cached (locally to the Spark partition). In case a layer has many identical partitions, which would end up having the same data handle, a cache is faster than gracefully handling a large number of conflict exceptions. 0 means only the empty payload will be cached, and it is the recommended setting in case of layers that cannot have the same content in different partitions.

    statistics

    The optional statistics counters.

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @IntrinsicCandidate()
  6. val dataPublisher: DataPublisher
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @IntrinsicCandidate()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @IntrinsicCandidate()
  11. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @IntrinsicCandidate()
  14. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @IntrinsicCandidate()
  15. def publishFullSnapshot(current: RDD[(Key, Meta)], candidates: RDD[(Key, Option[Payload])], baseChanges: Option[RDD[(Key, Change)]] = None): RDD[(Key, Commit)]

    Publishes a full snapshot of data.

    Publishes a full snapshot of data.

    current

    The RDD containing the catalog's current status.

    candidates

    The RDD containing the candidate operations to upload.

    baseChanges

    Optional RDD containing the changes inherited from the base catalog of all composite layers during a rebase. Should be defined if and only if the catalog has composite layers and if the dependency on the base catalog is being updated (rebase).

    returns

    the metadata RDD that were already published but not yet committed.

    Note

    Candidate operations are potential operations that can be done, depending on the status of the catalog. These operations depend on whether there is duplicated content or not, as duplicated content do not have to be published. This function deletes content in the current RDD that is not present in the candidates' RDD.

  16. def publishIncrementalSnapshot(current: RDD[(Key, Meta)], candidates: RDD[(Key, Option[Payload])], baseChanges: Option[RDD[(Key, Change)]] = None): RDD[(Key, Commit)]

    Publishes an increment snapshot of data.

    Publishes an increment snapshot of data.

    current

    The RDD containing the catalog's current status.

    candidates

    The RDD containing the candidate operations to upload.

    baseChanges

    Optional RDD containing the changes inherited from the base catalog of all composite layers during a rebase. Should be defined if and only if the catalog has composite layers and if the dependency on the base catalog is being updated (rebase).

    returns

    the metadata RDD that were already published but not yet committed.

    Note

    Candidate operations are potential operations that can be done, depending on the status of the catalog. These operations depend on whether there is duplicated content or not, as duplicated content do not have to be published.

  17. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  18. def toString(): String
    Definition Classes
    AnyRef → Any
  19. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  20. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  21. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from AnyRef

Inherited from Any

Ungrouped