Packages

implicit final class KeyValueOpsWrapper[K, V] extends AnyVal

Adds some important operation on a key/value-based RDD.

K

the type of the keys

V

the type of the values

Linear Supertypes
AnyVal, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. KeyValueOpsWrapper
  2. AnyVal
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new KeyValueOpsWrapper(rdd: RDD[(K, V)])

    rdd

    the RDD to process

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    Any
  2. final def ##(): Int
    Definition Classes
    Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def distinctByKey(partitioner: Partitioner)(implicit kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, V)]

    Removes duplicate elements from the subject rdd efficiently.

    Removes duplicate elements from the subject rdd efficiently. The operation is local on the workers, a shuffle is applied only in case rdd is not partitioned or it is partitioned differently than the partitioner specified.

    partitioner

    the partitioner applied to the returned RDD

    returns

    a new RDD with no duplicate elements

  6. def distinctByKey()(implicit kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, V)]

    Removes duplicate elements from the subject rdd efficiently.

    Removes duplicate elements from the subject rdd efficiently. The operation is local on the workers, a shuffle is applied only in case rdd is not partitioned.

    returns

    a new RDD with no duplicate elements

  7. def distinctKeys(partitioner: Partitioner)(implicit kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, Unit)]

    Calculates an RDD with the keys of the subject rdd, with no values and duplicates removed in an efficient way.

    Calculates an RDD with the keys of the subject rdd, with no values and duplicates removed in an efficient way.

    partitioner

    the partitioner applied to the returned RDD

    returns

    the keys of rdd without values and duplicates

  8. def distinctKeys()(implicit kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, Unit)]

    Calculates an RDD with the keys of the subject rdd, with no values and duplicates removed in an efficient way.

    Calculates an RDD with the keys of the subject rdd, with no values and duplicates removed in an efficient way.

    returns

    the keys of rdd without values and duplicates

  9. def flatMapKeysAndRepartition[K2](f: (K) ⇒ Iterable[K2], partitioner: Partitioner)(implicit arg0: ClassTag[K2], kt: ClassTag[K], vt: ClassTag[V]): RDD[(K2, V)]

    Flat maps the keys of a pair RDD, possibly duplicating the value.

    Flat maps the keys of a pair RDD, possibly duplicating the value.

    K2

    The type of the resulting key.

    f

    The function to transform the key.

    partitioner

    The partitioner applied to the returned RDD.

    returns

    An RDD with the same values, but new keys.

  10. def getClass(): Class[_ <: AnyVal]
    Definition Classes
    AnyVal → Any
  11. def intersectKeys[X](other: RDD[(K, X)], partitioner: Partitioner)(implicit arg0: ClassTag[X], kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, Unit)]

    Calculates the intersection of the keys of two rdd's.

    Calculates the intersection of the keys of two rdd's. Distinct keys are returned, values are stripped.

    X

    any type, it is not used

    other

    the RDD whose keys are intersected with the one of the subject rdd

    partitioner

    the partitioner applied to the returned RDD

    returns

    the keys of rdd intersected with the one of the RDD without values and duplicates

  12. def intersectKeys[X](other: RDD[(K, X)])(implicit arg0: ClassTag[X], kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, Unit)]

    Calculates the intersection of the keys of two rdd's.

    Calculates the intersection of the keys of two rdd's. Distinct keys are returned, values are stripped.

    X

    any type, it is not used

    other

    the RDD whose keys are intersected with the one of the subject rdd

    returns

    the keys of rdd intersected with the one of the other RDD without values and duplicates

  13. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  14. def mapKeysAndRepartition[K2](f: (K) ⇒ K2, partitioner: Partitioner)(implicit arg0: ClassTag[K2], kt: ClassTag[K], vt: ClassTag[V]): RDD[(K2, V)]

    Maps the keys of a pair RDD, without changing the value.

    Maps the keys of a pair RDD, without changing the value.

    K2

    The type of the resulting key.

    f

    The function to transform the key.

    partitioner

    The partitioner applied to the returned RDD.

    returns

    An RDD with the same values, but new keys.

  15. def replaceAndDeleteByKey[X](replace: RDD[(K, V)], delete: RDD[(K, X)], partitioner: Partitioner)(implicit arg0: ClassTag[X], kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, V)]

    Replaces and deletes elements of rdd by key in one single, efficient operation.

    Replaces and deletes elements of rdd by key in one single, efficient operation. Comparison is done by key: if a key is being replaced/deleted, every element with that key already present in rdd is replaced/deleted as well.

    X

    any type, it is not used

    replace

    the keys and values of new elements that should replace existing elements in rdd

    delete

    the keys of elements that should be deleted from rdd

    partitioner

    the partitioner applied to the returned RDD

    returns

    an RDD with its elements replaced/deleted by key.

  16. def replaceAndDeleteByKey[X](replace: RDD[(K, V)], delete: RDD[(K, X)])(implicit arg0: ClassTag[X], kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, V)]

    Replaces and deletes elements of rdd by key in one single, efficient operation.

    Replaces and deletes elements of rdd by key in one single, efficient operation. Comparison is done by key: if a key is being replaced/deleted, every element with that key already present in rdd is replaced/deleted as well.

    X

    any type, it is not used

    replace

    the keys and values of new elements that should replace existing elements in rdd

    delete

    the keys of elements that should be deleted from rdd

    returns

    an RDD with its elements replaced/deleted by key.

  17. def replaceByKey(replace: RDD[(K, V)], partitioner: Partitioner)(implicit kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, V)]

    Replaces elements of rdd by key in one single, efficient operation.

    Replaces elements of rdd by key in one single, efficient operation. Comparision is done by key: if a key is being replaced, every element with that key already present in rdd is replaced as well.

    replace

    Keys and values of new elements that should replace existing elements in rdd

    partitioner

    The partitioner applied to the returned RDD

    returns

    An RDD with its elements replaced by key.

  18. def replaceByKey(replace: RDD[(K, V)])(implicit kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, V)]

    Replaces elements of rdd by key in one single, efficient operation.

    Replaces elements of rdd by key in one single, efficient operation. Comparision is done by key: if a key is being replaced, every element with that key already present in rdd is replaced as well.

    replace

    the keys and values of new elements that should replace existing elements in rdd

    returns

    an RDD with its elements replaced by key.

  19. def stripValues()(implicit kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, Unit)]

    Strips values.

    Strips values.

    In many computations we do not need any value associated to keys in RDD's, in that case the need to set the value part of the key-value RDD to scala.Unit.

    returns

    an rdd having the same key as the input one and scala.Unit as values

    Note

    input partitioning is kept

  20. def toString(): String
    Definition Classes
    Any
  21. def unionKeys[X](other: RDD[(K, X)], partitioner: Partitioner)(implicit arg0: ClassTag[X], kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, Unit)]

    Calculates an RDD with the union of the keys of the subject rdd and of another RDD.

    Calculates an RDD with the union of the keys of the subject rdd and of another RDD. No duplicate keys are returned.

    X

    any type, it is not used

    other

    the other RDD to perform the union

    partitioner

    the partitioner applied to the returned RDD

    returns

    the keys of rdd and the other RDD without values and duplicates

  22. def unionKeys[X](other: RDD[(K, X)])(implicit arg0: ClassTag[X], kt: ClassTag[K], vt: ClassTag[V]): RDD[(K, Unit)]

    Calculates an RDD with the union of the keys of the subject rdd and of another RDD.

    Calculates an RDD with the union of the keys of the subject rdd and of another RDD. No duplicate keys are returned.

    X

    any type, it is not used

    other

    the other RDD to perform the union

    returns

    the keys of rdd and the other RDD without values and duplicates

Inherited from AnyVal

Inherited from Any

Ungrouped