I had an interesting discussion with a friend, who is a Haskell Developer.

Onur: Hey. I have just written a function but I feel like it is more complicated than it should be. I’m trying to dedup a collection of maps based on two keys. Here is my version that I am not happy with:

(defn- dedup-by-dorry-and-brogy
  [records]
  (vec
    (reduce
      (fn [dedupped-records
           {:keys [dorry brogy] :as record}]
        (if (some
              #(and
                 (= dorry (:dorry %))
                 (= brogy (:brogy %)))
              dedupped-records)
          dedupped-records
          (conj dedupped-records record)))
      '()
      ;; reverse to keep the last occurrence
      (reverse records))))

Berk: Hmm… I’ll send you how I’d write it in Haskell.

Onur: Well, I just thought a better version while explaining the problem to you… I guess sometimes I should just explain the problem outloud to myself. I can just use group-by, grouping the records by the keys, and map to get last ones. group-by respects the order so it should be fine.

Onur: Here it is:

(defn- dedup-by-dorry-and-brogy
  [records]
  (->> records
       (group-by #(select-keys % [:dorry :brogy]))
       vals
       (map last)
       (into [])))

Berk: Yeah, this is indeed better… Though I’d write it like this in Haskell:

dedup :: [Rec] -> [Rec]
dedup = nubBy ((==) `on` (dorry &&& brogy)) . reverse

Onur: What does nubBy do?

Berk: nub deletes duplicates from a list and retains the first element. The nubBy function behaves just like nub, except it uses a user-supplied equality predicate instead of the overloaded (==) function. Here is the documentation.

Onur: Hmm… We have distinct for nub in Clojure. I wonder if we have a distinct-by.

Berk: In Haskell, for functions like nub there’s generally a -by version too.

Onur: Similarly, Clojure has sort and sort-by too. But seems there’s not distinct-by yet as this ticket is still open.

Onur: Though, there is this awesome de-facto helper library. It’s already in our project dependencies and has distinct-by in it. I’ll just use from that.

Onur: Here is how it looks like.

(defn- dedup-by-dorry-and-brogy
  [records]
  (->> records
       ;; reverse to retain last record
       reverse
       (distinct-by #(select-keys % [:dorry :brogy]))))

Onur: Well, I can use juxt too. It is like map, but unlike map applying a function to each element of a collection, juxt takes multiple functions and returns one function applying each one of those functions in order to passed args. Or something like that. Here is the updated version.

(defn- dedup-by-dorry-and-brogy
  [records]
  (->> records
       ;; reverse to retain last record
       reverse
       (distinct-by (juxt :dorry :brogy))))

Berk: This is pretty cool! You are brilliant Onur, and also much more handsome than I am.

Onur: Thanks.

Berk: Though, why not using comp instead of threading macro? I think this reads more natural:

(def dedup-by-dorry-and-brogy
  (comp
     (distinct-by (juxt :dorry :brogy))
     reverse))

Onur: In this example, comp indeed can be used too as both functions take one argument. But imagine you want to take last 10 elements of the collection. With threading macro, you can just add (take 10) in the end. If it was comp, you’d have to create a partial function…

Berk: Right, Haskell automatically curries so partial is not needed.

These were the last moments we enjoyed our lives before losing 3 games in a row in Dota.