I had an interesting discussion with a friend, who is a Haskell Developer.
Onur: Hey. I have just written a function but I feel like it is more complicated than it should be. I’m trying to dedup a collection of maps based on two keys. Here is my version that I am not happy with:
(defn- dedup-by-dorry-and-brogy
[records]
(vec
(reduce
(fn [dedupped-records
{:keys [dorry brogy] :as record}]
(if (some
#(and
(= dorry (:dorry %))
(= brogy (:brogy %)))
dedupped-records)
dedupped-records
(conj dedupped-records record)))
'()
;; reverse to keep the last occurrence
(reverse records))))
Berk: Hmm… I’ll send you how I’d write it in Haskell.
Onur: Well, I just thought a better version while explaining the problem to you… I guess sometimes I should just explain the problem outloud to myself. I can just use group-by
, grouping the records by the keys, and map
to get last ones. group-by
respects the order so it should be fine.
Onur: Here it is:
(defn- dedup-by-dorry-and-brogy
[records]
(->> records
(group-by #(select-keys % [:dorry :brogy]))
vals
(map last)
(into [])))
Berk: Yeah, this is indeed better… Though I’d write it like this in Haskell:
dedup :: [Rec] -> [Rec]
dedup = nubBy ((==) `on` (dorry &&& brogy)) . reverse
Onur: What does nubBy
do?
Berk: nub
deletes duplicates from a list and retains the first element. The nubBy
function behaves just like nub
, except it uses a user-supplied equality predicate instead of the overloaded (==) function. Here is the documentation.
Onur: Hmm… We have distinct
for nub
in Clojure. I wonder if we have a distinct-by
.
Berk: In Haskell, for functions like nub
there’s generally a -by
version too.
Onur: Similarly, Clojure has sort
and sort-by
too. But seems there’s not distinct-by
yet as this ticket is still open.
Onur: Though, there is this awesome de-facto helper library. It’s already in our project dependencies and has distinct-by
in it. I’ll just use from that.
Onur: Here is how it looks like.
(defn- dedup-by-dorry-and-brogy
[records]
(->> records
;; reverse to retain last record
reverse
(distinct-by #(select-keys % [:dorry :brogy]))))
Onur: Well, I can use juxt
too. It is like map
, but unlike map
applying a function to each element of a collection, juxt
takes multiple functions and returns one function applying each one of those functions in order to passed args. Or something like that. Here is the updated version.
(defn- dedup-by-dorry-and-brogy
[records]
(->> records
;; reverse to retain last record
reverse
(distinct-by (juxt :dorry :brogy))))
Berk: This is pretty cool! You are brilliant Onur, and also much more handsome than I am.
Onur: Thanks.
Berk: Though, why not using comp
instead of threading macro
? I think this reads more natural:
(def dedup-by-dorry-and-brogy
(comp
(distinct-by (juxt :dorry :brogy))
reverse))
Onur: In this example, comp
indeed can be used too as both functions take one argument. But imagine you want to take last 10 elements of the collection. With threading macro, you can just add (take 10)
in the end. If it was comp
, you’d have to create a partial
function…
Berk: Right, Haskell automatically curries so partial
is not needed.
These were the last moments we enjoyed our lives before losing 3 games in a row in Dota.