Models are made of features, features define transformations for data, they tell Antelope what to maintain as in-memory state, and they tell Antelope how in-memory state relates to parameters of the model.
Building on top of a set of simple state primitives, we demonstrate a flexible series of feature implementations of varying degrees of complexity. Antelope aims to provide an expressive DSL-like experience for shaping raw streams of input events into the signals directly useful to machine learning. In doing so, we are always guided by the principle that concepts that are easy to explain should also be easy to implement.
A simple feature: Overall Popularity
A basic yet very useful feature is Overall Popularity, a measure of the all-time popularity of an item. Overall popularity proves to be a powerful feature in the Best Buy product search challenge.
Here is overall popularity as implemented in Antelope:
class OverallPopularityFeature[T <: ScoringContext](ide: IdExtractor)(implicit val s: State[T])
extends Feature[T] {
import co.ifwe.antelope.util._
val ct = s.counter(ide)
override def score(implicit ctx: T) = {
id: Long => ct(id) div ct()
}
}
Full source: OverallPopularityFeature.scala
OverallPopularityFeature implements the interface of Feature, which requires just one method
trait Feature[T <: ScoringContext] {
def score(implicit ctx: T): Long => Double
}
the method score takes as input a context of type T. In our product search example the context includes the search query, whereas in the dating recommendations example it contains the identifier of the user requesting the recommendation. score returns the scoring function for this feature, a mapping from candidate identifiers to numeric (Double) values.
For the overall popularity feature the score implementation is simple, returning the fraction of observations that match the provided id. To keep track of this OverallPopularityFeature maintains a counter, provided by the State object.
Let’s look at the internals of State to see what a counter represents
trait Counter {
def increment: PartialFunction[Event, Unit]
def apply(): Long
}
trait Counter1[T1] extends Counter {
def apply(k: T1): Long
def toMap: Map[T1, Long]
}
trait Counter2[T1,T2] extends Counter {
def apply(k: (T1,T2)): Long
def apply(k: T1): Long
def mapAt(k: T1): Map[T2, Long]
}
Antelope’s state implements a “hierarchical counter” concept, supporting not just a single total count of events, but also providing a breakdown of counts according to key, or, if provided with a pair of keys, by either the first key alone or the two keys in combination.
Antelope uses Scala’s partial functions to bind state variables to incoming data. The domain of a partial function does not necessarily include all values of the domain’s type, allowing us to attach a feature’s state updates only to certain events.
The IdExtractor parameter of the OverallPopularityFeature class allows us to pass in a partial function describing which events types to count; leveraging Scala’s composability allows the flexibility to re-use feature definition in different models.
In the Best Buy product search model, the id extracted from the event provides the key to a map of individual, per-key counters.
Additional examples from the Best Buy product search model
Recent popularity
Online systems that need to adapt to changing circumstances can benefit from a bit of amnesia. A simple and clean solution is a counter that decays exponentially. Antelope’s State framework allows this implementation
val ct = s.decayingCounter(ide, Math.log(2) / halfLife)
We can store and update decaying counters efficiently by simply storing a (count,timestamp) pair and updating accordingly as new events come in.
When scoring using the decaying counter we need to provide the current time as represented in the query context
id: Long => ct(ctx.t, id) div ct(ctx.t)
Full source: RecentPopularityFeature.scala
Term popularity
Direct measure of the frequency of the document given the term. In our implementation we make use of a hierarchical counter state variable that allows us to get the count at each level.
val ct = s.counter(t.termsFromUpdate, ide)
The scoring function first extracts terms from the query, then returns a term-by-term product of the fraction of occurrences of the term that occur in the scored document
val queryTerms = t.termsFromQueryContext(ctx)
id: Long => queryTerms.map(term => {
ct(term, id) div ct(term)
}).product
Full source: TermPopularityFeature.scala
Naive Bayes popularity
One could assert that Naive Bayes rests on better foundations that the TermPopularityFeature, that it is similar but more principled. Yet Naive Bayes is usually applied in situations where its theoretical justifications do not apply. Our advice to the user: see what works best.
Here’s the scoring function used
val queryTerms = t.termsFromQueryContext(ctx)
id: Long => queryTerms.map(term => {
ct(id, term) div ct(id)
}).product
Full source: NaiveBayesPopularityFeature.scala
TF-IDF
Term frequency-inverse document frequency is a standard information retrieval measure. It is quite simple to implement using Antelope’s state primitives. Here ide represents the document identifier while t.termsFromUpdate represents a series of tokes from the update of indexed text
val terms = counter(ide,t.termsFromUpdate)
val docs = set(ide)
val docsWithTerm = set(t.termsFromUpdate,ide)
The scoring function is defined as follows
val queryTerms: Iterable[String] = t.termsFromQueryContext(ctx) // get the query terms from the query
val n = docs.size() // total number of documents
// scoring function returned
id: Long => (queryTerms map { t: String =>
val tf = terms(id, t) // number of times the term occurs within the document
val df = docsWithTerm.size(t) // number of documents that contain the term
Math.sqrt(tf) * sq(1D + Math.log(n / (1D + df))) // TF-IDF as implemented in Lucene
}).sum
Full source: TfIdfFeature.scala
Additional examples from the dating simulation model
In the DatingModel we use an inline representation whereby features are defined in the model where they are used. We explore these features in detail here, whereas the full source is available in DatingModel.scala.
Binary indicators
Are users in the same region?
feature(new Feature[DatingScoringContext]() {
val userRegion = map(userId, userRegionUpdate)
override def score(implicit ctx: DatingScoringContext): (Long) => Double = {
val srcRegion = userRegion(ctx.id)
id: Long => if (srcRegion == userRegion(id)) 1 else 0
}
})
Do regions share a border?
feature(new Feature[DatingScoringContext]() {
val userRegion = map(userId, userRegionUpdate)
override def score(implicit ctx: DatingScoringContext): (Long) => Double = {
val srcRegion = userRegion(ctx.id)
id: Long => if (Region.borders(srcRegion, userRegion(id))) 1 else 0
}
})
Most recent value, with transformation relative to current time
When was the user most recently active?
feature(new Feature[DatingScoringContext]() {
val lastActivity = map(userId, userActivityTime)
override def score(implicit ctx: DatingScoringContext): (Long) => Double = {
id: Long => {
lastActivity.get(id) match {
case Some(ts) => ctx.t - ts
case None => 0D
}
}
}
})
we need to pair the above feature with another that merely indicates whether we have an activity history for this user
feature(new Feature[DatingScoringContext]() {
val lastActivity = map(userId, userActivityTime)
override def score(implicit ctx: DatingScoringContext): (Long) => Double = {
id: Long => {
lastActivity.get(id) match {
case Some(_) => 1D
case None => 0D
}
}
}
})
Arithmetic features
The age difference between two users is one of the most straightforward features for dating recommendations
feature(new Feature[DatingScoringContext]() {
val userAge = map(userId, userAgeUpdate)
override def score(implicit ctx: DatingScoringContext): (Long) => Double = {
val srcAge: Int = userAge(ctx.id)
id: Long => {
val tgtAge = userAge(id)
math.abs(srcAge - tgtAge)
}
}
})
we also introduce a feature for the square of the age difference
feature(new Feature[DatingScoringContext]() {
val userAge = map(userId, userAgeUpdate)
override def score(implicit ctx: DatingScoringContext): (Long) => Double = {
val srcAge = userAge(ctx.id)
id: Long => {
val tgtAge = userAge(id)
(srcAge - tgtAge) * (srcAge - tgtAge)
}
}
})
Ratio features
For users with sufficient voting history we use the historical click rates to estimate how likely the user is to vote positively. Recommending less-selective users can increase match rates, though there are limits to how much inbound interest a user can respond to.
val thresholdVotes = 5
feature(new Feature[DatingScoringContext]() {
class VoteRatio extends Updatable[Boolean] {
var ct = 0
var yesCt = 0
override def update(x: Boolean): Unit = {
ct += 1
if (x) {
yesCt += 1
}
}
}
val voteRatios = mapUpdatable(userId, userVote, new VoteRatio)
override def score(implicit ctx: DatingScoringContext): (Long) => Double = {
id: Long => {
voteRatios.get(id) match {
case Some(vr: VoteRatio) => if (vr.ct >= thresholdVotes) {
vr.yesCt.toDouble / vr.ct.toDouble
} else {
0D
}
case None => 0D
}
}
}
})
for features like the vote ratio, which are defined only for some users, we need to provide a complementary indicator feature that indicates availability of the ratio
feature(new Feature[DatingScoringContext]() {
val voteCounts = counter(userIdVoted)
override def score(implicit ctx: DatingScoringContext): (Long) => Double = {
id: Long => if (voteCounts(id) >= thresholdVotes) 1D else 0D
}
})
< Back to Documentation Index