Parceling is a technique often used in
analytical performance inference to help account for a “fuzzy” or
probabilistic 0/1 outcome for a given observation It is also used in
model development as an alternative to linear regression for to
convert a ratio variable to a 0/1 or binary target. For example:
The problem in inference is to
determine how a specific “unknown” observation would have
performed had it been in the known population.
- Lending or credit
- How would a rejected applicant have performed had it been accepted?
- How would a “unbooked” (accepted but walked away) applicant have performed had they taken the loan.
- Direct Mail
- Would a potential customer have responded had they been mailed an offer?
- If they had responded, would they have purchased something?
- Fraud
- Would a credit application been identified as fraudulent had it been investigated?
- Would an insurance claim been identified as fraudulent had it been investigated?
In this situation, the analysis of the
known population can be extrapolated (very carefully) into the
unknown population to derive a probability of the target performance
(for example 1=Good or 0=Bad). This probability is then used to
divide an unknown observation into two separate observations, a Good
observation and a Bad observation. The good observation is given a
weight equivalent to the probability of a 1 and the Bad observation
is given a weight equivalent to the probability of 0.
This technique is only applicable in
some very specific conditions. In general, if the target can have
different degrees of “Badness” or “Goodness” then parceling
can be used. For example:
- Loan Collections – When a loan has been charged off, some or all of the money can be recovered. If none is recovered it is Bad, if all is recovered, it is Good but if a portion is recovered then it is partially good and partially bad. The ratio here is % recovered.
- Insurance Risk –When an insurance policy has a no claims it is Good, when it has claims it is somewhat Bad, depending on how large the claim is. The ratio here is Loss Ratio, or Loss/(Premiums Paid).
- Profitability – An account in any sort of business could be classified as Good or Bad depending on the revenue generated from the account compared to the costs associated with the account. Those accounts with no costs are Good, those accounts with no revenue are Bad. The ratio here is Revenue/Cost.
Parceling is used in these ratio
examples in a similar way to the inference solution. Each partial
observation is duplicated making a “Good” observation and a “Bad”
observation. The Good observations are given weights proportional to
their “Goodness” ($ recovered on the charged off loan, insurance
premiums paid, revenue generated by the acount) and the Bad
observations are given weights proportional to their “Badness” ($
owed on the charged off loan, losses on the insurance due to
claim(s), costs associated with the account).
For inference situations, parceling
allows a single observation with unknown performance to be split into
a “good” observation with a weight proportional to the estimated
probability that observation would have been good and a :bad”
observation with a weight proportional to the estimated probability
that observation would have been “bad.” These parceled values are
then added to the known population to build a final model based on
the full TTD population.
Return to FAMQ
Return to FAMQ
No comments:
Post a Comment