## Friday, February 15, 2013

### What is Parceling?

Parceling is a technique often used in analytical performance inference to help account for a “fuzzy” or probabilistic 0/1 outcome for a given observation It is also used in model development as an alternative to linear regression for to convert a ratio variable to a 0/1 or binary target. For example:
Inference
The problem in inference is to determine how a specific “unknown” observation would have performed had it been in the known population.
• Lending or credit
• How would a rejected applicant have performed had it been accepted?
• How would a “unbooked” (accepted but walked away) applicant have performed had they taken the loan.
• Direct Mail
• Would a potential customer have responded had they been mailed an offer?
• If they had responded, would they have purchased something?
• Fraud
• Would a credit application been identified as fraudulent had it been investigated?
• Would an insurance claim been identified as fraudulent had it been investigated?

In this situation, the analysis of the known population can be extrapolated (very carefully) into the unknown population to derive a probability of the target performance (for example 1=Good or 0=Bad). This probability is then used to divide an unknown observation into two separate observations, a Good observation and a Bad observation. The good observation is given a weight equivalent to the probability of a 1 and the Bad observation is given a weight equivalent to the probability of 0.
Ratios
This technique is only applicable in some very specific conditions. In general, if the target can have different degrees of “Badness” or “Goodness” then parceling can be used. For example:
• Loan Collections – When a loan has been charged off, some or all of the money can be recovered. If none is recovered it is Bad, if all is recovered, it is Good but if a portion is recovered then it is partially good and partially bad. The ratio here is % recovered.
• Insurance Risk –When an insurance policy has a no claims it is Good, when it has claims it is somewhat Bad, depending on how large the claim is. The ratio here is Loss Ratio, or Loss/(Premiums Paid).
• Profitability – An account in any sort of business could be classified as Good or Bad depending on the revenue generated from the account compared to the costs associated with the account. Those accounts with no costs are Good, those accounts with no revenue are Bad. The ratio here is Revenue/Cost.

Parceling is used in these ratio examples in a similar way to the inference solution. Each partial observation is duplicated making a “Good” observation and a “Bad” observation. The Good observations are given weights proportional to their “Goodness” (\$ recovered on the charged off loan, insurance premiums paid, revenue generated by the acount) and the Bad observations are given weights proportional to their “Badness” (\$ owed on the charged off loan, losses on the insurance due to claim(s), costs associated with the account).

For inference situations, parceling allows a single observation with unknown performance to be split into a “good” observation with a weight proportional to the estimated probability that observation would have been good and a :bad” observation with a weight proportional to the estimated probability that observation would have been “bad.” These parceled values are then added to the known population to build a final model based on the full TTD population.