AI Underwriting Without Credit Bureaus
Most of the world's credit infrastructure is built on an assumption: the borrower has a credit file. A bureau has their repayment history. A score exists. A lender queries it.
In Nigeria — and across most of sub-Saharan Africa — that assumption breaks for a significant portion of small and medium businesses. They transact in cash or on mobile money. They haven't borrowed from a formal institution before. They're not in any bureau. Their creditworthiness is real, but it's invisible to the standard pipeline.
TrustRail is a BNPL underwriting platform I built for this context. A merchant offers "buy now, pay in instalments" to their business customers. The customer submits a bank statement. TrustRail reads it, produces a trust score, and makes an approval decision. No bureau required.
This article is about the architecture of that pipeline — the data model, the analysis engine, the two-tier LLM integration, and the specific design decisions driven by the Nigerian financial context.
Let's dig in.
The Input Signal
In the absence of a credit bureau, the bank statement is the richest available signal. A 3–6 month statement from a Nigerian bank (GTBank, Access, Zenith, Kuda, OPay, Moniepoint) tells you:
- Income patterns — frequency, amounts, source classifications (salary, business receipts, transfers)
- Spending behaviour — recurring obligations, loan repayments, utility payments, gambling activity
- Balance health — average balance, minimum balance, whether the account goes negative
- Behavioural signals — bounced transactions, failed direct debits, overdraft usage
- Debt exposure — identifiable loan repayments to lenders like Carbon, FairMoney, PalmCredit, Renmoney
None of these are as clean as a bureau-sourced credit score. But combined, they support a decision with meaningful predictive validity for the specific question being asked: "Can this business afford ₦X per month for the next N months?"
The challenge is that bank statements arrive as PDFs or CSVs in wildly different formats. GTBank's statement looks nothing like Kuda's. OPay's narrations are truncated differently from Access Bank's. A traditional rule-based parser that handles one bank's format doesn't handle another's without significant engineering for each.
This is where GPT-4o earns its place.
The Two-Engine Architecture
TrustRail has two analysis paths for a submitted statement, and the choice between them depends on what the applicant uploaded.
Primary Path: GPT-4o
When the application includes an OpenAI file ID (set during upload via the openaiService.uploadFileToOpenAI call), the background job uses GPT-4o to read the PDF directly:
if (application.openai?.fileId) {
const { analysisResult, fullResponse, fullPrompt } = await analyzeFileWithOpenAI(
application.openai.fileId,
application.installmentAmount,
trustWallet.approvalWorkflow,
);
// ...
}
The model receives the PDF, the installment amount being applied for, and the approval thresholds configured by the merchant. It returns a structured TrustEngineAnalysisResult — but crucially, as discussed in the previous articles, it does not compute the trust score. The score is computed by the deterministic TypeScript engine after the model's extraction is complete.
Fallback Path: TypeScript CSV Engine
If the GPT-4o call fails, or if the application was submitted with a CSV buffer instead of a PDF file ID, the system falls back to a pure TypeScript CSV parser:
if (application.bankStatementCsvData) {
trustEngineOutput = await analyzeApplication(application.applicationId);
}
The CSV engine uses keyword-based transaction classification against a catalogue of known Nigerian financial institution narrations. It handles standard CSV exports from the major banks, though it lacks the format flexibility of the LLM path.
The fallback is intentional architecture, not an error recovery afterthought. CSV analysis costs \(0 per application. PDF analysis via GPT-4o costs ~\)0.06–0.08 per application. For merchants with high application volumes and low average approval amounts, the CSV path may have better unit economics than the LLM path even when both are available.
The Classification Vocabulary
The analysis engine classifies every transaction against a domain-specific taxonomy built for the Nigerian context. This is not a generic financial classifier — it's tuned to the specific institutions, product names, and transaction narration patterns that appear on Nigerian bank statements.
Income Sources
const CATEGORY_KEYWORDS = {
salary: ['SALARY', 'SAL', 'WAGES', 'PAYROLL'],
freelance: ['TRANSFER', 'REMITTANCE', 'UPWORK', 'FIVERR'],
business: ['POS', 'PAYMENT FOR', 'SALES'],
};
The freelance category deliberately includes TRANSFER and REMITTANCE — in Nigeria, the vast majority of business-to-business and client-to-freelancer payments arrive as bank transfers with narrations that include these words. An overly conservative classifier that marks all transfers as non-income will systematically undercount income for self-employed applicants and freelancers.
This is the same classification problem TaxLens encounters with the Kuda format. The solution in TrustRail's CSV engine is the same: bias toward counting credits as income, reserve transfer classification only for credits you're confident are not income (reversals, refunds, self-transfers).
Spending Categories
const CATEGORY_KEYWORDS = {
bills: ['PHCN', 'EKEDC', 'IKEDC', 'DSTV', 'GOTV', 'STARTIMES', 'AIRTEL', 'MTN', 'GLO', '9MOBILE'],
loans: ['LOAN', 'REPAYMENT', 'INSTALLMENT', 'CARBON', 'BRANCH', 'FAIRMONEY', 'PALMCREDIT', 'RENMONEY'],
gambling: ['BET', 'BETKING', 'SPORTYBET', 'NAIRABET', '1XBET', 'BET9JA', 'MSPORT', 'MERRYBET'],
};
The lender names in the loans category (Carbon, FairMoney, PalmCredit, Renmoney, Branch) are specifically Nigerian digital lenders. An applicant who is already repaying three of these simultaneously has a materially different risk profile from one with no active loans — even if both have the same average monthly income.
The gambling keywords are Nigerian-specific bookmakers. Frequent high-value transactions to betting platforms are a risk flag. They don't automatically trigger a decline — one bet per month isn't a signal — but above a threshold (₦10,000+ in gambling spend), a risk flag is raised.
Bounce Detection
const BOUNCE_KEYWORDS = [
'INSUFFICIENT FUNDS',
'REVERSAL',
'DECLINED',
'FAILED',
'REJECTED',
];
Bounced transactions are identified by narration keyword matching. Bounce count feeds directly into the trust score (zero bounces: +5 points; 1–2 bounces: +2 points; 3+ bounces: -5 points). More than three bounces in a 3-month window triggers a FREQUENT_BOUNCES risk flag at HIGH severity.
The Trust Score Formula
The trust score is a weighted sum across five independent dimensions. Each dimension is computed from structured transaction data. The score ranges from 0 to 100.
Score = Income Stability (30) + Spending Behaviour (25) + Balance Health (20)
+ Transaction Behaviour (15) + Affordability (10)
Income Stability (30 points max)
Two sub-components:
Income consistency (0–15): incomeConsistency × 15. The consistency metric is the ratio of months with meaningful credit activity to total months analyzed. An account with regular monthly credits close to the expected salary date scores near 1.0. An account with sporadic, irregular credits scores lower.
Income-to-installment ratio (0–15):
- Ratio < 0.2: +15 (installment is less than 20% of monthly income)
- Ratio < 0.3: +10
- Ratio < 0.4: +5
- Ratio ≥ 0.4: +0
This ratio directly tests whether the requested credit is proportionate to the applicant's income. A ₦50,000/month installment for someone earning ₦500,000/month is a very different risk from the same installment for someone earning ₦100,000/month, even if the absolute income is "adequate."
Spending Behaviour (25 points max)
Debt ratio (0–10): max(0, 10 - debtRatio × 20). Existing loan repayments as a fraction of monthly income, penalised at 20× the ratio. An applicant already paying back 40% of their income in loan repayments scores 0 on this sub-component.
Gambling penalty (variable): If gambling spend is detected, the score is reduced by min(10, gamblingSpend / 1000). Up to 10 points can be lost here. The divisor (1,000) is denominated in naira — ₦10,000 in gambling spend removes the full 10 points.
Savings rate (0–15): min(15, savingsRate × 20). savingsRate = (avgMonthlyIncome - avgMonthlySpending) / avgMonthlyIncome. An applicant who saves 75% of their income scores 15. An applicant who spends everything they earn scores 0.
Balance Health (20 points max)
Average balance vs. installment (0–10):
- Average balance > 2× installment: +10
- Average balance > 1× installment: +5
- Average balance ≤ installment: +0
Minimum balance vs. installment (0–10):
- Minimum balance > installment: +10
- Minimum balance > 0.5× installment: +5
- Minimum balance ≤ 0.5× installment: +0
The minimum balance test is particularly revealing. An account that has a good average balance but frequently drops near zero has a cash flow pattern inconsistent with reliable monthly payments. The minimum balance score penalises this specifically.
Transaction Behaviour (15 points max)
Bounce count (+5/+2/-5): Described above.
Overdraft usage (+5/-5): An account that has gone negative receives -5 points. An account that has never gone negative receives +5. This tests the account's buffer — not just its average behaviour.
Transaction volume (+5/+2/+0): More than 30 transactions: +5. More than 15: +2. Fewer than 15: +0. A very low transaction count is a signal that the account may not be the applicant's primary account — they may be submitting a secondary account with selected favourable transactions.
Affordability (10 points max)
affordabilityRatio = installmentAmount / disposableIncome
disposableIncome = avgMonthlyIncome - (avgMonthlySpending + existingLoanRepayments)
- Ratio < 0.2: +10
- Ratio < 0.3: +7
- Ratio < 0.4: +4
- Ratio ≥ 0.5:
canAffordInstallment = false→ automatic DECLINED
The canAffordInstallment flag is a hard gate. If the installment exceeds 50% of disposable income, the application is declined regardless of the trust score. This rule is evaluated before the score is used for the decision.
The Decision Gate
The merchant configures three thresholds per TrustWallet (a TrustWallet is a product offering — the merchant may have multiple with different terms):
autoApproveThreshold: scores at or above this → APPROVEDautoDeclineThreshold: scores below this → DECLINEDminTrustScore: the floor — any score below this declines regardless of the other thresholds
The decision function:
const makeDecision = (
trustScore: number,
approvalWorkflow: IApprovalWorkflow,
affordabilityAssessment: AffordabilityAssessment,
): 'APPROVED' | 'FLAGGED_FOR_REVIEW' | 'DECLINED' => {
if (!affordabilityAssessment.canAffordInstallment) return 'DECLINED';
if (trustScore < approvalWorkflow.minTrustScore) return 'DECLINED';
if (trustScore < approvalWorkflow.autoDeclineThreshold) return 'DECLINED';
if (trustScore >= approvalWorkflow.autoApproveThreshold) return 'APPROVED';
return 'FLAGGED_FOR_REVIEW';
};
Scores between autoDeclineThreshold and autoApproveThreshold are FLAGGED_FOR_REVIEW — they're routed to a human reviewer on the merchant's side. This band gives the merchant control over their risk tolerance without requiring them to binary-classify every application. A conservative merchant might auto-approve only at 80+ and manually review everything from 55–79. An aggressive merchant might auto-approve at 65+ and only manually review 45–64.
This is the right model for a system replacing a credit bureau: the algorithm provides a signal, not an edict. The merchant keeps the approval authority and calibrates the thresholds based on their actual loss rates over time.
Risk Flags
Beyond the score, the engine produces a set of named risk flags that the merchant can use to inform their manual review decisions:
| Flag | Severity | Condition |
|---|---|---|
HIGH_GAMBLING_ACTIVITY |
HIGH | Gambling spend > ₦10,000 |
FREQUENT_BOUNCES |
HIGH | Bounce count > 3 |
OVERDRAFT_USAGE |
MEDIUM | Account went negative |
HIGH_DEBT_TO_INCOME |
HIGH | Debt-to-income ratio > 40% |
CANNOT_AFFORD_INSTALLMENT |
HIGH | Affordability ratio ≥ 50% |
INVALID_STATEMENT |
HIGH | Document validity check failed |
A FLAGGED_FOR_REVIEW decision with HIGH_DEBT_TO_INCOME and FREQUENT_BOUNCES should inform a human reviewer differently than the same decision with only OVERDRAFT_USAGE. The flags add texture to the decision that the single-number score cannot.
The Approved Path: Mandate Creation
When an application is approved, TrustRail doesn't just send a webhook and stop. It creates a direct debit mandate via PayWithAccount/NIBSS so that installment collections can be automated without the customer having to manually make each payment.
The mandate creation runs immediately after the APPROVED decision:
if (trustEngineOutput.decision === 'APPROVED') {
application.status = 'APPROVED';
await application.save();
const mandateResult = await createMandate(
{
accountNumber: application.customerDetails.accountNumber,
bankCode: application.customerDetails.bankCode,
bvn: application.customerDetails.bvn,
// ...
},
business.billerCode,
application.totalAmount,
);
application.pwaMandateRef = mandateResult.mandateRef;
application.status = 'MANDATE_CREATED';
await application.save();
}
The mandate reference is then available for the merchant to initiate collection on each instalment due date. The underwriting pipeline produces not just a decision but a collection infrastructure reference.
BVN data is encrypted at rest using an AES-based scheme before storage. The encryption key is environment-specific. BVN is transmitted to the mandate provider but never stored in plaintext in the application document.
Trade-offs and Honest Limitations
The classification keywords are a maintenance surface. The list of known lender names, utility providers, and gambling operators needs updating as the market changes. A new digital lender that isn't in the loans keyword list won't have its repayments counted against the debt ratio. This is a known gap. The LLM path mitigates it for PDF analysis — GPT-4o can identify "CARBON FINANCE REPAYMENT" as a loan repayment even without the keyword list — but the CSV fallback is keyword-dependent.
Transaction volume is a weak proxy for account completeness. The +5 points for 30+ transactions per period is designed to penalise accounts with suspiciously few transactions. But a high-income individual who uses their corporate card for most expenses and only transfers their salary through the statement account might legitimately have 12 transactions in a month. The signal is directional, not definitive.
The income consistency metric is simplified. The current implementation approximates consistency as creditTransactions.length / (5 × monthsAnalyzed), assuming roughly 5 credits per month as a baseline. This works adequately for salary earners with regular deposits, but understates consistency for business owners who receive many smaller payments spread across the month. A more precise implementation would group transactions by calendar month and measure the variance in monthly credit totals — a project for v2.
The trust score is not calibrated against outcomes. The weightings (30/25/20/15/10) and sub-component thresholds are based on financial domain knowledge, not trained against a labelled dataset of historical applications and their actual repayment outcomes. As TrustRail accumulates repayment data, those weightings should be revisited against actual loss rates per score band. The current score is a principled starting point, not a validated predictor.
BVN encryption is application-layer, not field-level. The BVN is encrypted before the application document is created, but it's stored as a single encrypted string in the document. A database-level field encryption with key rotation would be more robust — this is the kind of compliance detail that matters more as transaction volume grows.
Evaluation
The architecture produces several measurable properties relevant to the use case:
Format flexibility — The LLM path handles any Nigerian bank's PDF format without parser engineering for each institution. New bank statement formats are handled automatically, within the classification guidance in the system prompt.
Explainability — Every component of the trust score maps to a named sub-component with a formula. A declined applicant can be shown exactly why: "Your trust score is 44. Your income-to-installment ratio cost you 0/15 on income stability. Your debt-to-income ratio is 47% — above the 40% threshold — which triggered a HIGH_DEBT_TO_INCOME flag." This is not achievable with a black-box model score.
Merchant control — The three-threshold configuration (auto-approve, auto-decline, minimum score) gives merchants genuine control over their risk appetite without requiring them to understand the underlying formula. They calibrate based on their observed approval rates and loss rates, not on score internals.
Zero-bureau operation — The entire pipeline runs without querying any credit bureau, credit registry, or external data provider beyond the bank statement itself and the mandate creation API. This is the deliberate design choice that makes the system useful for the specific population of applicants who are invisible to conventional underwriting.
The Honest Conclusion
Bank statement underwriting is not a replacement for bureau-based credit scoring at scale. Bureaus aggregate repayment history across thousands of lenders and years of data. A 3-month bank statement is a much noisier signal.
What it is, is a viable signal for a specific use case: a merchant offering BNPL to their existing customers, where the merchant has prior relationship context and the amounts involved are moderate. The trust score doesn't need to be as precise as a FICO score — it needs to be more informative than a gut feeling, and it needs to reach the population that the bureau pipeline cannot.
For that use case, in that market, with those constraints, the architecture described here does the job.
The credit bureau is not the only path to a lending decision. It's just the one the infrastructure was built around.