Migrating data using APIs

A guide

Prerequisites:

A stable Fineract 1.x deployment and a running old/new web-based community app for tests.
Configured Offices and Loan Products.
The following data to migrate:
1. Customer Data
2. Loan Data
3. Repayment transactions Data

Related Guides:

For generic domain-specific documentation on DB migration, please refer to https://docs.fineract.net/database-migration/fineract-connector
https://research.muellners.org/end-to-end-deployment-of-fineract/
https://research.muellners.org/fineract-common-deployment-issues/
Apache Fineract API Documentation

Summary:

This guide first stages the work that needs to take place to import data feeds from legacy CBS, 3rd Party Transaction Processor, – and transaction processors into an instance of Fineract.

The scope of this guide further applies to a detailed set of steps to migrate data for historical records. Historical records are data pertaining to a Financial Institution which is backdated and old. Basically, loan records are before the runtime commencement state of an Apache Fineract instance.

The readers of this guide will gain an understanding on:

Ability to get updates on posted data import/export results
Following a configuration mechanism and audit capability
Bi-directional approach

Questions that this guide covers:

What kind of mechanism is followed?
1. Listening to replication logs
2. Are we interested in listening to replication logs?
How do we maintain data consistency?
1. What are different matrices?
  1. Financial Data Consistency using Accounting
  2. Several levels of consistency checks
    High-Level Checks such as Accounting provided by Fineract
    Matching of transaction
    Matching of Loan Balances
Compute and verify the hashes from both ends of the files to allow consistency.

Data Parsing:

Say a file such as CSV or a database hosted on another platform is provided, which might contain N no. of records of data attributes with different rows and columns, in a generalized approach if headers are provided with association with a Fineract data model.

There is an important task where we need to evaluate the fields from the CSV for, say, a client or a loan, and thus map those fields with essential attributes in case of clients or a loan.

Design Principles

Fineract has a modern file management system compared to transaction processors or Core banking systems used by banks from traditional times. These systems have been defined as legacy core banking systems.

The purpose of this document is to understand Fineract Native APIs supporting external calls to legacy Core Banking Systems or a template database system with capabilities to export different File interchange formats.

The objective of this approach is to extract information using an ETL mechanism (Extract-Transform-Load) for the accounts/records such as loan, savings, and transaction data. This will then later be transformed to suit the data requirements necessary for Fineract APIs. In this guide, we will cover only the customer, loan and loan transaction data as examples.

Design Scenario and a few salient features:

Deployment assumptions: Apache Fineract is deployed in Kubernetes (Amazon EKS). This is to leverage its health monitoring and autoscaling when there is a high volume of traffic or stress in the workload of the deployment.

The data migration (depicted in the diagram) could be summarized in the following steps:

Historical Data from the Legacy systems will then be fed to ETL for preprocessing
ETL will be used for cleaning and transforming the historical data to satisfy the data requirements needed by the Fineract APIs. (In this example, we leverage the usage of Google BigQuery)
The migration script will then ingest the preprocessed data from the ETL, putting it into Fineract using its native REST API calls. The scripts are implemented in such a way that it will leverage asynchronous IO to boost performance by eliminating bottlenecks arising from IO blocking calls.

Validation

Validating the data migration is to check if the desired data was created in Fineract. Fineract will return a resourceId in the response from the API when it successfully creates a record. The scripts are implemented in such a way that checks are performed to see if the resource ID was created or not. If errors happen due to some Domain Errors, the Fineract API also has a descriptive error response.

Sample Response:

{'defaultUserMessage': 'Errors contain reason '
                	'for domain rule '
                	'violation.',
'developerMessage': 'Request was understood '
                	'but caused a domain rule '
                	'violation.',
'errors': [{'args': [{'value': {'chronology': {'calendarType': 'iso8601',
                                        	'id': 'ISO'},
                            	'dayOfMonth': 3,
                            	'dayOfWeek': 'THURSDAY',
                            	'dayOfYear': 154,
                            	'era': 'CE',
                            	'leapYear': False,
                            	'month': 'JUNE',
                            	'monthValue': 6,
                            	'year': 2021}}],
        	'defaultUserMessage': 'The date '
                            	'on which a '
                            	'loan is '
                            	'submitted '
                            	'cannot be '
                            	'earlier '
                            	'than '
                            	"client's "
                            	'transfer '
                            	'date to '
                            	'this '
                            	'office',
        	'developerMessage': 'The date on '
                            	'which a loan '
                            	'is submitted '
                            	'cannot be '
                            	'earlier than '
                            	"client's "
                            	'transfer '
                            	'date to this '
                            	'office',
        	'parameterName': 'id',
        	'userMessageGlobalisationCode': 'error.msg.loan.submittal.cannot.be.before.client.transfer.date',

How to migrate customer Data:

For migrating the customer data, API call POST https://DomainName/api/v1/clients was invoked per each customer record. This will be done iteratively until all customer records are migrated.

Sample Request:

{
   "officeId": office_id,
   "firstname": first_name,
   "lastname": last_name,
   "middlename": middle_name,
   "externalId": external_id,
   "dateFormat": "dd MMMM yyyy",
   "locale": "en",
   "active": True,
   "activationDate": "08 September 2021",
   "submittedOnDate": "08 September 2021",
}

NOTE:

Data Mapping: We have used externalId here to back reference the customer ID we used in our existing legacy system. This will be used later to map the correct customer ID to a loan record correctly.
Data Redundancy: We also use the external ID to guard against the possibility that the migrated loan data will be re-uploaded, causing duplicates. Fineract returns an error when we try to re-upload data with the same external ID.
Here, we just migrated the minimum customer data needed by Fineract

Sample Response:

{
    'clientId': 55393,
    'officeId': 418,
    'resourceId': 55393
}

How to migrate the loan Data:

For migrating the loan data, do not forget to configure the loan products. In our case, we have this data model (Please see diagram below), in which one office = 1 loan product. Also, our historical loan record is always associated with one company.

So that this can then be used to map the correct loan product when we migrate the loan records.

To correctly map the customer to the loan, the loan record contains a customer ID (saved as an external ID in the Fineract Client Object). This way, we could easily map the loan to the correct borrower.

Then, I invoke the POST https://DomainName/api/v1/loans per loan record to migrate the existing loan record to Fineract. This will be done iteratively until all loan records are migrated.

Sample Request:

{
	"externalId": nlo_id,
	"clientId": client_id,
	"productId": loan_product_id,
	"disbursementData": [],
	"principal": principal,
	"loanTermFrequency": loan_term,
	"loanTermFrequencyType": 2,
	"numberOfRepayments": num_repayments,
	"repaymentEvery": 2,
	"repaymentFrequencyType": 5,
	"interestRatePerPeriod": interest_rate_per_period,
	"amortizationType": 1,
"isEqualAmortization": False,
	"interestType": 0,
	"interestCalculationPeriodType": 1,
	"allowPartialPeriodInterestCalcualtion": False,
	"transactionProcessingStrategyId": 1,
	"rates": [],
	"charges": charges,
	"repaymentsStartingFromDate": "15 September 2021",
	"locale": "en",
	"dateFormat": "dd MMMM yyyy",
	"loanType": "individual",
	"expectedDisbursementDate": "10 September 2021",
	"submittedOnDate": "08 September 2021"
}

NOTE:

Mapping: We have used externalId to back reference the Loan ID we used in our existing legacy db system. This will be used later to correctly map the proper loan ID to a repayment transactions record.
Avoid data redundancy: We also use the external ID to guard against the possibility that the migrated loan data will be reuploaded, causing duplicates. Fineract returns an error when we try to re-upload data with the same external ID.

Sample Response:

{
    'clientId': 10672,
    'loanId': 69297,
    'officeId': 395,
    'resourceId': 69297
}

How to migrate loan transactions Data:

For migrating loan transactions, it is essential to correctly map the transaction to the correct loan previously imported. We map the transactions to the correct loan ID by mapping out the external ID of the loan record in Fineract.

Then, after mapping out, we could invoke the POST https://DomainName/api/v1/loans/{loanId}/transactions?command=repayment per payment transaction record to migrate the transaction to Fineract. This will be done iteratively until all repayment transaction records are migrated.

Sample Request:

{
"dateFormat": "dd MMMM yyyy",
"locale": "en",
"externalId": external_id,
"transactionDate": "10 September 2021",
"transactionAmount": transaction_amount,
"paymentTypeId": "1"  # This is just the default payment type id
}

NOTE:

We have used externalId to back reference the Transaction ID we used in our existing system.
Avoid redundancies: We also use the externalId to guard against the possibility that the migrated loan data will be reuploaded, causing duplicates. Fineract returns an error when we try to re-upload data with the same externalId.

Sample Response:

{
	'changes': {
       	'dateFormat': 'dd MMMM yyyy',
       	'locale': 'en',
       	'paymentTypeId': '1',
        	'transactionAmount': '1453.337646',
       	'transactionDate': '24 March 2021'
	},
	'clientId': 18083,
	'loanId': 68618,
	'officeId': 189,
	'resourceId': 1384967
}

There are other aspects and suggestions to observe while a large-scale db migration or import sync takes place:

Syncing Data Using Log Level Approach

Maintaining logs of the incoming information through streams of listeners
Event-driven message oriented support

The role of a suitable event-sourcing middleware architecture is vital to:

Observe changes and maintain a changelog
Expose data transform API
Expose API for the start and status of the job
Finalise the data interchange API
Mutate information API

These APIs should contain the following methods

When processing a file or a new record directly with the job processing entity
1. Make sure that we store the job unit details
2. Returning and updating listening parties to ensure consistent state updates
Maintain distinct logs
1. These logs can be read and written by other services
2. Maintaining these logs should be done by introducing a consensus approach for the scenario where any data point gets excluded (if the service went down), Service should resume its function from the same point.
3. Context awareness of the other services
Fineract-related activities should be handled in separate queues even in the case of sending just a single record.

This post is contributed by Guest author Marnell Monteverde

PreviousRe-engineering Fineract 1. x NextDatabase

Last updated 9 months ago