Replicate MongoDB to PostgreSQL in real time
Open source
Why would you want to replicate a database?
Sometimes there is some different reasons: improve permforance and reliability on distributed systems, to make a backup...
On this case we (RebellionPay) were using PowerBI to analyze and visualize our business data.
The tricky part was to feed PowerBI with the data from our MongoDB database and it doesn't have a way to do that. So we decided that a PostgreSQL database would be the Data Warehouse for our business data.
That is why we decided to create a CLI tool to replicate MongoDB to PostgreSQL.
But why in real time?
This type of work (replication) is done in nightly jobs, data was that big that it would take a long time to replicate. We needed to do it in real time. Listen to changes on MongoDB and replicate them to the PostgreSQL database.
Configuration
The user needs to provide the following configuration:
- MongoDB connection
- PostgreSQL connection
- MongoDB collection models
- Sync options: watchOperations
You could listen to a multiple operations on a collection:
ts
enum MongoOperation {INSERT = 'insert',REPLACE = 'replace',UPDATE = 'update',DELETE = 'delete',RENAME = 'rename',DROP = 'drop',DROP_DATABASE = 'dropDatabase',INVALIDATE = 'invalidate',}
Example configuration
json
{"mongo": {"connection": {"options": {"useNewUrlParser": true,"useUnifiedTopology": true,"auto_reconnect": true}},"collectionModels": {"users": {"_id": {"type": "TEXT","primary": true},"email": {"type": "TEXT"},"phone": {"type": "TEXT"},"phone_country": {"type": "TEXT"},"lang": {"type": "TEXT"},"name": {"type": "TEXT"},"createdAt": {"type": "TIMESTAMP"},"updatedAt": {"type": "TIMESTAMP"}},"postgres": {"connection": {"config": {"host": "psql.example.com","database": "database","user": "postgres","password": "****************************"}}},"sync": [{"collection": "users","watchOperations": ["INSERT", "UPDATE"]}]}}}
You can specify also if a column is primary or unique as shown.
As you can see, the configuration is pretty simple. We are going to sync all the INSERT
and UPDATE
operations on the users
collection.
Every INSERT on MongoDB is parsed and inserted on PostgreSQL. It happens the same way for UPDATE.
Running the tool
Setup the tool:
bash
git clone https://github.com/rebellionpay/mopo mopo && cd moponpm inpm run build
Listen and sync only users collection:
bash
dist/index.js --start .mopo.json --listen-only users --log-level verbose --strict-listen
Visit the github repository
Open source