Skip to content

Commit

Permalink
Simple firestore data migrator
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisjwwalker committed Jan 23, 2020
0 parents commit 14e0941
Show file tree
Hide file tree
Showing 4 changed files with 1,428 additions and 0 deletions.
101 changes: 101 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*

#Idea
.idea/
target/
**/*.iml


# VS Code
.vscode/

# Runtime data
pids
*.pid
*.seed
*.pid.lock

# Directory for instrumented libs generated by jscoverage/JSCover
lib-cov

# Coverage directory used by tools like istanbul
coverage

# nyc test coverage
.nyc_output

# Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files)
.grunt

# Bower dependency directory (https://bower.io/)
bower_components

# node-waf configuration
.lock-wscript

# Compiled binary addons (https://nodejs.org/api/addons.html)
build/Release

# Dependency directories
node_modules/
jspm_packages/

# TypeScript v1 declaration files
typings/

# Optional npm cache directory
.npm

# Optional eslint cache
.eslintcache

# Optional REPL history
.node_repl_history

# Output of 'npm pack'
*.tgz

# Yarn Integrity file
.yarn-integrity

# dotenv environment variables file
.env

# next.js build output
.next
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.

# dependencies
/node_modules
/.pnp
.pnp.js

# testing
/coverage

# production
/build
.firebase

# misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local

npm-debug.log*
yarn-debug.log*
yarn-error.log*
/.idea/
/functions/.serverless/
/.metadata/


aws/fargate/kinesis-x-recognition/aws.properties
*.json
33 changes: 33 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# firestore-data-migrator

## What is this?
This a simple tool that can migrate data between two cloud firestore instances sitting under different GCP accounts.

## What do I need?
You need four things.

- Service account json for the database providing the data
- Service account json for the database receiving the data
- The name of the collection where the database currently sits
- The name of the collection you want to insert the data into
- The name of the firebase project you're exporting to

## Where do I get "Service account jsons"?
On the Google Cloud Platform [service account page](https://console.cloud.google.com/projectselector2/iam-admin/serviceaccounts?supportedpurview=project) select or create a new service account. Then go into the account, click edit at the top and click create key at the bottom. Choose json and a file will download.

## Got those, how do I run?

```javascript
git clone git@github.com:Capgemini-AIE/firestore-data-migrator.git

yarn install

INPUT_CREDS="path/to/input/service/acc.json" INPUT_COLLECTION="collection name" OUTPUT_CREDS=".path/to/output/service/acc.json" OUTPUT_COLLECTION="collection" OUTPUT_APP_NAME="name of your second firebase app" node migrator.js
```

## Notes
If your collection a substantial amount of document the migrator will chunk your documents into batches of 10 to prevent errors.


## Authors
Created by Capgemini AIE London team.
68 changes: 68 additions & 0 deletions migrator.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@

const admin = require("firebase-admin");

const inputCreds = require(process.env.INPUT_CREDS);
const inputDatabase = admin.initializeApp({credential: admin.credential.cert(inputCreds)}).firestore();
const inputCollection = process.env.INPUT_COLLECTION;

const outputCreds = require(process.env.OUTPUT_CREDS);
const outputAppName = process.env.OUTPUT_APP_NAME;
const outputDatabase = admin.initializeApp({credential: admin.credential.cert(outputCreds)}, outputAppName).firestore();
const outputCollection = process.env.OUTPUT_COLLECTION;

const batchGet = (repo, collection) => {
return repo.collection(collection).get().then(querySnap => {
console.log(`Found ${querySnap.docs.length} documents`);
return querySnap.docs.reduce((mappedDoc, doc) => {
return Object.assign({}, mappedDoc, {
[doc.id] : {
id: doc.id,
data: doc.data()
}
})
}, {})
});
};

const batchWrite = (docs, repo, collection) => {
let batch = repo.batch();
docs.forEach(doc => {
const ref = repo.collection(collection).doc(doc.id);
batch.set(ref, doc.data)
});
console.log(`Inserting ${docs.length} documents`);
return batch.commit();
};

const reconciliate = (inputRepo, outputRepo, inputCollection, outputCollection) => {
return inputRepo.collection(inputCollection).get().then(inputSnap => {
return outputRepo.collection(outputCollection).get().then(outputSnap => {
const inputCount = inputSnap.docs.length;
const outputCount = outputSnap.docs.length;
console.log(`reconciliate => Found ${inputCount} in input collection`);
console.log(`reconciliate => Found ${outputCount} in output collection`);
console.log(`reconciliate => Do amount of docs match? => ${inputCount === outputCount ? "YES" : "NO"}`)
})
});
};

const chunk = (array, size) => {
const chunked_arr = [];
let copied = [...array];
const numOfChild = Math.ceil(copied.length / size);
for (let i = 0; i < numOfChild; i++) {
chunked_arr.push(copied.splice(0, size));
}
return chunked_arr;
};

return batchGet(inputDatabase, inputCollection).then(docMap => {
const docArr = Object.values(docMap);
const chunkedArr = chunk(docArr, 10);

return Promise.all(chunkedArr.map(chunk => {
return batchWrite(chunk, outputDatabase, outputCollection);
})).then(() => {
return reconciliate(inputDatabase, outputDatabase, inputCollection, outputCollection);
});
});
Loading

0 comments on commit 14e0941

Please sign in to comment.