Skip to main content

Firebase and Google Cloud Pitfalls

Overview

There is a term called Cloud Overflow (you might have heard of memory overflow, or stack overflow). In this case, Cloud Overflow just mean that when things go out of control and you cannot help but burning money on the cloud for Google if you are justing more resources than you expect to.

In this article, one line of code can cost up to $30K in 72hours: https://hackernoon.com/how-we-spent-30k-usd-in-firebase-in-less-than-72-hours-307490bd24d To be honest, for a couple of times, Avada team has also encountered a billing issue regarding code mistake in Firestore or Google Cloud in general. For now, we will go to each case and come up with some lessons.

Never get more information than you need from Firestore!

It is just plain as it means from the title. NEVER get more than what you need from Firestore, ALWAYS consider adding filter to every query you make to Firestore. For example, in the Email Marketing, you want to get all the contacts to send emails a Welcome emails. You query all the contacts, and then send our the welcome email no matter if there is an email associated, and leave it to the email sending service to deal with null email.

This could be a huge mistake of your query run every minute or on page that have high traffic. As a way to avoid this, you need filter all the contacts that have email NOT NULL

const toSendContacts = await db.collectionGroup('contacts').where('email', '!=', null).get();
toSendContacts.docs.forEach(() => {
// Send email for push to pub/sub
})

With this simple line of codes, you could potentially save a huge numbers of unnecessary reads from the Firestore, which might cost a lot.

Firestore does not have SUM or COUNT aggregation (Not all the time)

You might be familiar with the COUNT or SUM from the SQL world which is right out of the box. But you got gobsmacked when found out that Firestore does not provide you with this feature. You go online, find a post on Stack overflow suggesting that you can read all the documents from the Firestore and use the reduce function to calculate the total. Then, please DO NOT DO THIS and again:

Do not read all the documents from Firestore!

Firestore charges you on every record you pull out from the database, you if you have 1k records in your database. You need to read 1k document to sum a field. Then n numbers of customers requesting for the same information every seconds. This could lead to a huge number of read in our system. But, if you use a count field right on your document like: comments for the number of comments on a post instead of query all the comments in the system. You can see this article for more information: https://fireship.io/snippets/firestore-increment-tips/

Firestore - Pitfalls, Scale, and Cost Optimization technique

Do not use more resources for a cloud function than needed?

By default, all the Firebase Cloud Functions use 256MB RAM and 400MHz CPU. In some cases, you might find your functions timeout or memory exceeded. Then you maximize the number of RAM available as 2GB

export const myFunction = functions
.runWith({memory: '2GB'})
.onRun(myHandler);

However, this could be redundant because your functions are not using all the resources but still manage to pay Google the price of a 2GB instance:

Google Cloud Function Pricing Table

So the question is: How do I know how much of resources does my function use?

In order to find out, you can go to Google Cloud Console > Functions > Select the function you want to inspect

Google Cloud Function Usage

If it is more than you declare in the first place, make a change.

Do not let your function timeout too long

Some functions, you need to make sure that it is executed totally, not partially due to timeout. However, if the function keeps on timing out, and your timeout setting is 540s like:

export const myFunction = functions
.runWith({memory: '540s'})
.onRun(myHandler);

This could lead to a huge problem because we will have to pay Google a huge amount of money for Google with an instance that consistently doing nothing over 9 minutes. Imagine you have 10000 invokes being timeout:

256MB	400MHz	$0.000000463 per 100ms
0.000000463 * 5400 * 100000 = $250

So why function keeps timing out?

It could because you are missing an try/catch in your code or in your NodeJS middleware, you forgot to call the next function

export async function middleware(ctx, next) {
try {
// Doing my logic
// return await next(); Forgot to next to the next middleware or terminate the HTTP function
} catch (e) {
console.error(e);
return (ctx.body = {
success: false,
message: e.message
});
}
}