Evaluating API BaaS as a data store

API BaaS is not an RDBMS. It does not include RDBMS features such as count(*) and cross-table joins.

API BaaS is a Graph Database built on top of a Key/Value database (Cassandra). While API BaaS is built on top of Cassandra, it is not itself simply Cassandra. For example, Cassandra may offer features which are not available in API BaaS. However, if you have experience with Cassandra, you will be in a better position to understand how API BaaS works, and therefore leverage it effectively.

How would you be using API BaaS?

If you're considering using API BaaS for your data store needs, be sure to consider the following questions.

What will be your data access pattern?

This is the most important question. If you will be accessing data by an entity "key" (such as its UUID or name property), your requests will be very fast and will scale well.

However, if you intend to access the data using a query string -- such as ql=select * where color='red' and size='large' and shape='circle' and lineWeight='heavy' and lineColor='blue' and overlay='Circle' and other='things' -- it will be slower and not scale as well.

How will your data be uploaded or created?

API BaaS is ideal for transactional data where there are smaller updates by the UUID or name entity properties. This is as you would typically find in a mobile or web app. However, as the size of the entities and the number of transactions per second grow, latency and scalability will suffer.

How large are your data entities?

API BaaS has a size limit which defaults to 1MB for JSON entities. While you can have the limit expanded, requiring a higher limit might mean that API BaaS isn't the best choice for your needs. Consider whether using another storage technology, such as Amazon S3, might be better for large entities.

What is your total dataset size?

As of May 2015, the limit on API BaaS storage is 250GB per customer for new customers. Exceeding this threshold incurs additional cost.

Great uses for API BaaS

Here's a list of features and applications for which API BaaS is ideal.

Purpose Description
Mobile apps Support for social (Facebook) login, application-level user-management. API BaaS also supports Push Notifications for iOS, Android, and Windows Phone.
Push notifications API BaaS has support for targeting push notifications to individual users or groups of users.
Social apps A Graph data model is ideal for social relationships which are arbitrary in nature -- arbitrary meaning there is not necessarily a predefined implicit relationship between any two entities.
Almost any "domain" of data API BaaS is great for data "domains" such as User profile, stock/inventory, product catalog, user tokens, shopping cart, and order tracking. The data domain is less important than how the data is accessed and at what frequency.
Multi-region DN API BaaS has multi-region support built on top of Cassandra replication and custom ElasticSearch replication.
Single-entity GETs at scale With NoSQL retrieving one of millions of entities at scale (many consumers), API BaaS is very fast and scalable.
Small query result sets If you need to return over 1,000 entities, perhaps a JSON document stored as a FILE via the API BaaS Assets API is a better option.

Poor uses for API BaaS

If you anticipate needing any of the following features, API BaaS is likely not the best choice.

Purpose Description
File/Bulk Uploads This is not currently supported.
Large updates If you have, for example, 500M entities that you need to upload every hour, API BaaS likely won't be a good solution. This type of load is not supported at this time.
Large Queries / Exports / Retrievals / Full Collection Iteration API BaaS (and Cassandra) enables the fast retrieval of one of a large number of entities very quickly (random I/O). It does not perform well for retrieving an entire collection in excess of 1,000 entities.
Collection-to-Collection relationships These kinds of relationships, such as in cross-table JOINs, are an RDBMS concept that does not apply to API BaaS. In a Graph data model -- for example, in API BaaS the relationships are from one distinct entity to another distinct entity -- every relationship has to be established explicitly with an API call. There is no bulk option for this.
Cross-table joins These are not supported in API BaaS. In other words, queries such as the following are not supported:
  • Joins such as the following are not supported:
    SELECT a.name, b.cost FROM products a JOIN prices b on a.name=b.name
  • Aggregations such as the following are not supported:
    SELECT a.category, count(b.promotion) FROM products a JOIN promotions b on a.name=b.name GROUP BY a.category
HIPAA API BaaS is not HIPAA-compliant at this time.

Data storage in API BaaS

API BaaS offers the following three aspects of storage:

Storage Feature Description
Key / Value storage Provided by Cassandra. Reading by "key" (UUID or name) will yield the fastest performance. Examples of key/value pairs:
  • Key : lastName / Value : west
  • Key : person_123 / Value : { "name": "Jeff West"}
Graph Data model / relationships "Graph Edges" are another name for "connections" in API BaaS. The edge has a property which describes the edge such as "likes" or "follows." For example, considering the following idea:

Here, "jeff" and "apple" are entities, while "likes" is the "connection" -- in other words, the "graph edge".

Indexing for "contains" and range queries Indexing and querying is provided by ElasticSearch. This allows you to query using a variety of operators. This is slower than direct key access.

Yet while they look like SQL, these queries do not support many SQL functions such as count(*) or cross-table joins (remember that API BaaS doesn't have tables at all).

Here are a couple of examples:

GET /org/app/collection?ql=select * where color="red"
GET /org/app/collection?ql=select * where description contains "red"