Hey backend developer,
Well, awesome dudes automated that job, at least for rather-simple projects (scroll to the end of this article for a note about complex APIs).
Congrats, you’re on the way of being a much more efficient backend developer.
THE TRUE POWER OF DATABASES
It is fascinating how lots of backend devs had to try no-SQL databases to fully understand why relational databases were awesome in the first place.
A lot a database systems in companies are badly managed. The data is in here, but it lacks tactics to ensure consistency and cleanliness of the data. Perhaps a lack of specific database skills? It is true that every database system has its own behaviour and syntax, which definitely does not help.
Therefore the constraints of format and security are handled in API servers. To simplify the developer experience, the wonderful anti-pattern of ORMs have been created. For those who do not know what an ORM is, it is basically a library you use in your server code that maps the database to virtual models.
The goal is to handle data differently in your server than in your database. But. You need to code for that. A lot.
Most often, ORMs are used to manipulate the database without having to use the SQL syntax. It can cache the rows you use often, etc. The use case is even worse here, because you are mounting a virtual database inside your app that still requests sometimes your real database. Complexity is your enemy.
And, you are not improving your database skills as you have to learn and use your ORM syntax (which are all specific).
Just quit ORMs.
YOUR DATABASE SHOULD BE YOUR SINGLE SOURCE OF TRUTH.
The database should handle the formatting. The database should handle the permissions (see next chapter). The database should handle the core business functions, like prices, reductions from promotions, etc.
I am not saying you will not have to duplicate some of the behaviour in the frontend to improve the UX of the user. But please, let the database have the last words.
Use triggers to react on changes. Use functions to improve development maintainability. Use views to restrict or aggregate tables.
HERE COMES POSTGREST.
Traditionally, you build an API server and mount an ORM just to manipulate the data for your basic API needs (CRUD = Create, Retrieve, Update or Delete rows). With postgREST, you don’t have to do that job anymore. It exposes a generic API from a PostgreSQL schema you provide.
It maps HTTP verbs to SQL operations:
• GET → SELECT …
• POST → INSERT INTO …
• PATCH → UPDATE …
• DELETE → DELETE …
It maps URLs routes to views, tables and functions in your schema:
• GET /users → SELECT * FROM schema.users;
• POST /rpc/login → SELECT schema.login();
See the full documentation to know about every details.
BUILT-IN SECURITY: POSTGRESQL ROLES
Often, authorization checking is done in the API server (when it’s done).
At each API route, you try to restrict what the issuer can access:
Can an anonymous visitor access this data?
Can a logged user access that data?
Can this particular user access these rows?
Bury that practice deep down. Then light it on fire.
1. You’re coding it → you are wasting time
2. If your database login gets compromised → the hacker has access to the full database
Isn’t it a good practice to restrict the access?
Of course it is. But not that way.
• you are coding these restrictions at each route
• you are coding these restrictions for each app that uses the DB
• testing is painful: for each route/app you have to make sure the restriction works correctly
I am pleased to introduce you to your new friends: PostgreSQL Roles!
Yes, the DB has already every tool you need to make your data safe.
• Privileges: for each database object (tables, views, functions, etc) specify what each role should have access to. Example: restrict an UPDATE query on a table for logged users only.
• Row-level security: you can configure a role to only have access to particular rows. A user should be able to only modify his own data. The data that don’t match the row-level security is not even returned by the DB. That means that even if a login gets compromised, worst case scenario is the hacker can see only the rows associated with that login. Insane.
Roles can belong to role groups to simplify your restriction declarations:
• users have access to this but authenticated users have only access to that
• my third-party apps can have access to this but only that one have access to these
DELEGATE THE COMPLEX: BUILD A STRONG STACK
Yes, you can do a lot with a PostgreSQL database. But no, there are still use cases where you need an external server:
This instruction allows PostgreSQL to send notifications. You can then listen to these notifications on another server and handle what needs to be handled.
• build a server that does one job: listen to notifications and pushes them in an asynchronous queue (rabbitMQ or a cloud queue like PubSubfor GoogleCloudPlatform).
• then build another server that listens on a specific queue and handles the data
• if your task server crashes, you won’t break the notification reception
TRANSACTIONS: TEST IN-PROD AND MIGRATE
Databases have the most powerful tool ever: transactions.
In a transaction, stuff happens without incidence in the real data. At the end of the stuff happening, you can decide to either COMMIT the transaction (that is, make it applied on the production data) or ROLLBACK it. If you rollback it, the real data won’t be affected by what happened in the transaction.
Which means …
You can test the production database by adding users, deleting clients, running functions, be messy with the database, then ROLLBACK. You will never ever harm the production data.
And THIS is powerful.
You can use libraries like pgtap to make testing easy. I did code my own testing framework for my needs, took me a couple of hours for basic features and great outcome.
I basically call the same procedures that a user would use (postgREST maps views, tables and procedures to urls). Then I check that the inner data is correct, that triggers are running correctly, that data delegated to my external server cluster is correct, etc. Then I rollback everything. And I am happy because I know the core features are rock solid on production.
Just be careful about structure queries, some of those aren’t running in a transaction (like creating types).
MASTER YOUR LOAD: SCALING UP
Well, well, well.
You did everything well. Strong structure and in-database data validation, finely tuned privileges restriction, perfectly delegated complex tasks and amazingly covered all that with automated tests while being in production.
Does all that scale waaay up when success is here and users are crazy?
subsecond response times for up to 2000 requests/sec on Heroku free tier.
2000 requests/sec on Heroku free tier.
on Heroku free tier.
If you are using interpreted languages, you cannot beat those metrics.
PostgREST is written in Haskell, and delegates much of the calculation to the PostgreSQL database.
If you are under heavy load, just scale postgREST by raising the number of instances.
If the servers handling the complex tasks delegation are suffering, just scale them up as well. Just make sure they are stateless (they should).
PostgreSQL databases are strong, they can handle gigs of relational data without shaking.
If you need to scale your DB, with or without postgREST you would have needed to do it anyway.
It involves master / slave replication (usually master for writing as it is often less intensive, then slaves for reading), sharding (splits your rows between your cluster of database nodes), and cluster management (what do you do if the master breaks down, do you promote a slave? Which one? How do the other slaves know who is the master now?).
→ Internet has a lot to offer: Replication, Clustering and Connection Pooling
QUICK NOTE FOR COMPLEX PROJECTS
When you are not dealing with an API that connects requests to database in a simple way, you might want to avoid postgREST. If your API is fetching data from an external source before giving it to your DB, you will need to build a server with the language of your choice. But you can still use postgREST to communicate internally, just don’t reinvent the wheel
And the same guys are building SubZero for doing GraphQL & REST with ease, in the same fashion than postgREST.