Best practices
When working with the Data Exports product, there are a handful of best practices that we recommend following for the simplest and most straightforward integration of the product and usage of the data.
Identifiers
The IDs on all associated resources point to the same identifiers in Orb’s system.
For example, the invoice_id
field in the Daily Line Item Revenue resource points to the row in the Invoice Metadata resource with that ID.
This ID also points to the same invoice in Orb’s system (e.g. through querying in the API).
Data updates
Updates in the data export resources are tracked with the updated_at
column on each row.
This timestamp is used internally by Orb’s system to track when data needs to be exported to data destinations.
This timestamp is not an indicator of when the underlying resource was updated in Orb’s system.
Deletion markers
Orb exports deletion markers to indicate that a resource has been deleted.
In general, the majority of Orb’s resources will have a marker, deleted_at
, which will be set to the time of deletion, or null if the resource has not been deleted.
Certain resources, are append-only and will never be deleted, these do not have a deleted_at
field present.
The following resources do not have a deletion marker:
- Credit Ledger Entry
- Daily Line Item Revenue
- Invoice
- Invoice Line Item Billing
Webhooks
To help gain better visibility into the operational status of data export process, Orb sends webhooks about transfer successes and failures. These webhooks are documented in Orb’s webhooks documentation and can be used to instruct downstream processes.
For example, you may need to run a downstream job on the raw Orb data to transform and join it against other data sources in your system.
This can easily be implemented by listening for the data_exports.transfer_success
webhook.
On the flip side, if you want to be notified when transfers fail, the corresponding data_exports.transfer_error
webhook can be used.
Revert and void timestamps
Both the Daily Line Item Revenue and Credit Ledger Entry resources contain a reverted_at
column which indicates when the underlying entry was reverted.
This can happen when changes occur in the past (e.g. a subscription cancellation is backdated) that undo operations that have been recorded.
In general, we recommend filtering out rows that were reverted when trying to understand the current state.
These columns can be used to construct point-in-time queries to understand the state at a given time, by filtering to rows that were active at the time in question.
The Invoice Line Item Billing column contains a similar column for voided data called voided_at
.
This column can be used to help understand data that has been voided.
This can happen when an invoice was originally sent but was later voided. As with reverted data, we recommend filtering out voided data for most queries, but your exact reporting needs may differ.
Append-only destinations
Orb currently supports S3 as a destination to send Orb’s data to. Since S3 is immutable, changes to data are sent as new records instead of updating the existing records because there is no concept of an “update” in S3, only a create. For customers using S3 as a data destination, this means that you will need to deduplicate the data on your end to avoid showing duplicate information.
This can be achieved by fetching the latest entry for a given id
based on the updated_at
timestamp.
Below is a sample snippet of SQL to achieve that:
Was this page helpful?