When working with the Data Exports product, there are a handful of best practices that we recommend following for the simplest and most straightforward integration of the product and usage of the data.

Identifiers

The IDs on all associated resources point to the same identifiers in Orb’s system. For example, the invoice_id field in the Daily Line Item Revenue resource points to the row in the Invoice Metadata resource with that ID. This ID also points to the same invoice in Orb’s system (e.g. through querying in the API).

Data updates

Updates in the data export resources are tracked with the updated_at column on each row. This timestamp is used internally by Orb’s system to track when data needs to be exported to data destinations. This timestamp is not an indicator of when the underlying resource was updated in Orb’s system.

Deletion markers

Orb exports deletion markers to indicate that a resource has been deleted. In general, the majority of Orb’s resources will have a marker, deleted_at, which will be set to the time of deletion, or null if the resource has not been deleted. Certain resources, are append-only and will never be deleted, these do not have a deleted_at field present. The following resources do not have a deletion marker:
  • Credit Ledger Entry
  • Daily Line Item Revenue
  • Invoice
  • Invoice Line Item Billing

Webhooks

To help gain better visibility into the operational status of data export process, Orb sends webhooks about transfer successes and failures. These webhooks are documented in Orb’s webhooks documentation and can be used to instruct downstream processes. For example, you may need to run a downstream job on the raw Orb data to transform and join it against other data sources in your system. This can easily be implemented by listening for the data_exports.transfer_success webhook. On the flip side, if you want to be notified when transfers fail, the corresponding data_exports.transfer_error webhook can be used.

Revert and void timestamps

Both the Daily Line Item Revenue and Credit Ledger Entry resources contain a reverted_at column which indicates when the underlying entry was reverted. This can happen when changes occur in the past (e.g. a subscription cancellation is backdated) that undo operations that have been recorded. In general, we recommend filtering out rows that were reverted when trying to understand the current state. These columns can be used to construct point-in-time queries to understand the state at a given time, by filtering to rows that were active at the time in question. The Invoice Line Item Billing column contains a similar column for voided data called voided_at. This column can be used to help understand data that has been voided. This can happen when an invoice was originally sent but was later voided. As with reverted data, we recommend filtering out voided data for most queries, but your exact reporting needs may differ.

Lock adjusted timestamps

Accounting period locks in Orb control which monthly time period revenue is recognized. As a result, relevant exported resources that affect revenue will include two timestamps for reporting purposes.
Field NameTypeDescription
timestampdatetimeThe true time that the entry was recognized.
lock_adjusted_timestampdatetimeThe time that the entry should be recognized with respect to the accounting period lock posture at the original time.
is_revertbooleanA flag indicating whether this resource represents the creation or reversion of the underlying entry.
The following resources contain these fields:
  • Credit Note Line Item Event
  • Committed Ledger Entry Event
  • Daily Line Item Revenue Event
  • Invoice Line Item Billing Event
When an underlying entry is created, one row with is_revert = False will be exported. If that underlying entry is later voided/reverted, another row with is_revert = True will be exported. Consequently, each underlying entry for these resources in Orb can map to up to two exported resource entries. To recognize revenue with respect to the accounting period lock posture at the time the resource was created or reverted, use the lock_adjusted_timestamp field. Based on the accounting period lock posture at timestamp, the original time of recognition, the lock_adjusted_timestamp may be shifted to the next open accounting period if the period that timestamp falls in was closed.

Created timestamps

The created_at field captures the exact moment when a resource (customer, subscription, invoice, etc.) was first created in Orb’s system, regardless of when it was last modified or when other business events occurred. How is it different from other timestamps?
  • created_at: When the resource was originally created in Orb
  • updated_at: When the resource was last modified
  • start_date/end_date: Business-specific dates (e.g., when a subscription starts/ends)
The created_at timestamp is particularly valuable for:
  • Historical analysis: Re-create analyses representative of a specific point in time
  • Backdated actions: Distinguish between when something was created vs. when it was backdated to take effect
  • Data auditing: Track the chronological order of resource creation
  • Cohort analysis: Group customers or subscriptions by their actual creation date, not their business effective date
For example, if you backdate a subscription to start January 1st but create it on February 15th:
  • start_date = January 1st (business effective date)
  • created_at = February 15th (when it was actually created in Orb)
This enables you to analyze “all subscriptions created in February” vs. “all subscriptions starting in January.”

Append-only destinations

Orb currently supports S3 as a destination for sending Orb’s data. Since S3 is immutable, changes to data are sent as new records instead of updating existing records, because there is no concept of an “update” in S3; only a create is supported. For customers using S3 as a data destination, this means that you will need to deduplicate the data on your end to avoid showing duplicate information. This can be achieved by fetching the latest entry for a given id based on the updated_at timestamp. Below is a sample snippet of SQL to achieve that:
Deduping data from S3
WITH ordered AS (
  SELECT *
       , ROW_NUMBER() OVER (PARTITION BY id ORDER BY updated_at DESC) AS row_number

  FROM <resource>
)

SELECT *
FROM ordered
WHERE row_number = 1;

Deprecated resources

Where possible, we recommend avoiding using any deprecated resources.